Otherwise, a KeyboardInterrupt may lead to an unpulled revision being
incorrectly saved as pulled in the lastpulled file. This will lead to
the interrupted revision being incorrectly skipped at the next pull,
leading to an incorrect conversion -- one might even say corrupt.
Due to it's nature of requiring a manual interrupt, this bug is
difficult to test.
- Since there are five handler, change to SubversionPrompt class from svn_auth_ssl_server_trust_prompt().
- Moved out of the handler to check the callback function is non-None.
If the server certificate is untrusted when connected to a subversion repository
using the connection SSL, respond with a message similar to svn client.
Here, we can choose either a permanent accept, temporary accept, rejection.
Missing files were stored directly in RevisionMeta and resolved after
the revision was replayed. It means the missing files set was no pruned
by delete_entry() actions or by the filemap, and some of them were
fetched for no reason.
Say you convert:
A branch/foo/bar (from trunk/foo/bar:123)
with a filemap excluding "foo/bar". Since the directory was excluded in
trunk the files cannot be found and were marked as missing even though
they were discarded afterwards.
Files brought by a copied add_directory() were processed despite being
excluded by the filemap. This was also the case with added files. The
conversion was still correct because they were eventually filtered out
in the replay.convert_rev() but processing them in itself may be
problematic. Filemaps are often use to exclude large binary files and
before this change, some of them could be marked as missing and be
fetched before being discarded.
A test configuration entry named hgsubversion.failoninvalidreplayfile
was added to help testing this case. It should become the default
behaviour in the future.
Reverting a branch with a remove followed by a copy results in a branch
replacement. By default, branch replacements are handled by closing the
replaced branch and committing the new branch on top of it. But we do
not really want that when reverting a branch, we only want a linear
history with a changeset capturing the revert.
Reverts including a lot of files create many actions like:
R bar (from trunk/bar:4343)
For each of these files, an open_file() call is made and the parent
mercurial revision is loaded to detect copies in issamefile(). This
results in a big slowdown, easily reduced by caching the changectx.
When renaming a branch you get something like:
D /branch/bar
A /branch/foo (from /branch/foo:42)
Unfortunately, the branch layout for the revision being converted is
computed before starting to convert it. It means the copyfrom path
supplied in the add_directory() for /branch/foo will be be considered
invalid, be added to missing and fetched the slow way despite being in
the repository history. Avoid that by checking the path looks like a
branch path and matching it with the filemap. It will be resolved
afterwards anyway.
Missing files should be the exception not the norm. Right now, a lot of
these are caused by incorrect handling of branch updates. The
hgsubversion.failonmissing configuration entry will help chase them.
The design is a little ugly as the data stored in _openfiles will be a
string or a SimpleStringIO depending on the file having been edited or
not but this is a simple way to avoid allocating large blocks of data.
This is also a bet the output stream passed to apply_text() is only
being written and never seeked or read.
Branch creation is a special case of add_directory(). Until now, it was
handled by enumerating branch parent files and creating svncopy records
which were later converted into open files or pushed into the
RevisionData. By default, there is no reason to record anything, the
files are the same than in the parent changeset. The tricky part is to
correctly check the source is the parent revision. This change massively
speeds up regular branching operations.
Before 6ba25d2444f3, all the rebase sequence including the update
was executed with the encoding reset to the native one. After the
change, the final update was left out and ran with UTF-8, which
fails for some badly shaped repository. Reset the correct encoding
context.
The commit pass only has to read files once, removing the related data
after the read helps not keeping large temporary files around after they
have been stored
The configuration entry defines the size of the replay or stupid edited
file store, that is the maximum amount of edited files data in megabytes
which can be kept in memory before falling back to storing it in a
temporary directory. Default to 200 (megabytes), use -1 to disable.
The implementation is similar to the one in mercurial.patch except the
mode and copy information are currently kept outside. It minimizes
changes to RevisionData and helps with files which properties are
modified but not their contents, which filestore was not designed to
handle. Besides, CopiedFile pushed from the editor may later be handled
separately to resolve them at commit time, in which case we would store
the metadata outside of the file stores.
Before this change, the data of files brought by directory copies was
resolved immediately and stored for the duration of the run. Now, only
references are stored and are resolved either when opening the files or
when closing the editor. It means RevisionData file data is only set
once per file and only when the file has been handled.
The next step is to turn RevisionData into a data store backed by the
filesystem.
The separation is not complete as we still have to update the
RevisionData deleted set when registering svn copies. This will be
cleaned up once open files are themselves separated from RevisionData.
Copied symlinks are also being prefixed with 'link '.
The concept of current.file is incorrect, svn_delta.h documents open
file lifetime as:
* 5. When the producer calls @c open_file or @c add_file, either:
*
* (a) The producer must follow with any changes to the file
* (@c change_file_prop and/or @c apply_textdelta, as applicable),
* followed by a @c close_file call, before issuing any other file
* or directory calls, or
*
* (b) The producer must follow with a @c change_file_prop call if
* it is applicable, before issuing any other file or directory
* calls; later, after all directory batons including the root
* have been closed, the producer must issue @c apply_textdelta
* and @c close_file calls.
So, an open file can be kept open until after the root directory is
closed and have deltas applied afterwards. In the meantime, other files
may have been opened and patched, overwriting the current.file variable.
This patch fixes it by introducing file batons bound to file paths, and
using them to deduce the correct target in apply_textdelta(). In theory,
open files could be put in a staging area until they are closed and
moved in the RevisionData. But the current code registers files copied
during a directory copy as open files and these will not receive a
close_file() event. This separation will be enforced later.