The getchanges function of some converter_source classes can return
some false positives. I.e. they sometimes claim that a file "foo"
was changed in some revision, even though its contents are still the
same.
convert_svn is particularly bad, but I think this can also happen with
convert_cvs and, at least in theory, with mercurial_source.
For regular conversions this is not really a problem - as long as
getfile returns the right contents, we'll get a converted revision
with the right contents. But when we use --filemap, this could lead
to superfluous revisions being converted.
Instead of fixing every converter_source, I decided to change
mercurial_sink to work around this problem.
When --filemap is used, we're interested only in revisions that touch
some specific files. If a revision doesn't change any of these files,
then we're not interested in it (at least for revisions with a single
parent; merges are special).
For mercurial_sink, we abuse this property and rollback a commit if
the manifest text hasn't changed. This avoids duplicating the logic
from localrepo.filecommit to detect unchanged files.
To handle merges correctly, this revision adds a filemap_source class
that wraps a converter_source and does the work necessary to calculate
the subgraph we're interested in.
The wrapped converter_source must provide a new getchangedfiles method
that, given a revision rev, and an index N, returns the list of files
that are different in rev and its Nth parent.
The implementation depends on the ability to skip some revisions and to
change the parents field of the commit objects that we returned earlier.
To make the conversion restartable, we assume the revisons in the
revmapfile are topologically sorted.
The --filemap support in hg convert doesn't handle merges correctly.
(And after 98d1e8c16343 I managed to break it even for simple cases
where we don't want the first revision.)
If getchanges returns a string, it's assumed to be the id of an
already converted revision. We map the current revision to the same
revision this converted revision was mapped to.
To allow skipping a root revision, getchanges can return the special
string 'hg-convert-skipped-revision' (a.k.a. common.SKIPREV), which
hopefully won't clash with any real id.
The converter_source is responsible for rewriting the parents of the
commit objects to make sure the revision graph makes sense.
In non-cygwin environment, cvsps fails to create its cache directory and redirect its output to stderr. Just ignore the error and capture stderr as well.
CVS connection strings regexp detect colons to separate protocols from path and login. Unfortunately, Windows paths contains colons and were interpreted as rsh connection strings.
- something else than "pat" followed by a number can be used as key
- something else than "/" can be used as delimiter
- "ilmsux" flags (e.g. "i" for case insensitive) can be used
There are some corner cases where we may have a copy in a file that
isn't in the added list:
- the result of a hg copy --after --force
- after a merge across a (local) rename
Clearing it before the conversion protects us from whatever data were
there (file copies in particular).
Invalidating it after the conversion avoids writing a possibly
inconsistent dirstate to disk.
During a conversion, the dirstate contents are not consistent - there
are files that may be missing from the dirstate and there may be files
that shouldn't be in the dirstate.
While this is not fixed, don't mark files as added - put them directly
in state 'n'ormal.