Since (b) is banned, we should do the same for (a) for consistency.
a) from mercurial import hg
from mercurial.i18n import _
b) from . import hg
from .i18n import _
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.
For great justice.
Previously convert could only take one '--rev'. This change allows the user to
specify multiple --rev entries. For instance, this could allow converting
multiple branches (but not all branches) at once from git.
In this first patch, we disable support for this for all sources. Future
patches will enable it for select sources (like git).
Python 2.6 introduced a new octal syntax: "0oXXX", replacing "0XXX". The
old syntax is not recognized in Python 3 and will result in a parse
error.
Mass rewrite all instances of the old octal syntax to the new syntax.
This patch was generated by `2to3 -f numliterals -w -n .` and the diff
was selectively recorded to exclude changes to "<N>l" syntax conversion,
which will be handled separately.
Conversion of a merge starts with p1 and re-adds the files that were changed in
the merge or came unmodified from p2. Files that are unmodified from p1 will
thus not be touched and take no time. Files that are unmodified from p2 would be
retrieved and rehashed. They would end up getting the same hash as in p2 and end
up reusing the filelog entry and look like the p1 case ... but it was slow.
Instead, make getchanges also return 'files that are unmodified from p2' so the
sink can reuse the existing p2 entry instead of calling getfile.
Reuse of filelog entries can make a big difference when files are big and with
long revlong chains so they take time to retrieve and hash, or when using an
expensive custom getfile function (think
http://mercurial.selenic.com/wiki/ConvertExtension#Customization with a code
reformatter).
This in combination with changes to reuse filectx entries in
localrepo._filecommit make 'unchanged from p2' almost as fast as 'unchanged
from p1'.
This is so far only implemented for the combination of hg source and hg sink.
This is a refactoring/optimization. It is covered by existing tests and show no
changes - which is a good thing.
Although Python supports `X = Y if COND else Z`, this was only
introduced in Python 2.5. Since we have to support Python 2.4, it was
a very common thing to write instead `X = COND and Y or Z`, which is a
bit obscure at a glance. It requires some intricate knowledge of
Python to understand how to parse these one-liners.
We change instead all of these one-liners to 4-liners. This was
executed with the following perlism:
find -name "*.py" -exec perl -pi -e 's,(\s*)([\.\w]+) = \(?(\S+)\s+and\s+(\S*)\)?\s+or\s+(\S*)$,$1if $3:\n$1 $2 = $4\n$1else:\n$1 $2 = $5,' {} \;
I tweaked the following cases from the automatic Perl output:
prev = (parents and parents[0]) or nullid
port = (use_ssl and 443 or 80)
cwd = (pats and repo.getcwd()) or ''
rename = fctx and webutil.renamelink(fctx) or []
ctx = fctx and fctx or ctx
self.base = (mapfile and os.path.dirname(mapfile)) or ''
I also added some newlines wherever they seemd appropriate for readability
There are probably a few ersatz ternary operators still in the code
somewhere, lurking away from the power of a simple regex.
Convert will normally only process files that were changed in a source
revision, apply the filemap, and record it has a change in the target
repository. (If it ends up not really changing anything, nothing changes.)
That means that _if_ the filemap is changed before continuing an incremental
convert, the change will only kick in when the files it affects are modified in
a source revision and thus processed.
With --full, convert will make a full conversion every time and process
all files in the source repo and remove target repo files that shouldn't be
there. Filemap changes will thus kick in on the first converted revision, no
matter what is changed.
This flag should in most cases not make any difference but will make convert
significantly slower.
Other names has been considered for this feature, such as "resync", "sync",
"checkunmodified", "all" or "allfiles", but I found that they were less obvious
and required more explanation than "full" and were harder to describe
consistently.
The internal API used IOError to indicate that a file should be marked as
removed.
There is some correlation between IOError (especially with ENOENT) and files
that should be removed, but using IOErrors to represent file removal internally
required some hacks.
Instead, use the value None to indicate that the file not is present.
Before, spurious IO errors could cause commits that silently removed files.
They will now be reported like all other IO errors so the root cause can be
fixed.
Built on top of previous patches:
- continuation-of parsing
- registered archives retrieval
- use of fully qualified revisions
This allows the converter scanning for more source revisions
following the tree versions 'leaked' through the continuation-of
informations. Coupled with the registered archives retrieval, this
makes possible to decide to follow such a hint or stop scanning for
more revisions.
This also implies some changes in the retrieval of some base-0
revisions when they're continuation-of other revisions, in that
case a 'replay' will work where a simple 'get' fails because the
dir exists already. I found the code dealing with 'replay' quite
good as it has already a fallback to 'get' in the error path.
In GNU Arch, continuation-of was often used for:
- tagging revisions
- continue working on a project in a new archive, because arch
was scaling poorly in revision numbers (cat-logs were slow
to be parsed and scanned through)
- very similar to the previous point, fork his own branch of
a project.
Parsing this header information will allow to 'follow' new history
because it often hints at older/forked/personal revision trees.
This patch however just implements the parsing of the
continuation-of header. A followup patch will implement the proper
use of this new information.
There is no need loosing information in the conversion process. This could
lead to wrong shamap mappings if different archives used the same 'version'
naming.
GNU Arch used to scale very poorly when revision number was
increasing. This was mostly caused by the huge amount of
cat-log it has to scan/read through to keep track of all
patches that were merged in a given revision.
In order to improve things, cat-log prunning was a common
admin task that would accelerate cat-log parsing at the expense
of unreachabe locally stored cat-logs.
However, these missing cat-logs are still available in the archive.
So try to get them from the archive as a fallback solution.
cat-log parsing was very wrong. It assumed the Summary header
was comming last, which is wrong. Plus the code was buggy because
it was concatenating all headers in the summary.
As parsing GNU Arch isn't trivial, and python email code does it
so well... just use that ;-)