During a commit amend there are 4 manifests being handled:
- original commit
- temporary commit
- amended commit
- merge base
This causes a manifest cache miss which hurts perf on large repos. On a large
repo, this fix causes amend to go from 6 seconds to 5.5 seconds.
Previously, the manifest cache would store the last manifest parsed. We could
run into situations with operations like update where we would try parsing the
manifest for a revision r1, then r2, then r1 again. This increases the cache
size to 3 to avoid that bit of performance fragility.
When commiting to a repo with lots of files (>170000),
manifest.py:addlistdelta takes some time because it's editing a large
array many times. Changing it to build a new array instead of editing
the old one saves around 0.04 seconds on a 1.64 second commit. A 2.5%
gain.
The gain here is pretty minor, but it was blatantly at the top of the
profiler report and the fix is straight forward.
I tested it by comparing the arrays produced by the new and old logic
while running all of the tests.
Introduce manifestdict.withflags() to get a set of all files which have any
flags set, since these are likely to be a minority. Otherwise checking .flags()
for every file is a lot of dictionary lookups and is quite slow.
Though both give the same result (a NUL byte), I found that I tend to
read "\000" as "\0" + "00", which is something completely different.
I did not change the occurance of "\000" in archival.py since there
are other octal constants in that file.
Py3k doesn't have a global cmp() function, making this call problematic in the
py3k port. Also, calling cmp() here is not necessary, since we only want to
know if the two values are equal. A check for equality perfect in this case and
this patch does that.
If a buffer of an mutable object is passed to revlog.addrevision(), the revlog
will happily store it in its cache. Later when the revlog reuses the cached
entry, if the manifest modified the object in-between, all kind of bugs
appears.
We fix it by:
- passing immutable objects to addrevision() if they are already available
- only storing the text in the cache if it's of str type
Then we can remove the conversion of the cache entry to str() during
retrieval. That was probably just there hiding the bug for the common cases
but not really fixing it.
No need to copy the dict, dict.__init__() will do that for us.
It was responsible for a non-negligeable waste of time during a qpush of an
-mm queue on the kernel repo.
Include the offending filenames in the error message. Now this error message
is consistent with the same error issued by dirstate.py (although there is
still duplicate code).
- create error.py for exception classes to reduce demandloading
- move revlog exceptions to it
- change users to import error and drop revlog import if possible
add _checklink var to dirstate
introduce dirstate.flagfunc
switch users of util.execfunc/linkfunc to flagfunc
change manifestdict.set to take a flags string
change ctx.fileflags to ctx.flags
change gitmode func to a dict
remove util.execfunc/linkfunc