Before this change, localrepository instances that performed multiple
transactions would leak transaction objects. This could occur when
running `hg convert`. When running `hg convert`, the leak would be
~90 MB per 10,000 changesets as measured with the Mercurial repo itself.
The leak I tracked down involved the "validate" closure from
localrepository.transaction(). It appeared to be keeping a
reference to the original transaction via __closure__. __del__
semantics and a circular reference involving the repo object
may have also come into play.
Attempting to refactor the "validate" closure proved to be
difficult because the "tr" reference in that closure may
reference an object that isn't created until transaction.__init__
is called. And the "validate" closure is passed as an argument to
transaction.__init__. Plus there is a giant warning comment in
"validate" about how hacky it is. I did not want to venture into
the dragon den.
Anyway, we've had problems with transactions causing leaks before.
The solution then (8b23c334b97f) is the same as the solution in this
patch: drop references to callbacks after they are called. This
not only breaks cycles in core Mercurial but can help break cycles
in extensions that accidentally introduce them.
While I only tracked down a leak due to self.validator, since this is
the 2nd time I've tracked down leaks due to transaction callbacks I
figure enough is enough and we should prevent the class of leak from
occurring regardless of the variable. That's why all callback variables
are now nuked.
This moves the optimization for patterns that match everything to the
caller, so we can remove it from patternmatcher.
Note that we need to teach alwaysmatcher to use relative paths now in
cases like "hg files .." from inside mercurial/, because while it
still matches everything, paths should be printed relative to the
working directory.
By passing in the result of the normalize() call, we prepare for
moving the special handling of patterns that always match out of the
patternmatcher.
It also lets us remove many of the arguments from the matcher, because
they were passed only the the normalize function (we could have
removed the arguments by binding them to the function instead of
moving the normalize() call out).
Since the caller now deals with empty pattern lists, we can drop that
support in the patternmatcher. It now gets the more logical behavior
of matching nothing when no patterns are given (although there is no
in-core caller that will pass no patterns).
In patternmatcher, we used to say that all directories should be
visited if no explicit files were listed, because the case of empty
_files usually implied that no patterns were given (which in turns
meant that everything should match). However, this made e.g. "hg files
-r . rootfilesin:." slower than necessary, because that also ended
up with an empty list in _files. Now that patternmatcher does not
handle includes, the only remaining case where its _files/_fileset
fields will be empty is when it's matching everything. We can
therefore treat the always-case specially and stop treating the empty
_files case specially. This makes the case mentioned above faster on
treemanifest repos.
Having a special matcher that always matches seems to make more sense
than making one of the other matchers handle the case. For now, we
just use this new matcher when no patterns were provided.
Should at least be useful for debugging. Would matter for correctness
too if fsmonitor or Facebook's sparse extension worked with subrepos
(which I don't know if they do).
Due to a quirk of our module importer setup on Python 3, all calls and
definitions of methods named iteritems() get rewritten at import
time. Unfortunately, this means there's not a good portable way to
access these methods from non-module-loader'ed code like our unit
tests. This change fixes that, which also unblocks test-manifest.py
from passing under Python 3.
We don't presently define any itervalues methods, or we'd need to give
those similar treatment.
This is all pure-Python code, so I'm not too worried about perf here,
but we can come back and fix it should it be a problem.
With this change, the manifest code passes most unit tests on Python 3
(once the tests are corrected with many b prefixes. I've got a little
more to sort out there and then I'll mail that change too.
This fixes `hg files 'set:(**.py)'` which makes test-check-py3-compat.t able to
run on Python 3. So if you now do `python3 ./run-tests.py
test-check-py3-compat`, the test will actually run.
This will make sure when ctx.repo.manifestlog changes, a correct new
manifestctx is returned. repo.manifestlog takes care of caching so the
manifestctx won't be reconstructed every time.
The "hg bundle" command is a good place to test if the inclusion of obsmarkers
within a bundle is working well (part exists, content is correct etc). So we
add a way to have them included.
Ideally, this would be controlled by a change around bundlespec (bundlespec
"v3" + arguments). However, my main goal is to have obsmarkers included in
bundle created by the 'hg strip' command, not the 'hg bundle' so for now I'm
avoiding the detour through bundlespec rework territory.
Better debug output for obsmarkers in 'debugbundle' will be added in later
changesets. The 'test-obsolete-bundle-strip.t' test will also get updated in a
later changeset to keep the current changeset smaller.
We move it next to similar part building functions. We will need it for the
"writenewbundle" logic. This will allow us to easily include obsmarkers in
on-disk bundle, a necessary step before having `hg strip` also operate on
markers.
(Yes, the bundle2 module was already too large, but there any many
interdependencies between its components so it is non-trivial to split, this is
a quest for another adventure.)
This is just a stub for future extension. I could add a version constant to
CFFI modules by putting it to both ffi.set_source() and ffi.cdef(), but that
doesn't seem right. So for now, cffi modules will be explicitly unversioned
(i.e. version constant must be undefined or set to None.) We can revisit it
later when we need to consider CFFI support more seriously.
The includematcher will always get at least one include pattern and
will never get any non-include patterns, so we can remove most of the
code in it. This patch does mostly straight-forward deletions of
code. We will clean up further later.
At this point the includematcher is an exact copy of the main matcher
class. We will specialize and simplify both classes in the following
patches. This initial unmodified copy is just to make the differences
clearer. We also rename the main matcher to "patternmatcher" for
consistency.
I may eventually merge this new includematcher back into the main
matcher, but I think doing it this way makes the intermediate steps
clearer regardless.
Exact matching is now handled by the exactmatcher class.
We can safely remove _files from the __repr__() implementation,
because even though the field is set, the patternspat field is enough
for the representation to be unambiguous (which was not the case when
the matcher could handle exact matches).