Before this patch, annotation is re-calculated even if it is already
calculated. This may cause unexpected annotation, because already
cleared "pcache" ("pcache[f] = []") prevents from scanning ancestors.
This patch reuses already calculated annotation if it is available.
In fact, "reusable" situation should be seen only on legacy
repositories in which a filelog include the merging between the
revision and its ancestor, because:
- tree is scanned in depth-first
without such merging, annotation result should be released soon
- recent Mercurial doesn't allow such merging
changelog and manifest can include such merging someway, but
filelogs can't, because "localrepository._filecommit()" converts
such merging request to linear history.
The performance of both the old and new Python ancestor algorithms
depends on the number of revs they need to traverse. Although the
new algorithm performs far better than the old when revs are
numerically and topologically close, both algorithms become slow
under other circumstances, taking up to 1.8 seconds to give answers
in a Linux kernel repo.
This C implementation of the new algorithm is a fairly straightforward
transliteration. The only corner case of interest is that it raises
an OverflowError if the number of GCA candidates found during the
first pass is greater than 24, to avoid the dual perils of fixnum
overflow and trying to allocate too much memory. (If this exception
is raised, the Python implementation is used instead.)
Performance numbers are good: in a Linux kernel repo, time for "hg
debugancestors" on two distant revs (24bf01de7537 and c2a8808f5943)
is as follows:
Old Python: 0.36 sec
New Python: 0.42 sec
New C: 0.02 sec
For a case where the new algorithm should perform well:
Old Python: 1.84 sec
New Python: 0.07 sec
New C: measures as zero when using --time
(This commit includes a paranoid cross-check to ensure that the
Python and C implementations give identical answers. The above
performance numbers were measured with that check disabled.)
Previously, we chose a rev based on numeric ordering, which could
cause "the same merge" in topologically identical but numerically
different repos to choose different merge bases.
We now choose the lexically least node; this is stable across
different revlog orderings.
Instead of walking all the way to the root of the DAG, we generate
a set of candidate GCA revs, then figure out which ones will win
the race to the root (usually without needing to traverse all the
way to the root).
In the common case of nodes that are close to each other in both
revision number and topology, this is usually a big win: it makes
"hg --time debugancestors" up to 9 times faster than the more general
ancestor function when measured on heads of the linux-2.6 hg repo.
Victory is not assured, however. The older function can still win
by a large margin if one node is much closer to the root than the
other, or by a much smaller amount if one is an ancestor of the
other.
For now, we've also got a small paranoid harness function that calls
both ancestor functions on every input and ensures that they give
equivalent answers.
Even without the checker function, the old ancestor function needs
to stay alive for the time being, as its generality is used by
context.filectx.merge.
Update to "foreground" are no longer seen as cross branch update. "Foreground"
are descendants or successors (or successors of descendants (or descendant of
successors (etc))). This allows to update with uncommited changes that get
automatically merged.
This changeset is a small step forward. We want to allow dirty update to
"background" (precursors) and takes obsolescence in account when finding the
default update destination. But those requires deeper changes and will comes in
later changesets.
When revisions are destroyed, the `phaserevs` cache becomes invalid in most case.
This cache hold a `{rev => phase}` mapping and revision number most likely
changed.
Since cc14b89388cb, we filter unknown phases' roots after changesets
destruction. When some roots are filtered the `phaserevs` cache is invalidated.
But not if none root where destroyed.
We now invalidate the cache in all case filtered root or not.
This bug was a bit tricky to reproduce as in most case we either:
* rebase a set a draft changeset including root (phaserev invalidated)
* strip tip-most changesets (no re-numbering of revision)
Note that the invalidation of `phaserevs` are not strictly needed when only
tip-most part of the history have been destroyed. But I do not expect the
overhead to be significant.
Note that we could raise this exception even if no pattern were specified, but
the revision contained no files. However this should not happen in practice
since in that case commands.py/archive would exit earlier with an "no working
directory: please specify a revision" error message instead.
This will allow a current traceback.print_exc() call in dispatch to be replaced
with ui.traceback() even if --traceback was not given on the command line.
The tracebacks in subrepos are truncated at the point where the original
exception is caught and SubrepoAbort is raised in its place since 6c419dfc848c.
That hides the most relevant subrepo methods when an error occurs. Python 2.x
doesn't support chaining exceptions, so it is manually done here for manual
printing later.
-r has a default value of '' in the command line. The function default value of
'tip' is thus never used and any attempt at specifying revisions without -r
will fail.
It seems like then intended behavior was that 'hg debugrebuildstate' without
any parameters should set the parents to tip. That would be very confusing now
when the command primarily is used to recover from incorrect stat info.
It is apparently undocumented that '' is the same as '.' ... unless it is
passed in a place where revsets are used.
The message was not very much to the point and did not in any way help an
ordinary user.
'repository changed while preparing/uploading bundle - please try again'
is more correct, gives the user some understanding of what is going on, and
tells how to 'recover' from the situation.
The 'bundle' aspect could be seen as an implementation detail that shouldn't be
mentioned, but I think it helps giving an exact error message.
The message could still leave the user wondering why Mercurial doesn't lock the
repo and how unsafe it thus is. Explaining that is however too much detail.
A common usecase for export is to preview the patch that will be patchbombed or
to see what changed in a revision found by bisect. Showing the working
directory parent is thus a useful and obvious default.
Backout 308a153b9120. The changeset prevented closing non-head changesets but
did not provide any rationale or test case and I don't see what value it adds.
Users might have their reasons to commit something anywhere - and close it
immediately.
And contrary to the comment that is removed: The topo heads set is _not_
included in the branch heads set of the current branch. It do not include
closed topological heads.
The change thus prevented closing commits on top of closing commits. A valid
usecase for that is to merge closed heads to reduce the number of topological
heads.
The only existing test coverage for this is the failing double close in
test-revset.t. It was added in dc0e42c06b4e and seems to not be intentional.
This patch makes "_journalfiles()" return a list of pairs of journal
file and corresponded vfs instance instead of a list of journal files
in full path, to use "vfs.rename()" instead of "util.rename()" in
"aftertrans()".
"undofiles()" still returns a list of undo files in full path, because
"repair.strip()" expects such list. It'll be also made to return a
list of pairs of undo file and corresponded vfs at vfs migration for
"repair.strip()" in the near future.
In the point of view of efficiency, "vfs" instance created in this
patch should be passed to and reuse in "store.store()" invocation just
after patched code block, because "store" object is initialized by vfs
created with "self.sharedpath".
eBut to focus just on migration from direct file I/O API accessing to
vfs, this patch uses created vfs as temporary one. Refactoring around
"store.store()" invocation will be done in the future.
Before this patch, vfs constructor applies both "util.expandpath()"
and "os.path.realpath()" on "base" path, if "expand" is True.
This patch splits it into "realpath" and "expandpath", to apply each
functions separately: this splitting can allow to use vfs also where
one of each is not needed.
unionrepo is just like bundlerepo without bundles.
The implementation is very similar to bundlerepo, but I don't see any obvious
way to generalize it.
Some most obvious use cases for this would be log and diff across local repos,
as a kind of preview of pulls, for instance:
$ hg -R union:repo1+repo2 heads
$ hg -R union:repo1+repo2 log -r REPO1REV -r REPO2REV
$ hg -R union:repo1+repo2 log -r '::REPO1REV-::REPO2REV'
$ hg -R union:repo1+repo2 log -r 'ancestor(REPO1REV,REPO2REV)'
$ hg -R union:repo1+repo2 diff -r REPO1REV -r REPO2REV
This is going to be used in RhodeCode, and Bitbucket already uses something
similar. Having a core implementation would be beneficial.
This patch stops mercurial from pushing unmodified subrepos. An unmodified
subrepo is one whose store is "clean" versus a given target subrepo.
Note that subrepos may have a clean store versus a target repo but not versus another. This patch handles this scenario by individually keeping track of the state of the store versus all push targets.
Tests will be added on the following revision.
The mercurial subrepo "storeclean" method works by calculating a "store hash" of
the repository state and comparing it to a cached store hash. The store hash is
always cached when the repository is cloned from or pushed to a remote
repository, but also on pull as long as the repository already had a clean
store. If the hashes match the store is "clean" versus the selected repository.
Note that this method is currenty unused, but it will be used by a later patch.
The store hash is calculated by hashing several key repository files, such as
the bookmarks file the phaseroots file and the changelog. Note that the hash
comparison is done file by file so that we can exit early if a pair of hashes
do not match. Also the hashes are calculated starting with the file that is
most likely to be smaller upto the file that is more likely to be larger.
Currently this method is unused and it is not implemented for any specific
subrepo type (it always returns False). It receives a remote repository path
because a repository may have a clean store versus a given repository but not
versus another.