'fcntl', 'termios' and 'wcurses' are not available on the default Windows python
installation, and importing them caused widespread carnage in the test suite.
There were 29 different changed test files (on top of unrelated errors), mostly
in the form of an ImportError.
The failures weren't related to actual crecord use, and followed the import
chain:
'localrepo' -> 'subrepo' -> 'cmdutil' -> 'crecord' -> 'fcntl'
This is necessary to annotate workingctx revision. basefilectx.linkrev() can't
be used because committablefilectx has no filelog.
committablefilectx looks for parents() from self._changectx. That means fctx
is linked to self._changectx, so linkrev() can simply be aliased to rev().
The main purpose of wdir() is to annotate working-directory files.
Currently many commands and revsets cannot handle workingctx and may raise
exception. For example, -r ":wdir()" results in TypeError. This problem will
be addressed by future patches.
We could add "wdir" symbol instead, but it would conflict with the existing
tag, bookmark or branch. So I decided not to.
List of commands that will potentially support workingctx revision:
command default remarks
-------- ------- -----------------------------------------------------
annotate p1 useful
archive p1 might be useful
cat p1 might be useful on Windows (no cat)
diff p1:wdir (default)
export p1 might be useful if wctx can have draft commit message
files wdir (default)
grep tip:0 might be useful
identify wdir (default)
locate wdir (default)
log tip:0 might be useful with -p or -G option
parents wdir (default)
status wdir (default)
This patch includes minimal test of "hg status" that should be able to handle
the workingctx revision.
For now this implementation is pretty naive -- it filters out files right
before passing them into trydiff. In upcoming patches we'll add some more
smarts.
Merge copies is traversing file history in search for copies and renames.
Since 3.3 we are doing "linkrev adjustment" to ensure duplicated filelog entry
does not confuse the traversal. This "linkrev adjustment" involved ancestry
testing and walking in the changeset graph. If we do such walk in the changesets
graph for each file, we end up with a 'O(<changesets>x<files>)' complexity
that create massive issue. For examples, grafting a changeset in Mozilla's repo
moved from 6 seconds to more than 3 minutes.
There is a mechanism to reuse such ancestors computation between all files. But
it has to be manually set up in situation were it make sense to take such
shortcut. This changesets set this mechanism up and bring back the graph time
from 3 minutes to 8 seconds.
To do so, we need a bigger control on the way 'filectx' are instantiated during
each 'checkcopies' calls that 'mergecopies' is doing. We add a new 'setupctx'
that configure and return a 'filectx' factory. The function make sure the
ancestry context is properly created and the factory make sure it is properly
installed on returned 'filectx'.
When the source rev value is 'None', the ctx is a working context. We
cannot compute the ancestors from there so we directly skip to its
parents. This will be necessary to allow 'None' value for
'_descendantrev' itself necessary to make all contexts used in
'mergecopies' reuse the same '_ancestrycontext'.
The linkrev adjustment will likely do the same ancestry walking multiple time
so we already have an optional mechanism to take advantage of this. Since
4e4e9e954fae, linkrev adjustment was done lazily to prevent too bad performance
impact on rename computation. However, this laziness created a quadratic
situation in 'annotate'.
Mercurial repo: hg annotate mercurial/commands.py
before: 8.090
after: 36.300
Mozilla repo: hg annotate layout/generic/nsTextFrame.cpp
before: 1.190
after: 290.230
So we setup sharing of the ancestry context in the annotate case too. Linkrev
adjustment still have an impact but it a much more sensible one.
Mercurial repo: hg annotate mercurial/commands.py
before: 36.300
after: 10.230
Mozilla repo: hg annotate layout/generic/nsTextFrame.cpp
before: 290.230
after: 5.560
Paths into the subrepo are not yet supported.
The need to use the workingctx in the subrepo will likely be used more in the
future, with the proposed working directory revset symbol. It is also needed
with archive, if that code is to be reused to support 'extdiff -S'.
Unfortunately, it doesn't seem possible to put the smarts in subrepo.subrepo(),
as it breaks various status and diff tests.
I opted not to pass the desired revision into the subrepo method explicitly,
because the only ones that do pass an explicit revision are methods like status
and diff, which actually operate on two contexts- the subrepo state and the
explicitly passed revision.
This brings parity with gitsubrepo and svnsubrepo (which already reference their
parent), and will be used in an upcoming patch.
I'm a bit concerned that the parent context could get stale (consider what
happens when the parent repo is reverted for example). I tried adding the
parent context to the substate tuple so that the parent is available everywhere
a state change is possible, but that made submerge() unhappy. Even with
removing the parent context inside submerge(), I wasn't able to get all of the
test diffs fixed.
But since the other subrepos reference their parent too, if there is a problem,
it is a preexisting one (that nobody seems to be running into). It can be fixed
if/when it pops up.
This has mostly the same semantics as the files that the 'ui.portablefilenames'
config option would warn or abort about. The only difference is filenames that
case-fold to the same string -- given a set of filenames we've already
checked we can check whether a new one collides with them, but we don't have a
way to tell which filename it collided with.
A style name should not contain "/", "\", "." and "..". Otherwise, templates
could be loaded from outside of the specified templates directory by invalid
?style= parameter. hgweb should not allow such requests.
This change means subdir/name is also rejected.
ntpath.join() of Python 2.7.9 does not work as expected if root is a UNC path
to top of share.
This patch doesn't take care of os.altsep, '/' on Windows, because root should
be normalized by realpath().
Before this change, amending a merge would lose the rename information for file
adding in the second parents and implicitly renamed into a new directory.
In case of the merge, we also look for directory rename data from the second
parent. This seems to fix the bug and does not show other issues from the test
suite.
Containment checking is slower in treemanifest than it is in
manifestdict, making the current diff algorithm O(n log n). By
traversing both treemanifests in parallel, we can make it O(n). More
importantly, once we start lazily loading submanifests, we will be
able to easily skip entire submanifest if they have the same nodeid.
This change adds boolean configuration option
experimental.treemanifest. When the option is enabled, manifests are
parsed into the new treemanifest type.
Tests can be now run using treemanifest by switching the config option
default in localrepo._applyrequirements(). Tests pass even when made
to randomly choose between manifestdict and treemanifest, suggesting
that the two types produce identical manifests (so e.g. a manifest
revlog entry written from a treemanifest can be parsed by the
manifestdict code).
There are a number of problems with large and flat manifests. Copying
from http://mercurial.selenic.com/wiki/ManifestShardingPlan:
* manifest too large for RAM
* manifest resolution too much CPU (long delta chains)
* committing is slow because entire manifest has to be hashed
* impossible for narrow clone to leave out part of manifest as all is
needed to calculate new hash
* diffing two revisions involves traversing entire subdirectories
even if identical
This is a first step in a series introducing a manifest revlog per
directory.
This change adds a new manifest class: treemanifest, which is a tree
where each node has a dict of files (nodeids), a dict of flags, and a
dict of subdirectories (treemanifests). So far, it behaves just like
manifestdict, but it will later help us write one manifest revlog per
directory. The new class is still unused; it will be used after the
next change.
The code is not yet optimized. Running with it (see below) makes most
or all operations slower. Once we start storing manifest revlogs for
every directory, it should be possible to make many of these
operations much faster. The fastdelta() optimization has been
intentionally not implemented for the treemanifests. We can implement
it later if necessary.
All tests pass when run with the following patch (and without, of
couse):
--- a/mercurial/manifest.py Thu Mar 19 11:08:42 2015 -0700
+++ b/mercurial/manifest.py Thu Mar 19 11:15:50 2015 -0700
@@ -596,7 +596,7 @@ class manifest(revlog.revlog):
return None, None
def add(self, m, transaction, link, p1, p2, added, removed):
- if p1 in self._mancache:
+ if False and p1 in self._mancache:
# If our first parent is in the manifest cache, we can
# compute a delta here using properties we know about the
# manifest up-front, which may save time later for the
@@ -626,3 +626,5 @@ class manifest(revlog.revlog):
self._mancache[n] = (m, arraytext)
return n
+
+manifestdict = treemanifest
This patch adds utility function "summary()", to replace comparing
bookmarks in "commands.summary()". This replacement finishes
centralizing the logic to compare bookmarks into "bookmarks.compare()".
This patch also adds test to check summary output with
incoming/outgoing bookmarks, because "hg summary --remote" is not
tested yet on the repository with incoming/outgoing bookmarks.
This test uses "(glob)" to ignore summary about incoming/outgoing
changesets.
This replacement makes enhancement of "show outgoing bookmarks" easy,
because "compare()" can detect more detailed difference of bookmarks
between two repositories.
This replacement makes enhancement of "show incoming bookmarks" easy,
because "compare()" can detect more detailed difference of bookmarks
between two repositories.
Previously we tried to avoid manifest.intersectfiles for exact matches
with less than 100 files. However, when the left side of the "or" is false,
the right side gets evaluated, of course, and the evaluation of "util.all(fn
in self for fn in files)" is both costly in itself, and likely to be true,
causing intersectfiles() to be called after all. Fix this by moving the
check for less than 100 files outside of the "or" expression, thereby also
making it apply for a non-exact matcher, should one be passed in.
Conversion of a merge starts with p1 and re-adds the files that were changed in
the merge or came unmodified from p2. Files that are unmodified from p1 will
thus not be touched and take no time. Files that are unmodified from p2 would be
retrieved and rehashed. They would end up getting the same hash as in p2 and end
up reusing the filelog entry and look like the p1 case ... but it was slow.
Instead, make getchanges also return 'files that are unmodified from p2' so the
sink can reuse the existing p2 entry instead of calling getfile.
Reuse of filelog entries can make a big difference when files are big and with
long revlong chains so they take time to retrieve and hash, or when using an
expensive custom getfile function (think
http://mercurial.selenic.com/wiki/ConvertExtension#Customization with a code
reformatter).
This in combination with changes to reuse filectx entries in
localrepo._filecommit make 'unchanged from p2' almost as fast as 'unchanged
from p1'.
This is so far only implemented for the combination of hg source and hg sink.
This is a refactoring/optimization. It is covered by existing tests and show no
changes - which is a good thing.
It is currently only amend and debugbuilddag that will pass a filectx to
localrepo._filecommit. Amend will usually not hit the case where a the filectx
is a parent that just can be reused. Future convert changes will use it more.
As for other currently in place cycle detection, arbitrarily cut the first
obsolescence link that create a cycle avoiding the infinite loop. This will have
to be made more deterministic in the future but we do not really care right now.
The class only depends on the 'repo' variable in the closure, so let's
move the class out of the function and make it explicit that that (the
repo) is all it needs.
This has several advantages compared to resolving it relative to the root:
- '--prefix .' works as expected.
- consistent with upcoming 'hg diff' option to produce relative patches
(I made sure to put in the (glob) annotations this time!)
Nobody should start a transaction on an unlocked repository. If
developer warnings are enabled this will be reported. This use the
same config as bad locking order since this is closely related.
Transitioning to Mercurial versions with revision branch cache could be slow as
long as all operations were readonly (revset queries) and the cache would be
populated but not written back.
Instead, fall back to using the consistently slow path when readonly and the
cache doesn't exist yet. That avoids the overhead of populating the cache
without writing it back.
If not readonly, it will still populate all missing entries initially. That
avoids repeated writing of the cache file with small updates, and it also makes
sure a fully populated cache available for the readonly operations.
Even if largefiles extension is enabled in a repository, "repo"
object, which isn't "largefiles.reposetup()"-ed, is passed to
overridden functions in the cases below unexpectedly, because
extensions are enabled for each repositories strictly.
(1) clone without -U:
(2) pull with -U:
(3) pull with --rebase:
combination of "enabled@src", "disabled@dst" and
"not-required@src" cause this situation.
largefiles requirement
@src @dst @src result
-------- -------- --------------- --------------------
enabled disabled not-required aborted unexpectedly
required requirement error (intentional)
-------- -------- --------------- --------------------
enabled enabled * success
-------- -------- --------------- --------------------
disabled enabled * success (only for "pull")
-------- -------- --------------- --------------------
disabled disabled not-required success
required requirement error (intentional)
-------- -------- --------------- --------------------
(4) update/revert with a subrepo disabling largefiles
In these cases, overridden functions cause accessing to largefiles
specific fields of not "largefiles.reposetup()"-ed "repo" object, and
execution is aborted.
- (1), (2), (4) cause accessing to "_lfstatuswriters" in
"getstatuswriter()" invoked via "updatelfiles()"
- (3) causes accessing to "_lfcommithooks" in "overriderebase()"
For safe accessing to these fields, this patch examines whether passed
"repo" object is "largefiles.reposetup()"-ed or not before accessing
to them.
This patch chooses examining existence of newly introduced
"_largefilesenabled" instead of "_lfcommithooks" and
"_lfstatuswriters" directly, because the former is better name for the
generic "largefiles is enabled in this repo" mark than the latter.
In the future, all other overridden functions should avoid largefiles
specific processing for efficiency, and "_largefilesenabled" is better
also for such purpose.
BTW, "lfstatus" can't be used for such purpose, because some code
paths set it forcibly regardless of existence of it in specified
"repo" object.
The default joinfmt, "x.values()[0]", can't be used here because it picks
either 'bookmark' or 'current' randomly.
I got wrong result with PYTHONHASHSEED=1 on my amd64 machine.
showlist() is the helper to build _hybrid object from a trivial list. It can't
be applied if each value has more than one items, 'bookmark' and 'current' in
this case.
This change is necessary to fix random failure of "{join(bookmarks, sep)}".
Starting with b1e85ff3a7fc, when a clone reached the checkout stage,
the cached changelog in the filtered view was still seeing the
_delayed flag, even though the changelog had already been finalized.
My Mac test machine has 'echo' in '/opt/local/libexec/gnubin/echo' since, well,
GNU is not BSD.
Also, I feel it need to be said about using regexes:
Some people, when confronted with a problem, think "I know, I'll use regular
expressions." Now they have two problems.