af5de4d23fd4 introduced nice hexified display of missing nodes. It did however
also make missing 20 character revision specifications be shown as hex - very
confusing.
Users are often wrong and somehow specify revisions that don't exist. Nodes
will however rarely be missing ... and they will only look like a user provided
revision specification and be all ascii in 1 of 4*10**9.
With this change, missing revisions will only be hexified if they really look
like binary nodes. This change will thus improve the error reporting UI in the
common case and only very rarely make it confusing in the opposite direction of
how it was before.
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.
For great justice.
To detect change of a file without redundant comparison of file
content, dirstate recognizes a file as certainly clean, if:
(1) it is already known as "normal",
(2) dirstate entry for it has valid (= not "-1") timestamp, and
(3) mode, size and timestamp of it on the filesystem are as same as
ones expected in dirstate
This works as expected in many cases, but doesn't in the corner case
that changing a file keeps mode, size and timestamp of it on the
filesystem.
The timetable below shows steps in one of typical such situations:
---- ----------------------------------- ----------------
timestamp of "f"
----------------
dirstate file-
time action mem file system
---- ----------------------------------- ---- ----- -----
* *** ***
- 'hg transplant REV1 REV2 ...'
- transplanting REV1
....
N
- change "f", but keep size N
(via 'patch.patch()')
- 'dirstate.normal("f")' N ***
(via 'repo.commit()')
- transplanting REV2
- change "f", but keep size N
(via 'patch.patch()')
- aborted while patching
N+1
- release wlock
- 'dirstate.write()' N N N
- 'hg status' shows "r1" as "clean" N N N
---- ----------------------------------- ---- ----- -----
The most important point is that 'dirstate.write()' is executed at N+1
or later. This causes writing dirstate timestamp N of "f" out
successfully. If it is executed at N, 'parsers.pack_dirstate()'
replaces timestamp N with "-1" before actual writing dirstate out.
This issue can occur when 'hg transplant' satisfies conditions below:
- multiple revisions to be transplanted change the same file
- those revisions don't change mode and size of the file, and
- the 2nd or later revision of them fails after changing the file
The root cause of this issue is that files are changed without
flushing in-memory dirstate changes via 'repo.commit()' (even though
omitting 'dirstate.normallookup()' on files changed by 'patch.patch()'
for efficiency also causes this issue).
To detect changes of files correctly, this patch writes in-memory
dirstate changes out explicitly after marking files as clean in
'committablectx.markcommitted()', which is invoked via
'repo.commit()'.
After this change, timetable is changed as below:
---- ----------------------------------- ----------------
timestamp of "f"
----------------
dirstate file-
time action mem file system
---- ----------------------------------- ---- ----- -----
* *** ***
- 'hg transplant REV1 REV2 ...'
- transplanting REV1
....
N
- change "f", but keep size N
(via 'patch.patch()')
- 'dirstate.normal("f")' N ***
(via 'repo.commit()')
----------------------------------- ---- ----- -----
- 'dirsttate.write()' -1 -1
----------------------------------- ---- ----- -----
- transplanting REV2
- change "f", but keep size N
(via 'patch.patch()')
- aborted while patching
N+1
- release wlock
- 'dirstate.write()' -1 -1 N
- 'hg status' shows "r1" as "clean" -1 -1 N
---- ----------------------------------- ---- ----- -----
To reproduce this issue in tests certainly, this patch emulates some
timing critical actions as below:
- change "f" at N
'patch.patch()' with 'fakepatchtime.py' explicitly changes mtime
of patched files to "2000-01-01 00:00" (= N).
- 'dirstate.write()' via 'repo.commit()' at N
'fakedirstatewritetime.py' forces 'pack_dirstate()' to use
"2000-01-01 00:00" as "now", only if 'pack_dirstate()' is invoked
via 'committablectx.markcommitted()'.
- 'dirstate.write()' via releasing wlock at N+1 (or "not at N")
'pack_dirstate()' via releasing wlock uses actual timestamp at
runtime as "now", and it should be different from the "2000-01-01
00:00" of "f".
BTW, this patch doesn't test cases below, even though 'patch.patch()'
is used similarly in these cases:
1. failure of 'hg import' or 'hg qpush'
2. success of 'hg import', 'hg qpush' or 'hg transplant'
Case (1) above doesn't cause this kind of issue, because:
- if patching is aborted by conflicts, changed files are committed
changed files are marked as CLEAN, even though they are partially
patched.
- otherwise, dirstate are fully restored by 'dirstateguard'
For example in timetable above, timestamp of "f" in .hg/dirstate
is restored to -1 (or less than N), and subsequent 'hg status' can
detect changes correctly.
Case (2) always causes 'repo.status()' invocation via 'repo.commit()'
just after changing files inside same wlock scope.
---- ----------------------------------- ----------------
timestamp of "f"
----------------
dirstate file-
time action mem file system
---- ----------------------------------- ---- ----- -----
N *** ***
- make file "f" clean N
- execute 'hg foobar'
....
- 'dirstate.normal("f")' N ***
(e.g. via dirty check
or previous 'repo.commit()')
- change "f", but keep size N
- 'repo.status()' (*1)
(via 'repo.commit()')
---- ----------------------------------- ---- ----- -----
At a glance, 'repo.status()' at (*1) seems to cause similar issue (=
"changed files are treated as clean"), but actually doesn't.
'dirstate._lastnormaltime' should be N at (*1) above, because
'dirstate.normal()' via dirty check is finished at N.
Therefore, "f" changed at N (= 'dirstate._lastnormaltime') is forcibly
treated as "unsure" at (*1), and changes are detected as expected (see
'dirstate.status()' for detail).
If 'hg import' is executed with '--no-commit', 'repo.status()' isn't
invoked just after changing files inside same wlock scope.
But preceding 'dirstate.normal()' is invoked inside another wlock
scope via 'cmdutil.bailifchanged()', and in-memory changes should be
flushed at the end of that scope.
Therefore, timestamp N of clean "f" should be replaced by -1, if
'dirstate.write()' is invoked at N. It means that condition of this
issue isn't satisfied.
To detect change of a file without redundant comparison of file
content, dirstate recognizes a file as certainly clean, if:
(1) it is already known as "normal",
(2) dirstate entry for it has valid (= not "-1") timestamp, and
(3) mode, size and timestamp of it on the filesystem are as same as
ones expected in dirstate
This works as expected in many cases, but doesn't in the corner case
that changing a file keeps mode, size and timestamp of it on the
filesystem.
The timetable below shows steps in one of typical such situations:
---- ----------------------------------- ----------------
timestamp of "f"
----------------
dirstate file-
time action mem file system
---- ----------------------------------- ---- ----- -----
N -1 ***
- make file "f" clean N
- execute 'hg foobar'
- instantiate 'dirstate' -1 -1
- 'dirstate.normal("f")' N -1
(e.g. via dirty check)
- change "f", but keep size N
N+1
- release wlock
- 'dirstate.write()' N N
- 'hg status' shows "f" as "clean" N N N
---- ----------------------------------- ---- ----- -----
The most important point is that 'dirstate.write()' is executed at N+1
or later. This causes writing dirstate timestamp N of "f" out
successfully. If it is executed at N, 'parsers.pack_dirstate()'
replaces timestamp N with "-1" before actual writing dirstate out.
Occasional test failure for unexpected file status is typical example
of this corner case. Batch execution with small working directory is
finished in no time, and rarely satisfies condition (2) above.
This issue can occur in cases below;
- 'hg revert --rev REV' for revisions other than the parent
- failure of 'merge.update()' before 'merge.recordupdates()'
The root cause of this issue is that files are changed without
flushing in-memory dirstate changes via 'repo.commit()' (even though
omitting 'dirstate.normallookup()' on changed files also causes this
issue).
To detect changes of files correctly, this patch writes in-memory
dirstate changes out explicitly after marking files as clean in
'workingctx._checklookup()', which is invoked via 'repo.status()'.
After this change, timetable is changed as below:
---- ----------------------------------- ----------------
timestamp of "f"
----------------
dirstate file-
time action mem file system
---- ----------------------------------- ---- ----- -----
N -1 ***
- make file "f" clean N
- execute 'hg foobar'
- instantiate 'dirstate' -1 -1
- 'dirstate.normal("f")' N -1
(e.g. via dirty check)
----------------------------------- ---- ----- -----
- 'dirsttate.write()' -1 -1
----------------------------------- ---- ----- -----
- change "f", but keep size N
N+1
- release wlock
- 'dirstate.write()' -1 -1
- 'hg status' -1 -1 N
---- ----------------------------------- ---- ----- -----
To reproduce this issue in tests certainly, this patch emulates some
timing critical actions as below:
- timestamp of "f" in '.hg/dirstate' is -1 at the beginning
'hg debugrebuildstate' before command invocation ensures it.
- make file "f" clean at N
- change "f" at N
'touch -t 200001010000' before and after command invocation
changes mtime of "f" to "2000-01-01 00:00" (= N).
- invoke 'dirstate.write()' via 'repo.status()' at N
'fakedirstatewritetime.py' forces 'pack_dirstate()' to use
"2000-01-01 00:00" as "now", only if 'pack_dirstate()' is invoked
via 'workingctx._checklookup()'.
- invoke 'dirstate.write()' via releasing wlock at N+1 (or "not at N")
'pack_dirstate()' via releasing wlock uses actual timestamp at
runtime as "now", and it should be different from the "2000-01-01
00:00" of "f".
BTW, this patch also changes 'test-largefiles-misc.t', because adding
'dirstate.write()' makes recent dirstate changes visible to external
process.
Python 2.6 introduced the "except type as instance" syntax, replacing
the "except type, instance" syntax that came before. Python 3 dropped
support for the latter syntax. Since we no longer support Python 2.4 or
2.5, we have no need to continue supporting the "except type, instance".
This patch mass rewrites the exception syntax to be Python 2.6+ and
Python 3 compatible.
This patch was produced by running `2to3 -f except -w -n .`.
Some code cannot handle a subrepo based on the working directory (e.g.
sub.dirty()), so the caller must opt in. This will be useful for archive, and
perhaps some other commands. The git and svn methods where this is used may
need to be fixed up on a case by case basis.
Since node is None for workingctx, it can't use the base class
implementation of 'hex(self.node())'.
It doesn't appear that there are any current callers of this, but there will be
when archive supports 'wdir()'. My first thought was to use "{p1node}+", but
that would cause headaches elsewhere [1].
We should probably fix up localrepository.__getitem__ to accept this hash for
consistency, as a followup. This works, if the full hash is specified:
@@ -480,7 +480,7 @@
return dirstate.dirstate(self.vfs, self.ui, self.root, validate)
def __getitem__(self, changeid):
- if changeid is None:
+ if changeid is None or changeid == 'ff' * 20:
return context.workingctx(self)
if isinstance(changeid, slice):
return [context.changectx(self, i)
That differs from null, where it will accept any number of 0s, as long as it
isn't ambiguous.
[1] https://www.selenic.com/pipermail/mercurial-devel/2015-June/071166.html
Ultimately, this will be used by scmutil. The subrepo module already imports
it, so it can't import the subrepo module to access the underlying method.
The uncommit command in evolve needs to create memory context with given
extra parameters. This patch allows us to do that instead of always giving them
an empty value and having to override it afterwards.
Previously, the first added test printed the following:
$ hg files -S -r '.^' sub1/sub2/folder
sub1/sub2/folder: no such file in rev 9bb10eebee29
sub1/sub2/folder: no such file in rev 9bb10eebee29
sub1/sub2/folder/test.txt
One warning occured each time a subrepo was crossed into.
The second test ensures that the matcher copy stays in place. Without the copy,
the bad() function becomes an increasingly longer chain, and no message would be
printed out for a file missing in the subrepo because the predicate would match
in one of the replaced methods. Manifest doesn't know anything about subrepos,
so it needs help ignoring subrepos when complaining about bad files.
This will work for any command that creates its matcher via scmutil.match(), but
only the files command is tested here (both workingctx and basectx based tests).
The previous behavior was to completely ignore the files in the subrepo, even
though -S was given.
My first attempt was to teach context.walk() to optionally recurse, but once
that was in place and the complete file list was built up, the predicate test
would fail with 'path in nested repo' when a file in a subrepo was accessed
through the parent context.
There are two slightly surprising behaviors with this functionality. First, any
path provided inside the fileset isn't narrowed when it is passed to the
subrepo. I dont see any clean way to do that in the matcher. Fortunately, the
'subrepo()' fileset is the only one to take a path.
The second surprise is that status predicates are resolved against the subrepo,
not the parent like 'hg status -S' is. I don't see any way to fix that either,
given the path auditor error mentioned above.
This should avoid the bad performance in the following scenario. Before this
patch, on "hg annotate -r10000", p.rev() would walk changelog from 10000 to 3
because _descendantrev was 10000. With this patch, it walks from 5 to 3.
1 -- 2 -- 4 -- 5 -- ... -- 10000
\ 'p' 'f'
- 3 (grafted 3 to 4)
'p'
repo: https://hg.mozilla.org/releases/mozilla-beta/#4f80fecda802
command: hg annotate -r b0a57152fd14 browser/app/profile/firefox.js
before: 83.120 secs
after: 3.820 secs
This patch involves extra calls of narrow _adjustlinkrev(), but the cost of
them seems relatively small compared to wide _adjustlinkrev() calls eliminated
by this patch.
repo: http://selenic.com/repo/hg/#d668bc5b9a06
command: hg annotate mercurial/commands.py
before: 7.380 secs
after: 7.320 secs
repo: https://hg.mozilla.org/mozilla-central/#f214df6ac75f
command: hg annotate layout/generic/nsTextFrame.cpp
before: 5.070 secs
after: 5.050 secs
repo: https://hg.mozilla.org/releases/mozilla-beta/#4f80fecda802
command: hg annotate -r 4954faa47dd0 gfx/thebes/gfxWindowsPlatform.cpp
before: 1.600 secs
after: 1.620 secs
_ancestrycontext is necessary for fast lookup of _changeid. Because we can't
compute the ancestors from wctx, we skip to its parents. 'None' is not needed
to be included in _ancestrycontext because it is used for a membership test
of filelog revisions.
repo: https://hg.mozilla.org/releases/mozilla-beta/#062e49bcb2da
command: hg annotate -r 'wdir()' gfx/thebes/gfxWindowsPlatform.cpp
before: 51.520 sec
after: 1.780 sec
Before this patch, annotating working directory could include wrong revisions
that were hidden or belonged to different branches. This fixes wfctx.parents()
to set _descendantrev so that all ancestors can take advantage of the linkrev
adjustment introduced at a5aaaeedd6cb. _adjustlinkrev() can handle 'None'
revision thanks to bb19d597bbcd.
This series tries to fix wrong ancestry information on annotating working
directory. This change should slightly improves the readability of the next
patch.
This class is only needed on case insensitive filesystems, and only
for wdir context matches. It allows the user to not match the case of
the items in the filesystem- especially for naming directories, which
dirstate doesn't handle[1]. Making dirstate handle mismatched
directory cases is too expensive[2].
Since dirstate doesn't apply to committed csets, this is only created by
overriding basectx.match() in workingctx, and only on icasefs. The default
arguments have been dropped, because the ctx must be passed to the matcher in
order to function.
For operations that can apply to both wdir and some other context, this ends up
normalizing the filename to the case as it exists in the filesystem, and using
that case for the lookup in the other context. See the diff example in the
test.
Previously, given a directory with an inexact case:
- add worked as expected
- diff, forget and status would silently ignore the request
- files would exit with 1
- commit, revert and remove would fail (even when the commands leading up to
them worked):
$ hg ci -m "AbCDef" capsdir1/capsdir
abort: CapsDir1/CapsDir: no match under directory!
$ hg revert -r '.^' capsdir1/capsdir
capsdir1\capsdir: no such file in rev 64dae27060b7
$ hg remove capsdir1/capsdir
not removing capsdir1\capsdir: no tracked files
[1]
Globs are normalized, so that the -I and -X don't need to be specified with a
case match. Without that, the second last remove (with -X) removes the files,
leaving nothing for the last remove. However, specifying the files as
'glob:**.Txt' does not work. Perhaps this requires 're.IGNORECASE'?
There are only a handful of places that create matchers directly, instead of
being routed through the context.match() method. Some may benefit from changing
over to using ctx.match() as a factory function:
revset.checkstatus()
revset.contains()
revset.filelog()
revset._matchfiles()
localrepository._loadfilter()
ignore.ignore()
fileset.subrepo()
filemerge._picktool()
overrides.addlargefiles()
lfcommands.lfconvert()
kwtemplate.__init__()
eolfile.__init__()
eolfile.checkrev()
acl.buildmatch()
Currently, a toplevel subrepo can be named with an inexact case. However, the
path auditor gets in the way of naming _anything_ in the subrepo if the top
level case doesn't match. That is trickier to handle, because there's the user
provided case, the case in the filesystem, and the case stored in .hgsub. This
can be fixed next cycle.
--- a/tests/test-subrepo-deep-nested-change.t
+++ b/tests/test-subrepo-deep-nested-change.t
@@ -170,8 +170,15 @@
R sub1/sub2/test.txt
$ hg update -Cq
$ touch sub1/sub2/folder/bar
+#if icasefs
+ $ hg addremove Sub1/sub2
+ abort: path 'Sub1\sub2' is inside nested repo 'Sub1'
+ [255]
+ $ hg -q addremove sub1/sub2
+#else
$ hg addremove sub1/sub2
adding sub1/sub2/folder/bar (glob)
+#endif
$ hg status -S
A sub1/sub2/folder/bar
? foo/bar/abc
The narrowmatcher class may need to be tweaked when that is fixed.
[1] http://www.selenic.com/pipermail/mercurial-devel/2015-April/068183.html
[2] http://www.selenic.com/pipermail/mercurial-devel/2015-April/068191.html
This patch extends the workaround introduced by d5844c5f6c7b. Even if the
base fctx is the same as intorrev, _ancestrycontext must be built for faster
_changeid lookup.
repo: https://hg.mozilla.org/releases/mozilla-beta
command: hg annotate -r 4954faa47dd0 gfx/thebes/gfxWindowsPlatform.cpp
before: 52.450 sec
after: 1.820 sec
When the source rev value is 'None', the ctx is a working context. We
cannot compute the ancestors from there so we directly skip to its
parents. This will be necessary to allow 'None' value for
'_descendantrev' itself necessary to make all contexts used in
'mergecopies' reuse the same '_ancestrycontext'.
The linkrev adjustment will likely do the same ancestry walking multiple time
so we already have an optional mechanism to take advantage of this. Since
4e4e9e954fae, linkrev adjustment was done lazily to prevent too bad performance
impact on rename computation. However, this laziness created a quadratic
situation in 'annotate'.
Mercurial repo: hg annotate mercurial/commands.py
before: 8.090
after: 36.300
Mozilla repo: hg annotate layout/generic/nsTextFrame.cpp
before: 1.190
after: 290.230
So we setup sharing of the ancestry context in the annotate case too. Linkrev
adjustment still have an impact but it a much more sensible one.
Mercurial repo: hg annotate mercurial/commands.py
before: 36.300
after: 10.230
Mozilla repo: hg annotate layout/generic/nsTextFrame.cpp
before: 290.230
after: 5.560
We're going to make rev() lazily do _adjustlinkrevs, and we don't want
that to happen when we're quickly tracing through file ancestry
without caring about revs (as we do when finding copies).
This takes us back to pre-linkrev-correction behavior, but shouldn't
regress us relative to the last stable release.
The new linkrev adjustement mechanism makes rename detection very slow, because
each file rewalks the ancestor dag. To mitigate the issue in Mercurial 3.3, we
introduce a simplistic way to share the ancestors computation for the linkrev
validation phase.
We can reuse the ancestors in that case because we do not care about
sub-branching in the ancestors graph.
The cached set will be use to check if the linkrev is valid in the search
context. This is the vast majority of the ancestors usage during copies search
since the uncached one will only be used when linkrev is invalid, which is
hopefully rare.
We are going to introduce some wider caching mechanisms during linkrev
adjustment. As there is no specific reason to not be a method and some
reasons to be a method, let's make it a method.
The logic of walking a manifest to yield files matching a match object is
currently being done by context, not the manifest itself. This moves the walk()
function to both manifestdict and treemanifest. This separate implementation
will also permit differing, optimized implementations for each manifest.
When the manifest revision is stored as a delta against a non-parent revision,
'_adjustlinkrev' could miss some file update because it was using the delta
only. We now use the 'fastread' method that uses the delta only when it makes
sense.
A test showcasing on the of possible issue have been added.
Now that the caching into _status is done in
workingctx._dirstatestatus(), which workingcommitctx._dirstatestatus()
does not call, there is no caching to prevent in _buildstatus(), so
stop overriding it.
Since it's only the dirstate status we cache, it makes more sense to
cache it in the _dirstatestatus() method. Note that this change means
the dirstate status will also be cached when status is requested
between the working copy and some other revision, while we currently
only cache the result if exactly the status between the working copy
and its parent is requested.
b04f57726c73 changed basefilectx.annotate() to directly instantiate new
filectx's instead of going through self.filectx(), this breaks extensions that
replace the filectx class, and would also break future uses that would need
memfilectx's.
This further simplifies the status code.
This simplification comes at a slight performance cost for `hg
export`. Before, on mozilla-central:
perfmanifest tip
! wall 0.265977 comb 0.260000 user 0.240000 sys 0.020000 (best of 38)
perftags
! result: 162
! wall 0.007172 comb 0.010000 user 0.000000 sys 0.010000 (best of 403)
perfstatus
! wall 0.422302 comb 0.420000 user 0.260000 sys 0.160000 (best of 24)
hgperf export tip
! wall 0.148706 comb 0.150000 user 0.150000 sys 0.000000 (best of 65)
after, same repo:
perfmanifest tip
! wall 0.267143 comb 0.270000 user 0.250000 sys 0.020000 (best of 37)
perftags
! result: 162
! wall 0.006943 comb 0.010000 user 0.000000 sys 0.010000 (best of 397)
perfstatus
! wall 0.411198 comb 0.410000 user 0.260000 sys 0.150000 (best of 24)
hgperf export tip
! wall 0.173229 comb 0.170000 user 0.170000 sys 0.000000 (best of 55)
The next set of patches introduces a new manifest type implemented
almost entirely in C, and more than makes up for the performance hit
incurred in this change.
We can do a little tiny bit better by enhancing manifest.diff to
optionally include files that are in both sides. This will be done in
a followup patch.
This is necessary to annotate workingctx revision. basefilectx.linkrev() can't
be used because committablefilectx has no filelog.
committablefilectx looks for parents() from self._changectx. That means fctx
is linked to self._changectx, so linkrev() can simply be aliased to rev().
When both '.' (the working copy root) and an explicit file (or files)
are in match.files(), we only walk the explicitly listed files. This
is because we remove the '.' from the set too early. Move later and
add a test for it. Before this change, the last test would print only
"3".
This is similar to 327902ff25df in motivation. All contexts now have this
method, so the rest of the 'ctx._repo' uses can be converted without worrying
about what type of context it is.
A couple places in context currently use "x in self._dirs" to check for the
existence of the directory, but this requires that all directories be loaded
into a dict. Calling hasdir() instead puts the work on the the manifest to
check for the existence of a directory in the most efficient manner.