A style name should not contain "/", "\", "." and "..". Otherwise, templates
could be loaded from outside of the specified templates directory by invalid
?style= parameter. hgweb should not allow such requests.
This change means subdir/name is also rejected.
ntpath.join() of Python 2.7.9 does not work as expected if root is a UNC path
to top of share.
This patch doesn't take care of os.altsep, '/' on Windows, because root should
be normalized by realpath().
Before this change, amending a merge would lose the rename information for file
adding in the second parents and implicitly renamed into a new directory.
In case of the merge, we also look for directory rename data from the second
parent. This seems to fix the bug and does not show other issues from the test
suite.
Transitioning to Mercurial versions with revision branch cache could be slow as
long as all operations were readonly (revset queries) and the cache would be
populated but not written back.
Instead, fall back to using the consistently slow path when readonly and the
cache doesn't exist yet. That avoids the overhead of populating the cache
without writing it back.
If not readonly, it will still populate all missing entries initially. That
avoids repeated writing of the cache file with small updates, and it also makes
sure a fully populated cache available for the readonly operations.
The default joinfmt, "x.values()[0]", can't be used here because it picks
either 'bookmark' or 'current' randomly.
I got wrong result with PYTHONHASHSEED=1 on my amd64 machine.
showlist() is the helper to build _hybrid object from a trivial list. It can't
be applied if each value has more than one items, 'bookmark' and 'current' in
this case.
This change is necessary to fix random failure of "{join(bookmarks, sep)}".
Starting with b1e85ff3a7fc, when a clone reached the checkout stage,
the cached changelog in the filtered view was still seeing the
_delayed flag, even though the changelog had already been finalized.
Before this patch, revset predicate "tag()" and "named('tags')" differ
from each other, because the former doesn't include "tip" but the
latter does.
For equivalence, "named('tags')" shouldn't include the revision
corresponded to "tip". But just removing "tip" from the "tags"
namespace causes breaking backward compatibility, even though "tip"
itself is planned to be eliminated, as mentioned below.
http://selenic.com/pipermail/mercurial-devel/2015-February/066157.html
To mask specific names ("tip" in this case) for "named()" predicate,
this patch introduces "deprecated" into "namespaces", and makes
"named()" predicate examine whether each names are masked by the
namespace, to which they belong.
"named()" will really work correctly after 3.3.1 (see a3c326a7f57a for
detail), and fixing this on STABLE before 3.3.1 can prevent initial
users of "named()" from expecting "named('tags')" to include "tip".
It is reason why this patch is posted for STABLE, even though problem
itself isn't so serious.
This may have to be flagged as "(BC)", if applied on DEFAULT.
There are currently no tests for change/remove conflicts of subrepos,
and it's pretty broken. Add some tests demonstrating some of the
breakages and fix the most obvious one (a KeyError when trying to look
up a subrepo in the wrong context).
There's a "p.changeset-age span" css block in style-monoblue.css with quite a
bit of rules, including position. They were all unused, since there weren't
matching span element inside the p.changeset-age.
The span was removed in 064b658181dd (as it seemed meaningless at the time?)
and since then relative changeset age text looked weird and broken.
"age" class is used for calculating relative changeset age in javascript: all
content of such element is replaced with human-friendly text (e.g.
"yesterday"). So the new span gets the age class.
Before this patch, revset predicate "named()" uses each nodes gotten
from target namespaces directly.
This causes problems below:
- combination of other predicates doesn't work correctly, because
they assume that revisions are listed up in number
- "hg log" doesn't show any revisions for "named()" result, because:
- "changeset_printer" stores formatted output for each revisions
into dict with revision number (= ctx.rev()) as a key of them
- "changeset_printer.flush(rev)" writes stored output for
the specified revision, but
- "commands.log" invokes it with the node, gotten from "named()"
- "hg debugrevspec" shows nodes (= may be binary) directly
Difference between revset predicate "tag()" and "named('tags')" in
tests is fixed in subsequent patch.
We're going to make rev() lazily do _adjustlinkrevs, and we don't want
that to happen when we're quickly tracing through file ancestry
without caring about revs (as we do when finding copies).
This takes us back to pre-linkrev-correction behavior, but shouldn't
regress us relative to the last stable release.
The new linkrev adjustement mechanism makes rename detection very slow, because
each file rewalks the ancestor dag. To mitigate the issue in Mercurial 3.3, we
introduce a simplistic way to share the ancestors computation for the linkrev
validation phase.
We can reuse the ancestors in that case because we do not care about
sub-branching in the ancestors graph.
The cached set will be use to check if the linkrev is valid in the search
context. This is the vast majority of the ancestors usage during copies search
since the uncached one will only be used when linkrev is invalid, which is
hopefully rare.
We are going to introduce some wider caching mechanisms during linkrev
adjustment. As there is no specific reason to not be a method and some
reasons to be a method, let's make it a method.
Before this patch, "bookmark()", "named()" and "tag()" predicates
raise "Abort", when the specified pattern doesn't match against
existing ones.
This prevents "present()" predicate from continuing the query, because
it only catches "RepoLookupError".
This patch raises "RepoLookupError" instead of "Abort", to make
"present()" predicate continue the query, even if "bookmark()",
"named()" or "tag()" in the sub-query of it are aborted.
This patch doesn't contain raising "RepoLookupError" for "re:" pattern
in "tag()", because "tag()" treats it differently from others. Actions
of each predicates at failure of pattern matching can be summarized as
below:
predicate "literal:" "re:"
---------- ----------- ------------
bookmark abort abort
named abort abort
tag abort continue (*1)
branch abort continue (*2)
---------- ----------- ------------
"tag()" may have to abort in the (*1) case for similarity, but this
change may break backward compatibility of existing revset queries. It
seems to have to be changed on "default" branch (with "BC" ?).
On the other hand, (*2) seems to be reasonable, even though it breaks
similarity, because "branch()" in this case doesn't check exact
existence of branches, but does pick up revisions of which branch
matches against the pattern.
This patch also adds tests for "branch()" to clarify behavior around
"present()" of similar predicates, even though this patch doesn't
change "branch()".
Changeset 427a0ac924e4 removed "showtags()" definition for "tags"
template keyword from "templatekw.py", because "namespaces" puts a
helper function for it into template keyword map automatically. This
works correctly from the point of view of templating functionality.
But on the other hand, it removed "tags" template keyword from "hg
help templates" unexpectedly, because online help text is built before
"namespaces" puts a helper function for "tags" into template keyword
map.
This patch is a kind of backing 427a0ac924e4 out, but this implements
"showtags()" with newly introduced "shownames()" instead of originally
used "showlist()".
The conditional was a bit too narrow and produced buggy result when a node was
present in both common and heads (because it pleased the discovery) and it was
locally known but filtered.
This resulted in buggy getbundle request and server side crash.
Some evolve user ignored the invalid markers for about two years and still have
some of them in some repository. This lead to plain abort whenever mercurial try
to open such repo. We need reinstall some way to clean this up in the evolve
extension. For this purpose, we need the checker code wrap-able independently.
This is scheduled for stable as this issue is blocking some evolve user.
Before this patch, failure of updating subrepos may cause inconsistent
".hgsubstate". For example:
1. dirstate entry for ".hgsubstate" of the parent repo is filled
with valid size/date (via "hg state" or so)
2. "hg update" is invoked at the parent repo
3. ".hgsubstate" of the parent repo is updated on the filesystem as
a part of "g"(et) action in "merge.applyupdates"
4. it is assumed that size/date of ".hgsubstate" on the filesystem
aren't changed from ones at (1)
this is not so difficult condition, because just changing hash
ids (every ids are same in length) in ".hgsubstate" doesn't
change the file size of it
5. "subrepo.submerge()" is invoked to update subrepos
6. failure of updating in one of subrepos raises exception
(e.g. "untracked file differs")
7. "hg update" is aborted without updating dirstate of the parent repo
dirstate entry for ".hgsubstate" still holds size/date at (1)
Then, ".hgsubstate" of the parent repo is treated as "CLEAN"
unexpectedly, because updating ".hgsubstate" at (3) doesn't change
size/date of it on the filesystem: see assumption at (4).
This inconsistent ".hgsubstate" status causes unexpected behavior, for
example:
- "hg revert" forgets to revert ".hgsubstate"
- "hg update" misunderstands that (not yet updated) subrepos diverge
(then, it shows the prompt to confirm user's decision)
To avoid inconsistent ".hgsubstate" status above, this patch marks
".hgsubstate" as possibly dirty before "submerge" invocation.
"normallookup"-ed (= dirty) dirstate should be written out, even if
processing is aborted by failure.
This patch marks ".hgsubstate" as possibly dirty before "submerge",
also when it is removed or merged while merging, for safety. This
should prevent Mercurial from misunderstanding inconsistent
".hgsubstate" as clean.
To satisfy conditions at (1) and (4) above, this patch uses "hg status
--config debug.dirstate.delaywrite=2" (to fill valid size/date into
dirstate) and "touch" (to fix date of the file).
b8552020f458 introduced automatic quoting of diffargs in a not entirely
backwards compatible way. That rendered some of the configuration in
mergetools.rc invalid. It would fail when running extdiff on a single file with
a name that needed quoting.
Before:
$ hg config merge-tools.meld.diffargs
-a --label='$plabel1' $parent --label='$clabel' $child
$ hg --config extdiff.meld= -v --debug meld
running "/usr/bin/meld -a --label=''sp ace@0'' '.../Z.b7a65a1d2f84/sp ace' --label=''sp ace'' '.../sp ace'" in ...
meld: error: too many arguments (wanted 0-3, got 4)
After:
$ hg config merge-tools.meld.diffargs
-a --label=$plabel1 $parent --label=$clabel $child
$ hg --config extdiff.meld= -v --debug meld
running "/usr/bin/meld -a --label='sp ace@0' '.../Z.b7a65a1d2f84/sp ace' --label='sp ace' '.../sp ace'" in ...
(success)
Before this patch, "_()" is used incorrectly for "tag:" and
"bookmark:" fields. "changeset_printer()" looks up keys composed at
runtime, and it prevents "xgettext" command from getting strings to be
translated from source files.
Then, *.po files merged with updated "hg.pot" lose msgids for "tag:"
and "bookmark:".
This patch introduces "logfmt" information into "namespace" to show
l10n-ed messages "hg log" (or "changeset_printer()") correctly.
For similarity to other namespaces, this patch specifies "logfmt" for
"branches" namespace, even though it isn't needed (branch information
is handled in "changeset_printer()" specially).
To find related code paths out easily, this patch adds "i18n: column
positioning ..." comment to the line composing "logfmt" by default,
even though this line itself doesn't contain any strings to be
translated.
The prefetch logic came before the actual population of the actions collection,
so it was always being passed an empty action list. This fixes it by moving it
to after that logic.
The only consumer of this function at the moment is remotefilelog, and I
verified it works with this change.
The warning has in some cases incorrectly attributed unrelated problems to rbc.
Instead, just do like the branch head cache does and stay quiet when reading
fails. The cache will be missing the first time a repo is used. It is a normal
situation and there is no reason to make a note of that.
"dirpath" in the tuple yielded by "vfs.walk()" is relative one from
the root of specified vfs, and absolute path in the warning message is
composed by "vfs.join()".
On the other hand, target file "f" exists in "dirpath", and
"reljoin()" is needed to unlink "f" by "vfs.unlink()".
To eliminate "path prefix" (= "the root of vfs") part from "dirpath"
yielded by "os.walk()" correctly, "path prefix" should have "os.sep"
at the end of own string, but it isn't easy to ensure it, because:
- examination by "path.endswith(os.sep)" isn't portable
Some problematic encodings use 0x5c (= "os.sep" on Windows) as the
tail byte of some multi-byte characters.
- "os.path.join(path, '')" isn't portable
With Python 2.7.9, this invocation doesn't add "os.sep" at the end
of UNC path (see issue4557 for detail).
Python 2.7.9 changed also behavior of "os.path.normpath()" (see *) and
"os.path.splitdrive()" for UNC path.
vfs root normpath splitdrive os.sep required
=============== ============== =================== ============
z:\ z:\ z: + \ no
z:\foo z:\foo z: + \foo yes
z:\foo\ z:\foo z: + \foo yes
[before Python 2.7.9]
\\foo\bar \\foo\bar '' + \\foo\bar yes
\\foo\bar\ \\foo\bar (*) '' + \\foo\bar yes
\\foo\bar\baz \\foo\bar\baz '' + \\foo\bar\baz yes
\\foo\bar\baz\ \\foo\bar\baz '' + \\foo\bar\baz yes
[Python 2.7.9]
\\foo\bar \\foo\bar \\foo\bar + '' yes
\\foo\bar\ \\foo\bar\ (*) \\foo\bar + \ no
\\foo\bar\baz \\foo\bar\baz \\foo\bar + \baz yes
\\foo\bar\baz\ \\foo\bar\baz \\foo\bar + \baz yes
If it is ensured that "normpath()"-ed vfs root is passed to
"splitdrive()", adding "os.sep" is required only when "path" part of
"splitdrive()" result isn't "os.sep" itself. This is just what
"pathutil.nameasprefix()" examines.
This patch applies "os.path.normpath()" on "self.join(None)"
explicitly, because it isn't ensured that vfs root is already
normalized: vfs itself is constructed with "realpath=False" (= avoid
normalizing in "vfs.__init__()") in many code paths.
This normalization should be much cheaper than subsequent file I/O for
directory traversal.
As a preparation for vfs migration of "_sanitize()", this patch passes
"wvfs" to "_sanitize()" and use "wvfs.base" instead of absolute path
to a subrepository.
An upcoming patch will establish per-filter tags caches. We'll want
to use the same cache validation logic as the branch cache. Prepare
for that by moving the logic for computing a filtered view hash
to somewhere central.
This eliminates the following test failure on Windows, as well as a similar one
in evolve's test-wireproto.t. See the previous patch for details on the
problem.
--- e:/Projects/hg/tests/test-init.t
+++ e:/Projects/hg/tests/test-init.t.err
@@ -216,10 +216,10 @@
* test 0:08b9e9f63b32
$ hg clone -e "python \"$TESTDIR/dummyssh\"" local ssh://user@dummy/remote-bookmarks
searching for changes
+ exporting bookmark test
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 1 changes to 1 files
- exporting bookmark test
$ hg -R remote-bookmarks bookmarks
test 0:08b9e9f63b32
There are a handful of SSH related test failures on Windows.
--- c:/Users/Matt/Projects/hg/tests/test-bundle2-exchange.t
+++ c:/Users/Matt/Projects/hg/tests/test-bundle2-exchange.t.err
@@ -305,16 +305,16 @@
remote: added 1 changesets with 1 changes to 1 files
remote: 1 new obsolescence markers
updating bookmark book_5fdd
+ pre-close-tip:02de42196ebe draft book_02de
+ postclose-tip:02de42196ebe draft book_02de
+ txnclose hook: HG_SOURCE=push-response HG_TXNNAME=push-response
+ ssh://user@dummy/other HG_URL=ssh://user@dummy/other
remote: pre-close-tip:5fddd98957c8 draft book_5fdd
remote: pushkey: lock state after "bookmarks"
remote: lock: free
remote: wlock: free
remote: postclose-tip:5fddd98957c8 draft book_5fdd
remote: txnclose hook: (env vars truncated)
- pre-close-tip:02de42196ebe draft book_02de
- postclose-tip:02de42196ebe draft book_02de
- txnclose hook: HG_SOURCE=push-response HG_TXNNAME=push-response
- ssh://user@dummy/other HG_URL=ssh://user@dummy/other
$ hg -R other log -G
o 6:5fddd98957c8 draft Nicolas Dumazet <...> book_5fdd C
|
--- c:/Users/Matt/Projects/hg/tests/test-ssh.t
+++ c:/Users/Matt/Projects/hg/tests/test-ssh.t.err
@@ -438,12 +438,12 @@
$ hg push
pushing to ssh://user@dummy/remote
searching for changes
+ local stdout
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 1 changes to 1 files
- remote: KABOOM
- local stdout
+ remote: KABOOM\r (esc)
$ cd ..
What is happening is that no data is available in 'sshpeer.pipee' while the
command is executing. As the command completes, local output is printed, and
then sshpeer.cleanup() is called. When it calls 'self.pipeo.close()', the child
process is shutdown, flushing stderr.
As an experiment, I printed a line to stdout and another to stderr instead this
flush(). The stdout data was immediately available to the hg client, and none
of the stderr data was until the child exited. At that point, pipee has all of
the buffered data, and it is read out and printed before the pipe is closed in
sshpeer.cleanup(). This is probably a known issue, since ui.write_err()
mentions that stderr may be buffered, and also flushes stderr.
It would be nice if there was a more general fix (there is one more test that
fails), but I'm not sure what it is. I've seen (ancient) references [1] to
setvbuf() "crashing spectacularly" on some systems if any I/O has been done
already, so it seems worth avoiding.
https://sourceware.org/ml/gdb-patches/2013-08/msg00422.html
[1] https://groups.google.com/forum/#!msg/comp.lang.python/JT8LiYzYDEY/Qg9d1HwyjScJ
Single letter properties are used to keep payload size down, as diff
representation can be quite large and longer property names can create a
lot of extra work for parsers.
Rename is not yet captured. This can be done in a follow-up.
Surpringly, the templates didn't receive an unmodified version of the
line numbers. Expose it to make implementing the JSON templates easier.
In theory, we could post-process an existing template variable. But
extra string manipulation seems quite wasteful, especially on items that
could occur hundreds or even thousands of times in output.
The goal of 'hg revert --interactive' is usually to keep some change in the
revert file, so the files -must-not- be marked as clean. We want the status
logic to do its usual job here.
For unclear reasons (probably timing related), I was unable to build an automated
test that reproduced issue4592 but manual testing shows this is fixed.
__setitem__ on the lazymanifest C type wasn't checking to see if a
line had previously been malloced before replacing it, leading to
leaks if files got updated multiple times in the course of a task.
I was able to reproduce the leak with this change to test-manifest.py:
diff --git a/tests/test-manifest.py b/tests/test-manifest.py
--- a/tests/test-manifest.py
+++ b/tests/test-manifest.py
@@ -456,6 +456,16 @@ class basemanifesttests(object):
['a/b/c/bar.txt', 'a/b/c/foo.txt', 'a/b/d/ten.txt'],
m2.keys())
+ def testManifestSetItem(self):
+ m = self.parsemanifest('')
+ for x in range(3):
+ m['file%d' % x] = BIN_HASH_1
+ for x in range(3):
+ m['file%d' % x] = BIN_HASH_2
+ import time
+ time.sleep(4)
+
+
along with the commands:
$ make local
$ PYTHONPATH=. SILENT_BE_NOISY=1 python tests/test-manifest.py testmanifestdict.testManifestSetItem &
$ sleep 4
$ leaks $(jobs -p | tee /dev/stderr | awk '{print $3}')
$ wait
in an interactive shell on OS X. As far as I can tell, it had to be an
interactive shell so that I could get the pid of the test run using
the jobs builtin. Prior to this change, I was leaking several strings,
and after this change leaks reports no leaks.
I thought there was a bug filed for this in bugzilla, but I can't find
it either in bugzilla or by searching my email.
I came across a case where an internal command was using a revset that I didn't
immediately pass in and it was difficult to debug what was going wrong with the
revset. This prints out the revset and informs the user that the error is with
a rebset so it should be more obvious what and where the error is.
In repos with many changesets, the computation of allfuturecommon
can take a significant amount of time. Since it's only used if
there's an obsstore, don't compute it otherwise.
When we start writing tree manifests with one manifest revlog per
directory, it will still be nice to be able to run tests using tree
manifests in memory but writing to a flat manifest to a single
revlog. Let's break the current '_usetreemanifest' flag on the revlog
into '_treeinmem' and '_treeondisk'. Both are populated from the same
config, but after this change, one can temporarily hard-code
_treeinmem=True to see that tests still pass.
manifestdict() creates an empty manifestdict, so let's consistently
use that instead of explicitly parsing an empty string (which does
result in an empty manifest).
The '--all' option have been introduced in 0a81b7721d8f (August 2006), most
probably to prevent user shooting themselves in the foot. As the record process
will let you, view and select the set of files and change you want to revert, I
feel like the '--all' flag is superfluous in the '--interactive' case.
The series at a17556fc1521::77b112363d48 introduced generic transaction level
hooking. This makes the experimental bundle2 specific hooks redundant, we drop
them.
That way, any new server will be ready to accept bundle2 payload. The decision
for the client to use it is still off by default so this is not turning bundle2
everywhere.
We introduce a new kill switch for this in case stuff goes wrong.
According to 6b1369445b7b introducing "windows._removedirs()":
If a hg repository including working directory is a reparse point
(directory symlinked or a junction point), then using
os.removedirs will remove the reparse point erroneously.
"windows._removedirs()" should be used instead of "os.removedirs()" on
Windows.
This patch adds "removedirs" as platform depending function to replace
"os.removedirs()" invocations for portability and safety
This duplicates "onerror()" function from "svnsubrepo.remove()" for
equivalence of replacing in subsequent patch.
This "onerror()" function for "shutil.rmtree()" was introduced by
094a056562e7, which avoids failure of removing svn repository on
Windows.
Revision 7fbf0ef28408 ('graft: record intermediate grafts in extras') introduced
'intermediate-source' extra which refers to the closest graft source.
As 'intermediate-source' extra provides more detailed information about the source
changeset than 'source' one, it is better to prefer the first one to use as a
value of HGREVISION environment variable for an editor.
It is finally time to freeze the bundle2 format! To do so we:
- rename HG2Y to HG20,
- drop "b2x:" prefix from all part names,
- rename capability to "bundle2-exp" to "bundle2"
- rename the hook flag from 'bundle2-exp' to 'bundle2'
The condition on which manifestdict.matches() and manifestdict.walk()
take the fast path of iterating over files instead of the manifest, is
slightly different. Specifically, walk() does not take the fast path
for exact matchers and it does not avoid taking the fast path when
there are more than 100 files. Let's extract the condition so we don't
have to maintain it in two places and so walk() can gain these two
missing pieces of the condition (although there seems to be no current
caller of walk() with an exact matcher).
When checking whether we can take the fast path of iterating over
matcher files instead of manifest files, we check whether
match.files() is non-empty. However, now that return early for
match.always(), it can only be empty when there are only
include/exclude patterns, but in that case anypats() will be True, so
it's already covered. This makes manifestdict.walk() more similar to
manifestdict.matches().
This cuts down the run time of
hg files -r . > /dev/null
from ~0.850s to ~0.780s on the Firefox repo. Note that
manifest.matches() already has the corresponding optimization.
This patch adds propertycache-ed "_relpath" field to
"abstractsubrepo", to centralize "subrelpath" logic into it.
Now, "subrelpath()" can always return "_relpath" field of the
specified subrepo object, because it is ensured that subrepo object
has it. To reduce changes in this patch, "subrelpath()" itself is
still kept, even though it seems to be redundant.
This is also a part of eliminating "os.path.*" API invocations for
"Windows UTF-8 Plan".
This patch doesn't create vfs object in "abstractsubrepo.__init__()"
but adds propertycache-ed "wvfs" field, because the latter can:
- delay vfs instantiation until it is actually needed
- allow to use "hgsubrepo._repo.wvfs" as "wvfs"
"hgsubrepo._repo" is initialized after
"abstractsubrepo.__init__()" invocation, and passing
"hgsubrepo._repo.wvfs" to "abstractsubrepo.__init__()" is
difficult.
This patch passes "ctx" and "path" instead of "ui" to
"abstractsubrepo.__init__()" and stores them as "_ctx" and "_path" to
use them in subsequent patches.
This also removes redundant field initializations in the constructor
of classes derived from "abstractsubrepo".
This makes treemanifest.walk() not visit submanifests that are known not to
have any matching files. It does this by calling match.visitdir() on
submanifests as it walks.
This change also updates largefiles to be able to work with this new behavior
in treemanifests. It overrides match.visitdir(), the function that dictates
how walk() and matches() skip over directories.
The greatest speed improvements are seen with narrower scopes. For example,
this commit speeds up the following command on the Mozilla repo from 1.14s
to 1.02s:
hg files -r . dom/apps/
Whereas with a wider scope, dom/, the speed only improves from 1.21s to 1.13s.
As with similar a similar optimization to treemanifest.matches(), this change
will bring out even bigger performance improvements once treemanifests are
loaded lazily. Once that happens, we won't just skip over looking at
submanifests, but we'll skip even loading them.
The _intersectfiles() method is only called from one place, it's
pretty short, and its caller has to be aware when it's appropriate to
call it (when the number of files in the matcher is not too large), so
let's inline it.
Currently __contains__ is called only by "rev()" revset, but "x in cl" is a
function that is likely to be used in hot loop. revlog.__contains__ is simple
enough to duplicate to changelog, so just inline it.
Before this patch, "hg outgoing -B" shows only difference of bookmarks
between two repositories, and it isn't user friendly.
This patch shows detailed status about outgoing bookmarks at "hg
outgoing -B".
To avoid breaking backward compatibility with other tool chains, this
patch shows status, only if --verbose is specified,
Before this patch, "hg incoming -B" shows only difference of bookmarks
between two repositories, and it isn't user friendly.
This patch shows detailed status about incoming bookmarks at "hg
incoming -B".
To avoid breaking backward compatibility with other tool chains, this
patch shows status, only if --verbose is specified,
Before this patch, "hg outgoing -B" shows only bookmarks added
locally. Then, users can't know about bookmarks below before "hg push"
execution.
- deleted locally (even though it may be added remotely from "hg pull" view)
- advanced locally
- diverged
- changed (= remote revision is unknown for local)
This patch shows such bookmarks, too.
Before this patch, "hg incoming -B" shows only bookmarks added
remotely. Then, users can't know about bookmarks below before "hg
pull" execution.
- advanced remotely
- diverged
- changed (remote revision is unknown for local)
This patch shows such bookmarks, too.
It appears that the read() in readpipe() never actually ran before (in
test-ssh.t anyway). A print of the size returned from os.fstat() is 0 for every
single print output in test-ssh.t, so the data in the pipe ends up being read
later instead of when it is available. This is the same problem as Linux, as
mentioned in e20a5309b88d.
There are several places in the Windows SSH tests where the order of local
output vs remote output differ from the other platforms. This only fixes one of
those cases (and interstingly, not the one added in order to test e20a5309b88d),
so there is more investigation needed. However, without this patch, test-ssh.t
also has this diff:
--- c:/Users/Matt/Projects/hg/tests/test-ssh.t
+++ c:/Users/Matt/Projects/hg/tests/test-ssh.t.err
@@ -397,11 +397,11 @@
$ hg push --ssh "sh ../ssh.sh"
pushing to ssh://user@dummy/*/remote (glob)
searching for changes
- remote: Permission denied
- remote: abort: prechangegroup.hg-ssh hook failed
- remote: Permission denied
- remote: pushkey-abort: prepushkey.hg-ssh hook failed
updating 6c0482d977a3 to public failed!
+ remote: Permission denied
+ remote: abort: prechangegroup.hg-ssh hook failed
+ remote: Permission denied
+ remote: pushkey-abort: prepushkey.hg-ssh hook failed
[1]
$ cd ..
Output with this change was stable over 600+ runs of test-ssh.t. I initially
tried a background thread to read the pipe[1], but this was simpler and the test
results were exactly the same. I also tried SetNamedPipeHandleState(), but the
PIPE_NOWAIT is for compatibility with LANMAN 2.0, not for async I/O (the results
were identical though).
[1] http://eyalarubas.com/python-subproc-nonblock.html
This will be used in the next patch to do nonblocking reads from the child
process, like on posix platforms. See that for why os.fstat() is insufficient.
I changed this to an explicit Py_DECREF + set to null in 4f6b7d37d06a. This was
a silly misunderstanding on my part -- for some reason I thought Py_CLEAR set
its argument to null only if its refcount reached 0. Turns out that's not
actually the case -- Py_CLEAR is just Py_DECREF + set to null with some
additional precautions around destructors that aren't relevant here.
The real bug that 4f6b7d37d06a fixed was the fact that we were mutating the
string after setting it in the Python dictionary.
This function refactors the logic that decides to use 'bundle2' during an
exchange (pull/push). This will help being consistent while transitioning from
the experimental protocol to the final frozen version.
I do not expect this function to survive on the long run when using 'bundle2'
will become a simple capability check.
This is also necessary to allow HG2Y support in an extension to ease transition
of companies using the experimental protocol in production (yeah...). Such
extension will be able to wrap this function to use the experimental protocol in
some case.
To support more bundle2 formats, we need a wider detection of bundle2-family
streams. The various places what were explicitly detecting the full magic string
are now matching on the first three characters of it.
We now take full advantage of the 'getunbundler' function by using a
'{version -> unbundler-class}' mapping. This map currently contains a single
entry but will make it easy to support more versions from an extension/the
future.
At some point, this map will probably contain bundler-class information too,
in the same fashion the packer map does. However, this is not critically required
right now so it will happen by itself when needed.
The main target is to allow HG2Y support in an extension to ease transition of
companies using the experimental protocol in production (yeah...) But I've no
doubt this will be useful when playing with a future HG21.
This refactor is a preparation for an optimization in the next commit. This
introduces a recursive element that recurses each submanifest. By using a
recursive function, the next commit can avoid walking over some subdirectories
altogether.
The logic of walking a manifest to yield files matching a match object is
currently being done by context, not the manifest itself. This moves the walk()
function to both manifestdict and treemanifest. This separate implementation
will also permit differing, optimized implementations for each manifest.
It isn't obvious which file is the problem with deep subrepos, so provide the
path. Since the parsing is done with a ctx and not a subrepo object, it isn't
possible to display a path from the root subrepo. Therefore, the path shown is
relative to cwd.
There's no test coverage for the first abort, and I couldn't figure out how to
trigger it, but it is changed for consistency.
Previously the extra field for a graft only contained the original commit hash.
This made it impossible to use graft to copy a commit more than once, because
the extras fields did not change after the second graft.
The fix is to add an extra.intermediate-source field that records the immediate
predecessor to graft. This changes hashes for commits that have been grafted
twice, which is why the test was affected.
Previously it was impossible to graft a commit onto it's own parent (i.e. create
a copy of the commit). This is useful when wanting to create a backup of the
commit before continuing to amend it. This patch enables that behavior.
The change to the histedit test is because histedit uses graft to apply commits.
The test in question moves a commit backwards onto an ancestor. Since the graft
logic now more explicitly supports this, it knows to simply accept the incoming
changes (since they are more recent), instead of prompting.
To support multiple bundle2 formats, we will need a function returning
the proper unbundler according to the header. We introduce such aa
function and change the usage in the code base. The function will get
smarter in later changesets.
This is somewhat similar to the dispatching we do for 'HG10' and 'HG11'.
The main target is to allow HG2Y support in an extension to ease transition of
companies using the experimental protocol in production (yeah...) But I've no
doubt this will be useful when playing with a future HG21.
This makes it easy to create a new bundler class that inherits from
the core one. This matches the way 'changegroup' packers work.
The main target is to allow HG2Y support in an extension to ease transition of
companies using the experimental protocol in production (yeah...) But I've no
doubt this will be useful when playing with a future HG21.
The 'format' argument was not used even when it was added in
05af7fe09faf (getbundle: pass arbitrary arguments all along the call
chain, 2014-04-17).
Note that by removing the argument, if any caller did pass a named
'format' argument, we will now pass that along to exchange.getbundle()
via the kwargs. If the idea was to remove such a key, that should have
been done explicitly.
When the 'kwargs' variable was added in 038ae1f12393 (bundle2: allow
pulling changegroups using bundle2, 2014-04-01), it could contain only
'bundlecaps', 'common' and 'heads', so the check for 'format' would
always be false. Since then, _pullbundle2extraprepare() has been added
for hooks, but it seems unlikely that they would a 'format' key.
The matches function was previously traversing all submanifests to look for
matching files, even though it was possible to know if a submanifest won't
contain any matches.
This change adds a visitdir function on the match object to decide quickly if
a directory should be visited when traversing. The function also decides if
_all_ subdirectories should be traversed.
Adding this logic as methods on the match object also makes the logic
modifiable by extensions, such as largefiles.
An example of a command this speeds up is running
hg status --rev .^ python/
on the Mozilla repo with the treemanifest experiment enabled.
It goes from 2.03s to 1.85s.
More improvements to speed from this change will happen when treemanifests are
lazily loaded. Because a flat manifest is still loaded and then converted
into treemanifests, speed improvements are limited.
This change has no negative effect on speed. For a worst-case example, this
command is not negatively impacted:
hg status --rev .^ 'relglob:*.js'
on the Mozilla repo. It goes from 2.83s to 2.82s.
An upcoming commit requires that match.py be able to call scmutil.dirs(), but
when match.py imports scmutil, a dependency cycle is created. This commit
avoids the cycle by moving dirs() and its related finddirs() function from
scmutil to util, which match.py already depends on.
Parsers.py had a reference to util.sha1 which was unused. This commit removes
this reference as well as the unused import of util to simplify the dependency
graph. This is important for the next commit which actually relocates part
of a module to eliminate a cycle.
The _computenonoverlap function takes two manifests to allow extensions to hook
in and read the manifest nodes produced by the function. The remotefilelog
extension actually needs the entire changectx instead (which includes the
manifest) so it can prefetch the subset of files necessary for a sparse checkout
(and the sparse checkout depends on which commit is being accessed, hence the
need for the changectx).
I have tests in the remotefilelog extension that cover this.
One of the rules of Python strings is that they're immutable. dirs._addpath
breaks this assumption for performance, which is fine as long as it is done
safely -- once a string is no longer internal-only it shouldn't be mutated.
Unfortunately, we weren't being safe here -- we were mutating 'key' even after
adding it to a dictionary.
This only really affects other C code that reads strings, so it's somewhat hard
to write a test for this without poking into the internal representation of the
string via ctypes or similar. There is currently no C code that reads the
output of the string, but there will likely be some soon as the bug indicates.
There's no significant difference in performance.
Since all children are processed before their parents, we can apply the following algorithm:
For each rev (descending order):
* If I'm still hidden, no children will block me,
* If I'm not hidden, I must remove my parent from the hidden set,
This allows us to dynamically change the set of 'hidden' revisions, dropping the
need for the 'actuallyhidden' dictionary and the 'blocked' boolean in the queue.
As before, we start iterating from all heads and stop at the first public
changesets. This ensures the hidden computation is 'O(not public())' instead of
'O(len(min(not public()):))'.
Since we want to process all non-public changesets from top to bottom, a heap
seems more appropriate. This will ensure any revision is processed after all
its children, opening the way to code simplification.
test-https.t was broken at d133034be253 if /usr/bin/pythonX.Y is used on
Mac OS X.
If python executable is not named as "python", run-tests.py creates a symlink
and hghave uses it. On the other hand, the installed hg executable knows the
real path to the system Python. Therefore, there was an inconsistency that
hghave said it was not an Apple python but hg knew it was.
This is a significant performance win on large repositories. perffilefoldmap:
On Linux/gcc, on a test repo with over 500,000 files:
before: wall 0.605021 comb 0.600000 user 0.560000 sys 0.040000 (best of 17)
after: wall 0.280530 comb 0.280000 user 0.250000 sys 0.030000 (best of 35)
On Mac OS X/clang, on a real-world repo with over 200,000 files:
before: wall 0.281103 comb 0.280000 user 0.260000 sys 0.020000 (best of 34)
after: wall 0.133622 comb 0.140000 user 0.120000 sys 0.020000 (best of 65)
This visibly impacts status times on case-insensitive file systems. On the Mac
OS X repo, status goes from 3.64 seconds to 3.50.
With the third-party hgwatchman extension [1], 'hg status' on the same repo
goes from 0.80 seconds to 0.65.
[1] https://bitbucket.org/facebook/hgwatchman
This is a hot path on case-insensitive filesystems -- it's guaranteed to be
called every time 'hg status' is run.
This is significantly faster than the equivalent Python code: see the following
patch for numbers.
Unlike changeset_printer, it does not hide the manifest field because JSON
output will be parsed by machine where explicit "null" will be more useful
than nothing.
When tree manifests are stored with one revlog per directory and
loaded lazily, it's unclear how much readdelta will help. If only a
few files change, then only a small part of the full manifest will be
loaded, and the delta chains should also be shorter for tree
manifests. Therefore, let's disable readdelta for tree manifests for
now.
These will be used in upcoming patches to efficiently create a dirstate
foldmap.
The Cygwin normcase behavior is more complicated than just a simple lowercasing
or uppercasing. That's why we specify 'other'.
For C code we don't want to pay the cost of calling into a Python function for
the common case of ASCII filenames. However, while on most POSIX platforms we
normalize filenames by lowercasing them, on Windows we uppercase them. We
define an enum here indicating the direction that filenames should be
normalized as. Some platforms (notably Cygwin) have more complicated
normalization behavior -- we add a case for that too.
In upcoming patches we'll also define a fallback function that is called if the
string has non-ASCII bytes.
This enum will be replicated in the C code to make foldmaps. There's
unfortunately no nice way to avoid that -- we can't have encoding import
parsers because of import cycles. One way might be to have parsers import
encoding, but accessing Python modules from C code is just awkward.
The name 'normcasespecs' was chosen to indicate that this is merely an integer
that specifies a behavior, not a function. The name was pluralized since in
upcoming patches we'll introduce 'normcasespec' which will be one of these
values.
We should consider add HTML rendering of the RST into the response as a
follow-up. I attempted to do this, but there was an empty array
returned by the rstdoc() template function. Not sure what's going on.
Will deal with it later.
These are the same dispatch function under the hood. The only difference
is the default number of entries to render and the template to use. So
it makes sense to use a shared template.
Format for {changelistentry} is similar to {changeset}. However, there
are differences to argument names and their values preventing us from
(easily) using the same template. (Perhaps there is room to consolidate
the templates as a follow-up.)
We're currently not recording some data in {changelistentry} that exists
in {changeset}. This includes the branch name. This should be added in
a follow-up. For now, something is better than nothing.
The content of "hg help templating" is largely derived from docstrings
on functions providing functionality. Template functions are the long
holdout.
Prepare for generating them dynamically by defining docstrings for all
template functions.
There are numerous ways these docs could be improved. Right now, the
help output simply shows function names and arguments. So literally
any accurate data is better than what is there now.
It's unfortunate that workingctx revision is None, which doesn't work well in
arithmetic operation or comparison. This function is trivial but will be used
in several places.
We're going to reuse this in upcoming patches.
The change to Py_ssize_t is necessary because parsers.c doesn't define
PY_SSIZE_T_CLEAN. That macro changes the behavior of PyArg_ParseTuple but not
PyBytes_GET_SIZE.
The new manifest format is designed to be smaller, in particular to
produce smaller deltas. It stores hashes in binary and puts the hash
on a new line (for smaller deltas). It also uses stem compression to
save space for long paths. The format has room for metadata, but
that's there only for future-proofing. The parser thus accepts any
metadata and throws it away. For more information, see
http://mercurial.selenic.com/wiki/ManifestV2Plan.
The current manifest format doesn't allow an empty filename, so we use
an empty filename on the first line to tell a manifest of the new
format from the old. Since we still never write manifests in the new
format, the added code is unused, but it is tested by
test-manifest.py.
While it should be safe to switch to the new manifest format on an
existing repo, let's keep it simple for now and make the configuration
have any effect only at repo creation time. If the configuration is
enabled then (at repo creation), we add an entry to requires and read
that instead of the configuration from then on.
Previously we would compute the repoview's static blockers by finding all the
children of hidden commits that were not hidden. This was O(number of commits
since first hidden change) since 'children' requires walking every commit from
tip until the first hidden change.
The new algorithm walks all heads down until it sees a public commit. This makes
the computation O(number of draft) commits, which is much faster in large
repositories with a large number of commits and a low number of drafts.
On a large repo with 1000+ obsolete markers and the earliest draft commit around
tip~200000, this improves computehidden perf by 200x (2s to 0.01s).
It's pretty surprising phase wasn't part of this template call already.
We now expose {phase} to the {changeset} template and we expose this
data to JSON.
This brings JSON output in line with the output from `hg log -Tjson`.
The lone exception is hweb doesn't print the numeric rev. As has been
stated previously, I don't believe hgweb should be exposing these
unstable identifiers. (We can add them later if we really want them.)
There is still work to bring hgweb in parity with --verbose and
--debug output from the CLI.
Constructing the dirfoldmap is expensive, so if there's a hit in the
filefoldmap, don't construct the directory foldmap.
This helps with cases like 'hg add foo' where foo is already tracked: for a
large repository, the operation goes from 1.5 seconds to 1.2 (which is still
way too much, but that's a matter for another day.)
Rev cce24e8019c8 changed the semantics of the work list to store (normalized,
non-normalized) pairs. All the tuple creation and destruction hurts perf: on a
large repo on OS X, 'hg status' went from 3.62 seconds to 3.78.
It also is unnecessary in most cases:
- it is clearly unnecessary on case-sensitive filesystems.
- it is also unnecessary when filenames have been read off of disk rather than
being supplied by the user.
The only case where the non-normalized case is required at all is when the file
is unknown.
To eliminate most of the perf cost, keep trace of whether the directory needs
to be normalized at all with a boolean called 'alreadynormed'. Pay the cost of
directory normalization only when necessary.
For the above large repo, 'hg status' goes to 3.63 seconds.
By converting treemanifest.matches() into a recursively additivie operation,
it becomes O(n).
The old matches function made a copy of the entire manifest and deleted
files that didn't match. With tree manifests, this was an O(n log n) operation
because del() was O(log n).
This change speeds up the command
"hg status --rev .^ 'relglob:*.js'
on the Mozilla repo, now taking 2.53s, down from 3.51s.
During operations that involve building up a new manifest tree, it will be
useful to be able to quickly check if a submanifest is empty, and if so, to
avoid including it in the final tree. Doing this check lets us avoid creating
treemanifest structures that contain any empty submanifests.
In preparation for the optimization in the following commit, this commit
removes treemanifest.matches()'s call to _intersectfiles(), and removes
_intersectfiles() itself since it's unused at this point.
Tags is pretty easy to implement. Let's start there.
The output is slightly different from `hg tags -Tjson`. For reference,
the CLI has the following output:
[
{
"node": "e2049974f9a23176c2addb61d8f5b86e0d620490",
"rev": 29880,
"tag": "tip",
"type": ""
},
...
]
Our output has the format:
{
"node": "0aeb19ea57a6d223bacddda3871cb78f24b06510",
"tags": [
{
"node": "e2049974f9a23176c2addb61d8f5b86e0d620490",
"tag": "tag1",
"date": [1427775457.0, 25200]
},
...
]
}
"rev" is omitted because it isn't a reliable identifier. We shouldn't
be exposing them in web APIs and giving the impression it remotely
resembles a stable identifier. Perhaps we could one day hide this behind
a config option (it might be useful to expose when running servers
locally).
The "type" of the tag isn't defined because this information isn't yet
exposed to the hgweb templater (it could be in a follow-up) and because
it is questionable whether different types should be exposed at all.
(Should the web interface really be exposing "local" tags?)
We use an object for the outer type instead of Array for a few reasons.
First, it is extensible. If we ever need to throw more global properties
into the output, we can do that without breaking backwards compatibility
(property additions should be backwards compatible). Second, uniformity
in web APIs is nice. Having everything return objects seems much saner than
a mix of array and object. Third, there are security issues with arrays
in older browsers. The JSON web services world almost never uses arrays
as the main type for this reason.
Another possibly controversial part about this patch is how dates are
defined. While JSON has a Date type, it is based on the JavaScript Date
type, which is widely considered a pile of garbage. It is a non-starter
for this reason.
Many of Mercurial's built-in date filters drop seconds resolution. So
that's a non-starter as well, since we want the API to be lossless where
possible. rfc3339date, rfc822date, isodatesec, and date are all lossless.
However, they each require the client to perform string parsing on top of
JSON decoding. While date parsing libraries are pretty ubiquitous, some
languages don't have them out of the box. However, pretty much every
programming language can deal with UNIX timestamps (which are just
integers or floats). So, we choose to use Mercurial's internal date
representation, which in JSON is modeled as float seconds since UNIX
epoch and an integer timezone offset from UTC (keep in mind
JavaScript/JSON models all "Numbers" as double prevision floating point
numbers, so there isn't a difference between ints and floats in JSON).
Many have long wanted hgweb to emit a common machine readable output.
We start the process by defining a stub json template.
Right now, each endpoint returns a stub "not yet implemented" string.
Individual templates will be implemented in subsequent patches.
Basic tests for templates have been included. Coverage isn't perfect,
but it is better than nothing.
Computing the set of directories in the dirstate is expensive. It turns out
that it isn't necessary for operations like 'hg status' at all.
Why? Consider the file 'foo/bar' on disk, which is represented in the dirstate
as 'FOO/BAR'.
On 'hg status', we'd walk down the directory tree, coming across 'foo' first.
Before: we'd normalize 'foo' to 'FOO', then add 'FOO' to our visited stack.
We'd then visit 'FOO', finding the file 'bar'. We'd normalize 'FOO/bar' to
'FOO/BAR', then add it to the results dict.
After: we wouldn't normalize 'foo' at all. We'd add it to our visited stack,
then visit 'foo', finding the file 'bar'. We'd normalize 'foo/bar' to
'FOO/BAR', then add it to the results dict.
So whether we normalize intermediate directories or not actually makes no
difference in most cases.
The only case where normalization matters at all is if a file is replaced with
a directory with the same case-folded name. In that case we can do a relatively
cheap file normalization instead and still get away with not computing the set
of directories.
This is a nice boost in status performance. On OS X with case-insensitive HFS+,
for a large repo with over 200,000 files, this brings down 'hg status' from
4.00 seconds to 3.62.
Computing the set of directories in the dirstate can be pretty expensive. For
'hg status' without arguments, it turns out we actually never need to figure
out the right case for directories in the foldmap. (An upcoming patch explains
why.)
This patch splits up the directory and file maps into separate ones, allowing
for the subsequent optimization in status.
Caches should be transparent. If a cache is damaged, it should
silently be rebuilt, much like if it were invalid. No one seems to
have ever hit this in the wild.
For manifest v2, revlog.revdiff() usually does not provide enough
information to produce a manifest. As a simple workaround, implement
readdelta() by reading both the old and the new manifest and use
manifest.diff() to find the difference. This is several times slower
than the current readdelta() for v1 manifests, but there seems to be
no other simple option, and this is still much faster than returning
the full manifest (at least for verify).
We may add support for the fastdelta optimization for manifest v2 at a
later point, but let's disable it for now, so we don't have to
implement it right away.
With tree manifests, hashes will change anyway, so now is a good time
to also take up the old plans of a new manifest format. While there
should be little or no reason to use tree manifests with the current
manifest format (v1) once the new format (v2) is supported, we'll try
to keep the two dimensions (flat/tree and v1/v2) separate.
In preparation for adding a the new format, let's add configuration
for it and propagate that configuration to the manifest revlog
subclass. The new configuration ("experimental.manifestv2") says in
what format to write the manifest data. We may later add other
configuration to choose how to hash it, either keeping the v1 hash for
BC or hashing the v2 content.
See http://mercurial.selenic.com/wiki/ManifestV2Plan for more details.
By extracting a method that generates (path, node, flags) tuples, we
can reuse the code for parsing a manifest without doing it via a
_lazymanifest like treemanifest currently does. It also prepares for
parsing the new manifest format.
Note that this makes parsing into treemanifest slower, since the
parsing is now always done in pure Python. Since treemanifests will be
expected (or even forced) to be used only with the new manifest
format, parsing via _lazymanifest was not an option anyway.
The overwhelmingly common case is running commands like 'hg diff' with no
arguments. Therefore the only file that'll be listed is the root directory.
Normalizing that's just a waste of time.
This means that for a plain 'hg diff' we'll never need to construct the
foldmap, saving us a significant chunk of time.
On case-insensitive HFS+ on OS X, for a large repository with over 200,000
files, this brings down 'hg diff' from 2.97 seconds to 2.36.
This will be useful to execute actions after the tree is parsed and
before the revset returns a match. Finding symbols in the parse tree
will later allow hashes of hidden revisions to work on the command
line without the --hidden flag.
manifest.intersectfiles() is just a utility used by manifest.matches(), and
a future commit removes intersectfiles for treemanifest for optimization
purposes.
This commit makes the intersectfiles methods on manifestdict and treemanifest
internal, and converts its test to a more generic testMatches(), which has the
exact same coverage.
With record's curses interface toggling and untoggling a newly added
file would lead to a confusing UI (the header was marked as partial
and the hunks as unselected). Tested additionally using the curses
interface with newly added, removed and modified files in a test repo.
This improves performance because it uses strchr rather than a loop.
For LLVM/clang version "Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM
3.5svn)" on OS X, for a repo with over 200,000 files, this improves perfdirs
from 0.248 seconds to 0.230 (7.3%)
For gcc 4.4.6 on Linux, for a test repo with over 500,000 files, this improves
perfdirs from 0.704 seconds to 0.658 (6.5%).
In the very early days of hg, it was possible to commit /dev/null because our
patch importer was too simple. Repos from this era may still
exist, add a note about why we ignore this name.
This patch makes wrapping "commands.update()" by largefiles extension
useless, because "cmdutil.bailifchanged()" can detect changes of
largefiles in the working directory.
This patch also changes test-update-branches.t, because
"cmdutil.bailifchanged()" shows more detailed information about
dirty-ness of the working directory than "workingctx.dirty()".
In "commands.update()", "cmdutil.bailifchanged()" isn't used for
"abort if the working directory is dirty", because it forcibly
examines about merging in progress.
"workingctx.dirty()" used in "commands.update()" can't detect changes
of largefiles in the working directory without "repo.lfstatus = True"
wrapping. This is only reason of "commands.update()" wrapping by
largefiles extension.
On the other hand, "cmdutil.bailifchanged()" already wrapped by
largefiles extension can detect changes of largefiles.
This patch is a preparations for replacing "workingctx.dirty()" and
raising Abort in "commands.update()" by "cmdutil.bailifchanged()". It
can remove redundant "commands.update()" wrapping.
This patch newly adds "dirtyreason()" to centralize composing dirty
reason message like "uncommitted changes in subrepository 'xxxx'".
There are 3 similar messages below, and this patch is a part of
preparations for unifying them into (1), too.
1. uncommitted changes in subrepository 'XXXX'
2. uncommitted changes in subrepository XXXX
3. uncommitted changes in subrepo XXXX
This patch chooses adding new method "dirtyreason()" instead of making
"dirty()" return "reason string", because:
- some of existing "dirty()" implementation is too complicated to do
so simply, and
- ill-mannered 3rd party subrepo classes, of which "dirty()" doesn't
return "reason string", cause meaningless message (even though it
is rare case)
When assigning a 22-byte hash to a nodeid in a manifest, manifestdict
drops the 22nd byte, while treemanifest keeps it. Let's make
treemanifest drop the 22nd byte as well.
Reverting to a revision where the subrepo didn't exist will now abort, and
matching subrepos against the working directory is consistent with how filesets
are evaluated since dd1c701aad4d.
The list of subrepos to revert is currently based on the given --rev, so there
is currently no way for this to fail. Using the --rev context is wrong though,
because if the subrepo doesn't exist in --rev, it is skipped, so it won't be
changed. This change makes it so that the revert aborts, which is what happens
if a plain file is reverted to -1. Finding matches based on --rev is also
inconsistent with evaluating files against the working directory (dd1c701aad4d).
This change is made now, so as to not cause breakage when the context is
switched in an upcoming patch.
This is a significant win for large repositories on OS X, especially with a
cold cache. Unfortunately we need to keep the lstat-based implementation around
for two reasons:
- Not all filesystems support this call.
- There's an edge case in which it's best to fall back to avoid a retry loop.
More about this in the comments.
The below tests are all performed on a Mac with an SSD running OS X 10.9, on a
repository with over 200k files. The results are best of 5 with simulated
best-effort conditions.
The gains with a hot cache are pretty impressive: 'hg status' goes from 5.18
seconds to 3.79 seconds.
However, a repository that large will probably already be using something like
hgwatchman [1], which helps much more (for this repo, 'hg status' with
hgwatchman is approximately 1 second). Where this really helps is when the
cache is cold [2]: hg status goes from 31.0 seconds to 9.66.
See http://lists.apple.com/archives/filesystem-dev/2014/Dec/msg00002.html for
some more discussion about this function.
This is based on a patch by Sean Farley <sean@farley.io>.
[1] https://bitbucket.org/facebook/hgwatchman
[2] There appears to be no easy way to clear the file cache (aka "vnodes") on
OS X short of rebooting. purge(8) purportedly does that but in my testing had
little effect. The workaround I came up with was to assume that vnode eviction
was LRU, make sure the kern.maxvnodes sysctl is smaller than the size of the
repository, then make sure we'd always miss the cache by running 'hg status' in
another clone of the repository before running it in the test repository.
In upcoming patches we'll add another implementation of listdir on OS X. That
implementation will have to fall back to this one under some circumstances,
though. We'll make _listdir be able to detect those circumstances and use the
right function as appropriate.
If self is a smartset and other is a fullreposet, nothing should be necessary.
A small win for trivial query in mozilla-central repo:
revset #0: (0:100000)
0) wall 0.017211 comb 0.020000 user 0.020000 sys 0.000000 (best of 163)
1) wall 0.001324 comb 0.000000 user 0.000000 sys 0.000000 (best of 2160)
Previously, it was difficult to find out how to display the status of files
relative to your current working directory. This patch adds that knowledge to
the help text.
The checkinlinesize function, which converts inline revlogs to non-inline,
uses the current transaction's "data" field to determine how to update the
transaction after the conversion.
This change works around the missing data field, which is not in the
transaction after a strip.
This speeds up 'hg revert -r .^ --all --dry-run' on the Mozilla repo
from 4.081s to 0.826s. Note that 'hg revert -r .^ .' does not get any
faster, since '.' does not make match.always() True.
I can't think of a reason it would break anything, and if it does,
it's clearly not covered by tests.
This can simplify the conversion from numeric revision to string. Without it,
we have to handle -1 specially because repo['-1'] != repo[-1].
The -1 revision is not officially documented, but this change makes sense
assuming that "rev(%d)" exists for scripting or third-party tools.
This was introduced way back in 2006 (rev e8e3582d6f80) as sys.exit(0) if
loading an extension failed when --traceback was on, then at some point morphed
into a 'return 1' in a function that otherwise returns nothing.
At this point, if ui.traceback is enabled and if loading an extension fails for
whatever reason, including one as innocent as it not being present, we leave
any extensions loaded so far in a bogus half-initialized state. That doesn't
really make any sense.
When running "hg log 'set:added()'", we create two matchers: one used
for producing the revset and one used for finding files to match. In
185b6b930e8c (graphlog: evaluate FILE/-I/-X filesets on the working
dir, 2012-02-26), we started passing a revision argument along from
what's currently in cmdutil._makelogrevset() to
revset._matchfiles(). When the revision was an empty string, it
referred to the working copy. This was subtly done with "repo[rev or
None]". Then, in 5ff5c5c9e69f (revset: avoid recalculating filesets,
2014-10-22), that conversion from empty string to None was lost. Note
that repo[''] is equivalent to repo['.'], not repo[None].
The consequence of this, to the user, is that when running "hg log
'set:added()'", the file matcher matches files added in the working
copy, while the revset matcher matches revisions that touch files
added in the parent of the working copy. As a result, only revisions
that touch any files added in the parent of the working copy will be
considered, but they will only be included if they also touch files
added in the working copy.
Fix the bug by converting '' to None again, but make it a little more
explicit this time (plus, we now have tests for it).
On IRC, rom1dep reported a traceback[1] from setting
experimental.strip-bundle2-version to True. This diff catches
unexpected values and falls back to the non-experimental bundle1
implementation after issuing a warning.
[1] http://gist.tamytro.org/_admin/gists/qXcdQLwtApgy6e3NwWgl