During a merge, each file has a current commitnode+filenode, an other
commitnode+filenode, and an ancestor commitnode+filenode. The ancestor
commitnode is not stored though, and we rely on the ability for the filectx() to
look up the commitnode by using the filenode's linkrev. In alternative backends
(like remotefilelog), linkrevs may have restriction that prevent arbitrary
linkrev look up given a filenode.
This patch accounts for that by storing the ancestor commitnode in
the merge state so that it is available later at resolve time.
This results in some test changes because the ancestor commitnode we're using at
resolve time changes slightly. Before, we used the linkrev commit, which is the
earliest commit that introduced that particular filenode (which may not be the
latest common ancestor of the commits being merged). Now we use the latest
common ancestor of the merged commits as the commitnode. This is fine though,
because that commit contains the same filenode as the linkrev'd commit.
In future commits we will want to store more data related to each file in the
merge state. This patch adds an optional record for storing a dictionary of
extras for each file.
The old _follow revset iterated over every file in the commit and checked if it
matched. For repos with large manifests, this could take 500ms. By switching to
use manifest.matches() we can take advantage of the fastpaths built in to
manifest.py that allows iterating over only the files in the matcher when it's a
simple matcher. This brings the time spent down from 500ms to 0ms during simple
operations like 'hg log -f file.txt'.
Similar to the previous patch, the .hg/store/meta/ directory does not
get copied when when using "hg clone --uncompressed". Fix by including
"meta/" in store.datafiles(). This seems safe to do, as there are only
a few users of this method. "hg manifest" already filters the paths by
"data/" prefix. The calls from largefiles also seem safe. The use in
verify needs updating to prevent it from mistaking dirlogs for
orphaned filelogs. That change is included in this patch.
Since the dirlogs will now be in the fncache when using fncachestore,
let's also update debugrebuildfncache(). That will also allow any
existing treemanifest repos to get their dirlogs into the fncache.
Also update test-treemanifest.t to use an a directory name that
requires dot-encoding and uppercase-encoding so we test that the path
encoding works.
When doing a local clone with treemanifests, the .hg/store/meta/
directory currently does not get copied. To fix it, all we need to do
is to add it to the list of directories to copy.
Forward copy detection (i.e. detecting what files have been moved/copied in
commit X since ancestor Y) previously required diff'ing the manifests of both X
and Y. This was expensive since it required reading both entire manifests and
doing a set difference (they weren't already in a set because of the
lazymanifest work). This cost almost 1 second on very large repositories, and
happens N times for a rebase of N commits.
This patch optimizes it for the case of rebase. In a rebase, we are comparing a
commit against it's immediate parent, and therefore we can know what files
changed by looking at ctx.files(). This lets us drastically decrease the size
of the set comparison, and makes it O(# of changes) instead of O(size of
manifest). This makes it take 1ms instead of 1000ms.
Before this patch, test-command-template.t is timed out on Solaris,
because "rm" on permission denied file implies prompting "override
protection 0 (yes/no)?" and blocks execution of test script.
Before this patch, test-check-config.t fails on Solaris, because
"xargs" doesn't invoke check-config.py with all filenames at once.
"xargs" may invoke specified command multiple times with part of
arguments given from stdin: according to "xargs(1)" man page, this
dividing arguments is system-dependent.
For portability of test-check-config.t, this patch adds "xargs" like
mode to check-config.py and executes it in test-check-config.t without
"xargs".
Before this patch, test-revset.t fails on Solaris using ksh as default
"sh", because nested quoting below isn't acceptable for it.
+-----------------------------+ inner quoting
"`python -c "print '|'.join(['0:1'] * 500)"`"
+-------------------------------------------+ outer quoting
This patch does below for portability.
- omit outer quoting
This should be safe, because generated string contains no white
space character.
- use '+' instead of '|' (for safety)
'|' has special meaning for many shell, but '+' isn't (at least,
for ordinary ones).
We're trying to forbid "grep -a" and unintentionally complained even
if the "-a" was part of the filename. Requiring a space before "-a" to
match is probably good enough.
Before this patch revert interactive mode unconditionally forgets
added files. This patch fixes this by asking user if he wants
to forget added file. If user doesn't want to forget given file,
it is added to matcher_opts exclude list, to not reviewing it
later with other modified files.
Previously, if you ran obsolete.createmarkers with a bunch of markers that did
not have successors (like when you do a prune), it encountered a n^2 computation
behavior because the loop would read the changelog (to get ctx.parents()), then
add a marker, in a loop. Adding a marker invalidated the computehidden cache,
and reading the changelog recomputed it.
This resulted in pruning 150 commits taking 150+ seconds in a large repo.
The fix is to break the reading part of the loop to be separate from the writing
part.
When memctx is asked for a manifest, it constructs one by merging the p1
manifest, and the changes that are on top. For the changes on top, it was
previously using p1.node() as the file entries parent, which actually returns
the commit node that the p1 linkrev points at! Which is entirely incorrect.
The fix is to use p1.filenode() instead, which returns the parent file node as
desired.
I don't know how to execute this or make it have a visible effect, so I'm not
sure how to test it. It was noticed because asking for the linkrev is an
expensive operation when using the remotefilelog extension and this was causing
performance regressions with commit.
In this case, a template is parsed recursively with no thunk for lazy
evaluation. This patch prevents recursion by putting a dummy of the same name
into a cache that will be referenced while parsing if there's a recursion.
changeset = {files % changeset}\n
~~~~~~~~~
= [(_runrecursivesymbol, 'changeset')]
It would be nice if we could detect recursion at the parsing phase, but we
can't because a template can refer to a keyword of the same name. For example,
"rev = {rev}" is valid if rev is a keyword, and we don't know if rev is a
keyword or a template while parsing.
In 7a1ccfe03f74 (treemanifests: set bundle2 part parameter indicating
treemanifest, 2016-01-08), I didn't realize I had to set the parameter
separately for getbundle and unbundle. Having the parameter there on
push allows us to push to an empty repo and have the requirements
updated correctly.
Before this patch, the help bar in crecord wouldn't be printed correctly when
the terminal window didn't have enough column to display it. This patch adds
logic to make sure that the help bar message is always displayed. We use an
ellipsis when it is not possible to display the complete message.
In the crecord help dialog, the toggle all option was wrongfully documented.
Instead of using 'a', one must use 'A' to toggle all the hunks. The crecord
header that is always displayed on the screen contains the right shortcut and
does not need to be changed.
This patch improves the error messages raised when an OSError occurs, since
simply re-raising the exception can be both confusing and misleading. For
example, if "hg identify" is run inside a repository that contains a Git
subrepository and the git binary could not be found, it'll exit with the message
"abort: No such file or directory". That implies "identify" has a problem
reading the repository itself. There's no way for the user to know what the
real problem is unless they dive into the Mercurial source, which is what I
ended up doing after spending hours debugging errors while provisioning a VM
with Ansible (turns out I forgot to install Git on it).
Descriptive errors are especially important on Windows, since it's common for
Windows users to forget to set the "Path" system variable after installing Git.
After 313b8d61b548 graph canvas width is decided once on the initial rendering.
However, after graph page gets scrolled down to load more, it might need more
horizontal space to draw, so it needs to resize the canvas dynamically.
The exact problem that this patch solves can be seen using:
hg init testfork
cd testfork
echo 0 > foo
hg ci -Am0
echo 1 > foo
hg ci -m1
hg up 0
echo 2 > foo
hg ci -m2
hg gl -T '{rev}\n'
@ 2
|
| o 1
|/
o 0
hg serve
And then by navigating to http://127.0.0.1:8000/graph/tip?revcount=1
"revcount=1" makes sure the initial graph contains only revision 2. And because
the initial canvas width takes only that one revision into count, after the
(immediate) AJAX update revision 1 will be cut off from the graph.
We can safely set canvas width to the new value we get from the AJAX request
because every time graph is updated, it is completely redrawn using all the
requested nodes (in the case above it will use /graph/2?revcount=61), so the
value is guaranteed not to decrease.
P.S.: Sorry for parsing HTML with regexes, but I didn't start it.
Laurent's commit 56cdfddbd2ed still suffers from a race: by the
time the "job" function tries to assign to channels[channel], that
list has been truncated to empty. The result is that every job
thread raises an IndexError.
Earlier, I tried an approach of correctly locking channels, but
that caused run-tests to hang on KeyboardInterrupt sometimes.
This approach is strictly hackier, but seems to actually work
reliably.
The newly created helper changegroup.safeversion() knows to pick
version 03 if the repo uses treemanifests, so just using that means we
pick the right changegroup version.
In a few places (at least repair.py and shelve.py), we want to find
the best changegroup version that we can assume users of the repo will
understand. For example, we choose version 01 by default, but if it's
a generaldelta repo, we expect clients to support version 02 anyway,
so we choose that for new bundles (for e.g. "hg strip"). Let's create
a helper for this functionality in changegroup, so we can reuse it
elsewhere later.
Since it would be terribly expensive to convert between flat manifests
and treemanifests, we have decided to simply not support changegroup
version 01 and 02 with treemanifests. Therefore, let's stop announcing
that we support these versions on treemanifest repos.
Note that this means that older clients that try to clone from a
treemanifest repo will fail. What happens is that the server, after
this patch, finds that there are no common versions and raises
"ValueError: no common changegroup version". This results in "abort:
HTTP Error 500: Internal Server Error" on the client.
Before this patch, it was no better: The server would instead find
that there were directory manifest nodes to put in the changegroup 01
or 02 and raise an AssertionError on changegroup.py#668 (assert not
tmfnodes), which would also appear as a 500 to the client.
This patch fixes a crash when both --json and --blacklist were given as
arguments of run-tests.py. Now, instead of crashing, we add an entry for
blacklisted tests in the json output to show that the tests were skipped.
Before this patch, it was possible for run-tests to crash on a race condition.
The race condition happens in the following case:
- the last test finishes and calls: done.put(None)
- the context switches to the main thread that clears the channels list
- the context switches to the last test mentioned above, it tries to access
channels[channel] and crashes
This happened to me while running run-tests.
This patch fixes the issue by clearing the channel before considering that the
test is done.
The new transaction context did not handle the case where an exception during
close should still call release. This cause pretxnclose hooks that failed to
cause the transaction to fail without aborting, thus requiring a hg recover.
I've added a test.
changegroup.getchunks() determines the end of the stream by looking
for an empty chunk group (two consecutive empty chunks). It ignores
empty groups in the first two groups. Changegroup 3 introduced an
empty chunk between the manifests and the files, which confuses
getchunks(). Since it comes after the first two, getchunks() will stop
there.
Fix by rewriting getchunks so it first counts two groups (empty or
not) and then keeps antostarts counting empty groups. With this counting,
changegroup 1 and 2 have exactly one empty group after the first two
groups, while changegroup 3 has two (one for directories and one for
files).
It's a little hard to test this at this point, but I have verified
that this patch fixes narrowhg (which was broken before this
patch). Also, future patches will fix "hg strip" with treemanifests,
and once that's done, getchunks() will be tested through tests of "hg
strip".