In Python 2:
>>> ls = [4, 2, None]
>>> sorted(ls)
[None, 2, 4]
In Python 3:
>>> ls = [4, 2, None]
>>> sorted(ls)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: NoneType() < int()
Therefore we replaced None with -1, and the safe part is that, None and -1 are
only the keys which are used for sorting so we don't need to convert the -1's
back to None.
The old merge code would call manifestmerge and calculate the complete diff
between the source to the destination. In many cases, like rebase, the vast
majority of differences between the source and destination are irrelevant
because they are differences between the destination and the common ancestor
only, and therefore don't affect the merge. Since most actions are 'keep', all
the effort to compute them is wasted.
Instead, let's compute the difference between the source and the common ancestor
and only perform the diff of those files against the merge destination. When
using treemanifest, this lets us avoid loading almost the entire tree when
rebasing from a very old ancestor. This speeds up rebase of an old stack of 27
commits by 20x.
In mozilla-central, without treemanifest, when rebasing a commit from
default~100000 to default, this speeds up the manifestmerge step from 2.6s to
1.2s. However, the additional diff adds an overhead to all manifestmerge calls,
especially for flat manifests. When rebasing a commit from default~1 to default
it appears to add 100ms in mozilla-central. While we could put this optimization
behind a flag, I think the fact that it makes merge O(number of changes being
applied) instead of O(number of changes between X and Y) justifies it.
As part of changing manifest.diff to accept a matcher, a previous patch added
matcher calls to each location in merge.manifestmerge that tested if 'x in mf'
to maintain the same behavior as before. After analyzing it further, this
matcher call isn't needed, and in fact hurts future patches ability to use the
matcher here.
Basically, all these 'if x in mf' checks were checking if a matched file's copy
source was in the matcher as well. This meant if you passed a matcher for just
file foo, it would not return file bar even if foo was a copy of bar. Since
manifestmerge cares about copy information, let's allow all lookups of copy
sources.
We also update one spot with a 'is not None' check, since it wasn't obvious that
the value could sometimes be None before, which broke when we called
matcher(None).
A future patch adds matcher optimizations to manifestmerge which causes this
code path to get covered by existing tests.
With experimental.updatecheck=noconflict set, if one checks out
e1870a31325e (tests: add execute bit and fix shbang line, 2015-12-22)
and then try to check out its parent, hg will complain about
conflicting changes, even though the working directory is clean. We
need to also allow the 'e' action in merge.py. The 'e' action is used
when moving to a commit where the only change to the file is to its
executable flag, so it's just an optimized 'g' action.
Doesn't seem to be worth writing a test for, since the existing setup
in test-update-branches.t does not set any flags.
The working directory will usually be clean or very clean, and wc will usually
have the same branch as its parent. This change will thus usually not make any
difference and is done as a separate change to show that. It will be used in a
later change.
This gets rid of the manifest.matches calls in merge.py in favor of the new api.
This is part of getting rid of manifest.matches since it is O(manifest).
With experimental.updatecheck=noconflict, if the update is aborted
because of conlicts, "uncommitted changes" is not quite
accurate. Let's use "conflicting changes" instead. Also fix the hint
to recomment --clean, not --merge, since that's what we do for other
failed updates.
Since 5a430b57b16e (update: change default destination to tipmost
descendant (issue4673) (BC), 2016-02-02), non-linear updates can no
longer happen if the user doesn't specify a destination, so we can
drop these case from the table in the docstring of merge.update().
The check is done in merge.update() already and the next few patches
will add more checks there. Some of the additional checks will need
information about the merge that will not be available in destutil.
Since commands.postincoming() catches UpdateAbort(), we need to change
merge.update() to raise that more specific exception.
This goes directly again c6bcc960108f (destupdate: move the check
related to the "clean" logic in the function, 2015-10-05), but it will
simplify the next few patches, and we can always move it out again
(preferably move, not copy) after if we still think it's better that
way.
The comment about "obsolete.background" seems to have been about
obsolete.foreground ever since it was introduced in fbdfa64ceced
(update: allow dirty update to foreground (successors), 2013-04-16),
so let's change it.
As far as I can tell, all the callers of merge.update() have been
migrated over to use destutil to find the default destination when the
revision was not specified. So it's time to delete the code for
handling a node value of None. Let's add an assertion that node is not
None, so any extensions relying on it will not silently misbehave.
The merge code works by applying the changes between an ancestor and
the working copy onto the destination. To achieve an update, it sets
the ancestor to be the parent of the working copy. To achieve a clean
update (update -C), it sets the ancestor to be the working copy itself
(so there are no changes to carry over). The logic for this was spread
out a bit. Let's move it all to one place so it's easier to follow the
reason for it. Also add some documentation.
0b5f1f2efc77 introduced handling of a crash in this case. A review comment
suggested that it was not entirely obvious that a 'dm' always would have a 'r'
for the source file.
To mitigate that risk, make the code more conservative and make less
assumptions.
Work around that 'dm' in the data model only can have one operation for the
target file, but still can have multiple and conflicting operations on the
source file where the other operation is a 'rm'. The move would thus fail with
'abort: No such file or directory'.
In this case it is "obvious" that the file should be removed, either before or
after moving it. We thus keep the 'rm' of the source file but drop the 'dm'.
This is not a pretty fix but quite "obviously" safe (famous last words...) as
it only touches a rare code path that used to crash. It is possible that it
would be better to swap the files for 'dm' as suggested on
https://bz.mercurial-scm.org/show_bug.cgi?id=5020#c13 but it is not entirely
obvious that it not just would create conflicts on the other file. That can be
revisited later.
Previously we indicated that the .hgsubstate file was dirty by adding a '+' to
the end of its hash in the wctx manifest. This made is complicated to have new
manifest implementations that rely on the node length being fixed.
In previous patches we added added and modified node placeholders, so let's use
those to indicate dirty here as well. It doesn't look like anything ever
depended on this '+' (aside from it being different to the parent), so nothing
else needed to change here.
Previously the added/modified placeholder hash for manifests generated from the
dirstate was a 21byte long string consisting of the p1 file hash plus a single
character to indicate an add or a modify. Normal hashes are only 20 bytes long.
This makes it complicated to implement more efficient manifest implementations
which rely on the hashes being fixed length.
Let's change this hash to just be 20 bytes long, and rely on the astronomical
improbability of an actual hash being these 20 bytes (just like we rely on no
hash every being the nullid).
This changes the possible behavior slightly in that the hash for all
added/modified entries in the dirstate manifest will now be the same (so simple
node comparisons would say they are equal), but we should never be doing simple
node comparisons on these nodes even with the old hashes, because they did not
accurately represent the content (i.e. two files based off the same p1 file
node, with different working copy contents would have the same hash (even with
the appended character) in the old scheme too, so we couldn't depend on the
hashes period).
As a followup to the issue4028 series, this fixes a variant of the issue
that can occur when updating with uncommited local changes.
The duplicated .hgsub warning is coming from wc.dirty(). We would previously
skip this call because it's only relevant when we're going to perform copy
tracing, which we didn't do before.
The change to the update summary line is because we now treat the rename as a
proper rename (which counts as a change), rather than an add+delete pair
(which counts as a change and a delete).
During update directories are deleted as soon as they have no entries.
But if current working directory is deleted then it cause problems
in complex commands like 'hg split'. This commit adds a warning
that will help users figure the problem faster.
I always read the name "checkcase(path)" as "do we need to check for
case folding at this path", but it's actually (I think) meant to be
read "check if the file system cares about case at this path". I'm
clearly not the only one confused by this as the dirstate has this
property:
def _checkcase(self):
return not util.checkcase(self._join('.hg'))
Maybe we should even inverse the function and call it fscasefolding()
since that's what all callers care about?
See the comment for a detailed explanation why.
Even though this is a bug, I've sent it to 'default' rather than 'stable'
because it isn't triggered in any code paths in stock Mercurial, just with the
merge driver included. For the same reason I haven't included any tests here --
the merge driver is getting a new test.
Now that we store and display merge labels in user prompts (not just
conflict markets), we should rely on labels to clarify the two sides of a
merge operation (hg merge, hg update, hg rebase etc).
"remote" is not a great name here, as it conflates "remote" as in "remote
server" with "remote" as in "the side of the merge that's further away". In
cases where you're merging the "wrong way" around, remote can even be the
"local" commit that you're merging with something pulled from the remote
server.
Now that we persist the labels, we can consistently use the labels in
prompts for the user without risk of confusion. This changes a huge amount
of command output:
This means that merge prompts like:
no tool found to merge a
keep (l)ocal, take (o)ther, or leave (u)nresolved? u
and
remote changed a which local deleted
use (c)hanged version, leave (d)eleted, or leave (u)nresolved? c
become:
no tool found to merge a
keep (l)ocal [working copy], take (o)ther [destination], or leave (u)nresolved? u
and
remote [source] changed a which local [dest] deleted
use (c)hanged version, leave (d)eleted, or leave (u)nresolved? c
where "working copy" and "destination" were supplied by the command that
requested the merge as labels for conflict markers, and thus should be
human-friendly.
In cbefa73a359814e6784a63f90b78c7afd39bc7d5, I introduced a new bug:
when a symlink points to a folder in commit A and to another folder
in commit B, while updating from A to B, Mercurial will try to use
removedir on this symlink, which will fail. This is a very bad bug,
since it basically renders symlinks to folders unusable in repos.
Added test case fails without a fix and passes with it.
This is a fix to an old problem when Mercurial got confused by an
untracked folder with the same name as one of the files in a commit
hg was trying to update to. It is pretty safe to remove this folder if
it is empty. Backing up an empty folder seems to go against Mercurial's
"don't track dirs" philosophy.