Previously, we were using Python's native 'os.path.isfile' method which follows
symlinks. In this case, since we're operating on repo contents, we don't want
to follow symlinks.
There's a behaviour change here, as shown by the second part of the added test.
Consider a symlink 'f' pointing to a file containing 'abc'. If we try and
replace it with a file with contents 'abc', previously we would have let it
though. Now we don't. Although this breaks naive inspection with tools like
'cat' and 'diff', on balance I believe this is the right change.
This means that the diff code does less work, potentially
significantly less in the case of treemanifests. It also should ease
implementation with narrowed clone cases (such as narrowhg) when we
don't always have the entire set of treemanifest revlogs locally.
As far as I can tell, this codepath is currently only used by record,
so it'll probably die in the near future, and then narrowhg won't have
to worry about composing with some unknown matching system.
This will be a chance for the merge driver to finish resolving or generating
any driver-resolved files.
As before, having a separate error state from 'unresolved' is too big a
refactoring for now, so we hack around it by setting unresolved to a positive
value when necessary.
We also need to update our internal state to whatever state driverpreprocess
leaves it in.
Adding an error state separate from the unresolved count is too big a
refactoring for now, so we hack around it by setting it to a positive value to
indicate an error state.
The exact semantics for what should happen (particularly with respect to error
handling) are still a bit hard to pin down, so I think it's better to
experiment with it as an extension for now. For now this stub will act as a
convenient point for extensions to hook on.
c67339617276 (while 3.4 code-freeze) made all 'update' hooks run after
releasing wlock for visibility of in-memory dirstate changes. But this
breaks paired invocation of 'preupdate' and 'update' hooks.
For example, 'hg backout --merge' for TARGET revision, which isn't
parent of CURRENT, consists of steps below:
1. update from CURRENT to TARGET
2. commit BACKOUT revision, which backs TARGET out
3. update from BACKOUT to CURRENT
4. merge TARGET into CURRENT
Then, we expects hooks to run in the order below:
- 'preupdate' on CURRENT for (1)
- 'update' on TARGET for (1)
- 'preupdate' on BACKOUT for (3)
- 'update' on CURRENT for (3)
- 'preupdate' on TARGET for (4)
- 'update' on CURRENT/TARGET for (4)
But hooks actually run in the order below:
- 'preupdate' on CURRENT for (1)
- 'preupdate' on BACKOUT for (3)
- 'preupdate' on TARGET for (4)
- 'update' on TARGET for (1), but actually on CURRENT/TARGET
- 'update' on CURRENT for (3), but actually on CURRENT/TARGET
- 'update' on CURRENT for (4), but actually on CURRENT/TARGET
Root cause of the issue focused by c67339617276 is that external
'update' hook process can't view in-memory changes (especially, of
dirstate), because they aren't written out until the end of
transaction (or wlock).
Now, hooks can be invoked just after updating, because previous
patches made in-memory changes visible to external process.
This patch may break backward compatibility from the point of view of
"scheduling hook execution", but should be reasonable because 'update'
hooks had been executed in this order before 3.4.
This patch tests "hg backout" and "hg unshelve", because the former
activates the transaction before 'update' hook invocation, but the
former doesn't.
Now, 'dirstate.write(tr)' delays writing in-memory changes out, if a
transaction is running.
This may cause treating this revision as "the first bad one" at
bisecting in some cases using external hook process inside transaction
scope, because some external hooks and editor process are still
invoked without HG_PENDING and pending changes aren't visible to them.
'dirstate.write()' callers below in localrepo.py explicitly use 'None'
as 'tr', because they can assume that no transaction is running:
- just before starting transaction
- at closing transaction, or
- at unlocking wlock
This opens the door to working slightly more closely with the manifest
type and letting it optimize out some of the diff comparisons for us,
and also makes life significantly easier for narrowhg.
Once we get a matcher down into manifestmerge, we can make narrowhg
work more easily and potentially let manifest.match().diff() do less
work in manifestmerge.
We currently allow updating and merging (with --force) when there are
unresolved merge conflicts, as long as there is only one parent of the
working copy. Even worse, when updating to another revision
(linearly), if one of the unresolved files (including any conflict
markers in the working copy) can now be merged cleanly with the target
revision, the file becomes marked as resolved.
While we could potentially allow updates that affect only files that
are not in the set of unresolved files, that's considerably more work,
and we don't have a use case for it anyway. Instead, let's keep it
simple and refuse any merge or update (without -C) when there are
unresolved conflicts.
Note that test-merge-local.t explicitly checks for conflict markers
that get carried over on update. It's unclear if that was intentional
or not, but it seems bad enough that we should forbid it. The simplest
way of fixing the test case is to leave the conflict markers in place
and just mark the files resolved, so let's just do that for now.
Currently merge.graft re-writes the dirstate so only a single
parent is kept. For some cases, like evolving a merge commit,
this behaviour is not desired. More specifically, this is
needed to fix issue4389.
We have finally laid all the groundwork to make this happen.
The only change/delete conflicts that haven't been moved are .hgsubstate
conflicts. Those are trickier to deal with and well outside the scope of this
series.
We add comprehensive testing not just for the initial selections but also for
re-resolves and all possible dirstate transitions caused by merge tools. That
testing managed to shake out several bugs in the way we were handling dirstate
transitions.
The other test changes are because we now treat change/delete conflicts as
proper merges, and increment the 'merged' counter rather than the 'updated'
counter. I believe this is the right approach here.
For third-party extensions, if they're interacting with filemerge code they
might have to deal with an absentfilectx rather than a regular filectx.
Still to come:
- add a 'leave unresolved' option to merges
- change the default for non-interactive change/delete conflicts to be 'leave
unresolved'
- add debug output to go alongside debug outputs for binary and symlink file
merges
This is somewhat different from the currently existing 'a' action, for the
following case:
- dirty working copy, with file 'fa' added and 'fm' modified
- hg merge --force with a rev that neither has 'fa' nor 'fm'
- for the change/delete conflicts we pick 'changed' for both 'fa' and 'fm'.
In this case 'branchmerge' is true, but we need to distinguish between 'fa',
which should ultimately be marked added, and 'fm', which should be marked
modified.
Our current strategy is to just not touch the dirstate at all. That works for
now, but won't work once we move change/delete conflicts to the resolve phase.
In that case we may perform repeated re-resolves, some of which might mark the
file removed or remove the file from the dirstate. We'll need to re-add the
file to the dirstate, and we need to be able to figure out whether we mark the
file added or modified. That is what the new 'am' action lets us do.
Once we move change/delete conflicts into the resolve phase, a 'dc' file might
first be resolved by picking the other side, then later be resolved by picking
the local side. For this transition we want to make sure that the file goes
back to not being in the dirstate.
This has no impact on conflicts during the initial merge.
At the moment this is a no-op (the only actions defined are 'r', 'a' and 'g'),
but soon we're going to add other sorts of actions to the dictionary returned
from mergestate.actions().
These are meant for use by custom merge drivers that might want to modify the
dirstate. Dirstate internal consistency rules require that all removes happen
before any adds -- this means that custom merge drivers shouldn't be modifying
the dirstate directly.
We're going to use this to extend the action lists in merge.applyupdates.
The somewhat funky return value is to make passing this dict directly into
recordactions easier. We're going to exploit that in an upcoming patch.
This eliminates a whole bunch of duplicate code and allows us to update the
removed count for change/delete conflicts where the delete action was chosen.
This will not only allow us to remove a bunch of duplicate code in applyupdates
in an upcoming patch, it will also allow the resolve interface to be a lot
simpler: it doesn't need to return the dirstate action to applyupdates.
We will represent a deleted file as 'nullhex' in the in-memory and on-disk
merge states. We need to be able to create absentfilectxes in that case, and
delete the file from disk rather than try to write it out.
We introduce a new record type, 'C', to indicate change/delete conflicts. This
is a separate record type because older versions of Mercurial will not be able
to handle these conflicts.
We aren't actually storing any change/delete conflicts yet -- that will come in
future patches.
This works around a bug in older Mercurial versions' handling of the v2 merge
state.
We also add a bunch of tests that make sure that
(1) we correctly abort when the merge state has an unsupported record type
(2) aborting the merge, rebase or histedit continues to work and clears out the
merge state.
This is too low-level to be the top-level documentation for mergestate. We're
restricting the top-level documentation to only be about what consumers of the
mergestate and anyone extending it need to care about.
With this patch, mergestate.clean() will no longer abort when it encounters an
unsupported merge type. However we hold off on testing it until backwards
compatibility is in place.