A 'colliding unknown file' is a file that meets all of the following
conditions:
- is untracked or ignored on disk
- is present in the changeset being merged or updated to
- has different contents
Previously, we would always abort whenever we saw such files. With this config
option we can choose to warn and back the unknown files up instead, or even
forgo the warning entirely and silently back the unknown files up.
Common use cases for this configuration include a large scale transition of
formerly ignored unknown files to tracked files. In some cases the files can be
given new names, but in other cases, external "convention over configuration"
constraints have determined that the file must retain the same name as before.
Previously, we were using Python's native 'os.path.isfile' method which follows
symlinks. In this case, since we're operating on repo contents, we don't want
to follow symlinks.
There's a behaviour change here, as shown by the second part of the added test.
Consider a symlink 'f' pointing to a file containing 'abc'. If we try and
replace it with a file with contents 'abc', previously we would have let it
though. Now we don't. Although this breaks naive inspection with tools like
'cat' and 'diff', on balance I believe this is the right change.
This means that the diff code does less work, potentially
significantly less in the case of treemanifests. It also should ease
implementation with narrowed clone cases (such as narrowhg) when we
don't always have the entire set of treemanifest revlogs locally.
As far as I can tell, this codepath is currently only used by record,
so it'll probably die in the near future, and then narrowhg won't have
to worry about composing with some unknown matching system.
This will be a chance for the merge driver to finish resolving or generating
any driver-resolved files.
As before, having a separate error state from 'unresolved' is too big a
refactoring for now, so we hack around it by setting unresolved to a positive
value when necessary.
We also need to update our internal state to whatever state driverpreprocess
leaves it in.
Adding an error state separate from the unresolved count is too big a
refactoring for now, so we hack around it by setting it to a positive value to
indicate an error state.
The exact semantics for what should happen (particularly with respect to error
handling) are still a bit hard to pin down, so I think it's better to
experiment with it as an extension for now. For now this stub will act as a
convenient point for extensions to hook on.
c67339617276 (while 3.4 code-freeze) made all 'update' hooks run after
releasing wlock for visibility of in-memory dirstate changes. But this
breaks paired invocation of 'preupdate' and 'update' hooks.
For example, 'hg backout --merge' for TARGET revision, which isn't
parent of CURRENT, consists of steps below:
1. update from CURRENT to TARGET
2. commit BACKOUT revision, which backs TARGET out
3. update from BACKOUT to CURRENT
4. merge TARGET into CURRENT
Then, we expects hooks to run in the order below:
- 'preupdate' on CURRENT for (1)
- 'update' on TARGET for (1)
- 'preupdate' on BACKOUT for (3)
- 'update' on CURRENT for (3)
- 'preupdate' on TARGET for (4)
- 'update' on CURRENT/TARGET for (4)
But hooks actually run in the order below:
- 'preupdate' on CURRENT for (1)
- 'preupdate' on BACKOUT for (3)
- 'preupdate' on TARGET for (4)
- 'update' on TARGET for (1), but actually on CURRENT/TARGET
- 'update' on CURRENT for (3), but actually on CURRENT/TARGET
- 'update' on CURRENT for (4), but actually on CURRENT/TARGET
Root cause of the issue focused by c67339617276 is that external
'update' hook process can't view in-memory changes (especially, of
dirstate), because they aren't written out until the end of
transaction (or wlock).
Now, hooks can be invoked just after updating, because previous
patches made in-memory changes visible to external process.
This patch may break backward compatibility from the point of view of
"scheduling hook execution", but should be reasonable because 'update'
hooks had been executed in this order before 3.4.
This patch tests "hg backout" and "hg unshelve", because the former
activates the transaction before 'update' hook invocation, but the
former doesn't.
Now, 'dirstate.write(tr)' delays writing in-memory changes out, if a
transaction is running.
This may cause treating this revision as "the first bad one" at
bisecting in some cases using external hook process inside transaction
scope, because some external hooks and editor process are still
invoked without HG_PENDING and pending changes aren't visible to them.
'dirstate.write()' callers below in localrepo.py explicitly use 'None'
as 'tr', because they can assume that no transaction is running:
- just before starting transaction
- at closing transaction, or
- at unlocking wlock
This opens the door to working slightly more closely with the manifest
type and letting it optimize out some of the diff comparisons for us,
and also makes life significantly easier for narrowhg.
Once we get a matcher down into manifestmerge, we can make narrowhg
work more easily and potentially let manifest.match().diff() do less
work in manifestmerge.
We currently allow updating and merging (with --force) when there are
unresolved merge conflicts, as long as there is only one parent of the
working copy. Even worse, when updating to another revision
(linearly), if one of the unresolved files (including any conflict
markers in the working copy) can now be merged cleanly with the target
revision, the file becomes marked as resolved.
While we could potentially allow updates that affect only files that
are not in the set of unresolved files, that's considerably more work,
and we don't have a use case for it anyway. Instead, let's keep it
simple and refuse any merge or update (without -C) when there are
unresolved conflicts.
Note that test-merge-local.t explicitly checks for conflict markers
that get carried over on update. It's unclear if that was intentional
or not, but it seems bad enough that we should forbid it. The simplest
way of fixing the test case is to leave the conflict markers in place
and just mark the files resolved, so let's just do that for now.
Currently merge.graft re-writes the dirstate so only a single
parent is kept. For some cases, like evolving a merge commit,
this behaviour is not desired. More specifically, this is
needed to fix issue4389.
We have finally laid all the groundwork to make this happen.
The only change/delete conflicts that haven't been moved are .hgsubstate
conflicts. Those are trickier to deal with and well outside the scope of this
series.
We add comprehensive testing not just for the initial selections but also for
re-resolves and all possible dirstate transitions caused by merge tools. That
testing managed to shake out several bugs in the way we were handling dirstate
transitions.
The other test changes are because we now treat change/delete conflicts as
proper merges, and increment the 'merged' counter rather than the 'updated'
counter. I believe this is the right approach here.
For third-party extensions, if they're interacting with filemerge code they
might have to deal with an absentfilectx rather than a regular filectx.
Still to come:
- add a 'leave unresolved' option to merges
- change the default for non-interactive change/delete conflicts to be 'leave
unresolved'
- add debug output to go alongside debug outputs for binary and symlink file
merges
This is somewhat different from the currently existing 'a' action, for the
following case:
- dirty working copy, with file 'fa' added and 'fm' modified
- hg merge --force with a rev that neither has 'fa' nor 'fm'
- for the change/delete conflicts we pick 'changed' for both 'fa' and 'fm'.
In this case 'branchmerge' is true, but we need to distinguish between 'fa',
which should ultimately be marked added, and 'fm', which should be marked
modified.
Our current strategy is to just not touch the dirstate at all. That works for
now, but won't work once we move change/delete conflicts to the resolve phase.
In that case we may perform repeated re-resolves, some of which might mark the
file removed or remove the file from the dirstate. We'll need to re-add the
file to the dirstate, and we need to be able to figure out whether we mark the
file added or modified. That is what the new 'am' action lets us do.
Once we move change/delete conflicts into the resolve phase, a 'dc' file might
first be resolved by picking the other side, then later be resolved by picking
the local side. For this transition we want to make sure that the file goes
back to not being in the dirstate.
This has no impact on conflicts during the initial merge.
At the moment this is a no-op (the only actions defined are 'r', 'a' and 'g'),
but soon we're going to add other sorts of actions to the dictionary returned
from mergestate.actions().
These are meant for use by custom merge drivers that might want to modify the
dirstate. Dirstate internal consistency rules require that all removes happen
before any adds -- this means that custom merge drivers shouldn't be modifying
the dirstate directly.
We're going to use this to extend the action lists in merge.applyupdates.
The somewhat funky return value is to make passing this dict directly into
recordactions easier. We're going to exploit that in an upcoming patch.
This eliminates a whole bunch of duplicate code and allows us to update the
removed count for change/delete conflicts where the delete action was chosen.
This will not only allow us to remove a bunch of duplicate code in applyupdates
in an upcoming patch, it will also allow the resolve interface to be a lot
simpler: it doesn't need to return the dirstate action to applyupdates.
We will represent a deleted file as 'nullhex' in the in-memory and on-disk
merge states. We need to be able to create absentfilectxes in that case, and
delete the file from disk rather than try to write it out.
We introduce a new record type, 'C', to indicate change/delete conflicts. This
is a separate record type because older versions of Mercurial will not be able
to handle these conflicts.
We aren't actually storing any change/delete conflicts yet -- that will come in
future patches.
This works around a bug in older Mercurial versions' handling of the v2 merge
state.
We also add a bunch of tests that make sure that
(1) we correctly abort when the merge state has an unsupported record type
(2) aborting the merge, rebase or histedit continues to work and clears out the
merge state.
This is too low-level to be the top-level documentation for mergestate. We're
restricting the top-level documentation to only be about what consumers of the
mergestate and anyone extending it need to care about.
With this patch, mergestate.clean() will no longer abort when it encounters an
unsupported merge type. However we hold off on testing it until backwards
compatibility is in place.
Eventually, we'll move the read call out of the constructor. This will:
- avoid unnecessary reads when we're going to nuke the merge state anyway
- avoid raising an exception if there's an unsupported merge record
'clean' seems like a good name for it because I wanted to avoid anything with
the word 'new' in it, and 'reset' is more an action performed on a merge state
than a way to get a new merge state.
Thanks to Martin von Zweigbergk for feedback about naming this.
We're going to catch this exception in 'hg summary' to print a better error
message.
This code is pretty untested, so there are no changes to test output. In
upcoming patches we're going to test the output more thoroughly.
Now that all internal callers pre-compute and set a destination at a higher level
it feels like we can kill this API. This will allow us to simplify this
function. However I feel like this is a bit too central and critical to break
now. I'm adding a devel warning to let extension make catch this in the next
cycle.
File/directory case folding collisions cannot be represented on case folding
systems and have to fail.
To detect this and abort early, utilize that for file/directory collisions, a
sorted list of case folded manifest names will have the colliding directory
right after the file.
(This could perhaps be optimized, but this way of doing it also has
directory/directory case folding in mind ... which however not is handled yet.)
A driver-resolved file is a file that's handled specially by the driver. A
common use case for this state would be autogenerated files, the generation of
which should happen only after all source conflicts are resolved.
This is done with an uppercase letter because older versions of Mercurial will
not know how to treat such files at all.
A 'merge driver' is a coordinator for the overall merge process. It will be
able to control:
- tools for individual files, much like the merge-patterns configuration does
today
- tools that can work across groups of files
- the ordering of file resolution
- resolution of automatically generated files
- adding and removing additional files to and from the dirstate
Since it is a critical part of the merge process, it really is part of the
merge state.
This is a lowercase character (i.e. optional) because ignoring this is fine for
older versions of Mercurial -- however, if there are any files that are
specially treated by the driver, we should abort. That will happen in upcoming
patches.
There is a potential security issue with storing the merge driver in the merge
state. See the inline comments for more details.
For the same reason, we move the bookmark related update logic into the
'destupdate' function. This requires to extend the returns of the function to
include the bookmark that needs to move (more or less) and the bookmark to
activate at the end of the function. See function documentation for details on
this returns.
We perform all that we can non-interactively before prompting the user for input
via their merge tool. This allows for a maximally consistent state when the user
is first prompted.
The test output changes indicate the actual behavior change happening.
The section of code that writes out the version of the file cached in the merge
state should only be run at preresolve time. This is so that if the premerge
keeps around conflict markers, those don't get overwritten before the main
merge.
This means that in ms.resolve we must call merge after calling premerge. This
doesn't yet mean that all premerges happen before any merges -- however, this
does get us closer to our goal.
The output differences are because we recompute the merge tool. The only
user-visible difference caused by this patch is that if the tool is missing
we'll print the warning twice. Not a huge deal, though.
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.
For great justice.
We can safely drop this because the very same assignment is enforcement later in
the function. Dropping it will make it simpler to extract the default
destination logic in its own function.
This is another step toward having "default" destination more clear and unified.
Not all the logic is there because some bookmark related computation happened
elsewhere. It will be moved later.
The function is private because as for the other ones, cleanup is needed before
we can proceed.
Resolving other conflicts before merge ones is better because the state before
the merge is as consistent as possible. It will also help with future work
involving automatic resolution of merge conflicts with an external merge
driver.
There are no ordering issues here because it is easy to verify that the same
file is never in both the dg/dm and the m sets.
We're going to treat these conflicts similarly to merge conflicts, and this
change to the way we store things in memory makes future code a lot simpler.
(1) These aren't currently read from anywhere, so emptying this out is
pointless.
(2) These *will* be read from later in upcoming patches, and not emptying
them out will be important then.
I actually wanted to reduce the amount of code around the call to
applyupdates(), so I tried moving these warnings a little earlier, and
I think it makes the output make a little more sense (see changes to
test cases).
This only makes a difference when a merge driver is active -- in that case we
don't want to try and merge all the files, just the ones still unresolved after
the merge driver's preprocess step is over.
Explicit 'dirstate.normallookup()' invocation via 'dirtysubstate()' in
'applyupdates()' is useless now, because previous patch fixed the
relevant issue by writing in-memory dirstate changes out at the end of
dirty check.
'dirstate.normallookup()' invocation was introduced by 13fc4cf249d9 to
avoid occasional test failure. This is partial backout of it (added
tests are still left).
Python 2.6 introduced the "except type as instance" syntax, replacing
the "except type, instance" syntax that came before. Python 3 dropped
support for the latter syntax. Since we no longer support Python 2.4 or
2.5, we have no need to continue supporting the "except type, instance".
This patch mass rewrites the exception syntax to be Python 2.6+ and
Python 3 compatible.
This patch was produced by running `2to3 -f except -w -n .`.
There were 2 test failures in 3.4-rc when running test-hook.t with the
largefiles extension enabled. For context, the first is a commit hook:
@@ -618,9 +621,9 @@
$ echo 'update = hg id' >> .hg/hgrc
$ echo bb > a
$ hg ci -ma
- 223eafe2750c tip
+ d3354c4310ed+
$ hg up 0
- cb9a9f314b8b
+ 223eafe2750c+ tip
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
make sure --verbose (and --quiet/--debug etc.) are propagated to the local ui
In both cases, largefiles acquires the wlock before calling into core, which
also acquires the wlock. The first case was fixed in 4100e338a886 by ensuring
the hook only runs after the lock has been fully released. The full release is
important, because that is what writes dirstate to the disk, allowing external
hooks to see the result of the update. This simply changes how the update hook
is called, so that it too is deferred until the lock is finally released.
There are many uses of mergemod.update(), but in terms of commands, it looks
like the following commands take wlock while calling mergemod.update(), and
therefore will now have their hook fired at a later time:
backout, fetch, histedit, qpush, rebase, shelve, transplant
Unlike the others, fetch immediately unlocks after calling update(), so for all
intents and purposes, its hook invocation is not deferred (but the external hook
still sees the proper state).
Previously it was impossible to graft a commit onto it's own parent (i.e. create
a copy of the commit). This is useful when wanting to create a backup of the
commit before continuing to amend it. This patch enables that behavior.
The change to the histedit test is because histedit uses graft to apply commits.
The test in question moves a commit backwards onto an ancestor. Since the graft
logic now more explicitly supports this, it knows to simply accept the incoming
changes (since they are more recent), instead of prompting.
Before this patch, failure of updating subrepos may cause inconsistent
".hgsubstate". For example:
1. dirstate entry for ".hgsubstate" of the parent repo is filled
with valid size/date (via "hg state" or so)
2. "hg update" is invoked at the parent repo
3. ".hgsubstate" of the parent repo is updated on the filesystem as
a part of "g"(et) action in "merge.applyupdates"
4. it is assumed that size/date of ".hgsubstate" on the filesystem
aren't changed from ones at (1)
this is not so difficult condition, because just changing hash
ids (every ids are same in length) in ".hgsubstate" doesn't
change the file size of it
5. "subrepo.submerge()" is invoked to update subrepos
6. failure of updating in one of subrepos raises exception
(e.g. "untracked file differs")
7. "hg update" is aborted without updating dirstate of the parent repo
dirstate entry for ".hgsubstate" still holds size/date at (1)
Then, ".hgsubstate" of the parent repo is treated as "CLEAN"
unexpectedly, because updating ".hgsubstate" at (3) doesn't change
size/date of it on the filesystem: see assumption at (4).
This inconsistent ".hgsubstate" status causes unexpected behavior, for
example:
- "hg revert" forgets to revert ".hgsubstate"
- "hg update" misunderstands that (not yet updated) subrepos diverge
(then, it shows the prompt to confirm user's decision)
To avoid inconsistent ".hgsubstate" status above, this patch marks
".hgsubstate" as possibly dirty before "submerge" invocation.
"normallookup"-ed (= dirty) dirstate should be written out, even if
processing is aborted by failure.
This patch marks ".hgsubstate" as possibly dirty before "submerge",
also when it is removed or merged while merging, for safety. This
should prevent Mercurial from misunderstanding inconsistent
".hgsubstate" as clean.
To satisfy conditions at (1) and (4) above, this patch uses "hg status
--config debug.dirstate.delaywrite=2" (to fill valid size/date into
dirstate) and "touch" (to fix date of the file).
This change touches every module in which repository.wopener was being used, and
changes it for the equivalent repository.wvfs.
It should now be possible to remove localrepo.wopener.
This change touches every module in which repository.opener was being used, and
changes it for the equivalent repository.vfs. This is meant to make it easier
to split the repository.vfs into several separate vfs.
It should now be possible to remove localrepo.opener.
This moves most reading of filelogs out of manifestmerge, making it
easy for a narrow clone extension to filter out or translate unwanted
actions before any filelogs are read. The only call left is inside of
copies.mergecopies(), which can be overridden separately at a lower
level.
We still have one case of a call to _checkunknownfile() in
manifestmerge(): when force=True and branchmerge=True and the remote
side has a file that the local side doesn't. This combination of
arguments is used by 'hg merge --force', but also by rebase and
unshelve. In this scenario, we try to create the file from the
contents from the remote, but if there is already a local untracked
file in place, we merge it instead.
When a directory was renamed and a new untracked file was added in the
new directory and the remote directory added a file by the same name
in the old directory, the local untracked file gets overwritten, as
demonstrated by the broken test case in test-rename-dir-merge.
Fix by checking for unknown files for 'dg' actions too. Since
_checkunknownfile() currently expects the same filename in both
contexts, we need to add a new parameter for the remote filename to
it.
The 'c' and 'dc' actions include creating a file on disk and we need
to check that no conflicting file exists unless force=True. Move two
of the calls to _checkunknownfile() to a single place at the end of
manifestmerge(). This removes some of the reading of filelogs from the
heart of manifestmerge() and collects it in one place close to where
its output (entries in the 'aborts' list) is used.
Note that this removes the unnecessary call to _checkunknownfile()
when force=True in one of the code paths.
_checkunknownfile() reads the filelog of the remote side's file. For
narrow clones, the filelog will not exist for all files and we need a
way to avoid reading them. While it would be easier for the narrow
extension to just override _checkunknownfile() and make it ignore
files outside the narrow clone, it seems cleaner to have
manifestmerge() not care about filelogs (considering its
name).
In order to move the calls to _checkunknownfile() out, we need to be
able to tell in which cases we should check for unknown files. Let's
start by introducing a new action distinct from 'g' for this
purpose. Specifically, the new action will be just like 'g' except
that it will check that for conflicting unknown files first. For now,
just add the new action type and convert it to 'g'.
This simplifies largefiles' overridecalculateupdates(), which no
longer has to do the conversion it started doing in 478d610ca1b0
(largefiles: rewrite merge code using dictionary with entry per file,
2014-12-09).
To keep this patch small, we'll leave the name 'actionbyfile' in
overrides.py. It will be renamed in the next patch.
By moving the conversion from the file->action dict after
_forgetremoved(), we make that method shorter by removing the need for
the confusing 'xactions' variable.
By moving the conversion from the file->action dict after the bid
merge code, bid merge can be simplified a little.
A few tests are affected by this change. Where we used to iterate over
the actions first in order of the action type ('g', 'r', etc.) [1], we
now iterate in order of filename. This difference affects the order of
debug log statements.
[1] And then in the non-deterministic order of files in the manifest
dictionary (the order returned from manifest.diff()).
In the same vein as 478d610ca1b0 (largefiles: rewrite merge code using
dictionary with entry per file, 2014-12-09), rewrite manifestmerge()
itself as dictionary with the filename as key. This will let us
simplify some of the other code in merge.py and eventually drop the
conversion in the largefiles code.
No difference in speed could be detected (well within the noise level
when run in Mozilla repo).
When there are multiple common ancestors, we should check for case
collisions only on the resulting actions after bid merge has run. To
do this, move the code until after bid merge.
Move it past _resolvetrivial() too, since that might update
actions. If the remote changed a file and then reverted the change,
while the local side deleted the file and created a new file with a
name that case-folds like the old file, we should fail before this
patch but not after.
Although the changes to the actions caused by _forgetremoved() should
have no effect on case collisions, move it after that, too, so the
next person reading the code won't have to think about it.
Moving it past these blocks of code takes it to the end of
calculateupdates(), so let's even move it outside of the method, so we
also check collisions in actions produced by extensions overriding the
method.
By moving the cd/dc prompts out of calculateupdates(), we let
largefiles' overridecalculateupdates() so the unresolved values
(i.e. 'cd' or 'dc' rather than 'g', 'r', 'a' and missing). This allows
overridecalculateupdates() to ask the user whether to keep the normal
file or the largefile before the user gets the cd/dc prompt. Whichever
answer the user gives, we make overridecalculateupdates() replace 'cd'
or 'dc' action, saving the user one annoying (and less clear)
question.
We would eventually like to move the resolution of modify/delete and
delete/modify conflicts to the resolve phase. However, we don't want
to move the checks for identical content that were added in
99b29d2bd5ed (merge: before cd/dc prompt, check that changed side
really changed, 2014-12-01). Let's instead move these out to a new
_resolvetrivial() function that processes the actions from
manifestmerge() and replaces any false cd/dc conflicts. The function
will also provide a natural place for us to later add code for
resolving false 'm' conflicts.
As preparation for making 'dr' and 'rd' actions no longer actions,
move the reporting from applyupdates() to its caller update(). This
way we won't have to pass additonal arguments to applyupdates() when
they are no longer actions. Also, the warnings are equally unrelated
to applyupdates() as they are to recordupdates(), as they don't result
in any changes to either the working copy or the dirstate.
See earlier patch for additional motivation.
It is easier to reason about certain algorithms in terms of a
file->action mapping than the current action->list-of-files. Bid merge
is already written this way (but with a list of actions per file), and
largefiles' overridecalculateupdates() will also benefit. However,
that requires us to have at most one action per file. That requirement
is currently violated by 'dr' (divergent rename) and 'rd' (rename and
delete) actions, which can exist for the same file as some other
action.
These actions are only used for displaying warnings to the user; they
don't change anything in the working copy or the dirstate. In this
way, they are similar to the 'k' (keep) action. However, they are even
less action-like than 'k' is: 'k' at least describes what to do with
the file ("do nothing"), while 'dr' and 'rd' or only annotations for
files for which there may exist other, "real" actions.
As a first step towards separating these acitons out, stop including
them in the progress output, just like we already exclude the 'k'
action.
Most merge action messages don't describe the action itself, they
describe the reason the action was taken. The only exeption is the 'k'
action, for which the message is just "keep" and instead there is a
code comment folling it that says "remote unchanged". Let's move that
comment into the merge action message.
When the local side has renamed a directory from a/ to b/ and added a
file b/c in it, and the remote side has added a file a/c, we end up
overwriting the local file b/c with the contents of remote file
a/c. Add a check for this case and use the merge ('m') action in this
case instead of the directory rename get ('dg') action.
When the remote side has renamed a directory from a/ to b/ and added a
file b/c in it, and the local side has added a file a/c, we end up
moving a/c to b/c without considering the remote version of b/c. Add a
check for this case and use the merge ('m') action in this case
instead of the directory rename ('dm') action.
There are three high-level cases that are of interest in
manifestmerge(): 1) The file exists on both sides, 2) The file exists
only on the local side, and 3) The file exists only on the remote
side. Let's make this clearer in the code.
The 'if f in copied' case will be broken up into the two applicable
branches in the next patch.
The order is determined by manifest.diff(), which currently is not
sorted. There are currently no tests for this, but we will soon add
some that would be flaky without this patch.
When looking for untracked files that would conflict with a tracked
file in the target revision (or the remote side of a merge), we
explcitly exclude ignored files. The code was added in f1db75422e70
(merge: refactor unknown file conflict checking, 2012-02-09), but it
seems like only unknown, not ignored, files were considered since the
beginning of time.
Although ignored files are mostly build outputs and backup files, we
should still not overwrite them. Fix by simply removing the explicit
check.
Before, merging would in some cases ask "wrong" questions about
"changed/deleted" conflicts ... and even do it before the resolve phase where
they can be postponed, re"resolved" or answered in bulk operations.
Instead, check that the content of the changed file really did change.
Reading and comparing file content is expensive and should be avoided before
the resolve phase. Prompting the user is however even more expensive. Checking
the content here is thus better.
The 'f in ancestors[0]' should not be necessary but is included to be extra
safe.
Instead of using a file that we know is not in the common ancestor's
maniffest, let's use None. This is safe as the only place that cares
about the value (applyupdates) already checks if the item exists in
the ancestor.
We can further limit the scope of the 2-way merge case by breaking out
the case where the file was not created from scratch on both sides but
rather renamed in the same way (and is therefore a 3-way merge). This
involves copying some code, but it makes it clearer which case the
"Note:" in the code refers to.
When 'f' is not in 'ma', 'a' will be 'nullid' and all the if/elif
conditions that check whether some one nodeid is equal to 'a' will
fail, and the else-clause will instead apply. We can make that more
explicit by creating a separate 'm' action for the case where 'a' is
'nullid'. While it does mean copying some code, perhaps it makes it a
little clearer which codepaths are possible, and which cases the
"Note:" in the code refers to. It also lets us make the debug action
messages a little more specific.
Since 4a56fba99974 (merge: don't use unknown(), 2012-02-09), untracked
files are no longer included in the manifest diff, so there is no need
to check exclude them when renaming files for directory moves with the
'dm' action.
calculateupdates() happens before applyupdates(), so move it before in
the code. That also moves it close to manifestmerge(), which is a good
location as calculateupdates() is the only caller of manifestmerge().
manifestmerge() has a piece of code that's roughly:
if not force and different:
abort
else:
# if different: old untracked f may be overwritten and lost
...
The comment only talks about what happens when 'different' is true,
and in combination with the if-block above, that must mean that it is
only about what happens when 'force and different'. It seems quite
fine that files are overwritten when 'force' is true, so let's remove
the comment. As it stands, it can easily be interpreted as a TODO
(which is how I interpreted it at first).
As far as I and the test suite can tell, the checks in manifestmerge()
already report the errors (whether or not --check is given), so we
don't need to call merge.checkunknown(). Since this is the last call
to the method, also remove the method.
The method has been called from commands.py since 8d9ca2ac2fe8
(update: just merge unknown file collisions, 2012-02-09), so drop the
underscore prefix that suggests that it's private.
From manifest.diff(), we return a dict from filename to pairs of pairs
of file nodeids and flags (values of the form ((n1,n2),(fl1,fl2))). To
create this dict, we currently generate one dict for files (with
(n1,n2) values) and one for flags (with (fl1,fl2) values) and then
join these dicts. Missing files are represented by None and missing
flags by '', but due to the dict joining, the inner pairs themselves
can also be None. The only caller, merge.manifestmerge(), then unpacks
these values while checking for None values.
By inlining the calls to dicthelpers and simplifying it to only
iterate over files (ignoring flags-only differences), we can simplify
life for our caller.
The manifestdict class already has a method for diff flags between two
manifests (presumably because there is no full access to the private
_flags field). The only caller is merge.manifestmerge(), which also
wants a diff of files between the same manifests. Let's combine the
code for diffing files and flags into a single method on
manifestdict. This puts all the manifest diffing in one place and will
allow for further simplification. It might also be useful for it to be
encapsulated in manifestdict if we later decide to to shard
manifests. The docstring is intentionally unclear about missing
entries for now.
If a merge is attempted when another merge is already ongoing, we give
the message "outstanding uncommitted merges". Many other commands
(such as backout, rebase, histedit) give the same message in singular
form. Since the singular form also seems to make more sense, let's use
that for 'hg merge' as well.
Bid merge is now the default and it is not necessary to tell the user that an
experimental feature kicked in.
(It could however still be relevant to get a notice that it is one of the rare
criss-cross merge situations so the user is warned that the situation is more
tricky than usual.)
In most cases merges will work exactly as before.
The only difference is in criss-cross merge situations where there is multiple
ancestors. Instead of picking an more or less arbitrary ancestor, it will
consider both ancestors and pick the best bids.
Bid merge can be disabled with --config merge.preferancestor='!'.
This wraps all the locations of dirstate.setparent with the appropriate
begin/endparentchange calls. This will prevent exceptions during those calls
from causing incoherent dirstates (issue4353).