As preparation for making 'dr' and 'rd' actions no longer actions,
move the reporting from applyupdates() to its caller update(). This
way we won't have to pass additonal arguments to applyupdates() when
they are no longer actions. Also, the warnings are equally unrelated
to applyupdates() as they are to recordupdates(), as they don't result
in any changes to either the working copy or the dirstate.
See earlier patch for additional motivation.
It is easier to reason about certain algorithms in terms of a
file->action mapping than the current action->list-of-files. Bid merge
is already written this way (but with a list of actions per file), and
largefiles' overridecalculateupdates() will also benefit. However,
that requires us to have at most one action per file. That requirement
is currently violated by 'dr' (divergent rename) and 'rd' (rename and
delete) actions, which can exist for the same file as some other
action.
These actions are only used for displaying warnings to the user; they
don't change anything in the working copy or the dirstate. In this
way, they are similar to the 'k' (keep) action. However, they are even
less action-like than 'k' is: 'k' at least describes what to do with
the file ("do nothing"), while 'dr' and 'rd' or only annotations for
files for which there may exist other, "real" actions.
As a first step towards separating these acitons out, stop including
them in the progress output, just like we already exclude the 'k'
action.
Most merge action messages don't describe the action itself, they
describe the reason the action was taken. The only exeption is the 'k'
action, for which the message is just "keep" and instead there is a
code comment folling it that says "remote unchanged". Let's move that
comment into the merge action message.
When the local side has renamed a directory from a/ to b/ and added a
file b/c in it, and the remote side has added a file a/c, we end up
overwriting the local file b/c with the contents of remote file
a/c. Add a check for this case and use the merge ('m') action in this
case instead of the directory rename get ('dg') action.
When the remote side has renamed a directory from a/ to b/ and added a
file b/c in it, and the local side has added a file a/c, we end up
moving a/c to b/c without considering the remote version of b/c. Add a
check for this case and use the merge ('m') action in this case
instead of the directory rename ('dm') action.
There are three high-level cases that are of interest in
manifestmerge(): 1) The file exists on both sides, 2) The file exists
only on the local side, and 3) The file exists only on the remote
side. Let's make this clearer in the code.
The 'if f in copied' case will be broken up into the two applicable
branches in the next patch.
The order is determined by manifest.diff(), which currently is not
sorted. There are currently no tests for this, but we will soon add
some that would be flaky without this patch.
When looking for untracked files that would conflict with a tracked
file in the target revision (or the remote side of a merge), we
explcitly exclude ignored files. The code was added in f1db75422e70
(merge: refactor unknown file conflict checking, 2012-02-09), but it
seems like only unknown, not ignored, files were considered since the
beginning of time.
Although ignored files are mostly build outputs and backup files, we
should still not overwrite them. Fix by simply removing the explicit
check.
Before, merging would in some cases ask "wrong" questions about
"changed/deleted" conflicts ... and even do it before the resolve phase where
they can be postponed, re"resolved" or answered in bulk operations.
Instead, check that the content of the changed file really did change.
Reading and comparing file content is expensive and should be avoided before
the resolve phase. Prompting the user is however even more expensive. Checking
the content here is thus better.
The 'f in ancestors[0]' should not be necessary but is included to be extra
safe.
Instead of using a file that we know is not in the common ancestor's
maniffest, let's use None. This is safe as the only place that cares
about the value (applyupdates) already checks if the item exists in
the ancestor.
We can further limit the scope of the 2-way merge case by breaking out
the case where the file was not created from scratch on both sides but
rather renamed in the same way (and is therefore a 3-way merge). This
involves copying some code, but it makes it clearer which case the
"Note:" in the code refers to.
When 'f' is not in 'ma', 'a' will be 'nullid' and all the if/elif
conditions that check whether some one nodeid is equal to 'a' will
fail, and the else-clause will instead apply. We can make that more
explicit by creating a separate 'm' action for the case where 'a' is
'nullid'. While it does mean copying some code, perhaps it makes it a
little clearer which codepaths are possible, and which cases the
"Note:" in the code refers to. It also lets us make the debug action
messages a little more specific.
Since 4a56fba99974 (merge: don't use unknown(), 2012-02-09), untracked
files are no longer included in the manifest diff, so there is no need
to check exclude them when renaming files for directory moves with the
'dm' action.
calculateupdates() happens before applyupdates(), so move it before in
the code. That also moves it close to manifestmerge(), which is a good
location as calculateupdates() is the only caller of manifestmerge().
manifestmerge() has a piece of code that's roughly:
if not force and different:
abort
else:
# if different: old untracked f may be overwritten and lost
...
The comment only talks about what happens when 'different' is true,
and in combination with the if-block above, that must mean that it is
only about what happens when 'force and different'. It seems quite
fine that files are overwritten when 'force' is true, so let's remove
the comment. As it stands, it can easily be interpreted as a TODO
(which is how I interpreted it at first).
As far as I and the test suite can tell, the checks in manifestmerge()
already report the errors (whether or not --check is given), so we
don't need to call merge.checkunknown(). Since this is the last call
to the method, also remove the method.
The method has been called from commands.py since 8d9ca2ac2fe8
(update: just merge unknown file collisions, 2012-02-09), so drop the
underscore prefix that suggests that it's private.
From manifest.diff(), we return a dict from filename to pairs of pairs
of file nodeids and flags (values of the form ((n1,n2),(fl1,fl2))). To
create this dict, we currently generate one dict for files (with
(n1,n2) values) and one for flags (with (fl1,fl2) values) and then
join these dicts. Missing files are represented by None and missing
flags by '', but due to the dict joining, the inner pairs themselves
can also be None. The only caller, merge.manifestmerge(), then unpacks
these values while checking for None values.
By inlining the calls to dicthelpers and simplifying it to only
iterate over files (ignoring flags-only differences), we can simplify
life for our caller.
The manifestdict class already has a method for diff flags between two
manifests (presumably because there is no full access to the private
_flags field). The only caller is merge.manifestmerge(), which also
wants a diff of files between the same manifests. Let's combine the
code for diffing files and flags into a single method on
manifestdict. This puts all the manifest diffing in one place and will
allow for further simplification. It might also be useful for it to be
encapsulated in manifestdict if we later decide to to shard
manifests. The docstring is intentionally unclear about missing
entries for now.
If a merge is attempted when another merge is already ongoing, we give
the message "outstanding uncommitted merges". Many other commands
(such as backout, rebase, histedit) give the same message in singular
form. Since the singular form also seems to make more sense, let's use
that for 'hg merge' as well.
Bid merge is now the default and it is not necessary to tell the user that an
experimental feature kicked in.
(It could however still be relevant to get a notice that it is one of the rare
criss-cross merge situations so the user is warned that the situation is more
tricky than usual.)
In most cases merges will work exactly as before.
The only difference is in criss-cross merge situations where there is multiple
ancestors. Instead of picking an more or less arbitrary ancestor, it will
consider both ancestors and pick the best bids.
Bid merge can be disabled with --config merge.preferancestor='!'.
This wraps all the locations of dirstate.setparent with the appropriate
begin/endparentchange calls. This will prevent exceptions during those calls
from causing incoherent dirstates (issue4353).
Updates with uncommited changes will always only have one ancestor - the parent
revision. Updates between existing revision should (and will) always give the
same result no matter which ancestor is used. The warning is thus only relevant
when doing a "real" merge.
This replaces the grand unified action list that had multiple action types as
tuples in one big list. That list was iterated multiple times just to find
actions of a specific type. This data model also made some code more
convoluted than necessary.
Instead we now store actions as a tuple of lists. Using multiple lists gives a
bit of cut'n'pasted code but also enables other optimizations.
This patch uses 'if True:' to preserve indentations and help reviewing. It also
limits the number of conflicts with other pending patches. It can trivially be
cleaned up later.
Adds a labels function parameter to all the functions between merge.update and
filemerge.filemerge. This will allow commands like rebase to specify custom
marker labels.
The old code had one function that could do 2 different things. First,
is was called a bunch of times to do one thing. Next, it was called a
bunch of times to do the other thing. That gave unnecessary complexity
and a dispatch overhead. Having separate functions is "obviously" better
than having a function that can do two things, depending on its parameters.
It also prepares the code for the next refactorings.
Preparing for action list split-up, making sure the final change don't have any
test changes.
The patch moves debug statements around without really changing anything.
Arguably, it temporarily makes the code worse. The only justification is that
it makes it easier to review the test changes ... and in the end the big change
will not change test output at all.
The changes to test output are due to changes in the ordering of debug output.
That is mainly because we now do the debug logging for files when we actually
process them. Files are also processed in a slightly different but still
correct order. It is now primarily ordered by action type, secondarily by
filename.
The patch introduces some redundancy. Some of it will be removed again, some of
it will in the end help code readability and efficiency. It is possible that we
later on could introduce a "process this action list and do some logging and
progress reporting and apply this function".
The "preserving X for resolve" debug statements will only have single space
indentation. It will no longer have a leading single space indented "f: msg ->
m" message. Having this message double indented would thus no longer make
sense.
The bid actions will temporarily be sorted using a custom sort key that happens
to match the sort order the simplified code will have in the end.
The ordering of actions matters. Normal file system semantics is that files
have to be removed before a directory with the same name can be created.
Before the first ordering key was to have 'r' and 'f' actions come first,
secondary key was the filename.
Because of future refactorings we want to consistently have all action types
(with a sensible priority) as separate first keys. Grouped by action type, we
sort by filename.
Not processing in strict filename order could give worse performance,
especially on spinning disks. That is however primarily an issue in the cases
where "all" actions are of the same kind and will be grouped together anyway.
When using resolve, users often have to consult with the output of |hg
resolve -l| to see if any unresolved files remain. This step is tedious
and adds overhead to resolving.
This patch will notify a user if there are no unresolved files remaining
after executing |hg resolve|::
no unresolved files; you may continue your unfinished operation
The patch stops short of telling the user exactly what command should be
executed to continue the unfinished operation. That is because this
information is not currently captured anywhere. This would make a
compelling follow-up feature.
The resolve command is only relevant when mergestate is present.
This patch will make resolve abort when no mergestate is present.
This change will let people know when they are using resolve when they
shouldn't be. This change will let people know when their use of resolve
doesn't do anything.
Previously, |hg resolve -m| would allow mergestate to be created. This
patch now forbids that. Strictly speaking, this is backwards
incompatible. The author of this patch believes creating mergestate via
resolve doesn't make much sense and this side-effect was unintended.
Some code branches and exceptional circumstances such as empty
mergestate files could result in mergestate._local and
mergestate._other not being defined or reset to None. These variables
are now correctly set to None when they should be.
The current situation is a bit of a layering violation as
merge-specific knowledge is pushed down to lower layers and leaks
merge assumptions into other code paths.
Here, we simply silence the warning with a hack. Both the warning and
the hack will probably go away in the near future when bid merge is
made the default.
Bid merge is a new rarely used feature that the user explicitly enabled - we
should tell/warn when the user actually is using it, just like we tell when we
not are using it.
Give a message like
note: merging 3b08d01b0ab5+ and adfe50279922 using bids from ancestors 0f6b37dbe527 and 40663881a6dd
The basic idea is to do the merge planning with all the available ancestors,
consider the resulting actions as "bids", make an "auction" and
automatically pick the most favourable action for each file.
This implements the basic functionality and will only consider "keep" and
"get" actions. The heuristics for picking the best action can be tweaked later
on.
By default it will only pass ctx.ancestor as the single ancestor to
calculateupdates. The code path for merging with a single ancestor is not
changed.
Such a 'keep' action will later be the preferred (non)action when there
is multiple ancestors. It is thus very convenient to have it explicitly.
The extra actions will only be emitted in the case where the local file has
changed since the ancestor but the other hasn't. That is the symmetrical
operation to a 'get' action.
This will create more action tuples that not really serve a purpose. The number
of actions will however have the number of changed files as upper bound and it
should thus not increase the memory/cpu use significantly.
test-merge-types.t changes a bit in flag merging. It relied on the
implementation detail that 100% identical revlog entries are reused. The revlog
reuse did that fctx.ancestor() saw an ancestor where there really not was one.
Before this patch, "contrib/check-code.py" can't detect these
problems, because the regexp pattern to detect "% inside _()" doesn't
suppose the case that the format string and "%" aren't placed in the
same line.
This patch replaces "\s" in that regexp pattern with "[ \t\n]" to
detect "% inside _()" problems in such case.
"[\s\n]" can't be used in this purpose, because "\s" is automatically
replaced with "[ \t]" by "_preparepats()" and "\s" in "[]" causes
nested "[]" unexpectedly.
We can use the "other" data from the recorded merge state instead of inferring
what the other could be from working copy parent. This will allow resolve to
fulfil its duty even when the second parent have been dropped.
Most direct benefit is fixing a regression in backout.
This data is mostly redundant with the "other" changeset node (+ other changeset
file path). However, more data never hurt.
The old format do not store it so this require some dancing to add and remove it
on demand.
When we have to fallback to the old version of the file, we infer the
"other" from current working directory parent. The same way it is currently done
in the resolve command. This is know to have shortcoming… but we cannot do
better from the data contained in the old file format. This is actually the
motivation to add this new file format.
We need to record the merge we were merging with. This solve multiple
bug with resolve when dropping the second parent after a merge. This
happen a lot when doing special merge (overriding the ancestor).
Backout, shelve, rebase, etc. can takes advantage of it.
This changeset just add the information in the merge state. We'll use it in the
resolve process in a later changeset.
This new format will allow us to address common bugs while doing special merge
(graft, backout, rebase…) and record user choice during conflict resolution.
The format is open so we can add more record for future usage.
This file still store hexified version of node to help human willing to debug
it by hand. The overhead or oversize are not expected be an issue.
The old format is still used. It will be written to disk along side the newer
format. And at parse time we detect if the data from old version of the
mergestate are different from the one in the new version file. If its the same,
both have most likely be written at the same time and you can trust the extra
data from the new file. If it differs, the old file have been written by an
older version of mercurial that did not knew about the new file. In that case we
use the content of the old file.
The format of the file is unchanged. But we are preparing a new file with a new
format that would be record based. So we change all the read/write logic to
handle a list of record until a very low level. This will allow simple plugging
of the new format in the current code.
Merge could overwrite untracked files and cause data loss.
Instead we now handle the 'local side removed file and has untracked file
instead' case as the 'other side added file that local has untracked' case:
FILE: untracked file exists
abort: untracked files in working directory differ from files in requested revision
It could perhaps make sense to create .orig files when overwriting, either
instead of aborting or when overwriting anyway because of force ... but for now
we stay consistent with similar cases.
The tests shows no real changes because of this ... but there must be some
weird corner cases where using the right ancestor for the merge planning is
better than using the wrong one.
Previously, a bare update would ignore any successor changesets thus
potentially leaving you on an obsolete head. This happens commonly when there
is an old bookmark that hasn't been moved forward which is the motivating
reason for this patch series.
Now, we will check for successor changesets if two conditions hold: 1) we are
doing a bare update 2) *and* we are currently on an obsolete head.
If we are in this situation, then we calculate the branchtip of the successor
set and update to that changeset.
Tests coverage has been added.
Forgets need to be in the beginning of the action list, same as removes. This
lets us avoid clashes in the dirstate where a directory is forgotten and a
file with the same name is added, or vice versa.
hg update . (or equivalents) are effectively no-ops in just about all
circumstances. These sorts of updates can be especially common in a
bookmark-oriented workflow. This saves us a status check and a manifest
decompression, which means that on a repo with over 210,000 files, this brings
hg update . down from 2.5 seconds to 0.15.
There is one change in behavior: a file that was added, not committed, and then
deleted but not removed used to be removed from the dirstate. With this patch
it isn't. This is what causes the change in test-mq-qpush-exact.t. This seems
like it's enough of an edge case to not be worth handling.
The output of test-empty.t changes because those files are not yet created.
Previously, the error message for a dirty non-linear update was the same (and
relatively unhelpful) whether or not a rev was specified. This patch and an
upcoming one will introduce separate, more helpful hints.
To figure out what to do with locally unknown files, Mercurial attempts to read
them if they exist. When an attempt is made to read a file that exists but
traverses a symlink, Mercurial aborts.
With this patch, we first ensure that the file doesn't traverse a symlink
before opening it. This is fine because a file being "remote created" means the
symlink doesn't exist remotely, which means it will be deleted in the apply
phase.
Before this patch, case-folding collision detection uses
"copies.pathcopies()" before "manifestmerge()", and is not aware of
renaming in some cases.
For example, in the case of issue3452, "copies.pathcopies()" can't
detect renaming, if the file is renamed at the revision before common
ancestor of merging. So, "hg merge" is aborted unexpectedly on case
insensitive filesystem.
This patch fully rewrites case-folding collision detection, and
relocate it into "manifestmerge()".
New implementation uses list of actions held in "actions" and
"prompts" to build provisional merged manifest up.
Provisional merged manifest should be correct, if actions required to
build merge result up in working directory are listed up in "actions"
and "prompts" correctly.
This patch checks case-folding collision still before prompting for
merge, to avoid aborting after some interactions with users. So, this
assumes that user would choose not "deleted" but "changed".
This patch also changes existing abort message, because sorting before
collision detection changes order of checked files.
"merge.applyupdates()" sorts "actions" in removal first order, and
"workeractions" derived from it should be also sorted.
If each actions in "workeractions" are executed in serial, this
sorting ensures that merging/updating process is collision free,
because updating the file in target context is always executed after
removing the existing file which causes case-folding collision against
the former.
In the other hand, if each actions are executed in parallel, updating
on a worker process may be executed before removing on another worker
process, because "worker.partition()" partitions list of actions
regardless of type of each actions.
This patch divides "workeractions" into removing and updating, and
executes the former first.
This patch still scans "actions"/"workeractions" some times for ease
of patch review, even though large list may cost much in this way.
(total cost should be as same as before)
This also changes some tests, because dividing "workeractions" affects
progress indication.
Update to "foreground" are no longer seen as cross branch update. "Foreground"
are descendants or successors (or successors of descendants (or descendant of
successors (etc))). This allows to update with uncommited changes that get
automatically merged.
This changeset is a small step forward. We want to allow dirty update to
"background" (precursors) and takes obsolescence in account when finding the
default update destination. But those requires deeper changes and will comes in
later changesets.
This can happen when a file with flags is removed or deleted in the working
directory and also not present in m2. The obvious solution is to add a
__delitem__ override to manifestdict that removes the file from flags if
necessary, but that has a significant performance cost in some cases, e.g.
hg status --rev rev1 --rev rev2 <file>.
This patch improves manifestmerge performance significantly.
In a repository with 170,000 files, the following results were observed on a
clean working directory. Revision '.' adds one file.
hg perfmergecalculate -r .
- before: 0.41 seconds
- after: 0.13 seconds
hg perfmergecalculate -r .^
- before: 0.53 seconds
- after: 0.24 seconds
Comparing against '.' is much faster than comparing against '.^' because with
'.', the wctx and p2 manifest strings have the same identity, so comparisons
are simply pointer equality. With '.^', the strings have different identities
so we need to perform memcmps.
Any operation that uses manifestmerge benefits.
- hg update . goes from 2.04 seconds to 1.75
- hg update .^ goes from 2.52 seconds to 2.25
- hg rebase -r . -d .~6 (involves 4 merges) goes from 11.8 seconds to 10.8
When a series of commits first adds a file and then removes it,
hg rebase --collapse prompts whether to keep the file or delete it. This is
due to it reusing the branch merge code. In a noninteractive terminal it
defaults to keeping the file, which results in a collapsed commit that is
has a file that should be deleted. This bug resulted in developers accidentally
commiting unintentional changes to our repo twice today, so it's fairly
important to get fixed.
This change allows rebase --collapse to tell the merge code to accept the
latest version every time without prompting.
Adds a test as well.
If the manifest of an earlier revision on the same delta chain is read before
that of a later revision, the revlog remembers that we parsed the earlier
revision and continues applying deltas from there onwards. If manifests are
parsed the other way round, we have to start over from the fulltext.
For a fresh clone of mozilla-central, updating from 29dd80c95b7d to its parent
aab96936a177 requires approximately 400 fewer zlib.decompress calls, which
results in a speedup from 1.10 seconds to 1.05.
_forgetremoved can trigger manifest construction, but we only want it to
happen after manifestmerge, so that our attempt to read the manifests in the
right order in an upcoming patch actually works.
This has a big effect on the performance of working dir updates.
Here are the results of update from null to the given rev in several
repos, on a Linux 3.2 system with 32 cores running ext4, with the progress
extension enabled.
repo rev plain parallel speedup
hg 76a842fb67cc 0.9 0.3 3
mozilla-central fe1600b22c77 42.8 7.7 5.5
linux-2.6 9ef4b770e069 31.4 4.9 6.4
Instead of a monotonic count, getupdates yields the number of files it
has updated since it last reported, and its caller sums the numbers when
updating progress.
Once we run these updates in parallel, this will allow worker processes
to report progress less often, reducing overhead.
In an upcoming patch, we will update .hgsubstate in a non-interactive
worker process. Merges of subrepo contents will still need to occur in the
master process (since they may be interactive), so we move that code into
a place where it will always run in what will become the master process.
This replaces the _checkunknown call in calculateupdates with a more
performant version. On a repository with over 150,000 files, this speeds up an
update by 0.6-0.8 seconds, which is up to 25%.
This does not introduce any UI changes. There is existing test coverage for
every case, mostly in test-merge*.t.
Show messages at a point where the actions have been sorted, thus preparing for
backout of 14f4258e3526.
This makes manifestmerge more of a silent operation, just like 'copies' is.
Indent 'preserving' messages to make them subordinate to the action logging so
they fit in the new context. (The 'preserving' messages are quite redundant and
could also be removed completely.)
Preparing for backout of 14f4258e3526.
The number of prompts will for all relevant cases be significantly smaller than
the total number of files in the manifests. We can thus afford to sort the
prompts more than we can afford to sort the manifests.
Add sorted() in places found by testing with PYTHONHASHSEED=random and code
inspection.
An alternative to sprinkling sorted() all over would be to change substate to a
custom dict with sorted iterators...
The 'x' flag and the 'l' flag are very different. It is usually not a problem
to change the 'x' flag of a normal file independent of the content, but one
does not simply change the type of a file to 'l' independent of the content.
This removes the fmerge function that merged both 'x' and 'l' independent of
content early in the merge process. This correctly introduces some conflicts
instead of silent incorrect merges. 3-way flag merge will now be done in the
resolve process, right next to file content merge. Conflicts can thus be
resolved with (slightly inconvenient) resolve commands like 'resolve f --tool
internal:other'. It thus brings us closer to be able to re-solve manifest merge
after the merge and avoid prompts during merge.
This also removes the "conflicting flags for a - (n)one, e(x)ec or sym(l)ink?"
prompt that nobody could answer and that made it easy to mix symlink targets
and file contents up. Instead it will give a file merge where a sufficiently
clever merge tool can help resolving the issue.
The early prescan for move/remove and removal of moved files in applyupdates
was introduced with mergestate 5ff549be1f0c and rendered this chunk of code
irrelevant.
The impact of the chunk was reduced in 0309b0586c65 - but it could have been
removed completely.
Currently the "copy" dict contains both explicit copies/moves made by a
context and pending moves that need to happen because the other context moved
the directory the file was in. For explicit copies, the dict stores a
destination to source map, while for pending moves via directory renames, it
stores a source to destination map. The merge code uses this fact in a non-
obvious way to differentiate between these two cases.
We make this explicit by storing these pending moves in a separate dict. The
dict still has a source to destination map, but that is called out in the
docstring.
This pulls the code used to calculate the changes that need to happen
during merge.update() into a separate function. This is not useful on
its own, but is instead preparatory to performing grafts in memory
when there are no potential conflicts.
Before this patch, case-folding collision is checked simply between
manifests of each merged revisions.
So, files may be considered as colliding each other, even though one
of them is already deleted on one of merged branches: in such case,
merge causes deleting it, so case-folding collision doesn't occur.
This patch checks whether both of files colliding each other still
remain after merge or not, and ignores collision if at least one of
them is deleted by merge.
In the case that one of colliding files is deleted on one of merged
branches and changed on another, file is considered to still remain
after merge, even though it may be deleted by merge, if "deleting" of
it is chosen in "manifestmerge()".
This avoids fail to merge by case-folding collisions after choices
from "changing" and "deleting" of files.
This patch adds only tests for "removed remotely" code paths in
"_remains()", because other ones are tested by existing tests in
"test-casecollision-merge.t".
For divergent renames the following message is printed during merge:
note: possible conflict - file was renamed multiple times to:
newfile
file2
When a file is renamed in one branch and deleted in the other, the file still
exists after a merge. With this change a similar message is printed for mv+rm:
note: possible conflict - file was deleted and renamed to:
newfile
We allow rebase plus collapse, but not collapse only? I imagine people would
rebase first then collapse once they are sure the rebase is correct and it is
the right time to finish it.
I was reluctant to submit this patch for reasons detailed below, but it
improves rebase --collapse usefulness so much it is worth the ugliness.
The fix is ugly because we should be fixing the collapse code path rather than
the merge. Collapsing by merging changesets repeatedly is inefficient compared
to what commit --amend does: commitctx(), update, strip. The problem with the
latter is, to generate the synthetic changeset, copy records are gathered with
copies.pathcopies(). copies.pathcopies() is still implemented with merging in
mind and discards information like file replaced by the copy of another,
criss-cross copies and so forth. I believe this information should not be lost,
even if we decide not to interpret it fully later, at merge time.
The second issue with improving rebase --collapse is the option should not be
there to begin with. Rebasing and collapsing are orthogonal and a dedicated
command would probably enable a better, simpler ui. We should avoid advertizing
rebase --collapse, but with this fix it becomes the best shipped solution to
collapse changesets.
And for the record, available techniques are:
- revert + commit + strip: lose copies
- mq/qfold: repeated patching() (mostly correct, fragile)
- rebase: repeated merges (mostly correct, fragile)
- collapse: revert + tag rewriting wizardry, lose copies
- histedit: repeated patching() (mostly correct, fragile)
- amend: copies.pathcopies() + commitctx() + update + strip
The fix introduced in 3509b9cf8f86 was only partially successful. It is correct
to turn dirstate 'm' merge records into normal/dirty ones but copy records are
lost in the process. To adjust them as well, we need to look in the first
parent manifest to know which files were added and preserve only related
records. But the dirstate does not have access to changesets, the logic has to
moved at another level, in localrepo.
a317664437ee introduced some logic to avoid case-collision detection between
source and destination revisions when it does not make sense: clean or to be
cleaned working directories. Unfortunately, part of it was flawed and the
related test was broken by another bug.
This patch disables cross revision case collision detection for updates without
option or with --check, if the working directory is clean.
if the file in target context causes case-folding collision against
one in working context, current implementation aborts merging with it,
even thouhg collding one (in target) is the file renamed from collided
one (in working).
this patch uses file copy information to know whether colliding file
is renamed from collided one or not: if so, collision between them is
ignored.
this patch also avoids collision detection between current context and
target context, if working context is clean (with --check/-c) or will
be clean (with --clean/-C).
This fixes a regression introduced by df049e784ab6. If no file-level
merge is needed, we can update flags directly, otherwise we have a
conflict to resolve in filemerge.
Previously, we could change a normal file into a corrupt symlink when
trying to merge a symlink flag. Now, we leave the flag alone and let
filemerge deal with it (usually by a prompt).
We also drop a redundant flag setting after filemerge (now dealt with
by ms.resolve) that would cause similar corruption.
The unknown file collision rule was introduced as an extension of the
"should be clean when merging" rule. Unfortunately, it got applied to
the normal update path, which should be happy to merge local changes.
This patch gives us merges for unknown file collisions on update,
while preserving abort for merge and update -c.
This removes use of unknown files for building the synthetic working
directory manifest used by manifestmerge. Instead, we adopt the
strategy used by _checkunknown.
Side-effect: unknown files are no longer moved by remote directory
renames, and now are left alone like ignored files.
Previously, we would do a full working directory walk including
unknown files to perform a merge. In many cases, this was painful
because unknown files greatly outnumbered tracked files and generally
had no useful effect on the merge.
Here we instead wait until we find a file in the destination that's
not tracked locally and detect if it exists and is not ignored. This
is usually cheaper but can be -more- expensive in the case where we're
adding a huge number of files. On the other hand, the cost of statting
the new files should be dwarfed by the cost of eventually writing
them.
In this version, case collisions are detected implicitly by
os.path.exists and wctx[f] lookup.
When doing hg up, if there is a file conflict with untracked files,
currently only the first such conflict is reported. With this patch,
all of them are listed.
With this patch error message is now reported as
a: untracked file differs
b: untracked file differs
abort: untracked files in working directory conflict with files in
requested revision
instead of
abort: untracked file in working directory differs from file in
requested revision: 'a'
This is a follow up to an old attempt to do this here:
http://selenic.com/pipermail/mercurial-devel/2011-August/033625.html
this patch makes branch merging abort when merged changesets have same
file in different case on case insensitive filesystem.
this patch does not prevent linear update which merges between target
and working contexts, because 'branchmerge' is False in such case.
Makes the 'nothing to merge' abort messages in commands.py consistent with
those in merge.py. Also makes commands.merge() and merge.update() use hints.
The tests show the changes.
Some character sets, cp932 (known as Shift-JIS for Japanese) for
example, use 0x41('A') - 0x5A('Z') and 0x61('a') - 0x7A('z') as second
or later character.
In such character set, case collision checking recognizes different
files as CASEFOLDED same file, if filenames are treated as byte
sequence.
win32mbcs extension is not appropriate to handle this problem, because
this problem can occur on other than Windows platform only if
problematic character set is used.
Callers of util.checkcase() use known ASCII filenames as last
component of path, and string.lower() is not applied to directory part
of path. So, util.checkcase() is kept intact, even though it applies
string.lower() to filenames.
merge.update() was missing a few dirtiness checks from workingcontext,
including subrepo cleanliness checks. Using wc.dirty() instead of
one-off checks for various forms of dirtiness will be significantly
safer.
Previously, pull would not update if new branch heads were received,
whereas pull && update would move to the tipmost branch head.
Also change the "crosses branches" abort in merge.update from
"crosses branches (merge branches or use --check to force update)"
to
"crosses branches (merge branches or update --check to force update)"
since it can no longer assume the user is running hg update.
It has substantially different semantics from forget at the command
layer, so change it to avoid confusion.
We can't simply combine it with remove because we need to explicitly
drop non-added files in some cases like commit.
The Mercurial 1.9 release is moving a lot of stuff around anyway and we are
already moving path_auditor from util.py to scmutil.py for that release.
So this seems like a good opportunity to do such a rename. It also strengthens
the current project policy to avoid underbars in names.
These leaks may occur in environments that don't employ a reference
counting GC, i.e. PyPy.
This implies:
- changing opener(...).read() calls to opener.read(...)
- changing opener(...).write() calls to opener.write(...)
- changing open(...).read(...) to util.readfile(...)
- changing open(...).write(...) to util.writefile(...)
This backs out
changeset: 13158:17d1b96c0f12
user: Mads Kiilerich <mads@kiilerich.com>
date: Tue Dec 07 03:29:21 2010 +0100
summary: merge: fast-forward merge with descendant
Before named branches, the invariants were:
a) "merges" always have two parents
b) p1 is not linearly related to p2
Adding named branches made (b) problematic, so the above patch was
introduced, which fixed (b) but broke (a).
After discussion, we decided that the invariants should be:
a) "merges" always have two parents
b) p1 is not linearly related to p2 OR p1 and p2 are on different branches
Add missing calls to close() to many places where files are
opened. Relying on reference counting to catch them soon-ish is not
portable and fails in environments with a proper GC, such as PyPy.
This makes 'hg update --clean' behave the same way for both kinds of
subrepositories. Before Subversion subrepos did not take the clean
parameter into account, but just updated to the given revision and
merged uncommitted changes into that.