If the manifest of an earlier revision on the same delta chain is read before
that of a later revision, the revlog remembers that we parsed the earlier
revision and continues applying deltas from there onwards. If manifests are
parsed the other way round, we have to start over from the fulltext.
For a fresh clone of mozilla-central, updating from 29dd80c95b7d to its parent
aab96936a177 requires approximately 400 fewer zlib.decompress calls, which
results in a speedup from 1.10 seconds to 1.05.
_forgetremoved can trigger manifest construction, but we only want it to
happen after manifestmerge, so that our attempt to read the manifests in the
right order in an upcoming patch actually works.
This has a big effect on the performance of working dir updates.
Here are the results of update from null to the given rev in several
repos, on a Linux 3.2 system with 32 cores running ext4, with the progress
extension enabled.
repo rev plain parallel speedup
hg 76a842fb67cc 0.9 0.3 3
mozilla-central fe1600b22c77 42.8 7.7 5.5
linux-2.6 9ef4b770e069 31.4 4.9 6.4
Instead of a monotonic count, getupdates yields the number of files it
has updated since it last reported, and its caller sums the numbers when
updating progress.
Once we run these updates in parallel, this will allow worker processes
to report progress less often, reducing overhead.
In an upcoming patch, we will update .hgsubstate in a non-interactive
worker process. Merges of subrepo contents will still need to occur in the
master process (since they may be interactive), so we move that code into
a place where it will always run in what will become the master process.
This replaces the _checkunknown call in calculateupdates with a more
performant version. On a repository with over 150,000 files, this speeds up an
update by 0.6-0.8 seconds, which is up to 25%.
This does not introduce any UI changes. There is existing test coverage for
every case, mostly in test-merge*.t.
Show messages at a point where the actions have been sorted, thus preparing for
backout of 14f4258e3526.
This makes manifestmerge more of a silent operation, just like 'copies' is.
Indent 'preserving' messages to make them subordinate to the action logging so
they fit in the new context. (The 'preserving' messages are quite redundant and
could also be removed completely.)
Preparing for backout of 14f4258e3526.
The number of prompts will for all relevant cases be significantly smaller than
the total number of files in the manifests. We can thus afford to sort the
prompts more than we can afford to sort the manifests.
Add sorted() in places found by testing with PYTHONHASHSEED=random and code
inspection.
An alternative to sprinkling sorted() all over would be to change substate to a
custom dict with sorted iterators...
The 'x' flag and the 'l' flag are very different. It is usually not a problem
to change the 'x' flag of a normal file independent of the content, but one
does not simply change the type of a file to 'l' independent of the content.
This removes the fmerge function that merged both 'x' and 'l' independent of
content early in the merge process. This correctly introduces some conflicts
instead of silent incorrect merges. 3-way flag merge will now be done in the
resolve process, right next to file content merge. Conflicts can thus be
resolved with (slightly inconvenient) resolve commands like 'resolve f --tool
internal:other'. It thus brings us closer to be able to re-solve manifest merge
after the merge and avoid prompts during merge.
This also removes the "conflicting flags for a - (n)one, e(x)ec or sym(l)ink?"
prompt that nobody could answer and that made it easy to mix symlink targets
and file contents up. Instead it will give a file merge where a sufficiently
clever merge tool can help resolving the issue.
The early prescan for move/remove and removal of moved files in applyupdates
was introduced with mergestate 5ff549be1f0c and rendered this chunk of code
irrelevant.
The impact of the chunk was reduced in 0309b0586c65 - but it could have been
removed completely.
Currently the "copy" dict contains both explicit copies/moves made by a
context and pending moves that need to happen because the other context moved
the directory the file was in. For explicit copies, the dict stores a
destination to source map, while for pending moves via directory renames, it
stores a source to destination map. The merge code uses this fact in a non-
obvious way to differentiate between these two cases.
We make this explicit by storing these pending moves in a separate dict. The
dict still has a source to destination map, but that is called out in the
docstring.
This pulls the code used to calculate the changes that need to happen
during merge.update() into a separate function. This is not useful on
its own, but is instead preparatory to performing grafts in memory
when there are no potential conflicts.
Before this patch, case-folding collision is checked simply between
manifests of each merged revisions.
So, files may be considered as colliding each other, even though one
of them is already deleted on one of merged branches: in such case,
merge causes deleting it, so case-folding collision doesn't occur.
This patch checks whether both of files colliding each other still
remain after merge or not, and ignores collision if at least one of
them is deleted by merge.
In the case that one of colliding files is deleted on one of merged
branches and changed on another, file is considered to still remain
after merge, even though it may be deleted by merge, if "deleting" of
it is chosen in "manifestmerge()".
This avoids fail to merge by case-folding collisions after choices
from "changing" and "deleting" of files.
This patch adds only tests for "removed remotely" code paths in
"_remains()", because other ones are tested by existing tests in
"test-casecollision-merge.t".
For divergent renames the following message is printed during merge:
note: possible conflict - file was renamed multiple times to:
newfile
file2
When a file is renamed in one branch and deleted in the other, the file still
exists after a merge. With this change a similar message is printed for mv+rm:
note: possible conflict - file was deleted and renamed to:
newfile
We allow rebase plus collapse, but not collapse only? I imagine people would
rebase first then collapse once they are sure the rebase is correct and it is
the right time to finish it.
I was reluctant to submit this patch for reasons detailed below, but it
improves rebase --collapse usefulness so much it is worth the ugliness.
The fix is ugly because we should be fixing the collapse code path rather than
the merge. Collapsing by merging changesets repeatedly is inefficient compared
to what commit --amend does: commitctx(), update, strip. The problem with the
latter is, to generate the synthetic changeset, copy records are gathered with
copies.pathcopies(). copies.pathcopies() is still implemented with merging in
mind and discards information like file replaced by the copy of another,
criss-cross copies and so forth. I believe this information should not be lost,
even if we decide not to interpret it fully later, at merge time.
The second issue with improving rebase --collapse is the option should not be
there to begin with. Rebasing and collapsing are orthogonal and a dedicated
command would probably enable a better, simpler ui. We should avoid advertizing
rebase --collapse, but with this fix it becomes the best shipped solution to
collapse changesets.
And for the record, available techniques are:
- revert + commit + strip: lose copies
- mq/qfold: repeated patching() (mostly correct, fragile)
- rebase: repeated merges (mostly correct, fragile)
- collapse: revert + tag rewriting wizardry, lose copies
- histedit: repeated patching() (mostly correct, fragile)
- amend: copies.pathcopies() + commitctx() + update + strip
The fix introduced in 3509b9cf8f86 was only partially successful. It is correct
to turn dirstate 'm' merge records into normal/dirty ones but copy records are
lost in the process. To adjust them as well, we need to look in the first
parent manifest to know which files were added and preserve only related
records. But the dirstate does not have access to changesets, the logic has to
moved at another level, in localrepo.
a317664437ee introduced some logic to avoid case-collision detection between
source and destination revisions when it does not make sense: clean or to be
cleaned working directories. Unfortunately, part of it was flawed and the
related test was broken by another bug.
This patch disables cross revision case collision detection for updates without
option or with --check, if the working directory is clean.
if the file in target context causes case-folding collision against
one in working context, current implementation aborts merging with it,
even thouhg collding one (in target) is the file renamed from collided
one (in working).
this patch uses file copy information to know whether colliding file
is renamed from collided one or not: if so, collision between them is
ignored.
this patch also avoids collision detection between current context and
target context, if working context is clean (with --check/-c) or will
be clean (with --clean/-C).
This fixes a regression introduced by df049e784ab6. If no file-level
merge is needed, we can update flags directly, otherwise we have a
conflict to resolve in filemerge.
Previously, we could change a normal file into a corrupt symlink when
trying to merge a symlink flag. Now, we leave the flag alone and let
filemerge deal with it (usually by a prompt).
We also drop a redundant flag setting after filemerge (now dealt with
by ms.resolve) that would cause similar corruption.
The unknown file collision rule was introduced as an extension of the
"should be clean when merging" rule. Unfortunately, it got applied to
the normal update path, which should be happy to merge local changes.
This patch gives us merges for unknown file collisions on update,
while preserving abort for merge and update -c.
This removes use of unknown files for building the synthetic working
directory manifest used by manifestmerge. Instead, we adopt the
strategy used by _checkunknown.
Side-effect: unknown files are no longer moved by remote directory
renames, and now are left alone like ignored files.
Previously, we would do a full working directory walk including
unknown files to perform a merge. In many cases, this was painful
because unknown files greatly outnumbered tracked files and generally
had no useful effect on the merge.
Here we instead wait until we find a file in the destination that's
not tracked locally and detect if it exists and is not ignored. This
is usually cheaper but can be -more- expensive in the case where we're
adding a huge number of files. On the other hand, the cost of statting
the new files should be dwarfed by the cost of eventually writing
them.
In this version, case collisions are detected implicitly by
os.path.exists and wctx[f] lookup.
When doing hg up, if there is a file conflict with untracked files,
currently only the first such conflict is reported. With this patch,
all of them are listed.
With this patch error message is now reported as
a: untracked file differs
b: untracked file differs
abort: untracked files in working directory conflict with files in
requested revision
instead of
abort: untracked file in working directory differs from file in
requested revision: 'a'
This is a follow up to an old attempt to do this here:
http://selenic.com/pipermail/mercurial-devel/2011-August/033625.html
this patch makes branch merging abort when merged changesets have same
file in different case on case insensitive filesystem.
this patch does not prevent linear update which merges between target
and working contexts, because 'branchmerge' is False in such case.
Makes the 'nothing to merge' abort messages in commands.py consistent with
those in merge.py. Also makes commands.merge() and merge.update() use hints.
The tests show the changes.
Some character sets, cp932 (known as Shift-JIS for Japanese) for
example, use 0x41('A') - 0x5A('Z') and 0x61('a') - 0x7A('z') as second
or later character.
In such character set, case collision checking recognizes different
files as CASEFOLDED same file, if filenames are treated as byte
sequence.
win32mbcs extension is not appropriate to handle this problem, because
this problem can occur on other than Windows platform only if
problematic character set is used.
Callers of util.checkcase() use known ASCII filenames as last
component of path, and string.lower() is not applied to directory part
of path. So, util.checkcase() is kept intact, even though it applies
string.lower() to filenames.
merge.update() was missing a few dirtiness checks from workingcontext,
including subrepo cleanliness checks. Using wc.dirty() instead of
one-off checks for various forms of dirtiness will be significantly
safer.
Previously, pull would not update if new branch heads were received,
whereas pull && update would move to the tipmost branch head.
Also change the "crosses branches" abort in merge.update from
"crosses branches (merge branches or use --check to force update)"
to
"crosses branches (merge branches or update --check to force update)"
since it can no longer assume the user is running hg update.
It has substantially different semantics from forget at the command
layer, so change it to avoid confusion.
We can't simply combine it with remove because we need to explicitly
drop non-added files in some cases like commit.
The Mercurial 1.9 release is moving a lot of stuff around anyway and we are
already moving path_auditor from util.py to scmutil.py for that release.
So this seems like a good opportunity to do such a rename. It also strengthens
the current project policy to avoid underbars in names.
These leaks may occur in environments that don't employ a reference
counting GC, i.e. PyPy.
This implies:
- changing opener(...).read() calls to opener.read(...)
- changing opener(...).write() calls to opener.write(...)
- changing open(...).read(...) to util.readfile(...)
- changing open(...).write(...) to util.writefile(...)
This backs out
changeset: 13158:17d1b96c0f12
user: Mads Kiilerich <mads@kiilerich.com>
date: Tue Dec 07 03:29:21 2010 +0100
summary: merge: fast-forward merge with descendant
Before named branches, the invariants were:
a) "merges" always have two parents
b) p1 is not linearly related to p2
Adding named branches made (b) problematic, so the above patch was
introduced, which fixed (b) but broke (a).
After discussion, we decided that the invariants should be:
a) "merges" always have two parents
b) p1 is not linearly related to p2 OR p1 and p2 are on different branches
Add missing calls to close() to many places where files are
opened. Relying on reference counting to catch them soon-ish is not
portable and fails in environments with a proper GC, such as PyPy.
This makes 'hg update --clean' behave the same way for both kinds of
subrepositories. Before Subversion subrepos did not take the clean
parameter into account, but just updated to the given revision and
merged uncommitted changes into that.
issue2538 gives a case where a changeset is merged with its child (which is on
another branch), and to my surprise the result is a real merge with two
parents, not just a "fast forward" "merge" with only the child as parent.
That is essentially the same as issue619.
Is the existing behaviour as intended and correct?
Or is the following fix correct?
Some extra "created new head" pops up with this fix, but it seems to me like
they could be considered correct. The old branch head has been superseeded by
changes on the other branch, and when the changes on the other branch is merged
back to the branch it will introduce a new head not directly related to the
previous branch head.
(I guess the intention with existing behaviour could be to ensure that the
changesets on the branch are directly connected and that no new heads pops up
on merges.)
See the Hg Book on why we actually want to detect this case:
http://hgbook.red-bean.com/read/mercurial-in-daily-use.html#id364290
Before:
$ hg up deadbeef
warning: detected divergent renames of X to:
...
After:
$ hg up deadbeef
note: possible conflict - X was renamed multiple times to:
...
No functionality change.
When using "hg update" to update to a revision on another branch, if
the user has uncommitted changes in the working directory, hg aborts
with the following message:
abort: crosses branches (use 'hg merge' to merge or use 'hg update
-C' to discard changes)
If the user isn't trying to update to tip and they follow the command
examples verbatim, they would end up updating to the wrong revision.
This patch removes the command examples in favor of just telling the
user to either merge or use --clean:
abort: crosses branches (merge branches or use --clean to discard
changes)
hg also aborts if the user tries to use "hg update" to get to tip
(without specifying a revision) and tip is on another branch:
abort: crosses branches (use 'hg merge' or use 'hg update -c')
This message is changed in the same fashion:
abort: crosses branches (merge branches or use --check to force
update)