Before this patch, the result of "status()" on "workingcommitctx" may
incorrectly contain files other than ones to be committed, because
"workingctx._dirstatestatus()" returns the result of
"dirstate.status()" directly.
For correct matching, this patch overrides "_dirstatestatus" in
"workingcommitctx" and makes it return matched files only in
"self._status".
This patch uses empty list for "deleted", "unknown" and "ignored" of
status, because status between "changectx"s also makes them empty.
Before this patch, "status()" on "workingcommitctx" with "always
match" object causes breaking "self._status" in
"workingctx._buildstatus()", because "workingctx._buildstatus()"
caches the result of "dirstate.status()" into "self._status" for
efficiency, even though it should be fixed at construction for
committing.
For example, template function "diff()" without any patterns in
"committemplate" implies "status()" on "workingcommitctx" with "always
match" object, via "basectx.diff()" and "patch.diff()".
Then, broken "self._status" causes committing unexpected files.
To avoid breaking already fixed "self._status" at "ctx.status()", this
patch overrides "_buildstatus" in "workingcommitctx".
This patch doesn't write out the result of template function "diff()"
in "committemplate" in "test-commit.t", because matching against files
to be committed still has an issue fixed in subsequent patch.
Before this patch, "workingctx.status" caches the result of
"dirstate.status" directly into "self._status".
But "dirstate.status" is invoked with False "list*" arguments in
normal "self._status" accessing route, and this makes
"unknown"/"ignored"/"clean" of status empty.
This may cause unexpected result of code paths internally accessing to
them (accessors for external usage are already removed by previous patch).
This patch makes "unknown"/"ignored"/"clean" of cached status empty
for equivalence. Making them empty is executed only when at least one
of "unknown", "ignored" or "clean" has files, for efficiency.
d20af8be6a14 introduced a significant performance regression: All largefiles
were marked 'normallookup' by linear (or noop) updates and had to be rehashed
by the next command.
The previous change introduced a different solution to the problem d20af8be6a14
solved and we can thus back it out again.
d20af8be6a14 introduced a significant performance regression: All largefiles
are marked 'normallookup' in lfdirstate by linear (or noop) updates and has to
be rehashed by the next command.
To avoid such regressions, keep an eye on the dirstate content after a plain
'hg up'.
The annotate logic now use the new 'introrev' method to bootstrap its traversal.
This catches issues from linkrev-shadowing of the changeset introducing the
version of a file in source changeset.
More tests have been added to display pathological cases.
The follow revset (used by `hg log --follow`) now uses the new 'introrev'
method to bootstrap its traversal. This catches issues from linkrev-shadowing of
the changesets introducing the version of a file in source changeset.
A new test has been added to display pathological cases.
Another test is affected because it was meant to test this behavior but actually
failed to do so for half of the output. The output are now similar.
Because of the way filenodes are computed, you can have multiple changesets
"introducing" the same file revision. For example, in the changeset graph
below, changeset 2 and 3 both change a file -to- and -from- the same content.
o 3: content = new
|
| o 2: content = new
|/
o 1: content = old
In such cases, the file revision is create once, when 2 is added, and just reused
for 3. So the file change in '3' (from "old" to "new)" has no linkrev pointing
to it). We'll call this situation "linkrev-shadowing". As the linkrev is used for
optimization purposes when walking a file history, the linkrev-shadowing
results in an unexpected jump to another branch during such a walk.. This leads to
multiple bugs with log, annotate and rename detection.
One element to fix such bugs is to ensure that walking the file history sticks on
the same topology as the changeset's history. For this purpose, we extend the
logic in 'basefilectx.parents' so that it always defines the proper changeset
to associate the parent file revision with. This "proper" changeset has to be an
ancestor of the changeset associated with the child file revision.
This logic is performed in the '_adjustlinkrev' function. This function is
given the starting changeset and all the information regarding the parent file
revision. If the linkrev for the file revision is an ancestor of the starting
changeset, the linkrev is valid and will be used. If it is not, we detected a
topological jump caused by linkrev shadowing, we are going to walk the
ancestors of the starting changeset until we find one setting the file to the
revision we are trying to create.
The performance impact appears acceptable:
- We are walking the changelog once for each filelog traversal (as there should
be no overlap between searches),
- changelog traversal itself is fairly cheap, compared to what is likely going
to be perform on the result on the filelog traversal,
- We only touch the manifest for ancestors touching the file, And such
changesets are likely to be the one introducing the file. (except in
pathological cases involving merge),
- We use manifest diff instead of full manifest unpacking to check manifest
content, so it does not involve applying multiple diffs in most case.
- linkrev shadowing is not the common case.
Tests for fixed issues in log, annotate and rename detection have been
added.
But this changeset does not solve all problems. It fixes -ancestry-
computation, but if the linkrev-shadowed changesets is the starting one, we'll
still get things wrong. We'll have to fix the bootstrapping of such operations
in a later changeset. Also, the usage of `hg log FILE` without --follow still
has issues with linkrev pointing to hidden changesets, because it relies on the
`filelog` revset which implement its own traversal logic that is still to be
fixed.
Thanks goes to:
- Matt Mackall: for nudging me in the right direction
- Julien Cristau and Rémi Cardona: for keep telling me linkrev bug were an
evolution show stopper for 3 years.
- Durham Goode: for finding a new linkrev issue every few weeks
- Mads Kiilerich: for that last rename bug who raise this topic over my
anoyance limit.
Before this patch, "workingctx.status" always replaces "self._status"
by the recent result, even though:
- status isn't calculated against the parent of the working directory, or
- specified "match" isn't "always" one
(status is only visible partially)
If "workingctx" object is shared between some procedures indirectly
referring "ctx._status", this incorrect caching may cause unexpected
result: for example, "ctx._status" is used via "manifest()", "files()"
and so on.
To cache "self._status" correctly at "workingctx.status", this patch
overwrites "self._status" in "workingctx._buildstatus" only when:
- status is calculated against the parent of the working directory, and
- specified "match" is "always" one
This patch can be applied (and effective) only on default branch,
because procedure around "basectx.status" is much different between
stable and default: for example, overwriting "self._status" itself is
executed not in "workingctx._buildstatus" but in
"workingctx._poststatus", on stable branch.
This module depends on _winreg, which is windows-only. Recent versions
of setuptools load distutils.msvc9compiler and expect it to
ImportError immediately when on non-Windows platforms, so we need to
let them do that. This breaks in an especially mystifying way, because
setuptools uses vars() on the imported module. We then throw an
exception, which vars doesn't pick up on well. For example:
In [3]: class wat(object):
...: @property
...: def __dict__(self):
...: assert False
...:
In [4]: vars(wat())
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-4-2781ada5ffe6> in <module>()
----> 1 vars(wat())
TypeError: vars() argument must have __dict__ attribute
Which is similar to the problem we run into.
If an uncommitted and deleted file was forgotten, a warning would be emitted,
even though the operation was successful. See the previous patch for
'remove -A' for the exact circumstances, and details about the cause.
The bug report doesn't mention largefiles, but the given recipe doesn't fail
unless the largefiles extension is loaded. The problem only affected normal
files, whether or not any largefiles are committed, and only files that have
not been committed yet. (Files with an 'a' state are dropped from dirstate,
not marked removed.) Further, if the named normal file never existed, the
warning would be printed out twice.
The problem is that the core implementation of remove() calls repo.status(),
which eventually triggers a dirstate.walk(). When the file isn't seen in the
filesystem during the walk, the exception handling finds the file in
dirstate, so it doesn't complain. However, the largefiles implementation
called status() again with all of the original files (including the normal
ones, just dropped). This time, the exception handler doesn't find the file
in dirstate and does complain. This simply excludes the normal files from
the second repo.status() call, which the largefiles extension has no interest
is processing anyway.
The log/graphlog revset was not producing stable results since it was
iterating over a dict. Now we sort before iterating to guarantee a fixed order.
This fixes some potential flakiness in the tests.
The revset created when -f was used with a slow path (for patterns and
directories) did not actually contain any logic to enforce follow. Instead it
was depending on the passed in subset to already be limited (which was limited
to :. but not ::.). This fixes it by adding a '& ::.' to any -f log revset.
hg log -f <file> is still broken, in that it can return results that aren't
actually ancestors of the current file, but fixing that has major perf
implications, so we'll deal with it later.
The full path is propagated to the original match object since this is often
used directly for printing a file name to the user. This is cleaner than
requiring each caller to join the prefix with the file name prior to calling it,
and will lead to not having to pass the prefix around separately. It is also
consistent with the bad() and abs() methods in terms of the required input. The
uipath() method now inherits this path building property.
There is no visible change in path style for rel() because it ultimately calls
util.pathto(), which returns an os.sep based path. (The previous os.path.join()
was violating the documented usage of util.pathto(), that its third parameter be
'/' separated.) The doctest needed to be normalized to '/' separators to avoid
test differences on Windows, now that a full path is returned for a short
filename.
The test changes are to drop globs that are no longer necessary when printing an
absolute file in a subrepo, as returned from match.uipath(). Previously when
os.path.join() was used to add the prefix, the absolute path to a file in a
subrepo was printed with a mix of '/' and '\'. The absolute path for a file not
in a subrepo, as returned from match.uipath(), is still purely '/' based.
This patch makes "posix.shellquote" examine the specified string and
quote it only when it may have to be quoted for safety, like as the
previous patch for "windows.shellquote".
In fact, on POSIX environment, quoting itself doesn't cause issues
like issue4463. But (almost) equivalent quoting policy can avoid
examining test result differently on POSIX and Windows (even though
showing command line with "%r" causes such examination in
"test-extdiff.t").
The last hunk for "test-extdiff.t" in this patch isn't needed for the
previous patch for "windows.shellquote", because the code path of it
is executed only "#if execbit" (= avoided on Windows).
Before this patch, "windows.shellquote" (as used as "util.shellquote")
always quotes specified strings with double quotation marks, for
external process invocation.
But some problematic applications can't work correctly, when command
line arguments are quoted: see issue4463 for detail.
On the other hand, quoting itself is needed to specify arguments
containing whitespaces and/or some special characters exactly.
This patch makes "windows.shellquote" examine the specified string and
quote it only when it may have to be quoted for safety.
Before this patch, all command line arguments for external tools are
quoted by the combination of "shlex.split" and "util.shellquote". But
this causes some problems.
- some problematic commands can't work correctly with quoted arguments
For example, 'WinMerge /r ....' is OK, but 'WinMerge "/r" ....' is
NG. See also below for detail about this problem.
https://bitbucket.org/tortoisehg/thg/issue/3978/
- quoting itself may change semantics of arguments
For example, when the environment variable CONCAT="foo bar baz':
- mydiff $CONCAT => mydiff foo bar baz (taking 3 arguments)
- mydiff "$CONCAT" => mydiff "foo bar baz" (taking only 1 argument)
For another example, single quoting (= "util.shellquote") on POSIX
environment prevents shells from expanding environment variables,
tilde, and so on:
- mydiff "$HOME" => mydiff /home/foobar
- mydiff '$HOME' => mydiff $HOME
- "shlex.split" can't handle some special characters correctly
It just splits specified command line by whitespaces.
For example, "echo foo;echo bar" is split into ["echo",
"foo;echo", "bar"].
On the other hand, if quoting itself is omitted, users can't specify
options including space characters with "--option" at runtime.
The root cause of this issue is that "shlex.split + util.shellquote"
combination loses whether users really want to quote each command line
elements or not, even though these can be quoted arbitrarily in
configurations.
To resolve this problem, this patch does:
- prevent configurations from being processed by "shlex.split" and
"util.shellquote"
only (possibly) "findexe"-ed or "findexternaltool"-ed command path
is "util.shellquote", because it may contain whitespaces.
- quote options specified by "--option" via command line at runtime
This patch also makes "dodiff()" take only one "args" argument instead
of "diffcmd" and "diffopts. It also omits applying "util.shellquote"
on "args", because "args" should be already stringified in "extdiff()"
and "mydiff()".
The last hunk for "test-extdiff.t" replaces two whitespaces by single
whitespace, because change of "' '.join()" logic causes omitting
redundant whitespaces.
Previously, revert was only possible if the '--no-backup'
switch was specified.
Now, to support backups, we explicitly go over all modified
files in the subrepo.
In the body of the loop in trydiff(), there are conditions like:
addedset or (f in modifiedset and to is None)
The second half of that expression is to account for the fact that
merge-in additions appear as additions. By instead fixing up the sets
of modified and added files to compensate for this fact, we can
simplify the body of the loop. It also fixes one case where the
addedset was checked without the additional check (the "have we
already reported a copy above?" case in the code, also see fixed test
case).
The similar condition with 'removedset' in it seems to have served no
purpose even before this change, so it could have been simplified even
before.
There is no --after for addremove, so the printing for addremove can be hoisted
out of the 'not after' check. The difference between the two remove messages
reflects the existing difference between core remove and core addremove styles
for printing the file.
There are still some pre-existing issues here. Core addremove only prints on
inexact matches or when verbose. But since the largefiles that are being
removed are passed to removelargefiles() as a pattern list, there is never an
inexact match, which would keep the largefiles from being printed at all unless
verbose is specified. Therefore, the output is a little more aggressive than
core. The addremove print style here is also inconsistent with core- it should
use matcher.uipath(f) instead of f. These can be fixed once a matcher is passed
in.
When a directory was renamed and a new untracked file was added in the
new directory and the remote directory added a file by the same name
in the old directory, the local untracked file gets overwritten, as
demonstrated by the broken test case in test-rename-dir-merge.
Fix by checking for unknown files for 'dg' actions too. Since
_checkunknownfile() currently expects the same filename in both
contexts, we need to add a new parameter for the remote filename to
it.
The pushkey operation used to be in its own wireprotocol command and (in
practice) always be lock free when running the hook. With bundle2, it happen in
a greater scheme and a hook running locking command would get stuck. We now run
such hooks after the lock release as similar hook do.
Bundle2 test are altered to ensure we are lockfree at hook running time.
This is different from latesttagdistance in that while latesttagdistance is
defined to be the length of the longest path to the latest tag,
changessincelatesttag is the number of changes contained in @ that aren't
contained in the latest tag. So, if 't' is the latest tag in the repository
below:
t
|
v
--o--o----o
\ \
..o..o..@
then latesttagdistance is 2, but changessincelatesttag is 4.
Note that changessincelatesttag is always greater than or equal to the
latesttagdistance -- that's because changessincelatesttag counts all the
changes in the longest path since the latest tag, and possibly others. This is
an important fact that we'll take advantage of in upcoming patches.
By moving the conversion from the file->action dict after the bid
merge code, bid merge can be simplified a little.
A few tests are affected by this change. Where we used to iterate over
the actions first in order of the action type ('g', 'r', etc.) [1], we
now iterate in order of filename. This difference affects the order of
debug log statements.
[1] And then in the non-deterministic order of files in the manifest
dictionary (the order returned from manifest.diff()).
This patch makes bundrepo retract the phase boundary for new commits to 'draft'
status, which is consistent with the behavior of 'hg unbundle'. The old
behavior was for commits to appear with the same phase as their nearest
ancestor in the base repository.
This affects several classes of operation:
* Inspecting a bundle with the -B flag
* Treating a bundle file as a peer (old: everything public, new: everything draft)
* Incoming command (neither old or new behavior is sensible -- fixed in next patch)
Previously these would be considered to be relative to the current working
directory. That behavior is both undocumented and doesn't really make sense.
There are two reasonable options for how to resolve relative paths:
- relative to the repo root
- relative to the config file
Resolving these files relative to the repo root matches existing behavior with
hooks. An earlier discussion about this is available at
http://mercurial.markmail.org/thread/tvu7yhzsiywgkjzl.
Thanks to Isaac Jurado <diptongo@gmail.com> for the initial patchset that
spurred the discussion.
The test case doesn't only check the commit message, but also the
patch, which can result in confusing output like
+ Revision df6f06d17100 does not comply to commit message rules
+ ------------------------------------------------------
+ 32: adds double empty line
+
+
even when there are no double blank lines in the commit message. Drop
the "commit message" part to make it less confusing.
Merged files are considered modified at commit time even if only 1 parent
differs. In this case we must use the change context of this parent for
expansion.
The issue went unnoticed for long because it is only apparent until the next
update to the merge revision - except in test-keyword where it was always
staring us in the face.
Mercurial backout command makes a commmit by default only when the backed out
revision is the parent of working directory and doesn't commit in any other
case.
The --commit option changes behaviour of backout to make a commit whenever
possible (i.e. there is no unresolved conflicts). This behaviour seems more
intuitive to many use (especially git users migrating to hg).
This patch adds the -B/--bookmarks option to the share command added by the
share extension. All it does for now is create a marker, 'bookmarks.shared',
that will be used by future code to implement the sharing functionality.
When looking for untracked files that would conflict with a tracked
file in the target revision (or the remote side of a merge), we
explcitly exclude ignored files. The code was added in f1db75422e70
(merge: refactor unknown file conflict checking, 2012-02-09), but it
seems like only unknown, not ignored, files were considered since the
beginning of time.
Although ignored files are mostly build outputs and backup files, we
should still not overwrite them. Fix by simply removing the explicit
check.