Summary: Also change the internal API so it no longer accepts the "heads" argument.
Reviewed By: ryanmce
Differential Revision: D6745865
fbshipit-source-id: 368742be49b192f7630421003552d0a10eb0b76d
# skip-blame because this was mechanically rewritten the following script. I
ran it on both *.t and *.py, but none of the *.py changes were proper. All *.t
ones appear to be, and they run without addition failures on both Windows and
Linux.
import argparse
import os
import re
ap = argparse.ArgumentParser()
ap.add_argument('path', nargs='+')
opts = ap.parse_args()
globre = re.compile(r'^(.*) \(glob\)(.*)$')
for p in opts.path:
tmp = p + '.tmp'
with open(p, 'rb') as src, open(tmp, 'wb') as dst:
for line in src:
m = globre.match(line)
if not m or '$LOCALIP' in line or '*' in line:
dst.write(line)
continue
if '?' in line[:-3] or ('?' in line[:-3] and line[-3:] != '(?)'):
dst.write(line)
continue
dst.write(m.group(1) + m.group(2) + '\n')
os.unlink(p)
os.rename(tmp, p)
We change subrepos.allowed from a list of allowed subrepo types to
a combination of a master switch and per-type boolean flag.
If the master switch is set, subrepos can be disabled wholesale.
If subrepos are globally enabled, then per-type options are
consulted. Mercurial repos are enabled by default. Everything else
is disabled by default.
This allows us to minimize the behavior change introduced by the next patch.
I have no idea which config style is preferred in UX POV, but I decided to
get things done.
a) list: 'allowed = hg, git, svn'
b) sub option: 'allowed.hg = True' or 'allowed:hg = True'
c) per-type action: 'hg = allow', 'git = abort'
This is an alternative workaround for the issue5730.
Perhaps this is the simplest way of disabling subrepo operations. It does
nothing clever, but just aborts if Mercurial starts accessing to a subrepo.
I think Greg's patch is more useful since it allows us to at least check
out the parent repository. However, that would be confusing if the default
is flipped to checkout=False and subrepos are silently ignored.
I don't like the config name 'allowed', but I couldn't get any better name.
Previously, only the top level repo was shared, and then any subrepos were
cloned on demand. This is problematic because commits to the parent repo would
write an updated .hgsubstate to the share source, but the corresponding subrepo
commit would be stuck in the local subrepo. That would prevent an update in the
source repo. We already go to great lengths to avoid having inconsistent repos
(e.g., `hg push -r rev` will push _everything_ in a subrepo, even if it isn't
referenced in one of the parent's outgoing commits). Therefore, this seems like
a bug fix, and there's no option to get the old behavior. I can't imagine the
previous behavior was useful to anybody.
There shouldn't be an issue with svn, since it is centralized. Maybe --git-dir
can be used for git subrepos, but I'll leave that to someone more familiar with
git.
An integer was previously being implicitly returned from commands.share(), which
caused dispatch() to start crashing when changing over to returning the shared
repo. All error paths appear to raise, so this can be hardcoded to success.
The clone command checks for 'is None' in a similar pattern, but since
hg.clone() always returns a tuple, that seems wrong?
.. fix:: Issue 5675
Creating a share of a repository with a Mercurial subrepository will now
share the subrepository.
and
.. bc::
Mercurial subrepositories are now shared instead of cloned when the parent
repository is shared. This prevents dangling subrepository references in the
share source. Previously shared repositories with cloned subrepositories
will continue to function unchanged.
Upon pull or unbundle, we display a message with the range of new revisions
fetched. This revision range could readily be used after a pull to look out
what's new with 'hg log'. The algorithm takes care of filtering "obsolete"
revisions that might be present in transaction's "changes" but should not be
displayed to the end user.
This vulnerability was fixed by the previous patch and there were more ways
to exploit than using '|shellcmd'. So it doesn't make sense to reject only
pipe character.
Test cases are updated to actually try to exploit the bug. As the SSH bridge
of git/svn subrepos are not managed by our code, the tests for non-hg subrepos
are just removed.
This may be folded into the original patches.
'ssh://' has an exploit that will pass the url blindly to the ssh
command, allowing a malicious person to have a subrepo with
'-oProxyCommand' which could run arbitrary code on a user's machine. In
addition, at least on Windows, a pipe '|' is able to execute arbitrary
commands.
When this happens, let's throw a big abort into the user's face so that
they can inspect what's going on.
Well, mostly. The annotation on subrepo functions tacks on a parenthetical to
the abort message, which seems reasonable for a generic mechanism. But now all
messages consistently spell out 'subrepository', and double quote the name of
the repo. I noticed the inconsistency in the change for the last commit.
Currently when we have multiple heads on the same branch, update tells us that
there some more heads for the current branch but does not tells us the head to
which the repository has been updated to. It makes more sense showing the
head we updated to and then telling there are some more heads.
The extra glob in test-command-template.t caused it to say no result was
reported. It used to be (within the past year), that both this and the missing
glob cases could be fixed simply by editing any output in the test, and
re-running it in interactive mode. But that no longer works, and I had to diff
*.t against *.t.err. I didn't dig into what changed.
The POSIX documentation about "cp" [1] says:
....
RATIONALE
....
Earlier versions of this standard included support for the -r option to
copy file hierarchies. The -r option is historical practice on BSD and
BSD-derived systems. This option is no longer specified by POSIX.1-2008
but may be present in some implementations. The -R option was added as a
close synonym to the -r option, selected for consistency with all other
options in this volume of POSIX.1-2008 that do recursive directory
descent.
The difference between -R and the removed -r option is in the treatment
by cp of file types other than regular and directory. It was
implementation-defined how the - option treated special files to allow
both historical implementations and those that chose to support -r with
the same abilities as -R defined by this volume of POSIX.1-2008. The
original -r flag, for historic reasons, did not handle special files any
differently from regular files, but always read the file and copied its
contents. This had obvious problems in the presence of special file
types; for example, character devices, FIFOs, and sockets.
....
....
Issue 6
The -r option is marked obsolescent.
....
Issue 7
....
The obsolescent -r option is removed.
....
(No "Issue 8" yet)
Therefore it's clear that "cp -R" is strictly better than "cp -r".
The issue was discovered when running tests on OS X after 2e4d149e62aa.
[1]: pubs.opengroup.org/onlinepubs/9699919799/utilities/cp.html
The way default marker template was defined before this patch,
the spacing before dash in conflict markes was dependent on
whether changeset is a tip one or not. This is a relevant part
of template:
'{ifeq(tags, "tip", "", "{tags} "}'
If revision is a tip revision with no other tags, this would
resolve to an empty string, but for revisions which are not tip
and don't have any other tags, this would resolve to a single
space string. In the end this causes weirdnesses like the ones
you can see in the affected tests.
This is a not a big deal, but double spacing may be visually
less pleasant.
Please note that test changes where commit hashes change are
the result of marking files as resolved without removing markers.
The only remaining usage of the experimental config were enforcing bundle2 on.
These are very old remains of when bundle2 was off by default. This was also
useful to highlight the fact that this was a bundle2 run and that a bundle1 one
was nearby. However, we want a future developer working on bundle3 to notice
possible output/behavior change on these tests and take them in account. So we
do not enforce bundle2 for these runs. We leave a comment around to make sure
dev still notice the bundle1 version.
The new option will stay around. The experimental option was only meant to be
temporary. We update various tests that validate both bundle1 and bundle2
version side by side. This changeset only takes care of enforcing bundle1. The
other use of 'experimental.bundle2-exp=True' will be taken care of in the next
changeset.
Previously a subrepository "sub" would cause no warnings to
be issued for a file "subnot/a", if it's not present in the
corresponding changeset when calling:
hg cat subnot/a
A concern around the user experience of Mercurial is user getting stuck on there
own topological branch forever. For example, someone pulling another topological
branch, missing that message in pull asking them to merge and getting stuck on
there own local branch.
The current way to "address" this concern was for bare 'hg update' to target the
tipmost (also latest pulled) changesets and complain when the update was not
linear. That way, failure to merge newly pulled changesets would result in some
kind of failure.
Yet the failure was quite obscure, not working in all cases (eg: commit right
after pull) and the behavior was very impractical in the common case
(eg: issue4673).
To be able to change that behavior, we need to provide other ways to alert a
user stucks on one of many topological head. We do so with an extra message after
bare update:
1 other heads for branch "default"
Bookmark get its own special version:
1 other divergent bookmarks for "foobar"
There is significant room to improve the message itself, and we should augment
it with hint about how to see theses other heads or handle the situation (see
in-line comment). But having "a" message is already a significant improvement
compared to the existing situation. Once we have it we can iterate on a better
version of it. As having such message is an important step toward changing the
default destination for update and other nicety, I would like to move forward
quickly on getting such message.
This was discussed during London - October 2015 Sprint.
We perform all that we can non-interactively before prompting the user for input
via their merge tool. This allows for a maximally consistent state when the user
is first prompted.
The test output changes indicate the actual behavior change happening.
The current output for a failed merge with conflict markers looks something like:
merging foo
warning: conflicts during merge.
merging foo incomplete! (edit conflicts, then use 'hg resolve --mark')
merging bar
warning: conflicts during merge.
merging bar incomplete! (edit conflicts, then use 'hg resolve --mark')
We're going to change the way merges are done to perform all premerges before
all merges, so that the output above would look like:
merging foo
merging bar
warning: conflicts during merge.
merging foo incomplete! (edit conflicts, then use 'hg resolve --mark')
warning: conflicts during merge.
merging bar incomplete! (edit conflicts, then use 'hg resolve --mark')
The 'warning: conflicts during merge' line has no context, so is pretty
confusing.
This patch will change the future output to:
merging foo
merging bar
warning: conflicts while merging foo! (edit, then use 'hg resolve --mark')
warning: conflicts while merging bar! (edit, then use 'hg resolve --mark')
The hint on how to resolve the conflicts makes this a bit unwieldy, but solving
that is tricky because we already hint that people run 'hg resolve' to retry
unresolved merges. The 'hg resolve --mark' mostly applies to conflict marker
based resolution.
This means that in ms.resolve we must call merge after calling premerge. This
doesn't yet mean that all premerges happen before any merges -- however, this
does get us closer to our goal.
The output differences are because we recompute the merge tool. The only
user-visible difference caused by this patch is that if the tool is missing
we'll print the warning twice. Not a huge deal, though.
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.
For great justice.
To detect change of a file without redundant comparison of file
content, dirstate recognizes a file as certainly clean, if:
(1) it is already known as "normal",
(2) dirstate entry for it has valid (= not "-1") timestamp, and
(3) mode, size and timestamp of it on the filesystem are as same as
ones expected in dirstate
This works as expected in many cases, but doesn't in the corner case
that changing a file keeps mode, size and timestamp of it on the
filesystem.
The timetable below shows steps in one of typical such situations:
---- ----------------------------------- ----------------
timestamp of "f"
----------------
dirstate file-
time action mem file system
---- ----------------------------------- ---- ----- -----
N -1 ***
- make file "f" clean N
- execute 'hg foobar'
- instantiate 'dirstate' -1 -1
- 'dirstate.normal("f")' N -1
(e.g. via dirty check)
- change "f", but keep size N
N+1
- release wlock
- 'dirstate.write()' N N
- 'hg status' shows "f" as "clean" N N N
---- ----------------------------------- ---- ----- -----
The most important point is that 'dirstate.write()' is executed at N+1
or later. This causes writing dirstate timestamp N of "f" out
successfully. If it is executed at N, 'parsers.pack_dirstate()'
replaces timestamp N with "-1" before actual writing dirstate out.
Occasional test failure for unexpected file status is typical example
of this corner case. Batch execution with small working directory is
finished in no time, and rarely satisfies condition (2) above.
This issue can occur in cases below;
- 'hg revert --rev REV' for revisions other than the parent
- failure of 'merge.update()' before 'merge.recordupdates()'
The root cause of this issue is that files are changed without
flushing in-memory dirstate changes via 'repo.commit()' (even though
omitting 'dirstate.normallookup()' on changed files also causes this
issue).
To detect changes of files correctly, this patch writes in-memory
dirstate changes out explicitly after marking files as clean in
'workingctx._checklookup()', which is invoked via 'repo.status()'.
After this change, timetable is changed as below:
---- ----------------------------------- ----------------
timestamp of "f"
----------------
dirstate file-
time action mem file system
---- ----------------------------------- ---- ----- -----
N -1 ***
- make file "f" clean N
- execute 'hg foobar'
- instantiate 'dirstate' -1 -1
- 'dirstate.normal("f")' N -1
(e.g. via dirty check)
----------------------------------- ---- ----- -----
- 'dirsttate.write()' -1 -1
----------------------------------- ---- ----- -----
- change "f", but keep size N
N+1
- release wlock
- 'dirstate.write()' -1 -1
- 'hg status' -1 -1 N
---- ----------------------------------- ---- ----- -----
To reproduce this issue in tests certainly, this patch emulates some
timing critical actions as below:
- timestamp of "f" in '.hg/dirstate' is -1 at the beginning
'hg debugrebuildstate' before command invocation ensures it.
- make file "f" clean at N
- change "f" at N
'touch -t 200001010000' before and after command invocation
changes mtime of "f" to "2000-01-01 00:00" (= N).
- invoke 'dirstate.write()' via 'repo.status()' at N
'fakedirstatewritetime.py' forces 'pack_dirstate()' to use
"2000-01-01 00:00" as "now", only if 'pack_dirstate()' is invoked
via 'workingctx._checklookup()'.
- invoke 'dirstate.write()' via releasing wlock at N+1 (or "not at N")
'pack_dirstate()' via releasing wlock uses actual timestamp at
runtime as "now", and it should be different from the "2000-01-01
00:00" of "f".
BTW, this patch also changes 'test-largefiles-misc.t', because adding
'dirstate.write()' makes recent dirstate changes visible to external
process.
Previously, if a subrepo was added in ctx2 and then compared to another without
it (ctx1), the subrepo for ctx2 was returned amongst all of the ctx1 based
subrepos, since no subrepo exists in ctx1 to replace it in the 'subpaths' dict.
The two callers of this, basectx.status() and cmdutil.diffordiffstat(), both
compare the yielded subrepo against ctx2, and thus saw no changes when ctx2's
subrepo was returned. The tests here previously didn't mention 's/a' for the
'p1()' case.
This appears to have been a known issue, because some diffordiffstat() comments
mention that the subpath disappeared, and "the best we can do is ignore it". I
originally ran into the issue with some custom convert code to flatten a tree of
subrepos causing hg.putcommit() to abort, but this new behavior seems like the
correct status and diff behavior regardless. (The abort in convert isn't
something users will see, because convert doesn't currently support subrepos in
the official repo.)
The phase of the pending commit depends on the parent of the working directory
and on the phases.newcommit configuration.
First, this information rather depend on the commit line which describe the
pending commit.
Then, we only want to be advertised when the pending phase is going to be higher
than the default new commit phase.
So the format will change from
$ hg summary
parent: 2:ab91dfabc5ad
foo
parent: 3:24f1031ad244 tip
bar
branch: default
commit: 1 modified, 1 unknown, 1 unresolved (merge)
update: (current)
phases: 1 secret (secret)
to
parent: 2:ab91dfabc5ad
foo
parent: 3:24f1031ad244 tip
bar
branch: default
commit: 1 modified, 1 unknown, 1 unresolved (merge) (secret)
update: (current)
phases: 1 secret
We are doing some strange special casing of phase push when:
- the source is a subrepo
- the destination is publishing
- some changeset are still draft on the destination
In that case we do not push phases information (to publish the draft changesets)
because it could break simple cycle of 'clone/pull/push' of subrepos. We have to
detect this case earlier to have bundle2 respecting it.
We change the test to check the behavior for both bundle1 and bundle2.
When the progress extension is not enabled, each call to 'ui.progress' used to
issue a debug message. This results is a very verbose output and often redundant
in tests. Dropping it makes tests less volatile to factor they do not meant to
test.
We had to alter the sed trick in 'test-rename-merge2.t'. Sed is used to drop all
output from a certain point and hidding the progress output remove its anchor.
So we anchor on something else.
The number of draft and secret changesets are currently not summarized.
This is an important information because the number of drafts give some rough
idea of the number of outgoing changesets in typical workflows, without needing
to probe a remote repository. And a non-zero number of secrets means that
those changeset will not be pushed.
If the repository is "dirty" - some draft or secret changesets exists - then
summary will display a line like:
phases: X draft, Y secret (public)
The phase in parenthesis corresponds to the highest phase of the parents of
the working directory, i.e. the current phase.
By default, the line is not printed if the repository is "clean" - all
changesets are public - but if verbose is activated, it will display:
phases: (public)
On the other hand, nothing will be printed if quiet is in action.
A few tests have been added in test-phases.t to cover the -v and -q cases.
When passing a --rev, 'hg incoming -S' previously suffered from the same output
truncation or abort that was fixed for 'hg outgoing -S' in the previous patch,
for the same reasons.
Unlike push, subrepos are currently only pulled when the outer repo is updated,
not when the outer repo is pulled. That makes matching 'hg pull' behavior
impossible. Listing all incoming csets in the subrepo seems like the most
useful behavior, and is consistent with 'hg outgoing -S'.
The previous behavior didn't reflect what would actually be pushed- push will
ignore --rev and --branch in the subrepo and push everything. Therefore,
'push -r {rev}' would not list everything, unless {rev} was also the revision of
the subrepo's tip. Worse, if a hash was passed in, the command would abort
because that hash would either not be in the outer repo or not in the subrepo.
The '' that is used to represent the state of a not-yet-committed
subrepo cannot be written to the file, because the code that parses
the file splits on ' ' and expects two parts.
Given that the .hgsubstate file is automatically rewritten on commit, it seems
a little strange that the file is written out during a merge.
This patch newly adds "dirtyreason()" to centralize composing dirty
reason message like "uncommitted changes in subrepository 'xxxx'".
There are 3 similar messages below, and this patch is a part of
preparations for unifying them into (1), too.
1. uncommitted changes in subrepository 'XXXX'
2. uncommitted changes in subrepository XXXX
3. uncommitted changes in subrepo XXXX
This patch chooses adding new method "dirtyreason()" instead of making
"dirty()" return "reason string", because:
- some of existing "dirty()" implementation is too complicated to do
so simply, and
- ill-mannered 3rd party subrepo classes, of which "dirty()" doesn't
return "reason string", cause meaningless message (even though it
is rare case)
Paths into the subrepo are not yet supported.
The need to use the workingctx in the subrepo will likely be used more in the
future, with the proposed working directory revset symbol. It is also needed
with archive, if that code is to be reused to support 'extdiff -S'.
Unfortunately, it doesn't seem possible to put the smarts in subrepo.subrepo(),
as it breaks various status and diff tests.
I opted not to pass the desired revision into the subrepo method explicitly,
because the only ones that do pass an explicit revision are methods like status
and diff, which actually operate on two contexts- the subrepo state and the
explicitly passed revision.
There are currently no tests for change/remove conflicts of subrepos,
and it's pretty broken. Add some tests demonstrating some of the
breakages and fix the most obvious one (a KeyError when trying to look
up a subrepo in the wrong context).
Before this patch, failure of updating subrepos may cause inconsistent
".hgsubstate". For example:
1. dirstate entry for ".hgsubstate" of the parent repo is filled
with valid size/date (via "hg state" or so)
2. "hg update" is invoked at the parent repo
3. ".hgsubstate" of the parent repo is updated on the filesystem as
a part of "g"(et) action in "merge.applyupdates"
4. it is assumed that size/date of ".hgsubstate" on the filesystem
aren't changed from ones at (1)
this is not so difficult condition, because just changing hash
ids (every ids are same in length) in ".hgsubstate" doesn't
change the file size of it
5. "subrepo.submerge()" is invoked to update subrepos
6. failure of updating in one of subrepos raises exception
(e.g. "untracked file differs")
7. "hg update" is aborted without updating dirstate of the parent repo
dirstate entry for ".hgsubstate" still holds size/date at (1)
Then, ".hgsubstate" of the parent repo is treated as "CLEAN"
unexpectedly, because updating ".hgsubstate" at (3) doesn't change
size/date of it on the filesystem: see assumption at (4).
This inconsistent ".hgsubstate" status causes unexpected behavior, for
example:
- "hg revert" forgets to revert ".hgsubstate"
- "hg update" misunderstands that (not yet updated) subrepos diverge
(then, it shows the prompt to confirm user's decision)
To avoid inconsistent ".hgsubstate" status above, this patch marks
".hgsubstate" as possibly dirty before "submerge" invocation.
"normallookup"-ed (= dirty) dirstate should be written out, even if
processing is aborted by failure.
This patch marks ".hgsubstate" as possibly dirty before "submerge",
also when it is removed or merged while merging, for safety. This
should prevent Mercurial from misunderstanding inconsistent
".hgsubstate" as clean.
To satisfy conditions at (1) and (4) above, this patch uses "hg status
--config debug.dirstate.delaywrite=2" (to fill valid size/date into
dirstate) and "touch" (to fix date of the file).