All versions of Python we support or hope to support make the hash
functions available in the same way under the same name, so we may as
well drop the util forwards.
This message has been overlooked by check-code, because it starts with
non-alphabet character ('%').
This is also a part of preparation for making "missing _() in ui
message" detection of check-code more exact.
Initializing a subrepo when one doesn't exist is the right thing to do when the
parent is being updated, but in few other cases. Unfortunately, there isn't
enough context in the subrepo module to distinguish this case. This same issue
can be caused with other subrepo aware commands, so there is a general issue
here beyond the scope of this fix.
A simpler attempt I tried was to add an '_updating' boolean to localrepo, and
set/clear it around the call to mergemod.update() in hg.updaterepo(). That
mostly worked, but doesn't handle the case where archive will clone the subrepo
if it is missing. (I vaguely recall that there may be other commands that will
clone if needed like this, but certainly not all do. It seems both handy, and a
bit surprising for what should be a read only operation. It might be nice if
all commands did this consistently, but we probably need Angel's subrepo caching
first, to not make a mess of the working directory.)
I originally handled 'Exception' in order to pick up the Aborts raised in
subrepo.state(), but this turns out to be unnecessary because that is called
once and cached by ctx.sub() when iterating the subrepos.
It was suggested in the bug discussion to skip looking at the subrepo links
unless -S is specified. I don't really like that idea because missing a subrepo
or (less likely, but worse) a corrupt .hgsubstate is a problem of the parent
repo when checking out a revision. The -S option seems like a better fit for
functionality that would recurse into each subrepo and do a full verification.
Ultimately, the default value for 'allowcreate' should probably be flipped, but
since the default behavior was to allow creation, this is less risky for now.
CVE-2016-3068 (1/1)
Git's git-remote-ext remote helper provides an ext:: URL scheme that
allows running arbitrary shell commands. This feature allows
implementing simple git smart transports with a single shell shell
command. However, git submodules could clone arbitrary URLs specified
in the .gitmodules file. This was reported as CVE-2015-7545 and fixed
in git v2.6.1.
However, if a user directly clones a malicious ext URL, the git client
will still run arbitrary shell commands.
Mercurial is similarly effected. Mercurial allows specifying git
repositories as subrepositories. Git ext:: URLs can be specified as
Mercurial subrepositories allowing arbitrary shell commands to be run
on `hg clone ...`.
The Mercurial community would like to thank Blake Burkhart for
reporting this issue. The description of the issue is copied from
Blake's report.
This commit changes submodules to pass the GIT_ALLOW_PROTOCOL env
variable to git commands with the same list of allowed protocols that
git submodule is using.
When the GIT_ALLOW_PROTOCOL env variable is already set, we just pass it
to git without modifications.
Git turned on renames by default in commit 5404c11 (diff: activate
diff.renames by default, 2016-02-25). The change is destined for
release in git 2.8.0. The change breaks test-subrepo-git, which test
specifically that a moved file is reported as a removal and an
addition. Fix by passing --no-renames (available in git since mid
2006) to the diff commands that don't use --quiet (should make no
difference for those).
Before this change, warnings were interspersed with (and easily drowned out by)
status messages.
API:
abstractsubrepo.removefiles has an extra argument warnings,
into which callees should append their warnings.
Note: Callees should not assume that there will be items in the list,
today, I'm lazily including any other subrepos warnings, but
that may change.
cmdutil.remove has an extra optional argument warnings,
into which it will place warnings.
If warnings is omitted, warnings will be reported via ui.warn()
as before this change (albeit, after any status messages).
This patch improves the error messages raised when an OSError occurs, since
simply re-raising the exception can be both confusing and misleading. For
example, if "hg identify" is run inside a repository that contains a Git
subrepository and the git binary could not be found, it'll exit with the message
"abort: No such file or directory". That implies "identify" has a problem
reading the repository itself. There's no way for the user to know what the
real problem is unless they dive into the Mercurial source, which is what I
ended up doing after spending hours debugging errors while provisioning a VM
with Ansible (turns out I forgot to install Git on it).
Descriptive errors are especially important on Windows, since it's common for
Windows users to forget to set the "Path" system variable after installing Git.
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.
For great justice.
This patch uses "wvfs of the parent repository" ('pwvfs') instead of
'wvfs' of own repository, because 'self._path' is the path to this
subrepository as seen from the parent repository.
Python 2.6 introduced the "except type as instance" syntax, replacing
the "except type, instance" syntax that came before. Python 3 dropped
support for the latter syntax. Since we no longer support Python 2.4 or
2.5, we have no need to continue supporting the "except type, instance".
This patch mass rewrites the exception syntax to be Python 2.6+ and
Python 3 compatible.
This patch was produced by running `2to3 -f except -w -n .`.
Python 2.6 introduced a new octal syntax: "0oXXX", replacing "0XXX". The
old syntax is not recognized in Python 3 and will result in a parse
error.
Mass rewrite all instances of the old octal syntax to the new syntax.
This patch was generated by `2to3 -f numliterals -w -n .` and the diff
was selectively recorded to exclude changes to "<N>l" syntax conversion,
which will be handled separately.
This is a step toward replacing the extdiff internals with archive, in order to
support 'extdiff -S'. Only Mercurial subrepos are supported for now.
If a file is missing from the filesystem, it is silently skipped. Perhaps it
should warn, but it cannot abort when working with extdiff because deleting a
file is a legitimate diff.
Some code cannot handle a subrepo based on the working directory (e.g.
sub.dirty()), so the caller must opt in. This will be useful for archive, and
perhaps some other commands. The git and svn methods where this is used may
need to be fixed up on a case by case basis.
While hopefully atypical, there are reasons that a subrepository revision can be
lost that aren't covered by corruption of the .hgsubstate revlog. Such things
can happen when a subrepo is amended, stripped or simply isn't pulled from
upstream because the parent repo revision wasn't updated yet. There's no way to
know if it is an error, but this will find potential problems sooner than when
some random revision is updated.
Until recently, convert made no attempt at rewriting the .hgsubstate file. The
impetuous for this is to verify the conversion of some repositories, and this is
orders of magnitude faster than a bash script from 0..tip that does an
'hg update -C $rev'. But it is equally useful to determine if everything has
been pulled down before taking a thumb drive on the go.
It feels somewhat wrong to leave this out of verifymod (mostly because the file
is already read in there, and the final summary is printed before the subrepos
are checked). But verifymod looks very low level, so importing subrepo stuff
there seems more wrong.
This code is already used in a couple of places, and will need to be used in
others where wdir() support is needed. Trying to get the wdir() revision stored
in self._substate[1] when creating the object in subrepo.subrepo() resulted in
breaking various status and diff tests. This is a far less invasive change.
This will be used in an upcoming patch. It's a one-off use, but seems better to
be contained in the subrepo module, than for the next patch to overwrite the
_ctx and _state fields in another module. '' is used as the default revision in
subrepo.state() if it can't be found, so it seems like a safe choice.
'git diff-index' by default does not recurse into subdirectories
when changes are found. As a result, the directory is shown as changed,
rather than the files in this directory.
Adding '-r' results in recursing until the blobs themselves are checked.
The previous behavior when archiving a subrepo 's' on Windows was to internally
name the file under it 's\file', due to the use of vfs.reljoin(). When printing
the file list from the archive on Windows or Linux, the file was named
's\\file'. The archive extracted OK on Windows, but if the archive was brought
to a Linux system, it created a file named 's\file' instead of a directory 's'
containing 'file'.
*.zip format achives seemed not to have the problem, but this was definitely an
issue with *.tgz archives.
Largefiles actually got this right, but a test is added to keep this from
regressing. The subrepo-deep-nested-change.t test was repurposed to archive to
a file, since there are several subsequent tests that archive to a directory.
The output change is losing the filesystem prefix '../archive_lf' and not
listing the directories 'sub1' and 'sub1/sub2'.
With many commands accepting a '-S' or an explicit path to trigger recursing
into subrepos, it seems that --hidden needs to be propagated too.
Unfortunately, many of the subrepo layer methods discard the options map, so
passing the option along explicitly isn't currently an option. It also isn't
clear if other filtered views need to be propagated, so changing all of those
commands may be insufficient anyway.
The specific jam I got into was amending an ancestor of qbase in a subrepo, and
then evolving. The patch ended up being hidden, and outgoing said it would only
push one unrelated commit. But push aborted with an 'unknown revision' that I
traced back to the patch. (Odd it didn't say 'filtered revision'.) A push with
--hidden worked from the subrepo, but that wasn't possible from the parent repo
before this.
Since the underlying problem doesn't actually require a subrepo, there's
probably more to investigate here in the discovery area. Yes, evolve + mq is
not exactly sane, but I don't know what is seeing the hidden revision.
In lieu of creating a test for the above situation (evolving mq should probably
be blocked), the test here is a marginally useful case where --hidden is needed
in a subrepo: cat'ing a file in a hidden revision. Without this change, cat
aborts with:
$ hg --hidden cat subrepo/a
skipping missing subrepository: subrepo
[1]
When passing a --rev, 'hg incoming -S' previously suffered from the same output
truncation or abort that was fixed for 'hg outgoing -S' in the previous patch,
for the same reasons.
Unlike push, subrepos are currently only pulled when the outer repo is updated,
not when the outer repo is pulled. That makes matching 'hg pull' behavior
impossible. Listing all incoming csets in the subrepo seems like the most
useful behavior, and is consistent with 'hg outgoing -S'.
The previous behavior didn't reflect what would actually be pushed- push will
ignore --rev and --branch in the subrepo and push everything. Therefore,
'push -r {rev}' would not list everything, unless {rev} was also the revision of
the subrepo's tip. Worse, if a hash was passed in, the command would abort
because that hash would either not be in the outer repo or not in the subrepo.
The '' that is used to represent the state of a not-yet-committed
subrepo cannot be written to the file, because the code that parses
the file splits on ' ' and expects two parts.
Given that the .hgsubstate file is automatically rewritten on commit, it seems
a little strange that the file is written out during a merge.
Prior to ef4d6b79dd3a, the subrelpath() (now _relpath) for hgsubrepo was
calculated by removing the root path of the outermost repo from the root path of
the subrepo. Since the root paths use platform specific separators, and the
relative path is printed by various commands, the output of these commands
require a glob (and check-code.py enforces this).
In an effort to be generic to all subrepos, ef4d6b79dd3a started calculating
this path based on the parent repo, and then joining the subrepo path in .hgsub.
One of the tests in test-subrepo.t creates a subrepo inside a directory, so the
path being joined contained '/' instead of '\'. This made the test fail with a
'~' status, because the glob is unnecessary[1]. Removing them made the test
work, but then check-code complains. We can't just drop the check-code rule,
because sub-subrepos are still joined with '\'. Presumably the other subrepo
types have this issue as well, but there likely isn't a test with git or svn
repos inside a subdirectory.
This simply restores the exact _relpath value (and output) for hgsubrepos prior
to ef4d6b79dd3a.
[1] http://www.selenic.com/pipermail/mercurial-devel/2015-April/068720.html
"dirpath" in the tuple yielded by "vfs.walk()" is relative one from
the root of specified vfs, and absolute path in the warning message is
composed by "vfs.join()".
On the other hand, target file "f" exists in "dirpath", and
"reljoin()" is needed to unlink "f" by "vfs.unlink()".
As a preparation for vfs migration of "_sanitize()", this patch passes
"wvfs" to "_sanitize()" and use "wvfs.base" instead of absolute path
to a subrepository.