Commit Graph

8011 Commits

Author SHA1 Message Date
Sean Farley
1debeb5a8a namespaces: add test for log
Now that we have enough features in the namespaces api, we add a test for it.
2014-10-21 19:49:23 -07:00
Sean Farley
e1121ac0ac tests: add a i18n translation test for log output
Upcoming patches will change the way that log output is generated so we add a
test to ensure that the words 'branches', 'bookmarks', and 'tags' are still
translated.
2014-12-31 16:50:19 -06:00
Matt Harbison
863f720320 largefiles: don't print files as both large and normal in addremove dryruns 2014-12-31 18:39:41 -05:00
Matt Harbison
f54b3d5591 largefiles: align the output messages for an added file with core methods
Core addremove prints the file relative to cwd only if patterns are provided to
the command.  Core add always prints relative to cwd.  Also, both methods print
the subrepo prefix when needed.  The 'already a largefile' doesn't have an
analog in core, but follows the same rules for consistency.
2014-11-28 21:44:41 -05:00
Matt Harbison
fe27be6419 largefiles: align the output messages for a removed file with core methods
Both cmdutil.remove() and scmutil.addremove() require verbose mode or an inexact
match to print the filename.  Core addremove also prints the file relative to
cwd only if patterns are provided to the command.  And finally, both methods
print the subrepo prefix when needed.
2014-11-28 21:03:44 -05:00
Sean Farley
2a4b30c27c revset: use '%' as an operator for 'only'
With this patch, we can make it much easier to specify 'only(A,B)' ->
A%B. Similarly, 'only(A)' -> A%.

On Windows, '%' is a semi-reserved symbol in the following way: using non-bash
shells (e.g. cmd.exe but NOT PowerShell, ConEmu, and cmder), %var% is only
expanded when 'var' exists and is surrounded by '%'.

That only leaves batch scripts which could prove to be problematic. I posit
that this isn't a big issue because any developer of batch scripts already
knows that to use '%' one needs to escape it by using a double '%%'.

Alternatives to '%' could be '=' but that might be limiting our future if we
ever decide to use temporary assignments in a revset.
2014-11-06 14:55:18 -08:00
Sean Farley
dc331facee debugnamecomplete: rename from debuglabelcomplete
Now that we have decided on the use of 'name' instead of 'label' we rename this
function accordingly.

The old method 'debuglabelcomplete' has been left as a deprecated command so
that current scripts don't break.
2014-10-17 13:41:29 -07:00
Martin von Zweigbergk
f2bc0a75a2 test-bundle2-exchange: create temp script in $TESTTMP, not $TESTDIR
The bundle2-pushkey-hook.sh script is currently created in $TESTTMP,
and leaves an untracked file in that directory (tests/) after running.
2015-01-07 14:30:40 -08:00
Mads Kiilerich
7167031347 localrepo: show headline notes in commitctx before showing filenames
commitctx already showed notes with filenames but didn't provide any context.
It is just as relevant to know when manifest or changelog is committed.

So, in addition to filenames, also show headlines 'committing files:',
'committing manifest' and 'committing changelog'.
2014-04-18 13:33:20 +02:00
Mads Kiilerich
af8710d713 bundle: when verbose, show what takes up the space in the generated bundle
This is kind of similar to the debugbundle command but gives summarized actual
uncompressed number of bytes when creating the bundle. The numbers are as
usable as the bundle format is efficient. Hopefully bundle2 will make it a
better indicator of actual entropy.

This is useful when accepting pull requests to assess whether the repo size
increase seems reasonable for the diff before pushing stuff upstream, It has
helped me catching large files that should have been committed as largefiles
but was committed as regular files in intermediate changesets.

This output doesn't combine well with debug output so we only enable it when
verbose without debug.
2014-08-15 19:43:32 +02:00
Gregory Szorc
49e88a6464 templates: use CSS classes for diff styling
Use of inline style for diff styling led to significant browser memory
usage on large diffs. Moving the styling into CSS classes corrects this.

This patch is based on work from
https://bugzilla.mozilla.org/show_bug.cgi?id=766952
and
https://hg.mozilla.org/hgcustom/version-control-tools/rev/2c355a580af6
2015-01-06 15:29:02 -08:00
FUJIWARA Katsunori
fa0f66b4fb revset: introduce new operator "##" to concatenate strings/symbols at runtime
Before this patch, there is no way to concatenate strings at runtime.

For example, to search for the issue ID "1234" in descriptions against
all of "issue 1234", "issue:1234", issue1234" and "bug(1234)"
patterns, the revset below should be written fully from scratch for
each issue ID.

    grep(r"\bissue[ :]?1234\b|\bbug\(1234\)")

This patch introduces new infix operator "##" to concatenate
strings/symbols at runtime. Operator symbol "##" comes from the same
one of C pre-processor. This concatenation allows parametrizing a part
of strings in revset queries.

In the case of example above, the definition of the revset alias using
operator "##" below can search issue ID "1234" in complicated patterns
by "issue(1234)" simply:

    issue($1) = grep(r"\bissue[ :]?" ## $1 ## r"\b|\bbug\(" ## $1 ## r"\)")

"##" operator does:

  - concatenate not only strings but also symbols into the string

    Exact distinction between strings and symbols seems not to be
    convenience, because it is tiresome for users (and
    "revset.getstring" treats both similarly)

    For example of revset alias "issue()", "issue(1234)" is easier
    than "issue('1234')".

  - have higher priority than any other prefix, infix and postfix
    operators (like as "##" of C pre-processor)

    This patch (re-)assigns the priority 20 to "##", and 21 to "(",
    because priority 19 is already assigned to "-" as prefix "negate".
2015-01-06 23:46:18 +09:00
Matt Harbison
8f0f0a4fa4 largefiles: pass a matcher instead of a raw file list to removelargefiles()
This is consistent with addlargefiles(), and will make it easier to get the
paths that are printed correct when recursing into subrepos or invoking from
outside the repository.  It also now restricts the path that the addremove is
performed on if a path is given, as is done with normal files.

The repo.status() call needs to exclude clean files when performing an
addremove, because the addremove override method calling this used to pass the
list of files to delete, which caused the matcher to only consider those files
in building the status list.  Now the matcher is restricted only to the extent
that the caller requested- usually directories if at all.  There's no reason for
addremove to care about clean files anyway- we don't want them deleted.
2014-11-28 19:50:52 -05:00
Matt Mackall
80c883c82a merge with stable 2015-01-06 18:18:28 -06:00
Gregory Szorc
73fa937531 cmdutil.jsonchangeset: properly compute added and removed files
jsonchangeset._show() was computing the reverse status of the current
changeset. As a result, added files were showing up as removed and
removed files were showing up as adds.

There were existing tests for this code and they were flat out wrong.
2015-01-05 22:18:55 -08:00
Matt Harbison
b391094100 largefiles: properly sync lfdirstate after removing largefiles
The more aggressive synchronization of lfdirstate that was backed out in
4fe80f20ab06 masked the problem where lfdirstate would hold an 'R' for a
largefile that was added and then removed without a commit between.  We could
just conditionally call lfdirstate.drop() or lfdirstate.remove() here, but this
also properly updates lfdirstate if the standin doesn't exist for the file
somehow (i.e. call drop instead of remove).

Without this change, the precommit status in the commit command immediately
after the test change lists the removed (and never committed) largefile as 'R'.
It can also lead to situations where the status command reports the same, long
after the commit [1].

[1] http://www.selenic.com/pipermail/mercurial-devel/2015-January/065153.html
2015-01-04 15:26:26 -05:00
Martin von Zweigbergk
c8aa337d29 status: don't list files as both clean and deleted
Tracked files that are deleted should always be reported as such, no
matter what their state was in earlier revisions. This is encoded in
in two conditions in the loop in basectx._buildstatus() for modified
and added files, but the check is missing for clean files. We should
check for clean files too, but instead of adding the check in a third
place, move it earlier and skip most of the loop body for deleted
files.
2015-01-05 17:12:04 -08:00
Martin von Zweigbergk
b1f3f94c3b status: don't list files as both removed and deleted
When calculating status involving the working copy and a revision
other than the parent of the working copy, the files that are not in
the working context manifest ('mf2' in the basectx._buildstatus())
will be reported as removed (note that deleted files _are_ in the
working context manifest). However, if the file is reported as deleted
in the dirstate, it will get that status too (as shown by failing
tests).

Fix by removing deleted files from the 'removed' list after the main
loop in _buildstatus().
2015-01-05 16:52:12 -08:00
Pierre-Yves David
1ed0c1e70a revset-filelog: handle hidden linkrev for file missing for head (issue4490)
The fix for linkrev pointing to hidden revision was crashing when the file was
missing from head's manifest. We now properly handle this case.

(yes I feel silly)
2015-01-06 11:23:38 -08:00
Mads Kiilerich
b438c7467e largefiles: backout d20af8be6a14 - linear updates handle m -> a differently
d20af8be6a14 introduced a significant performance regression: All largefiles
were marked 'normallookup' by linear (or noop) updates and had to be rehashed
by the next command.

The previous change introduced a different solution to the problem d20af8be6a14
solved and we can thus back it out again.
2014-12-31 14:46:03 +01:00
Mads Kiilerich
d0a915efcf tests: add test coverage for lfdirstate invalidation of linear update
d20af8be6a14 introduced a significant performance regression: All largefiles
are marked 'normallookup' in lfdirstate by linear (or noop) updates and has to
be rehashed by the next command.

To avoid such regressions, keep an eye on the dirstate content after a plain
'hg up'.
2014-12-31 14:45:02 +01:00
Augie Fackler
ffd2cf1dba demandimport: blacklist distutils.msvc9compiler (issue4475)
This module depends on _winreg, which is windows-only. Recent versions
of setuptools load distutils.msvc9compiler and expect it to
ImportError immediately when on non-Windows platforms, so we need to
let them do that. This breaks in an especially mystifying way, because
setuptools uses vars() on the imported module. We then throw an
exception, which vars doesn't pick up on well. For example:

In [3]: class wat(object):
   ...:     @property
   ...:     def __dict__(self):
   ...:         assert False
   ...:

In [4]: vars(wat())
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-2781ada5ffe6> in <module>()
----> 1 vars(wat())

TypeError: vars() argument must have __dict__ attribute

Which is similar to the problem we run into.
2014-12-22 17:27:31 -05:00
Matt Harbison
93382de245 largefiles: fix a spurious missing file warning with forget (issue4053)
If an uncommitted and deleted file was forgotten, a warning would be emitted,
even though the operation was successful.  See the previous patch for
'remove -A' for the exact circumstances, and details about the cause.
2014-12-21 15:06:54 -05:00
Matt Harbison
8edc086e01 largefiles: fix a spurious missing file warning with 'remove -A' (issue4053)
The bug report doesn't mention largefiles, but the given recipe doesn't fail
unless the largefiles extension is loaded.  The problem only affected normal
files, whether or not any largefiles are committed, and only files that have
not been committed yet.  (Files with an 'a' state are dropped from dirstate,
not marked removed.)  Further, if the named normal file never existed, the
warning would be printed out twice.

The problem is that the core implementation of remove() calls repo.status(),
which eventually triggers a dirstate.walk().  When the file isn't seen in the
filesystem during the walk, the exception handling finds the file in
dirstate, so it doesn't complain.  However, the largefiles implementation
called status() again with all of the original files (including the normal
ones, just dropped).  This time, the exception handler doesn't find the file
in dirstate and does complain.  This simply excludes the normal files from
the second repo.status() call, which the largefiles extension has no interest
is processing anyway.
2014-12-21 15:04:13 -05:00
Matt Mackall
8b83dfc480 pathauditor: check for Windows shortname aliases 2014-12-18 14:18:28 -06:00
Augie Fackler
13206234f4 pathauditor: check for codepoints ignored on OS X 2014-12-16 13:08:17 -05:00
Augie Fackler
c1fe89f64f darwin: omit ignorable codepoints when normcase()ing a file path
This lets us avoid some nasty case collision problems in OS X with
invisible codepoints.
2014-12-16 13:07:10 -05:00
Augie Fackler
ab5a564dd5 test-casefolding.t: demonstrate a bug with HFS+ ignoring some codepoints 2014-12-11 15:42:49 -05:00
Matt Harbison
f96d6adc63 largefiles: don't actually remove largefiles in an addremove dry run
The addlargefiles() method already properly handled dry runs.
2014-12-13 13:33:48 -05:00
Durham Goode
fd373c16b0 log: fix log revset instability
The log/graphlog revset was not producing stable results since it was
iterating over a dict. Now we sort before iterating to guarantee a fixed order.

This fixes some potential flakiness in the tests.
2014-12-08 15:41:54 -08:00
Durham Goode
74a5156e21 log: fix log -f slow path to actually follow history
The revset created when -f was used with a slow path (for patterns and
directories) did not actually contain any logic to enforce follow. Instead it
was depending on the passed in subset to already be limited (which was limited
to :. but not ::.). This fixes it by adding a '& ::.' to any -f log revset.

hg log -f <file> is still broken, in that it can return results that aren't
actually ancestors of the current file, but fixing that has major perf
implications, so we'll deal with it later.
2014-12-05 14:27:32 -08:00
Mads Kiilerich
7ebc1609d6 run-tests: automatically add (glob) to "saved backup bundle to" lines
Avoid spending too much time adding (glob) after running run-tests -i. This
doesn't handle all cases but it helps.

The run-tests tests add a bit of escaping of trailing (glob) in the output to
avoid interference from the outer test runner.

The regexp for matching the output lines contains a group for making multiline
substitute in a way that works with Python before 2.7.
2014-11-27 02:04:30 +01:00
Matt Mackall
1c9cf418be merge with stable 2015-01-05 15:46:14 -06:00
FUJIWARA Katsunori
ec7eaba601 revset: delay showing parse error for the revset alias until it is referred
Before this patch, a problematic revset alias aborts execution
immediately, even if it isn't referred in the specified revset.

If old "hg" may be used too (for example, bisecting Mercurial itself),
it is also difficult to write alias definitions using features newly
introduced by newer "hg" into configuration files, because such alias
definitions cause unexpected abortion at parsing revset aliases with
old "hg".

This patch delays showing parse error for the revset alias until it is
actually referred at runtime.

This patch detects referring problematic aliases in "_expandaliases"
by examination of "revsetalias.error", which is initialized with the
error message only when parsing fails.

For usability, this patch also warns about problematic aliases, even
if they aren't referred at runtime. This should help users to know
potential problems in their alias definitions earlier.
2015-01-05 11:02:04 +09:00
Pierre-Yves David
a533fb454c linkrev-filelog: handle filtered linkrev with no visible children (issue4307)
If the file revision with a filtered linkrev does not have any
(unfiltered) children, we cannot use it to bound the search for
another introduction. Instead, we have to look at the file revision
used by each head changeset. If one of them uses this file revision, we
know there is another occurrence and we have a starting point. See
inline comments for details.

Adding some kind of permanent reference of all the introductions of a
file revision instead of just the first one would be much better. But
this is more difficult. I hope to take that into account in the next
repository format.
2014-12-29 18:35:23 -08:00
Pierre-Yves David
e3f2a2625a linkrev: work around linkrev to filtered entry in 'filelog' revset
This revset is used by 'hg log FILENAME'. This prevent bugs when used on
a repository with hidden revisions.

Instead of just discarding file revisions whose linkrevs point to filtered
revisions, we put them aside and post-process them trying to find a non-filtered
introduction. See inline documentation for details about how it works.

This only fixes some of the problems. Once again, more will be needed when we can
cannot rely on child revisions of a file to find linkrev-shadowned revisions.

A test is added for 'hg log' catching such cases.
2014-12-29 17:23:16 -08:00
FUJIWARA Katsunori
851e75b58e context: override _dirstatestatus in workingcommitctx for correct matching
Before this patch, the result of "status()" on "workingcommitctx" may
incorrectly contain files other than ones to be committed, because
"workingctx._dirstatestatus()" returns the result of
"dirstate.status()" directly.

For correct matching, this patch overrides "_dirstatestatus" in
"workingcommitctx" and makes it return matched files only in
"self._status".

This patch uses empty list for "deleted", "unknown" and "ignored" of
status, because status between "changectx"s also makes them empty.
2014-12-31 17:55:43 +09:00
FUJIWARA Katsunori
7b31cd3a28 context: avoid breaking already fixed self._status at ctx.status()
Before this patch, "status()" on "workingcommitctx" with "always
match" object causes breaking "self._status" in
"workingctx._buildstatus()", because "workingctx._buildstatus()"
caches the result of "dirstate.status()" into "self._status" for
efficiency, even though it should be fixed at construction for
committing.

For example, template function "diff()" without any patterns in
"committemplate" implies "status()" on "workingcommitctx" with "always
match" object, via "basectx.diff()" and "patch.diff()".

Then, broken "self._status" causes committing unexpected files.

To avoid breaking already fixed "self._status" at "ctx.status()", this
patch overrides "_buildstatus" in "workingcommitctx".

This patch doesn't write out the result of template function "diff()"
in "committemplate" in "test-commit.t", because matching against files
to be committed still has an issue fixed in subsequent patch.
2014-12-31 17:55:43 +09:00
FUJIWARA Katsunori
c2e92a32b4 context: make unknown/ignored/clean of cached status empty for equivalence
Before this patch, "workingctx.status" caches the result of
"dirstate.status" directly into "self._status".

But "dirstate.status" is invoked with False "list*" arguments in
normal "self._status" accessing route, and this makes
"unknown"/"ignored"/"clean" of status empty.

This may cause unexpected result of code paths internally accessing to
them (accessors for external usage are already removed by previous patch).

This patch makes "unknown"/"ignored"/"clean" of cached status empty
for equivalence. Making them empty is executed only when at least one
of "unknown", "ignored" or "clean" has files, for efficiency.
2014-12-31 17:55:43 +09:00
Matt Mackall
00a6d1affb merge with stable 2015-01-01 16:47:14 -06:00
Pierre-Yves David
451115c9e1 linkrev: also adjust linkrev when bootstrapping annotate (issue4305)
The annotate logic now use the new 'introrev' method to bootstrap its traversal.
This catches issues from linkrev-shadowing of the changeset introducing the
version of a file in source changeset.

More tests have been added to display pathological cases.
2014-12-24 03:26:48 -08:00
Pierre-Yves David
0344ecbf5b linkrev: also adjust linkrev when bootstrapping 'follow' revset
The follow revset (used by `hg log --follow`) now uses the new 'introrev'
method to bootstrap its traversal. This catches issues from linkrev-shadowing of
the changesets introducing the version of a file in source changeset.

A new test has been added to display pathological cases.

Another test is affected because it was meant to test this behavior but actually
failed to do so for half of the output. The output are now similar.
2014-12-29 23:40:24 -08:00
Pierre-Yves David
3c79d53ced filectx.parents: enforce changeid of parent to be in own changectx ancestors
Because of the way filenodes are computed, you can have multiple changesets
"introducing" the same file revision. For example, in the changeset graph
below, changeset 2 and 3 both change a file -to- and -from- the same content.

  o 3: content = new
  |
  | o 2: content = new
  |/
  o 1: content = old

In such cases, the file revision is create once, when 2 is added, and just reused
for 3. So the file change in '3' (from "old" to "new)" has no linkrev pointing
to it).  We'll call this situation "linkrev-shadowing". As the linkrev is used for
optimization purposes when walking a file history, the linkrev-shadowing
results in an unexpected jump to another branch during such a walk.. This leads to
multiple bugs with log, annotate and rename detection.

One element to fix such bugs is to ensure that walking the file history sticks on
the same topology as the changeset's history. For this purpose, we extend the
logic in 'basefilectx.parents' so that it always defines the proper changeset
to associate the parent file revision with. This "proper" changeset has to be an
ancestor of the changeset associated with the child file revision.

This logic is performed in the '_adjustlinkrev' function. This function is
given the starting changeset and all the information regarding the parent file
revision. If the linkrev for the file revision is an ancestor of the starting
changeset, the linkrev is valid and will be used. If it is not, we detected a
topological jump caused by linkrev shadowing, we are going to walk the
ancestors of the starting changeset until we find one setting the file to the
revision we are trying to create.

The performance impact appears acceptable:

- We are walking the changelog once for each filelog traversal (as there should
  be no overlap between searches),

- changelog traversal itself is fairly cheap, compared to what is likely going
  to be perform on the result on the filelog traversal,

- We only touch the manifest for ancestors touching the file, And such
  changesets are likely to be the one introducing the file. (except in
  pathological cases involving merge),

- We use manifest diff instead of full manifest unpacking to check manifest
  content, so it does not involve applying multiple diffs in most case.

- linkrev shadowing is not the common case.

Tests for fixed issues in log, annotate and rename detection have been
added.

But this changeset does not solve all problems. It fixes -ancestry-
computation, but if the linkrev-shadowed changesets is the starting one, we'll
still get things wrong. We'll have to fix the bootstrapping of such operations
in a later changeset. Also, the usage of `hg log FILE`  without --follow still
has issues with linkrev pointing to hidden changesets, because it relies on the
`filelog` revset which implement its own traversal logic that is still to be
fixed.

Thanks goes to:
- Matt Mackall: for nudging me in the right direction
- Julien Cristau and Rémi Cardona: for keep telling me linkrev bug were an
  evolution show stopper for 3 years.
- Durham Goode: for finding a new linkrev issue every few weeks
- Mads Kiilerich: for that last rename bug who raise this topic over my
  anoyance limit.
2014-12-23 15:30:38 -08:00
FUJIWARA Katsunori
797fef3e65 context: cache self._status correctly at workingctx.status
Before this patch, "workingctx.status" always replaces "self._status"
by the recent result, even though:

  - status isn't calculated against the parent of the working directory, or

  - specified "match" isn't "always" one
    (status is only visible partially)

If "workingctx" object is shared between some procedures indirectly
referring "ctx._status", this incorrect caching may cause unexpected
result: for example, "ctx._status" is used via "manifest()", "files()"
and so on.

To cache "self._status" correctly at "workingctx.status", this patch
overwrites "self._status" in "workingctx._buildstatus" only when:

  - status is calculated against the parent of the working directory, and
  - specified "match" is "always" one

This patch can be applied (and effective) only on default branch,
because procedure around "basectx.status" is much different between
stable and default: for example, overwriting "self._status" itself is
executed not in "workingctx._buildstatus" but in
"workingctx._poststatus", on stable branch.
2014-12-31 17:55:43 +09:00
Matt Mackall
3c097d24cf test-subrepo-git: ignore global git config
This was causing a test failure for people with company-wide settings.
Still need a way to ignore local config.
2014-12-30 15:51:14 -06:00
Siddharth Agarwal
3c683f5548 tests/autodiff.py: explicitly only honor feature diffopts
This test extension manages the opts it cares about on its own anyway.
2014-11-21 16:02:26 -08:00
Matt Harbison
21ca967140 narrowmatcher: propagate the rel() method
The full path is propagated to the original match object since this is often
used directly for printing a file name to the user.  This is cleaner than
requiring each caller to join the prefix with the file name prior to calling it,
and will lead to not having to pass the prefix around separately.  It is also
consistent with the bad() and abs() methods in terms of the required input.  The
uipath() method now inherits this path building property.

There is no visible change in path style for rel() because it ultimately calls
util.pathto(), which returns an os.sep based path.  (The previous os.path.join()
was violating the documented usage of util.pathto(), that its third parameter be
'/' separated.)  The doctest needed to be normalized to '/' separators to avoid
test differences on Windows, now that a full path is returned for a short
filename.

The test changes are to drop globs that are no longer necessary when printing an
absolute file in a subrepo, as returned from match.uipath().  Previously when
os.path.join() was used to add the prefix, the absolute path to a file in a
subrepo was printed with a mix of '/' and '\'.  The absolute path for a file not
in a subrepo, as returned from match.uipath(), is still purely '/' based.
2014-11-27 10:16:56 -05:00
FUJIWARA Katsunori
f6bf07d36e posix: quote the specified string only when it may have to be quoted
This patch makes "posix.shellquote" examine the specified string and
quote it only when it may have to be quoted for safety, like as the
previous patch for "windows.shellquote".

In fact, on POSIX environment, quoting itself doesn't cause issues
like issue4463. But (almost) equivalent quoting policy can avoid
examining test result differently on POSIX and Windows (even though
showing command line with "%r" causes such examination in
"test-extdiff.t").

The last hunk for "test-extdiff.t" in this patch isn't needed for the
previous patch for "windows.shellquote", because the code path of it
is executed only "#if execbit" (= avoided on Windows).
2014-12-25 23:33:26 +09:00
FUJIWARA Katsunori
1623722bb9 windows: quote the specified string only when it has to be quoted
Before this patch, "windows.shellquote" (as used as "util.shellquote")
always quotes specified strings with double quotation marks, for
external process invocation.

But some problematic applications can't work correctly, when command
line arguments are quoted: see issue4463 for detail.

On the other hand, quoting itself is needed to specify arguments
containing whitespaces and/or some special characters exactly.

This patch makes "windows.shellquote" examine the specified string and
quote it only when it may have to be quoted for safety.
2014-12-25 23:33:26 +09:00
FUJIWARA Katsunori
203706516b extdiff: avoid unexpected quoting arguments for external tools (issue4463)
Before this patch, all command line arguments for external tools are
quoted by the combination of "shlex.split" and "util.shellquote". But
this causes some problems.

  - some problematic commands can't work correctly with quoted arguments

    For example, 'WinMerge /r ....' is OK, but 'WinMerge "/r" ....' is
    NG. See also below for detail about this problem.

        https://bitbucket.org/tortoisehg/thg/issue/3978/

  - quoting itself may change semantics of arguments

    For example, when the environment variable CONCAT="foo bar baz':

      - mydiff $CONCAT   => mydiff foo bar baz   (taking 3 arguments)
      - mydiff "$CONCAT" => mydiff "foo bar baz" (taking only 1 argument)

    For another example, single quoting (= "util.shellquote") on POSIX
    environment prevents shells from expanding environment variables,
    tilde, and so on:

      - mydiff "$HOME" => mydiff /home/foobar
      - mydiff '$HOME' => mydiff $HOME

  - "shlex.split" can't handle some special characters correctly

    It just splits specified command line by whitespaces.

    For example, "echo foo;echo bar" is split into ["echo",
    "foo;echo", "bar"].

On the other hand, if quoting itself is omitted, users can't specify
options including space characters with "--option" at runtime.

The root cause of this issue is that "shlex.split + util.shellquote"
combination loses whether users really want to quote each command line
elements or not, even though these can be quoted arbitrarily in
configurations.

To resolve this problem, this patch does:

  - prevent configurations from being processed by "shlex.split" and
    "util.shellquote"

    only (possibly) "findexe"-ed or "findexternaltool"-ed command path
    is "util.shellquote", because it may contain whitespaces.

  - quote options specified by "--option" via command line at runtime

This patch also makes "dodiff()" take only one "args" argument instead
of "diffcmd" and "diffopts. It also omits applying "util.shellquote"
on "args", because "args" should be already stringified in "extdiff()"
and "mydiff()".

The last hunk for "test-extdiff.t" replaces two whitespaces by single
whitespace, because change of "' '.join()" logic causes omitting
redundant whitespaces.
2014-12-25 23:33:26 +09:00