Commit Graph

12843 Commits

Author SHA1 Message Date
Pierre-Yves David
3c79d53ced filectx.parents: enforce changeid of parent to be in own changectx ancestors
Because of the way filenodes are computed, you can have multiple changesets
"introducing" the same file revision. For example, in the changeset graph
below, changeset 2 and 3 both change a file -to- and -from- the same content.

  o 3: content = new
  |
  | o 2: content = new
  |/
  o 1: content = old

In such cases, the file revision is create once, when 2 is added, and just reused
for 3. So the file change in '3' (from "old" to "new)" has no linkrev pointing
to it).  We'll call this situation "linkrev-shadowing". As the linkrev is used for
optimization purposes when walking a file history, the linkrev-shadowing
results in an unexpected jump to another branch during such a walk.. This leads to
multiple bugs with log, annotate and rename detection.

One element to fix such bugs is to ensure that walking the file history sticks on
the same topology as the changeset's history. For this purpose, we extend the
logic in 'basefilectx.parents' so that it always defines the proper changeset
to associate the parent file revision with. This "proper" changeset has to be an
ancestor of the changeset associated with the child file revision.

This logic is performed in the '_adjustlinkrev' function. This function is
given the starting changeset and all the information regarding the parent file
revision. If the linkrev for the file revision is an ancestor of the starting
changeset, the linkrev is valid and will be used. If it is not, we detected a
topological jump caused by linkrev shadowing, we are going to walk the
ancestors of the starting changeset until we find one setting the file to the
revision we are trying to create.

The performance impact appears acceptable:

- We are walking the changelog once for each filelog traversal (as there should
  be no overlap between searches),

- changelog traversal itself is fairly cheap, compared to what is likely going
  to be perform on the result on the filelog traversal,

- We only touch the manifest for ancestors touching the file, And such
  changesets are likely to be the one introducing the file. (except in
  pathological cases involving merge),

- We use manifest diff instead of full manifest unpacking to check manifest
  content, so it does not involve applying multiple diffs in most case.

- linkrev shadowing is not the common case.

Tests for fixed issues in log, annotate and rename detection have been
added.

But this changeset does not solve all problems. It fixes -ancestry-
computation, but if the linkrev-shadowed changesets is the starting one, we'll
still get things wrong. We'll have to fix the bootstrapping of such operations
in a later changeset. Also, the usage of `hg log FILE`  without --follow still
has issues with linkrev pointing to hidden changesets, because it relies on the
`filelog` revset which implement its own traversal logic that is still to be
fixed.

Thanks goes to:
- Matt Mackall: for nudging me in the right direction
- Julien Cristau and Rémi Cardona: for keep telling me linkrev bug were an
  evolution show stopper for 3 years.
- Durham Goode: for finding a new linkrev issue every few weeks
- Mads Kiilerich: for that last rename bug who raise this topic over my
  anoyance limit.
2014-12-23 15:30:38 -08:00
FUJIWARA Katsunori
9e120db5e9 context: remove unreliable accessor methods from committablectx
There are two caching routes for (propertycache-ed) "_status" below in
committablectx:

  - invoking "status()":

    "dirstate.status()" is invoked, and the result of it is cached
    into "_status". In this case, any of "listignored", "listclean"
    and "listunknown" may be True.

  - accessing "_status" directly before "status()":

    Own "status()" is invoked, but all of "listignored", "listclean"
    and "listunknown" arguments are False, in this case.

"ignored"/"clean"/"unknown" accessor methods of "committablectx" use
corresponded fields of "_status", but these fields aren't reliable,
because these fields are empty when:

  - "_status" method is executed before accessors, or
  - "status()" is executed with "list*=False" before accessors

In addition to it, these accessors aren't used in the recent Mercurial
implementation. At least, removing them doesn't cause any test
failures.
2014-12-31 17:55:43 +09:00
FUJIWARA Katsunori
797fef3e65 context: cache self._status correctly at workingctx.status
Before this patch, "workingctx.status" always replaces "self._status"
by the recent result, even though:

  - status isn't calculated against the parent of the working directory, or

  - specified "match" isn't "always" one
    (status is only visible partially)

If "workingctx" object is shared between some procedures indirectly
referring "ctx._status", this incorrect caching may cause unexpected
result: for example, "ctx._status" is used via "manifest()", "files()"
and so on.

To cache "self._status" correctly at "workingctx.status", this patch
overwrites "self._status" in "workingctx._buildstatus" only when:

  - status is calculated against the parent of the working directory, and
  - specified "match" is "always" one

This patch can be applied (and effective) only on default branch,
because procedure around "basectx.status" is much different between
stable and default: for example, overwriting "self._status" itself is
executed not in "workingctx._buildstatus" but in
"workingctx._poststatus", on stable branch.
2014-12-31 17:55:43 +09:00
Pierre-Yves David
2662df0db5 filectx.parents: also fetch the filelog of rename source too
we are going to need this filelog for the linkrev adjustment, so we better
normalise the list and have the filelog in all case.

This is done in a previous changeset to help readability.
2014-12-23 18:30:46 -08:00
Siddharth Agarwal
34aea9120f cmdutil.changeset_printer: explicitly honor all diffopts
This is used in hg log -p so the output is expected to be the same as that of
hg diff.
2014-11-21 16:01:55 -08:00
Siddharth Agarwal
84d6328fc1 export: explicitly honor all diffopts
This is slightly more controversial than diff, but we hope that HGPLAIN=1
covers all the format-breaking ones.

A possible alternative here that breaks BC is to honor all opts except the
whitespace ones.
2014-11-18 22:21:03 -08:00
Siddharth Agarwal
ed470d2b63 webcommands.annotate: explicitly only honor whitespace diffopts
The whitespace ones are the only ones the annotate logic cares about anyway, so
there's no visible impact.
2014-11-21 16:16:03 -08:00
Pierre-Yves David
1512377552 filectx.parents: filter nullrev parent sooner
We are going to introduce a linkrev-correction phases  when computing parents.
It will be more convenient to have the nullid parent filtered out earlier. I
had to make a minimal adjustment to the rename handling logic to keep it
functional.  That logic have been documented in the process since it took me
some time to check all the cases out.
2014-12-23 18:29:03 -08:00
Pierre-Yves David
a0aa329ab1 context: catch FilteredRepoLookupError instead of RepoLookupError
Now that we have a more specialised exception, lets use it when we meant to
catch the more specialised case.
2014-12-23 17:13:51 -08:00
Matt Harbison
21ca967140 narrowmatcher: propagate the rel() method
The full path is propagated to the original match object since this is often
used directly for printing a file name to the user.  This is cleaner than
requiring each caller to join the prefix with the file name prior to calling it,
and will lead to not having to pass the prefix around separately.  It is also
consistent with the bad() and abs() methods in terms of the required input.  The
uipath() method now inherits this path building property.

There is no visible change in path style for rel() because it ultimately calls
util.pathto(), which returns an os.sep based path.  (The previous os.path.join()
was violating the documented usage of util.pathto(), that its third parameter be
'/' separated.)  The doctest needed to be normalized to '/' separators to avoid
test differences on Windows, now that a full path is returned for a short
filename.

The test changes are to drop globs that are no longer necessary when printing an
absolute file in a subrepo, as returned from match.uipath().  Previously when
os.path.join() was used to add the prefix, the absolute path to a file in a
subrepo was printed with a mix of '/' and '\'.  The absolute path for a file not
in a subrepo, as returned from match.uipath(), is still purely '/' based.
2014-11-27 10:16:56 -05:00
Matt Harbison
e9dd83402c match: add the abs() method
This is a utility to make it easier for subrepos to convert a file name to the
full path rooted at the top repository.  It can replace the various path joining
lambdas, and doesn't require the prefix to be passed into the method that wishes
to build such a path.

The name is derived from the following pattern in annotate() and other places:

        name = ((pats and rel) or abs)

The pathname separator is not os.sep in part to avoid confusion with variables
named 'abs' or similar that _are_ '/' separated, and in part because some
methods like cmdutils.forget() and maybe cmdutils.add() need to build a '/'
separated path to the root repo.  This can replace the manual path building
there.
2014-11-28 20:15:46 -05:00
Matt Mackall
3406ce4956 merge with stable 2014-12-29 16:39:20 -06:00
Matt Mackall
bf17f56a67 sshpeer: more thorough shell quoting
This fixes an issue spotted by Jesse Hertz.
2014-12-29 14:27:02 -06:00
FUJIWARA Katsunori
f6bf07d36e posix: quote the specified string only when it may have to be quoted
This patch makes "posix.shellquote" examine the specified string and
quote it only when it may have to be quoted for safety, like as the
previous patch for "windows.shellquote".

In fact, on POSIX environment, quoting itself doesn't cause issues
like issue4463. But (almost) equivalent quoting policy can avoid
examining test result differently on POSIX and Windows (even though
showing command line with "%r" causes such examination in
"test-extdiff.t").

The last hunk for "test-extdiff.t" in this patch isn't needed for the
previous patch for "windows.shellquote", because the code path of it
is executed only "#if execbit" (= avoided on Windows).
2014-12-25 23:33:26 +09:00
FUJIWARA Katsunori
1623722bb9 windows: quote the specified string only when it has to be quoted
Before this patch, "windows.shellquote" (as used as "util.shellquote")
always quotes specified strings with double quotation marks, for
external process invocation.

But some problematic applications can't work correctly, when command
line arguments are quoted: see issue4463 for detail.

On the other hand, quoting itself is needed to specify arguments
containing whitespaces and/or some special characters exactly.

This patch makes "windows.shellquote" examine the specified string and
quote it only when it may have to be quoted for safety.
2014-12-25 23:33:26 +09:00
Mathias De Maré
55a6b1081b subrepo: add forgotten annotation for reverting git subrepos
Support for reverting git subrepos was added earlier,
but the annotation to handle any subrepo errors was forgotten.
2014-12-28 23:59:57 +01:00
Mathias De Maré
4aa7ae08d9 subrepo: add full revert support for git subrepos
Previously, revert was only possible if the '--no-backup'
switch was specified.
Now, to support backups, we explicitly go over all modified
files in the subrepo.
2014-12-28 10:42:25 +01:00
Matt Harbison
c2203f42d7 remove: use vfs instead of os.path + match.rel() for filesystem checks 2014-12-25 21:50:35 -05:00
Matt Harbison
5632d06d85 forget: use vfs instead of os.path + match.rel() for filesystem checks 2014-12-25 21:43:45 -05:00
Angel Ezquerra
9da7dc5b00 localrepo: use the vfs join method to implement the localrepo join method
This will make it possible to customize the behavior of the join method by
changing the vfs class (e.g. by using the altvfs" class introduced recently).

Note that we could have modified the VFS join methods to acept a set of optional
paths in the same way thta the localrepo join method does. However it seemed
simpler to simply call os.path.join before calling self.vfs.join.
2014-12-23 19:48:38 +01:00
Augie Fackler
ffd2cf1dba demandimport: blacklist distutils.msvc9compiler (issue4475)
This module depends on _winreg, which is windows-only. Recent versions
of setuptools load distutils.msvc9compiler and expect it to
ImportError immediately when on non-Windows platforms, so we need to
let them do that. This breaks in an especially mystifying way, because
setuptools uses vars() on the imported module. We then throw an
exception, which vars doesn't pick up on well. For example:

In [3]: class wat(object):
   ...:     @property
   ...:     def __dict__(self):
   ...:         assert False
   ...:

In [4]: vars(wat())
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-2781ada5ffe6> in <module>()
----> 1 vars(wat())

TypeError: vars() argument must have __dict__ attribute

Which is similar to the problem we run into.
2014-12-22 17:27:31 -05:00
Angel Ezquerra
1a80984319 localrepo: introduce shared method to check if a repository is shared
Up until now we compared the "path" and "sharedpath" repository properties to
check if a repository is shared. This was relying an implementation detail of
shared repositories. In order to make it possible to change the way shared
repositories are implemented, we encapsulate this check into its own localrepo
method, called shared.

This new method returns None if the repository is shared, and something else
(for now a string describing the short of share) otherwise.

The reason why I did not call this method "isshared" and made it return a
boolean is that I plan to introduce a new type of shared repository soon.

# NOTES:

This is the first patch in a series whose purpose is to add support for
creating "full repository shares", which are repositories that share everything
with the repository source except their current revision, branch and bookmark.

This series is RFC because I am not very sure of some of the solutions I took.
Comments are welcome!
2014-12-21 00:19:10 +01:00
Martin von Zweigbergk
d21c241927 trydiff: use 'ctx1.flags()' for symmetry with 'ctx2.flags()' 2014-12-23 16:16:26 -08:00
Martin von Zweigbergk
0154ee1c3e trydiff: use 'not in addedset' for symmetry with 'not in removedset'
With the previous change in place, we can safely use 'addedset'.
2014-12-23 16:25:00 -08:00
Martin von Zweigbergk
a69f913d1d trydiff: simplify checking for additions
In the body of the loop in trydiff(), there are conditions like:

  addedset or (f in modifiedset and to is None)

The second half of that expression is to account for the fact that
merge-in additions appear as additions. By instead fixing up the sets
of modified and added files to compensate for this fact, we can
simplify the body of the loop. It also fixes one case where the
addedset was checked without the additional check (the "have we
already reported a copy above?" case in the code, also see fixed test
case).

The similar condition with 'removedset' in it seems to have served no
purpose even before this change, so it could have been simplified even
before.
2014-12-23 16:12:54 -08:00
Martin von Zweigbergk
12fa2d93ff trydiff: extract 'date2' variable like existing 'date1'
Note that there is a comment saying "ctx2 date may be dynamic". The
comment was introduced in 8ed3d2a60500 (patch: use contexts for diff,
2006-12-25), but it seems like it stopped being dynamic in that very
changeset -- before that changeset, the date seems to have been the
file's mtime, but after the changeset, it seems to be the changeset's
timestamp (current time for workingctx). Since no one seems to have
missed the "dynamicness", let's simplify and extract a date2 for
symmetry with date1.
2014-12-23 14:56:30 -08:00
Martin von Zweigbergk
c7f4a068b8 trydiff: use sets, not lists, for containment checks
This only has a noticeable effect on diffs touching A LOT of
files. For example, it takes

  hg diff -r FIREFOX_AURORA_30_BASE -r FIREFOX_AURORA_35_BASE

from 1m55.465s to 1m32.354s. That diff has over 50k files.
2014-12-23 10:41:45 -08:00
Matt Mackall
8b83dfc480 pathauditor: check for Windows shortname aliases 2014-12-18 14:18:28 -06:00
Augie Fackler
13206234f4 pathauditor: check for codepoints ignored on OS X 2014-12-16 13:08:17 -05:00
Augie Fackler
c1fe89f64f darwin: omit ignorable codepoints when normcase()ing a file path
This lets us avoid some nasty case collision problems in OS X with
invisible codepoints.
2014-12-16 13:07:10 -05:00
Augie Fackler
3c9e7fcc66 encoding: add hfsignoreclean to clean out HFS-ignored characters
According to Apple Technote 1150 (unavailable from Apple as far as I
can tell, but archived in several places online), HFS+ ignores sixteen
specific unicode runes when doing path normalization. We need to
handle those cases, so this function lets us efficiently strip the
offending characters from a UTF-8 encoded string (which is the only
way it seems to matter on OS X.)
2014-12-16 13:06:41 -05:00
Augie Fackler
1a7f98d9da manifest: disallow setting the node id of an entry to None
manifest.diff() uses None as a special value to denote the absence of
a file, so setting a file node to None means you then can't trust
manifest.diff().

This should also make future manifest work slightly easier.
2014-12-12 13:40:44 -05:00
Augie Fackler
dbb0b3f2fe context: stop setting None for modified or added nodes
Instead use a magic value, so that we can identify modified or added
nodes correctly when using manifest.diff().

Thanks to Martin von Zweigbergk for catching that we have to update
_buildstatus as well. That part eluded my debugging for some time.
2014-12-12 15:29:54 -05:00
Durham Goode
fd373c16b0 log: fix log revset instability
The log/graphlog revset was not producing stable results since it was
iterating over a dict. Now we sort before iterating to guarantee a fixed order.

This fixes some potential flakiness in the tests.
2014-12-08 15:41:54 -08:00
Durham Goode
74a5156e21 log: fix log -f slow path to actually follow history
The revset created when -f was used with a slow path (for patterns and
directories) did not actually contain any logic to enforce follow. Instead it
was depending on the passed in subset to already be limited (which was limited
to :. but not ::.). This fixes it by adding a '& ::.' to any -f log revset.

hg log -f <file> is still broken, in that it can return results that aren't
actually ancestors of the current file, but fixing that has major perf
implications, so we'll deal with it later.
2014-12-05 14:27:32 -08:00
Martin von Zweigbergk
e5d2c961b1 merge: move checking of unknown files out of manifestmerge()
This moves most reading of filelogs out of manifestmerge, making it
easy for a narrow clone extension to filter out or translate unwanted
actions before any filelogs are read. The only call left is inside of
copies.mergecopies(), which can be overridden separately at a lower
level.
2014-12-18 09:22:09 -08:00
Martin von Zweigbergk
9855f121f2 merge: extract method for checking for conflicting untracked file
Now that the functionality is collected in one place, let's extract it
to a method.
2014-12-13 23:52:22 -08:00
Martin von Zweigbergk
8c5b45ee44 merge: create 'cm' action for 'get or merge' case
We still have one case of a call to _checkunknownfile() in
manifestmerge(): when force=True and branchmerge=True and the remote
side has a file that the local side doesn't. This combination of
arguments is used by 'hg merge --force', but also by rebase and
unshelve. In this scenario, we try to create the file from the
contents from the remote, but if there is already a local untracked
file in place, we merge it instead.
2014-12-15 16:45:19 -08:00
Martin von Zweigbergk
a78e57ed87 merge: don't overwrite untracked file at directory rename target
When a directory was renamed and a new untracked file was added in the
new directory and the remote directory added a file by the same name
in the old directory, the local untracked file gets overwritten, as
demonstrated by the broken test case in test-rename-dir-merge.

Fix by checking for unknown files for 'dg' actions too. Since
_checkunknownfile() currently expects the same filename in both
contexts, we need to add a new parameter for the remote filename to
it.
2014-12-12 23:18:36 -08:00
Martin von Zweigbergk
a397b486a4 merge: remove constant tuple element from 'aborts'
The second element of the tuples in the 'aborts' list is always 'ud',
so let's remove it.
2014-11-18 20:29:25 -08:00
Martin von Zweigbergk
e31b8e4514 merge: collect checking for unknown files at end of manifestmerge()
The 'c' and 'dc' actions include creating a file on disk and we need
to check that no conflicting file exists unless force=True. Move two
of the calls to _checkunknownfile() to a single place at the end of
manifestmerge(). This removes some of the reading of filelogs from the
heart of manifestmerge() and collects it in one place close to where
its output (entries in the 'aborts' list) is used.

Note that this removes the unnecessary call to _checkunknownfile()
when force=True in one of the code paths.
2014-11-19 11:51:31 -08:00
Martin von Zweigbergk
05f68a2661 merge: introduce 'c' action like 'g', but with additional safety
_checkunknownfile() reads the filelog of the remote side's file. For
narrow clones, the filelog will not exist for all files and we need a
way to avoid reading them. While it would be easier for the narrow
extension to just override _checkunknownfile() and make it ignore
files outside the narrow clone, it seems cleaner to have
manifestmerge() not care about filelogs (considering its
name).

In order to move the calls to _checkunknownfile() out, we need to be
able to tell in which cases we should check for unknown files. Let's
start by introducing a new action distinct from 'g' for this
purpose. Specifically, the new action will be just like 'g' except
that it will check that for conflicting unknown files first. For now,
just add the new action type and convert it to 'g'.
2014-11-19 11:48:30 -08:00
Martin von Zweigbergk
589fe422b9 merge: structure 'remote created' code to match table
This does duplicate the call to _checkunknownfile(), but it will
simplify future patches.
2014-11-19 11:44:00 -08:00
Pierre-Yves David
fe9585a642 pushkey: run hook after the lock release
The pushkey operation used to be in its own wireprotocol command and (in
practice) always be lock free when running the hook. With bundle2, it happen in
a greater scheme and a hook running locking command would get stuck. We now run
such hooks after the lock release as similar hook do.

Bundle2 test are altered to ensure we are lockfree at hook running time.
2014-12-22 15:48:39 -08:00
Siddharth Agarwal
94be18d96c archive: store number of changes since latest tag as well
This is different from latesttagdistance in that while latesttagdistance is
defined to be the length of the longest path to the latest tag,
changessincelatesttag is the number of changes contained in @ that aren't
contained in the latest tag. So, if 't' is the latest tag in the repository
below:

      t
      |
      v
 --o--o----o
    \       \
     ..o..o..@

then latesttagdistance is 2, but changessincelatesttag is 4.

Note that changessincelatesttag is always greater than or equal to the
latesttagdistance -- that's because changessincelatesttag counts all the
changes in the longest path since the latest tag, and possibly others. This is
an important fact that we'll take advantage of in upcoming patches.
2014-12-12 15:27:13 -08:00
Matt Mackall
25fddcdd96 merge with stable 2014-12-22 17:26:21 -06:00
Martin von Zweigbergk
b74979cf01 merge: make calculateupdates() return file->action dict
This simplifies largefiles' overridecalculateupdates(), which no
longer has to do the conversion it started doing in 478d610ca1b0
(largefiles: rewrite merge code using dictionary with entry per file,
2014-12-09).

To keep this patch small, we'll leave the name 'actionbyfile' in
overrides.py. It will be renamed in the next patch.
2014-12-11 22:07:41 -08:00
Martin von Zweigbergk
00bed96a72 merge: let _forgetremoved() work on the file->action dict
By moving the conversion from the file->action dict after
_forgetremoved(), we make that method shorter by removing the need for
the confusing 'xactions' variable.
2014-12-11 21:58:49 -08:00
Martin von Zweigbergk
bb82d70631 merge: let _resolvetrivial() work on the file->action dict
By moving the conversion from the file->action dict after
_resolvetrivial(), we greatly simplify and clarify that method.
2014-12-11 21:06:16 -08:00
Martin von Zweigbergk
b0011a70aa merge: let bid merge work on the file->action dict
By moving the conversion from the file->action dict after the bid
merge code, bid merge can be simplified a little.

A few tests are affected by this change. Where we used to iterate over
the actions first in order of the action type ('g', 'r', etc.) [1], we
now iterate in order of filename. This difference affects the order of
debug log statements.

 [1] And then in the non-deterministic order of files in the manifest
     dictionary (the order returned from manifest.diff()).
2014-12-11 20:56:53 -08:00