Commit Graph

735 Commits

Author SHA1 Message Date
Gregory Szorc
91a364e88d context: use changelogrevision
Upcoming patches will make the changelogrevision object perform
lazy parsing. Let's switch to it.

Because we're switching from a tuple to an object, everthing that
accesses the internal cached attribute needs to be updated to access
via attributes. A nice side-effect is this makes the code easier to
read!

Surprisingly, this appears to make revsets accessing this data
slightly faster (values are before series, p1, this patch):

author(mpm)
0.896565
0.929984
0.914234

desc(bug)
0.887169
0.935642
0.921073

date(2015)
0.878797
0.908094
0.891980

extra(rebase_source)
0.865446
0.922624
0.912514

author(mpm) or author(greg)
1.801832
1.902112
1.860402

author(mpm) or desc(bug)
1.812438
1.860977
1.844850

date(2015) or branch(default)
0.968276
1.005824
0.994673

author(mpm) or desc(bug) or date(2015) or extra(rebase_source)
3.656193
3.743381
3.721032
2016-03-06 13:26:37 -08:00
Matt Mackall
75ceec976e changelog: backed out changeset 4ef1c9b76e22 2016-03-02 16:05:30 -06:00
Matt Mackall
b4737e5b27 changelog: backed out changeset 9f92d143bdd2
We want to avoid leaking UTF-8 to main body of code wherever possible.
2016-03-02 12:46:54 -06:00
Gregory Szorc
ab7f7f6a19 changelog: lazy decode user (API)
This appears to show a similar speedup as the previous patch.
2016-02-27 22:34:18 -08:00
Gregory Szorc
6f651cb96b changelog: lazy decode description (API)
Currently, changelog reading decodes read values. This is wasteful
because a lot of times consumers aren't interested in some of these
values.

This patch changes description decoding to occur in changectx as
needed.

revsets reading changelog entries appear to speed up slightly:

revset #7: author(lmoscovicz)
   plain
0) 0.906329
1) 0.872653

revset #8: author(mpm)
   plain
0) 0.903478
1) 0.878037

revset #9: author(lmoscovicz) or author(mpm)
   plain
0) 1.817855
1) 1.778680

revset #10: author(mpm) or author(lmoscovicz)
   plain
0) 1.837052
1) 1.764568
2016-02-27 22:25:14 -08:00
Durham Goode
a6b3658fd5 filectx: replace use of _filerev with _filenode
_filerev depends on the filelog implementation using revlogs and linkrevs.
Alternative implementations, like remotefilelog, do not have rev numbers, so
this call fails. Replacing it with _filenode means it doesn't rely on rev
numbers, and doesn't cost anything extra, since _filerev is using _filenode
under the hood anyway.
2016-02-08 14:17:11 -08:00
Martin von Zweigbergk
c84bb33a89 match: rename "narrowmatcher" to "subdirmatcher" (API)
I keep mistaking "narrowmatcher" for narrowhg's
narrowmatcher. "subdirmatcher" seems more to the point anyway.
2016-02-05 21:09:32 -08:00
Durham Goode
0660375798 memctx: fix memctx manifest file hashes
When memctx is asked for a manifest, it constructs one by merging the p1
manifest, and the changes that are on top. For the changes on top, it was
previously using p1.node() as the file entries parent, which actually returns
the commit node that the p1 linkrev points at! Which is entirely incorrect.

The fix is to use p1.filenode() instead, which returns the parent file node as
desired.

I don't know how to execute this or make it have a visible effect, so I'm not
sure how to test it.  It was noticed because asking for the linkrev is an
expensive operation when using the remotefilelog extension and this was causing
performance regressions with commit.
2016-02-03 17:44:11 -08:00
Martin von Zweigbergk
c04f1844f0 context: back out sneaky code change in documentation change
In a4119550f1e1 (context: clarify why we don't compare file contents
when nodeid differs, 2016-01-12), I also changed "node2 != _newnode"
into "self.rev() is not None". I don't remember why. They are similar,
but the former also catches the case where the file is clean in the
dirstate (so node2 is not _newnode), but different from the "other"
context. This resulted in unnecessary file content comparison a few
lines further down in the code. Let's just back out the code change.

Thanks to Durham Goode for spotting this.
2016-01-25 15:48:35 -08:00
Gregory Szorc
fbab5f0c4c context: don't use util.cachefunc due to cycle creation (issue5043)
util.cachefunc stores all arguments as the cache key. For filectxfn
functions, the arguments include the memctx instance. This creates a
cycle where memctx._filectxfn references self. This causes a memory
leak.

We break the cycle by implementing our own memoizing function that
only uses the path as the cache key. Since each memctx has its own
cache instance, there is no concern about invalid cache hits.
2016-01-17 12:10:30 -08:00
Bryan O'Sullivan
dc202a7844 with: use context manager for wlock in checklookup 2016-01-15 13:14:46 -08:00
Bryan O'Sullivan
c12335032a with: use context manager for wlock in copy 2016-01-15 13:14:46 -08:00
Bryan O'Sullivan
0d3b3841ab with: use context manager for wlock in workingctx.undelete 2016-01-15 13:14:46 -08:00
Bryan O'Sullivan
e93769dccc with: use context manager for wlock in workingctx.forget 2016-01-15 13:14:46 -08:00
Bryan O'Sullivan
521cb71d13 with: use context manager for wlock in workingctx.add 2016-01-15 13:14:46 -08:00
Martin von Zweigbergk
9df2a9ac77 context: check for differing flags a little earlier
This makes it clearer that a unchanged file whose flags have changed
will be reported as a modification. Also test this.
2016-01-12 13:10:31 -08:00
Martin von Zweigbergk
a5a17ed71f context: clarify why we don't compare file contents when nodeid differs
See previous commit for timing information.
2016-01-12 13:09:54 -08:00
Martin von Zweigbergk
a883bc954a status: back out changeset 7e679fd51132
This backs out 7e679fd51132 (status: change + back out == clean (API),
2016-01-04). Although correct, it turned out that it was just too
slow. For example, 'hg status --rev .~1000 --rev .' on the Mozilla
repo went from <1s to >30s on cold disk. So we go back to reporting
reverted changes as modified instead of clean. These are rare anyway,
as suggested by the fact that it had been broken since before
Mercurial 2.0.
2016-01-12 12:43:36 -08:00
Martin von Zweigbergk
452bda4582 status: change + back out == clean (API)
After backing out a change, so the file contents is equal to a
previous revision of itself, we currently report the status between
the two equal revisions as modified. This is because
context._buildstatus() reports any file whose new nodeid is not equal
to _newnode as modified. That magic nodeid is given only to files
added or modified in the working directory, so any file whose nodeid
has changed between two revisions will be reported as modified.

Fix by simply comparing the file contents for all cases where the
nodeid changed, whether they are in the working copy or committed.

Marking with (API) as it subtly changes the semantics of the method.
2016-01-04 10:13:29 -08:00
Martin von Zweigbergk
aba61f81ad status: revert + flag-change == modified
After just changing the flag on a file, plain 'hg status' will report
the file as modified. However, after reverting a file to a previous
revision's state and changing the flag, it will be reported as clean.

Fix by comparing the flags that were previously ignored in
context._buildstatus().
2016-01-04 09:44:58 -08:00
timeless
ebb1d48658 cleanup: remove superfluous space after space after equals (python) 2015-12-31 08:16:59 +00:00
Gregory Szorc
a12b0c85e9 context: use absolute_import 2015-12-21 21:51:31 -08:00
Pierre-Yves David
756a483011 context: use a the nofsauditor when matching file in history (issue4749)
Before this change, asking for file from history (eg: 'hg cat -r 42 foo/bar')
could fail because of the current content of the working copy (eg: current
"foo" being a symlink). As the working copy state have no influence on the
content of the history, we can safely skip these checks.

The working copy context class have a different 'match'
implementation. That implementation still use the repo.auditor will
still catch symlink traversal.

I've audited all stuff calling "match" and they all go through a ctx
in a sensible way. The most unclear case was diff which still seemed
okay. You raised my paranoid level today and I double checked through
tests. They behave properly.

The odds of someone using the wrong (matching with a changectx for
operation that will eventually touch the file system) is non-zero
because you are never sure of what people will do. But I dunno if we
can fight against that. So I would not commit to "never" for "at this
level" and "in the future" if someone write especially bad code.

However, as a last defense, the vfs itself is running path auditor in
all cases outside of .hg/. So I think anything passing the 'matcher'
for buggy reason would growl at the vfs layer.
2015-12-03 13:23:46 -08:00
Andrew Zwicky
2b7b071ce1 extdiff: correctly handle deleted subrepositories (issue3153)
Previously, when extdiff was called on two changesets where
a subrepository had been removed, an unexpected KeyError would
be raised.

Now, the missing subrepository will be ignored.  This behavior
mirrors the behavior in diffordiffstat from cmdutil.py line
~1138-1153.  The KeyError is caught and the revision is
set to None.

try/catch of LookupError around matchmod.narrowmatcher and
sub.status is removed, as LookupError is not raised anywhere
within those methods or deeper calls.
2015-11-17 16:42:52 -06:00
Gregory Szorc
79473d891d context: avoid extra parents lookups
Resolving parents requires reading from the changelog, which is a few
attributes and function calls away. Parents lookup occurs surprisingly
often. Micro optimizing the code to avoid redundant lookups of parents
appears to make `hg log` on my Firefox repo a little faster:

before: 24.91s
after:  23.76s
delta:  -1.15s (95.4% of original)
2015-11-21 19:21:01 -08:00
Gregory Szorc
b65e310257 context: optimize _parents()
This patch avoids some extra attribute lookups and list mutations.

This micro-optimization seems to result in a minor speedup for `hg log`
on my Firefox repo:

before: 25.35s
after:  24.91s
delta:  -0.44s (98% of original)

Not the biggest gain. But every little bit helps.
2015-11-21 19:04:12 -08:00
Matt Mackall
ab46ec3b99 util: drop statmtimesec
We've globablly forced stat to return integer times which agrees with
our extension code, so this is no longer needed.

This speeds up status on mozilla-central substantially:

$ hg perfstatus
! wall 0.190179 comb 0.180000 user 0.120000 sys 0.060000 (best of 53)
$ hg perfstatus
! wall 0.275729 comb 0.270000 user 0.210000 sys 0.060000 (best of 36)
2015-11-19 13:15:17 -06:00
Siddharth Agarwal
8529661292 filectx: add isabsent method
This will indicate whether this filectx represents a file that is *not* in a
changectx. This will be used by merge and filemerge code to know about when a
conflict is a change/delete conflict.

While this is kind of hacky, it is the least bad of all the alternatives. Other
options considered but rejected include:
- isinstance(fctx, ...) -- not very Pythonic, doesn't support duck typing
- fctx.size() is None -- the 'size()' call on workingfilectxes causes a disk stat
- fctx.filenode() == nullid -- the semantics around filenode are incredibly
  confusing. In particular, for workingfilectxes, filenode() is always None no
  matter whether the file is present on disk or in either parent. Having different
  behavior for None versus nullid in the merge code is just asking for pain.

Thanks to Pierre-Yves David for early review feedback here.
2015-11-16 11:27:27 -08:00
Siddharth Agarwal
6a83f7850c filectx: allow custom comparators
We're going to introduce other sorts of filectxes very soon, and we'd like the
cmp method to function properly (i.e. commutatively) for them. The only way to
make that happen is for this cmp method to call into that specialized one if
that defines a custom comparator.
2015-11-13 22:37:51 -08:00
FUJIWARA Katsunori
106983607a dirstate: make dirstate.write() callers pass transaction object to it
Now, 'dirstate.write(tr)' delays writing in-memory changes out, if a
transaction is running.

This may cause treating this revision as "the first bad one" at
bisecting in some cases using external hook process inside transaction
scope, because some external hooks and editor process are still
invoked without HG_PENDING and pending changes aren't visible to them.

'dirstate.write()' callers below in localrepo.py explicitly use 'None'
as 'tr', because they can assume that no transaction is running:

  - just before starting transaction
  - at closing transaction, or
  - at unlocking wlock
2015-10-17 01:15:34 +09:00
Mads Kiilerich
90c21b3c76 context: don't hex encode all unknown 20 char revision specs (issue4890)
af5de4d23fd4 introduced nice hexified display of missing nodes. It did however
also make missing 20 character revision specifications be shown as hex - very
confusing.

Users are often wrong and somehow specify revisions that don't exist. Nodes
will however rarely be missing ... and they will only look like a user provided
revision specification and be all ascii in 1 of 4*10**9.

With this change, missing revisions will only be hexified if they really look
like binary nodes. This change will thus improve the error reporting UI in the
common case and only very rarely make it confusing in the opposite direction of
how it was before.
2015-10-09 01:19:37 +02:00
Pierre-Yves David
30913031d4 error: get Abort from 'error' instead of 'util'
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.

For great justice.
2015-10-08 12:55:45 -07:00
Yuya Nishihara
ea5724ad42 util: extract stub function to get mtime with second accuracy
This function is trivial but will need a long comment why it can't use
st.st_mtime. See the next patch for details.
2015-10-04 22:25:29 +09:00
Matt Mackall
1f2f7de9a3 merge: make merge.preferancestor type and default consistent
(and mark it)
2015-06-25 17:54:55 -05:00
FUJIWARA Katsunori
1dcc27a649 context: write dirstate out explicitly at the end of markcommitted
To detect change of a file without redundant comparison of file
content, dirstate recognizes a file as certainly clean, if:

  (1) it is already known as "normal",
  (2) dirstate entry for it has valid (= not "-1") timestamp, and
  (3) mode, size and timestamp of it on the filesystem are as same as
      ones expected in dirstate

This works as expected in many cases, but doesn't in the corner case
that changing a file keeps mode, size and timestamp of it on the
filesystem.

The timetable below shows steps in one of typical such situations:

  ---- ----------------------------------- ----------------
                                           timestamp of "f"
                                           ----------------
                                           dirstate   file-
  time          action                     mem  file  system
  ---- ----------------------------------- ---- ----- -----
   *                                             ***    ***
       - 'hg transplant REV1 REV2 ...'
         - transplanting REV1
           ....
   N
           - change "f", but keep size                   N
             (via 'patch.patch()')
           - 'dirstate.normal("f")'          N   ***
             (via 'repo.commit()')

         - transplanting REV2
           - change "f", but keep size                   N
             (via 'patch.patch()')
           - aborted while patching
  N+1
         - release wlock
           - 'dirstate.write()'              N    N      N

       - 'hg status' shows "r1" as "clean"   N    N      N
  ---- ----------------------------------- ---- ----- -----

The most important point is that 'dirstate.write()' is executed at N+1
or later. This causes writing dirstate timestamp N of "f" out
successfully. If it is executed at N, 'parsers.pack_dirstate()'
replaces timestamp N with "-1" before actual writing dirstate out.

This issue can occur when 'hg transplant' satisfies conditions below:

  - multiple revisions to be transplanted change the same file
  - those revisions don't change mode and size of the file, and
  - the 2nd or later revision of them fails after changing the file

The root cause of this issue is that files are changed without
flushing in-memory dirstate changes via 'repo.commit()' (even though
omitting 'dirstate.normallookup()' on files changed by 'patch.patch()'
for efficiency also causes this issue).

To detect changes of files correctly, this patch writes in-memory
dirstate changes out explicitly after marking files as clean in
'committablectx.markcommitted()', which is invoked via
'repo.commit()'.

After this change, timetable is changed as below:

  ---- ----------------------------------- ----------------
                                           timestamp of "f"
                                           ----------------
                                           dirstate   file-
  time          action                     mem  file  system
  ---- ----------------------------------- ---- ----- -----
   *                                             ***    ***
       - 'hg transplant REV1 REV2 ...'
         - transplanting REV1
           ....
   N
           - change "f", but keep size                   N
             (via 'patch.patch()')
           - 'dirstate.normal("f")'          N    ***
             (via 'repo.commit()')
       ----------------------------------- ---- ----- -----
           - 'dirsttate.write()'            -1    -1
       ----------------------------------- ---- ----- -----
         - transplanting REV2
           - change "f", but keep size                   N
             (via 'patch.patch()')
           - aborted while patching
  N+1
         - release wlock
           - 'dirstate.write()'              -1   -1     N

       - 'hg status' shows "r1" as "clean"   -1   -1     N
  ---- ----------------------------------- ---- ----- -----

To reproduce this issue in tests certainly, this patch emulates some
timing critical actions as below:

  - change "f" at N

    'patch.patch()' with 'fakepatchtime.py' explicitly changes mtime
    of patched files to "2000-01-01 00:00" (= N).

  - 'dirstate.write()' via 'repo.commit()' at N

    'fakedirstatewritetime.py' forces 'pack_dirstate()' to use
    "2000-01-01 00:00" as "now", only if 'pack_dirstate()' is invoked
    via 'committablectx.markcommitted()'.

  - 'dirstate.write()' via releasing wlock at N+1 (or "not at N")

    'pack_dirstate()' via releasing wlock uses actual timestamp at
    runtime as "now", and it should be different from the "2000-01-01
    00:00" of "f".

BTW, this patch doesn't test cases below, even though 'patch.patch()'
is used similarly in these cases:

  1. failure of 'hg import' or 'hg qpush'

  2. success of 'hg import', 'hg qpush' or 'hg transplant'

Case (1) above doesn't cause this kind of issue, because:

  - if patching is aborted by conflicts, changed files are committed

    changed files are marked as CLEAN, even though they are partially
    patched.

  - otherwise, dirstate are fully restored by 'dirstateguard'

    For example in timetable above, timestamp of "f" in .hg/dirstate
    is restored to -1 (or less than N), and subsequent 'hg status' can
    detect changes correctly.

Case (2) always causes 'repo.status()' invocation via 'repo.commit()'
just after changing files inside same wlock scope.

  ---- ----------------------------------- ----------------
                                           timestamp of "f"
                                           ----------------
                                           dirstate   file-
  time          action                     mem  file  system
  ---- ----------------------------------- ---- ----- -----
  N                                              ***   ***
       - make file "f" clean                            N

       - execute 'hg foobar'
         ....
         - 'dirstate.normal("f")'           N    ***
           (e.g. via dirty check
            or previous 'repo.commit()')

         - change "f", but keep size                    N
         - 'repo.status()' (*1)
           (via 'repo.commit()')
  ---- ----------------------------------- ---- ----- -----

At a glance, 'repo.status()' at (*1) seems to cause similar issue (=
"changed files are treated as clean"), but actually doesn't.

'dirstate._lastnormaltime' should be N at (*1) above, because
'dirstate.normal()' via dirty check is finished at N.

Therefore, "f" changed at N (= 'dirstate._lastnormaltime') is forcibly
treated as "unsure" at (*1), and changes are detected as expected (see
'dirstate.status()' for detail).

If 'hg import' is executed with '--no-commit', 'repo.status()' isn't
invoked just after changing files inside same wlock scope.

But preceding 'dirstate.normal()' is invoked inside another wlock
scope via 'cmdutil.bailifchanged()', and in-memory changes should be
flushed at the end of that scope.

Therefore, timestamp N of clean "f" should be replaced by -1, if
'dirstate.write()' is invoked at N. It means that condition of this
issue isn't satisfied.
2015-07-08 17:01:09 +09:00
FUJIWARA Katsunori
51754ba82b context: write dirstate out explicitly after marking files as clean
To detect change of a file without redundant comparison of file
content, dirstate recognizes a file as certainly clean, if:

  (1) it is already known as "normal",
  (2) dirstate entry for it has valid (= not "-1") timestamp, and
  (3) mode, size and timestamp of it on the filesystem are as same as
      ones expected in dirstate

This works as expected in many cases, but doesn't in the corner case
that changing a file keeps mode, size and timestamp of it on the
filesystem.

The timetable below shows steps in one of typical such situations:

  ---- ----------------------------------- ----------------
                                           timestamp of "f"
                                           ----------------
                                           dirstate   file-
  time          action                     mem  file  system
  ---- ----------------------------------- ---- ----- -----
  N                                              -1    ***
       - make file "f" clean                            N

       - execute 'hg foobar'
         - instantiate 'dirstate'           -1   -1
         - 'dirstate.normal("f")'           N    -1
           (e.g. via dirty check)
         - change "f", but keep size                    N
  N+1
         - release wlock
           - 'dirstate.write()'             N    N

       - 'hg status' shows "f" as "clean"   N    N      N
  ---- ----------------------------------- ---- ----- -----

The most important point is that 'dirstate.write()' is executed at N+1
or later. This causes writing dirstate timestamp N of "f" out
successfully. If it is executed at N, 'parsers.pack_dirstate()'
replaces timestamp N with "-1" before actual writing dirstate out.

Occasional test failure for unexpected file status is typical example
of this corner case. Batch execution with small working directory is
finished in no time, and rarely satisfies condition (2) above.

This issue can occur in cases below;

  - 'hg revert --rev REV' for revisions other than the parent
  - failure of 'merge.update()' before 'merge.recordupdates()'

The root cause of this issue is that files are changed without
flushing in-memory dirstate changes via 'repo.commit()' (even though
omitting 'dirstate.normallookup()' on changed files also causes this
issue).

To detect changes of files correctly, this patch writes in-memory
dirstate changes out explicitly after marking files as clean in
'workingctx._checklookup()', which is invoked via 'repo.status()'.

After this change, timetable is changed as below:

  ---- ----------------------------------- ----------------
                                           timestamp of "f"
                                           ----------------
                                           dirstate   file-
  time          action                     mem  file  system
  ---- ----------------------------------- ---- ----- -----
  N                                              -1    ***
       - make file "f" clean                            N

       - execute 'hg foobar'
         - instantiate 'dirstate'           -1   -1
         - 'dirstate.normal("f")'           N    -1
           (e.g. via dirty check)
       ----------------------------------- ---- ----- -----
         - 'dirsttate.write()'              -1   -1
       ----------------------------------- ---- ----- -----
         - change "f", but keep size                    N
  N+1
         - release wlock
           - 'dirstate.write()'             -1   -1

       - 'hg status'                        -1   -1      N
  ---- ----------------------------------- ---- ----- -----

To reproduce this issue in tests certainly, this patch emulates some
timing critical actions as below:

  - timestamp of "f" in '.hg/dirstate' is -1 at the beginning

    'hg debugrebuildstate' before command invocation ensures it.

  - make file "f" clean at N
  - change "f" at N

    'touch -t 200001010000' before and after command invocation
    changes mtime of "f" to "2000-01-01 00:00" (= N).

  - invoke 'dirstate.write()' via 'repo.status()' at N

    'fakedirstatewritetime.py' forces 'pack_dirstate()' to use
    "2000-01-01 00:00" as "now", only if 'pack_dirstate()' is invoked
    via 'workingctx._checklookup()'.

  - invoke 'dirstate.write()' via releasing wlock at N+1 (or "not at N")

    'pack_dirstate()' via releasing wlock uses actual timestamp at
    runtime as "now", and it should be different from the "2000-01-01
    00:00" of "f".

BTW, this patch also changes 'test-largefiles-misc.t', because adding
'dirstate.write()' makes recent dirstate changes visible to external
process.
2015-07-08 17:01:09 +09:00
Yuya Nishihara
0340e6a83e workingctx: use node.wdirid constant 2015-06-22 22:05:10 +09:00
Matt Harbison
9f8b7aa09e workingctx: don't report the tags for its parents
This fixes the bad distance calculation for '{latesttagdistance}' mentioned in
the previous patch.
2015-06-28 13:38:03 -04:00
Gregory Szorc
5380dea2a7 global: mass rewrite to use modern exception syntax
Python 2.6 introduced the "except type as instance" syntax, replacing
the "except type, instance" syntax that came before. Python 3 dropped
support for the latter syntax. Since we no longer support Python 2.4 or
2.5, we have no need to continue supporting the "except type, instance".

This patch mass rewrites the exception syntax to be Python 2.6+ and
Python 3 compatible.

This patch was produced by running `2to3 -f except -w -n .`.
2015-06-23 22:20:08 -07:00
Matt Harbison
ba46b0e533 subrepo: allow a representation of the working directory subrepo
Some code cannot handle a subrepo based on the working directory (e.g.
sub.dirty()), so the caller must opt in.  This will be useful for archive, and
perhaps some other commands.  The git and svn methods where this is used may
need to be fixed up on a case by case basis.
2015-06-16 23:03:36 -04:00
Matt Harbison
a1b56b9f59 context: override workingctx.hex() to avoid a crash
Since node is None for workingctx, it can't use the base class
implementation of 'hex(self.node())'.

It doesn't appear that there are any current callers of this, but there will be
when archive supports 'wdir()'.  My first thought was to use "{p1node}+", but
that would cause headaches elsewhere [1].

We should probably fix up localrepository.__getitem__ to accept this hash for
consistency, as a followup.  This works, if the full hash is specified:

  @@ -480,7 +480,7 @@
           return dirstate.dirstate(self.vfs, self.ui, self.root, validate)

       def __getitem__(self, changeid):
  -        if changeid is None:
  +        if changeid is None or changeid == 'ff' * 20:
               return context.workingctx(self)
           if isinstance(changeid, slice):
               return [context.changectx(self, i)

That differs from null, where it will accept any number of 0s, as long as it
isn't ambiguous.


[1] https://www.selenic.com/pipermail/mercurial-devel/2015-June/071166.html
2015-06-14 22:04:17 -04:00
Matt Harbison
f758dd3fee context: add an optional constructor parameter for a match.bad() override
Most matcher creation is done by way of a context.
2015-06-05 19:01:04 -04:00
Matt Harbison
5159a795ed context: replace match.bad() monkey patching with match.badmatch()
No known issues with the previous code since it restored the original method,
but this is cleaner.
2015-06-04 21:37:59 -04:00
Matt Harbison
23a0164ef3 context: introduce the nullsub() method
Ultimately, this will be used by scmutil.  The subrepo module already imports
it, so it can't import the subrepo module to access the underlying method.
2015-06-03 13:51:27 -04:00
Laurent Charignon
9aa6695e8b patch: add 'extra' argument to makememctx
The uncommit command in evolve needs to create memory context with given
extra parameters. This patch allows us to do that instead of always giving them
an empty value and having to override it afterwards.
2015-05-22 13:06:45 -07:00
Matt Mackall
7e1cf5444c merge with stable 2015-05-19 07:17:57 -05:00
Matt Harbison
dc85d51beb context: don't complain about a matcher's subrepo paths in changectx.walk()
Previously, the first added test printed the following:

  $ hg files -S -r '.^' sub1/sub2/folder
  sub1/sub2/folder: no such file in rev 9bb10eebee29
  sub1/sub2/folder: no such file in rev 9bb10eebee29
  sub1/sub2/folder/test.txt

One warning occured each time a subrepo was crossed into.

The second test ensures that the matcher copy stays in place.  Without the copy,
the bad() function becomes an increasingly longer chain, and no message would be
printed out for a file missing in the subrepo because the predicate would match
in one of the replaced methods.  Manifest doesn't know anything about subrepos,
so it needs help ignoring subrepos when complaining about bad files.
2015-05-17 01:06:10 -04:00
Matt Harbison
9311abf92a match: resolve filesets in subrepos for commands given the '-S' argument
This will work for any command that creates its matcher via scmutil.match(), but
only the files command is tested here (both workingctx and basectx based tests).
The previous behavior was to completely ignore the files in the subrepo, even
though -S was given.

My first attempt was to teach context.walk() to optionally recurse, but once
that was in place and the complete file list was built up, the predicate test
would fail with 'path in nested repo' when a file in a subrepo was accessed
through the parent context.

There are two slightly surprising behaviors with this functionality.  First, any
path provided inside the fileset isn't narrowed when it is passed to the
subrepo.  I dont see any clean way to do that in the matcher.  Fortunately, the
'subrepo()' fileset is the only one to take a path.

The second surprise is that status predicates are resolved against the subrepo,
not the parent like 'hg status -S' is.  I don't see any way to fix that either,
given the path auditor error mentioned above.
2015-05-16 00:36:35 -04:00
Yuya Nishihara
d99ad9e9b8 annotate: always adjust linkrev before walking down to parents (issue4623)
This should avoid the bad performance in the following scenario. Before this
patch, on "hg annotate -r10000", p.rev() would walk changelog from 10000 to 3
because _descendantrev was 10000. With this patch, it walks from 5 to 3.

  1 -- 2 -- 4 -- 5 -- ... -- 10000
    \      'p'  'f'
     - 3   (grafted 3 to 4)
      'p'

repo:    https://hg.mozilla.org/releases/mozilla-beta/#4f80fecda802
command: hg annotate -r b0a57152fd14 browser/app/profile/firefox.js
before:  83.120 secs
after:    3.820 secs

This patch involves extra calls of narrow _adjustlinkrev(), but the cost of
them seems relatively small compared to wide _adjustlinkrev() calls eliminated
by this patch.

repo:    http://selenic.com/repo/hg/#d668bc5b9a06
command: hg annotate mercurial/commands.py
before:  7.380 secs
after:   7.320 secs

repo:    https://hg.mozilla.org/mozilla-central/#f214df6ac75f
command: hg annotate layout/generic/nsTextFrame.cpp
before:  5.070 secs
after:   5.050 secs

repo:    https://hg.mozilla.org/releases/mozilla-beta/#4f80fecda802
command: hg annotate -r 4954faa47dd0 gfx/thebes/gfxWindowsPlatform.cpp
before:  1.600 secs
after:   1.620 secs
2015-04-25 15:38:06 +09:00
Yuya Nishihara
0a37922d8d annotate: prepare ancestry context of workingfilectx
_ancestrycontext is necessary for fast lookup of _changeid. Because we can't
compute the ancestors from wctx, we skip to its parents. 'None' is not needed
to be included in _ancestrycontext because it is used for a membership test
of filelog revisions.

repo:    https://hg.mozilla.org/releases/mozilla-beta/#062e49bcb2da
command: hg annotate -r 'wdir()' gfx/thebes/gfxWindowsPlatform.cpp
before:  51.520 sec
after:    1.780 sec
2015-04-18 15:27:03 +09:00
Yuya Nishihara
5b51ae23c8 committablefilectx: propagate ancestry info to parent to fix annotation
Before this patch, annotating working directory could include wrong revisions
that were hidden or belonged to different branches. This fixes wfctx.parents()
to set _descendantrev so that all ancestors can take advantage of the linkrev
adjustment introduced at a5aaaeedd6cb. _adjustlinkrev() can handle 'None'
revision thanks to bb19d597bbcd.
2015-04-18 14:10:55 +09:00
Yuya Nishihara
d32c454372 filectx: extract function to create parent fctx keeping ancestry info
committablefilectx.parents() should use this to take advantage of the linkrev
adjustment.
2015-04-18 14:03:41 +09:00
Yuya Nishihara
921251cb12 filectx: factor out creation of parent fctx
This series tries to fix wrong ancestry information on annotating working
directory. This change should slightly improves the readability of the next
patch.
2015-04-18 13:46:24 +09:00
Matt Harbison
041a91f971 match: add a subclass for dirstate normalizing of the matched patterns
This class is only needed on case insensitive filesystems, and only
for wdir context matches. It allows the user to not match the case of
the items in the filesystem- especially for naming directories, which
dirstate doesn't handle[1]. Making dirstate handle mismatched
directory cases is too expensive[2].

Since dirstate doesn't apply to committed csets, this is only created by
overriding basectx.match() in workingctx, and only on icasefs.  The default
arguments have been dropped, because the ctx must be passed to the matcher in
order to function.

For operations that can apply to both wdir and some other context, this ends up
normalizing the filename to the case as it exists in the filesystem, and using
that case for the lookup in the other context.  See the diff example in the
test.

Previously, given a directory with an inexact case:

  - add worked as expected

  - diff, forget and status would silently ignore the request

  - files would exit with 1

  - commit, revert and remove would fail (even when the commands leading up to
    them worked):

        $ hg ci -m "AbCDef" capsdir1/capsdir
        abort: CapsDir1/CapsDir: no match under directory!

        $ hg revert -r '.^' capsdir1/capsdir
        capsdir1\capsdir: no such file in rev 64dae27060b7

        $ hg remove capsdir1/capsdir
        not removing capsdir1\capsdir: no tracked files
        [1]

Globs are normalized, so that the -I and -X don't need to be specified with a
case match.  Without that, the second last remove (with -X) removes the files,
leaving nothing for the last remove.  However, specifying the files as
'glob:**.Txt' does not work.  Perhaps this requires 're.IGNORECASE'?

There are only a handful of places that create matchers directly, instead of
being routed through the context.match() method.  Some may benefit from changing
over to using ctx.match() as a factory function:

  revset.checkstatus()
  revset.contains()
  revset.filelog()
  revset._matchfiles()
  localrepository._loadfilter()
  ignore.ignore()
  fileset.subrepo()
  filemerge._picktool()
  overrides.addlargefiles()
  lfcommands.lfconvert()
  kwtemplate.__init__()
  eolfile.__init__()
  eolfile.checkrev()
  acl.buildmatch()

Currently, a toplevel subrepo can be named with an inexact case.  However, the
path auditor gets in the way of naming _anything_ in the subrepo if the top
level case doesn't match.  That is trickier to handle, because there's the user
provided case, the case in the filesystem, and the case stored in .hgsub.  This
can be fixed next cycle.

  --- a/tests/test-subrepo-deep-nested-change.t
  +++ b/tests/test-subrepo-deep-nested-change.t
  @@ -170,8 +170,15 @@
     R sub1/sub2/test.txt
     $ hg update -Cq
     $ touch sub1/sub2/folder/bar
  +#if icasefs
  +  $ hg addremove Sub1/sub2
  +  abort: path 'Sub1\sub2' is inside nested repo 'Sub1'
  +  [255]
  +  $ hg -q addremove sub1/sub2
  +#else
     $ hg addremove sub1/sub2
     adding sub1/sub2/folder/bar (glob)
  +#endif
     $ hg status -S
     A sub1/sub2/folder/bar
     ? foo/bar/abc

The narrowmatcher class may need to be tweaked when that is fixed.


[1] http://www.selenic.com/pipermail/mercurial-devel/2015-April/068183.html
[2] http://www.selenic.com/pipermail/mercurial-devel/2015-April/068191.html
2015-04-12 01:39:21 -04:00
Matt Mackall
4b6771a9f8 linkrev: fix issue with annotate of working copy
The introrev was appearing as None in new annotate tests, which the
code from the stable branch wasn't expecting.
2015-04-16 18:30:08 -05:00
Matt Mackall
9483b9f214 merge with stable 2015-04-16 17:30:01 -05:00
Yuya Nishihara
3af3ee9d07 annotate: always prepare ancestry context of base fctx (issue4600)
This patch extends the workaround introduced by d5844c5f6c7b. Even if the
base fctx is the same as intorrev, _ancestrycontext must be built for faster
_changeid lookup.

repo:    https://hg.mozilla.org/releases/mozilla-beta
command: hg annotate -r 4954faa47dd0 gfx/thebes/gfxWindowsPlatform.cpp
before:  52.450 sec
after:    1.820 sec
2015-04-16 22:33:53 +09:00
Pierre-Yves David
ea1a0fd29f adjustlinkrev: handle 'None' value as source
When the source rev value is 'None', the ctx is a working context. We
cannot compute the ancestors from there so we directly skip to its
parents. This will be necessary to allow 'None' value for
'_descendantrev' itself necessary to make all contexts used in
'mergecopies' reuse the same '_ancestrycontext'.
2015-03-19 23:57:34 -07:00
Pierre-Yves David
7605b9a4cf adjustlinkrev: prepare source revs for ancestry only once
We'll need some more complex initialisation to handle workingfilectx
case. We do this small change in a different patch for clarity.
2015-03-19 23:52:26 -07:00
Pierre-Yves David
78f35efc70 annotate: reuse ancestry context when adjusting linkrev (issue4532)
The linkrev adjustment will likely do the same ancestry walking multiple time
so we already have an optional mechanism to take advantage of this. Since
4e4e9e954fae, linkrev adjustment was done lazily to prevent too bad performance
impact on rename computation. However, this laziness created a quadratic
situation in 'annotate'.

Mercurial repo: hg annotate mercurial/commands.py
before:   8.090
after:  36.300

Mozilla repo: hg annotate layout/generic/nsTextFrame.cpp
before:   1.190
after:  290.230


So we setup sharing of the ancestry context in the annotate case too. Linkrev
adjustment still have an impact but it a much more sensible one.

Mercurial repo: hg annotate mercurial/commands.py
before:  36.300
after:   10.230

Mozilla repo: hg annotate layout/generic/nsTextFrame.cpp
before: 290.230
after:    5.560
2015-03-19 19:52:23 -07:00
Matt Mackall
b09693fbd8 filectx: use _descendantrev in parents()
This lets us be lazy about linkrev adjustments when tracing history.
2015-02-01 16:33:45 -06:00
Matt Mackall
66f6b10d5f filectx: if we have a _descendantrev, use it to adjust linkrev
This lets us use _adjustlinkrev lazily.
2015-02-01 16:26:35 -06:00
Matt Mackall
9b9eada68d filectx: use linkrev to sort ancestors
We're going to make rev() lazily do _adjustlinkrevs, and we don't want
that to happen when we're quickly tracing through file ancestry
without caring about revs (as we do when finding copies).

This takes us back to pre-linkrev-correction behavior, but shouldn't
regress us relative to the last stable release.
2015-02-01 16:23:07 -06:00
Pierre-Yves David
e94f338ab6 _adjustlinkrev: reuse ancestors set during rename detection (issue4514)
The new linkrev adjustement mechanism makes rename detection very slow, because
each file rewalks the ancestor dag. To mitigate the issue in Mercurial 3.3, we
introduce a simplistic way to share the ancestors computation for the linkrev
validation phase.

We can reuse the ancestors in that case because we do not care about
sub-branching in the ancestors graph.

The cached set will be use to check if the linkrev is valid in the search
context. This is the vast majority of the ancestors usage during copies search
since the uncached one will only be used when linkrev is invalid, which is
hopefully rare.
2015-01-30 16:02:28 +00:00
Pierre-Yves David
a0008f62ee filectx: move _adjustlinkrev to a method
We are going to introduce some wider caching mechanisms during linkrev
adjustment. As there is no specific reason to not be a method and some
reasons to be a method, let's make it a method.
2015-01-30 14:39:03 +00:00
Yuya Nishihara
0301f78f70 committablectx: override manifestnode() to return None
wctx.manifestnode() crashed before because it has no _changeset. Instead of
crashing, just return None like wctx.node().
2015-04-09 22:18:55 +09:00
Drew Gottlieb
ee2eebcb93 manifest: move changectx.walk() to manifests
The logic of walking a manifest to yield files matching a match object is
currently being done by context, not the manifest itself. This moves the walk()
function to both manifestdict and treemanifest. This separate implementation
will also permit differing, optimized implementations for each manifest.
2015-04-07 15:18:52 -07:00
Martin von Zweigbergk
6c7d935363 changectx.walk: drop unnecessary call to match function
If all the files in match.files() are in the context/manifest, we
already know that the matcher will match each file.
2015-04-06 17:03:35 -07:00
Pierre-Yves David
dc490d9ff6 linkrev: use the right manifest content when adjusting linrev (issue4499)
When the manifest revision is stored as a delta against a non-parent revision,
'_adjustlinkrev' could miss some file update because it was using the delta
only. We now use the 'fastread' method that uses the delta only when it makes
sense.

A test showcasing on the of possible issue have been added.
2015-01-14 17:21:09 -08:00
Martin von Zweigbergk
01d503fc7e status: don't override _buildstatus() in workingcommitctx
Now that the caching into _status is done in
workingctx._dirstatestatus(), which workingcommitctx._dirstatestatus()
does not call, there is no caching to prevent in _buildstatus(), so
stop overriding it.
2015-01-08 13:29:06 -08:00
Martin von Zweigbergk
370c0e4b47 status: cache dirstate status in _dirstatestatus()
Since it's only the dirstate status we cache, it makes more sense to
cache it in the _dirstatestatus() method. Note that this change means
the dirstate status will also be cached when status is requested
between the working copy and some other revision, while we currently
only cache the result if exactly the status between the working copy
and its parent is requested.
2015-01-08 13:12:44 -08:00
Durham Goode
d73818aad4 filectx: fix annotate to not directly instantiate filectx
b04f57726c73 changed basefilectx.annotate() to directly instantiate new
filectx's instead of going through self.filectx(), this breaks extensions that
replace the filectx class, and would also break future uses that would need
memfilectx's.
2015-01-09 11:21:29 -08:00
Augie Fackler
b539edc70e context: use new manifest.diff(clean=True) support
This further simplifies the status code.

This simplification comes at a slight performance cost for `hg
export`. Before, on mozilla-central:

perfmanifest tip
! wall 0.265977 comb 0.260000 user 0.240000 sys 0.020000 (best of 38)
perftags
! result: 162
! wall 0.007172 comb 0.010000 user 0.000000 sys 0.010000 (best of 403)
perfstatus
! wall 0.422302 comb 0.420000 user 0.260000 sys 0.160000 (best of 24)
hgperf export tip
! wall 0.148706 comb 0.150000 user 0.150000 sys 0.000000 (best of 65)

after, same repo:
perfmanifest tip
! wall 0.267143 comb 0.270000 user 0.250000 sys 0.020000 (best of 37)
perftags
! result: 162
! wall 0.006943 comb 0.010000 user 0.000000 sys 0.010000 (best of 397)
perfstatus
! wall 0.411198 comb 0.410000 user 0.260000 sys 0.150000 (best of 24)
hgperf export tip
! wall 0.173229 comb 0.170000 user 0.170000 sys 0.000000 (best of 55)

The next set of patches introduces a new manifest type implemented
almost entirely in C, and more than makes up for the performance hit
incurred in this change.
2014-12-15 16:06:04 -05:00
Augie Fackler
509875a2fe context: use manifest.diff() to compute most of status
We can do a little tiny bit better by enhancing manifest.diff to
optionally include files that are in both sides. This will be done in
a followup patch.
2014-12-15 15:33:55 -05:00
Yuya Nishihara
dd57e3c688 committablefilectx: override linkrev() to point to the associated changectx
This is necessary to annotate workingctx revision. basefilectx.linkrev() can't
be used because committablefilectx has no filelog.

committablefilectx looks for parents() from self._changectx. That means fctx
is linked to self._changectx, so linkrev() can simply be aliased to rev().
2015-03-19 23:31:53 +09:00
Matt Mackall
db55434dfb merge with stable 2015-03-20 17:30:38 -05:00
Martin von Zweigbergk
500314e378 context.walk: walk all files when file and '.' given
When both '.' (the working copy root) and an explicit file (or files)
are in match.files(), we only walk the explicitly listed files. This
is because we remove the '.' from the set too early. Move later and
add a test for it. Before this change, the last test would print only
"3".
2015-03-18 11:42:09 -07:00
Martin von Zweigbergk
2be30ae8a2 context.walk: call with util.all() a generator, not a list
The file set can be large, so avoid going through the entire file set
when a file happens not to be in the context.
2015-03-18 09:26:26 -07:00
Matt Harbison
dee58afe9f filectx: add a repo accessor
This is similar to 327902ff25df in motivation.  All contexts now have this
method, so the rest of the 'ctx._repo' uses can be converted without worrying
about what type of context it is.
2015-03-13 20:34:52 -04:00
Drew Gottlieb
c9f0f58f01 manifest: have context use self.hasdir()
A couple places in context currently use "x in self._dirs" to check for the
existence of the directory, but this requires that all directories be loaded
into a dict. Calling hasdir() instead puts the work on the the manifest to
check for the existence of a directory in the most efficient manner.
2015-03-13 15:36:11 -07:00
Drew Gottlieb
1b0e10bb02 manifest: add hasdir() to context
This is a convenience method that calls to its manifest's hasdir(). There are
parts of context that check to see if a directory exists, and this method will
let implementations of manifest provide an optimal way to find a particular
directory.
2015-03-13 15:32:45 -07:00
Drew Gottlieb
788c192400 manifest: have context's dirs() call its manifest's dirs()
This lets the context's dirs() method be agnostic towards any alternate
manifest implementations.
2015-03-13 15:23:02 -07:00
Jordi Gutiérrez Hermoso
8eb132f5ea style: kill ersatz if-else ternary operators
Although Python supports `X = Y if COND else Z`, this was only
introduced in Python 2.5. Since we have to support Python 2.4, it was
a very common thing to write instead `X = COND and Y or Z`, which is a
bit obscure at a glance. It requires some intricate knowledge of
Python to understand how to parse these one-liners.

We change instead all of these one-liners to 4-liners. This was
executed with the following perlism:

    find -name "*.py" -exec perl -pi -e 's,(\s*)([\.\w]+) = \(?(\S+)\s+and\s+(\S*)\)?\s+or\s+(\S*)$,$1if $3:\n$1    $2 = $4\n$1else:\n$1    $2 = $5,' {} \;

I tweaked the following cases from the automatic Perl output:

    prev = (parents and parents[0]) or nullid
    port = (use_ssl and 443 or 80)
    cwd = (pats and repo.getcwd()) or ''
    rename = fctx and webutil.renamelink(fctx) or []
    ctx = fctx and fctx or ctx
    self.base = (mapfile and os.path.dirname(mapfile)) or ''

I also added some newlines wherever they seemd appropriate for readability

There are probably a few ersatz ternary operators still in the code
somewhere, lurking away from the power of a simple regex.
2015-03-13 17:00:06 -04:00
Matt Harbison
c4ec144647 context: add a repo accessor
There are 29 instances of 'ctx._repo' in the code, so make the ability
to access more official.
2015-03-12 22:54:53 -04:00
Augie Fackler
c0d9e7c859 context: don't sort manifest entries
The manifest iterator is now pre-sorted, so we can skip this check.
2014-11-17 00:00:25 -05:00
Durham Goode
bd91d1c63e workingctx: use normal dirs() instead of dirstate.dirs()
The workingctx class was using dirstate.dirs() as it's implementation. The
sparse extension maintains a pruned down version of the dirstate, so this
resulted in the workingctx reporting an incorrect listing of directories
during merge calculations (it was detecting directory renames when it
shouldn't have).

The fix is to use the default implementation, which uses workingctx._manifest,
which unions the manifest with the dirstate to produce the correct overall
picture. This also produces more accurate output since it will no longer
return directories that have been entirely deleted in the dirstate.

Tests will be added to the sparse extension to detect regressions for this.
2015-03-05 22:16:28 -08:00
Mads Kiilerich
b2b60414f6 spelling: fixes from proofreading of spell checker issues 2015-01-18 02:38:57 +01:00
Martin von Zweigbergk
ba5ff3b5e7 context: use unfiltered repo for '.'
There is no reason to read obsolescence markers when doing a plain 'hg
status' without --rev. Use the unfiltered repo when initializing
context._rev to speed things up. This speeds up 'hg status' from
1.342s to 0.080s on my repo with ~110k markers.
2014-11-20 12:15:12 -08:00
Martin von Zweigbergk
c8aa337d29 status: don't list files as both clean and deleted
Tracked files that are deleted should always be reported as such, no
matter what their state was in earlier revisions. This is encoded in
in two conditions in the loop in basectx._buildstatus() for modified
and added files, but the check is missing for clean files. We should
check for clean files too, but instead of adding the check in a third
place, move it earlier and skip most of the loop body for deleted
files.
2015-01-05 17:12:04 -08:00
Martin von Zweigbergk
b1f3f94c3b status: don't list files as both removed and deleted
When calculating status involving the working copy and a revision
other than the parent of the working copy, the files that are not in
the working context manifest ('mf2' in the basectx._buildstatus())
will be reported as removed (note that deleted files _are_ in the
working context manifest). However, if the file is reported as deleted
in the dirstate, it will get that status too (as shown by failing
tests).

Fix by removing deleted files from the 'removed' list after the main
loop in _buildstatus().
2015-01-05 16:52:12 -08:00
FUJIWARA Katsunori
851e75b58e context: override _dirstatestatus in workingcommitctx for correct matching
Before this patch, the result of "status()" on "workingcommitctx" may
incorrectly contain files other than ones to be committed, because
"workingctx._dirstatestatus()" returns the result of
"dirstate.status()" directly.

For correct matching, this patch overrides "_dirstatestatus" in
"workingcommitctx" and makes it return matched files only in
"self._status".

This patch uses empty list for "deleted", "unknown" and "ignored" of
status, because status between "changectx"s also makes them empty.
2014-12-31 17:55:43 +09:00
FUJIWARA Katsunori
7b31cd3a28 context: avoid breaking already fixed self._status at ctx.status()
Before this patch, "status()" on "workingcommitctx" with "always
match" object causes breaking "self._status" in
"workingctx._buildstatus()", because "workingctx._buildstatus()"
caches the result of "dirstate.status()" into "self._status" for
efficiency, even though it should be fixed at construction for
committing.

For example, template function "diff()" without any patterns in
"committemplate" implies "status()" on "workingcommitctx" with "always
match" object, via "basectx.diff()" and "patch.diff()".

Then, broken "self._status" causes committing unexpected files.

To avoid breaking already fixed "self._status" at "ctx.status()", this
patch overrides "_buildstatus" in "workingcommitctx".

This patch doesn't write out the result of template function "diff()"
in "committemplate" in "test-commit.t", because matching against files
to be committed still has an issue fixed in subsequent patch.
2014-12-31 17:55:43 +09:00
FUJIWARA Katsunori
56e025176b context: add workingcommitctx for exact context to be committed
Before this patch, "workingctx" is also used for the context to be
committed. But "workingctx" works incorrectly in some cases.

For example, even when only some of changed files in the working
directory are committed, "status()" on "workingctx" object for
committing recognizes files not to be committed as changed, too.

As the preparation for fixing these issues, this patch chooses adding
new class "workingcommitctx" for exact context to be committed,
because switching by the flag (like "self._fixedstatus" or so) in some
code paths of "workingctx" is less readable and maintenancable.
2014-12-31 17:55:43 +09:00
FUJIWARA Katsunori
c2e92a32b4 context: make unknown/ignored/clean of cached status empty for equivalence
Before this patch, "workingctx.status" caches the result of
"dirstate.status" directly into "self._status".

But "dirstate.status" is invoked with False "list*" arguments in
normal "self._status" accessing route, and this makes
"unknown"/"ignored"/"clean" of status empty.

This may cause unexpected result of code paths internally accessing to
them (accessors for external usage are already removed by previous patch).

This patch makes "unknown"/"ignored"/"clean" of cached status empty
for equivalence. Making them empty is executed only when at least one
of "unknown", "ignored" or "clean" has files, for efficiency.
2014-12-31 17:55:43 +09:00
Pierre-Yves David
451115c9e1 linkrev: also adjust linkrev when bootstrapping annotate (issue4305)
The annotate logic now use the new 'introrev' method to bootstrap its traversal.
This catches issues from linkrev-shadowing of the changeset introducing the
version of a file in source changeset.

More tests have been added to display pathological cases.
2014-12-24 03:26:48 -08:00
Pierre-Yves David
ca713b2bdd linkrev: introduce an 'introrev' method on filectx
The previous changeset properly fixed the ancestors computation, but we need to
ensure that the initial filectx is also using the right changeset.

When asking for log or annotation from a certain point, the first step is to
define the changeset that introduced the current file version. We cannot just
pick the "starting point" changesets as it may just "use" the file revision,
unchanged.

Currently, we were using 'linkrev' for this purpose, but this exposes us to
unexpected branch-jumping when the revision introducing the starting point
version is itself linkrev-shadowed. So we need to take the topology into
account again. Therefore, we introduce an 'introrev' function, returning the
changeset which introduced the file change in the current changeset.

This function will be used to fix linkrev-related issues when bootstrapping 'hg
log --follow' and 'hg annotate'.

It reuses the '_adjustlinkrev' function, extending it to allow introspection of
the initial changeset too. In the previous usage of the '_adjustlinkrev' the
starting rev was always using a children file revisions, so it could be safely
ignored in the search. In this case, the starting point is using the revision
of the file we are looking, and may be the changeset we are looking for.
2014-12-23 16:14:39 -08:00
Pierre-Yves David
3c79d53ced filectx.parents: enforce changeid of parent to be in own changectx ancestors
Because of the way filenodes are computed, you can have multiple changesets
"introducing" the same file revision. For example, in the changeset graph
below, changeset 2 and 3 both change a file -to- and -from- the same content.

  o 3: content = new
  |
  | o 2: content = new
  |/
  o 1: content = old

In such cases, the file revision is create once, when 2 is added, and just reused
for 3. So the file change in '3' (from "old" to "new)" has no linkrev pointing
to it).  We'll call this situation "linkrev-shadowing". As the linkrev is used for
optimization purposes when walking a file history, the linkrev-shadowing
results in an unexpected jump to another branch during such a walk.. This leads to
multiple bugs with log, annotate and rename detection.

One element to fix such bugs is to ensure that walking the file history sticks on
the same topology as the changeset's history. For this purpose, we extend the
logic in 'basefilectx.parents' so that it always defines the proper changeset
to associate the parent file revision with. This "proper" changeset has to be an
ancestor of the changeset associated with the child file revision.

This logic is performed in the '_adjustlinkrev' function. This function is
given the starting changeset and all the information regarding the parent file
revision. If the linkrev for the file revision is an ancestor of the starting
changeset, the linkrev is valid and will be used. If it is not, we detected a
topological jump caused by linkrev shadowing, we are going to walk the
ancestors of the starting changeset until we find one setting the file to the
revision we are trying to create.

The performance impact appears acceptable:

- We are walking the changelog once for each filelog traversal (as there should
  be no overlap between searches),

- changelog traversal itself is fairly cheap, compared to what is likely going
  to be perform on the result on the filelog traversal,

- We only touch the manifest for ancestors touching the file, And such
  changesets are likely to be the one introducing the file. (except in
  pathological cases involving merge),

- We use manifest diff instead of full manifest unpacking to check manifest
  content, so it does not involve applying multiple diffs in most case.

- linkrev shadowing is not the common case.

Tests for fixed issues in log, annotate and rename detection have been
added.

But this changeset does not solve all problems. It fixes -ancestry-
computation, but if the linkrev-shadowed changesets is the starting one, we'll
still get things wrong. We'll have to fix the bootstrapping of such operations
in a later changeset. Also, the usage of `hg log FILE`  without --follow still
has issues with linkrev pointing to hidden changesets, because it relies on the
`filelog` revset which implement its own traversal logic that is still to be
fixed.

Thanks goes to:
- Matt Mackall: for nudging me in the right direction
- Julien Cristau and Rémi Cardona: for keep telling me linkrev bug were an
  evolution show stopper for 3 years.
- Durham Goode: for finding a new linkrev issue every few weeks
- Mads Kiilerich: for that last rename bug who raise this topic over my
  anoyance limit.
2014-12-23 15:30:38 -08:00
FUJIWARA Katsunori
9e120db5e9 context: remove unreliable accessor methods from committablectx
There are two caching routes for (propertycache-ed) "_status" below in
committablectx:

  - invoking "status()":

    "dirstate.status()" is invoked, and the result of it is cached
    into "_status". In this case, any of "listignored", "listclean"
    and "listunknown" may be True.

  - accessing "_status" directly before "status()":

    Own "status()" is invoked, but all of "listignored", "listclean"
    and "listunknown" arguments are False, in this case.

"ignored"/"clean"/"unknown" accessor methods of "committablectx" use
corresponded fields of "_status", but these fields aren't reliable,
because these fields are empty when:

  - "_status" method is executed before accessors, or
  - "status()" is executed with "list*=False" before accessors

In addition to it, these accessors aren't used in the recent Mercurial
implementation. At least, removing them doesn't cause any test
failures.
2014-12-31 17:55:43 +09:00
FUJIWARA Katsunori
797fef3e65 context: cache self._status correctly at workingctx.status
Before this patch, "workingctx.status" always replaces "self._status"
by the recent result, even though:

  - status isn't calculated against the parent of the working directory, or

  - specified "match" isn't "always" one
    (status is only visible partially)

If "workingctx" object is shared between some procedures indirectly
referring "ctx._status", this incorrect caching may cause unexpected
result: for example, "ctx._status" is used via "manifest()", "files()"
and so on.

To cache "self._status" correctly at "workingctx.status", this patch
overwrites "self._status" in "workingctx._buildstatus" only when:

  - status is calculated against the parent of the working directory, and
  - specified "match" is "always" one

This patch can be applied (and effective) only on default branch,
because procedure around "basectx.status" is much different between
stable and default: for example, overwriting "self._status" itself is
executed not in "workingctx._buildstatus" but in
"workingctx._poststatus", on stable branch.
2014-12-31 17:55:43 +09:00
Pierre-Yves David
2662df0db5 filectx.parents: also fetch the filelog of rename source too
we are going to need this filelog for the linkrev adjustment, so we better
normalise the list and have the filelog in all case.

This is done in a previous changeset to help readability.
2014-12-23 18:30:46 -08:00