Commit Graph

390 Commits

Author SHA1 Message Date
Matt Mackall
da0586aaf9 merge with stable 2015-03-05 15:52:07 -06:00
Yuya Nishihara
d646851eb8 dirstate: make sure rootdir ends with directory separator (issue4557)
ntpath.join() of Python 2.7.9 does not work as expected if root is a UNC path
to top of share.

This patch doesn't take care of os.altsep, '/' on Windows, because root should
be normalized by realpath().
2015-03-06 00:14:22 +09:00
Mads Kiilerich
e51a6aa2aa dirstate: clarify comment about leaving normal files undef if changed 'now'
Clarify that they only are saved as undef if they were marked as normal and
changed in the same second.
2015-01-14 01:15:26 +01:00
Mads Kiilerich
3c0558f97b dirstate: ignore negative debug.dirstate.delaywrite values - they crashed it
Sleep can only travel forward in time, not back.
2015-01-14 01:15:26 +01:00
Siddharth Agarwal
5bc4775669 ignore: resolve ignore files relative to repo root (issue4473) (BC)
Previously these would be considered to be relative to the current working
directory. That behavior is both undocumented and doesn't really make sense.
There are two reasonable options for how to resolve relative paths:
- relative to the repo root
- relative to the config file

Resolving these files relative to the repo root matches existing behavior with
hooks. An earlier discussion about this is available at
http://mercurial.markmail.org/thread/tvu7yhzsiywgkjzl.

Thanks to Isaac Jurado <diptongo@gmail.com> for the initial patchset that
spurred the discussion.
2014-12-16 14:34:53 -08:00
Pierre-Yves David
dd01dca5ec dirstate: use the 'nogc' decorator
Now that we have a generic way to disable the gc, we use it. however, we have too
use it in a baroque way. See inline comment for details.
2014-12-04 05:43:15 -08:00
Martin von Zweigbergk
c71ba3444e dirstate: speed up repeated missing directory checks
In a mozilla repo with tip at bb3ff09f52fe,

  hg update tip~1000 && time hg revert -nq -r tip .

displays ~4:20 minutes. With tip~100, it runs in ~11 s. With revision
100000, it did not finish in 12 minutes.

Revert calls dirstate.status() with a matcher that matches each file
in the target revision. The main problem [1] lies in
dirstate._walkexplicit(), which looks for matching deleted directories
by checking whether each path is prefix of any path in the
dirstate. With m files in the dirstate and n files in the target
revision that are not in the dirstate, this is clearly O(m*n). Let's
improve by keeping a lazily initialized set of all the directories in
the dirstate, so the time becomes O(m+n).

After this patch, the 4:20 minutes become 5.5 s, while for a single
missing path, it slows down from 1.092 s to 1.150 s (best of 4). The
>12 min case becomes 5.8 s.

 [1] A narrower optimization would be to make revert take the fast
     path for '.' and '--all'.
2014-11-19 23:15:07 -08:00
Martin von Zweigbergk
8b968ecfe2 status: update and move documentation of status types to status class
The various status types are currently documented on the
dirstate.status() method. Now that we have a class for the status
types, it makese sense to document the status types there
instead. Only leave the bits related to lookup/unsure in the status()
method documentation.
2014-10-10 10:14:35 -07:00
Martin von Zweigbergk
41a4138ec7 status: create class for status lists
Callers of various status() methods (on dirstate, context, repo) get a
tuple of 7 elements, where each element is a list of files. This
results in lots of uses of indexes where names would be much more
readable. For example, "status.ignored" seems clearer than "status[4]"
[1]. So, let's introduce a simple named tuple containing the 7 status
fields: modified, added, removed, deleted, unknown, ignored, clean.

This patch introduces the class and updates the status methods to
return instances of it. Later patches will update the callers.

 [1] Did you even notice that it should have been "status[5]"?

(tweaked by mpm to introduce the class in scmutil and only change one user)
2014-10-10 14:32:36 -07:00
Martin von Zweigbergk
1a4e0a3d51 dirstate: separate 'lookup' status field from others
The status tuple returned from dirstate.status() has an additional
field compared to the other status tuples: lookup/unsure. This field
is just an optimization and not something most callers care about
(they want the resolved value of 'modified' or 'clean'). To prepare
for a single future status type, let's separate out the 'lookup' field
from the rest by having dirstate.status() return a pair: (lookup,
status).
2014-10-03 21:44:10 -07:00
Matt Mackall
8e8234eecc dirstate: merge falls through to otherparent
This lets us more correctly fix the state when we use setparents, as
demonstrated in the change in test-graft.t.
2014-10-11 14:05:09 -05:00
Matt Mackall
f7a8e82c62 dirstate: use 'm' state in otherparent to reduce ambiguity
In rebase-like operations where we abandon the second parent, we can
correctly fix up the state in setparents.
2014-10-10 13:31:06 -05:00
Matt Mackall
a44416ab0f dirstate: properly clean-up some more merge state on setparents 2014-10-10 13:05:50 -05:00
Siddharth Agarwal
0b13ea03ab dirstate: cache util.normcase while constructing the foldmap
This is a small win on OS X. hg perfdirstatefoldmap:

before: wall 0.399708 comb 0.410000 user 0.390000 sys 0.020000 (best of 25)
after:  wall 0.386331 comb 0.390000 user 0.370000 sys 0.020000 (best of 25)
2014-10-03 18:48:09 -07:00
Siddharth Agarwal
1394c04c63 dirstate: copyedit exception for no beginparentchange call 2014-09-17 13:08:03 -07:00
Durham Goode
1d292c944f dirstate: add exception when calling setparent without begin/end (API)
Adds an exception when calling dirstate.setparent without having first called
dirstate.beginparentchange. This will prevent people from writing code that
modifies the dirstate parent without considering the transactionality of their
change.

This will break third party extensions that call setparents.
2014-09-05 11:37:44 -07:00
Durham Goode
7a67b9c913 dirstate: add begin/endparentchange to dirstate
It's possible for the dirstate to become incoherent (issue4353) if there is an
exception in the middle of the dirstate parent and entries being written (like
if the user ctrl+c's). This change adds begin/endparentchange which a future
patch will require to be set before changing the dirstate parent.  This will
allow us to prevent writing the dirstate in the event of an exception while
changing the parent.
2014-09-05 11:34:29 -07:00
Siddharth Agarwal
e621257516 dirstate: add a method to efficiently filter by match
Current callers that require just this data call workingctx.walk, which calls
dirstate.walk, which stats all the files. Even worse, workingctx.walk looks for
unknown files, significantly slowing things down, even though callers might not
be interested in them at all.
2014-08-01 22:05:16 -07:00
FUJIWARA Katsunori
2fdcc0328a dirstate: delay writing out to ensure timestamp of each entries explicitly
Even though "dirstate.write()" is invoked explicitly after "normal"
invocations, timestamp field of entries may be still "unset" in the
"dirstate" file itself , because "pack_dirstate" drops it when it is
equal to the timestamp of "dirstate" file itself.

This can avoid overlooking modification of files, which are updated at
same time in the second. But on the other hand, this may hide timing
critical problems.

For example, incorrect "normal"-ing (or lack of "normallookup"-ing on
the already "normal"-ed entry) is visible only when:

  - the target file is modified in the working directory at T1, and
  - "dirstate" file is written out at T2 (!= T1)

Otherwise, T1 is dropped by "pack_dirstate" in "dirstate.write()"
invocation, and "unset" is stored into "dirstate" file.

It often fails to reproduce problems from incorrect "normal"-ing by
Mercurial testset, because automated actions in the small repository
almost always causes that T1 and T2 are same.

This patch adds the debug feature to delay writing out to ensure
timestamp of each entries explicitly.

This feature is used to make timing critical "dirstate" problems
reproducable in subsequent patches.
2014-07-22 23:59:30 +09:00
Siddharth Agarwal
337e9e8db0 dirstate.status: assign members one by one instead of unpacking the tuple
With this patch, hg status and hg diff regain their previous speed.

The following tests are run against a working copy with over 270,000 files.
Here, 'before' means without this or the previous patch applied.

Note that in this case `hg perfstatus` isn't representative since it doesn't
take dirstate parsing time into account.

$ time hg status  # best of 5
before: 2.03s user 1.25s system 99% cpu 3.290 total
after:  2.01s user 1.25s system 99% cpu 3.261 total

$ time hg diff    # best of 5
before: 1.32s user 0.78s system 99% cpu 2.105 total
after:  1.27s user 0.79s system 99% cpu 2.066 total
2014-05-27 21:02:16 -07:00
Siddharth Agarwal
f40a94a790 parsers: inline fields of dirstate values in C version
Previously, while unpacking the dirstate we'd create 3-4 new CPython objects
for most dirstate values:

- the state is a single character string, which is pooled by CPython
- the mode is a new object if it isn't 0 due to being in the lookup set
- the size is a new object if it is greater than 255
- the mtime is a new object if it isn't -1 due to being in the lookup set
- the tuple to contain them all

In some cases such as regular hg status, we actually look at all the objects.
In other cases like hg add, hg status for a subdirectory, or hg status with the
third-party hgwatchman enabled, we look at almost none of the objects.

This patch eliminates most object creation in these cases by defining a custom
C struct that is exposed to Python with an interface similar to a tuple. Only
when tuple elements are actually requested are the respective objects created.

The gains, where they're expected, are significant. The following tests are run
against a working copy with over 270,000 files.

parse_dirstate becomes significantly faster:

$ hg perfdirstate
before: wall 0.186437 comb 0.180000 user 0.160000 sys 0.020000 (best of 35)
after:  wall 0.093158 comb 0.100000 user 0.090000 sys 0.010000 (best of 95)

and as a result, several commands benefit:

$ time hg status  # with hgwatchman enabled
before: 0.42s user 0.14s system 99% cpu 0.563 total
after:  0.34s user 0.12s system 99% cpu 0.471 total

$ time hg add new-file
before: 0.85s user 0.18s system 99% cpu 1.033 total
after:  0.76s user 0.17s system 99% cpu 0.931 total

There is a slight regression in regular status performance, but this is fixed
in an upcoming patch.
2014-05-27 14:27:41 -07:00
Siddharth Agarwal
64ffd83be6 dirstate: add dirstatetuple to create dirstate values
Upcoming patches will switch away from using Python tuples for dirstate values
in compiled builds.  Make that easier by introducing a variable called
dirstatetuple, currently set to tuple. In upcoming patches, this will be set to
an object from the parsers module.
2014-05-27 17:10:28 -07:00
Mads Kiilerich
eb39238a97 dirstate: report bad subdirectories as match.bad, not just a warning (BC)
This seems simpler and more correct.

The only test coverage for this is test-permissions.t when it says:
  dir: Permission denied
2013-10-03 18:01:21 +02:00
Mads Kiilerich
67a2ed988d dirstate: improve documentation and readability of match and ignore in the walker 2013-10-03 18:01:21 +02:00
Mads Kiilerich
020377d078 dirstate: inline local finish function
Having it as a local function adds no value.
2013-04-27 23:19:52 +02:00
Yuya Nishihara
0cf2db5155 dirstate: remove double imports of errno 2014-03-03 15:50:41 +09:00
Pierre-Yves David
b563457396 rebase: do not crash in panic when cwd disapear in the process (issue4121)
Before this patch rebase crashed badly when it happend. (not abort, crash).

Fix courtesy of Matt Mackall.
2014-01-31 15:13:15 -08:00
Augie Fackler
213fff305a pathutil: tease out a new library to break an import cycle from canonpath use 2013-11-06 18:19:04 -05:00
Siddharth Agarwal
7decbaaad4 dirstate.status: return explicit unknown files even when not asked
dirstate.walk will return unknown files that were explicitly requested, even
if listunknown is false. There's no point in dropping these files on the
floor in dirstate.status.

This has no effect on any current callers, because all of them assume the
unknown list is empty and ignore it. Future callers may find it useful,
though.
2013-10-14 00:25:29 -04:00
Siddharth Agarwal
d8f5823b88 dirstate.status: don't ignore symlink placeholders in the normal set
On Windows, there are two ways symlinks can manifest themselves:
1. As placeholders: text files containing the symlink's target. This is what
   usually happens with fresh clones on Windows.
2. With their dereferenced contents. This happens with clones accessed over NFS
   or Samba.

In order to handle case 2, 28af3e0c54f0 made dirstate.status ignore all symlink
placeholders on Windows. It doesn't ignore symlinks in the lookup set, though,
since those don't have the link bit set. This is problematic because it
violates the invariant that `hg status` with every file in the normal set
produces the same output as `hg status` with every file in the lookup set.

With this change, symlink placeholders in the normal set are no longer ignored.
We instead rely on code in localrepo.status that uses heuristics to look for
suspect placeholders.

An upcoming patch will test this out by no longer adding files written in the
last second of an update to the lookup set.
2013-08-31 10:20:15 -07:00
Matt Mackall
0e295e1642 merge with stable 2013-05-17 17:22:08 -05:00
Matt Mackall
870fe2303d dirstate: don't overnormalize for ui.slash
This should fix the issue exposed by debugpathcomplete on the buildbot.
2013-05-17 14:31:06 -05:00
Durham Goode
c5a04c46b0 hgignore: fix regression with hgignore directory matches (issue3921)
If a directory matched a regex in hgignore but the files inside the directory
did not match the regex, they would appear as deleted in hg status. This
change fixes them to appear normally in hg status.

Removing the ignore(nf) conditional here is ok because it just means we might
stat more files than we had before. My testing on a large repo shows this
causes no performance regression since the only additional files being stat'd
are the ones that are missing (i.e. status=!), which are generally rare.
2013-05-03 09:44:50 -07:00
FUJIWARA Katsunori
364377a413 icasefs: ignore removed files at building "dirstate._foldmap" up on icasefs
Before this patch, all files in dirstate are used to build "_foldmap"
up on case insensitive filesystem regardless of their statuses.

For example, when dirstate contains both removed file 'a' and added
file 'A', "_foldmap" may be updated finally by removed file 'a'. This
causes unexpected status information for added file 'A' at "hg status"
invocation.

This patch ignores removed files at building "_foldmap" up on case
insensitive filessytem.

This patch doesn't add any test, because this issue is difficult to
reproduce intentionally: it depends on iteration order of "dirstate._map".
2013-04-30 05:00:48 +09:00
Siddharth Agarwal
65a08dce5b dirstate.status: avoid full walks when possible 2013-04-23 14:16:33 -07:00
Siddharth Agarwal
b35c53428c dirstate.walk: add a flag to let extensions avoid full walks
Consider a hypothetical extension that implements walk in a more efficient
manner and skips some known-clean files. However, that can only be done under
some situations, such as when clean files are not being asked for and a
match.traversedir callback is not set. The full flag lets walk tell these two
cases apart.
2013-04-22 17:11:18 -07:00
Siddharth Agarwal
d9060ea694 dirstate._walkexplicit: inline dirsnotfound.append 2013-05-07 14:20:34 -07:00
Siddharth Agarwal
c1be698d70 dirstate._walkexplicit: rename work to dirsfound
Now that this code is factored out, work is too specific a name.
2013-05-07 14:19:04 -07:00
Siddharth Agarwal
819d02613a dirstate.walk: refactor explicit walk into separate function
This enables this code to be reused by extensions that implement the other,
more time-consuming bits of walk in different ways.
2013-05-07 10:02:55 -07:00
Siddharth Agarwal
b485dd87ab dirstate.walk: pull skipstep3 out of the explicit walk code
This is a move towards factoring out this code into a separate function.
2013-05-07 09:31:00 -07:00
Siddharth Agarwal
c0d29e5fe5 dirstate.walk: move dirignore filter out of explicit walk code
This is a move towards factoring this code out into a separate function.
2013-05-07 09:47:10 -07:00
Siddharth Agarwal
a143c1d00c dirstate.walk: maintain a list of dirs not found
Upcoming patches will factor out the walk over explicit files done in step 1.
This helps us get there.
2013-05-07 09:29:43 -07:00
Siddharth Agarwal
8e7066e5f2 match: make explicitdir and traversedir None by default
With this, extensions can easily tell when traversedir and/or explicitdir don't
need to be called.
2013-05-03 14:41:58 -07:00
Siddharth Agarwal
053c6e8fe3 dirstate.walk: cache match.explicitdir and traversedir locally 2013-05-03 14:39:28 -07:00
Siddharth Agarwal
cad3e8cacb dirstate.walk: call match.explicitdir or traversedir as appropriate 2013-04-28 21:25:41 -07:00
Bryan O'Sullivan
4a4a5dde94 scmutil: use new dirs class in dirstate and context
The multiset-of-directories code was open coded in each of these
modules; this change gets rid of the duplication.
2013-04-10 15:08:26 -07:00
Bryan O'Sullivan
00ff38c9c3 scmutil: migrate finddirs from dirstate 2013-04-10 15:08:25 -07:00
Bryan O'Sullivan
6f6047415e dirstate: only call lstat once per flags invocation
This makes a big difference to performance in some cases.

hg --time locate 'set:symlink()'

mozilla-central (70,000 files):

  before: 2.92 sec
  after:  2.47

another repo (170,000 files):

  before: 7.87 sec
  after:  6.86
2013-04-03 11:35:27 -07:00
Siddharth Agarwal
a6bf485dee dirstate.walk: fast path none-seen + match-always case for step 3
This case is a common one -- e.g. `hg diff`.

For a repository with 170,000 files, this speeds up perfstatus from 0.95
seconds to 0.88.
2013-03-22 17:03:49 -07:00
Siddharth Agarwal
5e52c298d1 dirstate.walk: fast path match-always case during traversal
This case is a common one -- e.g. `hg status`.

For a repository with 170,000 files, this speeds up perfstatus --unknown from
2.15 seconds to 2.09.
2013-03-22 17:03:00 -07:00