Commit Graph

23292 Commits

Author SHA1 Message Date
Pierre-Yves David
9fc6abae03 localrepo: add a currenttransaction method
This method returnx the current transaction or None: it will allow a
cache writer to hook in an existing transaction.
2014-11-13 11:12:47 +00:00
Pierre-Yves David
8f926bef9a repoview: extract actual hidden cache writing in its own function
This will allow the generation of this cache within the transaction. Relying on
the transaction will reduce the chance of reader seeing bad cache.
2014-11-13 11:11:17 +00:00
Martin von Zweigbergk
c71ba3444e dirstate: speed up repeated missing directory checks
In a mozilla repo with tip at bb3ff09f52fe,

  hg update tip~1000 && time hg revert -nq -r tip .

displays ~4:20 minutes. With tip~100, it runs in ~11 s. With revision
100000, it did not finish in 12 minutes.

Revert calls dirstate.status() with a matcher that matches each file
in the target revision. The main problem [1] lies in
dirstate._walkexplicit(), which looks for matching deleted directories
by checking whether each path is prefix of any path in the
dirstate. With m files in the dirstate and n files in the target
revision that are not in the dirstate, this is clearly O(m*n). Let's
improve by keeping a lazily initialized set of all the directories in
the dirstate, so the time becomes O(m+n).

After this patch, the 4:20 minutes become 5.5 s, while for a single
missing path, it slows down from 1.092 s to 1.150 s (best of 4). The
>12 min case becomes 5.8 s.

 [1] A narrower optimization would be to make revert take the fast
     path for '.' and '--all'.
2014-11-19 23:15:07 -08:00
Martin von Zweigbergk
2916dab85b revert: access status fields by name rather than index
For better readability.
2014-11-19 17:07:27 -08:00
FUJIWARA Katsunori
4a1c867054 subrepo: remove "_getstorehashcachepath" referred by no other code paths 2014-11-19 18:35:14 +09:00
FUJIWARA Katsunori
ba433385a7 subrepo: replace direct file APIs around "writelines" by "vfs.writelines"
This patch also replaces "self._getstorehashcachepath" (building
absolute path up) by "self._getstorehashcachename" (building relative
path up), because "vfs.writelines" requires relative path.
2014-11-19 18:35:14 +09:00
FUJIWARA Katsunori
86bc73ffa9 vfs: add "writelines"
This patch allows "writelines" to take "mode" and "notindexed"
arguments, because subsequent patch for subrepo requires both.
2014-11-19 18:35:14 +09:00
FUJIWARA Katsunori
f60bafa1b3 vfs: add "notindexed" argument to invoke "ensuredir" with it in write mode
This patch uses "False" as default value of "notindexed" argument,
even though "vfs.makedir()" uses "True" for it, because "os.mkdir()"
doesn't set "_FILE_ATTRIBUTE_NOT_CONTENT_INDEXED" attribute to newly
created directories.
2014-11-19 18:35:14 +09:00
FUJIWARA Katsunori
59f23cabee subrepo: replace direct file APIs around "readlines" by "vfs.tryreadlines"
This patch also replaces "self._getstorehashcachepath" (building
absolute path up) by "self._getstorehashcachename" (building relative
path up), because "vfs.tryreadlines" requires relative path.

This patch makes "_readstorehashcache()" return "[]" (returned by
"vfs.tryreadlines()"), when cache file doesn't exist, even though
"_readstorehashcache()" returned '' (empty string) in such case before
this patch.

"_readstorehashcache()" is invoked only by the code path below in
"_storeclean()":

            for filehash in self._readstorehashcache(path):
                if filehash != itercache.next():
                    clean = False
                    break

In this case, "[]" and '' don't differ from each other, because both
of them cause avoiding iteration of "for loop".
2014-11-19 18:35:14 +09:00
FUJIWARA Katsunori
b1ff97d24c vfs: add "readlines" and "tryreadlines"
This patch allows "readlines" and "tryreadlines" to take "mode"
argument, because "subrepo" requires to read files not in "rb"
(binary, default for vfs) but in "r" (text) mode in subsequent patch.
2014-11-19 18:35:14 +09:00
FUJIWARA Katsunori
38f496a6bb subrepo: add "_cachestorehashvfs" to handle cache store hash files via vfs
This "vfs" object will be used by subsequent patches to handle cache
store hash files without direct file APIs.

This patch decorates "_cachestorehashvfs" with "@propertycache" to
delay vfs creation, because it is used only for cooperation with other
repositories.

In this patch, "/" is used as the path separator, even though
"self._repo.join" uses platform specific path separator (e.g. "\\" on
Windows). But it is reasonable enough, because "store" and other
management file handling already include such implementation, and they
work well.
2014-11-19 18:35:14 +09:00
FUJIWARA Katsunori
82551da514 subrepo: remove "_calcfilehash" referred by no other code paths 2014-11-19 18:35:14 +09:00
FUJIWARA Katsunori
d95af73d10 subrepo: replace "_calcfilehash" invocation by "vfs.tryread"
"_calcfilehash" can be completely replaced by simple "vfs.tryread"
invocation.

    def _calcfilehash(filename):
        data = ''
        if os.path.exists(filename):
            fd = open(filename, 'rb')
            data = fd.read()
            fd.close()
        return util.sha1(data).hexdigest()

Building absolute path "absname" up by "self._repo.join" for files in
"filelist" is avoided, because "vfs.tryread" does so internally.
2014-11-19 18:35:14 +09:00
FUJIWARA Katsunori
617a0e35a6 subrepo: replace "os.path.exists" by "exists" via wvfs of the parent
Existance of specified "path" should be examined by "exists" via wvfs
of the parent repository, because the working directory of the parent
repository may be in UTF-8 mode. Wide API should be used via wvfs in
such case.

In this patch, "/" is used as the path separator, even though "path"
uses platform specific path separator (e.g. "\\" on Windows). But it
is reasonable enough, because "store" and other management file
handling already include such implementation, and they work well.
2014-11-19 18:35:14 +09:00
FUJIWARA Katsunori
42cf1cdb87 subrepo: avoid redundant "util.makedirs" invocation
"util.makedirs" for the (sub-)repository root of "hgsubrepo" is also
executed in the constructor of "localrepository", if "create" is True
and ".hg" of it doesn't exist.

This patch avoids redundant "util.makedirs" invocation in the
constructor of "hgsubrepo".
2014-11-19 18:35:14 +09:00
Martin von Zweigbergk
1d09a87f4e merge: remove confusing comment about --force
manifestmerge() has a piece of code that's roughly:

  if not force and different:
      abort
  else:
      # if different: old untracked f may be overwritten and lost
      ...

The comment only talks about what happens when 'different' is true,
and in combination with the if-block above, that must mean that it is
only about what happens when 'force and different'. It seems quite
fine that files are overwritten when 'force' is true, so let's remove
the comment. As it stands, it can easily be interpreted as a TODO
(which is how I interpreted it at first).
2014-11-19 08:50:08 -08:00
Augie Fackler
11da8b36af test-run-tests: accept more levels of precision and trailing ws (issue4440)
simplejson produces slightly different output from the built-in json
module, specifically:
  * It uses 0.000 instead of 0.0000
  * It likes to put a trailing space after a comma

This change works around both of those variations.
2014-11-06 10:57:13 -05:00
Matt Harbison
eae366b6a1 hgweb: fix a crash when using web.archivesubrepos
A matcher is required when enabling the subrepo option on archival.archive(),
because that calls match.narrowmatcher(), which accesses fields on the object.
It's therefore probably a bad idea to default the matcher to None on archive(),
but that's a fix for default.
2014-11-05 21:33:45 -05:00
Matt Harbison
6daa20236a tests: introduce a subrepository to test-archive.t
This will be used in an upcoming patch to add coverage for web.archivesubrepos.
2014-11-05 20:31:58 -05:00
Gregory Szorc
04eeb85285 changegroup: sparsely populate fnodes
Previously, fnodes had a key and empty dict value for every element in
changedfiles. This is somewhat wasteful. Empty dicts in CPython consume
a lot more memory than you would expect - 280 bytes.

On mozilla-central, which has ~190,000 files/fnodes keys, the previous
loop populating fnodes allocated 91,924 KB of memory, most of that for
the empty dicts.

With this patch in place, our peak RSS during mozilla-central clone
drops:

before:  364,356 KB
after:   326,008 KB
delta:   -38,348 KB

When combined with the previous patch, total peak RSS decrease is now
190,116 KB.
2014-11-06 22:48:20 -08:00
Gregory Szorc
c6e3c6fb27 changegroup: don't store unused value on fnodes (issue4443)
The contents of fnodes are only accessed once per key. It is wasteful to
cache the value since nobody will use it.

Before this patch, the caching of unused data in fnodes was effectively
causing a memory leak during the file streaming part of bundle creation.

On mozilla-central (which has ~190,000 entries in fnodes), this patch
has a significant impact on RSS at the end of generate():

before:  516,124 KB
after:   364,356 KB
delta:  -151,768 KB

The origin of this code can be traced back to 1f567a607f1f and has been
with us since the 2.7 release.
2014-11-06 22:33:48 -08:00
Gregory Szorc
0bfb4de7ec changegroup: don't define lookupmf() until it is needed
lookupmf() is currently defined earlier than when it is needed. Future
patches further refactoring this code will be easier to read when
lookupmf() is in its new home.
2014-11-06 20:57:12 -08:00
Pierre-Yves David
4919d2a337 mail: actually use the verifycert config value
The mail module only verifies the smtp ssl certificate if 'verifycert' is enabled
(the default). The 'verifycert' can take three possible values:

- 'strict'
- 'loose'
- any "False" value, eg: 'false' or '0'

We tested the validity of the third value, but never converted it to actual
falseness, making 'False' an equivalent for 'loose'.

This changeset fixes it.
2014-11-05 18:31:39 +00:00
Thomas Arendsen Hein
64014abd9c convert: use git diff-tree -Cn% instead of --find-copies=n% for older git
The option --find-copies was added in a later git version than the one included
in Debian squeeze-lts (1.7.2.5), probably around 1.7.4.
2014-11-06 09:36:39 +01:00
Pierre-Yves David
9984e5c699 bookmarks: fix formatting of exchange message (issue4439)
The message formatting was crashing when doing explicit pulling `hg pull -B X`.
This changeset fix it and improved the test coverage.
2014-11-05 17:25:00 +00:00
Mads Kiilerich
c5488ba34c discovery: indices between sample and yesno must match (issue4438)
2ec3e28dea6b changed 'sample' from a list to a set. The iteration order is thus
undefined and the yesno indices are not stable.

To solve this, repeat the listification and comment from elsewhere in the code.

Note: the randomness in the discovery protocol can make this problem hard to
reproduce.
2014-11-05 13:05:32 +01:00
Mads Kiilerich
8079358ce3 discovery: limit 'all local heads known remotely' to real 'all' (issue4438)
2ec3e28dea6b made it possible that the initial head check didn't include all
heads. If that is the case, don't use the early exit just because this random
sample happened to be 'all known'.

Note: the randomness in the discovery protocol can make this problem hard to
reproduce.
2014-11-05 13:05:29 +01:00
Matt Harbison
298c02c65a templater: don't overwrite the keyword mapping in runsymbol() (issue4362)
This keyword remapping was introduced in 236440938a03 as part of converting
generator based iterators into list based iterators, mentioning "undesired
behavior in template" when a generator is exhausted, but doesn't say what and
introduces no tests.

The problem with the remapping was that it corrupted the output for keywords
like 'extras', 'file_copies' and 'file_copies_switch' in templates such as:

    $ hg log -r 82a4f5557c6b --template "{file_copies % ' File: {file_copy}\n'}"
    File: mercurial/changelog.py (mercurial/hg.py)
    File: mercurial/changelog.py (mercurial/hg.py)
    File: mercurial/changelog.py (mercurial/hg.py)
    File: mercurial/changelog.py (mercurial/hg.py)
    File: mercurial/changelog.py (mercurial/hg.py)
    File: mercurial/changelog.py (mercurial/hg.py)
    File: mercurial/changelog.py (mercurial/hg.py)
    File: mercurial/changelog.py (mercurial/hg.py)

What was happening was that in the first call to runtemplate() inside runmap(),
'lm' mapped the keyword (e.g. file_copies) to the appropriate showxxx() method.
On each subsequent call to runtemplate() in that loop however, the keyword was
mapped to a list of the first item's pieces, e.g.:

   'file_copy': ['mercurial/changelog.py', ' (', 'mercurial/hg.py', ')']

Therefore, the dict for the second and any subsequent items were not processed
through the corresponding showxxx() method, and the first item's data was
reused.

The 'extras' keyword regressed in 56b014c52204, and 'file_copies' regressed in
4e182fb53989 for other reasons.  The common thread of things fixed by this seems
to be when a list of dicts are passed to the templatekw._hybrid class.
2014-11-03 12:08:03 -05:00
Pierre-Yves David
160c394fe7 phases: read pending data when appropriate
If we are called by a hook and pending data exists, read those.
2014-10-17 22:23:06 -07:00
Pierre-Yves David
4012eb31b0 bookmark: read pending data when appropriate
If we are called by a hook and pending data exists, read it.
2014-09-28 21:27:48 -07:00
Pierre-Yves David
0e44aeb8c0 test-bundle2: check visible data in pre/post-transaction hooks
We are about to make bookmarks and phases available for hooks.
Therefore we need a witness for this new availability. We introduce
the new hooks in a distinct changeset to reduce the noise in the ones
with actual changes.
2014-11-12 16:54:57 +00:00
Pierre-Yves David
3ace7493d7 transaction: write pending generated files
Such file are generated with a .pending prefix. It is up to the reader to
implement the necessary logic for reading pending files.

We add a test to ensure pending files are properly cleaned-up in both success and
error cases.
2014-10-17 22:19:05 -07:00
Pierre-Yves David
58e32f1eeb transaction: have _generatefile return a boolean
The function returns True if any files were generated. This will be
used to know if any pending files have been written.
2014-10-17 21:57:32 -07:00
Pierre-Yves David
81a1fe4d5b transaction: allow generating files with a suffix
This will allow us to generate temporary pending files. Files
generated with a suffix are assumed temporary and will be cleaned up
at the end of the transaction.
2014-09-29 01:29:08 -07:00
Matt Mackall
3f845e51cb transaction: fix some docstring grammar 2014-11-19 09:52:05 -06:00
Pierre-Yves David
ecac877d99 transaction: accept a 'location' argument for registertmp
This will allow generation of temporary files outside of store. This will be
useful for bookmarks.
2014-11-12 14:57:41 +00:00
Matt Harbison
984e26eb0f tests: handle differences between missing file error strings on Windows and Unix 2014-11-18 23:51:58 -05:00
Matt Harbison
a1af7899a6 run-tests: don't warn on unnecessary globs mandated by check-code.py
When test output is processed, if os.altsep is defined (i.e. on Windows),
TTest.globmatch() will cause a warning later on if a line has a glob that isn't
necessary.  Unfortunately, the regex checking in check-code.py doesn't have this
context.  Therefore we ended up with cases where the test would get flagged with
a warning only on Windows because a glob was present, because check-code.py
would warn if it wasn't.  For example, from test-subrepo.t:

    $ hg -R issue1852a push `pwd`/issue1852c
    pushing to $TESTTMP/issue1852c (glob)

The glob isn't necessary here because the slash is shown as it was provided.
However, check-code mandates one to handle the case where the default path has
backslashes in it.

Break the cycle by checking against a subset of the check-code rules before
flagging the test with a warning, and ignore the superfluous glob if it matches
a rule.  This change fixes warnings in test-largefiles-update.t, test-subrepo.t,
test-tag.t, and test-rename-dir-merge.t on Windows.

I really hate that the rules are copy/pasted here (minus the leading two spaces)
because it would be nice to only update the rules once, in a single place.  But
I'm not sure how else to do it.  I'm open to suggestions.  Splitting some of the
rules out of check-code.py seems wrong, but so does moving check-code.py out of
contrib, given that other checking scripts live there.

There are other glob patterns that could be copied over, but this is enough to
make the current tests run on Windows.
2014-11-18 22:02:00 -05:00
Martin von Zweigbergk
f29370d747 update: remove unnecessary check for unknown files with --check
As far as I and the test suite can tell, the checks in manifestmerge()
already report the errors (whether or not --check is given), so we
don't need to call merge.checkunknown(). Since this is the last call
to the method, also remove the method.
2014-11-18 16:14:32 -08:00
Matt Mackall
cbe34eb85a merge with stable 2014-11-18 12:29:30 -06:00
Matt Harbison
ca194aa932 tests: move a multi-statement debuglocks hook into a shell script for Windows
Before this patch, a part of "test-push-hook-lock.t" fails unexpectedly on
Windows environment, because semicolon (";") isn't recognized as the command
separator by "cmd.exe".  This is fixed the same way as a similar issue in
7c253c23de3b.
2014-11-16 22:03:57 -05:00
Matt Harbison
d1e1ea0e07 tests: fix globs for Windows
test-largefiles-update.t, test-subrepo.t, test-tag.t, and
test-rename-dir-merge.t still warn about no result returned because of
unnecessary globs that test-check-code-hg.t wants, relating to output for
pushing to, pulling from and moving X to Y.
2014-11-16 16:26:15 -05:00
Matt Harbison
46cd7c6aa4 run-tests: include quotes in the HGEDITOR value when storing sys.executable
This fixes test-install.t on Windows that broke in 97300cee8fc0 when
shlex.split() was added to the debuginstall command:

    @@ -7,8 +7,11 @@
       checking installed modules (*mercurial)... (glob)
       checking templates (*mercurial?templates)... (glob)
       checking commit editor...
    +   Can't find editor 'c:\Python27\python.exe -c "(omitted)"' in PATH
    +   (specify a commit editor in your configuration file)
       checking username...
    -  no problems detected
    +  1 problems detected, please check your install!
    +  [1]

What happens is that shlex.split() on Windows turns this:

    c:\Python27\python.exe -c "import sys; sys.exit(0)"

into this:

    ['c:Python27python.exe', '-c', 'import sys; sys.exit(0)']

While technically a regression, most programs on Windows live in some flavor of
'Program Files', and therefore the environment variable needs to contain quotes
anyway to handle the space.  This wasn't handled prior to the shlex() change,
because it tested the whole environment variable to see if it was an executable,
or split on the first space and tested again.
2014-11-04 12:46:00 -05:00
Siddharth Agarwal
fba9f14547 setdiscovery: avoid a full changelog graph traversal
We were definitely being suboptimal here: we were constructing two full sets,
one with the full set of common nodes (i.e. a graph traversal) and one with all
nodes. Then we subtract one set from the other. This whole process is
O(commits) and causes discovery to be significantly slower than it should be.

Instead, keep track of common incrementally and keep undecided as small as
possible.

This makes discovery massively faster on large repos: on one such repo, 'hg
debugdiscovery' over SSH with one commit missing on the client and five on the
server went from 4.5 seconds to 1.5. (An 'hg debugdiscovery' with no commits
missing on the client, i.e. connection startup time, was 1.2 seconds.)
2014-11-16 00:40:29 -08:00
Siddharth Agarwal
1a87e8b8c3 ancestor: add a way to remove ancestors of bases from a given set
This and missingancestors can share state, which will turn out to be perfect
for set discovery.
2014-11-14 19:40:30 -08:00
Siddharth Agarwal
0d3efeefd2 ancestor: add a way to add to bases of a missing ancestor object
This will be useful for setdiscovery, since with that we incrementally add to
our knowledge of common nodes.
2014-11-14 17:21:00 -08:00
Siddharth Agarwal
8c7869477d ancestor: add a way to test whether a missing ancestor object has bases
This is pretty trivial so there's no unit test coverage for it.

This will be used by setdiscovery.
2014-11-16 00:39:29 -08:00
Siddharth Agarwal
078961d745 ancestor: remove now-unused missingancestors function
Callers should use revlog.incrementalmissingrevs instead.
2014-11-14 16:53:40 -08:00
Siddharth Agarwal
2d669c474b revlog: switch findmissing* methods to incrementalmissingrevs
This will allow us to remove ancestor.missingancestors in an upcoming patch.
2014-11-14 16:52:40 -08:00
Siddharth Agarwal
5692148f49 revlog: add a method to get missing revs incrementally
This will turn out to be useful for discovery.
2014-11-16 00:39:48 -08:00