Commit Graph

17491 Commits

Author SHA1 Message Date
Bryan O'Sullivan
a150198558 store: implement fncache basic path encoding in C
(This is not yet enabled; it will be turned on in a followup patch.)

The path encoding performed by fncache is complex and (perhaps
surprisingly) slow enough to negatively affect the overall performance
of Mercurial.

For a short path (< 120 bytes), the Python code can be reduced to a fairly
tractable state machine that either determines that nothing needs to be
done in a single pass, or performs the encoding in a second pass.

For longer paths, we avoid the more complicated hashed encoding scheme
for now, and fall back to Python.

Raw performance: I measured in a repo containing 150,000 files in its tip
manifest, with a median path name length of 57 bytes, and 95th percentile
of 96 bytes.

In this repo, the Python code takes 3.1 seconds to encode all path
names, while the hybrid C-and-Python code (called from Python) takes
0.21 seconds, for a speedup of about 14.

Across several other large repositories, I've measured the speedup from
the C code at between 26x and 40x.

For path names above 120 bytes where we must fall back to Python for
hashed encoding, the speedup is about 1.7x.  Thus absolute performance
will depend strongly on the characteristics of a particular repository.
2012-09-18 15:42:19 -07:00
Pierre-Yves David
7a06ce262e rebase: ensure rebase does not revive extinct revision
Here, we exclude hidden changesets from a rebase operation. If we
don't, a rewritten version of the hidden changesets will be created
by rebase. Those rewritten versions won't be hidden and will likely
conflict with other rewriting or revive pruned changeset. Moreover,
rewriting hidden revisions will surprise the user.

This change would not be necessary if changelog filtering were
already in core. But it's fairly cheap and helps to increase the
test-suite for such filtering.

Once changelog level filtering is added, hidden changes will be
automatically excluded or included according to the global --hidden
flags. Plain ignoring them is good enough for now.
2012-09-18 23:32:42 +02:00
Pierre-Yves David
6fa51bbd6d rebase: remove useless list around repo.revs
As repo.revs already returns a list.
2012-09-18 23:29:05 +02:00
Pierre-Yves David
4e2a2d8f61 rebase: properly handle --collapse when creating obsolescence marker
In collapse mode, that content of state is not suitable to compute obsolescence
markers. We explicitly pass the resulting revision instead and use it as the
successors for all elements of the rebased set.
2012-09-18 23:42:27 +02:00
Pierre-Yves David
be6378e49d rebase: allow creation obsolescence relation instead of stripping
When obsolescence feature is enabled we now create markers from the rebased
set to the resulting set instead of stripping. The "state" mapping built by
rebase holds all necessary data.

Changesets "deleted" by the rebase are marked "succeeded" by the changeset they
would be rebased one. That the best guess of "successors" we have. Getting a
successors as meaningful as possible is important for automatic resolution of
obsolescence troubles. In other word, emptied changeset will looks collapsed
with their former parents. (see "empty changeset" section of the test if you are
still confused)
2012-09-18 23:13:31 +02:00
Pierre-Yves David
f0bddab386 rebase: extract final changesets cleanup logic in a dedicated function
At the end of the rebase, rebased changesets are currently stripped. This
behavior will be eventually dropped in favor of obsolescence marker creation.

The main rebase function is already big and branchy enough. This changeset move
the clean-up logic in a dedicated function before we make it more complex.
2012-09-18 22:58:12 +02:00
Bryan O'Sullivan
024c511c7b store: refactor hashed encoding into its own function 2012-09-18 14:37:32 -07:00
Adrian Buehlmann
fc4c657eab store: reuse direncoded path in _hybridencode
For a netbeans clone on Windows 7 x64:

  Before:
    $ hg perffncacheencode
    ! wall 3.516000 comb 3.525623 user 3.525623 sys 0.000000 (best of 3)

  After:
    $ hg perffncacheencode
    ! wall 3.443000 comb 3.447622 user 3.447622 sys 0.000000 (best of 3)
2012-09-18 19:51:59 +02:00
Adrian Buehlmann
6935b60c2a store: extract functions _encodefname and _decodefname 2012-09-18 19:51:48 +02:00
Adrian Buehlmann
6123a70068 store: use fast C implementation of encodedir() if it's available
For a netbeans clone on Windows 7 x64:

  Encoding all paths in the fncache:

    Before:
      $ hg perffncacheencode
      ! wall 3.639000 comb 3.634823 user 3.634823 sys 0.000000 (best of 3)
    After:
      $ hg perffncacheencode
      ! wall 3.470000 comb 3.463222 user 3.463222 sys 0.000000 (best of 3)

  Writing fncache:

    Before:
      $ hg perffncachewrite
      ! wall 0.103000 comb 0.093601 user 0.093601 sys 0.000000 (best of 95)
    After:
      $ hg perffncachewrite
      ! wall 0.081000 comb 0.078001 user 0.062400 sys 0.015600 (best of 100)
2012-09-18 11:44:16 +02:00
Adrian Buehlmann
fd6785ba1c pathencode: new C module with fast encodedir() function
Not yet used (will be enabled in a later patch).

This patch is a stripped down version of patches originally created by
Bryan O'Sullivan <bryano@fb.com>
2012-09-18 11:43:30 +02:00
Adrian Buehlmann
2dfd0c409a store: add multiline doctest case for encodedir()
a followup to 7090b12b599b
2012-09-18 07:58:50 +02:00
Adrian Buehlmann
7bf349fd77 store: optimize fncache._load a bit by dirdecoding the contents in one go
For a netbeans clone on Windows 7 x64:

  Before:
    $ hg perffncacheload
    ! wall 0.124000 comb 0.124801 user 0.124801 sys 0.000000 (best of 76)

  After:
    $ hg perffncacheload
    ! wall 0.096000 comb 0.093601 user 0.078001 sys 0.015600 (best of 97)
2012-09-17 11:00:38 +02:00
Thomas Arendsen Hein
41c1ca4d45 wireproto: workaround for yield inside try/finally incompatible with python2.4 2012-09-18 17:00:58 +02:00
Thomas Arendsen Hein
53ccf3443a merge with stable 2012-09-18 15:36:58 +02:00
Thomas Arendsen Hein
658554b7a9 merge with main 2012-09-18 15:30:22 +02:00
Thomas Arendsen Hein
a5ec5f8a67 largefiles: fix trailing spaces in test-largefiles.t
With the default branch this will cause warnings from check-code.
2012-09-18 15:29:43 +02:00
Matt Mackall
b0dd832e65 merge with stable 2012-09-17 15:13:17 -05:00
Matt Mackall
c4e6f2ab1f merge with crew 2012-09-17 15:13:03 -05:00
Patrick Mezard
cad875fb81 Merge with stable 2012-09-17 21:53:50 +02:00
Tim Delaney
0872da322b hgweb: fix incorrect graph padding calculation (issue3626)
hgweb has an incorrect padding calculation, causing the text to move further
away from the graph the more branches there are (issue3626). This patch fixes
all existing templates (gitweb, monoblue, paper and spartan).

Tests updated by Patrick Mezard <patrick@mezard.eu>
2012-09-17 21:33:16 +02:00
Adrian Buehlmann
b335701920 test-hybridencode: add a case for direncode 2012-09-16 22:43:24 +02:00
Adrian Buehlmann
039dd48b84 store: optimize fncache._write by direncoding the contents in one go
For a netbeans clone on Windows 7 x64:

  Before:
    $ hg perffncachewrite
    ! wall 0.210000 comb 0.218401 user 0.202801 sys 0.015600 (best of 47)

  After:
    $ hg perffncachewrite
    ! wall 0.104000 comb 0.109201 user 0.078000 sys 0.031200 (best of 95)
2012-09-17 08:58:35 +02:00
Adrian Buehlmann
c239e1837f store: move encode lambda logic into fncachestore
and define two named functions at module scope.

This again also speeds up perffncacheencode a little bit.
2012-09-16 11:41:02 +02:00
Adrian Buehlmann
e773964ca6 store: eliminate one level of lambda functions on _hybridencode 2012-09-16 11:36:14 +02:00
Adrian Buehlmann
bbb1196b99 store: parameter path of _auxencode is now a list of strings 2012-09-16 11:36:06 +02:00
Adrian Buehlmann
55185b8e33 store: keep an accumulated length for the shorted dirs in _hybridencode
so we don't have to repeatedly do  '/'.join(sdirs)  inside the loop
2012-09-16 11:36:00 +02:00
Adrian Buehlmann
1184227af0 store: reorder basename assignment in _hybridencode 2012-09-16 11:35:55 +02:00
Adrian Buehlmann
40f2dc614d store: remove uneeded startswith('data/') checks in encodedir() and decodedir()
I don't think we will ever have anything in the store that resides inside a
directory that ends in .i or .d under store/ that we wouldn't want to have
direncoded. The files not under data/ surely don't need direncoding, but it
doesn't harm to let these few run through it. It hurts more to check whether the
thousands of other files start with 'data/'. They do anyway.

See also 67e6074ba430 (fixed with 0c522fe42894), which moved the direncoding
from filelog into store
2012-09-15 21:44:08 +02:00
Adrian Buehlmann
574c96ecb1 store: remove uneeded startswith('data/') check in _hybridencode() 2012-09-15 21:43:56 +02:00
Adrian Buehlmann
5b06fb5a39 store: refactor splitting off of "data/" in _hybridencode()
encodefilename() already calls encodedir(). Note that encodedir() skips the
encoding if the path doesn't start with "data/".
2012-09-15 21:43:14 +02:00
Adrian Buehlmann
007324cd1e store: let _auxencode() return the list of path segments
so we can spare us splitting the path again in _hybridencode()
2012-09-15 21:43:05 +02:00
Adrian Buehlmann
95efee1bf9 store: eliminate unneded last assignment to n in _auxencode()
The check for period or space at the end of the string is the last one, the
local variable n is thus not used anymore.
2012-09-15 21:42:58 +02:00
Adrian Buehlmann
d900f301da store: unindent most of the contents of the for loop in _auxencode()
by refactoring

    for i, n in enumerate(res):
        if n:
            <main code block>

to

    for i, n in enumerate(res):
        if not n:
            continue
        <main code block>

(no functional change)
2012-09-15 21:42:52 +02:00
Adrian Buehlmann
77d1f7ae29 store: optimize _auxencode() by assigning to the list elements of the path 2012-09-15 21:42:43 +02:00
Adrian Buehlmann
c774df44ee store: optimze _auxencode() a bit by grouping the reserved names by length
This reduces perffncacheencode wall time on Windows 7 x64 for my netbeans clone
here from 4.3 to 4.0 (7% faster).
2012-09-15 21:41:09 +02:00
Adrian Buehlmann
8149bc6545 store: explain "aux.foo" versus "foo.aux" in doc of _auxencode() 2012-09-15 21:41:53 +02:00
Adrian Buehlmann
93e8048296 store: add 'com0' and 'lpt0' doctest cases for _auxencode()
These are already covered by test-hybridencode.py, but they are so noteworthy
that I think they deserve being shown right in that doctest.
2012-09-15 21:41:45 +02:00
Patrick Mezard
ed389c0023 wireproto: fix check-code.py breakage introduced by 6e0f31b6f59a 2012-09-15 08:38:02 +02:00
Nikolaj Sjujskij
88edfbe265 record: fix display of non-ASCII names in chunk selection
46cdcb89086f fixed display of non-ASCII names in file-selecting prompt, but
display in chunk selection remained broken. The reason is that using '%r' in
string formatting results in calling `repr` on file names, thus mangling
non-ASCII ones.
2012-09-15 00:06:08 +04:00
Patrick Mezard
a4c03168e1 tests: enable even more Windows server tests 2012-09-14 21:05:24 +02:00
Patrick Mezard
0d4dcbceb6 test-obsolete-checkheads: fix on windows 2012-09-14 20:40:52 +02:00
Bryan O'Sullivan
9c2cc273fb sshserver: avoid a multi-dot attribute lookup in a hot loop
This improves stream_out performance by about 3%.
2012-09-14 12:09:44 -07:00
Bryan O'Sullivan
b5119e949d store: reduce string concatenation when joining
This improves stream_out performance by a couple of percent.
2012-09-14 12:09:05 -07:00
Bryan O'Sullivan
1dc1178198 scmutil: use the new faster path split
Combined with a few other patches in this series, this contributes
to improving stream_out performance by 10%.
2012-09-14 12:08:55 -07:00
Bryan O'Sullivan
e3555667b8 util: implement a faster os.path.split for posix systems
This is not yet used.
2012-09-14 12:08:17 -07:00
Bryan O'Sullivan
7df4ae95dc scmutil: make join cheaper
Combined with a few followup patches, this contributes to improving
stream_out performance by 10%.
2012-09-14 12:07:33 -07:00
Bryan O'Sullivan
10380b320f wireproto: don't format a debug string inside a hot loop
This improves stream_out performance by about 5%.
2012-09-14 12:06:40 -07:00
Bryan O'Sullivan
55e3685d62 wireproto: bypass filechunkiter for small files when streaming
Merely creating and using a generator has a measurable impact,
particularly since the common case for stream_out is generators that
yield just once. Avoiding generators improves stream_out performance
by about 7%.
2012-09-14 12:05:37 -07:00
Bryan O'Sullivan
0b5b186a96 wireproto: don't audit local paths during stream_out
Auditing at this stage is both pointless (paths are already trusted by
the local repo) and expensive. Skipping the audits improves stream_out
performance by about 15%.
2012-09-14 12:05:12 -07:00