Commit Graph

423 Commits

Author SHA1 Message Date
Siddharth Agarwal
cca6ff3076 revlog: remove incancestors since it is no longer used 2012-12-17 15:08:37 -08:00
Siddharth Agarwal
6d5198a5a3 revlog.ancestors: add support for including revs
This is in preparation for an upcoming refactoring. This also fixes a bug in
incancestors, where if an element of revs was an ancestor of another it would
be generated twice.
2012-12-17 15:13:51 -08:00
Pierre-Yves David
9ac120f569 revlog: allow reverse iteration with revlog.revs
We often need to perform rev iteration in reverse order. This
changeset makes it possible to do so, in order to avoid costly reverse
or reversed() calls later.
2012-11-21 00:42:05 +01:00
Siddharth Agarwal
b05b94a300 revlog: add rev-specific variant of findmissing
This will be used by rebase in an upcoming commit.
2012-11-26 10:48:24 -08:00
Siddharth Agarwal
76a23a18f8 revlog: switch findmissing to use ancestor.missingancestors
This also speeds up other commands that use findmissing, like
incoming and merge --preview. With a large linear repository (>400000
commits) and with one incoming changeset, incoming is sped up from
around 4-4.5 seconds to under 3.
2012-11-26 11:02:48 -08:00
Durham Goode
cce0517fb6 commit: increase perf by avoiding unnecessary filteredrevs check
When commiting to a repo with lots of history (>400000 changesets)
the filteredrevs check (added with 373606589de5) in changelog.py
takes a bit of time even if the filteredrevs set is empty. Skipping
the check in that case shaves 0.36 seconds off a 2.14 second commit.
A 17% gain.
2012-11-16 15:39:12 -08:00
Pierre-Yves David
6d6a3d27a5 clfilter: split revlog.headrevs C call from python code
Make the pure python implementation of headrevs available to derived classes. It
is important because filtering logic applied by `revlog` derived class won't
have effect on `index`. We want to be able to bypass this C call to implement
our own.
2012-09-03 14:19:45 +02:00
Pierre-Yves David
23fb63d637 clfilter: handle non contiguous iteration in revlov.headrevs
This prepares changelog level filtering.  We can't assume that any revision can
be heads because filtered revisions need to be excluded.

New algorithm:
- All revisions now start as "non heads",
- every revision we iterate over is made candidate head,
- parents of iterated revisions are definitely not head.

Filtered revisions are never iterated over and never considered as candidate
head.
2012-09-03 14:12:45 +02:00
Pierre-Yves David
6981326b92 clfilter: make the revlog class responsible of all its iteration
This prepares changelog level filtering. We need the algorithms used in revlog to
work on a subset of revisions.  To achieve this, the use of explicit range of
revision is banned. `range` and `xrange` calls are replaced by a `revlog.irevs`
method. Filtered super class can then overwrite the `irevs` method to filter out
revision.
2012-09-20 19:00:59 +02:00
Mads Kiilerich
2f4504e446 fix trivial spelling errors 2012-08-15 22:38:42 +02:00
Matt Mackall
5b06da939f backout 94ae81a4e338
This may have allowed unbounded I/O sizes with the current chunk
retrieval code.
2012-07-12 14:20:34 -05:00
Martin Geisler
c52341ae3f merge with main 2012-07-12 10:03:50 +02:00
Friedrich Kastner-Masilko
a6245a11d3 revlog: fix for generaldelta distance calculation
The decision whether or not to store a full snapshot instead of a delta is done
based on the distance value calculated in _addrevision.builddelta(rev).

This calculation traditionally used the fact of deltas only using the previous
revision as base. Generaldelta mechanism is changing this, yet the calculation
still assumes that current-offset minus chainbase-offset equals chain-length.
This appears to be wrong.

This patch corrects the calculation by means of using the chainlength function
if Generaldelta is used.
2012-07-11 12:38:42 +02:00
Bryan O'Sullivan
26f2c363fd revlog: make compress a method
This allows an extension to optionally use a new compression type based
on the options applied by the repo to the revlog's opener.

(decompress doesn't need the same treatment, as it can be replaced using
extensions.wrapfunction, and can figure out which compression algorithm
is in use based on the first byte of the compressed payload.)
2012-06-25 13:56:13 -07:00
Joshua Redstone
09130c5cf2 revlog: remove reachable and switch call sites to ancestors
This change does a trivial conversion of callsites to ancestors.
Followon diffs will switch the callsites over to revs.
2012-06-08 08:39:44 -07:00
Joshua Redstone
70aeee0070 revlog: add incancestors, a version of ancestors that includes revs listed
ancestors() returns the ancestors of revs provided. This func is like
that except it also includes the revs themselves in the total set of
revs generated.
2012-06-08 07:59:37 -07:00
Thomas Arendsen Hein
01adf7776d merge heads 2012-06-07 15:55:12 +02:00
Brad Hall
f20c06750f revlog: zlib.error sent to the user (issue3424)
Give the user the zlib error message instead of a backtrace when decompression
fails.
2012-06-04 14:46:42 -07:00
Joshua Redstone
e38b770424 revlog: add optional stoprev arg to revlog.ancestors()
This will be used as a step in removing reachable() in a future diff.
Doing it now because bryano is in the process of rewriting ancestors in
C.  This depends on bryano's patch to replace *revs with revs in the
declaration of revlog.ancestors.
2012-06-01 15:44:13 -07:00
Bryan O'Sullivan
141bd09daa revlog: descendants(*revs) becomes descendants(revs) (API)
Once again making the API more rational, as with ancestors.
2012-06-01 12:45:16 -07:00
Bryan O'Sullivan
6ba97b40c1 revlog: ancestors(*revs) becomes ancestors(revs) (API)
Accepting a variable number of arguments as the old API did is
deeply ugly, particularly as it means the API can't be extended
with new arguments.  Partly as a result, we have at least three
different implementations of the same ancestors algorithm (!?).

Most callers were forced to call ancestors(*somelist), adding to
both inefficiency and ugliness.
2012-06-01 12:37:18 -07:00
Bryan O'Sullivan
abdf4a8227 util: subclass deque for Python 2.4 backwards compatibility
It turns out that Python 2.4's deque type is lacking a remove method.
We can't implement remove in terms of find, because it doesn't have
find either.
2012-06-01 17:05:31 -07:00
Bryan O'Sullivan
bef5b61512 cleanup: use the deque type where appropriate
There have been quite a few places where we pop elements off the
front of a list.  This can turn O(n) algorithms into something more
like O(n**2).  Python has provided a deque type that can do this
efficiently since at least 2.4.

As an example of the difference a deque can make, it improves
perfancestors performance on a Linux repo from 0.50 seconds to 0.36.
2012-05-15 10:46:23 -07:00
Bryan O'Sullivan
a49ea963d7 revlog: switch to a C version of headrevs
The C implementation is more than 100 times faster than the Python
version (which is still available as a fallback).

In a repo with 330,000 revs and a stale .hg/cache/tags file, this
patch improves the performance of "hg tip" from 2.2 to 1.6 seconds.
2012-05-19 19:44:58 -07:00
Matt Mackall
42c30757a2 revlog: don't handle long for revision matching
The underlying C code doesn't support indexing by longs, there are no
legitimate reasons to use a long, and longs should generally be
converted to ints at a higher level by context's constructor.
2012-05-21 16:36:09 -05:00
Brodie Rao
a7ef0a0cc5 cleanup: "not x in y" -> "x not in y" 2012-05-12 16:00:57 +02:00
Bryan O'Sullivan
058dfb801d revlog: speed up prefix matching against nodes
The radix tree already contains all the information we need to
determine whether a short string is an unambiguous node identifier.
We now make use of this information.

In a kernel tree, this improves the performance of
"hg log -q -r24bf01de75" from 0.27 seconds to 0.06.
2012-05-12 10:55:08 +02:00
Matt Mackall
a97dbbe308 revlog: backout df8c4d732869
This regresses performance of 'hg branches', presumably because it's
visiting the revlog in the wrong order. This suggests we either need
to fix the branch code or add some read-behind to mitigate the effect.
2012-04-27 13:07:29 -05:00
Patrick Mezard
e9454c243f revlog: fix partial revision() docstring (from f4a6c9197dbd) 2012-04-13 10:14:59 +02:00
Matt Mackall
a6546db90e revlog: drop some unneeded rev.node calls in revdiff 2012-04-13 22:55:46 -05:00
Bryan O'Sullivan
62554752c6 revlog: avoid an expensive string copy
This showed up in a statprof profile of "hg svn rebuildmeta", which
is read-intensive on the changelog.  This two-line patch improved
the performance of that command by 10%.
2012-04-12 20:26:33 -07:00
Matt Mackall
4e0b41f193 revlog: increase readahead size 2012-04-13 21:35:48 -05:00
Bryan O'Sullivan
dc46676e81 parsers: use base-16 trie for faster node->rev mapping
This greatly speeds up node->rev lookups, with results that are
often user-perceptible: for instance, "hg --time log" of the node
associated with rev 1000 on a linux-2.6 repo improves from 0.3
seconds to 0.03.  I have not found any instances of slowdowns.

The new perfnodelookup command in contrib/perf.py demonstrates the
speedup more dramatically, since it performs no I/O.  For a single
lookup, the new code is about 40x faster.

These changes also prepare the ground for the possibility of further
improving the performance of prefix-based node lookups.
2012-04-12 14:05:59 -07:00
Matt Mackall
055cba03a8 revlog: allow retrieving contents by revision number 2012-04-08 12:38:02 -05:00
Matt Mackall
30645d82e7 revlog: add hasnode helper method 2012-04-07 15:43:18 -05:00
Pierre-Yves David
15ab7ccd15 revlog: make addgroup returns a list of node contained in the added source
This list will contains any node see in the source, not only the added one.
This is intended to allow phase to be move according what was pushed by client
not only what was added.
2012-01-13 01:29:03 +01:00
Pierre-Yves David
a51dc67424 revlog: improve docstring for findcommonmissing 2012-01-09 04:15:31 +01:00
Steven Brown
3ebdb5ed19 revlog: clarify strip docstring "readd" -> "re-add"
I misread it as "read".
2012-01-10 22:35:25 +08:00
Matt Mackall
864ce9da04 misc: adding missing file close() calls
Spotted by Victor Stinner <victor.stinner@haypocalc.com>
2011-11-03 11:24:55 -05:00
Greg Ward
bc1dfb1ac9 atomictempfile: make close() consistent with other file-like objects.
The usual contract is that close() makes your writes permanent, so
atomictempfile's use of close() to *discard* writes (and rename() to
keep them) is rather unexpected. Thus, change it so close() makes
things permanent and add a new discard() method to throw them away.
discard() is only used internally, in __del__(), to ensure that writes
are discarded when an atomictempfile object goes out of scope.

I audited mercurial.*, hgext.*, and ~80 third-party extensions, and
found no one using the existing semantics of close() to discard
writes, so this should be safe.
2011-08-25 20:21:04 -04:00
Augie Fackler
ea2e868e0f revlog: use getattr instead of hasattr 2011-07-25 15:43:55 -05:00
Matt Mackall
1b52b02896 check-code: catch misspellings of descendant
This word is fairly common in Mercurial, and easy to misspell.
2011-06-07 17:02:54 -05:00
Sune Foldager
7db447dd4c revlog: bail out earlier in group when we have no chunks 2011-06-03 20:32:54 +02:00
Martin Geisler
af8a35e078 check-code: flag 0/1 used as constant Boolean expression 2011-06-01 12:38:46 +02:00
Matt Mackall
66805ccfed revlog: stop exporting node.short 2011-05-21 15:01:28 -05:00
Matt Mackall
a6f2ad6f1e revlog: drop base() again
deltaparent does what's needed, and more "portably".
2011-05-18 17:05:30 -05:00
Sune Foldager
9a73f9bed3 revlog: linearize created changegroups in generaldelta revlogs
This greatly improves the speed of the bundling process, and often reduces the
bundle size considerably. (Although if the repository is already ordered, this
has little effect on both time and bundle size.)

For non-generaldelta clients, the reduced bundle size translates to a reduced
repository size, similar to shrinking the revlogs (which uses the exact same
algorithm). For generaldelta clients the difference is minor.

When the new bundle format comes, reordering will not be necessary since we
can then store the deltaparent relationsships directly. The eventual default
behavior for clients and servers is presented in the table below, where "new"
implies support for GD as well as the new bundle format:

                    old client                    new client
old server          old bundle, no reorder        old bundle, no reorder
new server, non-GD  old bundle, no reorder[1]     old bundle, no reorder[2]
new server, GD      old bundle, reorder[3]        new bundle, no reorder[4]

[1] reordering is expensive on the server in this case, skip it
[2] client can choose to do its own redelta here
[3] reordering is needed because otherwise the pull does a lot of extra
    work on the server
[4] reordering isn't needed because client can get deltabase in bundle
    format

Currently, the default is to reorder on GD-servers, and not otherwise. A new
setting, bundle.reorder, has been added to override the default reordering
behavior. It can be set to either 'auto' (the default), or any true or false
value as a standard boolean setting, to either force the reordering on or off
regardless of generaldelta.


Some timing data from a relatively branch test repository follows. All
bundling is done with --all --type none options.

Non-generaldelta, non-shrunk repo:
-----------------------------------
Size: 276M

Without reorder (default):
Bundle time: 14.4 seconds
Bundle size: 939M

With reorder:
Bundle time: 1 minute, 29.3 seconds
Bundle size: 381M

Generaldelta, non-shrunk repo:
-----------------------------------
Size: 87M

Without reorder:
Bundle time: 2 minutes, 1.4 seconds
Bundle size: 939M

With reorder (default):
Bundle time: 25.5 seconds
Bundle size: 381M
2011-05-18 23:26:26 +02:00
Sune Foldager
c222fc4662 changelog: don't use generaldelta 2011-05-16 13:06:48 +02:00
Sune Foldager
d7f01e602b revlog: get rid of defversion
defversion was a property (later option) on the store opener, used to propagate
the changelog revlog format to the other revlogs, so they would be created with
the same format.

This required that the changelog instance was created before any other revlog;
an invariant that wasn't directly enforced (or documented) anywhere.

We now use the revlogv1 requirement instead, which is transfered to the store
opener options. If this option is missing, v0 revlogs are created.
2011-05-16 12:44:34 +02:00
Matt Mackall
608041d55e revlog: restore the base method 2011-05-15 11:50:15 -05:00