Commit Graph

537 Commits

Author SHA1 Message Date
Yuya Nishihara
b67b0a75ea revset: drop pre-lazyset optimization for stringset of subset == entire repo
It was introduced at deb42ca4dd93, where spanset.__contains__() did not exist.
Nowadays, we have to pay huge penalty for len(subset).

The following example showed that OR operation could be O(n * m^2)
(n: len(repo), m: number of OR operators, m >= 2) probably because of
filteredset.__len__.

revset #0: 0|1|2|3|4|5|6|7|8|9
0) wall 8.092713 comb 8.090000 user 8.090000 sys 0.000000 (best of 3)
1) wall 0.445354 comb 0.450000 user 0.430000 sys 0.020000 (best of 22)
2) wall 0.000389 comb 0.000000 user 0.000000 sys 0.000000 (best of 7347)
(0: 3.2.4, 1: 3.1.2, 2: this patch)
2015-01-03 10:25:08 +09:00
Pierre-Yves David
56b039c98c revset: fix first and last for generatorset (issue4465)
The code was just plain wrong.
2014-12-01 05:18:12 -08:00
Pierre-Yves David
2463533597 addset: fix first and last on sorted addset (issue4426)
The lazy sorting were not enforced on addset. This was made visible through MQ.
2014-11-01 22:58:30 +00:00
Martin von Zweigbergk
ef6448aa8b revset: don't recreate matcher for every revision
The matcher variable 'm' in checkstatus() is reset to None on each
call, so the caching of the matcher no longer happens as it was
intended. This seems to be a regression in 6b9fbae54476 (revset: added
lazyset implementation to checkstatus, 2014-01-03).

Fix by moving the cached matcher into the enclosing function so it's
actually cached across calls. This speeds up

  hg log -r 'modifies(mercurial/context.py)' >/dev/null

from 7.5s to 4s.

Also see similar fix in 5ff5c5c9e69f (revset: avoid recalculating
filesets, 2014-10-22).
2014-10-31 10:41:36 -07:00
Durham Goode
c9e0ce83ec revset: fix O(2^n) perf regression in addset
hg log -r 1 ... -r 100 was never returning due to a regression in the
way addset computes __nonzero__. It used 'bool(self._r1 or self._r2)'
which required executing self._r1.__nonzero__ twice (once for the or,
once for the bool). hg log with a lot of -r's happens to build a one
sided addset tree of N length, which ends up being 2^N performance.

This patch fixes it by converting to bool before or'ing.

This problem can be repro'd with something as simple as:

hg log `for x in $(seq 1 50) ; do echo "-r $x "; done`

Adding '1 + 2 + ... + 20' to the revsetbenchmark.txt didn't seem to repro the
problem, so I wasn't able to add a revset benchmark for this issue.
2014-10-28 14:06:06 -07:00
Yuya Nishihara
740a18d819 revset: avoid O(n) lookup of invalid revision in rev()
0cc5c10d5dc7 was not the final version of that patch.  It was really slow
because `l not in repo.changelog` iterates revisions up to `l`.  Instead,
rev() should utilize spanset.__contains__().

revset #0: rev(210000)
0) wall 0.000039 comb 0.000000 user 0.000000 sys 0.000000 (best of 67978)
1) wall 0.002721 comb 0.000000 user 0.000000 sys 0.000000 (best of 1055)
2) wall 0.000059 comb 0.000000 user 0.000000 sys 0.000000 (best of 45599)
(0: 3.2-rc, 1: 0cc5c10d5dc7, 2: this patch)

Note that the benchmark result described in 0cc5c10d5dc7 is wrong because
it is the one of the initial version.
2014-10-23 21:53:37 +09:00
Yuya Nishihara
bac0595bd4 revset: have rev() drop out-of-range or filtered rev explicitly (issue4396)
The recent optimization of "and" operation relies on the assumption that
the rhs set does not contain invalid revisions.  So rev() has to remove
invalid revisions.

This is still faster than using `.filter(lambda r: r == l)`.

revset #0: rev(25)
0) wall 0.026341 comb 0.020000 user 0.020000 sys 0.000000 (best of 113)
1) wall 0.000038 comb 0.000000 user 0.000000 sys 0.000000 (best of 66567)
2) wall 0.000062 comb 0.000000 user 0.000000 sys 0.000000 (best of 43699)
(0: 428fa22fb2d1^, 1: 3.2-rc, 2: this patch)
2014-10-19 16:48:33 +09:00
Matt Mackall
56b374dd4e revset: avoid recalculating filesets
This fixes a regression in ea41ca830940 that moved matcher building
into a callback, thus causing it be rebuilt for each revision matched
against.
2014-10-22 15:47:27 -05:00
Pierre-Yves David
0d2e3a1dee revset-phases: prefetch attributes in phasesrelated revsets
Pre-fetching attributes gives a significant performance boost. Such is Python.


draft()
0) wall 0.011661 comb 0.010000 user 0.010000 sys 0.000000 (best of 205)
1) wall 0.009804 comb 0.000000 user 0.000000 sys 0.000000 (best of 231)

draft() - ::bookmark()
0) wall 0.014173 comb 0.010000 user 0.010000 sys 0.000000 (best of 177)
1) wall 0.012966 comb 0.010000 user 0.010000 sys 0.000000 (best of 182)
2014-10-16 17:46:58 -07:00
Pierre-Yves David
8347f164c0 revset-phases: do not cache phase-related filters
The phase retrieval is fast enough to not require caching the result of the
functions.

draft()
0) wall 0.017209 comb 0.020000 user 0.020000 sys 0.000000 (best of 149)
1) wall 0.011654 comb 0.010000 user 0.010000 sys 0.000000 (best of 186)

public()
0) wall 0.018687 comb 0.010000 user 0.010000 sys 0.000000 (best of 128)
1) wall 0.013290 comb 0.010000 user 0.010000 sys 0.000000 (best of 181)

secret()
0) wall 0.017464 comb 0.020000 user 0.020000 sys 0.000000 (best of 127)
1) wall 0.011499 comb 0.000000 user 0.000000 sys 0.000000 (best of 196)

draft() - ::bookmark()
0) wall 0.020099 comb 0.020000 user 0.020000 sys 0.000000 (best of 127)
1) wall 0.014399 comb 0.020000 user 0.020000 sys 0.000000 (best of 169)
2014-10-11 01:21:47 -07:00
Pierre-Yves David
4de6496309 revset-node: speedup by a few hundred fold
Instead of checking all elements of the subset against a single rev, just check
if this rev is in the subset. The old way was inherited from when the subset was
a list.

Non surprise, this provide massive speedup.


id("b7dc31e4baa4")
before) wall 0.008205 comb 0.000000 user 0.000000 sys 0.000000 (best of 302)
after)  wall 0.000069 comb 0.000000 user 0.000000 sys 0.000000 (best of 34518)

revset #1: public() and id("b7dc31e4baa4")
before) wall 0.019763 comb 0.020000 user 0.020000 sys 0.000000 (best of 124)
after)  wall 0.000101 comb 0.000000 user 0.000000 sys 0.000000 (best of 20130)
2014-10-11 01:39:20 -07:00
Pierre-Yves David
7268257aae revset-only: use subset & instead of filtering
The & version is more likely to be optimised.

only(.)
before) wall 0.003216 comb 0.000000 user 0.000000 sys 0.000000 (best of 768)
after)  wall 0.001086 comb 0.000000 user 0.000000 sys 0.000000 (best of 2231)

only(default, stable)
before) wall 0.018469 comb 0.020000 user 0.020000 sys 0.000000 (best of 138)
after)  wall 0.015888 comb 0.010000 user 0.010000 sys 0.000000 (best of 156)
2014-10-10 17:28:18 -07:00
Pierre-Yves David
4e015a4853 revset-_ancestor: use & instead of filter
The & operation is more likely optimised.

::10
before) wall 0.028189 comb 0.030000 user 0.030000 sys 0.000000 (best of 100)
after)  wall 0.001050 comb 0.000000 user 0.000000 sys 0.000000 (best of 2326)

::tip
before) wall 0.081132 comb 0.080000 user 0.080000 sys 0.000000 (best of 100)
after)  wall 0.055418 comb 0.050000 user 0.050000 sys 0.000000 (best of 100)
2014-09-30 15:03:54 -05:00
Pierre-Yves David
98eb7704b3 revset-only: use __nonzero__ to check if a revset is empty
For some smartsets, computing length is more expensive than checking if the set
is empty.
2014-10-08 02:45:21 -07:00
Pierre-Yves David
4ec24e2ba6 _spanset: drop __getitem__ implementation
It is expensive and not part of the official smartset API.
2014-10-15 12:38:47 -07:00
Pierre-Yves David
b634e6fb8f filteredset: drop __getitem__ implementation
It is expensive and not part of the official smartset API.
2014-10-15 12:38:32 -07:00
Pierre-Yves David
8f9f017c39 generatorset: implement __len__
It was the only smartset class without a `__len__` implementation.
2014-10-15 04:28:55 -07:00
Pierre-Yves David
d4bf12d496 revset: make __len__ part of the offical API
It is common for code to ask for the length of a revset. In fact, all but
generatorset already implement it.
2014-10-15 04:26:23 -07:00
Mads Kiilerich
4353d6acbb revset: better naming of variables containing the value of a single argument
Calling them args is not helpful.
2014-10-15 04:08:06 +02:00
Pierre-Yves David
551481efc6 spanset: remove .set() definition
All my friends are dead.
2014-10-10 13:09:22 -07:00
Pierre-Yves David
cdaf453077 generatorset: remove .set() definition
All my friends are dead.
2014-10-10 13:08:49 -07:00
Pierre-Yves David
9531e16b0d addset: remove .set() definition
All my friends are dead.
2014-10-10 13:08:28 -07:00
Pierre-Yves David
f957be2403 filteredset: remove .set() definition
All my friends are dead.
2014-10-10 13:08:10 -07:00
Pierre-Yves David
fcdeb29add baseset: remove set() definition
All my friends are dead.
2014-10-10 13:07:35 -07:00
Pierre-Yves David
a9fcdb25c5 abstractsmartset: remove set() method definition
Now that all usages have been removed, we can drop this not so useful part of
the API. We can note that the name was wrong all along...
2014-10-10 11:27:57 -07:00
Pierre-Yves David
70851c278d match: check if an object is a baseset using isascending instead of set
The `set()` method is going away.
2014-10-10 14:27:05 -07:00
Pierre-Yves David
0de25934dc getset: check if an object is a baseset using isascending instead of set
The `set()` method is going away.
2014-10-10 14:22:23 -07:00
Pierre-Yves David
c249a728eb fullreposet: detect smartset using "isascending" instead of "set"
The `.set()` function is going away.
2014-10-10 13:24:57 -07:00
Pierre-Yves David
be86e2f6f1 fullreposet: drop custom sets but not smartsets detection
All custom classes use by revsets are smartsets now. We drop the special-casing.
2014-10-10 13:21:05 -07:00
Pierre-Yves David
e0b5b0a127 addset: drop .set() usage during iteration
We can use the containment check directly.
2014-10-10 12:30:00 -07:00
Pierre-Yves David
efff35ee9d baseset: access _set directly for containment check
The `.set()` method is going away.
2014-10-10 12:31:22 -07:00
Pierre-Yves David
5049b858d9 baseset: make _set a property cache
This will remove the need for `baseset.set()`.
2014-10-10 12:30:56 -07:00
Pierre-Yves David
a54d940320 revset-_hexlist: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:52:10 -07:00
Pierre-Yves David
76a0476bf7 revset-_intlist: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:51:54 -07:00
Pierre-Yves David
3094e008ed revset-_list: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:51:16 -07:00
Pierre-Yves David
2c5bccb146 revset-roots: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:50:20 -07:00
Pierre-Yves David
b1e5f6cb89 revset-origin: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:49:17 -07:00
Pierre-Yves David
29984785df revset-last: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:48:56 -07:00
Pierre-Yves David
113095a6b7 revset-limit: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:48:24 -07:00
Pierre-Yves David
6847074b2d revset-destination: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:47:46 -07:00
Pierre-Yves David
1ecbe47993 revset-children: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:47:24 -07:00
Pierre-Yves David
4186e2d344 revset-branch: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:47:00 -07:00
Pierre-Yves David
afe4b27987 revset-rangeset: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:45:53 -07:00
Pierre-Yves David
ca06344dab revset-only: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:45:43 -07:00
Pierre-Yves David
4e9488a8f8 revset: cache most conditions used in filter
Except when stated otherwise, the condition used in `smartset.filter` will be
cached. A new argument has been introduced to disable that behavior. We use it
for filters created from `and` and `sub` operations.

This gives massive performance boosts for revsets with expensive conditions.

revset: branch(stable) or branch(default)
before) wall 4.329070 comb 4.320000 user 4.310000 sys 0.010000 (best of 3)
after)  wall 2.356451 comb 2.360000 user 2.330000 sys 0.030000 (best of 4)

revset: author(mpm) or author(lmoscovicz)
before) wall 4.434719 comb 4.440000 user 4.440000 sys 0.000000 (best of 3)
after)  wall 2.321720 comb 2.320000 user 2.320000 sys 0.000000 (best of 4)
2014-10-09 22:57:52 -07:00
Pierre-Yves David
372cc7c36d baseset: empty or one-element sets are ascending and descending
The empty set is full of interesting properties. In the ordering case, the one
element set is too.
2014-10-09 04:12:20 -07:00
Pierre-Yves David
090fe27a36 filteredset: drop explicit order management
Now that all low-level smartset classes have proper ordering and fast iteration
management, we can just rely on the subset in filteredset.
2014-10-07 01:33:05 -07:00
Pierre-Yves David
d521f34fda revset: restore order of or operation as in Mercurial 2.9
Lazy revset broke the ordering of the `or` revset. We now stop assuming that
two ascending revset are combine into an ascending one.

Behavior in 3.0:

  3:4 or 2:5 == [2, 3, 4, 5]

Behavior in 2.9:

  3:4 or 2:5 == [3, 4, 2, 5]

We are adding a test for it.

For unclear reason, the performance `or` revset with expensive filter are
getting even worse than they used to be. This is probably caused by extra
uncached containment check or iteration.

revset #9: author(lmoscovicz) or author(mpm)
before) wall 3.487583 comb 3.490000 user 3.490000 sys 0.000000 (best of 3)
after)  wall 4.481486 comb 4.480000 user 4.470000 sys 0.010000 (best of 3)


revset #10: author(mpm) or author(lmoscovicz)
before) wall 3.164839 comb 3.170000 user 3.160000 sys 0.010000 (best of 3)
after)  wall 4.574965 comb 4.570000 user 4.570000 sys 0.000000 (best of 3)
2014-10-09 04:24:51 -07:00
Pierre-Yves David
2213bfcae4 revset-_descendant: rework the whole sorting and combining logic
We use the & operator to combine with subset (since this is more likely to be
optimised than filter) and we enforce the sorting of the result. Without this
enforced sorting, we may result in a different iteration order than the set
_descendent was computed from.

This reverts a bad `test-glog.t` change from 7904906883bd.

Another side effect is that `test-mq.t` shows `qparent::` including `-1` if
`qparent is -1`. This sound like a positive change.

This has good and bad impacts on the benchmarks, here is a good ones:

revset: 0::
before) wall 0.045489 comb 0.040000 user 0.040000 sys 0.000000 (best of 100)
after)  wall 0.034330 comb 0.030000 user 0.030000 sys 0.000000 (best of 100)

revset: roots((0::) - (0::tip))
before)  wall 0.134090 comb 0.140000 user 0.140000 sys 0.000000 (best of 63)
after) wall 0.128346 comb 0.130000 user 0.130000 sys 0.000000 (best of 69)

revset: ::p1(p1(tip))::
before) wall 0.143892 comb 0.140000 user 0.140000 sys 0.000000 (best of 55)
after)  wall 0.124502 comb 0.130000 user 0.130000 sys 0.000000 (best of 65)

revset: roots((0:tip)::)
before) wall 0.204966 comb 0.200000 user 0.200000 sys 0.000000 (best of 43)
after) wall 0.184455 comb 0.180000 user 0.180000 sys 0.000000 (best of 47)

Here is a bad one:

revset: (20000::) - (20000)
before) wall 0.009592 comb 0.010000 user 0.010000 sys 0.000000 (best of 222)
after)  wall 0.029837 comb 0.030000 user 0.030000 sys 0.000000 (best of 100)
2014-10-09 09:12:54 -07:00
Pierre-Yves David
b33f0a62a0 addset: do lazy sorting
The previous implementation was consuming the whole revset when asked for any
sort. The addset class is now doing lazy sorting like all other smarset classes.

This has no significant impact in the benchmark as-is. But this is important
to later change.
2014-10-09 20:15:41 -07:00