Commit Graph

712 Commits

Author SHA1 Message Date
Pierre-Yves David
70851c278d match: check if an object is a baseset using isascending instead of set
The `set()` method is going away.
2014-10-10 14:27:05 -07:00
Pierre-Yves David
0de25934dc getset: check if an object is a baseset using isascending instead of set
The `set()` method is going away.
2014-10-10 14:22:23 -07:00
Pierre-Yves David
c249a728eb fullreposet: detect smartset using "isascending" instead of "set"
The `.set()` function is going away.
2014-10-10 13:24:57 -07:00
Pierre-Yves David
be86e2f6f1 fullreposet: drop custom sets but not smartsets detection
All custom classes use by revsets are smartsets now. We drop the special-casing.
2014-10-10 13:21:05 -07:00
Pierre-Yves David
e0b5b0a127 addset: drop .set() usage during iteration
We can use the containment check directly.
2014-10-10 12:30:00 -07:00
Pierre-Yves David
efff35ee9d baseset: access _set directly for containment check
The `.set()` method is going away.
2014-10-10 12:31:22 -07:00
Pierre-Yves David
5049b858d9 baseset: make _set a property cache
This will remove the need for `baseset.set()`.
2014-10-10 12:30:56 -07:00
Pierre-Yves David
a54d940320 revset-_hexlist: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:52:10 -07:00
Pierre-Yves David
76a0476bf7 revset-_intlist: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:51:54 -07:00
Pierre-Yves David
3094e008ed revset-_list: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:51:16 -07:00
Pierre-Yves David
2c5bccb146 revset-roots: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:50:20 -07:00
Pierre-Yves David
b1e5f6cb89 revset-origin: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:49:17 -07:00
Pierre-Yves David
29984785df revset-last: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:48:56 -07:00
Pierre-Yves David
113095a6b7 revset-limit: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:48:24 -07:00
Pierre-Yves David
6847074b2d revset-destination: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:47:46 -07:00
Pierre-Yves David
1ecbe47993 revset-children: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:47:24 -07:00
Pierre-Yves David
4186e2d344 revset-branch: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:47:00 -07:00
Pierre-Yves David
afe4b27987 revset-rangeset: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:45:53 -07:00
Pierre-Yves David
ca06344dab revset-only: remove usage of set()
All smartset classes have fast lookup, so this function will be removed soon.
2014-10-08 02:45:43 -07:00
Pierre-Yves David
4e9488a8f8 revset: cache most conditions used in filter
Except when stated otherwise, the condition used in `smartset.filter` will be
cached. A new argument has been introduced to disable that behavior. We use it
for filters created from `and` and `sub` operations.

This gives massive performance boosts for revsets with expensive conditions.

revset: branch(stable) or branch(default)
before) wall 4.329070 comb 4.320000 user 4.310000 sys 0.010000 (best of 3)
after)  wall 2.356451 comb 2.360000 user 2.330000 sys 0.030000 (best of 4)

revset: author(mpm) or author(lmoscovicz)
before) wall 4.434719 comb 4.440000 user 4.440000 sys 0.000000 (best of 3)
after)  wall 2.321720 comb 2.320000 user 2.320000 sys 0.000000 (best of 4)
2014-10-09 22:57:52 -07:00
Pierre-Yves David
372cc7c36d baseset: empty or one-element sets are ascending and descending
The empty set is full of interesting properties. In the ordering case, the one
element set is too.
2014-10-09 04:12:20 -07:00
Pierre-Yves David
090fe27a36 filteredset: drop explicit order management
Now that all low-level smartset classes have proper ordering and fast iteration
management, we can just rely on the subset in filteredset.
2014-10-07 01:33:05 -07:00
Pierre-Yves David
d521f34fda revset: restore order of or operation as in Mercurial 2.9
Lazy revset broke the ordering of the `or` revset. We now stop assuming that
two ascending revset are combine into an ascending one.

Behavior in 3.0:

  3:4 or 2:5 == [2, 3, 4, 5]

Behavior in 2.9:

  3:4 or 2:5 == [3, 4, 2, 5]

We are adding a test for it.

For unclear reason, the performance `or` revset with expensive filter are
getting even worse than they used to be. This is probably caused by extra
uncached containment check or iteration.

revset #9: author(lmoscovicz) or author(mpm)
before) wall 3.487583 comb 3.490000 user 3.490000 sys 0.000000 (best of 3)
after)  wall 4.481486 comb 4.480000 user 4.470000 sys 0.010000 (best of 3)


revset #10: author(mpm) or author(lmoscovicz)
before) wall 3.164839 comb 3.170000 user 3.160000 sys 0.010000 (best of 3)
after)  wall 4.574965 comb 4.570000 user 4.570000 sys 0.000000 (best of 3)
2014-10-09 04:24:51 -07:00
Pierre-Yves David
2213bfcae4 revset-_descendant: rework the whole sorting and combining logic
We use the & operator to combine with subset (since this is more likely to be
optimised than filter) and we enforce the sorting of the result. Without this
enforced sorting, we may result in a different iteration order than the set
_descendent was computed from.

This reverts a bad `test-glog.t` change from 7904906883bd.

Another side effect is that `test-mq.t` shows `qparent::` including `-1` if
`qparent is -1`. This sound like a positive change.

This has good and bad impacts on the benchmarks, here is a good ones:

revset: 0::
before) wall 0.045489 comb 0.040000 user 0.040000 sys 0.000000 (best of 100)
after)  wall 0.034330 comb 0.030000 user 0.030000 sys 0.000000 (best of 100)

revset: roots((0::) - (0::tip))
before)  wall 0.134090 comb 0.140000 user 0.140000 sys 0.000000 (best of 63)
after) wall 0.128346 comb 0.130000 user 0.130000 sys 0.000000 (best of 69)

revset: ::p1(p1(tip))::
before) wall 0.143892 comb 0.140000 user 0.140000 sys 0.000000 (best of 55)
after)  wall 0.124502 comb 0.130000 user 0.130000 sys 0.000000 (best of 65)

revset: roots((0:tip)::)
before) wall 0.204966 comb 0.200000 user 0.200000 sys 0.000000 (best of 43)
after) wall 0.184455 comb 0.180000 user 0.180000 sys 0.000000 (best of 47)

Here is a bad one:

revset: (20000::) - (20000)
before) wall 0.009592 comb 0.010000 user 0.010000 sys 0.000000 (best of 222)
after)  wall 0.029837 comb 0.030000 user 0.030000 sys 0.000000 (best of 100)
2014-10-09 09:12:54 -07:00
Pierre-Yves David
b33f0a62a0 addset: do lazy sorting
The previous implementation was consuming the whole revset when asked for any
sort. The addset class is now doing lazy sorting like all other smarset classes.

This has no significant impact in the benchmark as-is. But this is important
to later change.
2014-10-09 20:15:41 -07:00
Pierre-Yves David
cdcb3820d1 baseset: drop custom __sub__ method
This add method is enforcing non-laziness, disabling multiple optimisations.

Benchmarks do not spot any significant difference but real usecase may. This
will also be important for further improvements to addset later in this series.
2014-10-09 04:29:18 -07:00
Pierre-Yves David
32d4f6ce1e baseset: drop custom __and__ method
This add method is enforcing non-laziness, disabling multiple optimisations.

Benchmarks do not spot any significant regression but real usecase may. This
even gives some speedup in some cases:

revset #15: min(0::)
before) wall 0.001247 comb 0.000000 user 0.000000 sys 0.000000 (best of 1814)
after)  wall 0.000942 comb 0.000000 user 0.000000 sys 0.000000 (best of 2367)

This will also be important for further improvement to addset later in this series.
2014-10-09 04:27:25 -07:00
Pierre-Yves David
35f0c6215a baseset: drop custom __add__ method
This add method is enforcing non-laziness, disabling multiple optimisations.

Benchmarks do not spot any significant differences but real usecase may. This
will also be important for further improvements to addset later in this series.
2014-10-09 04:27:01 -07:00
Pierre-Yves David
89697960fb smartset: drop infamous ascending, descending
All your friends are dead.
2014-10-07 01:46:53 -07:00
Pierre-Yves David
1f08d3a119 fullreposet: use isascending instead of ascending to recognise smartsets
`ascending` is going to be removed.
2014-10-07 01:41:14 -07:00
Pierre-Yves David
6da3ca17ab fullreposet: use sort to enforce the order
The `ascending` and `descending` methods are useless.
2014-10-07 01:41:26 -07:00
Pierre-Yves David
40789b5325 revancestors: replace descending with sort(reverse=False) 2014-10-07 01:48:34 -07:00
Pierre-Yves David
9ae79aaf3a _descendants: replace ascending() with sort() 2014-10-07 01:41:02 -07:00
Pierre-Yves David
59a9933f96 _descendants: directly use smartset
As `addset` objects are proper smartset objects, we do not need to make any
transformation of the result.
2014-10-07 01:36:53 -07:00
Pierre-Yves David
eb55591aca baseset: explicitly track order of the baseset
A baseset starts without an explicit order. But as soon as a sort is requested,
we simply register that the baseset has an order and use the ordered version of
the list to behave accordingly.

We will want to properly record the order at creation time in the future. This
would unlock more optimisation and avoid some sorting.
2014-10-03 03:29:55 -05:00
Pierre-Yves David
cf7077b249 baseset: fix isascending and isdescending
We now have sufficient information to return the proper value there.
2014-10-03 03:31:05 -05:00
Pierre-Yves David
ff023f566d baseset: prepare lazy ordering in __iter__
We'll explicitly track the order of the baseset to take advantage of the
ascending and descending lists during iteration.
2014-10-03 03:26:18 -05:00
Pierre-Yves David
d2f9fa68fe baseset: implement a fastasc and fastdesc
Baseset contains already-computed revisions. It is considered "cheap" to do
operations on an already-computed set. So we add attributes to hold version of
the list in ascending and descending order and use them for `fastasc` and
`fastdesc`. Having distinct lists is important to provide correct iteration in
all cases. Altering a python list will impact an iterator connected to it.

eg: not preserving order at iterator creation time

    >>> l = [0, 1]
    >>> i = iter(l)
    >>> l.reverse()
    >>> list(i)
    [1, 0]

eg: corrupting in progress iteration

    >>> l = [0, 1]
    >>> i = iter(l)
    >>> i.next()
    0
    >>> l.reverse()
    >>> i.next()
    0
2014-10-03 03:19:23 -05:00
Pierre-Yves David
b09ad7ecb4 baseset: stop inheriting from built-in list class
The baseset is doing more and more smartset magic and using its list-like
property less and less. So we store the list of revisions in an explicit
attribute and stop inheriting.

This requires reimplementing some basic methods.
2014-10-06 11:03:30 -07:00
Pierre-Yves David
734133be26 rangeset: use first and last instead of direct indexing
This makes it compatible with all smarsets classes.
2014-10-06 23:45:07 -07:00
Pierre-Yves David
38a691e7c7 filteredset: implement first and last 2014-10-07 00:18:08 -07:00
Pierre-Yves David
51a5e9775c baseset: implement first and last methods 2014-10-06 14:42:00 -07:00
Pierre-Yves David
98513b6f74 generatorset: implement first and last methods 2014-10-06 12:52:36 -07:00
Pierre-Yves David
0a7da549e2 addset: implement first and last methods
The implementation is non-lazy for now. One may want to make it more lazy in the
future.
2014-10-06 11:57:59 -07:00
Pierre-Yves David
0567fe01bf spanset: implement first and last methods 2014-10-06 11:54:53 -07:00
Pierre-Yves David
798e6b00b0 smartset: add first and last methods
In multiple places in the code, we use `someset[0]` or `someset[-1]`. This
works only because the `someset` is usually a baseset. For the same reason we
introduce a `first` and `last` methods to be implemented for all smartset
classes.
2014-10-06 11:46:53 -07:00
Pierre-Yves David
a814455e99 revset-last: remove user of baseset.append
A `baseset` has multiple cached results and will get even more in the future.
Making it an object "populated once" like the other smartsets makes it both safer
and simpler. The append method will be removed at some point.
2014-10-08 00:55:09 -07:00
Pierre-Yves David
e7663a53cd revset-limit: remove user of baseset.append
A `baseset` has multiple cached results and will get even more in the future.
Making it an object "populated once" like the other smartsets makes it both safer
and simpler. The append method will be removed at some point.
2014-10-06 10:57:01 -07:00
Pierre-Yves David
9f5274f4bc baseset: use default value instead of [] when possible
For pure cleanup purposes, we replace all the occurences of `baseset([])` with
`baseset()`.
2014-10-06 10:41:43 -07:00
Pierre-Yves David
a046529f73 generatorset: implement isascending and isdescending 2014-10-04 06:17:18 -07:00
Pierre-Yves David
d4b459bd9f generatorset: explicitly track iteration order
The expected iteration order may be different than the fast iteration order (eg:
ancestors(42) is expected to be iterated upward but is fast/lazy to compute
downward.

So we explicitly track the iteration order and enforce it if the manual
iteration is requested.

Default expected iteration order of a generator set is ascending because I'm
not aware of any descending revset that need a generatorset. The first to find
such descending revset will have the pleasure to make this configurable.
2014-10-03 21:11:56 -07:00
Pierre-Yves David
a82beab10d addset: drop caching through generatorset
The utility of this cache is debatable (no visible benchmark impact) and using
generatorset for such purpose makes the code complicated.

We drop it for now. Someone can reintroduce a smart version of it in the future
if it is detected to be relevant.
2014-10-03 20:23:02 -07:00
Pierre-Yves David
e6831448a3 generatorset: get list-based fast iterations after the generator is consumed
When all revisions are known, we shortcut most of the class logic to use list
iteration instead. The cost of the sort is expected to be non-significant. The
list creation and sorting could be done lazily in the future. We have to copy
the list to not break existing iterator created before we finished consuming the
generator.
2014-10-03 21:01:30 -07:00
Pierre-Yves David
1d2cf353fc generatorset: move iteration code into _iterator
_iterator handles the generator iteration. The `__iter__` method will need
changes to handle ordering-related information.
2014-10-03 20:48:28 -07:00
Pierre-Yves David
a0f1c697e2 generatorset: stop using a base as the _genlist
It does not add anything and makes it more complicated to have a simple baseset
implementation.
2014-10-03 20:43:48 -07:00
Pierre-Yves David
9d3c052ee3 generatorset: drop the leading underscore in the class name
This is a real smart set now.
2014-10-03 20:12:02 -07:00
Pierre-Yves David
dc2b8470bf generatorset: update the docstring now that it is a smartset
The documentation was still stating that this class was not a smartset. We drop
that part.
2014-10-03 20:14:43 -07:00
Pierre-Yves David
ccc0b916ad addset: drop the leading underscore from the class name
This class is now a real smartset.
2014-10-03 20:18:48 -07:00
Pierre-Yves David
e434af74be addset: this is a smartset, update the docstring
The documentation was still stating that this class is a not a smartset. We drop
that part.
2014-10-03 20:17:12 -07:00
Pierre-Yves David
d1e22facbe addset: use the ascending argument in _iterordered
Fix a bug where fastasc and fastdesc were iterator in the same order as
self._ascending.
2014-10-09 05:27:23 -07:00
Pierre-Yves David
cc531eaf7c revset: remove the now unused _descgeneratorset class 2014-10-03 12:54:56 -05:00
Pierre-Yves David
5660381e46 revset: use _generatorset in _revancestors
The _descgeneratorset class is going away.
2014-10-03 12:53:41 -05:00
Pierre-Yves David
0064df5af6 revset: remove now unused class _ascgeneratorset 2014-10-03 12:52:49 -05:00
Pierre-Yves David
2c0f15affd revset: use _generatorset directly in _revdescendant
_ascgeneratorset is going away.
2014-10-03 12:52:17 -05:00
Pierre-Yves David
d8ee591ede generatorset: move membership testing on ordered gen to the main class
We are phasing out the ordered version of the class to simplify the code.
2014-10-03 12:46:34 -05:00
Pierre-Yves David
f6fa8eb009 generatorset: make use of the new mechanism in the subclass
Until we remove them, we use the new parameter of _generatorset to make sure
the code is run.
2014-10-03 12:36:57 -05:00
Pierre-Yves David
8182bcd552 generatorset: make it possible to use gen as fastasc or fastdesc
We gain a parameter to inform that the generator is ascending or descending. If
the generator is ordered, it is also used for the `fastasc` or `fastdesc`
version.

The _ascgeneratorset and _descgeneratorset class will be removed soon.
2014-10-03 12:36:08 -05:00
Pierre-Yves David
360df469a0 baseset: rely on the abstractsmartset implementation for filter 2014-10-03 03:19:00 -05:00
Pierre-Yves David
1173000a7c _orderedsetmixin: drop this now unused class
All my friends are dead.
2014-10-02 19:48:14 -05:00
Pierre-Yves David
214e70e3ed spanset: drop _orderedsetmixin inheritance
The min/max method are as well provided by abstractsmartset.
2014-10-02 19:47:33 -05:00
Pierre-Yves David
5c0b91dc51 orderedlazyset: drop this now unused class
All my friends are dead.
2014-10-03 01:44:52 -05:00
Pierre-Yves David
2d5a7f7706 _descendant: use filteredset instead of orderedlazyset
The orderedlazyset class is going away. Filteredset gives the same service.
2014-10-02 19:43:42 -05:00
Pierre-Yves David
5081443516 addset: use the base implementation for ascending and descending 2014-10-03 01:37:13 -05:00
Pierre-Yves David
c77388089d addset: use base implementation for __filter__ 2014-10-03 01:34:25 -05:00
Pierre-Yves David
02ce29364d addset: use base implementation for __add__ 2014-10-03 01:33:32 -05:00
Pierre-Yves David
355c9d986e addset: use base implementation for __sub__ 2014-10-03 01:32:50 -05:00
Pierre-Yves David
a24bd6fb5b addset: use base implementation for __and__ 2014-10-03 01:31:46 -05:00
Pierre-Yves David
c65b8b42bd addset: promote to real smartset
Better revset performance are also achieved with less overlay. There is no good
reason for addset to not be a smartset. We can replace the `_orderedsetmixin`
inheritance since `abstractsmartset` has efficient min and max too.
2014-10-02 19:42:06 -05:00
Pierre-Yves David
7a25a7121b addset: add a __nonzero__ method
This is required to be a full smartset (not sure what was happening before
that...)
2014-10-03 00:12:22 -05:00
Pierre-Yves David
6a1c6ffa59 addset: offer a fastasc and fastdesc methods
If the underlying object offers fast iterators, we use them to provide fast
iterators too.
2014-10-02 23:38:30 -05:00
Pierre-Yves David
89b6f70699 addset: split simple and ordered iteration
We have two goals here. First, we would like to restore the former iteration
order we had in 2.9. Second, we want this logic to be reusable for `fastasc`
and `fastdesc` methods.
2014-10-02 23:28:18 -05:00
Pierre-Yves David
5559000069 generatorset: promote to smartset
This is not going to be efficient but we need all basic set classes to be smartsets
for the other classes to work.
2014-10-03 01:55:09 -05:00
Pierre-Yves David
58b382b0f7 generatorset: implement __nonzero__
This is necessary to become a real smartset.
2014-10-03 01:56:57 -05:00
Pierre-Yves David
a14781af28 spanset: use base implementation for __add__ 2014-10-03 00:31:33 -05:00
Pierre-Yves David
215016c505 spanset: use base implementation for __sub__ 2014-10-03 00:31:18 -05:00
Pierre-Yves David
8f595a844a spanset: use base implementation for __and__ 2014-10-03 00:30:58 -05:00
Pierre-Yves David
12baf0e606 spanset: use base implementation for filter 2014-10-03 00:39:57 -05:00
Pierre-Yves David
5d23f77ec3 filteredset: use base implementation for filter 2014-10-03 01:27:00 -05:00
Pierre-Yves David
1be20553d2 filteredset: use base implementation for __add__ 2014-10-03 01:25:35 -05:00
Pierre-Yves David
23bcf240b5 filteredset: use base implementation for __sub__ 2014-10-03 01:24:30 -05:00
Pierre-Yves David
24ee9a4abf filteredset: use base implementation for __and__ 2014-10-03 01:23:12 -05:00
Pierre-Yves David
13924bc45b abstractsmartset: add default implementation for __sub__ 2014-10-02 19:22:17 -05:00
Pierre-Yves David
67a9c485c6 abstractsmartset: add default implementation for __add__ 2014-10-02 19:22:03 -05:00
Pierre-Yves David
47e527a95f abstractsmartset: add default implementation for __and__ 2014-10-02 19:21:40 -05:00
Pierre-Yves David
8a3b420ade abstractsmartset: add default implementation for filter 2014-10-01 00:26:50 -05:00
Pierre-Yves David
133cc5824b lazyset: rename the class to filteredset
All smartsets try to be lazy. The purpose of this class is to apply a
filter on another set. So we rename the class (and all its occurences) to
`filteredset`.
2014-10-03 01:16:23 -05:00
Pierre-Yves David
c7274e7678 lazyset: add order awareness to the class
Just a bit of extra code makes the lazyset aware of order. This renders
orderedlazyset useless.

At some point, the `subset` will become responsible for this ordering logic. But
we are not there yet because the various objects used as subsets are not good enough.
2014-10-02 19:14:03 -05:00
Pierre-Yves David
0b05dee60c lazyset: remove min/max
This is now handled by abstractsmartset.
2014-10-02 19:03:14 -05:00
Pierre-Yves David
76604324cd baseset: remove min/max methods
This is now handled by the base class.
2014-10-02 19:02:50 -05:00
Pierre-Yves David
fb4c81e11e abstractsmartset: add a default implementation for min and max
This default implementation takes advantage of the fast iterator if available.
2014-10-02 18:59:41 -05:00
Pierre-Yves David
d810f109b3 lazyset: drop now useless ascending/descending definition 2014-10-02 18:52:09 -05:00
Pierre-Yves David
4be4f3fe52 lazyset: inherit the fastasc and fastdesc method from subset
When the filtered subset has such methods, we can use them. It is implemented
as properties to be able to quickly return None if no corresponding fastasc exists
on the subset.
2014-09-30 23:36:57 -05:00
Pierre-Yves David
b0f4537a2a lazyset: split the iteration logic from the condition filtering logic
So that the filter can be reused by `fastasc` or `fastdesc`.
2014-10-02 18:25:37 -05:00
Pierre-Yves David
5310fa5e65 spanset: do a single range check in __contains__
Now that `start <= end` is always true, we can simplify this function.
2014-10-02 17:53:55 -05:00
Pierre-Yves David
9d76d87327 spanset: enforce the order lazily to gain fastasc and fastdesc methods
Instead of having the direction of iteration enforced through the ordering of
`start` and `end` attributes of spanset, we encode the iteration direction in
an explicit attribute and always store start < end.  The logic for sort and
reverse has to be updated. The __iter__ is now based on the newly introduced
`fastasc` and `fastdesc` methods.

This will allow other code simplifications in the future.
2014-10-02 18:02:17 -05:00
Pierre-Yves David
82a1d861c5 abstractsmartset: document the fastasc and fastdesc attributes/methods
See the in-line documentation for details. (This is the beginning of a massive
overhaul of revset).
2014-09-30 22:26:34 -05:00
Pierre-Yves David
177faece69 spanset: remove ascending/descending implementation
We can rely on their implementation in abstractsmartset.
2014-10-02 18:35:56 -05:00
Pierre-Yves David
2b0dd7c610 baseset: remove ascending/descending redefinition
We can rely on the abstractsmartset implementation.
2014-10-02 18:35:00 -05:00
Pierre-Yves David
765584f8b2 abstractsmartset: default implementation for ascending and descending
These two methods are actually silly aliases for `sort()` and
`sort(reverse=True)`. So we get that aliasing at the abstractsmartset level. We
will slowly phase out all the custom implementations and eventually remove any
mentions of it from the code.
2014-10-02 18:34:18 -05:00
Pierre-Yves David
1b61d96256 revert: bring back usage of subset & ps in parents
Changeset 1440ec8e33c0 switched the order of the operand of the "&" computation
to work around an issue from repo-wide spanset. The need for a workaround has been
alleviated by the introduction of `fullreposet`. So we restore it to normal.

The benchmark shows no significant changes as expected.

We also revert the bogus test change introduced by 1440ec8e33c0. The order is
actually important.
2014-09-17 04:55:55 -07:00
Pierre-Yves David
166e755bd8 revset: introduce an abstractsmartset class
This class documents all methods required by a smartset. This makes it easier
for people to respect the API and ensure we fail loudly when something does
not. It will later also contain common default implementations for multiple
methods, making it easier to have smartset classes with minimal work.
2014-10-01 15:14:36 -05:00
Pierre-Yves David
63c5d3af3b revset: add a __nonzero__ to baseset
We are about to add a base class for `baseset` with an abstract `__nonzero__`
method. So we need this method to be explicitly defined to avoid issues. The
built-in list object apparently does not have a `__nonzero__` and relies on
`__len__` for this purpose?
2014-10-01 15:03:16 -05:00
Pierre-Yves David
a44ac390ee revset: drop isinstance(baseset) in spanset.__sub__
As baseset now has a fast __contains___ operator, this `baseset.set()` dance is no
longer needed. No regressions are visible in the benchmark.
2014-10-01 15:50:54 -05:00
Pierre-Yves David
43c1a7c7b7 revset: drop isinstance(baseset) in spanset.__and__
As baseset now has a fast __contains___ operator, this `baseset.set()` dance is no
longer needed. No regressions are visible in the benchmark.
2014-10-01 15:50:40 -05:00
Pierre-Yves David
3c21cdaa54 revset: drop isinstance(baseset) from baseset.__and__
As baseset now has a fast __contains___ operator, this `baseset.set()` dance is
no longer needed. No regressions are visible in the benchmark.
2014-09-30 23:09:59 -05:00
Pierre-Yves David
3a3ec48d95 revset: use direct access to __contains__ in spanset.__sub__
Using `x.__contains__(r)` instead of `r in x` does not matter for built-in type
(set) but have a positive impact for all other classes. This will let us drop
some usage of baseset.set() in future patches. This also probably improves some
performance.
2014-10-01 15:53:42 -05:00
Pierre-Yves David
d93cff0448 revset: use a single return statement in matcher function
This makes it easy to insert post processing and debug code on the returned
value.
2014-09-30 12:39:21 -05:00
Pierre-Yves David
2a8200e655 revset: rely on built in iterator when possible in _generatorset.__iter__
Doing manual iteration is expensible. We rely on built in list iteration
whenever possible. The other case has to become a closure we cannot have a both
yield and return in the same function.
2014-04-30 16:56:23 -07:00
Pierre-Yves David
c6262025ca revset: prefetch an attribute in _generatorset.__iter__
Python's attribute lookup are expensible, lets do less of them.

This gives us a 7% speedup on this revset iteration (from 0.063403 to 0.059032)
2014-09-18 15:52:45 -07:00
Pierre-Yves David
a7bd255d53 revset: use subset & in bare p2()
This takes advantage of the `fullreposet` smartness with a nice
speedup. It's a similar speedup to `p1()` when a merge is in progress
(the non merge case is already lightning fast anyway.)
2014-09-17 11:00:03 -07:00
Pierre-Yves David
85eb5c83a9 revset: use subset & in bare p1()
This takes advantage of the `fullreposet` smartness and yields a nice
speedup.

revset #0: p1()
0) wall 0.003256 comb 0.010000 user 0.010000 sys 0.000000 (best of 527)
1) wall 0.000066 comb 0.000000 user 0.000000 sys 0.000000 (best of 23224)
2014-09-17 10:59:52 -07:00
Pierre-Yves David
8cb4d64b32 revset: use subset & in rev
This takes advantage of the `fullreposet` smartness and yields a nice
speedup.

revset #0: rev(25)
0) wall 0.005480 comb 0.000000 user 0.000000 sys 0.000000 (best of 305)
1) wall 0.000052 comb 0.000000 user 0.000000 sys 0.000000 (best of 21891)
2014-09-17 11:00:09 -07:00
Pierre-Yves David
375f152fed revset: use subset & in origin
This takes advantage of the `fullreposet` smartness.

revset #0: origin(tip)
0) wall 0.005353 comb 0.000000 user 0.000000 sys 0.000000 (best of 354)
1) wall 0.003080 comb 0.000000 user 0.000000 sys 0.000000 (best of 446)
2014-09-17 19:52:34 -07:00
Pierre-Yves David
fe68969345 revset: use subset & in follow
This takes advantage of the `fullreposet` smartness.


revset #0: follow(COPYING)
0) wall 0.002446 comb 0.000000 user 0.000000 sys 0.000000 (best of 735)
1) wall 0.000331 comb 0.000000 user 0.000000 sys 0.000000 (best of 5672)
2014-09-17 10:59:16 -07:00
Pierre-Yves David
a3297e9a12 revset: use subset & in filelog
This takes advantage of the `fullreposet` smartness.

revset #0: file(COPYING)
0) wall 3.179066 comb 3.180000 user 3.140000 sys 0.040000 (best of 3)
1) wall 2.723699 comb 2.730000 user 2.690000 sys 0.040000 (best of 4)
2014-09-17 10:58:50 -07:00
Pierre-Yves David
244ffda42c revset: use subset & in divergent
This takes advantage of the `fullreposet` smartness.

revset #0: divergent()
0) wall 0.002047 comb 0.000000 user 0.000000 sys 0.000000 (best of 813)
1) wall 0.000052 comb 0.000000 user 0.000000 sys 0.000000 (best of 22757)
2014-09-17 10:58:39 -07:00
Pierre-Yves David
4cc5660e43 revset: use subset & in bisect
This takes advantage of the `fullreposet` smartness.

revset #0: bisect(range)
0) wall 0.014007 comb 0.010000 user 0.010000 sys 0.000000 (best of 115)
1) wall 0.005556 comb 0.010000 user 0.010000 sys 0.000000 (best of 235)
2014-09-17 10:57:57 -07:00
Pierre-Yves David
35c1eba9e6 revset: use subset & in ancestorspec
This takes advantage of the `fullreposet` smartness.


revset #0: tip~25
0) wall 0.004800 comb 0.010000 user 0.010000 sys 0.000000 (best of 259)
1) wall 0.002475 comb 0.000000 user 0.000000 sys 0.000000 (best of 717)
2014-09-17 10:57:47 -07:00
Pierre-Yves David
25b1e1b399 revset: use subset & in bookmark
Speedup, Weeeeeee!

revset #0: bookmark()
0) wall 0.002240 comb 0.000000 user 0.000000 sys 0.000000 (best of 571)
1) wall 0.000132 comb 0.000000 user 0.000000 sys 0.000000 (best of 14059)
2014-09-17 19:57:09 -07:00
Pierre-Yves David
cefa7eaabc revset: use subset & in outgoing
This should give us the same benefit as elsewhere. Result is simpler (and
"faster").

Outgoing is dominated by the discovery so no benchmark is provided.
2014-09-17 10:59:40 -07:00
Pierre-Yves David
490d3a84ce revset: avoid in loop lookup in _generatorset._consumegen
Python lookups are slow, so do all lookup outside of the for loop.

This provide a small but still significant speedup:

revset #0: 0::
0) wall 0.063258 comb 0.060000 user 0.060000 sys 0.000000 (best of 100)
1) wall 0.057776 comb 0.050000 user 0.050000 sys 0.000000 (best of 100)
2014-04-30 16:56:48 -07:00
Pierre-Yves David
ae357027fd revset: reduce dict lookup in lazyset.__contains__
Avoid an extra dict lookup when we have to compute the value. No
visible performance impact but this shaves the yak a few extra
nanometers.
2014-04-25 14:51:24 -07:00
Pierre-Yves David
e291b5bdba revset: do less lookup during spanset.__contains__
Attribute lookup is slow in python. So this version is going to be a bit
faster. This does not have a visible impact since the rest of the stack is much
slower but this shaves the yak a few extra nanometers.

Moreover the new version is more readable so it worth doing this change for code
quality purpose.

This optimisation was approved by a core python dev.
2014-04-25 17:53:58 -07:00
Pierre-Yves David
477ee214de revset: fast implementation for fullreposet.__and__
"And" operation with something that contains the whole repo should be super
cheap. Check method docstring for details.

This provide massive boost to simple revset that use `subset & xxx`

revset #0: p1(20000)
0) wall 0.002447 comb 0.010000 user 0.010000 sys 0.000000 (best of 767)
1) wall 0.000529 comb 0.000000 user 0.000000 sys 0.000000 (best of 3947)

revset #1: p2(10000)
0) wall 0.002464 comb 0.000000 user 0.000000 sys 0.000000 (best of 913)
1) wall 0.000530 comb 0.000000 user 0.000000 sys 0.000000 (best of 4226)

No other regression spotted.

More performance improvements are expected in the future as more
revset predicate are converted to use `subset & xxx`

The relaxed way `fullreposet` handles "&" operation may cause some trouble for
people comparing smartset from different filter levels. I'm not sure such people
exist and we can improve that aspect in later patches.
2014-09-24 20:11:36 -07:00
Pierre-Yves David
acb7d8cae1 revset: turn spanset into a factory function
We rename the `spanset` class to `_spanset`. `spanset` is now a function that
builds either a `fullreposet` or a `_spanset` according to the argument passed.

At some point, we may force people to explicitly use the `fullreposet`
constructor, but the current approach makes it easier to ensure we use the new
class whenever possible and focus on the benefits of this class.
2014-09-18 13:04:02 -07:00
Pierre-Yves David
8febe1e995 revert: add a fullreposet class
Every revset evaluation starts from `subset = spanset(repo)` and a lot of
revset predicates build a `spansetrepo` for their internal needs.

`spanset` is a generic class that can handle any situation. As a result a lot
of operation between spanset result in an `orderedlazyset`, a safe object but
suboptimal in may situation.

So we introduce a `fullreposet` class where some of the operation will be
overwritten to produce more interesting results.
2014-04-29 19:06:15 -07:00
Matt Mackall
0e7a6163da merge with stable 2014-09-27 13:18:10 -05:00
Pierre-Yves David
537742ab10 revset: remove nullrev from the bookmark computation
Same as for other revset we sanitize the content of the set to be able to rely
on it more.
2014-09-17 19:56:59 -07:00
Pierre-Yves David
a06107e4f9 revset: unify code flow in bookmark
We refactor the code of the bookmark revset to have a single return. This will
allow us to sanitize the content of the set.
2014-09-17 10:58:25 -07:00
Pierre-Yves David
9cdb9e5a5d revset: remove invalid value in the origin set
Same as the parents related revsets, origin had some invalid value in the
computed set. We remove them.
2014-09-17 10:59:30 -07:00
Pierre-Yves David
5005f69963 revset: remove nullrev from set computed in parents()
The old code relied on the subset contents to get rid of invalid values. We would
like to be able to rely more on the computation in parents() so we filter out
the invalid value.
2014-09-17 19:49:26 -07:00
Pierre-Yves David
1601426b0b revset: refactor parents() into a single return point
Both paths are doing similar thing in the end. We refactor the function so that
the `ps` set is commonly used at the end.

This will end excluding `nullrev` from this set in a future patch
2014-09-17 19:44:03 -07:00
Pierre-Yves David
3e189cc195 revset: remove nullrev from set computed in p1() and p2()
The old code relied on the subset contents to get rid of invalid values. We would
like to be able to rely more on the computation in p1() and p2() so we filter out
the invalid value
2014-09-17 04:40:30 -07:00
Pierre-Yves David
d0e7545de8 revset: add an optimised baseset.__contains__ (issue4371)
The baseset class is based on a python list. This means that base.__contains__
was absolutely as crappy as list.__contains__. We now rely on __contains__ from
the underlying set.

This will avoid having to explicitly convert the baseset to a set (using
baseset.set()) whenever one want fast membership test.

Apparently there is already code that forgot to do such conversions since we
observe a massive speedup in some test.

revset #25: roots((0::) - (0::tip))
0) wall 2.079454 comb 2.080000 user 2.080000 sys 0.000000 (best of 5)
1) wall 0.132970 comb 0.130000 user 0.130000 sys 0.000000 (best of 65)

No regression is observed in benchmarks.

This change improve the issue4371 back to acceptable situation (but are still
slower than manual substraction)
2014-09-16 23:59:29 -07:00
Pierre-Yves David
27c652135f revset: document the choice made in __generatorset.__iter__
The method code looks a bit ugly but has good reasons to. We document them
to prevent naive refactoring in the future.
2014-09-16 23:42:41 -07:00
Pierre-Yves David
f7b33aee4f revset: stop using a baseset instead of a plain list in _revsbetween
The function internal code needs a list. Lets use a list.
2014-09-16 22:55:49 -07:00
Pierre-Yves David
10f312d751 revset: simplify orderedlazyset creation in spanset method
We can simply use the `self.isascending` value instead of more complex if/else
clause. This get the code simpler.

Benchmarks show no performances harmed in the process.
2014-09-16 23:47:34 -07:00
Pierre-Yves David
88ccbff938 revset: use spanset.isdescending in multiple simple places
We call the method directly instead of duplicating checks.

Benchmarks show no performances harmed in the process.
2014-09-16 23:37:03 -07:00
Pierre-Yves David
440efac9e4 revset: wider definition of ascending and descending for spanset
Before this patches, empty spanset were seen as neither ascending nor
descending. This is mathematically wrong and create some edges case. We put
`isascending` and `isdescending` back on track so we can use them to simplify
some of the spanset code.

Benchmarks show no performances harmed in the process.
2014-09-16 23:34:18 -07:00
Durham Goode
ce250e375a revset: lower weight for _intlist function
The histedit command uses a revset like:

(_intlist('1234\x001235')) and merge()

Previously the optimizer gave a weight of 1.5 to the _intlist side (1 for the
function, 0.5 for the string) which caused it to process the merge() side first.
This caused it to evaluate merge against every commit in the repo, which took
2.5 seconds on a large repo.

I changed the weight of _intlist to 0, since it's a trivial calculation, which
makes it process intlist first, which makes merge apply only to the revs in the
list. Which makes the revset take 0.15 seconds now. Cutting off 2.4 seconds off
our histedit performance.

>From the revset benchmark:
revset #25: (_intlist('20000\x0020001')) and merge()
0) obsolete feature not enabled but 54243 markers found!
! wall 0.036767 comb 0.040000 user 0.040000 sys 0.000000 (best of 100)
1) obsolete feature not enabled but 54243 markers found!
! wall 0.000198 comb 0.000000 user 0.000000 sys 0.000000 (best of 9084)
2014-09-12 14:21:18 -07:00
Durham Goode
79319b785d revset: make parents() O(number of parents)
Strip executes a revset like this:

max(parents(_intlist('1234\x001235')) - _intlist('1234\x001235'))

Previously the parents() revset would do 'subset & parents' which iterates over
each item in the subset and checks if it's in parents.  subset is usually the
entire repo (a spanset) so this takes a while.

Reversing the parameters to be 'parents & subset' means the operation becomes
O(number of parents) instead of O(size of repo). It also means the result gets
evaluated immediately (since parents isn't a lazy set), but I think this is a
win in most scenarios.

This shaves 0.3 seconds off strip (amend/histedit/rebase/etc) for large repositories.

revset #0: parents(20000)
0) obsolete feature not enabled but 54243 markers found!
! wall 0.006256 comb 0.010000 user 0.010000 sys 0.000000 (best of 289)
1) obsolete feature not enabled but 54243 markers found!
! wall 0.000391 comb 0.000000 user 0.000000 sys 0.000000 (best of 4323)
2014-09-12 15:00:51 -07:00
Durham Goode
9d2bd7f0b2 revset: make descendants() lazier
Previously descendants() would force the provided subset to become a set.  In
the case of revsets like '(%ld::) - (%ld)' (as used by histedit) this would
force the '- (%ld)' set to be evaluated, which produced a set containing every
commit in the repo (except %ld). This takes 0.6s on large repos.

This changes descendants to trust the subset to implement __contains__
efficiently, which improves the above revset to 0.16s. Shaving 0.4 seconds off
of histedit.

revset #27: (20000::) - (20000)
0) obsolete feature not enabled but 54243 markers found!
! wall 0.023640 comb 0.020000 user 0.020000 sys 0.000000 (best of 100)
1) obsolete feature not enabled but 54243 markers found!
! wall 0.019589 comb 0.020000 user 0.020000 sys 0.000000 (best of 100)

This commit removes the final revset related perf hotspot from histedit.
Combined with the previous two patches, they shave a little over 3 seconds off
histedit on large repos.
2014-09-12 16:21:13 -07:00
Michael O'Connor
306b55bcc9 revset: bookmark revset interprets 'literal:' prefix correctly (issue4329) 2014-08-11 23:45:08 -04:00
Gregory Szorc
27315bd014 revset: optimize baseset.__sub__ (issue4313)
f5a63a5506d2 regressed performance of baseset.__sub__ by introducing
a lazyset. This patch restores that lost performance by eagerly
evaluating baseset.__sub__ if the other set is a baseset.

revsetbenchmark.py results impacted by this change:

revset #6: roots(0::tip)
0) wall 2.923473 comb 2.920000 user 2.920000 sys 0.000000 (best of 4)
1) wall 0.077614 comb 0.080000 user 0.080000 sys 0.000000 (best of 100)

revset #23: roots((0:tip)::)
0) wall 2.875178 comb 2.880000 user 2.880000 sys 0.000000 (best of 4)
1) wall 0.154519 comb 0.150000 user 0.150000 sys 0.000000 (best of 61)

On the author's machine, this slowdown manifested during evaluation of
'roots(%ln::)' in phases.retractboundary after unbundling the Firefox
repository. Using `time hg unbundle firefox.hg` as a benchmark:

Before: 8:00
After:  4:28
Delta: -3:32

For reference, the subset and cs baseset instances impacted by this
change were of lengths 193634 and 193627, respectively.

Explicit test coverage of roots(%ln::), while similar to the existing
roots(0::tip) benchmark, has been added.
2014-07-24 12:12:12 -07:00
Matt Harbison
f0308c64dd revset: avoid a ValueError when 'only()' is given an empty set
This previously died in _revdescendants() taking the min() of the first set to
only(), when it was empty.  An empty second set already worked.  Likewise,
descendants() already handled an empty set.
2014-07-18 19:46:56 -04:00
Siddharth Agarwal
5801040e81 revset: remove no longer used _missingancestors revset
This was undocumented.
2014-07-12 00:37:08 -07:00
Siddharth Agarwal
9ea47f6848 revset: replace _missingancestors optimization with only revset
(::a - ::b) is equivalent to only(a, b).
2014-07-12 00:31:36 -07:00
Matt Mackall
5157edb3f5 revset: maintain ordering when subtracting from a baseset (issue4289) 2014-07-14 17:55:31 -05:00
Pierre-Yves David
0a422cd2f2 revset: cosmetic changes in spanset range comparison
We use the python syntax for range comparison: `a < x < c`. This is shorter,
more readable and less error prone. This comparison escaped the cleanup make in
166d6dde9310
2014-04-28 15:14:11 -07:00
Pierre-Yves David
f9553b0f97 revset: drop spanset._contained
All its users inlined it for performance reasons.
(See fcccbf073394 and 166d6dde9310)
2014-04-25 23:38:24 -07:00
Pierre-Yves David
e654410fbc revset: directly use __contains__ instead of a lambda
We get rid of lambda in a bunch of other place. This is equivalent and much
faster. (no new timing as this is the same change as three other changesets)
2014-05-01 14:07:04 -07:00
Pierre-Yves David
b3f19fad6e orderedlazyset: directly use __contains__ instead of a lambda
We apply the same speedup as in spanset, getting rid of the useless lambda.
(No new timing, as this is the very same change)
2014-05-01 12:15:28 -07:00
Pierre-Yves David
68eca34f8d lazyset: directly use __contains__ instead of a lambda
We apply the same speedup as in spanset, getting rid of the useless lambda.
(No new timing, as this is the very same change)
2014-05-01 12:15:00 -07:00
Pierre-Yves David
7a5b8cbf4b spanset: directly use __contains__ instead of a lambda
Spanset are massively used in revset. First because the initial subset itself is
a repo wide spanset. We speed up the __and__ operation by getting rid of a
gratuitous lambda call. A more long terms solution would be to:

1. speed up operation between spansets,
2. have a special smartset for `all` revisions.

In the mean time, this is a very simple fix that buyback some of the performance
regression.

Below is performance benchmark for trival `and` operation between two spansets.
(Run on an unspecified fairly large repository.)

revset tip:0
2.9.2)  wall 0.282543 comb 0.280000 user 0.260000 sys 0.020000 (best of 35)
before) wall 0.819181 comb 0.820000 user 0.820000 sys 0.000000 (best of 12)
after)  wall 0.645358 comb 0.650000 user 0.650000 sys 0.000000 (best of 16)

Proof of concept implementation of an `all` smartset brings this to 0.10 but it's
too invasive for stable.
2014-04-26 00:38:02 -07:00
Pierre-Yves David
a4f88556f4 revset: also inline spanset._contained in __len__
For consistency with what happen in `__contains__`, we inline the range test
into `__len__` too.
2014-04-25 18:00:07 -07:00
Pierre-Yves David
b42c62324c revset: inline spanset containment check (fix perf regression)
Calling a function is super expensive in python. We inline the trivial range
comparison to get back to more sensible performance on common revset operation.

Benchmark result below:

Revision mapping:
0) bced32a3fd6c 2.9.2 release
1) 2ab64f462d81 current @
2) This revision


revset #0: public()
0) wall 0.010890 comb 0.010000 user 0.010000 sys 0.000000 (best of 201)
1) wall 0.012109 comb 0.010000 user 0.010000 sys 0.000000 (best of 199)
2) wall 0.012211 comb 0.020000 user 0.020000 sys 0.000000 (best of 197)

revset #1: :10000 and public()
0) wall 0.007141 comb 0.010000 user 0.010000 sys 0.000000 (best of 361)
1) wall 0.014139 comb 0.010000 user 0.010000 sys 0.000000 (best of 186)
2) wall 0.008334 comb 0.010000 user 0.010000 sys 0.000000 (best of 308)

revset #2: draft()
0) wall 0.009610 comb 0.010000 user 0.010000 sys 0.000000 (best of 279)
1) wall 0.010942 comb 0.010000 user 0.010000 sys 0.000000 (best of 243)
2) wall 0.011036 comb 0.010000 user 0.010000 sys 0.000000 (best of 239)

revset #3: :10000 and draft()
0) wall 0.006852 comb 0.010000 user 0.010000 sys 0.000000 (best of 383)
1) wall 0.014641 comb 0.010000 user 0.010000 sys 0.000000 (best of 183)
2) wall 0.008314 comb 0.010000 user 0.010000 sys 0.000000 (best of 299)

We can see this changeset gains back the regression for `and` operation on
spanset.  We are still a bit slowerfor the `public()` and `draft()`. Predicates
not touched by this changeset.
2014-04-28 15:15:36 -07:00
Pierre-Yves David
4274fa0b04 revset: fix revision filtering in spanset.contains (regression)
The argument is `x` but the variable tested for filtering is `rev`. `rev`
happens to be a revset methods, ... never part of the filtered revs. This
method is now using `rev` for everything.
2014-04-28 16:28:52 -07:00
Greg Hurrell
89c96d28b3 help: clarify distinction among contains/file/filelog
For a Mercurial new-comer, the distinction between `contains(x)`,
`file(x)`, and `filelog(x)` in the "revsets" help page may not be
obvious. This commit tries to make things more obvious (text based on
an explanation from Matt in an FB group thread).
2014-04-28 15:09:23 -07:00
Wagner Bruna
0db6df4ed7 revset, i18n: add translator comment to "only" 2014-04-22 10:12:13 -03:00
Mads Kiilerich
0e8795ccd6 spelling: fixes from spell checker 2014-04-13 19:01:00 +02:00
Mads Kiilerich
b117005b7c revlog: use context ancestor instead of changelog ancestor
We want to move in this direction.
2014-04-07 23:17:51 +02:00
Durham Goode
eac8ba4613 revset: improve roots revset performance
Previously we would iterate over every item in the subset, checking if it was in
the provided args. This often meant iterating over every rev in the repo.

Now we iterate over the args provided, checking if they exist in the subset.
On a large repo this brings setting phase boundaries (which use this revset
roots(X:: - X::Y)) down from 0.8 seconds to 0.4 seconds.

The "roots((tip~100::) - (tip~100::tip))" revset in revsetbenchmarks shows it
going from 0.12s to 0.10s, so we should be able to catch regressions here in the
future.

This actually introduces a regression in 'roots(all())' (0.2s to 0.26s) since
we're now using spansets, which are slightly slower to do containment checks on.
I believe this trade off is worth it, since it makes the revset O(number of
args) instead of O(size of repo).
2014-03-31 16:03:34 -07:00
Durham Goode
13db32b575 revset: improve _descendants performance
Previously revset._descendants would iterate over the entire subset (which is
often the entire repo) and test if each rev was in the descendants list. This is
really slow on large repos (3+ seconds).

Now we iterate over the descendants and test if they're in the subset.
This affects advancing and retracting the phase boundary (3.5 seconds down to
0.8 seconds, which is even faster than it was in 2.9). Also affects commands
that move the phase boundary (commit and rebase, presumably).

The new revsetbenchmark indicates an improvement from 0.2 to 0.12 seconds. So
future revset changes should be able to notice regressions.

I removed a bad test. It was recently added and tested '1:: and reverse(all())',
which has an amibiguous output direction.  Previously it printed in reverse order,
because we iterated over the subset (the reverse part). Now it prints in normal
order because we iterate over the 1:: . Since the revset itself doesn't imply an
order, I removed the test.
2014-03-25 14:10:01 -07:00
Pierre-Yves David
3a1c934d6a revset: raise ValueError when calling min or max on empty smartset
min([]) raise a ValueError, we do the same thing in smartset.min() and
smartset.max() for the sake of consistency.

The min/amax test are greatly improved in the process to prevent this familly
of regression
2014-03-28 17:00:13 -07:00
Pierre-Yves David
4e0cea0c36 _addset: add a __len__ method
Back in the time where repo.revs(...) returned a list, calling `len(...)` on the
result was quite common. We reinstall this on _addset.

There is absolutely no easy way to test this from the command line. The commands
using this in the evolve extension will eventually land into core.
2014-03-20 18:55:28 -07:00
Durham Goode
dd74d4bd47 revset: fix generatorset race condition
If two things were iterating over a generatorset at the same time, they could
miss out on the things the other was generating, resulting in incomplete
results. This fixes it by making it possible for two things to iterate at once,
by always checking the _genlist at the beginning of each iteration.

I was only able to repro it with pending changes from my other commits, but they
aren't ready yet. So I'm unable to add a test for now.
2014-03-25 16:10:07 -07:00
Matt Mackall
b465bcd596 merge with stable 2014-03-25 16:17:16 -05:00
Gregory Szorc
2e930ae769 revset: improve performance of _generatorset.__contains__ (issue 4201)
_generatorset.__contains__ and __contains__ from child classes were
calling into __iter__ to look for values. Since all
previously-encountered values from the generator were cached and checked
in __contains__ before this iteration, __contains__ was effectively
performing iteration busy work which could lead to an explosion of
redundant work.

This patch changes __contains__ to be more intelligent. Instead of
looking at all values via __iter__, __contains__ will instead go
straight to "new" values from the underlying generator.

On a clone of the Firefox repository with around 200,000 changesets,
this patch decreases the execution time of the revset '::(200067)::'
from ~100s to ~4s on the author's machine. Rebase operations (which use
the aforementioned revset), speed up accordingly.
2014-03-24 20:00:18 -07:00
Matt Harbison
60d20455df revset: document the regular expression support for tag(name)
This has been supported since 0041ea008c64, in 2.3.
2014-03-24 21:27:40 -04:00
Matt Mackall
4436b1c4a4 revset: try to handle hyphenated symbols if lookup callback is available
Formerly an expression like "2.4-rc::" was tokenized as 2.4|-|rc|::.
This allows dashes in symbols iff the whole symbol-like string can be
looked up. Otherwise, it's tokenized as a series of symbols and
operators.

No attempt is made to accept dashed symbols inside larger symbol-like
string, for instance foo-bar or bar-baz inside foo-bar-baz.
2014-03-18 17:54:42 -05:00
Matt Mackall
a9a943eddd revset: pass a lookup function to the tokenizer 2014-03-18 17:19:44 -05:00
Lucas Moscovicz
1ff08e4b20 revset: changed minrev and maxrev implementations to use ordered sets
Performance Benchmarking:

0) max(tip:0)
1) min(0:tip)
2) min(0::)

c6d901b5cf89 (2.9.1 release)

    0) ! wall 0.005699 comb 0.000000 user 0.000000 sys 0.000000 (best of 450)
    1) ! wall 0.005414 comb 0.010000 user 0.010000 sys 0.000000 (best of 493)
    2) ! wall 0.025951 comb 0.030000 user 0.030000 sys 0.000000 (best of 107)

a9da3f4c0086 (public tip at submission time)

    0) ! wall 0.015177 comb 0.020000 user 0.020000 sys 0.000000 (best of 175)
    1) ! wall 0.014779 comb 0.010000 user 0.010000 sys 0.000000 (best of 189)
    2) ! wall 12.345179 comb 12.350000 user 12.350000 sys 0.000000 (best of 3)

Current patches:

    0) ! wall 0.001911 comb 0.000000 user 0.000000 sys 0.000000 (best of 1357)
    1) ! wall 0.001943 comb 0.010000 user 0.010000 sys 0.000000 (best of 1406)
    2) ! wall 0.000405 comb 0.000000 user 0.000000 sys 0.000000 (best of 6761)
2014-02-18 11:35:03 -08:00
Lucas Moscovicz
3bb3a9b599 revset: changed addset to extend _orderedsetmixin
Now _addset can use the lazy min and max implementation.
2014-03-14 14:43:44 -07:00
Pierre-Yves David
5b4ec0c02e revset: add a default argument for baseset.__init__
We are now able to create empty baseset using `baseset()` as we are able to
create empty list with `list()`.
2014-03-14 11:41:26 -07:00
Lucas Moscovicz
b693f7e6e2 revset: changed orderedlazyset to also extend _orderedsetmixin
Now orderedlazyset can use the lazy min and max implementation.
2014-03-13 11:36:45 -07:00
Lucas Moscovicz
6cf940b8a2 revset: changed spanset to extend _orderedsetmixin
Now spanset can use the lazy min and max methods implementation.
2014-03-13 11:36:11 -07:00
Lucas Moscovicz
732d8995dd revset: added _orderedsetmixin class
This class has utility methods for any ordered class to get the min and the
max values.
2014-03-12 16:40:18 -07:00
Lucas Moscovicz
829c2686af revset: added min and max methods to baseset and lazyset
This classes have no particular order so they rely on python min() and max()
implementation. This methods will be implemented in every smartset class in
future patches. For other classes there are lazy implementations that can be
made for this methods.
2014-02-19 09:28:17 -08:00
Pierre-Yves David
74abf98daf revset: add documentation and comment for _generatorset
(clean up some old irrelevant comment in the process)
2014-03-14 10:57:04 -07:00
Pierre-Yves David
170d61865e revset: add some documentation for lazyset 2014-03-14 10:55:03 -07:00
Lucas Moscovicz
6f3e54ea87 revset: added documentation and comment for spanset class 2014-03-14 10:59:51 -07:00
Lucas Moscovicz
432476d8c0 revset: changed smartset methods to return ordered addsets
Now when adding two structures that are ordered, they are wrapped into an
_addset and they get added lazily while keeping the order.
2014-03-11 17:25:53 -07:00
Lucas Moscovicz
dd9f8534d9 revset: added isascending and isdescending methods to _addset
This methods are intended to duck-type baseset, so we will still have _addset
as a private class but now we can return it without wrapping it into an
orderedlazyset or a lazyset.

These were the last methods to add for smartset compatibility.
2014-03-14 10:24:09 -07:00
Lucas Moscovicz
c5a9f97171 revset: added __add__ method to _addset
This method is intended to duck-type baseset, so we will still have _addset as a
private class but we will be able to return it without wrapping it into an
orderedlazyset or a lazyset.
2014-03-14 10:23:54 -07:00
Lucas Moscovicz
a79e6ad5a0 revset: added __sub__ mehtod to _addset
This method is intended to duck-type baseset, so we will still have _addset as a
private class but now will be able to return it without wrapping it into an
orderedlazyset or a lazyset.
2014-03-14 10:22:51 -07:00
Lucas Moscovicz
02e069f77d revset: added __and__ method to _addset
This method is intended to duck-type baseset, so we will still have _addset as a
private class but  we will be able to return it without wrapping it into an
orderedlazyset or a lazyset.
2014-03-14 10:22:29 -07:00
Lucas Moscovicz
73a28cdab8 revset: added ascending and descending methods to _addset
This methods are intended to duck-type baseset, so we will still have _addset
as a private class but will be able return it without wrapping it into an
orderedlazyset or a lazyset.
2014-03-14 10:21:56 -07:00
Lucas Moscovicz
e603775f94 revset: added filter method to _addset
This method is intended to duck-type baseset, so we will still have _addset
as a private class but we will be able return it without wrapping it into an
orderedlazyset or a lazyset.
2014-03-13 19:12:36 -07:00
Lucas Moscovicz
8adde22fdb revset: added comments to all methods needed to duck-type from baseset
All this methods are required to duck-type for any class that works as a smart
set.
2014-03-14 09:18:14 -07:00
Lucas Moscovicz
23a1573060 revset: use more explicit argument names for baseset methods
Use other instead of x and condition instead of l
2014-03-14 10:10:18 -07:00
Lucas Moscovicz
1e32ffe0b9 revset: added isascending and isdescending methods to smartset classes
This methods state if the class is sorted in an ascending or descending order

We need this to implement methods based on order on smartset classes in order
to be able to create new objects with a given order.

We cannot just rely on a simple boolean since unordered set are neither
ascending nor descending.
2014-03-11 17:09:23 -07:00
Lucas Moscovicz
8004a27cbf revset: added sort method in addset
We need this method to duck-type generatorset since this class is not going to
be used outside revset.py and we don't need to duck-type baseset.

This sort method will only do something when the addset is not already sorted
or is not sorted in the way we want it to be.
2014-03-11 17:03:43 -07:00
Lucas Moscovicz
62c7bb9117 revset: added reverse method to addset
This method is needed to duck type generatorset.
2014-03-13 18:57:30 -07:00
Lucas Moscovicz
e5f79a44be revset: changed _iterator() method on addset to work with a given order
If the two collections are in ascending order, yield their values in an
ordered way by iterating both at the same time and picking the values to
yield.
2014-03-13 13:29:04 -07:00
Lucas Moscovicz
f2b80f803a revset: changed _iterator() in addset to use the generated list when available
Now when all the elements have been generated, the iterator will just use the
generated list instead of going through all the elements again.
2014-03-13 14:51:04 -07:00
Lucas Moscovicz
1a5e79571b revset: added cached generated list to addset
This way when all the values have been generated the list can be sorted
without having to generate them all again.
2014-03-11 16:59:42 -07:00
Lucas Moscovicz
67d42c8519 revset: changed sort method to use native sort implementation of smartsets
When sort is done by revision or reversed revision number it can just call
sort on the set and doesn't have to iterate it all over again.
2014-03-13 17:15:21 -07:00
Lucas Moscovicz
6da2a5ca22 revset: fixed sorting issue with spanset
When a spanset was being sorted it didn't take into account it's current
state (ascending or descending) and it reversed itself everytime the reverse
parameter was True.

This is not yet used but it will be as soon as the sort revset is changed to
directly use the structures sort method.
2014-03-13 17:16:58 -07:00
Lucas Moscovicz
edbf51ad8f revset: added __nonzero__ method to spanset class
Implemented it in a lazy way, just look for the first non-filtered revision
and return True if there's any revision at all.
2014-03-14 09:07:59 -07:00
Lucas Moscovicz
04e9ab0a9b revset: optimized sort method in lazyset class
We are taking advantage of the smartset classes sort method when it exists and
converting the set to a baseset otherwise.
2014-03-06 09:41:47 -08:00
Durham Goode
f906635299 revset: improve head revset performance
Previously the head() revset would iterate over every item in the subset and
check if it was a head.  Since the subset is often the entire repo, this was
slow on large repos. Now we iterate over each item in the head list and check if
it's in the subset, which results in much less work.

hg log -r 'head()' on a large repo:
Before: 0.95s
After: 0.28s
2014-03-13 13:47:21 -07:00
Lucas Moscovicz
36702af6f9 revset: added ascending attribute to addset class
In case both collections are in an ascending/descending order then we will be
able to iterate them lazily while keeping the order.
2014-03-11 16:52:15 -07:00
Lucas Moscovicz
7da57d658a revset: added set method to addset to duck type generatorset
Since this class is only going to be used inside revset.py (it does not duck
type baseset) it needs to duck type only a few more methods for the next
patches.
2014-03-10 10:49:04 -07:00
Matt Mackall
064d711c69 revsets: backout f497f83593d8 due to performance regressions 2014-03-13 14:34:32 -05:00
Lucas Moscovicz
efb319fb76 revset: made addset a private class
This class is not supposed to be used outside revset.py since it only
wraps content that is used by baseset typed classes.

It only gets created by revset operations or private methods.
2014-03-12 17:20:26 -07:00
Lucas Moscovicz
a50567190d revset: made descgeneratorset a private class
This class is not supposed to be used outside revset.py since it only
wraps content that is used by baseset typed classes.

It only gets created by revset operations or private methods.
2014-03-12 17:19:46 -07:00
Lucas Moscovicz
e6b1cc49b9 revset: made ascgeneratorset a private class
This class is not supposed to be used outside revset.py since it only
wraps content that is used by baseset typed classes.

It only gets created by revset operations or private methods.
2014-03-12 17:18:54 -07:00
Lucas Moscovicz
7e33d4c691 revset: made generatorset a private class
This class are not supposed to be used outside revset.py since it only
wraps content that is used by baseset typed classes.

It only gets created by revset operations or private methods.
2014-03-12 17:07:38 -07:00
Lucas Moscovicz
2c2b32444e revset: added sort methods to generatorsets
Method needed to propagate sort calls amongst lazy structures.
The generated list (stored in the object) is sorted.

If the generated list did not contain all elements from the generator, we
take care of that before sorting the list.
2014-02-24 16:36:17 -08:00
Lucas Moscovicz
9db776218f revset: changed __add__ methods on lazy sets to return addsets (issue4191)
Performance Benchmarking:

$ hg --time log --graph --style compact --limit 6 -r 'sort((::. or bookmark()
or heads(public())), "-rev")'
time: real 1.540 secs (user 1.510+0.000 sys 0.020+0.000)

$ ./hg --time log --graph --style compact --limit 6 -r 'sort((::. or
bookmark() or heads(public())), "-rev")'
time: real 1.240 secs (user 1.190+0.000 sys 0.040+0.010)
2014-03-07 14:06:49 -08:00
Lucas Moscovicz
82e3bb4e2c revset: added addset class with its basic methods
This class addresses the problem of losing performance on the __contains__
method when adding two smart structures with fast membership testing.
2014-03-07 13:48:31 -08:00
Lucas Moscovicz
a47624e9b5 revset: changed _children method to use lazy structures 2014-02-11 14:03:43 -08:00
Lucas Moscovicz
c7b531ba29 revset: changed descendants revset to use lazy generators
Performance Benchmarking:

$ time hg log -qr "0:: and 0:5"
...

real  0m3.665s
user  0m3.364s
sys 0m0.289s

$ time ./hg log -qr "0:: and 0:5"
...

real  0m0.492s
user  0m0.394s
sys 0m0.097s
2014-02-10 12:26:45 -08:00
Lucas Moscovicz
ead9ef7efc revset: optimized _revancestors method based on order of revisions
If the revisions for which the ancestors are required are in descending order,
it lazily loads them into a heap to be able to yield values faster.
2014-02-07 13:44:57 -08:00
Lucas Moscovicz
8456e8d71a revset: changed ancestors revset to return lazy generators
This will not improve revsets like "::tip" but will do when that gets
intersected or substracted with another revset.

Performance Benchmarking:

$ time hg log -qr "draft() and ::tip"
...

real  0m3.961s
user  0m3.640s
sys 0m0.313s

$ time ./hg log -qr "draft() and ::tip"
...

real  0m1.080s
user  0m0.987s
sys 0m0.083s
2014-02-07 10:32:02 -08:00
Lucas Moscovicz
4a4c0782a7 revset: changed methods in spanset to return ordered sets
Now __sub__ and __and__ can smartly return ordered lazysets.
2014-02-18 13:07:08 -08:00
Lucas Moscovicz
82070b57ca revset: added sort method to orderedlazyset 2014-02-25 10:36:23 -08:00
Lucas Moscovicz
9ade74466e revset: added order methods to lazyset classes
This will allow revsets to ask for an ordered set when possible to be able to
work lazily with it.
2014-02-07 08:44:18 -08:00
Lucas Moscovicz
e87d3ac6ae revset: added ordered generatorset classes with __contains__ method
They stop iterating as soon as they go past the value they are looking for,
so, for values not in the generator they return faster.
2014-02-27 17:27:03 -08:00
Lucas Moscovicz
d710c5103e revset: changed generatorset code to remove unnecesary function call
Removed _nextitem() method, now __iter__ has that logic and __contains__ uses
__iter__ to check for membership.
2014-03-03 12:54:46 -08:00
Durham Goode
f2e7078d1a revset: add 'only' revset
Adds a only() revset that has two forms:

only(<set>) is equivalent to "::<set> - ::(heads() - heads(<set>::))"

only(<include>,<exclude>) is equivalent to "::<include> - ::<exclude>"

On a large repo, this implementation can process/traverse 50,000 revs in 0.7
seconds, versus 4.2 seconds using "::<include> - ::<exclude>".

This is useful for performing histedits on your branch:
hg histedit -r 'first(only(.))'

Or lifting branch foo off of branch bar:
hg rebase -d @ -s 'only(foo, bar)'

Or a variety of other uses.
2013-11-16 08:57:08 -08:00
Lucas Moscovicz
fbd41fa2f4 revset: added basic operators to orderedlazyset
Now __and__ and __sub__ return orderedlazyset.
2014-02-06 17:42:08 -08:00
Lucas Moscovicz
66c7004b8c revset: changed revset code to use filter method
Revset methods now use the filter code to apply a condition.
2014-02-06 09:28:41 -08:00
Lucas Moscovicz
e70d62798f revset: added filter method to revset classes
This method will replace the creation of lazysets inside the revset methods.
Instead, the classes that handle lazy structures will create them based on
their current order.
2014-02-06 17:18:11 -08:00
Lucas Moscovicz
2e8fa99be8 revset: added orderedlazyset class 2014-02-05 15:24:08 -08:00
Lucas Moscovicz
5d74d40ff4 revset: changed spanset __add__ implementation to work lazily
$ time hg log -qr "first(0:tip or draft())"
...

real  0m1.032s
user  0m0.841s
sys 0m0.179s

$ time ./hg log -qr "first(0:tip or draft())"
...

real  0m0.378s
user  0m0.291s
sys 0m0.085s
2014-02-13 09:18:16 -08:00
Lucas Moscovicz
3c92337f5c revset: changed lazyset __add__ implementation to work lazily
Performance Benchmarking:

$ time hg log -qr "first(author(mpm) or branch(default))"
0:3a6a38229d41

real  0m3.875s
user  0m3.818s
sys 0m0.051s

$ time ./hg log -qr "first(author(mpm) or branch(default))"
0:3a6a38229d41

real  0m0.213s
user  0m0.174s
sys 0m0.038s
2014-02-13 09:00:25 -08:00
Lucas Moscovicz
df2d5bd9ed revset: added _hexlist method to replace _list for %ln
Now %ln expression goes through _hexlist and doesn't do any unnecesary
processing anymore.
2014-02-26 17:15:55 -08:00
Lucas Moscovicz
d6c47fdc4c revset: added _intlist method to replace _list for %ld
Now %ld expression goes through _intlist and doesn't do any unnecesary
processing anymore.
2014-02-26 12:36:36 -08:00
Lucas Moscovicz
59ef037f26 revset: added __nonzero__ method to lazyset
Now it doesn't have to go through all the set and can return lazily as soon as
it finds one element.
2014-02-20 10:15:38 -08:00
Lucas Moscovicz
b6ae0f9720 revset: added cached generated list on generatorset
This allows to iterate the generatorset more than once.
2014-02-12 18:45:49 -08:00
Lucas Moscovicz
e97b3cbf9e revset: fixed bug where log -f was taking too long to return 2014-02-21 13:16:17 -08:00
Lucas Moscovicz
bee2c6a618 revset: added generatorset class with cached __contains__ method 2014-02-05 15:23:11 -08:00
Lucas Moscovicz
cf5a2af3df revset: changed last implementation to use lazy classes
Instead of using getitem just reverse the revision list and get the first
'lim' elements. With classes like spanset which are easily reversible this
will work faster.

Performance Benchmarking:

$ time hg log -qr "last(all())"
...

real  0m0.569s
user  0m0.447s
sys 0m0.122s

$ time ./hg log -qr "last(all())"
...

real  0m0.215s
user  0m0.150s
sys 0m0.063s
2014-02-19 12:56:41 -08:00
Lucas Moscovicz
131799c6de revset: changed mfunc and getset to work with old style revset methods
Now extensions shouldn't break when adding new revsets.
2014-02-18 15:54:46 -08:00
Lucas Moscovicz
ba7cfe4ed4 revset: changed revsets to use spanset
Performance Benchmarking:

$ hg perfrevset "first(all())"
! wall 0.304936 comb 0.300000 user 0.280000 sys 0.020000 (best of 33)

$ ./hg perfrevset "first(all())"
! wall 0.175640 comb 0.180000 user 0.160000 sys 0.020000 (best of 56)
2014-02-03 10:15:15 -08:00
Lucas Moscovicz
6e0e0d512b revset: changed spanset to take a repo argument
This way, we can have by default the length of the repo as the end argument
and less code has to be aware of hidden revisions.
2014-02-18 11:38:03 -08:00
Lucas Moscovicz
4961b39b44 revset: changed spanset implementation to take hidden revisions into account
Hidden revisions are now excluded from the spanset.
Now this doesn't break for people using changeset evolution.
2014-02-10 17:38:43 -08:00
Lucas Moscovicz
3d8c71a8cc revset: added cache to lazysets
This allows __contains__ to return faster when asked for same value twice.
2014-02-04 15:31:57 -08:00
Siddharth Agarwal
2c51509bb0 revset: optimize missing ancestor expressions
A missing ancestor expression is any expression of the form (::x - ::y) or
equivalent. Such expressions are remarkably common, and so far have involved
multiple walks down the DAG, followed by a set difference operation.

With this patch, such expressions will be transformed into uses of the fast
algorithm at ancestor.missingancestor.

For a repository with over 600,000 revisions, perfrevset for '::tip - ::-10000'
returns:

Before: ! wall 3.999575 comb 4.000000 user 3.910000 sys 0.090000 (best of 3)
After:  ! wall 0.132423 comb 0.130000 user 0.130000 sys 0.000000 (best of 75)
2014-02-13 14:04:47 -08:00