Commit Graph

255 Commits

Author SHA1 Message Date
Lucas Moscovicz
082b56b0fb revset: added lazyset implementation to contains revset 2014-02-04 15:07:03 -08:00
Lucas Moscovicz
7a6459e30d revset: added lazyset implementation to secret revset 2014-02-04 09:29:19 -08:00
Lucas Moscovicz
c27606d6ba revset: added lazyset implementation to matching revset
Performance Benchmarking:

$ time hg log -qr "first(matching(0))"
0:3a6a38229d41

real  0m2.213s
user  0m2.149s
sys 0m0.055s

$ time ./hg log -qr "first(matching(0))"
0:3a6a38229d41

real  0m0.177s
user  0m0.137s
sys 0m0.038s
2014-02-04 09:14:45 -08:00
Lucas Moscovicz
2f1f581a8c revset: added lazyset implementation to _matchfiles
Performance Benchmarking:

$ time hg log -qr "first(file(README))"
0:3a6a38229d41

real  0m2.234s
user  0m2.180s
sys 0m0.044s

$ time ./hg log -qr "first(file(README))"
0:3a6a38229d41

real  0m0.172s
user  0m0.129s
sys 0m0.042s
2014-02-04 08:51:07 -08:00
Lucas Moscovicz
3bfbd34602 revset: added lazyset implementation to checkstatus
This improves the performance of the revsets 'adds' 'modifies' and 'removes'

Performance benchmarking:

$ time hg log -qr "first(adds(README))"
0:3a6a38229d41

real  0m2.279s
user  0m2.222s
sys 0m0.053s

$ time ./hg log -qr "first(adds(README))"
0:3a6a38229d41

real  0m0.172s
user  0m0.131s
sys 0m0.041s

$ time hg log -qr "first(modifies(README))"
1:692bee203f23

real  0m2.292s
user  0m2.227s
sys 0m0.061s

$ time ./hg log -qr "first(modifies(README))"
1:692bee203f23

real  0m0.178s
user  0m0.130s
sys 0m0.038s

$ time hg log -qr "first(removes(README))"
2379:f24de2acd560

real  0m2.297s
user  0m2.235s
sys 0m0.058s

$ time ./hg log -qr "first(removes(README))"
2379:f24de2acd560

real  0m0.975s
user  0m0.797s
sys 0m0.056s
2014-01-31 10:47:51 -08:00
Lucas Moscovicz
8cb1ccfe44 revset: added lazyset implementation to public revset
Performance Benchmarking:

$ time hg log -qr "first(public())"
...

real  0m1.184s
user  0m1.051s
sys 0m0.130s

$ time ./hg log -qr "first(public())"
...

real  0m0.548s
user  0m0.427s
sys 0m0.118s
2014-01-30 17:46:08 -08:00
Lucas Moscovicz
69d62676c5 revset: added lazyset implementation to merge revset
Performance benchmarking:

$ time hg log -qr "first(merge())"
102:1634643e0cd8

real  0m0.276s
user  0m0.208s
sys 0m0.047s

$ time ./hg log -qr "first(merge())"
102:1634643e0cd8

real  0m0.192s
user  0m0.154s
sys 0m0.027s
2014-01-30 16:47:29 -08:00
Lucas Moscovicz
f18b7c26e0 revset: added lazyset implementation to grep revset
Performance benchmarking:

$ time hg log -qr "first(grep(hg))"
0:3a6a38229d41

real  0m2.214s
user  0m2.163s
sys 0m0.045s

$ time ./hg log -qr "first(grep(hg))"
0:3a6a38229d41

real  0m0.211s
user  0m0.146s
sys 0m0.035s
2014-01-30 16:03:18 -08:00
Lucas Moscovicz
ab42b4bbc5 revset: added lazyset implementation to desc revset
Performance benchmarking:

$ time hg log -qr "first(desc(hg))"
changeset:   0:3a6a38229d41

real  0m2.210s
user  0m2.158s
sys 0m0.049s

$ time ./hg log -qr "first(desc(hg))"
changeset:   0:3a6a38229d41

real  0m0.171s
user  0m0.131s
sys 0m0.035s
2014-01-30 15:39:56 -08:00
Lucas Moscovicz
6ba591c77e revset: added lazyset implementation to draft revset 2014-02-03 16:15:25 -08:00
Lucas Moscovicz
57e67b73a7 revset: added lazyset implementation bookmark revset 2014-01-29 15:23:16 -08:00
Lucas Moscovicz
28688b6d54 revset: added lazyset implementation to date revset
Performance Benchmarking:

$ time hg log -qr "first(date(05/03/2005))"
0:3a6a38229d41

real  0m3.157s
user  0m2.994s
sys 0m0.087s

$ time ./hg log -qr "first(date(05/03/2005))"
0:3a6a38229d41

real  0m0.509s
user  0m0.289s
sys 0m0.070s
2014-02-03 16:02:48 -08:00
Lucas Moscovicz
d80d16c98c revset: added lazyset implementation to author revset
Performance benchmarking:

$ time hg log -qr "first(author(mpm))"
0:3a6a38229d41

real  0m3.486s
user  0m3.317s
sys 0m0.077s

$ time ./hg log -qr "first(author(mpm))"
0:3a6a38229d41

real  0m0.551s
user  0m0.295s
sys 0m0.072s
2014-01-29 09:22:31 -08:00
Lucas Moscovicz
fe40abc599 revset: added lazyset implementation to keyword revset
Performance benchmarking:

$ time hg log -qr "first(keyword(changeset))"
0:3a6a38229d41

real  0m3.466s
user  0m3.345s
sys 0m0.072s

$ time ./hg log -qr "first(keyword(changeset))"
0:3a6a38229d41

real  0m0.365s
user  0m0.199s
sys 0m0.083s
2014-01-29 09:04:03 -08:00
Lucas Moscovicz
303bd9d554 revset: changed limit revset implementation to work with lazy revsets
Performance benchmarking:

$ time hg log -qr "first(branch(default))"
0:3a6a38229d41

real  0m3.130s
user  0m3.025s
sys 0m0.074s

$ time ./hg log -qr "first(branch(default))"
0:3a6a38229d41

real  0m0.300s
user  0m0.198s
sys 0m0.069s
2014-01-28 16:19:30 -08:00
Lucas Moscovicz
7a6ce89407 revset: added lazyset implementation to branch revset
Performance Benchmarking:

$ time hg log -l1 -qr "branch(default)"
0:3a6a38229d41

real  0m3.366s
user  0m3.217s
sys 0m0.095s

$ time ./hg log -l1 -qr "branch(default)"
0:3a6a38229d41

real  0m0.389s
user  0m0.199s
sys 0m0.061s
2014-02-05 16:12:03 -08:00
Lucas Moscovicz
0ff716dae6 revset: changed getset so that it can return a lazyset
Not converting it manually to a baseset anymore. At this point every revset
method should return a baseset typed structure.
2014-01-28 15:19:14 -08:00
Lucas Moscovicz
fb15b99ea8 revset: added operations to duck type baseset
Added more operations which are not lazy but only used so far to duck type
baseset.

Their implementations will be changed in future patches.
2014-02-06 14:29:37 -08:00
Lucas Moscovicz
dd14a88eaa revset: added basic operations to lazyset
Added methods __add__, __sub__ and __and__ to duck type more methods in
baseset
2014-02-06 14:25:37 -08:00
Lucas Moscovicz
939eba25eb revset: added lazyset class with basic operations
This class allows us to return values from large revsets as soon as they are
computed instead of having to wait for the entire revset to be calculated.
2014-02-06 14:19:40 -08:00
Lucas Moscovicz
5ea4d9b527 revset: minor changes adding baseset to revsets
Changed bits of code to work with baseset implementations.
2014-02-06 14:57:25 -08:00
Lucas Moscovicz
c5d089f8a5 revset: minor changes adding baseset to revsets
Changed bits of code to work with baseset implementations.
2014-02-06 14:57:25 -08:00
Lucas Moscovicz
91dcdf4e40 revset: added __add__ method to baseset class 2014-02-06 11:37:16 -08:00
Lucas Moscovicz
03857b594a revset: added docstring to baseset class 2014-02-06 11:33:36 -08:00
Lucas Moscovicz
0c945a320b revset: fixed bug where revset returning order was being changed
Some revsets were innecesarily turning the subset into a set before iterating
over it. This led to returning order changing in some cases.
2014-02-07 15:01:33 -08:00
Lucas Moscovicz
3d0344901f revset: added intersection to baseset class
Added the method __and__ to the baseset class to be able to intersect with
other objects in a more efficient way.
2014-01-24 16:57:44 -08:00
Lucas Moscovicz
f1e6aec1ef revset: added substraction to baseset class
Added __sub__ method to the baseset class to be able to compare it with other
subsets more efficiently.
2014-01-23 14:20:58 -08:00
Lucas Moscovicz
e0e3b9efa2 revset: implemented set caching for revset evaluation
Added set caching to the baseset class. It lazily builds the set whenever it's
needed and keeps a reference which is returned when the set is requested
instead of being built again.
2014-01-22 10:46:02 -08:00
Lucas Moscovicz
ef8bd69f5f revset: added baseset class (still empty) to improve revset performance
This class is going to be used to cache the set that is created from this list
in many cases while evaluating a revset.
2014-01-21 11:39:26 -08:00
FUJIWARA Katsunori
9baf47cf1a revset: add explanation about the pattern without explicit kind
Before this patch, online help of "adds()", "contains()", "filelog()",
"file()", "modifies()" and "removes()" predicates doesn't explain
about how the pattern without explicit kind like "glob:" is treated,
even though each predicates treat it differently:

  - as "relpath:" by "adds()", "modifies()" and "removes()"

  - as "glob:" by "file()"

  - as special by "contains()" and "filelog()"
    - be relative to cwd, and
    - match against a file exactly
      ("relpath:" matches also against a directory)

This may confuse users.

This patch adds explanation about the pattern without explicit kind
to these predicates.
2014-01-17 23:55:11 +09:00
FUJIWARA Katsunori
92ff577d38 revset: use "canonpath()" for "filelog()" pattern without explicit kind
Before this patch, revset predicate "filelog()" uses "match.files()"
to get filename also for the pattern without explicit kind.

But in such case, only canonicalization of relative path is required,
and other initializations of "match" object including regexp
compilation are meaningless.

This patch uses "pathutil.canonpath()" directly for "filelog()"
pattern without explicit kind like "glob:", for efficiency.

This patch also does below as a part of introducing "canonpath()":

  - move location of "matchmod.match()" invocation, because "m" is no
    more used in "if not matchmod.patkind(pat)" code path

  - omit passing "default" argument to "matchmod.match()", because
    "pat" should have explicit kind of pattern in this code path
2014-01-17 23:55:03 +09:00
FUJIWARA Katsunori
4d5d9b1517 revset: avoid loop for "match.files()" having always one element for efficiency
This patch avoids the loop for "match.files()" having always one
element in revset predicate "filelog()" for efficiency: "match" object
"m" is constructed with "[pat]" as "patterns" argument.
2014-01-17 23:42:12 +09:00
FUJIWARA Katsunori
7e4fbdb87b revset: make default kind of pattern for "contains()" rooted at cwd
Before this patch, default kind of pattern for revset predicate
"contains()" is treated as the exact file path rooted at the root of
the repository. This decreases usability, because:

  - all other predicates taking pattern argument (also "filelog()")
    treat such pattern as the path rooted at the current working
    directory

  - "contains()" doesn't describe this difference in its help

  - this difference may confuse users

    for example, this prevents revset aliases from sharing same
    argument between "contains()" and other predicates


This patch makes default kind of pattern for revset predicate
"contains()" be rooted at the current working directory.

This patch uses "pathutil.canonpath()" instead of creating "match"
object for efficiency.
2014-01-17 23:42:12 +09:00
FUJIWARA Katsunori
43338be810 revset: narrow scope of the variable referred only in specific code path
This patch narrows scope of the variable "m" in the function for
revset predicate "contains()", because it is referred only in "else"
code path of "if not matchmod.patkind(pat)" examination.
2014-01-17 23:42:12 +09:00
Yuya Nishihara
cb7a1dd14c fileset, revset: do not use global parser object for thread safety
parse() cannot be called at the same time because a parser object keeps its
states.  This is no problem for command-line hg client, but it would cause
strange errors in multi-threaded hgweb.

Creating parser object is not too expensive.

original:
% python -m timeit -s 'from mercurial import revset' 'revset.parse("0::tip")'
100000 loops, best of 3: 11.3 usec per loop

thread-safe:
% python -m timeit -s 'from mercurial import revset' 'revset.parse("0::tip")'
100000 loops, best of 3: 13.1 usec per loop
2013-12-21 12:44:19 +09:00
Alexander Plavin
d519eef8ec revset: add a whitelist of DoS-safe symbols
'Safe' here means that they can't be used for a DoS attack for any given input.
2013-09-06 13:30:56 +04:00
Alexander Plavin
75dc3d55d6 revset: add helper function to get functions used in a revset parse tree
Will be used to determine whether all functions used in a hgweb search query
are allowed there.
2013-08-07 01:21:31 +04:00
Alexander Plavin
573b3e69b6 revset: add helper function to get revset parse tree depth
Will be used to determine if a hgweb search query actually uses
any revset syntax.
2013-08-09 22:52:58 +04:00
Alexander Plavin
fd04e86dd0 revset: fix wrong keyword() behaviour for strings with spaces
Some changesets can be wrongly reported as matched by this predicate
due to searching in a string joined with spaces and not individually.
A test case added, which fails without this fix.
2013-08-06 00:52:06 +04:00
Alexander Plavin
829cf92d16 log: fix behavior with empty repositories (issue3497)
Make output in this special case consistent with general case one.
2013-04-17 00:29:54 +04:00
Kevin Bullock
8d329cb6b6 revset: don't abort when regex to tag() matches nothing (issue3850)
This makes the tag("re:...") revset consistent with branch("re:...").
2013-03-18 16:04:10 -05:00
Paul Cavallaro
4a3134830c revset: change ancestor to accept 0 or more arguments (issue3750)
Change ancestor to accept 0 or more arguments. The greatest common ancestor of a
single changeset is that changeset. If passed no arguments, the empty list is
returned.
2013-01-28 12:19:21 -08:00
Kevin Bullock
921b868783 bookmarks: don't use bookmarks.listbookmarks in local computations
bookmarks.listbookmarks is for wire-protocol use. The normal way to get
all the bookmarks on a local repository is repo._bookmarks.
2013-01-27 14:24:37 -06:00
FUJIWARA Katsunori
1fc2644404 revset: evaluate sub expressions correctly (issue3775)
Before this patch, sub expression may return unexpected result, if it
is joined with another expression by "or":

  - "^"/parentspec():
    "R or R^1" is not equal to "R^1 or R". the former returns only "R".

  - "~"/ancestorspec():
    "R or R~1" is not equal to "R~1 or R". the former returns only "R".

  - ":"/rangeset():
    "10 or (10 or 15):" is not equal to "(10 or 15): or 10". the
    former returns only 10 and 15 or grater (11 to 14 are not
    included).

In "or"-ed expression "A or B", the "subset" passed to evaluation of
"B" doesn't contain revisions gotten from evaluation of "A", for
efficiency.

In the other hand, "stringset()" fails to look corresponding revision
for specified string/symbol up, if "subset" doesn't contain that
revision.

So, predicates looking revisions up indirectly should evaluate sub
expressions of themselves not with passed "subset" but with "entire
revisions in the repository", to prevent "stringset()" from unexpected
failing to look symbols in them up.

But predicates in above example don't so. For example, in the case of
"R or R^1":

  1. "R^1" is evaluated with "subset" containing revisions other than
     "R", because "R" is already gotten by the former of "or"-ed
     expressions

  2. "parentspec()" evaluates "R" of "R^1" with such "subset"

  3. "stringset()" fails to look "R" up, because "R" is not contained
     in "subset"

  4. so, evaluation of "R^1" returns no revision

This patch evaluates sub expressions for predicates above with "entire
revisions in the repository".
2013-01-23 22:52:55 +09:00
Mads Kiilerich
7bc546e4e0 bundlerepo: improve performance for bundle() revset expression
Create the set of revision numbers directly instead of creating a list of nodes
first.
2013-01-16 20:41:34 +01:00
Kevin Bullock
93f9cb7f25 filtering: rename filters to their antonyms
Now that changelog filtering is in place, it's become evident that
naming the filters according to the set of revs _not_ included in the
filtered changelog is confusing. This is especially evident in the
collaborative branch cache scheme.

This changes the names of the filters to reflect the revs that _are_
included:

  hidden -> visible
  unserved -> served
  mutable -> immutable
  impactable -> base

repoview.filteredrevs is renamed to filterrevs, so that callers read a
bit more sensibly, e.g.:

  filterrevs('visible') # filter revs according to what's visible
2013-01-13 01:39:16 -06:00
Pierre-Yves David
10fc5e09ff revset: retrieve hidden from filteredrevs
This prepare the dropping of the `repo.hiddenrevs` property
2013-01-03 18:48:14 +01:00
Pierre-Yves David
f3faf259c5 obsolete: add revset and test for divergent changesets
This changesets add a new `divergent()` revset similar to `unstable()` and
`bumped()` one. Introducting this revset allows actuall test of the divergent
detection.
2012-12-12 03:12:55 +01:00
Siddharth Agarwal
86e87e5ede revset.children: ignore rev numbers that are too low
This replaces unnecessary parentrevs() calls with calculating min(parentset).
Even though the min operation is O(size of parentset), since parentrevs is
relatively expensive, this tradeoff almost always works in our favour. In a
repository with over 400,000 changesets, hg perfrevset "children(X)" takes:

       Set X       Before    After
       -1           0.51s    0.06s
    -1000:          0.55s    0.08s
   -10000:          0.56s    0.10s
  -100000:          0.60s    0.25s
  -100000:-99000    0.55s    0.19s
        0:100000    0.60s    0.61s
      all()         0.72s    0.74s

The relative performance is similar for Mercurial's own repository -- several
times faster in most cases, slightly slower for revisions close to 0 and
all().
2012-12-07 10:37:43 -08:00
Matt Mackall
cf4605492a merge with stable 2012-11-28 16:15:05 -06:00