Instead of using getitem just reverse the revision list and get the first
'lim' elements. With classes like spanset which are easily reversible this
will work faster.
Performance Benchmarking:
$ time hg log -qr "last(all())"
...
real 0m0.569s
user 0m0.447s
sys 0m0.122s
$ time ./hg log -qr "last(all())"
...
real 0m0.215s
user 0m0.150s
sys 0m0.063s
A missing ancestor expression is any expression of the form (::x - ::y) or
equivalent. Such expressions are remarkably common, and so far have involved
multiple walks down the DAG, followed by a set difference operation.
With this patch, such expressions will be transformed into uses of the fast
algorithm at ancestor.missingancestor.
For a repository with over 600,000 revisions, perfrevset for '::tip - ::-10000'
returns:
Before: ! wall 3.999575 comb 4.000000 user 3.910000 sys 0.090000 (best of 3)
After: ! wall 0.132423 comb 0.130000 user 0.130000 sys 0.000000 (best of 75)
Performance Benchmarking:
$ time hg log -qr "first(matching(0))"
0:3a6a38229d41
real 0m2.213s
user 0m2.149s
sys 0m0.055s
$ time ./hg log -qr "first(matching(0))"
0:3a6a38229d41
real 0m0.177s
user 0m0.137s
sys 0m0.038s
Performance Benchmarking:
$ time hg log -qr "first(file(README))"
0:3a6a38229d41
real 0m2.234s
user 0m2.180s
sys 0m0.044s
$ time ./hg log -qr "first(file(README))"
0:3a6a38229d41
real 0m0.172s
user 0m0.129s
sys 0m0.042s
This improves the performance of the revsets 'adds' 'modifies' and 'removes'
Performance benchmarking:
$ time hg log -qr "first(adds(README))"
0:3a6a38229d41
real 0m2.279s
user 0m2.222s
sys 0m0.053s
$ time ./hg log -qr "first(adds(README))"
0:3a6a38229d41
real 0m0.172s
user 0m0.131s
sys 0m0.041s
$ time hg log -qr "first(modifies(README))"
1:692bee203f23
real 0m2.292s
user 0m2.227s
sys 0m0.061s
$ time ./hg log -qr "first(modifies(README))"
1:692bee203f23
real 0m0.178s
user 0m0.130s
sys 0m0.038s
$ time hg log -qr "first(removes(README))"
2379:f24de2acd560
real 0m2.297s
user 0m2.235s
sys 0m0.058s
$ time ./hg log -qr "first(removes(README))"
2379:f24de2acd560
real 0m0.975s
user 0m0.797s
sys 0m0.056s
Performance Benchmarking:
$ time hg log -qr "first(public())"
...
real 0m1.184s
user 0m1.051s
sys 0m0.130s
$ time ./hg log -qr "first(public())"
...
real 0m0.548s
user 0m0.427s
sys 0m0.118s
Performance benchmarking:
$ time hg log -qr "first(merge())"
102:1634643e0cd8
real 0m0.276s
user 0m0.208s
sys 0m0.047s
$ time ./hg log -qr "first(merge())"
102:1634643e0cd8
real 0m0.192s
user 0m0.154s
sys 0m0.027s
Performance benchmarking:
$ time hg log -qr "first(grep(hg))"
0:3a6a38229d41
real 0m2.214s
user 0m2.163s
sys 0m0.045s
$ time ./hg log -qr "first(grep(hg))"
0:3a6a38229d41
real 0m0.211s
user 0m0.146s
sys 0m0.035s
Performance benchmarking:
$ time hg log -qr "first(desc(hg))"
changeset: 0:3a6a38229d41
real 0m2.210s
user 0m2.158s
sys 0m0.049s
$ time ./hg log -qr "first(desc(hg))"
changeset: 0:3a6a38229d41
real 0m0.171s
user 0m0.131s
sys 0m0.035s
Performance Benchmarking:
$ time hg log -qr "first(date(05/03/2005))"
0:3a6a38229d41
real 0m3.157s
user 0m2.994s
sys 0m0.087s
$ time ./hg log -qr "first(date(05/03/2005))"
0:3a6a38229d41
real 0m0.509s
user 0m0.289s
sys 0m0.070s
Performance benchmarking:
$ time hg log -qr "first(author(mpm))"
0:3a6a38229d41
real 0m3.486s
user 0m3.317s
sys 0m0.077s
$ time ./hg log -qr "first(author(mpm))"
0:3a6a38229d41
real 0m0.551s
user 0m0.295s
sys 0m0.072s
Performance benchmarking:
$ time hg log -qr "first(keyword(changeset))"
0:3a6a38229d41
real 0m3.466s
user 0m3.345s
sys 0m0.072s
$ time ./hg log -qr "first(keyword(changeset))"
0:3a6a38229d41
real 0m0.365s
user 0m0.199s
sys 0m0.083s
Performance benchmarking:
$ time hg log -qr "first(branch(default))"
0:3a6a38229d41
real 0m3.130s
user 0m3.025s
sys 0m0.074s
$ time ./hg log -qr "first(branch(default))"
0:3a6a38229d41
real 0m0.300s
user 0m0.198s
sys 0m0.069s
Performance Benchmarking:
$ time hg log -l1 -qr "branch(default)"
0:3a6a38229d41
real 0m3.366s
user 0m3.217s
sys 0m0.095s
$ time ./hg log -l1 -qr "branch(default)"
0:3a6a38229d41
real 0m0.389s
user 0m0.199s
sys 0m0.061s
This class allows us to return values from large revsets as soon as they are
computed instead of having to wait for the entire revset to be calculated.
Added set caching to the baseset class. It lazily builds the set whenever it's
needed and keeps a reference which is returned when the set is requested
instead of being built again.
Before this patch, online help of "adds()", "contains()", "filelog()",
"file()", "modifies()" and "removes()" predicates doesn't explain
about how the pattern without explicit kind like "glob:" is treated,
even though each predicates treat it differently:
- as "relpath:" by "adds()", "modifies()" and "removes()"
- as "glob:" by "file()"
- as special by "contains()" and "filelog()"
- be relative to cwd, and
- match against a file exactly
("relpath:" matches also against a directory)
This may confuse users.
This patch adds explanation about the pattern without explicit kind
to these predicates.
Before this patch, revset predicate "filelog()" uses "match.files()"
to get filename also for the pattern without explicit kind.
But in such case, only canonicalization of relative path is required,
and other initializations of "match" object including regexp
compilation are meaningless.
This patch uses "pathutil.canonpath()" directly for "filelog()"
pattern without explicit kind like "glob:", for efficiency.
This patch also does below as a part of introducing "canonpath()":
- move location of "matchmod.match()" invocation, because "m" is no
more used in "if not matchmod.patkind(pat)" code path
- omit passing "default" argument to "matchmod.match()", because
"pat" should have explicit kind of pattern in this code path
This patch avoids the loop for "match.files()" having always one
element in revset predicate "filelog()" for efficiency: "match" object
"m" is constructed with "[pat]" as "patterns" argument.
Before this patch, default kind of pattern for revset predicate
"contains()" is treated as the exact file path rooted at the root of
the repository. This decreases usability, because:
- all other predicates taking pattern argument (also "filelog()")
treat such pattern as the path rooted at the current working
directory
- "contains()" doesn't describe this difference in its help
- this difference may confuse users
for example, this prevents revset aliases from sharing same
argument between "contains()" and other predicates
This patch makes default kind of pattern for revset predicate
"contains()" be rooted at the current working directory.
This patch uses "pathutil.canonpath()" instead of creating "match"
object for efficiency.