Performance Benchmarking:
$ time hg log -qr "first(matching(0))"
0:3a6a38229d41
real 0m2.213s
user 0m2.149s
sys 0m0.055s
$ time ./hg log -qr "first(matching(0))"
0:3a6a38229d41
real 0m0.177s
user 0m0.137s
sys 0m0.038s
Performance Benchmarking:
$ time hg log -qr "first(file(README))"
0:3a6a38229d41
real 0m2.234s
user 0m2.180s
sys 0m0.044s
$ time ./hg log -qr "first(file(README))"
0:3a6a38229d41
real 0m0.172s
user 0m0.129s
sys 0m0.042s
This improves the performance of the revsets 'adds' 'modifies' and 'removes'
Performance benchmarking:
$ time hg log -qr "first(adds(README))"
0:3a6a38229d41
real 0m2.279s
user 0m2.222s
sys 0m0.053s
$ time ./hg log -qr "first(adds(README))"
0:3a6a38229d41
real 0m0.172s
user 0m0.131s
sys 0m0.041s
$ time hg log -qr "first(modifies(README))"
1:692bee203f23
real 0m2.292s
user 0m2.227s
sys 0m0.061s
$ time ./hg log -qr "first(modifies(README))"
1:692bee203f23
real 0m0.178s
user 0m0.130s
sys 0m0.038s
$ time hg log -qr "first(removes(README))"
2379:f24de2acd560
real 0m2.297s
user 0m2.235s
sys 0m0.058s
$ time ./hg log -qr "first(removes(README))"
2379:f24de2acd560
real 0m0.975s
user 0m0.797s
sys 0m0.056s
Performance Benchmarking:
$ time hg log -qr "first(public())"
...
real 0m1.184s
user 0m1.051s
sys 0m0.130s
$ time ./hg log -qr "first(public())"
...
real 0m0.548s
user 0m0.427s
sys 0m0.118s
Performance benchmarking:
$ time hg log -qr "first(merge())"
102:1634643e0cd8
real 0m0.276s
user 0m0.208s
sys 0m0.047s
$ time ./hg log -qr "first(merge())"
102:1634643e0cd8
real 0m0.192s
user 0m0.154s
sys 0m0.027s
Performance benchmarking:
$ time hg log -qr "first(grep(hg))"
0:3a6a38229d41
real 0m2.214s
user 0m2.163s
sys 0m0.045s
$ time ./hg log -qr "first(grep(hg))"
0:3a6a38229d41
real 0m0.211s
user 0m0.146s
sys 0m0.035s
Performance benchmarking:
$ time hg log -qr "first(desc(hg))"
changeset: 0:3a6a38229d41
real 0m2.210s
user 0m2.158s
sys 0m0.049s
$ time ./hg log -qr "first(desc(hg))"
changeset: 0:3a6a38229d41
real 0m0.171s
user 0m0.131s
sys 0m0.035s
Performance Benchmarking:
$ time hg log -qr "first(date(05/03/2005))"
0:3a6a38229d41
real 0m3.157s
user 0m2.994s
sys 0m0.087s
$ time ./hg log -qr "first(date(05/03/2005))"
0:3a6a38229d41
real 0m0.509s
user 0m0.289s
sys 0m0.070s
Performance benchmarking:
$ time hg log -qr "first(author(mpm))"
0:3a6a38229d41
real 0m3.486s
user 0m3.317s
sys 0m0.077s
$ time ./hg log -qr "first(author(mpm))"
0:3a6a38229d41
real 0m0.551s
user 0m0.295s
sys 0m0.072s
Performance benchmarking:
$ time hg log -qr "first(keyword(changeset))"
0:3a6a38229d41
real 0m3.466s
user 0m3.345s
sys 0m0.072s
$ time ./hg log -qr "first(keyword(changeset))"
0:3a6a38229d41
real 0m0.365s
user 0m0.199s
sys 0m0.083s
Performance benchmarking:
$ time hg log -qr "first(branch(default))"
0:3a6a38229d41
real 0m3.130s
user 0m3.025s
sys 0m0.074s
$ time ./hg log -qr "first(branch(default))"
0:3a6a38229d41
real 0m0.300s
user 0m0.198s
sys 0m0.069s
Performance Benchmarking:
$ time hg log -l1 -qr "branch(default)"
0:3a6a38229d41
real 0m3.366s
user 0m3.217s
sys 0m0.095s
$ time ./hg log -l1 -qr "branch(default)"
0:3a6a38229d41
real 0m0.389s
user 0m0.199s
sys 0m0.061s
This class allows us to return values from large revsets as soon as they are
computed instead of having to wait for the entire revset to be calculated.
Added set caching to the baseset class. It lazily builds the set whenever it's
needed and keeps a reference which is returned when the set is requested
instead of being built again.
Before this patch, online help of "adds()", "contains()", "filelog()",
"file()", "modifies()" and "removes()" predicates doesn't explain
about how the pattern without explicit kind like "glob:" is treated,
even though each predicates treat it differently:
- as "relpath:" by "adds()", "modifies()" and "removes()"
- as "glob:" by "file()"
- as special by "contains()" and "filelog()"
- be relative to cwd, and
- match against a file exactly
("relpath:" matches also against a directory)
This may confuse users.
This patch adds explanation about the pattern without explicit kind
to these predicates.
Before this patch, revset predicate "filelog()" uses "match.files()"
to get filename also for the pattern without explicit kind.
But in such case, only canonicalization of relative path is required,
and other initializations of "match" object including regexp
compilation are meaningless.
This patch uses "pathutil.canonpath()" directly for "filelog()"
pattern without explicit kind like "glob:", for efficiency.
This patch also does below as a part of introducing "canonpath()":
- move location of "matchmod.match()" invocation, because "m" is no
more used in "if not matchmod.patkind(pat)" code path
- omit passing "default" argument to "matchmod.match()", because
"pat" should have explicit kind of pattern in this code path
This patch avoids the loop for "match.files()" having always one
element in revset predicate "filelog()" for efficiency: "match" object
"m" is constructed with "[pat]" as "patterns" argument.
Before this patch, default kind of pattern for revset predicate
"contains()" is treated as the exact file path rooted at the root of
the repository. This decreases usability, because:
- all other predicates taking pattern argument (also "filelog()")
treat such pattern as the path rooted at the current working
directory
- "contains()" doesn't describe this difference in its help
- this difference may confuse users
for example, this prevents revset aliases from sharing same
argument between "contains()" and other predicates
This patch makes default kind of pattern for revset predicate
"contains()" be rooted at the current working directory.
This patch uses "pathutil.canonpath()" instead of creating "match"
object for efficiency.
This patch narrows scope of the variable "m" in the function for
revset predicate "contains()", because it is referred only in "else"
code path of "if not matchmod.patkind(pat)" examination.
parse() cannot be called at the same time because a parser object keeps its
states. This is no problem for command-line hg client, but it would cause
strange errors in multi-threaded hgweb.
Creating parser object is not too expensive.
original:
% python -m timeit -s 'from mercurial import revset' 'revset.parse("0::tip")'
100000 loops, best of 3: 11.3 usec per loop
thread-safe:
% python -m timeit -s 'from mercurial import revset' 'revset.parse("0::tip")'
100000 loops, best of 3: 13.1 usec per loop
Some changesets can be wrongly reported as matched by this predicate
due to searching in a string joined with spaces and not individually.
A test case added, which fails without this fix.
Change ancestor to accept 0 or more arguments. The greatest common ancestor of a
single changeset is that changeset. If passed no arguments, the empty list is
returned.
Before this patch, sub expression may return unexpected result, if it
is joined with another expression by "or":
- "^"/parentspec():
"R or R^1" is not equal to "R^1 or R". the former returns only "R".
- "~"/ancestorspec():
"R or R~1" is not equal to "R~1 or R". the former returns only "R".
- ":"/rangeset():
"10 or (10 or 15):" is not equal to "(10 or 15): or 10". the
former returns only 10 and 15 or grater (11 to 14 are not
included).
In "or"-ed expression "A or B", the "subset" passed to evaluation of
"B" doesn't contain revisions gotten from evaluation of "A", for
efficiency.
In the other hand, "stringset()" fails to look corresponding revision
for specified string/symbol up, if "subset" doesn't contain that
revision.
So, predicates looking revisions up indirectly should evaluate sub
expressions of themselves not with passed "subset" but with "entire
revisions in the repository", to prevent "stringset()" from unexpected
failing to look symbols in them up.
But predicates in above example don't so. For example, in the case of
"R or R^1":
1. "R^1" is evaluated with "subset" containing revisions other than
"R", because "R" is already gotten by the former of "or"-ed
expressions
2. "parentspec()" evaluates "R" of "R^1" with such "subset"
3. "stringset()" fails to look "R" up, because "R" is not contained
in "subset"
4. so, evaluation of "R^1" returns no revision
This patch evaluates sub expressions for predicates above with "entire
revisions in the repository".
Now that changelog filtering is in place, it's become evident that
naming the filters according to the set of revs _not_ included in the
filtered changelog is confusing. This is especially evident in the
collaborative branch cache scheme.
This changes the names of the filters to reflect the revs that _are_
included:
hidden -> visible
unserved -> served
mutable -> immutable
impactable -> base
repoview.filteredrevs is renamed to filterrevs, so that callers read a
bit more sensibly, e.g.:
filterrevs('visible') # filter revs according to what's visible