Commit Graph

576 Commits

Author SHA1 Message Date
Mads Kiilerich
56207b4242 revisionbranchcache: fall back to slow path if starting readonly (issue4531)
Transitioning to Mercurial versions with revision branch cache could be slow as
long as all operations were readonly (revset queries) and the cache would be
populated but not written back.

Instead, fall back to using the consistently slow path when readonly and the
cache doesn't exist yet. That avoids the overhead of populating the cache
without writing it back.

If not readonly, it will still populate all missing entries initially. That
avoids repeated writing of the cache file with small updates, and it also makes
sure a fully populated cache available for the readonly operations.
2015-02-06 02:52:10 +01:00
FUJIWARA Katsunori
8a439b3cc6 revset: mask specific names for named() predicate
Before this patch, revset predicate "tag()" and "named('tags')" differ
from each other, because the former doesn't include "tip" but the
latter does.

For equivalence, "named('tags')" shouldn't include the revision
corresponded to "tip". But just removing "tip" from the "tags"
namespace causes breaking backward compatibility, even though "tip"
itself is planned to be eliminated, as mentioned below.

    http://selenic.com/pipermail/mercurial-devel/2015-February/066157.html

To mask specific names ("tip" in this case) for "named()" predicate,
this patch introduces "deprecated" into "namespaces", and makes
"named()" predicate examine whether each names are masked by the
namespace, to which they belong.

"named()" will really work correctly after 3.3.1 (see a3c326a7f57a for
detail), and fixing this on STABLE before 3.3.1 can prevent initial
users of "named()" from expecting "named('tags')" to include "tip".

It is reason why this patch is posted for STABLE, even though problem
itself isn't so serious.

This may have to be flagged as "(BC)", if applied on DEFAULT.
2015-02-05 14:45:49 +09:00
FUJIWARA Katsunori
c3172b4737 revset: get revision number of each node from target namespaces
Before this patch, revset predicate "named()" uses each nodes gotten
from target namespaces directly.

This causes problems below:

  - combination of other predicates doesn't work correctly, because
    they assume that revisions are listed up in number

  - "hg log" doesn't show any revisions for "named()" result, because:

    - "changeset_printer" stores formatted output for each revisions
      into dict with revision number (= ctx.rev()) as a key of them

    - "changeset_printer.flush(rev)" writes stored output for
      the specified revision, but

    - "commands.log" invokes it with the node, gotten from "named()"

  - "hg debugrevspec" shows nodes (= may be binary) directly

Difference between revset predicate "tag()" and "named('tags')" in
tests is fixed in subsequent patch.
2015-02-03 21:56:29 +09:00
FUJIWARA Katsunori
6a05d7fab8 revset: raise RepoLookupError to make present() predicate continue the query
Before this patch, "bookmark()", "named()" and "tag()" predicates
raise "Abort", when the specified pattern doesn't match against
existing ones.

This prevents "present()" predicate from continuing the query, because
it only catches "RepoLookupError".

This patch raises "RepoLookupError" instead of "Abort", to make
"present()" predicate continue the query, even if "bookmark()",
"named()" or "tag()" in the sub-query of it are aborted.

This patch doesn't contain raising "RepoLookupError" for "re:" pattern
in "tag()", because "tag()" treats it differently from others. Actions
of each predicates at failure of pattern matching can be summarized as
below:

  predicate  "literal:"  "re:"
  ---------- ----------- ------------
  bookmark   abort       abort
  named      abort       abort
  tag        abort       continue (*1)

  branch     abort       continue (*2)
  ---------- ----------- ------------

"tag()" may have to abort in the (*1) case for similarity, but this
change may break backward compatibility of existing revset queries. It
seems to have to be changed on "default" branch (with "BC" ?).

On the other hand, (*2) seems to be reasonable, even though it breaks
similarity, because "branch()" in this case doesn't check exact
existence of branches, but does pick up revisions of which branch
matches against the pattern.

This patch also adds tests for "branch()" to clarify behavior around
"present()" of similar predicates, even though this patch doesn't
change "branch()".
2015-01-31 01:00:50 +09:00
Yuya Nishihara
b5f973788a revset: fix ancestors(null) to include null revision (issue4512)
Since fe39bbbf31f0, null parent is explicitly excluded. So, there is no reason
to have nullrev in the initial seen set.
2015-01-25 20:20:27 +09:00
Yuya Nishihara
78d778b5ef revset: allow rev(-1) to indicate null revision (BC)
This can simplify the conversion from numeric revision to string. Without it,
we have to handle -1 specially because repo['-1'] != repo[-1].

The -1 revision is not officially documented, but this change makes sense
assuming that "rev(%d)" exists for scripting or third-party tools.
2015-01-10 12:56:38 +09:00
Martin von Zweigbergk
4b40ac0110 log: evaluate filesets on working copy, not its parent
When running "hg log 'set:added()'", we create two matchers: one used
for producing the revset and one used for finding files to match. In
185b6b930e8c (graphlog: evaluate FILE/-I/-X filesets on the working
dir, 2012-02-26), we started passing a revision argument along from
what's currently in cmdutil._makelogrevset() to
revset._matchfiles(). When the revision was an empty string, it
referred to the working copy. This was subtly done with "repo[rev or
None]". Then, in 5ff5c5c9e69f (revset: avoid recalculating filesets,
2014-10-22), that conversion from empty string to None was lost. Note
that repo[''] is equivalent to repo['.'], not repo[None].

The consequence of this, to the user, is that when running "hg log
'set:added()'", the file matcher matches files added in the working
copy, while the revset matcher matches revisions that touch files
added in the parent of the working copy. As a result, only revisions
that touch any files added in the parent of the working copy will be
considered, but they will only be included if they also touch files
added in the working copy.

Fix the bug by converting '' to None again, but make it a little more
explicit this time (plus, we now have tests for it).
2015-01-21 15:23:13 -08:00
FUJIWARA Katsunori
ac41d830e2 revset: check for collisions between alias argument names in the declaration
Before this patch, collisions between alias argument names in the
declaration are ignored, and this silently causes unexpected alias
evaluation.

This patch checks for such collisions, and aborts (or shows a warning) when
collisions are detected.

This patch doesn't add a test to "test-revset.t", because a doctest is
enough to test the collisions detection itself.
2015-01-10 23:18:11 +09:00
FUJIWARA Katsunori
e416b72fc5 revset: parse alias declaration strictly by _parsealiasdecl
Before this patch, alias declaration is parsed by string base
operations: matching against "^([^(]+)\(([^)]+)\)$" and splitting by
",".

This overlooks many syntax errors like below (see the previous patch
introducing "_parsealiasdecl" for detail):

  - un-closed parenthesis causes being treated as "alias symbol"
  - symbol/function name aren't examined whether they are valid or not
  - invalid argument list causes unexpected argument names

To parse alias declaration strictly, this patch replaces parsing
implementation by "_parsealiasdecl".

This patch tests only one typical declaration error case, because
error detection itself is already tested in the doctest of
"_parsealiasdecl".

This also removes class property "args" and "error", because these are
certainly initialized in "revsetalias.__init__".
2015-01-10 23:18:11 +09:00
FUJIWARA Katsunori
87958c780f revset: introduce "_parsealiasdecl" to parse alias declarations strictly
This patch introduces "_parsealiasdecl" to parse alias declarations
strictly. For example, "_parsealiasdecl" can detect problems below,
which current implementation can't.

  - un-closed parenthesis causes being treated as "alias symbol"

    because all of declarations not in "func(....)" style are
    recognized as "alias symbol".

    for example, "foo($1, $2" is treated as the alias symbol.

  - alias symbol/function names aren't examined whether they are valid
    as symbol or not

    for example, "foo bar" can be treated as the alias symbol, but of
    course such invalid symbol can't be referred in revset.

  - just splitting argument list by "," causes overlooking syntax
    problems in the declaration

    for example, all of invalid declarations below are overlooked:

    - foo("bar")     => taking one argument named as '"bar"'
    - foo("unclosed) => taking one argument named as '"unclosed'
    - foo(bar::baz)  => taking one argument named as 'bar::baz'
    - foo(bar($1))   => taking one argument named as 'bar($1)'

To decrease complication of patch, current implementation for alias
declarations is replaced by "_parsealiasdecl" in the subsequent
patch. This patch just introduces it.

This patch defines "_parsealiasdecl" not as a method of "revsetalias"
class but as a one of "revset" module, because of ease of testing by
doctest.

This patch factors some helper functions for "tree" out, because:

  - direct accessing like "if tree[0] == 'func' and len(tree) > 1"
    decreases readability

  - subsequent patch (and also existing code paths, in the future) can
    use them for readability

This patch also factors "_tokenizealias" out, because it can be used
also for parsing alias definitions strictly.
2015-01-10 23:18:11 +09:00
FUJIWARA Katsunori
883b1f7edf revset: store full detail into revsetalias.error for error source distinction
Before this patch, any errors in the declaration of revset alias
aren't detected at all, and there is no information about error source
in the error message.

As a part of preparation for parsing alias declarations and
definitions more strictly, this patch stores full detail into
"revsetalias.error" for error source distinction.

This makes raising "Abort" and warning potential errors just use
"revsetalias.error" without any message composing.
2015-01-10 23:18:11 +09:00
FUJIWARA Katsunori
ae25ee95c4 revset: factor out composing error message for ParseError to reuse
This patch defines the composing function not in "ParseError" class but
in "revset" module, because:

  - "_()" shouldn't be used in "ParseError", to avoid adding "from
    i18n import _" i18n" to "error" module

  - generalizing message composition of"ParseError" for all code paths
    other than revset isn't the purpose of this patch

    we should also take care of showing "unexpected leading
    whitespace" for some code paths, to generalize widely.
2015-01-10 23:18:11 +09:00
FUJIWARA Katsunori
48233206c2 revset: make tokenize extensible to parse alias declarations and definitions
Before this patch, "tokenize" doesn't recognize the symbol starting
with "$" as a valid one.

This prevents revset alias declarations and definitions from being
parsed with "tokenize", because "$" may be used as the initial letter
of alias arguments.

BTW, the alias argument name doesn't require leading "$" itself, in
fact. But we have to assume that users may use "$" as the initial
letter of argument names in their aliases, because examples in "hg
help revsets" uses such names for a long time.

To make "tokenize" extensible to parse alias declarations and
definitions, this patch introduces optional arguments "syminitletters"
and "symletters". Giving these sets can change the policy of "valid
symbol" in tokenization easily.

This patch keeps original examination of letter validity for
reviewability, even though there is redundant interchanging between
"chr"/"ord" at initialization of "_syminitletters" and "_symletters".
At most 256 times examination (per initialization) is cheaper enough
than revset evaluation itself.

This patch is a part of preparation for parsing alias declarations and
definitions more strictly.
2015-01-10 23:18:11 +09:00
Sean Farley
f534500188 namespaces: add revset for 'named(namespace)'
This patch adds functionality for listing all changesets in a given namespace
via the revset language.
2015-01-13 15:07:08 -08:00
Matt Mackall
d829c30933 merge with stable 2015-01-14 12:50:46 -08:00
Yuya Nishihara
47e97cb140 revset: fix spanset.isascending() to honor sort() or reverse() request
Because spanset.isascending() ignored the ascending flag, the result of
"fullreposet() & x" was always sorted in ascending order.

The test case is carefully chosen to call fullreposet.__and__.
2015-01-10 21:31:59 +09:00
Yuya Nishihara
bf8b92850c revset: simplify fullreposet.__and__ to call sort() with boolean flag
Note that sort() takes a boolean flag, so other.sort(reverse) was wrong.
It just worked fine because there is a top-level function, reverse().
2015-01-10 21:36:42 +09:00
Martin von Zweigbergk
c91c749e46 filelog: fix backwards comment for 'backrevref' 2015-01-12 09:46:56 -08:00
Martin von Zweigbergk
63a49977e2 filelog: remove trailing "form feed" character 2015-01-12 09:49:25 -08:00
Martin von Zweigbergk
3e0453e7c4 filelog: remove unused variable 'lkr'
It's used further down, but it's overwritten before, so it's
technically a dead assignment, but unnecessary nevertheless.
2015-01-12 09:48:05 -08:00
Mads Kiilerich
61a36ea4fe revset: use localrepo revbranchcache for branch name filtering
Branch name filtering in revsets was expensive. For every rev it created a
changectx and called .branch() which retrieved the branch name from the
changelog.

Instead, use the revbranchcache.

The revbranchcache is used read-only. The revset implementation with generators
and callbacks makes it hard to figure out when we are done using/updating the
cache and could write it back. It would also be 'tricky' to lock the repo for
writing from within a revset execution. Finally, the branchmap update will
usually make sure that the cache is updated before any revset can be run.
The revbranchcache is used without any locking but is short-lived and used in a
tight loop where we can assume that the changelog doesn't change ... or where
it not is relevant to us if it does.

perfrevset 'branch(mobile)' on mozilla-central.
Before:
! wall 10.989637 comb 10.970000 user 10.940000 sys 0.030000 (best of 3)
After, no cache:
! wall 7.368656 comb 7.370000 user 7.360000 sys 0.010000 (best of 3)
After, with cache:
! wall 0.528098 comb 0.530000 user 0.530000 sys 0.000000 (best of 18)

The performance improvement even without cache come from being based on
branchinfo on the changelog instead of using ctx.branch().

Some tests are added to verify that the revbranchcache works and keep an eye on
when the cache files actually are updated.
2015-01-08 00:01:03 +01:00
Yuya Nishihara
b67b0a75ea revset: drop pre-lazyset optimization for stringset of subset == entire repo
It was introduced at deb42ca4dd93, where spanset.__contains__() did not exist.
Nowadays, we have to pay huge penalty for len(subset).

The following example showed that OR operation could be O(n * m^2)
(n: len(repo), m: number of OR operators, m >= 2) probably because of
filteredset.__len__.

revset #0: 0|1|2|3|4|5|6|7|8|9
0) wall 8.092713 comb 8.090000 user 8.090000 sys 0.000000 (best of 3)
1) wall 0.445354 comb 0.450000 user 0.430000 sys 0.020000 (best of 22)
2) wall 0.000389 comb 0.000000 user 0.000000 sys 0.000000 (best of 7347)
(0: 3.2.4, 1: 3.1.2, 2: this patch)
2015-01-03 10:25:08 +09:00
Pierre-Yves David
56b039c98c revset: fix first and last for generatorset (issue4465)
The code was just plain wrong.
2014-12-01 05:18:12 -08:00
Sean Farley
2a4b30c27c revset: use '%' as an operator for 'only'
With this patch, we can make it much easier to specify 'only(A,B)' ->
A%B. Similarly, 'only(A)' -> A%.

On Windows, '%' is a semi-reserved symbol in the following way: using non-bash
shells (e.g. cmd.exe but NOT PowerShell, ConEmu, and cmder), %var% is only
expanded when 'var' exists and is surrounded by '%'.

That only leaves batch scripts which could prove to be problematic. I posit
that this isn't a big issue because any developer of batch scripts already
knows that to use '%' one needs to escape it by using a double '%%'.

Alternatives to '%' could be '=' but that might be limiting our future if we
ever decide to use temporary assignments in a revset.
2014-11-06 14:55:18 -08:00
FUJIWARA Katsunori
fa0f66b4fb revset: introduce new operator "##" to concatenate strings/symbols at runtime
Before this patch, there is no way to concatenate strings at runtime.

For example, to search for the issue ID "1234" in descriptions against
all of "issue 1234", "issue:1234", issue1234" and "bug(1234)"
patterns, the revset below should be written fully from scratch for
each issue ID.

    grep(r"\bissue[ :]?1234\b|\bbug\(1234\)")

This patch introduces new infix operator "##" to concatenate
strings/symbols at runtime. Operator symbol "##" comes from the same
one of C pre-processor. This concatenation allows parametrizing a part
of strings in revset queries.

In the case of example above, the definition of the revset alias using
operator "##" below can search issue ID "1234" in complicated patterns
by "issue(1234)" simply:

    issue($1) = grep(r"\bissue[ :]?" ## $1 ## r"\b|\bbug\(" ## $1 ## r"\)")

"##" operator does:

  - concatenate not only strings but also symbols into the string

    Exact distinction between strings and symbols seems not to be
    convenience, because it is tiresome for users (and
    "revset.getstring" treats both similarly)

    For example of revset alias "issue()", "issue(1234)" is easier
    than "issue('1234')".

  - have higher priority than any other prefix, infix and postfix
    operators (like as "##" of C pre-processor)

    This patch (re-)assigns the priority 20 to "##", and 21 to "(",
    because priority 19 is already assigned to "-" as prefix "negate".
2015-01-06 23:46:18 +09:00
Pierre-Yves David
1ed0c1e70a revset-filelog: handle hidden linkrev for file missing for head (issue4490)
The fix for linkrev pointing to hidden revision was crashing when the file was
missing from head's manifest. We now properly handle this case.

(yes I feel silly)
2015-01-06 11:23:38 -08:00
Matt Mackall
1c9cf418be merge with stable 2015-01-05 15:46:14 -06:00
FUJIWARA Katsunori
ec7eaba601 revset: delay showing parse error for the revset alias until it is referred
Before this patch, a problematic revset alias aborts execution
immediately, even if it isn't referred in the specified revset.

If old "hg" may be used too (for example, bisecting Mercurial itself),
it is also difficult to write alias definitions using features newly
introduced by newer "hg" into configuration files, because such alias
definitions cause unexpected abortion at parsing revset aliases with
old "hg".

This patch delays showing parse error for the revset alias until it is
actually referred at runtime.

This patch detects referring problematic aliases in "_expandaliases"
by examination of "revsetalias.error", which is initialized with the
error message only when parsing fails.

For usability, this patch also warns about problematic aliases, even
if they aren't referred at runtime. This should help users to know
potential problems in their alias definitions earlier.
2015-01-05 11:02:04 +09:00
Pierre-Yves David
a533fb454c linkrev-filelog: handle filtered linkrev with no visible children (issue4307)
If the file revision with a filtered linkrev does not have any
(unfiltered) children, we cannot use it to bound the search for
another introduction. Instead, we have to look at the file revision
used by each head changeset. If one of them uses this file revision, we
know there is another occurrence and we have a starting point. See
inline comments for details.

Adding some kind of permanent reference of all the introductions of a
file revision instead of just the first one would be much better. But
this is more difficult. I hope to take that into account in the next
repository format.
2014-12-29 18:35:23 -08:00
Pierre-Yves David
e3f2a2625a linkrev: work around linkrev to filtered entry in 'filelog' revset
This revset is used by 'hg log FILENAME'. This prevent bugs when used on
a repository with hidden revisions.

Instead of just discarding file revisions whose linkrevs point to filtered
revisions, we put them aside and post-process them trying to find a non-filtered
introduction. See inline documentation for details about how it works.

This only fixes some of the problems. Once again, more will be needed when we can
cannot rely on child revisions of a file to find linkrev-shadowned revisions.

A test is added for 'hg log' catching such cases.
2014-12-29 17:23:16 -08:00
Pierre-Yves David
0344ecbf5b linkrev: also adjust linkrev when bootstrapping 'follow' revset
The follow revset (used by `hg log --follow`) now uses the new 'introrev'
method to bootstrap its traversal. This catches issues from linkrev-shadowing of
the changesets introducing the version of a file in source changeset.

A new test has been added to display pathological cases.

Another test is affected because it was meant to test this behavior but actually
failed to do so for half of the output. The output are now similar.
2014-12-29 23:40:24 -08:00
Yuya Nishihara
72198a2056 revset: add 'only' to DoS-safe symbols
There is no reason to disable it in hgweb because the same query can be
written without using 'only()'.
2014-12-03 22:56:42 +09:00
Yuya Nishihara
f62323518b revset: move 'only' so that functions are sorted alphabetically 2014-12-03 22:52:54 +09:00
Matt Mackall
289d6b53bc merge with stable 2014-12-01 19:34:11 -06:00
Siddharth Agarwal
b3be1bdf41 revset.only: use cl.findmissingrevs
ancestor.missingancestors is really an implementation detail.
2014-11-14 16:38:58 -08:00
Pierre-Yves David
515b17f5ef revset-matching: call 'getset' on a 'fullreposet'
Calling 'baseset(repo.changelog)' builds a list for all revisions in
the repo. And we already have the lazy and efficient 'fullreposet'
class for this purpose.

This gives us the usual benefits of the fullreposet but it is less visible
because the matching process itself is very expensive:

revset) matching(100)
before) wall 6.413281 comb 6.420000 user 5.910000 sys 0.510000 (best of 3)
after)  wall 6.173608 comb 6.170000 user 5.750000 sys 0.420000 (best of 3)

However for some complex list, this provide a massive speedup

revset) matching(parents(100))
before) wall 23.890740 comb 23.890000 user 23.450000 sys 0.440000 (best of 3)
after)  wall 6.382280 comb 6.390000 user 5.930000 sys 0.460000 (best of 3)
2014-10-16 23:15:35 -07:00
Pierre-Yves David
7b5edd0cf5 revset-parentspec: call 'getset' on a 'fullreposet'
Calling 'baseset(repo.changelog)' builds a list for all revisions in
the repo. And we already have the lazy and efficient 'fullreposet'
class for this purpose.

This gives us the usual benefits of the fullreposet:

revset) 100^1
before) wall 0.002694 comb 0.000000 user 0.000000 sys 0.000000 (best of 897)
after)  wall 0.000997 comb 0.000000 user 0.000000 sys 0.000000 (best of 2324)

revset) parents(100)^1
before) wall 0.003832 comb 0.000000 user 0.000000 sys 0.000000 (best of 587)
after)  wall 0.001034 comb 0.000000 user 0.000000 sys 0.000000 (best of 2309)

revset) (100^1)^1
before) wall 0.005616 comb 0.000000 user 0.000000 sys 0.000000 (best of 405)
after)  wall 0.001030 comb 0.000000 user 0.000000 sys 0.000000 (best of 2258)
2014-10-16 23:15:06 -07:00
Pierre-Yves David
f1982bfd1d revset-children: call 'getset' on a 'fullreposet'
Calling 'baseset(repo.changelog)' builds a list for all revisions in the
repo. And we already have the lazy and efficient 'fullreposet' class
for this purpose.

This gives us the usual benefits of the fullreposet:

revset) children(tip~100)
before) wall 0.007469 comb 0.010000 user 0.010000 sys 0.000000 (best of 338)
after)  wall 0.003356 comb 0.000000 user 0.000000 sys 0.000000 (best of 755)
2014-10-16 23:14:17 -07:00
Pierre-Yves David
593610531e revset-ancestorspec: call 'getset' on a 'fullreposet'
Calling 'baseset(repo.changelog)' builds a list for all revisions in
the repo. And we already have the lazy and efficient 'fullreposet'
class for this purpose.

This gives us the usual benefits of the fullreposet:

revset) 100~5
before) wall 0.002712 comb 0.000000 user 0.000000 sys 0.000000 (best of 918)
after)  wall 0.000996 comb 0.000000 user 0.000000 sys 0.000000 (best of 2493)

revset) parents(100)~5
before) wall 0.003812 comb 0.010000 user 0.010000 sys 0.000000 (best of 667)
after)  wall 0.001038 comb 0.000000 user 0.000000 sys 0.000000 (best of 2361)

revset) (100~5)~5
before) wall 0.005614 comb 0.000000 user 0.000000 sys 0.000000 (best of 446)
after)  wall 0.001035 comb 0.000000 user 0.000000 sys 0.000000 (best of 2424)
2014-10-16 23:11:25 -07:00
Pierre-Yves David
985922c845 revset-rangeset: call 'getset' on a 'fullreposet'
Calling 'baseset(repo.changelog)' builds a list for all revisions in
the repo. And we already have the lazy and efficient 'fullreposet'
class for this purpose.

This gives us the usual benefit ofs the fullreposet:

revset) 10:100
before) wall 0.002774 comb 0.000000 user 0.000000 sys 0.000000 (best of 797)
after)  wall 0.001977 comb 0.000000 user 0.000000 sys 0.000000 (best of 1244)

revset) parents(10):parents(100)
before) wall 0.005054 comb 0.000000 user 0.000000 sys 0.000000 (best of 481)
after)  wall 0.002060 comb 0.000000 user 0.000000 sys 0.000000 (best of 1056)
2014-10-16 23:10:44 -07:00
Mads Kiilerich
523c87c1fe spelling: fixes from proofreading of spell checker issues 2014-04-17 22:47:38 +02:00
Pierre-Yves David
2463533597 addset: fix first and last on sorted addset (issue4426)
The lazy sorting were not enforced on addset. This was made visible through MQ.
2014-11-01 22:58:30 +00:00
Martin von Zweigbergk
ef6448aa8b revset: don't recreate matcher for every revision
The matcher variable 'm' in checkstatus() is reset to None on each
call, so the caching of the matcher no longer happens as it was
intended. This seems to be a regression in 6b9fbae54476 (revset: added
lazyset implementation to checkstatus, 2014-01-03).

Fix by moving the cached matcher into the enclosing function so it's
actually cached across calls. This speeds up

  hg log -r 'modifies(mercurial/context.py)' >/dev/null

from 7.5s to 4s.

Also see similar fix in 5ff5c5c9e69f (revset: avoid recalculating
filesets, 2014-10-22).
2014-10-31 10:41:36 -07:00
Durham Goode
c9e0ce83ec revset: fix O(2^n) perf regression in addset
hg log -r 1 ... -r 100 was never returning due to a regression in the
way addset computes __nonzero__. It used 'bool(self._r1 or self._r2)'
which required executing self._r1.__nonzero__ twice (once for the or,
once for the bool). hg log with a lot of -r's happens to build a one
sided addset tree of N length, which ends up being 2^N performance.

This patch fixes it by converting to bool before or'ing.

This problem can be repro'd with something as simple as:

hg log `for x in $(seq 1 50) ; do echo "-r $x "; done`

Adding '1 + 2 + ... + 20' to the revsetbenchmark.txt didn't seem to repro the
problem, so I wasn't able to add a revset benchmark for this issue.
2014-10-28 14:06:06 -07:00
Yuya Nishihara
740a18d819 revset: avoid O(n) lookup of invalid revision in rev()
0cc5c10d5dc7 was not the final version of that patch.  It was really slow
because `l not in repo.changelog` iterates revisions up to `l`.  Instead,
rev() should utilize spanset.__contains__().

revset #0: rev(210000)
0) wall 0.000039 comb 0.000000 user 0.000000 sys 0.000000 (best of 67978)
1) wall 0.002721 comb 0.000000 user 0.000000 sys 0.000000 (best of 1055)
2) wall 0.000059 comb 0.000000 user 0.000000 sys 0.000000 (best of 45599)
(0: 3.2-rc, 1: 0cc5c10d5dc7, 2: this patch)

Note that the benchmark result described in 0cc5c10d5dc7 is wrong because
it is the one of the initial version.
2014-10-23 21:53:37 +09:00
Yuya Nishihara
bac0595bd4 revset: have rev() drop out-of-range or filtered rev explicitly (issue4396)
The recent optimization of "and" operation relies on the assumption that
the rhs set does not contain invalid revisions.  So rev() has to remove
invalid revisions.

This is still faster than using `.filter(lambda r: r == l)`.

revset #0: rev(25)
0) wall 0.026341 comb 0.020000 user 0.020000 sys 0.000000 (best of 113)
1) wall 0.000038 comb 0.000000 user 0.000000 sys 0.000000 (best of 66567)
2) wall 0.000062 comb 0.000000 user 0.000000 sys 0.000000 (best of 43699)
(0: 428fa22fb2d1^, 1: 3.2-rc, 2: this patch)
2014-10-19 16:48:33 +09:00
Matt Mackall
56b374dd4e revset: avoid recalculating filesets
This fixes a regression in ea41ca830940 that moved matcher building
into a callback, thus causing it be rebuilt for each revision matched
against.
2014-10-22 15:47:27 -05:00
Pierre-Yves David
0d2e3a1dee revset-phases: prefetch attributes in phasesrelated revsets
Pre-fetching attributes gives a significant performance boost. Such is Python.


draft()
0) wall 0.011661 comb 0.010000 user 0.010000 sys 0.000000 (best of 205)
1) wall 0.009804 comb 0.000000 user 0.000000 sys 0.000000 (best of 231)

draft() - ::bookmark()
0) wall 0.014173 comb 0.010000 user 0.010000 sys 0.000000 (best of 177)
1) wall 0.012966 comb 0.010000 user 0.010000 sys 0.000000 (best of 182)
2014-10-16 17:46:58 -07:00
Pierre-Yves David
8347f164c0 revset-phases: do not cache phase-related filters
The phase retrieval is fast enough to not require caching the result of the
functions.

draft()
0) wall 0.017209 comb 0.020000 user 0.020000 sys 0.000000 (best of 149)
1) wall 0.011654 comb 0.010000 user 0.010000 sys 0.000000 (best of 186)

public()
0) wall 0.018687 comb 0.010000 user 0.010000 sys 0.000000 (best of 128)
1) wall 0.013290 comb 0.010000 user 0.010000 sys 0.000000 (best of 181)

secret()
0) wall 0.017464 comb 0.020000 user 0.020000 sys 0.000000 (best of 127)
1) wall 0.011499 comb 0.000000 user 0.000000 sys 0.000000 (best of 196)

draft() - ::bookmark()
0) wall 0.020099 comb 0.020000 user 0.020000 sys 0.000000 (best of 127)
1) wall 0.014399 comb 0.020000 user 0.020000 sys 0.000000 (best of 169)
2014-10-11 01:21:47 -07:00
Pierre-Yves David
4de6496309 revset-node: speedup by a few hundred fold
Instead of checking all elements of the subset against a single rev, just check
if this rev is in the subset. The old way was inherited from when the subset was
a list.

Non surprise, this provide massive speedup.


id("b7dc31e4baa4")
before) wall 0.008205 comb 0.000000 user 0.000000 sys 0.000000 (best of 302)
after)  wall 0.000069 comb 0.000000 user 0.000000 sys 0.000000 (best of 34518)

revset #1: public() and id("b7dc31e4baa4")
before) wall 0.019763 comb 0.020000 user 0.020000 sys 0.000000 (best of 124)
after)  wall 0.000101 comb 0.000000 user 0.000000 sys 0.000000 (best of 20130)
2014-10-11 01:39:20 -07:00