Commit Graph

712 Commits

Author SHA1 Message Date
Pierre-Yves David
5269ce9dff revset: speed up '_matchfiles'
File matching is done by applying the matcher to all elements in the 'file'
field of all changesets in the repository. This requires to read/parse all
changesets in the repository and do a lot of matching. However about 1/3 of the
time of the function is used to create 'changectx' object and retrieve their
'file' field.

This is far too much overhead so we are skipping the changectx layer and
directly access the data from the changelog. This provide use significant speed
up:

repository: mozilla central 252524 revisions
command: hg perfrevset '_matchfiles("p:browser")'
Before: 15.899687s
After:  10.011705s

Slowdown is even more significant if you have a lot of namespace that slowdown
lookup.

The time is now spent with this approximate repartition:

  Matcher: 20%
    regexp matching: 10%
  changelog.read: 80%
    reading revision: 60%
      checking hash: 15%
      decompression: 15%
      reading chunk: 30%
    changelog parsing: 20%
      decoding to local: 10%

The next easy win is probably to have more of the changelog stack implemented
using the CPython api.
2015-11-18 23:23:03 -08:00
timeless@mozdev.org
716b455ed5 l10n: use %d instead of %s for numbers 2015-10-14 22:29:03 -04:00
Pierre-Yves David
4664d62d42 revset: rename and test '_destmerge'
We make the name consistent with the one used by '_destupdate' and we ensure the
code is run by testing it (abort is expected and merge would).
2015-10-15 01:47:28 +01:00
Pierre-Yves David
614a2e0418 destutil: move default merge destination into a function
Function in destutil are much simpler to wrap and more flexible than revset.
This also help consistency as 'destupdate' live here and cannot become a pure
revset anyway.
2015-10-15 01:11:00 +01:00
Pierre-Yves David
fb61444d77 revset: reintroduce and experimental revset for update destination
The revset is not ready for prime time yet. However it is useful to have some
version of it exposed to help candidate users to play with it and provide
feedback on what we should aim at.

We add a small test to make sure the code runs.
2015-10-15 01:35:44 +01:00
Yuya Nishihara
c7bc2fcfb9 revset: add optional offset argument to limit() predicate
It's common for GUI or web frontend to fetch chunk of revisions per batch
size. Previously it was possible only if revisions were sorted by revision
number.

  $ hg log -r 'limit({revspec} & :{last_known}, 101)'

So this patch introduces a general way to retrieve chunk of revisions after
skipping offset revisions.

  $ hg log -r 'limit({revspec}, 100, {last_count})'

This is a dumb implementation. We can optimize it for baseset and spanset
later.
2015-03-24 00:28:28 +09:00
Yuya Nishihara
772342e4a1 revset: port limit() to support keyword arguments
The next patch will introduce the third 'offset' argument. This allows us
to specify 'offset' without 'n' argument.
2015-10-12 17:19:22 +09:00
Yuya Nishihara
516c2f1e3b revset: eliminate temporary reference to subset in limit() and last() 2015-10-12 17:14:47 +09:00
Pierre-Yves David
30913031d4 error: get Abort from 'error' instead of 'util'
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.

For great justice.
2015-10-08 12:55:45 -07:00
Pierre-Yves David
450d49c4b8 revset: delete _updatedefaultdest as it has no users
The revset is not used anywhere anymore. We delete the function until we use
(and therefore test it again).
2015-10-05 02:33:45 -07:00
Pierre-Yves David
295091442c update: move default destination computation to a function
We ultimately want this to be accessible through a revset, but there is too
much complexity here for that to work. Especially we'll have to return more
than just the destination to control the behavior (eg: bookmarks to activate,
etc).

To prevent cycle, a new module is created, it will receive other
destination/behavior function in the future.
2015-10-05 01:46:47 -07:00
Yuya Nishihara
e482010d79 revset: strip off "literal:" prefix from bookmark not found error
This is what branch() and tag() do.
2015-10-07 23:04:31 +09:00
Yuya Nishihara
af229fbc94 revset: do not fall through to revspec for literal: branch (issue4838)
If "literal:" is specified, it must not be a revset expression. It should
error out with a better message.
2015-10-07 23:00:29 +09:00
Matt Harbison
bb1dafe069 util: extract stringmatcher() from revset
This is used to match against tags, bookmarks, etc in revsets.  It will be used
in a future patch to do the same tag matching in templater.
2015-08-22 22:52:18 -04:00
Pierre-Yves David
980dfb1fe1 revset: avoid implicit None testing in revset
Implicit None testing is a very good way to get in trouble. We explicitly test
for None.
2015-09-23 00:41:07 -07:00
Durham Goode
b02cd211f8 revset: speed up existence checks for ordered filtered sets
Previously, calling 'if foo:' on a ordered filtered set would start iterating in
whatever the current direction was and return if a value was available. If the
current direction was ascending, but the set had a fastdesc available, this
meant we did a lot more work than necessary.

If this was applied without my previous max/min fixes, it would improve max()
performance (this was my first attempt at fixing the issue). Since those
previous fixes went in though, this doesn't have a visible benefit in the
benchmarks, but it does seem clearly better than it was before so I think it
should still go in.
2015-09-20 16:53:42 -07:00
Durham Goode
dcc5c5ec45 revset: remove existence check from min() and max()
min() and max() would first do an existence check. Unfortunately existence
checks can be slow in certain situations (like if the smartset is a list, and
quickly iterable in both ascending and descending directions, then doing an
existence check will start from the bottom, even if you want to check the
max()).

The fix is to not do the check, and just handle the error if it happens. In a
large repo, this speeds up:

hg log -r 'max(parents(. + .^) - (. + .^)  & ::master)'

from 3.5s to 0.85s. That revset is contrived and just for testing. In our
real case we used 'bundle()' in place of '. + .^'

Interesting perf numbers for the revset benchmarks:

max(draft() and ::tip) =>  0.027s to 0.0005s
max(author(lmoscovicz)) => 2.48s to 0.57s

min doesn't show any perf changes, but changing it as well will prevent a perf
regression in my next patch.

Result from revset benchmark

revset #0: draft() and ::tip
   min           max
0) 0.001971      0.001991
1) 0.001965      0.000428  21%

revset #1: ::tip and draft()
   min           max
0) 0.002017      0.001912
1) 0.001896  94% 0.000421  22%

revset #2: author(lmoscovicz)
   min           max
0) 1.049033      1.358913
1) 1.042508      0.319824  23%

revset #3: author(lmoscovicz) or author(mpm)
   min           max
0) 1.042512      1.367432
1) 1.019750      0.327750  23%

revset #4: author(mpm) or author(lmoscovicz)
   min           max
0) 1.050135      0.324924
1) 1.070698      0.319913

revset #5: roots((tip~100::) - (tip~100::tip))
   min           max
0) 0.000671      0.001018
1) 0.000605  90% 0.000946  92%

revset #6: roots((0::) - (0::tip))
   min           max
0) 0.149714      0.152369
1) 0.098677  65% 0.100374  65%

revset #7: (20000::) - (20000)
   min           max
0) 0.051019      0.042747
1) 0.035586  69% 0.016267  38%
2015-09-20 19:27:53 -07:00
Pierre-Yves David
182447758e update: move default destination into a revset
This is another step toward having "default" destination more clear and unified.
Not all the logic is there because some bookmark related computation happened
elsewhere. It will be moved later.

The function is private because as for the other ones, cleanup is needed before
we can proceed.
2015-09-18 17:23:10 -07:00
Pierre-Yves David
14615c5363 merge: move default destination computation in a revset
This is another step toward having "default" destination more clear and unified.
2015-09-17 14:03:15 -07:00
Yuya Nishihara
b5477ed9b3 revset: handle error of string unescaping 2015-09-10 23:29:55 +09:00
Yuya Nishihara
4c181af0c1 revset: uncache filteredset.__contains__
Since ca895be75c36, condition function returns a cached value, so there's
little benefit to cache __contains__.

No measurable difference found in contrib/base-revsets.txt.
2015-09-05 12:56:53 +09:00
Durham Goode
151bf91f0c revset: fix resolving strings from a list
When using multiple revsets that get optimized into a list (like
hg log -r r1235 -r r1237 in hgsubversion), the revset list code was assuming the
strings were resolvable via repo[X]. hgsubversion and other extensions override
def stringset() to allow processing different revision identifiers (such as
r1235 or g<githash>), and there for the _list() implementation was circumventing
that resolution.

The fix is to just call stringset(). The default implementaiton does the same
thing that _list was already doing (namely repo[X]).

This has always been broken, but it was recently exposed by ad142c72c6db which
made "--rev X --rev Y" produce a combined revset "X | Y".
2015-09-01 16:46:05 -07:00
liscju
e42f8565a1 revsets: makes follow() supports file patterns (issue4757) (BC)
Before this patch, follow only supports full, exact filenames.
This patch makes follow argument to be treated like file
pattern same way like log treats their arguments.

It preserves current behaviour of follow() matching paths
relative to the repository root by default.
2015-08-20 17:19:32 +02:00
Pierre-Yves David
1983912bde revset: cache smartset's min/max
As the content of a smartset never changes, min and max will never change
either.  This will save us time when this function is called multiple times.
This is relevant for issue4782 but does not fix it.
2015-08-27 17:57:33 -07:00
Yuya Nishihara
ef5e39e49c revset: mark reachablerootspure as private 2015-08-28 11:15:31 +09:00
Yuya Nishihara
d0b6532f54 reachableroots: construct and sort baseset in revset module
This can remove the dependency from changelog to revset, which seems a bit awkward
for me.
2015-08-28 11:14:24 +09:00
Pierre-Yves David
e12322b5c9 reachableroots: use smartset min
smartset min are likely to be optimised, cached or other magical property.
2015-08-21 16:12:24 -07:00
Pierre-Yves David
ceddd0bffc reachableroots: sort the smartset in the pure version too
Changeset 79b4c33e868f uses smartset lazy sorting for the C version. We need to
apply the same to the pure version for consistency. This is fixing the tests
with --pure.
2015-08-24 15:40:42 -07:00
Pierre-Yves David
dfa99ba207 baseset: keep the input set around
Baseset needs a list to operate, but will convert that list back to a set for
membership testing. It seems a bit silly to convert the set into a list to
convert it back afterward.
2015-08-20 17:19:56 -07:00
Yuya Nishihara
7f0aba37f0 reachableroots: use internal "revstates" array to test if rev is a root
The main goal of this patch series is to reduce the use of PyXxx() function
that is likely to require ugly error handling and inc/decref. Plus, this is
faster than using PySet_Contains().

  revset #0: 0::tip
  0) 0.004168
  1) 0.003678  88%

This patch ignores out-of-range roots as they are in the pure implementation.
Because reachable sets are calculated from heads, and out-of-range heads raise
IndexError, we can just take out-of-range roots as unreachable. Otherwise,
the test of "hg log -Gr '. + wdir()'" would fail.

"heads" argument is changed to a list. Should we have to rename the C function
as its signature is changed?
2015-08-14 15:43:29 +09:00
Laurent Charignon
2884b0bddb reachableroots: default to the C implementation
This patch is part of a series of patches to speed up the computation of
revset.reachableroots by introducing a C implementation. The main motivation is to
speed up smartlog on big repositories. At the end of the series, on our big
repositories the computation of reachableroots is 10-50x faster and smartlog on is
2x-5x faster.

Before this patch, reachableroots was computed in pure Python by default. This
patch makes the C implementation the default and provides a speedup for
reachableroots.
2015-08-06 22:11:20 -07:00
Laurent Charignon
d803ae95b1 revset: rename revsbetween to reachableroots and add an argument
This patch is part of a series of patches to speed up the computation of
revset.revsbetween by introducing a C implementation. The main motivation is to
speed up smartlog on big repositories. At the end of the series, on our big
repositories the computation of revsbetween is 10-50x faster and smartlog on is
2x-5x faster.

This patch rename 'revsbetween' to 'reachableroots' and makes the computation of
the full path optional. This will allow graphlog to compute grandparents using
'reachableroots' and remove the need for a dedicated grandparent function.
2015-06-19 20:18:54 -07:00
Laurent Charignon
c508574727 revset: make revsbetween public
This patch is part of a series of patches to speed up the computation of
revset.revsbetween by introducing a C implementation. The main motivation is to
speed up smartlog on big repositories. At the end of the series, on our big
repositories the computation of revsbetween is 10-50x faster and smartlog on is
2x-5x faster.

Later in this serie, we want to reuse the implementation of revsbetween in the
changelog module, therefore, we make it public.
2015-08-07 02:13:42 -07:00
Matt Mackall
6c04738a65 merge with stable 2015-08-10 15:30:28 -05:00
Gregory Szorc
754028767d revset: use absolute_import 2015-08-08 18:36:58 -07:00
Yuya Nishihara
6bf30cb038 revset: prevent crash caused by empty group expression while optimizing "or"
An empty group expression "()" generates None in AST, so it should be tested
before destructuring a tuple.

"A | ()" is still evaluated to an error because I'm not sure whether "()"
represents an empty set or an empty expression (= a unit value). They are
identical in "or" operation, but they should be evaluated differently in
"and" operation.

  expression  empty set  unit value
  ----------  ---------  ----------
  ()          {}         A
  A & ()      {}         A
  A | ()      A          A
2015-08-09 16:09:41 +09:00
Yuya Nishihara
1ebcb08eb6 revset: prevent crash caused by empty group expression while optimizing "and"
An empty group expression "()" generates None in AST, so the optimizer have
to test it before destructuring a tuple. The error message, "missing argument",
is somewhat obscure, but it should be better than crash.
2015-08-09 16:06:36 +09:00
Yuya Nishihara
59f7e7c7df revset: make balanced addsets by orset() without using _combinesets()
As scmutil.revrange() was rewritten to not use _combinesets(), we no longer
need _combinesets().
2015-07-05 12:50:09 +09:00
Yuya Nishihara
cfbb764a2a revset: add matchany() to construct OR expression from a list of specs
This will allow us to optimize "-rREV1 -rREV2 ..." command-line options.
2015-08-07 21:39:38 +09:00
Yuya Nishihara
938e25baff revset: split post-parsing stage from match()
_makematcher() will be reused by new matchany(ui, specs, repo=None) function
I'll add by the next patch.
2015-08-07 21:31:16 +09:00
Yuya Nishihara
f7a6661b37 revset: parse nullary ":" operator as "0:tip"
This is necessary for compatibility with the old-style parser that will be
removed by future patches.
2015-07-05 12:15:54 +09:00
Yuya Nishihara
b4caf94446 parser: separate actions for primary expression and prefix operator
This will allow us to define both a primary expression, ":", and a prefix
operator, ":y". The ambiguity will be resolved by the next patch.

Prefix actions in elements table are adjusted as follows:

  original prefix      primary  prefix
  -----------------    -------- -----------------
  ("group", 1, ")") -> n/a      ("group", 1, ")")
  ("negate", 19)    -> n/a      ("negate", 19)
  ("symbol",)       -> "symbol" n/a
2015-07-05 12:02:13 +09:00
Yuya Nishihara
a85993b95a revset: port parsing rule of old-style ranges from scmutil.revrange()
The old-style parser will be removed soon.
2015-07-18 23:30:17 +09:00
Yuya Nishihara
4645c24be5 parser: fill invalid infix and suffix actions by None
This can simplify the expansion of (prefix, infix, suffix) actions.
2015-07-05 11:17:22 +09:00
Yuya Nishihara
b677e35b5b parser: add comment about structure of elements to each table 2015-07-05 11:06:58 +09:00
Yuya Nishihara
b9bc142035 revset: rename getkwargs() to getargsdict()
This function was added recently at c1a643334daf, but its name was misleading
because it processes both positional and keyword arguments.
2015-07-02 21:39:31 +09:00
Yuya Nishihara
33dcb19532 revset: work around x:y range where x or y is wdir()
All revisions must be contiguous in spanset, so we need the special case
for the wdir revision.
2015-06-28 16:08:07 +09:00
Yuya Nishihara
3732960ab3 revset: use integer representation of wdir() in revset
This is the simplest way to handle wdir() revision in revset. None didn't
work well because revset heavily depends on integer operations such as min(),
max(), sorted(), x:y, etc.

One downside is that we cannot do "wctx.rev() in set" because wctx.rev() is
still None. We could wrap the result set by wdirproxyset that translates None
to wdirrev, but it seems overengineered at this point.

    result = getset(repo, subset, tree)
    if 'wdir' in funcsused(tree):
        result = wdirproxyset(result)

Test cases need the '(all() + wdir()) &' hack because we have yet to fix the
bootstrapping issue of null and wdir.
2015-03-16 16:17:06 +09:00
Pierre-Yves David
09075e5712 revset: prefetch method in "parents"
As already demonstrated, saving attribute lookup gains us some minor but
noticeable performance improvements.

revset #0: parents(all())
before) 0.024169
after ) 0.022756  94%
2015-07-02 23:46:18 -07:00
Yuya Nishihara
329cd61d62 revset: port extra() to support keyword arguments
This is an example to show how keyword arguments are processed.
2015-06-28 22:57:33 +09:00