Commit Graph

824 Commits

Author SHA1 Message Date
Pierre-Yves David
38693c451d destutil: allow to specify an explicit source for the merge
We can now specify from where the merge is performed. The experimental revset
is updated to take revisions as argument, allowing to test the feature.

This will become very useful for pick the 'rebase' default destination. For this
reason, we also exclude all descendants from the rebased set from the candidate
destinations. This descendants exclusion was not necessary for merge as default
destination would not be picked from anything else than a head.

I'm not super excited with the current error messages, but I would prefer to
delay an overall messages rework once 'hg rebase' is done getting a default
destination aligned with 'hg merge'.
2016-02-08 19:32:29 +01:00
Matt Mackall
fd4d3ffdae merge with stable 2016-02-07 00:49:31 -06:00
Yuya Nishihara
6b8c99a6d3 revset: flatten chained 'list' operations (aka function args) (issue5072)
Internal _matchfiles() function can take bunch of arguments, which would
lead to a maximum recursion depth error. This patch avoids the excessive
stack use by flattening 'list' nodes beforehand.

Since getlist() no longer takes a nested 'list' nodes, _parsealiasdecl()
also needs to flatten argument list, "aliasname($1, $2, ...)".
2016-02-02 23:49:49 +09:00
Matt Mackall
e2cfbb7c54 log: speed up single file log with hidden revs (issue4747)
On repos with lots of heads, the filelog() code could spend several
minutes decompressing manifests. This change instead tries to
efficiently scan the changelog for candidates and decompress as few
manifests as possible. This is a regression introduced in 3.3 by the
linkrev adjustment code. Prior to that, filelog was nearly instant.

For the repo in the bug report, this improves time of a simple log
command from ~3 minutes to ~.5 seconds, a 360x speedup.

For the main Mercurial repo, a log of commands.py slows down from
1.14s to 1.45s, a 27% slowdown. This is still faster than the file()
revset, which takes 2.1 seconds.
2016-01-22 12:08:20 -06:00
Durham Goode
82c3cb9aed revset: use manifest.matches in _follow revset
The old _follow revset iterated over every file in the commit and checked if it
matched. For repos with large manifests, this could take 500ms. By switching to
use manifest.matches() we can take advantage of the fastpaths built in to
manifest.py that allows iterating over only the files in the matcher when it's a
simple matcher. This brings the time spent down from 500ms to 0ms during simple
operations like 'hg log -f file.txt'.
2016-02-05 13:30:25 -08:00
timeless
ebb1d48658 cleanup: remove superfluous space after space after equals (python) 2015-12-31 08:16:59 +00:00
FUJIWARA Katsunori
3a913aa7a9 revset: use decorator to mark a predicate as safe
Using decorator can localize changes for adding (or removing) a "safe"
revset predicate function in source code.

To avoid accidentaly treating unsuitable predicates as safe, this
patch uses False as default value of "safe" argument. This forces safe
predicates to be decorated with explicit 'safe=True'.
2015-12-29 23:58:30 +09:00
FUJIWARA Katsunori
4d06739a86 revset: use delayregistrar to register predicate in extension easily
Previous patch introduced 'revset.predicate' decorator to register
revset predicate function easily.

But it shouldn't be used in extension directly, because it registers
specified function immediately. Registration itself can't be restored,
even if extension loading fails after that.

Therefore, registration should be delayed until 'uisetup()' or so.

This patch uses 'extpredicate' decorator derived from 'delayregistrar'
to register predicate in extension easily.

This patch also tests whether 'registrar.delayregistrar' avoids
function registration if 'setup()' isn't invoked on it, because
'extpredicate' is the first user of it.
2015-12-29 23:58:30 +09:00
FUJIWARA Katsunori
3a36e78620 revset: use decorator to register a function as revset predicate
Using decorator can localize changes for adding (or removing) a revset
predicate function in source code.

It is also useful to pick predicates up for specific purpose. For
example, subsequent patch marks predicates as "safe" by decorator.

This patch defines 'parsefuncdecl()' in 'funcregistrar' class, because
this implementation can be uesd by other decorator class for fileset
predicate and template function.
2015-12-29 23:58:30 +09:00
timeless
60432cef00 revset: add hint for list error to use or 2015-12-23 17:54:03 +00:00
Laurent Charignon
5e7ee9a128 log: speed up hg log <file|folder>
This patch makes hg log <file|folder> faster by using changelog.readfiles
instead of changelog.read.
On our large repos for hg log <file|folder> -l5 operations that were taking:
- ~8s I see a 25% improvement
- ~15s, I see a 35% improvement
For recently modified folder/file, the difference is negligible as we don't
have to consider many revisions.
2015-12-18 12:54:45 -08:00
timeless
7451a5cbc6 grammar: favor zero, one, two over ... or no 2015-11-30 19:30:16 +00:00
Pierre-Yves David
5269ce9dff revset: speed up '_matchfiles'
File matching is done by applying the matcher to all elements in the 'file'
field of all changesets in the repository. This requires to read/parse all
changesets in the repository and do a lot of matching. However about 1/3 of the
time of the function is used to create 'changectx' object and retrieve their
'file' field.

This is far too much overhead so we are skipping the changectx layer and
directly access the data from the changelog. This provide use significant speed
up:

repository: mozilla central 252524 revisions
command: hg perfrevset '_matchfiles("p:browser")'
Before: 15.899687s
After:  10.011705s

Slowdown is even more significant if you have a lot of namespace that slowdown
lookup.

The time is now spent with this approximate repartition:

  Matcher: 20%
    regexp matching: 10%
  changelog.read: 80%
    reading revision: 60%
      checking hash: 15%
      decompression: 15%
      reading chunk: 30%
    changelog parsing: 20%
      decoding to local: 10%

The next easy win is probably to have more of the changelog stack implemented
using the CPython api.
2015-11-18 23:23:03 -08:00
timeless@mozdev.org
716b455ed5 l10n: use %d instead of %s for numbers 2015-10-14 22:29:03 -04:00
Pierre-Yves David
4664d62d42 revset: rename and test '_destmerge'
We make the name consistent with the one used by '_destupdate' and we ensure the
code is run by testing it (abort is expected and merge would).
2015-10-15 01:47:28 +01:00
Pierre-Yves David
614a2e0418 destutil: move default merge destination into a function
Function in destutil are much simpler to wrap and more flexible than revset.
This also help consistency as 'destupdate' live here and cannot become a pure
revset anyway.
2015-10-15 01:11:00 +01:00
Pierre-Yves David
fb61444d77 revset: reintroduce and experimental revset for update destination
The revset is not ready for prime time yet. However it is useful to have some
version of it exposed to help candidate users to play with it and provide
feedback on what we should aim at.

We add a small test to make sure the code runs.
2015-10-15 01:35:44 +01:00
Yuya Nishihara
c7bc2fcfb9 revset: add optional offset argument to limit() predicate
It's common for GUI or web frontend to fetch chunk of revisions per batch
size. Previously it was possible only if revisions were sorted by revision
number.

  $ hg log -r 'limit({revspec} & :{last_known}, 101)'

So this patch introduces a general way to retrieve chunk of revisions after
skipping offset revisions.

  $ hg log -r 'limit({revspec}, 100, {last_count})'

This is a dumb implementation. We can optimize it for baseset and spanset
later.
2015-03-24 00:28:28 +09:00
Yuya Nishihara
772342e4a1 revset: port limit() to support keyword arguments
The next patch will introduce the third 'offset' argument. This allows us
to specify 'offset' without 'n' argument.
2015-10-12 17:19:22 +09:00
Yuya Nishihara
516c2f1e3b revset: eliminate temporary reference to subset in limit() and last() 2015-10-12 17:14:47 +09:00
Pierre-Yves David
30913031d4 error: get Abort from 'error' instead of 'util'
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.

For great justice.
2015-10-08 12:55:45 -07:00
Pierre-Yves David
450d49c4b8 revset: delete _updatedefaultdest as it has no users
The revset is not used anywhere anymore. We delete the function until we use
(and therefore test it again).
2015-10-05 02:33:45 -07:00
Pierre-Yves David
295091442c update: move default destination computation to a function
We ultimately want this to be accessible through a revset, but there is too
much complexity here for that to work. Especially we'll have to return more
than just the destination to control the behavior (eg: bookmarks to activate,
etc).

To prevent cycle, a new module is created, it will receive other
destination/behavior function in the future.
2015-10-05 01:46:47 -07:00
Yuya Nishihara
e482010d79 revset: strip off "literal:" prefix from bookmark not found error
This is what branch() and tag() do.
2015-10-07 23:04:31 +09:00
Yuya Nishihara
af229fbc94 revset: do not fall through to revspec for literal: branch (issue4838)
If "literal:" is specified, it must not be a revset expression. It should
error out with a better message.
2015-10-07 23:00:29 +09:00
Matt Harbison
bb1dafe069 util: extract stringmatcher() from revset
This is used to match against tags, bookmarks, etc in revsets.  It will be used
in a future patch to do the same tag matching in templater.
2015-08-22 22:52:18 -04:00
Pierre-Yves David
980dfb1fe1 revset: avoid implicit None testing in revset
Implicit None testing is a very good way to get in trouble. We explicitly test
for None.
2015-09-23 00:41:07 -07:00
Durham Goode
b02cd211f8 revset: speed up existence checks for ordered filtered sets
Previously, calling 'if foo:' on a ordered filtered set would start iterating in
whatever the current direction was and return if a value was available. If the
current direction was ascending, but the set had a fastdesc available, this
meant we did a lot more work than necessary.

If this was applied without my previous max/min fixes, it would improve max()
performance (this was my first attempt at fixing the issue). Since those
previous fixes went in though, this doesn't have a visible benefit in the
benchmarks, but it does seem clearly better than it was before so I think it
should still go in.
2015-09-20 16:53:42 -07:00
Durham Goode
dcc5c5ec45 revset: remove existence check from min() and max()
min() and max() would first do an existence check. Unfortunately existence
checks can be slow in certain situations (like if the smartset is a list, and
quickly iterable in both ascending and descending directions, then doing an
existence check will start from the bottom, even if you want to check the
max()).

The fix is to not do the check, and just handle the error if it happens. In a
large repo, this speeds up:

hg log -r 'max(parents(. + .^) - (. + .^)  & ::master)'

from 3.5s to 0.85s. That revset is contrived and just for testing. In our
real case we used 'bundle()' in place of '. + .^'

Interesting perf numbers for the revset benchmarks:

max(draft() and ::tip) =>  0.027s to 0.0005s
max(author(lmoscovicz)) => 2.48s to 0.57s

min doesn't show any perf changes, but changing it as well will prevent a perf
regression in my next patch.

Result from revset benchmark

revset #0: draft() and ::tip
   min           max
0) 0.001971      0.001991
1) 0.001965      0.000428  21%

revset #1: ::tip and draft()
   min           max
0) 0.002017      0.001912
1) 0.001896  94% 0.000421  22%

revset #2: author(lmoscovicz)
   min           max
0) 1.049033      1.358913
1) 1.042508      0.319824  23%

revset #3: author(lmoscovicz) or author(mpm)
   min           max
0) 1.042512      1.367432
1) 1.019750      0.327750  23%

revset #4: author(mpm) or author(lmoscovicz)
   min           max
0) 1.050135      0.324924
1) 1.070698      0.319913

revset #5: roots((tip~100::) - (tip~100::tip))
   min           max
0) 0.000671      0.001018
1) 0.000605  90% 0.000946  92%

revset #6: roots((0::) - (0::tip))
   min           max
0) 0.149714      0.152369
1) 0.098677  65% 0.100374  65%

revset #7: (20000::) - (20000)
   min           max
0) 0.051019      0.042747
1) 0.035586  69% 0.016267  38%
2015-09-20 19:27:53 -07:00
Pierre-Yves David
182447758e update: move default destination into a revset
This is another step toward having "default" destination more clear and unified.
Not all the logic is there because some bookmark related computation happened
elsewhere. It will be moved later.

The function is private because as for the other ones, cleanup is needed before
we can proceed.
2015-09-18 17:23:10 -07:00
Pierre-Yves David
14615c5363 merge: move default destination computation in a revset
This is another step toward having "default" destination more clear and unified.
2015-09-17 14:03:15 -07:00
Yuya Nishihara
b5477ed9b3 revset: handle error of string unescaping 2015-09-10 23:29:55 +09:00
Yuya Nishihara
4c181af0c1 revset: uncache filteredset.__contains__
Since ca895be75c36, condition function returns a cached value, so there's
little benefit to cache __contains__.

No measurable difference found in contrib/base-revsets.txt.
2015-09-05 12:56:53 +09:00
Durham Goode
151bf91f0c revset: fix resolving strings from a list
When using multiple revsets that get optimized into a list (like
hg log -r r1235 -r r1237 in hgsubversion), the revset list code was assuming the
strings were resolvable via repo[X]. hgsubversion and other extensions override
def stringset() to allow processing different revision identifiers (such as
r1235 or g<githash>), and there for the _list() implementation was circumventing
that resolution.

The fix is to just call stringset(). The default implementaiton does the same
thing that _list was already doing (namely repo[X]).

This has always been broken, but it was recently exposed by ad142c72c6db which
made "--rev X --rev Y" produce a combined revset "X | Y".
2015-09-01 16:46:05 -07:00
liscju
e42f8565a1 revsets: makes follow() supports file patterns (issue4757) (BC)
Before this patch, follow only supports full, exact filenames.
This patch makes follow argument to be treated like file
pattern same way like log treats their arguments.

It preserves current behaviour of follow() matching paths
relative to the repository root by default.
2015-08-20 17:19:32 +02:00
Pierre-Yves David
1983912bde revset: cache smartset's min/max
As the content of a smartset never changes, min and max will never change
either.  This will save us time when this function is called multiple times.
This is relevant for issue4782 but does not fix it.
2015-08-27 17:57:33 -07:00
Yuya Nishihara
ef5e39e49c revset: mark reachablerootspure as private 2015-08-28 11:15:31 +09:00
Yuya Nishihara
d0b6532f54 reachableroots: construct and sort baseset in revset module
This can remove the dependency from changelog to revset, which seems a bit awkward
for me.
2015-08-28 11:14:24 +09:00
Pierre-Yves David
e12322b5c9 reachableroots: use smartset min
smartset min are likely to be optimised, cached or other magical property.
2015-08-21 16:12:24 -07:00
Pierre-Yves David
ceddd0bffc reachableroots: sort the smartset in the pure version too
Changeset 79b4c33e868f uses smartset lazy sorting for the C version. We need to
apply the same to the pure version for consistency. This is fixing the tests
with --pure.
2015-08-24 15:40:42 -07:00
Pierre-Yves David
dfa99ba207 baseset: keep the input set around
Baseset needs a list to operate, but will convert that list back to a set for
membership testing. It seems a bit silly to convert the set into a list to
convert it back afterward.
2015-08-20 17:19:56 -07:00
Yuya Nishihara
7f0aba37f0 reachableroots: use internal "revstates" array to test if rev is a root
The main goal of this patch series is to reduce the use of PyXxx() function
that is likely to require ugly error handling and inc/decref. Plus, this is
faster than using PySet_Contains().

  revset #0: 0::tip
  0) 0.004168
  1) 0.003678  88%

This patch ignores out-of-range roots as they are in the pure implementation.
Because reachable sets are calculated from heads, and out-of-range heads raise
IndexError, we can just take out-of-range roots as unreachable. Otherwise,
the test of "hg log -Gr '. + wdir()'" would fail.

"heads" argument is changed to a list. Should we have to rename the C function
as its signature is changed?
2015-08-14 15:43:29 +09:00
Laurent Charignon
2884b0bddb reachableroots: default to the C implementation
This patch is part of a series of patches to speed up the computation of
revset.reachableroots by introducing a C implementation. The main motivation is to
speed up smartlog on big repositories. At the end of the series, on our big
repositories the computation of reachableroots is 10-50x faster and smartlog on is
2x-5x faster.

Before this patch, reachableroots was computed in pure Python by default. This
patch makes the C implementation the default and provides a speedup for
reachableroots.
2015-08-06 22:11:20 -07:00
Laurent Charignon
d803ae95b1 revset: rename revsbetween to reachableroots and add an argument
This patch is part of a series of patches to speed up the computation of
revset.revsbetween by introducing a C implementation. The main motivation is to
speed up smartlog on big repositories. At the end of the series, on our big
repositories the computation of revsbetween is 10-50x faster and smartlog on is
2x-5x faster.

This patch rename 'revsbetween' to 'reachableroots' and makes the computation of
the full path optional. This will allow graphlog to compute grandparents using
'reachableroots' and remove the need for a dedicated grandparent function.
2015-06-19 20:18:54 -07:00
Laurent Charignon
c508574727 revset: make revsbetween public
This patch is part of a series of patches to speed up the computation of
revset.revsbetween by introducing a C implementation. The main motivation is to
speed up smartlog on big repositories. At the end of the series, on our big
repositories the computation of revsbetween is 10-50x faster and smartlog on is
2x-5x faster.

Later in this serie, we want to reuse the implementation of revsbetween in the
changelog module, therefore, we make it public.
2015-08-07 02:13:42 -07:00
Matt Mackall
6c04738a65 merge with stable 2015-08-10 15:30:28 -05:00
Gregory Szorc
754028767d revset: use absolute_import 2015-08-08 18:36:58 -07:00
Yuya Nishihara
6bf30cb038 revset: prevent crash caused by empty group expression while optimizing "or"
An empty group expression "()" generates None in AST, so it should be tested
before destructuring a tuple.

"A | ()" is still evaluated to an error because I'm not sure whether "()"
represents an empty set or an empty expression (= a unit value). They are
identical in "or" operation, but they should be evaluated differently in
"and" operation.

  expression  empty set  unit value
  ----------  ---------  ----------
  ()          {}         A
  A & ()      {}         A
  A | ()      A          A
2015-08-09 16:09:41 +09:00
Yuya Nishihara
1ebcb08eb6 revset: prevent crash caused by empty group expression while optimizing "and"
An empty group expression "()" generates None in AST, so the optimizer have
to test it before destructuring a tuple. The error message, "missing argument",
is somewhat obscure, but it should be better than crash.
2015-08-09 16:06:36 +09:00
Yuya Nishihara
59f7e7c7df revset: make balanced addsets by orset() without using _combinesets()
As scmutil.revrange() was rewritten to not use _combinesets(), we no longer
need _combinesets().
2015-07-05 12:50:09 +09:00
Yuya Nishihara
cfbb764a2a revset: add matchany() to construct OR expression from a list of specs
This will allow us to optimize "-rREV1 -rREV2 ..." command-line options.
2015-08-07 21:39:38 +09:00
Yuya Nishihara
938e25baff revset: split post-parsing stage from match()
_makematcher() will be reused by new matchany(ui, specs, repo=None) function
I'll add by the next patch.
2015-08-07 21:31:16 +09:00
Yuya Nishihara
f7a6661b37 revset: parse nullary ":" operator as "0:tip"
This is necessary for compatibility with the old-style parser that will be
removed by future patches.
2015-07-05 12:15:54 +09:00
Yuya Nishihara
b4caf94446 parser: separate actions for primary expression and prefix operator
This will allow us to define both a primary expression, ":", and a prefix
operator, ":y". The ambiguity will be resolved by the next patch.

Prefix actions in elements table are adjusted as follows:

  original prefix      primary  prefix
  -----------------    -------- -----------------
  ("group", 1, ")") -> n/a      ("group", 1, ")")
  ("negate", 19)    -> n/a      ("negate", 19)
  ("symbol",)       -> "symbol" n/a
2015-07-05 12:02:13 +09:00
Yuya Nishihara
a85993b95a revset: port parsing rule of old-style ranges from scmutil.revrange()
The old-style parser will be removed soon.
2015-07-18 23:30:17 +09:00
Yuya Nishihara
4645c24be5 parser: fill invalid infix and suffix actions by None
This can simplify the expansion of (prefix, infix, suffix) actions.
2015-07-05 11:17:22 +09:00
Yuya Nishihara
b677e35b5b parser: add comment about structure of elements to each table 2015-07-05 11:06:58 +09:00
Yuya Nishihara
b9bc142035 revset: rename getkwargs() to getargsdict()
This function was added recently at c1a643334daf, but its name was misleading
because it processes both positional and keyword arguments.
2015-07-02 21:39:31 +09:00
Yuya Nishihara
33dcb19532 revset: work around x:y range where x or y is wdir()
All revisions must be contiguous in spanset, so we need the special case
for the wdir revision.
2015-06-28 16:08:07 +09:00
Yuya Nishihara
3732960ab3 revset: use integer representation of wdir() in revset
This is the simplest way to handle wdir() revision in revset. None didn't
work well because revset heavily depends on integer operations such as min(),
max(), sorted(), x:y, etc.

One downside is that we cannot do "wctx.rev() in set" because wctx.rev() is
still None. We could wrap the result set by wdirproxyset that translates None
to wdirrev, but it seems overengineered at this point.

    result = getset(repo, subset, tree)
    if 'wdir' in funcsused(tree):
        result = wdirproxyset(result)

Test cases need the '(all() + wdir()) &' hack because we have yet to fix the
bootstrapping issue of null and wdir.
2015-03-16 16:17:06 +09:00
Pierre-Yves David
09075e5712 revset: prefetch method in "parents"
As already demonstrated, saving attribute lookup gains us some minor but
noticeable performance improvements.

revset #0: parents(all())
before) 0.024169
after ) 0.022756  94%
2015-07-02 23:46:18 -07:00
Yuya Nishihara
329cd61d62 revset: port extra() to support keyword arguments
This is an example to show how keyword arguments are processed.
2015-06-28 22:57:33 +09:00
Yuya Nishihara
d1927459b6 revset: add function to build dict of positional and keyword arguments
Keyword arguments will be convenient for functions that will take more than
one optional or boolean flags. For example,

  file(pattern[, subrepos=false])
  subrepo([[pattern], status])

Because I don't think all functions should accept key=value syntax, getkwargs()
does not support variadic functions such as 'ancestor(*changeset)'.

The core logic is placed in the parser module because keyword arguments will
be more useful in the templater, where functions take more options. Test cases
will be added by the next patch.
2015-06-27 17:25:01 +09:00
Yuya Nishihara
411c9c1693 revset: add parsing rule for key=value pair
It will be used as an keyword argument.

Note that our "=" operator is left-associative. In general, the assignment
operator is right-associative, but we don't care because it isn't allowed to
chain "=" operations.
2015-06-27 17:05:28 +09:00
Matt Harbison
b41110155b revset: fix a crash in parents() when 'wdir()' is in the set
The crash was "TypeError: expected string or Unicode object, NoneType found"
down in revlog.parentrevs().  This fixes heads() too (which is where I found
it.)
2015-06-29 10:34:56 -04:00
Gregory Szorc
5380dea2a7 global: mass rewrite to use modern exception syntax
Python 2.6 introduced the "except type as instance" syntax, replacing
the "except type, instance" syntax that came before. Python 3 dropped
support for the latter syntax. Since we no longer support Python 2.4 or
2.5, we have no need to continue supporting the "except type, instance".

This patch mass rewrites the exception syntax to be Python 2.6+ and
Python 3 compatible.

This patch was produced by running `2to3 -f except -w -n .`.
2015-06-23 22:20:08 -07:00
Yuya Nishihara
fe462ed8ac parser: accept iterator of tokens instead of tokenizer function and program
This can simplify the interface of parse() function. Our tokenizer tends to
have optional arguments other than the message to be parsed.

Before this patch, the "lookup" argument existed only for the revset, and the
templater had to pack [program, start, end] to be passed to its tokenizer.
2015-06-21 00:49:26 +09:00
Pierre-Yves David
4686ca0994 revset: rework 'filteredset.last'
'isascending' and 'isdescending' are methods, not attributes. This led 'last()'
to misbehave on some non-ascending filtered sets.
2015-06-22 13:48:01 -07:00
Pierre-Yves David
9c4de6ba91 revset: improves time complexity of 'roots(xxx)'
The canonical way of doing 'roots(X)' is 'X - children(X)'. This is what the
implementation used to be. However, computing children is expensive because it
is unbounded. Any changesets in the repository may be a children of '0' so you
have to look at all changesets in the repository to compute children(0).
Moreover the current revsets implementation for children is not lazy, leading to
bad performance when fetching the first result.


There is a more restricted algorithm to compute roots:

    roots(X) = [r for r in X if not parents(r) & X]

This achieve the same result while only looking for parent/children relation in
the X set itself, making the algorithm 'O(len(X))' membership operation.
Another advantages is that it turns the check into a simple filter, preserving
all laziness property of the underlying revsets.

The speed is very significant and some laziness is restored.

-) revset without 'roots(...)' to compare to base line
0) before this change
1) after this change

revset #0: roots((tip~100::) - (tip~100::tip))
   plain         min           last
-) 0.001082      0.000993      0.000790
0) 0.001366      0.001385      0.001339
1) 0.001257  92% 0.001028  74% 0.000821  61%

revset #1: roots((0::) - (0::tip))
   plain         min           last
-) 0.134551      0.144682      0.068453
0) 0.161822      0.171786      0.157683
1) 0.137583  85% 0.146204  85% 0.070012  44%

revset #2: roots(tip~100:)
   plain         min           first         last
-) 0.000219      0.000225      0.000231      0.000229
0) 0.000513      0.000529      0.000507      0.000539
1) 0.000463  90% 0.000269  50% 0.000267  52% 0.000463  85%

revset #3: roots(:42)
   plain         min           first         last
-) 0.000119      0.000146      0.000146      0.000146
0) 0.000231      0.000254      0.000253      0.000260
1) 0.000216  93% 0.000186  73% 0.000184  72% 0.000244  93%

revset #4: roots(not public())
   plain         min           first
-) 0.000478      0.000502      0.000504
0) 0.000611      0.000639      0.000634
1) 0.000604      0.000560  87% 0.000558

revset #5: roots((0:tip)::)
   plain         min           max           first         last
-) 0.057795      0.004905      0.058260      0.004908      0.038812
0) 0.132845      0.118931      0.130306      0.114280      0.127742
1) 0.111659  84% 0.005023   4% 0.111658  85% 0.005022   4% 0.092490  72%

revset #6: roots(0::tip)
   plain         min           max           first         last
-) 0.032971      0.033947      0.033460      0.032350      0.033125
0) 0.083671      0.081953      0.084074      0.080364      0.086069
1) 0.074720  89% 0.035547  43% 0.077025  91% 0.033729  41% 0.083197

revset #7: 42:68 and roots(42:tip)
   plain         min           max           first         last
-) 0.006827      0.000251      0.006830      0.000254      0.006771
0) 0.000337      0.000353      0.000366      0.000350      0.000366
1) 0.000318  94% 0.000297  84% 0.000353      0.000293  83% 0.000351

revset #8: roots(0:tip)
   plain         min           max           first         last
-) 0.002119      0.000145      0.000147      0.000147      0.000147
0) 0.047441      0.040660      0.045662      0.040284      0.043435
1) 0.038057  80% 0.000187   0% 0.034919  76% 0.000186   0% 0.035097  80%

revset #0: roots(:42 + tip~42:)
   plain         min           max           first         last          sort
-) 0.000321      0.000317      0.000319      0.000308      0.000369      0.000343
0) 0.000772      0.000751      0.000811      0.000750      0.000802      0.000783
1) 0.000632  81% 0.000369  49% 0.000617  76% 0.000358  47% 0.000601  74% 0.000642  81%
2015-06-22 10:19:12 -07:00
Pierre-Yves David
e0e715574b revsets: use '&' instead of '.filter' in head
More high level operations are more likely to be optimised.
2014-10-10 17:30:09 -07:00
Matt Harbison
198604740e revset: don't suggest private or undocumented queries
I noticed when I mistyped 'matching', that it suggested '_matchfiles' as well.
Rather than simply exclude names that start with '_', this excludes anything
without a docstring.  That way, if it isn't in the help text, it isn't
suggested, such as 'wdir()'.
2015-06-20 10:59:56 -04:00
Pierre-Yves David
b62d73dc41 devel-warn: issue a warning for old style revsets
We have move to smartset class more than a year ago, we now have the tool to
aggressively nudge developer into upgrading their extensions.
2015-06-19 11:17:11 -07:00
Pierre-Yves David
da0f39bd8a revset: make use of natively-computed set for 'draft()' and 'secret()'
If the computation of a set for each phase (done in C) is available,
we use it directly instead of applying a simple filter. This give a
massive speed-up in the vast majority of cases.

On my mercurial repo with about 15000 out of 40000 draft changesets:

revset: draft()
   plain         min           first         last
0) 0.011201      0.019950      0.009844      0.000074
1) 0.000284   2% 0.000312   1% 0.000314   3% 0.000315 x4.3

Bad performance for "last" come from the handling of the 15000 elements set
(memory allocation, filtering hidden changesets (99% of it) etc. compared to
applying the filter only on a handfuld of revisions (the first draft changesets
being close of tip).

This is not seen as an issue since:

* Timing is still pretty good and in line with all the other one,
* Current user of Vanilla Mercurial will not have 1/3 of their repo draft,

This bad effect disappears when phase's set is smaller. (about 200 secrets):

revset: secret()
   plain         min           first         last
0) 0.011181      0.022228      0.010851      0.000452
1) 0.000058   0% 0.000084   0% 0.000087   0% 0.000087  19%
2015-06-10 19:18:51 -07:00
Pierre-Yves David
71823c86d9 revset: refactor the non-public phase code
Code for draft and secret are the same. We'll make it more complex to
take advantages of the set recomputed in C, so we first refactor the
code to only have one place to update (and make sure all behave
properly).

We do not refactor the 'public()' code because it does not have a natively
computed set.
2015-06-17 19:19:57 -07:00
Pierre-Yves David
7dc4b61365 revset: translate node directly with changelog in 'head'
Using 'repo[X]' is much slower because it creates a 'changectx' object and goes
though multiple layers of code to do so. It is also error prone if there is
tags, bookmarks, branch or other names that could map to a node hash and take
precedence (user are wicked).

This provides a significant performance boost on repository with a lot of
heads.  Benchmark result for a repo with 1181 heads.

revset: head()
   plain         min           last          reverse
0) 0.014853      0.014371      0.014350      0.015161
1) 0.001402   9% 0.000975   6% 0.000874   6% 0.001415   9%

revset: head() - public()
   plain         min           last          reverse
0) 0.015121      0.014420      0.014560      0.015028
1) 0.001674  11% 0.001109   7% 0.000980   6% 0.001693  11%

revset: draft() and head()
   plain         min           last          reverse
0) 0.015976      0.014490      0.014214      0.015892
1) 0.002335  14% 0.001018   7% 0.000887   6% 0.002340  14%

The speed up is visible even when other more costly revset are in use

revset: head() and author("mpm")
   plain         min           last          reverse
0) 0.105419      0.090046      0.017169      0.108180
1) 0.090721  86% 0.077602  86% 0.003556  20% 0.093324  86%
2015-06-16 19:47:46 -07:00
Pierre-Yves David
2c584c4990 revset: use a baseset in _notpublic()
The '_notpublic()' internal revset was "returning" a set. That was wrong. We now
return a 'baseset' as appropriate. This has no effect on performance in most case,
because we do the exact same operation than what the combination with a
'fullreposet' was doing. This as a small effect on some operation when combined
with other set, because we now apply the filtering in all cases. I think the
correctness is worth the impact on some corner cases. The optimizer should take
care of these corner cases anyway.

revset #0: not public()
   plain         min           max           first         last          reverse
0) 0.000465      0.000491      0.000495      0.000500      0.000494      0.000479
1) 0.000484      0.000503      0.000498      0.000505      0.000504      0.000491

revset #1: (tip~1000::) - public()
   plain         min           max           first         last          reverse
0) 0.002765      0.001742      0.002767      0.001730      0.002761      0.002782
1) 0.002847      0.001777      0.002776      0.001741      0.002764      0.002858

revset #2: not public() and branch("default")
   plain         min           max           first         last          reverse
0) 0.012104      0.011138      0.011189      0.011138      0.011166      0.011578
1) 0.011387  94% 0.011738 105% 0.014220 127% 0.011223      0.011184      0.012077

revset #3: (not public() - obsolete())
   plain         min           max           first         last          reverse
0) 0.000583      0.000556      0.000552      0.000555      0.000552      0.000610
1) 0.000613 105% 0.000559      0.000557      0.000573      0.000558      0.000613

revset #4: head() - public()
   plain         min           max           first         last          reverse
0) 0.010869      0.010800      0.011547      0.010843      0.010891      0.010891
1) 0.011031      0.011497 106% 0.011087      0.011100      0.011100      0.011085
2015-06-10 19:58:27 -07:00
Pierre-Yves David
d05c4e4635 revset: ensure we have loaded phases data in '_notpublic()'
If we are the very first rev access (or if the phase cache just got
invalidated) the phasesets will be None even if we support the native
computation. So we explicitly trigger a computation if needed.

This was not an issue before because requesting any phase information
would have triggered such computation.
2015-06-15 16:16:02 -07:00
Pierre-Yves David
48aa2b29f7 revset: use parentsets.min in _children
As stated in the comment, using the smartset 'min' will give more opportunity to
be smart. It give a small but significant boost to the performance. Most of the
time is still spend doing the actual computation but at least we can scrap some
performance when it makes sense.

revset #0: roots(0:tip)
   plain
0) 0.046600
1) 0.044109  94%
2015-06-11 19:02:24 -07:00
Pierre-Yves David
ae45bb4c69 revset: prefetch all attributes before loop in _revsbetween
Python is slow at attributes lookup. No, really, I mean -slow-. prefetching
these three methods give use a measurable performance boost.

revset #0: 0::tip
   plain
0) 0.037655
1) 0.034290  91%
2015-06-11 11:42:46 -07:00
Pierre-Yves David
062add8bae revset: mark spots that use 'set' instead of 'smartset'
Using smartset is better because we can do more optimisation on it. So we are
marking the faulty spot for later processing.
2015-06-11 15:45:02 -07:00
Pierre-Yves David
fb5a589cc8 revset: mark spot that feeds a set to a baseset
Sets have non-defined order and this should break stuff, but as we are lucky
fullreposet is also broken so the result is "not too bad".

We should fix it anyway, but it is too much for my current plate.
2015-06-11 15:43:11 -07:00
Pierre-Yves David
f5b6e24d29 revset: mark the fact we should use '&' instead of 'filter' in 'head'
I do not have time to fix all this now, let's mark it for later.
2015-06-11 15:37:17 -07:00
Pierre-Yves David
b62dbb27c2 revset: gratuitous formating fix in keyword
You will be aligned.
2015-06-11 15:36:03 -07:00
Pierre-Yves David
eb81242d78 revset: gratuitous code move in '_children'
As 'cs' is empty as the time of the conditional, we can just return an empty
'baseset' and create the variable later.
2015-06-11 14:27:52 -07:00
Pierre-Yves David
64a4e299bb revset: mark spots that should use 'smartset.min()'
Using smartset's min will be significantly faster when the input set can provided
an optimised answer. I do not have time to fix all of them but I'm marking the
spot.
2015-06-11 14:26:44 -07:00
Pierre-Yves David
53bc8a61ba revset: mark the place where we are combining sets in the wrong direction
We should always combine with subset as the left operand (to preserve the
order). I do not have time to fix all of them so I'm just marking the spot.
2015-06-11 14:21:21 -07:00
Pierre-Yves David
9bcc184424 revset: point out wrong behavior in fullreposet
I cannot fix all issues in revset because I've got other things to do,
but let's write down all the brokenness to help other people reading
and fixing.
2015-06-11 14:00:13 -07:00
Yuya Nishihara
0fcfb8834b revset: add fast path for _list() of integer revisions
This can greatly speed up chained 'or' of integer revisions.

1) reduce nesting of chained 'or' operations
2) optimize to a list
3) fast path for integer revisions (this patch)

revset #0: 0 + 1 + 2 + ... + 1000
1) wall 0.483341 comb 0.480000 user 0.480000 sys 0.000000 (best of 20)
2) wall 0.025393 comb 0.020000 user 0.020000 sys 0.000000 (best of 107)
3) wall 0.008371 comb 0.000000 user 0.000000 sys 0.000000 (best of 317)

revset #1: sort(0 + 1 + 2 + ... + 1000)
1) wall 0.035240 comb 0.040000 user 0.040000 sys 0.000000 (best of 100)
2) wall 0.026432 comb 0.030000 user 0.030000 sys 0.000000 (best of 102)
3) wall 0.008418 comb 0.000000 user 0.000000 sys 0.000000 (best of 322)

revset #2: first(0 + 1 + 2 + ... + 1000)
1) wall 0.028949 comb 0.030000 user 0.030000 sys 0.000000 (best of 100)
2) wall 0.025503 comb 0.030000 user 0.030000 sys 0.000000 (best of 106)
3) wall 0.008423 comb 0.010000 user 0.010000 sys 0.000000 (best of 319)

But I admit that it is still slower than the spanset.

revset #3: 0:1000
3) wall 0.000132 comb 0.000000 user 0.000000 sys 0.000000 (best of 19010)
2015-05-17 15:16:13 +09:00
Yuya Nishihara
f19a0ca9bc revset: optimize 'or' operation of trivial revisions to a list
As seen in issue4565 and issue4624, GUI wrappers and automated scripts are
likely to generate a long query that just has numeric revisions joined by 'or'.
One reason why is that they allows users to choose arbitrary revisions from
a list. Because this use case isn't handled well by smartset, let's optimize
it to a plain old list.

Benchmarks:

1) reduce nesting of chained 'or' operations
2) optimize to a list (this patch)

revset #0: 0 + 1 + 2 + ... + 1000
1) wall 0.483341 comb 0.480000 user 0.480000 sys 0.000000 (best of 20)
2) wall 0.025393 comb 0.020000 user 0.020000 sys 0.000000 (best of 107)

revset #1: sort(0 + 1 + 2 + ... + 1000)
1) wall 0.035240 comb 0.040000 user 0.040000 sys 0.000000 (best of 100)
2) wall 0.026432 comb 0.030000 user 0.030000 sys 0.000000 (best of 102)

revset #2: first(0 + 1 + 2 + ... + 1000)
1) wall 0.028949 comb 0.030000 user 0.030000 sys 0.000000 (best of 100)
2) wall 0.025503 comb 0.030000 user 0.030000 sys 0.000000 (best of 106)
2015-05-17 15:11:38 +09:00
Yuya Nishihara
15ea334802 revset: make "null" able to appear in internal _list() expression
This is the same workaround introduced at 3aa0e4733e6f. Without this patch,
"null or x" can't be optimized to _list(null x).

Test case will be added by the next patch.
2015-05-29 21:31:00 +09:00
Yuya Nishihara
003219cbf5 revset: make internal _list() expression remove duplicated revisions
This allows us to optimize chained 'or' operations to _list() expression.

Unlike _intlist() or _hexlist(), it's difficult to remove duplicates by the
caller of _list() because different symbols can point to the same revision.
If the caller knows all symbols are unique, that probably means revisions or
nodes are known, therefore, _intlist() or _hexlist() should be used instead.
So, it makes sense to check duplicates by _list() function.

'%ls' is no longer used in core, this won't cause performance regression.
2015-05-24 14:49:41 +09:00
Yuya Nishihara
d23a5c2002 revset: reduce nesting of chained 'or' operations (issue4624)
This reduces the stack depth of chained 'or' operations:
 - from O(n) to O(1) at the parsing, alias expansion and optimization phases
 - from O(n) to O(log(n)) at the evaluation phase

simplifyinfixops() must be applied immediately after the parsing phase.
Otherwise, alias expansion would crash by "maximum recursion depth exceeded"
error.

Test cases use 'x:y|y:z' instead of 'x|y' because I'm planning to optimize
'x|y' in a different way.

Benchmarks:

0) a2acce8dcd95
1) this patch

revset #0: 0 + 1 + 2 + ... + 200
0) wall 0.026347 comb 0.030000 user 0.030000 sys 0.000000 (best of 101)
1) wall 0.023858 comb 0.030000 user 0.030000 sys 0.000000 (best of 112)

revset #1: 0 + 1 + 2 + ... + 1000
0) maximum recursion depth exceeded
1) wall 0.483341 comb 0.480000 user 0.480000 sys 0.000000 (best of 20)

revset #2: sort(0 + 1 + 2 + ... + 200)
0) wall 0.013404 comb 0.010000 user 0.010000 sys 0.000000 (best of 196)
1) wall 0.006814 comb 0.010000 user 0.010000 sys 0.000000 (best of 375)

revset #3: sort(0 + 1 + 2 + ... + 1000)
0) maximum recursion depth exceeded
1) wall 0.035240 comb 0.040000 user 0.040000 sys 0.000000 (best of 100)
2015-04-26 18:13:48 +09:00
Yuya Nishihara
c29c74ec96 revset: add helper to build balanced addsets from chained 'or' operations
This function will be used by revset.orset() and scmutil.revrange() to reduce
the stack depth from O(n) to O(log(n)).

We've bikeshed the interface of this function, but we couldn't come to an
agreement. So we decided to attempt to make it move forward.

 marmoute:
 - new factory function isn't necessary for balanced addsets
 - addset.__init__ can just recurse, should handle "len(subsets) == 2+"

 yuja:
 - want to write all "len(subsets) == 0, 1, 2, 3+" cases in the same function
 - no recursion in __init__ for cosmetic reason: can't return, can't call
   __init__ directly

I've changed it to a private function so that nobody would be tempted to
utilize it.
2015-05-24 14:10:52 +09:00
Yuya Nishihara
cc8b59754d revset: comment that we can't swap 'or' operands by weight
Though the original code did nothing, it tried to optimize the calculation
order by weight. But we can't simply swap 'ta' and 'tb' because it would
change the order of revisions.

For future reference, this patch keeps the modified version of the original
code as comment.
2015-04-26 18:27:32 +09:00
Matt Mackall
07ca4361bb merge with stable 2015-05-26 07:44:37 -05:00
Yuya Nishihara
799e490a3c revset: drop magic of fullreposet membership test (issue4682)
This patch partially backs out a9a86cbbc5b2 and adds an alternative workaround
to functions that evaluate "null" and "wdir()". Because the new workaround is
incomplete, "first(null)" and "min(null)" don't work as expected. But they were
not usable until 3.4 and "null" isn't commonly used, we can postpone a complete
fix for 3.5.

The issue4682 was caused because "branch(default)" is evaluated to
"<filteredset <fullreposet>>", keeping fullreposet magic. The next patch will
fix crash on "branch(null)", but without this patch, it would make
"null in <branch(default)>" be True, which means "children(branch(default))"
would return all revisions but merge (p2 != null).

I believe the right fix is to stop propagating fullreposet magic on filter(),
but it wouldn't fit to stable release. Also, we should discuss how to handle
"null" and "wdir()" in revset before.
2015-05-24 10:29:33 +09:00
Yuya Nishihara
cc6506056b revset: map postfix '%' to only() to optimize operand recursively (issue4670)
Instead of keeping 'onlypost' as a method, this patch rewrites it to 'only'
function. This way, 'x%' always has the same weight as 'only(x)'.
2015-05-15 22:32:31 +09:00
Yuya Nishihara
09759e9679 parser: move prettyformat() function from revset module
I want to use it in doctests that I'll add by future patches. Also, it can
be used in "hg debugfileset" command.
2015-04-26 22:20:03 +09:00
Yuya Nishihara
0aa2a7df4d revset: move validation of incomplete parsing to parse() function
revset.parse() should be responsible for all parsing errors. Perhaps it wasn't
because 'revset.parse' was not a real function when the validation code was
added at ac01134d0a40.
2015-04-26 19:42:47 +09:00
Alexander Drozdov
f95ceba6f9 revset: id() called with 40-byte strings should give the same results as for short strings
The patch solves two issues:
1. id(unknown_full_hash) aborts, but id(unknown_short_hash) doesn't
2. id(40byte_tag_or_bookmark) returns tagged/bookmarked revision,
   but id(non-40byte_tag_or_bookmark) doesn't

After the patch:
1. id(unknown_full_hash) doesn't abort
2. id(40byte_tag_or_bookmark) returns empty set
2015-04-20 10:52:20 +03:00
Yuya Nishihara
eae9526460 revset: undocument wdir() until its command outputs get stable
wdir() implementation is still incomplete and shouldn't be advertised to
users. This patch will be backed out when

 - template values such as {rev} and {node} are settled
 - major commands and revsets work without crashing
2015-04-12 19:00:31 +09:00
Gregory Szorc
7498d10d47 revset: don't import discovery at module level
discovery.py imports a lot of the world. Pierre-Yves told me to move it
to a function-level import to avoid an import cycle in a future patch.
2015-04-14 12:54:16 -04:00
Ryan McElroy
0bcb80448a revsets: more informative syntax error message
I came across a case where an internal command was using a revset that I didn't
immediately pass in and it was difficult to debug what was going wrong with the
revset. This prints out the revset and informs the user that the error is with
a rebset so it should be more obvious what and where the error is.
2015-04-13 20:53:05 -07:00
Laurent Charignon
dfc226357c revset: add hook after tree parsing
This will be useful to execute actions after the tree is parsed and
before the revset returns a match. Finding symbols in the parse tree
will later allow hashes of hidden revisions to work on the command
line without the --hidden flag.
2015-03-24 14:24:55 -07:00
Yuya Nishihara
499c2ed6e7 revset: optimize "x & fullreposet" case
If self is a smartset and other is a fullreposet, nothing should be necessary.

A small win for trivial query in mozilla-central repo:

revset #0: (0:100000)
0) wall 0.017211 comb 0.020000 user 0.020000 sys 0.000000 (best of 163)
1) wall 0.001324 comb 0.000000 user 0.000000 sys 0.000000 (best of 2160)
2015-03-16 17:11:25 +09:00
Yuya Nishihara
bc9e0dc64b debugrevspec: show nesting structure of smartsets if verbose
This shows how smartsets are constructed from the query. It will be somewhat
useful to track problems such as stack overflow.
2015-03-16 18:36:53 +09:00
Yuya Nishihara
2c5f3cb86d revset: add __repr__ to all smartset classes
This is sometimes useful for debugging.
2015-03-16 18:15:06 +09:00
Matt Harbison
1abecad109 revset: add the 'subrepo' symbol
This returns the csets where matching subrepos have changed with respect to the
containing repo's first parent.  The second parent shouldn't matter, because it
is either syncing up to the first parent (i.e. it hasn't changed from the
current branch's POV), or the merge changed it with respect to the first parent
(which already adds it to the set).

There's already a 'subrepo' fileset, but it is prefixed with 'set:', so there
should be no ambiguity (in code anyway).  The only test I see for it is to
revert subrepos named by a glob pattern (in test-subrepo.t, line 58).  Since it
doesn't return a tracked file, neither 'log "set:subrepo()"' nor
'files "set:subrepo()"' print anything.  Therefore, it seems useful to have a
revset that will return something for log (and can be added to a revsetalias to
be chained with 'file' revsets.)

It might be nice to be able to filter for added, modified and removed
separately, but add/remove should be rare.  It might also be nice to be able to
do a 'contains' check, in addition to this mutated check.  Maybe it is possible
to get those with the existing 'adds', 'contains', 'modifies' and 'removes' by
teaching them to chase explicit paths into subrepos.

I'm not sure if this should be added to the 'modifies adds removes' line in
revset.optimize() (since it is doing an AMR check on .hgsubstate), or if it is
OK to put into 'safesymbols' (things like 'file' are on the list, and that takes
a regex, among other patterns).
2015-03-25 14:56:54 -04:00
Yuya Nishihara
e50dcd48fb revset: drop translation marker from error message of _notpublic()
It is a kind of an internal error. End user won't see it.
2015-05-19 23:29:20 +09:00
Yuya Nishihara
9dda746eb9 revset: drop docstring from internal _notpublic() function
It shouldn't be listed in "hg help revset".
2015-05-19 23:26:25 +09:00
Laurent Charignon
be38dd4abe revset: optimize not public revset
This patvh speeds up the computation of the not public() changeset
and incidentally speed up the computation of divergents() changeset on our big
repo by 100x from 50% to 0.5% of the time spent in smartlog with evolve.

In this patch we optimize not public() to _notpublic() (new revset) and use
the work on phaseset (from the previous commit) to be able to compute
_notpublic() quickly.

We use a non-lazy approach making the assumption the number of notpublic
change will not be in the order of magnitude of the repo size. Adopting a
lazy approach gives a speedup of 5x (vs 100x) only due to the overhead of the
code for lazy generation.
2015-04-24 14:30:30 -07:00
Augie Fackler
a5b17bd9d1 cleanup: use __builtins__.any instead of util.any
any() is available in all Python versions we support now.
2015-05-16 14:30:07 -04:00
Pierre-Yves David
49fcb055f4 generatorset: use 'next()' to simplify the code
The 'next()' built-in accept a default value. This remove the needs to check if
self non-empty before returning a value.
2015-05-17 18:06:09 -07:00
Pierre-Yves David
db5d2b0a12 revset: use 'next()' to detect end of iteration in 'last'
The 'next()' built-in can return a default value, allow to get rid of the
confusing try/except code flow.
2015-05-17 18:00:38 -07:00
Pierre-Yves David
95c104ceb7 revset: use 'next()' to detect end of iteration in 'limit'
The 'next()' built-in can return a default value, allow to get rid of the
confusing try/except code flow.
2015-05-17 17:58:39 -07:00
Pierre-Yves David
3e86398675 _revancestors: use 'next' to remove the verbose try except clauses
The 'next()' built-in can return a default value, making the final iteration
case simpler and clearer.
2015-05-17 17:54:58 -07:00
Yuya Nishihara
498cbbe4ca revset: extract addset._iterordered to free function
It never uses self, so let's make it less dependent on variables.
2015-05-16 21:42:09 +09:00
Yuya Nishihara
4f23717a35 revset: use fastasc/fastdesc switch consistently in addset.__iter__ 2015-05-16 14:05:02 +09:00
Yuya Nishihara
be1e504983 revset: drop redundant filteredset from right-hand side set of "or" operation
Since b4681ae82d4a, it should no longer be necessary because the addset can
remove duplicates correctly.
2015-03-30 20:56:37 +09:00
Pierre-Yves David
fc3daddf63 revset: fix iteration over ordered addset composed of non-ordered operands
Before this change, doing ordered iteration over an 'addset' object composed of
operands without fastasc or fastdesc method could result in duplicated entries.
This was the result of applying '_iterordered' on an unordered set.

We fix it by ensuring we iterate over the set in a sorted order. Using the fast
iterator when it exists on any operand. We kill the '_iterator' method in the
process because it did not make a lot of sense independently.

Thanks goes to Yuya Nishihara for reporting the issue and analysing the cause.
2015-05-15 00:25:43 -07:00
Yuya Nishihara
7c8dcbb212 revset: remove unused 'only' from methods table
The infix 'only' operator is mapped to 'only()' function by optimize(), so
it won't be looked up as a method. The test shows it.
2015-05-15 22:38:24 +09:00
Matt Mackall
e71173010b merge with stable 2015-05-15 11:52:09 -05:00
Yuya Nishihara
17d1eb2f7e revset: test current behavior of addset class
The addset class isn't simple and it has a hidden bug that will be fixed by
future patches. So let's test the current behavior.
2015-03-30 19:51:40 +09:00
Yuya Nishihara
5780b536c6 revset: remove duplicated definition of choice() from addset._iterordered()
choice() is already defined before val1 = None. Perhaps there was merge or
rebase error.
2015-04-27 23:03:20 +09:00
Yuya Nishihara
4290eff5ce revset: add wdir() function to specify workingctx revision by command
The main purpose of wdir() is to annotate working-directory files.

Currently many commands and revsets cannot handle workingctx and may raise
exception. For example, -r ":wdir()" results in TypeError. This problem will
be addressed by future patches.

We could add "wdir" symbol instead, but it would conflict with the existing
tag, bookmark or branch. So I decided not to.

List of commands that will potentially support workingctx revision:

  command   default  remarks
  --------  -------  -----------------------------------------------------
  annotate  p1       useful
  archive   p1       might be useful
  cat       p1       might be useful on Windows (no cat)
  diff      p1:wdir  (default)
  export    p1       might be useful if wctx can have draft commit message
  files     wdir     (default)
  grep      tip:0    might be useful
  identify  wdir     (default)
  locate    wdir     (default)
  log       tip:0    might be useful with -p or -G option
  parents   wdir     (default)
  status    wdir     (default)

This patch includes minimal test of "hg status" that should be able to handle
the workingctx revision.
2014-08-16 13:44:16 +09:00
Durham Goode
23a18a419d revbranchcache: store repo on the object
Previously we would instantiate the revbranchcache with a repo object, use it
briefly, then require it be passed in every time we wanted to fetch any
information. This seems unnecessary since it's obviously specific to that repo
(since it was constructed with it).

This patch stores the repo on the revbranchcache object, and removes the repo
parameter from the various functions on that class. This has the other nice
benefit of removing the double-revbranchcache-read that existed before (it was
read once for the branch revset, and once for the repo.revbranchcache).
2015-02-10 19:57:51 -08:00
Yuya Nishihara
34da300653 revset: replace "working copy" with "working directory" in function help 2015-03-17 20:50:19 +09:00
Jordi Gutiérrez Hermoso
8eb132f5ea style: kill ersatz if-else ternary operators
Although Python supports `X = Y if COND else Z`, this was only
introduced in Python 2.5. Since we have to support Python 2.4, it was
a very common thing to write instead `X = COND and Y or Z`, which is a
bit obscure at a glance. It requires some intricate knowledge of
Python to understand how to parse these one-liners.

We change instead all of these one-liners to 4-liners. This was
executed with the following perlism:

    find -name "*.py" -exec perl -pi -e 's,(\s*)([\.\w]+) = \(?(\S+)\s+and\s+(\S*)\)?\s+or\s+(\S*)$,$1if $3:\n$1    $2 = $4\n$1else:\n$1    $2 = $5,' {} \;

I tweaked the following cases from the automatic Perl output:

    prev = (parents and parents[0]) or nullid
    port = (use_ssl and 443 or 80)
    cwd = (pats and repo.getcwd()) or ''
    rename = fctx and webutil.renamelink(fctx) or []
    ctx = fctx and fctx or ctx
    self.base = (mapfile and os.path.dirname(mapfile)) or ''

I also added some newlines wherever they seemd appropriate for readability

There are probably a few ersatz ternary operators still in the code
somewhere, lurking away from the power of a simple regex.
2015-03-13 17:00:06 -04:00
Augie Fackler
0b9e6790bf revset: use UnknownIdentifier where appropriate 2015-01-26 14:32:30 -05:00
Yuya Nishihara
3adf9bf0f3 revset: extend fullreposet to make "null" revision magically appears in set
As per fullreposet.__and__, it can omit the range check of rev.  Therefore,
"null" revision is accepted automagically.

It seems this can fix many query results involving null symbol.  Originally,
the simplest "(null)" query did fail if there were hidden revisions.  Tests
are randomly chosen.

fullreposet mimics the behavior of localrepo, where "null" revision is not
listed but contained.
2015-01-08 23:05:45 +09:00
Yuya Nishihara
f6f2cc07d6 revset: duplicate spanset.__contains__ to fullreposet for modification
fcccbf073394 says we should avoid function calls in __contains__, so
super(fullreposet, self).__contains__(rev) is not an option.

Actually the super call doubled the benchmark result of trivial query:

revisions:
0) 6aa81b0c4658 (tip when I wrote this patch)
1) rev == node.nullrev or super(fullreposet, self).__contains__(rev)

revset #0: tip:0
0) wall 0.008441 comb 0.010000 user 0.010000 sys 0.000000 (best of 282)
1) wall 0.016152 comb 0.010000 user 0.010000 sys 0.000000 (best of 146)
2015-01-10 18:09:25 +09:00
Yuya Nishihara
ee4ca20b38 revset: have all() filter out null revision
I'm not sure if "all()" should filter out "null", but "all()" is stated as
'the same as "0:tip"' (except that it doesn't reorder the subset, I think.)

This patch is intended to avoid exposing a fullreposet to graphmod.dagwalker(),
which would result in strange drawing in future version:

  |
  o  changeset:   0:f8035bb17114
  |  user:        test
  |  date:        Thu Jan 01 00:00:00 1970 +0000
  |  summary:     add a

caused by:

    parents = sorted(set([p.rev() for p in ctx.parents()
                          if p.rev() in revs]))

We cannot add "and p.rev() != nullrev" here because revs may actually include
"null" revision.
2015-01-10 14:49:50 +09:00
Yuya Nishihara
bc28702606 revset: drop unnecessary calls of getall() with empty argument
If x is None, getall(repo, subset, x) == subset.
2015-01-10 16:41:36 +09:00
Matt Mackall
b907416f7b merge with stable 2015-03-02 01:20:14 -06:00
Mads Kiilerich
56207b4242 revisionbranchcache: fall back to slow path if starting readonly (issue4531)
Transitioning to Mercurial versions with revision branch cache could be slow as
long as all operations were readonly (revset queries) and the cache would be
populated but not written back.

Instead, fall back to using the consistently slow path when readonly and the
cache doesn't exist yet. That avoids the overhead of populating the cache
without writing it back.

If not readonly, it will still populate all missing entries initially. That
avoids repeated writing of the cache file with small updates, and it also makes
sure a fully populated cache available for the readonly operations.
2015-02-06 02:52:10 +01:00
FUJIWARA Katsunori
8a439b3cc6 revset: mask specific names for named() predicate
Before this patch, revset predicate "tag()" and "named('tags')" differ
from each other, because the former doesn't include "tip" but the
latter does.

For equivalence, "named('tags')" shouldn't include the revision
corresponded to "tip". But just removing "tip" from the "tags"
namespace causes breaking backward compatibility, even though "tip"
itself is planned to be eliminated, as mentioned below.

    http://selenic.com/pipermail/mercurial-devel/2015-February/066157.html

To mask specific names ("tip" in this case) for "named()" predicate,
this patch introduces "deprecated" into "namespaces", and makes
"named()" predicate examine whether each names are masked by the
namespace, to which they belong.

"named()" will really work correctly after 3.3.1 (see a3c326a7f57a for
detail), and fixing this on STABLE before 3.3.1 can prevent initial
users of "named()" from expecting "named('tags')" to include "tip".

It is reason why this patch is posted for STABLE, even though problem
itself isn't so serious.

This may have to be flagged as "(BC)", if applied on DEFAULT.
2015-02-05 14:45:49 +09:00
FUJIWARA Katsunori
c3172b4737 revset: get revision number of each node from target namespaces
Before this patch, revset predicate "named()" uses each nodes gotten
from target namespaces directly.

This causes problems below:

  - combination of other predicates doesn't work correctly, because
    they assume that revisions are listed up in number

  - "hg log" doesn't show any revisions for "named()" result, because:

    - "changeset_printer" stores formatted output for each revisions
      into dict with revision number (= ctx.rev()) as a key of them

    - "changeset_printer.flush(rev)" writes stored output for
      the specified revision, but

    - "commands.log" invokes it with the node, gotten from "named()"

  - "hg debugrevspec" shows nodes (= may be binary) directly

Difference between revset predicate "tag()" and "named('tags')" in
tests is fixed in subsequent patch.
2015-02-03 21:56:29 +09:00
FUJIWARA Katsunori
6a05d7fab8 revset: raise RepoLookupError to make present() predicate continue the query
Before this patch, "bookmark()", "named()" and "tag()" predicates
raise "Abort", when the specified pattern doesn't match against
existing ones.

This prevents "present()" predicate from continuing the query, because
it only catches "RepoLookupError".

This patch raises "RepoLookupError" instead of "Abort", to make
"present()" predicate continue the query, even if "bookmark()",
"named()" or "tag()" in the sub-query of it are aborted.

This patch doesn't contain raising "RepoLookupError" for "re:" pattern
in "tag()", because "tag()" treats it differently from others. Actions
of each predicates at failure of pattern matching can be summarized as
below:

  predicate  "literal:"  "re:"
  ---------- ----------- ------------
  bookmark   abort       abort
  named      abort       abort
  tag        abort       continue (*1)

  branch     abort       continue (*2)
  ---------- ----------- ------------

"tag()" may have to abort in the (*1) case for similarity, but this
change may break backward compatibility of existing revset queries. It
seems to have to be changed on "default" branch (with "BC" ?).

On the other hand, (*2) seems to be reasonable, even though it breaks
similarity, because "branch()" in this case doesn't check exact
existence of branches, but does pick up revisions of which branch
matches against the pattern.

This patch also adds tests for "branch()" to clarify behavior around
"present()" of similar predicates, even though this patch doesn't
change "branch()".
2015-01-31 01:00:50 +09:00
Yuya Nishihara
b5f973788a revset: fix ancestors(null) to include null revision (issue4512)
Since fe39bbbf31f0, null parent is explicitly excluded. So, there is no reason
to have nullrev in the initial seen set.
2015-01-25 20:20:27 +09:00
Yuya Nishihara
78d778b5ef revset: allow rev(-1) to indicate null revision (BC)
This can simplify the conversion from numeric revision to string. Without it,
we have to handle -1 specially because repo['-1'] != repo[-1].

The -1 revision is not officially documented, but this change makes sense
assuming that "rev(%d)" exists for scripting or third-party tools.
2015-01-10 12:56:38 +09:00
Martin von Zweigbergk
4b40ac0110 log: evaluate filesets on working copy, not its parent
When running "hg log 'set:added()'", we create two matchers: one used
for producing the revset and one used for finding files to match. In
185b6b930e8c (graphlog: evaluate FILE/-I/-X filesets on the working
dir, 2012-02-26), we started passing a revision argument along from
what's currently in cmdutil._makelogrevset() to
revset._matchfiles(). When the revision was an empty string, it
referred to the working copy. This was subtly done with "repo[rev or
None]". Then, in 5ff5c5c9e69f (revset: avoid recalculating filesets,
2014-10-22), that conversion from empty string to None was lost. Note
that repo[''] is equivalent to repo['.'], not repo[None].

The consequence of this, to the user, is that when running "hg log
'set:added()'", the file matcher matches files added in the working
copy, while the revset matcher matches revisions that touch files
added in the parent of the working copy. As a result, only revisions
that touch any files added in the parent of the working copy will be
considered, but they will only be included if they also touch files
added in the working copy.

Fix the bug by converting '' to None again, but make it a little more
explicit this time (plus, we now have tests for it).
2015-01-21 15:23:13 -08:00
Yuya Nishihara
4c1f7f24d7 revset: drop factory that promotes spanset to fullreposet
All callers use fullreposet where appropriate.

Backed out changeset 6c2c046ac382
2015-01-08 23:43:15 +09:00
Yuya Nishihara
878c8b67df revset: specify fullreposet without using spanset factory
The factory function will be removed because the subsequent patches will
make fullreposet(repo) not fully compatible with spanset(repo).
2015-01-08 23:46:54 +09:00
Yuya Nishihara
25fac1a15b revset: make match function initiate query from full set by default
This change is intended to avoid exposing the implementation detail to
callers. I'm going to extend fullreposet to support "null" revision, so
these mfunc calls will have to use fullreposet() instead of spanset().
2015-02-02 22:21:07 +09:00
FUJIWARA Katsunori
ac41d830e2 revset: check for collisions between alias argument names in the declaration
Before this patch, collisions between alias argument names in the
declaration are ignored, and this silently causes unexpected alias
evaluation.

This patch checks for such collisions, and aborts (or shows a warning) when
collisions are detected.

This patch doesn't add a test to "test-revset.t", because a doctest is
enough to test the collisions detection itself.
2015-01-10 23:18:11 +09:00
FUJIWARA Katsunori
e416b72fc5 revset: parse alias declaration strictly by _parsealiasdecl
Before this patch, alias declaration is parsed by string base
operations: matching against "^([^(]+)\(([^)]+)\)$" and splitting by
",".

This overlooks many syntax errors like below (see the previous patch
introducing "_parsealiasdecl" for detail):

  - un-closed parenthesis causes being treated as "alias symbol"
  - symbol/function name aren't examined whether they are valid or not
  - invalid argument list causes unexpected argument names

To parse alias declaration strictly, this patch replaces parsing
implementation by "_parsealiasdecl".

This patch tests only one typical declaration error case, because
error detection itself is already tested in the doctest of
"_parsealiasdecl".

This also removes class property "args" and "error", because these are
certainly initialized in "revsetalias.__init__".
2015-01-10 23:18:11 +09:00
FUJIWARA Katsunori
87958c780f revset: introduce "_parsealiasdecl" to parse alias declarations strictly
This patch introduces "_parsealiasdecl" to parse alias declarations
strictly. For example, "_parsealiasdecl" can detect problems below,
which current implementation can't.

  - un-closed parenthesis causes being treated as "alias symbol"

    because all of declarations not in "func(....)" style are
    recognized as "alias symbol".

    for example, "foo($1, $2" is treated as the alias symbol.

  - alias symbol/function names aren't examined whether they are valid
    as symbol or not

    for example, "foo bar" can be treated as the alias symbol, but of
    course such invalid symbol can't be referred in revset.

  - just splitting argument list by "," causes overlooking syntax
    problems in the declaration

    for example, all of invalid declarations below are overlooked:

    - foo("bar")     => taking one argument named as '"bar"'
    - foo("unclosed) => taking one argument named as '"unclosed'
    - foo(bar::baz)  => taking one argument named as 'bar::baz'
    - foo(bar($1))   => taking one argument named as 'bar($1)'

To decrease complication of patch, current implementation for alias
declarations is replaced by "_parsealiasdecl" in the subsequent
patch. This patch just introduces it.

This patch defines "_parsealiasdecl" not as a method of "revsetalias"
class but as a one of "revset" module, because of ease of testing by
doctest.

This patch factors some helper functions for "tree" out, because:

  - direct accessing like "if tree[0] == 'func' and len(tree) > 1"
    decreases readability

  - subsequent patch (and also existing code paths, in the future) can
    use them for readability

This patch also factors "_tokenizealias" out, because it can be used
also for parsing alias definitions strictly.
2015-01-10 23:18:11 +09:00
FUJIWARA Katsunori
883b1f7edf revset: store full detail into revsetalias.error for error source distinction
Before this patch, any errors in the declaration of revset alias
aren't detected at all, and there is no information about error source
in the error message.

As a part of preparation for parsing alias declarations and
definitions more strictly, this patch stores full detail into
"revsetalias.error" for error source distinction.

This makes raising "Abort" and warning potential errors just use
"revsetalias.error" without any message composing.
2015-01-10 23:18:11 +09:00
FUJIWARA Katsunori
ae25ee95c4 revset: factor out composing error message for ParseError to reuse
This patch defines the composing function not in "ParseError" class but
in "revset" module, because:

  - "_()" shouldn't be used in "ParseError", to avoid adding "from
    i18n import _" i18n" to "error" module

  - generalizing message composition of"ParseError" for all code paths
    other than revset isn't the purpose of this patch

    we should also take care of showing "unexpected leading
    whitespace" for some code paths, to generalize widely.
2015-01-10 23:18:11 +09:00
FUJIWARA Katsunori
48233206c2 revset: make tokenize extensible to parse alias declarations and definitions
Before this patch, "tokenize" doesn't recognize the symbol starting
with "$" as a valid one.

This prevents revset alias declarations and definitions from being
parsed with "tokenize", because "$" may be used as the initial letter
of alias arguments.

BTW, the alias argument name doesn't require leading "$" itself, in
fact. But we have to assume that users may use "$" as the initial
letter of argument names in their aliases, because examples in "hg
help revsets" uses such names for a long time.

To make "tokenize" extensible to parse alias declarations and
definitions, this patch introduces optional arguments "syminitletters"
and "symletters". Giving these sets can change the policy of "valid
symbol" in tokenization easily.

This patch keeps original examination of letter validity for
reviewability, even though there is redundant interchanging between
"chr"/"ord" at initialization of "_syminitletters" and "_symletters".
At most 256 times examination (per initialization) is cheaper enough
than revset evaluation itself.

This patch is a part of preparation for parsing alias declarations and
definitions more strictly.
2015-01-10 23:18:11 +09:00