Commit Graph

824 Commits

Author SHA1 Message Date
Augie Fackler
efaaf08415 revset: build _syminitletters from a saner source: the string module
For now, these sets will be unicode characters in Python 3, which is
probably wrong, but it un-blocks importing the module so we can get
further along. In the future we'll have to come up with a reasonable
encoding strategy for revsets in Python 3.

This patch was originally pair-programmed with Martijn.
2016-10-07 08:32:40 -04:00
Augie Fackler
a4faa822dc revset: define _symletters in terms of _syminitletters 2016-10-07 08:09:23 -04:00
Augie Fackler
71a3469685 revset: remove doubled space 2016-10-07 08:03:30 -04:00
Yuya Nishihara
7078b29941 revset: do not rewrite ':y' to '0:y' (issue5385)
That's no longer valid since the revision 0 may be hidden. Bypass validating
the existence of '0' and filter it by spanset.
2016-10-01 20:20:11 +09:00
Yuya Nishihara
b94864e756 revset: extract function that creates range set from computed revisions
So we can pass m=0 to _makerangeset() even if the revision 0 is hidden.
Hidden revisions are filtered by spanset.
2016-10-01 20:11:48 +09:00
Yuya Nishihara
a667f16896 revset: add option to make matcher takes the ordering of the input set
This allows us to evaluate match(subset) as if 'subset & expr', which will
be the complete fix for the issue5100.
2016-05-03 14:18:28 +09:00
Yuya Nishihara
6fa461cbb8 revset: make sort() noop depending on ordering requirement (BC)
See the previous patch for why.
2016-05-03 13:36:12 +09:00
Yuya Nishihara
6a7529851f revset: make reverse() noop depending on ordering requirement (BC)
Because smartset.reverse() may modify the underlying subset, it should be
called only if the set can define the ordering.

In the following example, 'a' and 'c' is the same object, so 'b.reverse()'
would reverse 'a' unexpectedly.

  # '0:2 & reverse(all())'
  <filteredset
    <spanset- 0:2>,    # a
    <filteredset       # b
      <spanset- 0:2>,  # c
      <spanset+ 0:9>>>
2016-05-03 13:36:12 +09:00
Yuya Nishihara
2ef98dedc5 revset: fix order of nested 'range' expression (BC)
Enforce range order only if necessary as the comment says "carrying the
sorting over would be more efficient."
2016-05-03 12:52:50 +09:00
Yuya Nishihara
d7145726a7 revset: forward ordering requirement to argument of present()
present() is special in that it returns the argument set with no
modification, so the ordering requirement should be forwarded.

We could make present() fix the order like orset(), but that would be silly
because we know the extra filtering cost is unnecessary.
2016-06-01 20:54:04 +09:00
Yuya Nishihara
eb51a746be revset: fix order of nested '_(|int|hex)list' expression (BC)
This fixes the order of 'x & (y + z)' where 'y' and 'z' are trivial, and the
other uses of _list()-family functions. The original functions are renamed to
'_ordered(|int|hex)list' to say clearly that they do not follow the subset
ordering.
2016-06-26 18:41:28 +09:00
Yuya Nishihara
46051dc16a revset: fix order of nested 'or' expression (BC)
This fixes the order of 'x & (y + z)' where 'y' and 'z' are not trivial.

The follow-order 'or' operation is slower than the ordered operation if
an input set is large:

       #0           #1           #2           #3
    0) 0.002968     0.002980     0.002982     0.073042
    1) 0.004513     0.004485     0.012029     0.075261

    #0: 0:4000 & (0:1099 + 1000:2099 + 2000:3099)
    #1: 4000:0 & (0:1099 + 1000:2099 + 2000:3099)
    #2: 10000:0 & (0:1099 + 1000:2099 + 2000:3099)
    #3: file("path:hg") & (0:1099 + 1000:2099 + 2000:3099)

I've tried another implementation, but which appeared to be slower than
this version.

    ss = [getset(repo, fullreposet(repo), x) for x in xs]
    return subset.filter(lambda r: any(r in s for s in ss), cache=False)
2016-06-26 18:17:12 +09:00
Yuya Nishihara
7706430a92 revset: add 'takeorder' attribute to mark functions that need ordering flag
Since most functions shouldn't need 'order' flag, it is passed only when
explicitly required. This avoids large API breakage.
2016-08-07 17:58:50 +09:00
Yuya Nishihara
a8ff6aeb43 revset: pass around ordering flags to operations
Some operations and functions will need them to fix ordering bugs.
2016-08-07 17:46:12 +09:00
Yuya Nishihara
bc25911069 revset: add stub to handle parentpost operation
All operations will take 'order' flag, but p1() function won't.
2016-08-07 17:48:52 +09:00
Yuya Nishihara
b341e7372c revset: infer ordering flag to teach if operation should define/follow order
New flag 'order' is the hint to determine if a function or operation can
enforce its ordering requirement or take the ordering already defined. It
will be used to fix a couple of ordering bugs, such as:

 a) 'x & (y | z)' disregards the order of 'x' (issue5100)
 b) 'x & y:z' is listed from 'y' to 'z'
 c) 'x & y' can be rewritten as 'y & x' if weight(x) > weight(y)

(a) and (b) are bugs of the revset core. Before this, there was no way to
tell if 'orset()' and 'rangeset()' can enforce its ordering. These bugs
could be addressed by overriding __and__() of the initial set to take the
ordering of the other set:

    class fullreposet:
        def __and__(self, other):
            # allow other to enforce its ordering
            return other

but it would expose (c), which is a hidden bug of optimize(). So, in either
ways, optimize() have to know the current ordering requirement. Otherwise,
it couldn't rewrite expressions by weights with no output change, nor tell
how a revset function or operation should order the entries.

'order' is tri-state. It starts with 'define', and shifts to 'follow' by
'x & y'. It changes back to 'define' on function call 'f(x)' or function-like
operation 'x (f) y' because 'f' may have its own ordering requirement for 'x'
and 'y'. The state 'any' will allow us to avoid extra cost that would be
necessary to constrain ordering where it isn't important, 'not x'.
2016-02-16 22:02:16 +09:00
Yuya Nishihara
ec5675abe6 revset: wrap arguments of 'or' by 'list' node
This makes the number of 'or' arguments deterministic so we can attach
additional ordering flag to all operator nodes. See the next patch.

We rewrite the tree immediately after chained 'or' operations are flattened
by simplifyinfixops(), so we don't need to care if arguments are stored in
x[1] or x[1:].
2016-08-07 17:04:05 +09:00
Yuya Nishihara
c883bfb39b revset: remove showwarning option from expandaliases()
Now all callers pass showwarning=ui.warn, so we no longer need the option to
suppress warnings.
2016-09-08 22:44:10 +09:00
Yuya Nishihara
5fdde91bb9 revset: add public function to create matcher from evaluatable tree
"hg debugrevspec" will use it to evaluate unoptimized tree.
2016-08-21 11:37:00 +09:00
Yuya Nishihara
c0d8e08ed6 revset: make analyze() a separate step from optimize()
This will allow us to evaluate unoptimized tree and compare the result with
optimized one.

The private _analyze() function isn't renamed since I'll add more parameters
to it.
2016-08-21 11:29:57 +09:00
Yuya Nishihara
48b8bb9c60 revset: extract tree transformation from optimize()
This patch separates the simple tree transformation from the optimization step,
which is called as _analyze() since I'll extend this function to infer ordering
flags. I want to avoid making _optimize() more complicated.

This will also allow us to evaluate unoptimized tree.
2016-08-07 14:35:03 +09:00
Yuya Nishihara
d8c35aee14 revset: do not partial-match operator and function names in optimize()
It was error-prone, and actually there was a typo, s/ancestorspec/ancestor/.
2016-08-07 16:36:08 +09:00
Yuya Nishihara
a63338644c revset: remove false condition to process 'negate' operator
'negate' is mapped to 'string' at the above clause.
2016-08-07 14:13:27 +09:00
Yuya Nishihara
02a180fc73 revset: make optimize() reject unknown operators
This should have caught the bug of 'keyvalue' operator fixed at 910346866463.
The catch-all pattern is useless since optimize() should be aware of all known
operators.
2016-08-07 15:01:42 +09:00
Gábor Stefanik
71039079d7 revset: support "follow(renamed.py, e22f4f3f06c3)" (issue5334)
v2: fixes from review
2016-08-18 17:25:10 +02:00
Augie Fackler
cb268cbd2f merge with stable 2016-08-15 12:26:02 -04:00
Yuya Nishihara
320973b5ef revset: fix keyword arguments to go through optimization process
Before, a keyvalue node was processed by the last catch-all condition of
_optimize(). Therefore, topo.firstbranch=expr would bypass tree rewriting
and would crash if an expr wasn't trivial.
2016-08-07 14:58:49 +09:00
FUJIWARA Katsunori
5f2b407a05 revset: refactor to make xgettext put i18n comments into hg.pot file
xgettext expects both "_()" and (a part of) text to be placed at just
next line of "i18n:" comment.
2016-08-01 06:08:26 +09:00
Yuya Nishihara
1cc6421086 revset: also parse x^: as (x^):
Given x^:y is (x^):y, this seems sensible.
2016-08-06 20:37:48 +09:00
Yuya Nishihara
992f4bdde9 revset: resolve ambiguity of x^:y before alias expansion
This is purely a parsing problem, which should be resolved before alias
expansion.
2016-08-06 20:21:00 +09:00
Yuya Nishihara
ba3291048d revset: check invalid function syntax "func-name"() explicitly
Before the error was caught at func() as an unknown identifier, and the
optimizer failed to detect the syntax error. This patch introduces getsymbol()
helper to ensure that a string is not allowed as a function name.
2016-06-27 20:44:14 +09:00
Yuya Nishihara
7323118189 revset: get rid of redundant error checking from match()
Actually there was no additional error checking. It should be caught by
"not all(specs)".
2016-06-26 17:16:57 +09:00
Gregory Szorc
aa5486b692 revset: implement match() in terms of matchany()
match() is the special case of a single element list being passed
to matchany() with the additional error checking that the revset
spec is defined. Change the implementation to remove the redundant
code and have match() call matchany().
2016-06-25 19:10:46 -07:00
Martin von Zweigbergk
59e258644a revset: make head() honor order of subset
The ordering of 'x & head()' was broken in 329d82866742 (revset:
improve head revset performance, 2014-03-13). Presumably due to other
optimizations since then, undoing that change to fix the order does
not slow down the simple case of "hg log -r 'head()'" mentioned in
that commit. I see a small slowdown from ~0.16s to about ~0.19s with
'not 0 & head()', but I'd say it's worth it for the correct output.
2016-06-23 12:37:09 -07:00
Martin von Zweigbergk
f44cccc475 revsets: use itervalues() where only values are needed
I don't think there will be a noticeable speedup, but it removes an
unused variable.
2016-06-23 13:08:10 -07:00
Martin von Zweigbergk
792eb4d1ec revsets: passing a set to baseset() is not wrong
Since 303be3afebae (revset: force ascending order for baseset
initialized from a set, 2016-04-04), it is safe to pass a revset to a
baseset.
2016-06-23 12:39:05 -07:00
liscju
c7ec9d159e i18n: translate abort messages
I found a few places where message given to abort is
not translated, I don't find any reason to not translate
them.
2016-06-14 11:53:55 +02:00
Yuya Nishihara
1c12bcd4ad revset: extract function that validates sort() arguments
This function will be used in _optimize() to get rid of noop sort() call while
validating its arguments.
2016-06-11 10:17:49 +09:00
Yuya Nishihara
2ddc948200 revset: build dict of extra sort options before evaluating set
Prepares for extracting a function that only validates sort options.
2016-06-15 21:26:45 +09:00
Yuya Nishihara
1ee4615445 revset: build list of (key, reverse) pairs before sorting
Prepares for extracting a function that only validates sort options.
2016-06-11 10:15:40 +09:00
Yuya Nishihara
4659949bd1 revset: fix crash on empty sort key
Make it noop as before ddf6bfe09ab2. We could change it to an error, but
allowing empty key makes some sense for scripting that builds a key string
programmatically.
2016-06-15 20:37:24 +09:00
Martijn Pieters
6cc53d84c9 revset: add new topographical sort
Sort revisions in reverse revision order but grouped by topographical branches.
Visualised as a graph, instead of:

  o  4
  |
  | o  3
  | |
  | o  2
  | |
  o |  1
  |/
  o  0

revisions on a 'main' branch are emitted before 'side' branches:

  o  4
  |
  o  1
  |
  | o  3
  | |
  | o  2
  |/
  o  0

where what constitutes a 'main' branch is configurable, so the sort could also
result in:

  o  3
  |
  o  2
  |
  | o  4
  | |
  | o  1
  |/
  o  0

This sort was already available as an experimental option in the graphmod
module, from which it is now removed.

This sort is best used with hg log -G:

  $ hg log -G "sort(all(), topo)"
2016-06-13 18:20:00 +01:00
Martijn Pieters
57bf8caf56 revset: move groupbranchiter over from graphmod
This move is to prepare the adaptation of this function into a toposort
predicate.
2016-06-13 18:20:00 +01:00
Martijn Pieters
ffccd3fc81 revset: record if a set is in topographical order
A later revision adds actual topographical sorting. Recording if a set is in
this order allows hg log -G to avoid re-sorting the revset.
2016-06-14 11:05:36 +01:00
Kostia Balytskyi
5501c91461 revset: make filteredset.__nonzero__ respect the order of the filteredset
This fix allows __nonzero__ to respect the direction of iteration of the
whole filteredset. Here's the case when it matters. Imagine that we have a
very large repository and we want to execute a command like:

    $ hg log --rev '(tip:0) and user(ikostia)' --limit 1

(we want to get the latest commit by me).

Mercurial will evaluate a filteredset lazy data structure, an
instance of the filteredset class, which will know that it has to iterate
in a descending order (isdescending() will return True if called). This
means that when some code iterates over the instance of this filteredset,
the 'and user(ikostia)' condition will be first checked on the latest
revision, then on the second latest and so on, allowing Mercurial to
print matches as it founds them. However, cmdutil.getgraphlogrevs
contains the following code:

    revs = _logrevs(repo, opts)
    if not revs:
        return revset.baseset(), None, None

The "not revs" expression is evaluated by calling filteredset.__nonzero__,
which in its current implementation will try to iterate the filteredset
in ascending order until it finds a revision that matches the 'and user(..'
condition. If the condition is only true on late revisions, a lot of
useless iterations will be done. These iterations could be avoided if
__nonzero__ followed the order of the filteredset, which in my opinion
is a sensible thing to do here.

The problem gets even worse when instead of 'user(ikostia)' some more
expensive check is performed, like grepping the commit diff.


I tested this fix on a very large repo where tip is my commit and my very
first commit comes fairly late in the revision history. Results of timing
of the above command on that very large repo.

-with my fix:
real    0m1.795s
user    0m1.657s
sys     0m0.135s

-without my fix:
real    1m29.245s
user    1m28.223s
sys     0m0.929s

I understand that this is a very specific kind of problem that presents
itself very rarely, only on very big repositories and with expensive
checks and so on. But I don't see any disadvantages to this kind of fix
either.
2016-06-02 22:39:01 +01:00
Yuya Nishihara
e2af538ed7 revset: define table of sort() key functions
This should be more readable than big "if" branch.
2016-05-14 19:52:00 +09:00
Yuya Nishihara
9079f0d3ff revset: factor out reverse flag of sort() key
Prepares for making a table of sort keys. This assumes 'k' has at least one
character, which should be guaranteed by keys.split().
2016-05-14 19:46:18 +09:00
Martijn Pieters
4501d84df3 revset: use getargsdict for sort()
This makes it possible to use keyword arguments to specify per-sort options.
For example, a hypothetical 'first' option for the user sort could sort certain
users first with:

    sort(all(), user, user.first=mpm@selenic.com)
2016-05-23 14:09:50 -07:00
timeless
a1cb3173a2 py3: convert to next() function
next(..) was introduced in py2.6 and .next() is not available in py3

https://docs.python.org/2/library/functions.html#next
2016-05-16 21:30:53 +00:00
timeless
db06992202 revset: rename variable to avoid shadowing with builtin next() function
https://docs.python.org/2/library/functions.html#next
2016-05-16 21:30:32 +00:00