Commit Graph

854 Commits

Author SHA1 Message Date
Denis Laxalde
a70f2fcec7 revset: factor out linerange processing into a utility function
Similar processing will be done in hgweb.webutil in forthcoming changeset.
2017-02-24 18:39:08 +01:00
Denis Laxalde
ca5e4eec65 context: also return ancestor's line range in blockancestors 2017-01-16 17:14:36 +01:00
Yuya Nishihara
b2229f5117 revset: split language services to revsetlang module (API)
New revsetlang module hosts parser, tokenizer, and miscellaneous functions
working on parsed tree. It does not include functions for evaluation such as
getset() and match().

  2288 mercurial/revset.py
   684 mercurial/revsetlang.py
  2972 total

get*() functions are aliased since they are common in revset.py.
2017-02-19 18:19:33 +09:00
Jun Wu
bc5a0cb908 revset: use phasecache.getrevset
This is part of a refactoring that moves some phase query optimization from
revset.py to phases.py. See the previous patch for motivation.

This patch changes revset code to use phasecache.getrevset so it no longer
accesses the private field: _phasecache._phasesets directly.

For performance impact, this patch was tested using the following query, on
my hg-committed repo:

    for i in 'public()' 'not public()' 'draft()' 'not draft()'; do
        echo $i;
        hg perfrevset "$i";
        hg perfrevset "$i" --hidden;
    done

For the CPython implementation, most operations are unchanged (within
+/- 1%), while "not public()" and "draft()" is noticeably faster on an
unfiltered repo. It may be because the new code avoids a set copy if
filteredrevs is empty.

  revset  | public()      | not public() | draft()    | not draft()
  hidden  |  yes  |  no   |   yes |  no  | yes |  no  | yes  |  no
  ------------------------------------------------------------------
  before  | 19006 | 17352 |   239 |  286 | 180 |  228 | 7690 | 5745
  after   | 19137 | 17231 |   240 |  207 | 182 |  150 | 7687 | 5658
  delta   |                       | -38% |     | -52% |

  (timed in microseconds)

For the pure Python implementation, some operations are faster while "not
draft()" is noticeably slower:

  revset  | public()      | not public()  | draft()       | not draft()
  hidden  |  yes  |  no   |   yes |  no   | yes   |  no   | yes   |  no
  ------------------------------------------------------------------------
  before  | 18852 | 17183 | 17758 | 15921 | 17505 | 15973 | 41521 | 39822
  after   | 18924 | 17380 | 17558 | 14545 | 16727 | 13593 | 48356 | 43992
  delta   |                       |   -9% |   -5% |  -15% |  +16% |  +10%

That may be the different performance characters of generatorset vs.
filteredset. The "not draft()" query could be optimized in this case where
both "public" and "secret" are passed to "getrevsets" so it won't iterate
the whole repo twice.
2017-02-18 00:39:31 -08:00
Martin von Zweigbergk
7ddb655b81 destutil: drop now-unused "check" parameter from destupdate() 2017-02-13 11:32:09 -08:00
Yuya Nishihara
2e50d5587f smartset: move set classes and related functions from revset module (API)
These classes are pretty large and independent from revset computation.

  2961 mercurial/revset.py
   973 mercurial/smartset.py
  3934 total

revset.prettyformatset() is renamed to smartset.prettyformat(). Smartset
classes are aliased since they are quite common in revset.py.
2016-10-16 17:28:51 +09:00
Yuya Nishihara
74023f2b13 revset: prevent using outgoing() and remote() in hgweb session (BC)
outgoing() and remote() may stall for long due to network I/O, which seems
unsafe per definition, "whether a predicate is safe for DoS attack." But I'm
not 100% sure about this. If our concern isn't elapsed time but CPU resource,
these predicates are considered safe. Perhaps that would be up to the
web/application server configuration?

Anyway, outgoing() and remote() wouldn't be useful in hgweb, so I think
it's okay to ban them.
2017-01-20 21:33:18 +09:00
Yuya Nishihara
5ade140d5c revset: abuse x:y syntax to specify line range of followlines()
This slightly complicates the parsing (see the previous patch), but the
overall result seems not bad.

I keep x:, :y and : for future extension.
2017-01-09 17:58:19 +09:00
Yuya Nishihara
615f3c1669 revset: do not transform range* operators in parsed tree
This allows us to handle x:y range as a general range object. A primary user
of it is followlines().
2017-01-09 16:55:56 +09:00
Yuya Nishihara
0f4a24bbbf revset: add default value to getinteger() helper
This seems handy.
2017-01-09 17:45:11 +09:00
Yuya Nishihara
49d42c696d revset: factor out getinteger() helper
We have 4 revset functions that take integer arguments, and they handle
their arguments in slightly different ways. This patch unifies them:

 - getstring() in place of getsymbol(), which is more consistent with the
   handling of integer revisions (both 1 and '1' are valid)
 - say "expects" instead of "requires" for type errors

We don't need to catch TypeError since getstring() must return a string.
2017-01-09 17:39:44 +09:00
Yuya Nishihara
a73b0aaf6b revset: rename rev argument of followlines() to startrev
The rev argument has the same meaning as startrev of follow(), and I think
startrev is more informative.

followlines() is new function, we can make BC now.
2017-01-09 16:16:26 +09:00
Yuya Nishihara
a0c3bc199a help: use :hg: role and canonical name to point to revset string patterns
Follows up ae418afed3f6. Now revisions.txt and revsets.txt has been merged,
so use revisions.* as a pointer.
2017-01-13 23:48:21 +09:00
Matt Harbison
d3bfb5a06a help: eliminate duplicate text for revset string patterns
There's no reason to duplicate this so many times, and it's likely an instance
will be missed if support for a new pattern is added and documented.  The
stringmatcher is mostly used by revsets, though it is also used for the 'tag'
related templates, and namespace filtering in the journal extension.  So maybe
there's a better place to document it.  `hg help patterns` seems inappropriate,
because that is all file pattern matching.

While here, indicate how to perform case insensitive regex searches.
2017-01-07 23:35:35 -05:00
Matt Harbison
e0b76f5323 revset: add regular expression support to 'desc'
This is a case insensitive predicate like 'author', so it conforms to the
existing behavior of performing a case insensitive regex.
2017-01-07 21:26:32 -05:00
Matt Harbison
840ab22fff revset: stop lowercasing the regex pattern for 'author'
It was probably unintentional for regex, as the meaning of some sequences like
\S and \s is actually inverted by changing the case.  For backward compatibility
however, the matching is forced to case insensitive.
2017-01-11 22:42:10 -05:00
Matt Harbison
762a49215b revset: point to 'grep' in the 'keyword' help for regex searches
The help for 'grep' already points to 'keyword'.
2017-01-11 23:13:51 -05:00
Yuya Nishihara
d04abe7517 revset: parse variable-length arguments of followlines() by getargsdict() 2017-01-09 16:02:56 +09:00
Yuya Nishihara
b1575d5948 parser: extend buildargsdict() to support variable-length positional args
This can simplify the argument parsing of followlines(). Tests are added by
the next patch.
2017-01-09 15:25:52 +09:00
Denis Laxalde
20d1dad252 revset: add a followlines(file, fromline, toline[, rev]) revset
This revset returns the history of a range of lines (fromline, toline) of a
file starting from `rev` or the current working directory.

Added tests in test-annotate.t which already contains a reasonably complex
repository.
2017-01-04 16:47:49 +01:00
Yuya Nishihara
a7a60a2e43 revset: drop TODO comment about sorting issue of fullreposet
The bootstrapping issue was addressed at the parsing phase and we expect
that fullreposet.__and__() fully complies to the smartset API, in which
'self & other' should return a result set in self's order. See also
ab938e7ae803.
2016-05-14 20:52:44 +09:00
Yuya Nishihara
2fa6a1e65e revset: document wdir() as an experimental function
Let's resurrect the docstring since our help module can detect the EXPERIMENTAL
tag and display it only if -v is specified.

This patch updates the test added by bbdfa2d5aaa2 since wdir() is now
documented.
2017-01-05 22:53:42 +09:00
Yuya Nishihara
ec99971228 revset: categorize wdir() as very fast function
The cost of wdir() should be identical to or cheaper than _intlist().
2016-08-20 17:50:23 +09:00
Yuya Nishihara
14fa3ba925 revset: make children() not look at p2 if null (issue5439)
Unlike p1 = null, p2 = null denotes the revision has only one parent, which
shouldn't be considered a child of the null revision. This was spotted while
fixing the issue4682 and rediscovered as issue5439.
2015-05-23 11:04:11 +09:00
Augie Fackler
f6cdc4e606 revset: avoid shadowing a variable with a list comprehension 2016-11-10 16:35:10 -05:00
Mads Kiilerich
38cb771268 spelling: fixes of non-dictionary words 2016-10-17 23:16:55 +02:00
Mads Kiilerich
b4b748a9ed revset: don't cache abstractsmartset min/max invocations infinitely
There was a "leak", apparently introduced in b37a67b41690. When running:

    hg = hglib.open('repo')
    while True:
        hg.log("max(branch('default'))")

all filteredset instances from branch() would be cached indefinitely by the
@util.cachefunc annotation on the max() implementation.

util.cachefunc seems dangerous as method decorator and is barely used elsewhere
in the code base. Instead, just open code caching by having the min/max
methods replace themselves with a plain lambda returning the result.
2016-10-25 18:56:27 +02:00
Mads Kiilerich
20a4281d3a revset: optimize for destination() being "inefficient"
destination() will scan through the whole subset and read extras for each
revision to get its source.
2016-10-17 19:48:36 +02:00
Yuya Nishihara
7e790cf836 revset: for x^2, do not take null as a valid p2 revision
Since we don't count null p2 revision as a parent, x^2 should never return
null even if null is explicitly populated.
2016-10-14 23:33:00 +09:00
Yuya Nishihara
e0c2008a2f revset: make follow() reject more than one start revisions
Taking only the last revision is inconsistent because ancestors(set) follows
all revisions given, and theoretically follow(startrev=set) == ancestors(set).
I'm planning to add a support for multiple start revisions, but that won't fit
to the 4.0 time frame. So reject multiple revisions now to avoid future BC.

len(revs) might be slow if revs were large, but we don't care since a valid
revs should have only one element.
2016-10-10 22:30:09 +02:00
Augie Fackler
efaaf08415 revset: build _syminitletters from a saner source: the string module
For now, these sets will be unicode characters in Python 3, which is
probably wrong, but it un-blocks importing the module so we can get
further along. In the future we'll have to come up with a reasonable
encoding strategy for revsets in Python 3.

This patch was originally pair-programmed with Martijn.
2016-10-07 08:32:40 -04:00
Augie Fackler
a4faa822dc revset: define _symletters in terms of _syminitletters 2016-10-07 08:09:23 -04:00
Augie Fackler
71a3469685 revset: remove doubled space 2016-10-07 08:03:30 -04:00
Yuya Nishihara
7078b29941 revset: do not rewrite ':y' to '0:y' (issue5385)
That's no longer valid since the revision 0 may be hidden. Bypass validating
the existence of '0' and filter it by spanset.
2016-10-01 20:20:11 +09:00
Yuya Nishihara
b94864e756 revset: extract function that creates range set from computed revisions
So we can pass m=0 to _makerangeset() even if the revision 0 is hidden.
Hidden revisions are filtered by spanset.
2016-10-01 20:11:48 +09:00
Yuya Nishihara
a667f16896 revset: add option to make matcher takes the ordering of the input set
This allows us to evaluate match(subset) as if 'subset & expr', which will
be the complete fix for the issue5100.
2016-05-03 14:18:28 +09:00
Yuya Nishihara
6fa461cbb8 revset: make sort() noop depending on ordering requirement (BC)
See the previous patch for why.
2016-05-03 13:36:12 +09:00
Yuya Nishihara
6a7529851f revset: make reverse() noop depending on ordering requirement (BC)
Because smartset.reverse() may modify the underlying subset, it should be
called only if the set can define the ordering.

In the following example, 'a' and 'c' is the same object, so 'b.reverse()'
would reverse 'a' unexpectedly.

  # '0:2 & reverse(all())'
  <filteredset
    <spanset- 0:2>,    # a
    <filteredset       # b
      <spanset- 0:2>,  # c
      <spanset+ 0:9>>>
2016-05-03 13:36:12 +09:00
Yuya Nishihara
2ef98dedc5 revset: fix order of nested 'range' expression (BC)
Enforce range order only if necessary as the comment says "carrying the
sorting over would be more efficient."
2016-05-03 12:52:50 +09:00
Yuya Nishihara
d7145726a7 revset: forward ordering requirement to argument of present()
present() is special in that it returns the argument set with no
modification, so the ordering requirement should be forwarded.

We could make present() fix the order like orset(), but that would be silly
because we know the extra filtering cost is unnecessary.
2016-06-01 20:54:04 +09:00
Yuya Nishihara
eb51a746be revset: fix order of nested '_(|int|hex)list' expression (BC)
This fixes the order of 'x & (y + z)' where 'y' and 'z' are trivial, and the
other uses of _list()-family functions. The original functions are renamed to
'_ordered(|int|hex)list' to say clearly that they do not follow the subset
ordering.
2016-06-26 18:41:28 +09:00
Yuya Nishihara
46051dc16a revset: fix order of nested 'or' expression (BC)
This fixes the order of 'x & (y + z)' where 'y' and 'z' are not trivial.

The follow-order 'or' operation is slower than the ordered operation if
an input set is large:

       #0           #1           #2           #3
    0) 0.002968     0.002980     0.002982     0.073042
    1) 0.004513     0.004485     0.012029     0.075261

    #0: 0:4000 & (0:1099 + 1000:2099 + 2000:3099)
    #1: 4000:0 & (0:1099 + 1000:2099 + 2000:3099)
    #2: 10000:0 & (0:1099 + 1000:2099 + 2000:3099)
    #3: file("path:hg") & (0:1099 + 1000:2099 + 2000:3099)

I've tried another implementation, but which appeared to be slower than
this version.

    ss = [getset(repo, fullreposet(repo), x) for x in xs]
    return subset.filter(lambda r: any(r in s for s in ss), cache=False)
2016-06-26 18:17:12 +09:00
Yuya Nishihara
7706430a92 revset: add 'takeorder' attribute to mark functions that need ordering flag
Since most functions shouldn't need 'order' flag, it is passed only when
explicitly required. This avoids large API breakage.
2016-08-07 17:58:50 +09:00
Yuya Nishihara
a8ff6aeb43 revset: pass around ordering flags to operations
Some operations and functions will need them to fix ordering bugs.
2016-08-07 17:46:12 +09:00
Yuya Nishihara
bc25911069 revset: add stub to handle parentpost operation
All operations will take 'order' flag, but p1() function won't.
2016-08-07 17:48:52 +09:00
Yuya Nishihara
b341e7372c revset: infer ordering flag to teach if operation should define/follow order
New flag 'order' is the hint to determine if a function or operation can
enforce its ordering requirement or take the ordering already defined. It
will be used to fix a couple of ordering bugs, such as:

 a) 'x & (y | z)' disregards the order of 'x' (issue5100)
 b) 'x & y:z' is listed from 'y' to 'z'
 c) 'x & y' can be rewritten as 'y & x' if weight(x) > weight(y)

(a) and (b) are bugs of the revset core. Before this, there was no way to
tell if 'orset()' and 'rangeset()' can enforce its ordering. These bugs
could be addressed by overriding __and__() of the initial set to take the
ordering of the other set:

    class fullreposet:
        def __and__(self, other):
            # allow other to enforce its ordering
            return other

but it would expose (c), which is a hidden bug of optimize(). So, in either
ways, optimize() have to know the current ordering requirement. Otherwise,
it couldn't rewrite expressions by weights with no output change, nor tell
how a revset function or operation should order the entries.

'order' is tri-state. It starts with 'define', and shifts to 'follow' by
'x & y'. It changes back to 'define' on function call 'f(x)' or function-like
operation 'x (f) y' because 'f' may have its own ordering requirement for 'x'
and 'y'. The state 'any' will allow us to avoid extra cost that would be
necessary to constrain ordering where it isn't important, 'not x'.
2016-02-16 22:02:16 +09:00
Yuya Nishihara
ec5675abe6 revset: wrap arguments of 'or' by 'list' node
This makes the number of 'or' arguments deterministic so we can attach
additional ordering flag to all operator nodes. See the next patch.

We rewrite the tree immediately after chained 'or' operations are flattened
by simplifyinfixops(), so we don't need to care if arguments are stored in
x[1] or x[1:].
2016-08-07 17:04:05 +09:00
Yuya Nishihara
c883bfb39b revset: remove showwarning option from expandaliases()
Now all callers pass showwarning=ui.warn, so we no longer need the option to
suppress warnings.
2016-09-08 22:44:10 +09:00
Yuya Nishihara
5fdde91bb9 revset: add public function to create matcher from evaluatable tree
"hg debugrevspec" will use it to evaluate unoptimized tree.
2016-08-21 11:37:00 +09:00
Yuya Nishihara
c0d8e08ed6 revset: make analyze() a separate step from optimize()
This will allow us to evaluate unoptimized tree and compare the result with
optimized one.

The private _analyze() function isn't renamed since I'll add more parameters
to it.
2016-08-21 11:29:57 +09:00