Commit Graph

19 Commits

Author SHA1 Message Date
Jun Wu
eaaeecf0d5 revset: optimize "draft() & ::x" pattern
The `draft() & ::x` type query could be common for selecting one or more
draft feature branches being worked on.

Before this patch, `::x` may travel through the changelog DAG for a long
distance until it gets a smaller revision number than `min(draft())`. It
could be very slow on long changelog with distant (in terms of revision
numbers) drafts.

This patch adds a fast path for this situation, and will stop traveling the
changelog DAG once `::x` hits a non-draft revision.

The fast path also works for `secret()` and `not public()`.

To measure the performance difference, I used drawdag to create a repo that
emulates distant drafts:

          DRAFT4
           |
          DRAFT3 # draft
          /
  PUBLIC9999 # public
      |
  PUBLIC9998
      |
      .   DRAFT2
      .    |
      .   DRAFT1 # draft
      |   /
  PUBLIC0001 # public

And measured the performance using the repo:

  (BEFORE)
  $ hg perfrevset 'draft() & ::(DRAFT2+DRAFT4)'
  ! wall 0.017132 comb 0.010000 user 0.010000 sys 0.000000 (best of 156)
  $ hg perfrevset 'draft() & ::(all())'
  ! wall 0.024221 comb 0.030000 user 0.030000 sys 0.000000 (best of 113)
  (AFTER)
  $ hg perfrevset 'draft() & ::(DRAFT2+DRAFT4)'
  ! wall 0.000243 comb 0.000000 user 0.000000 sys 0.000000 (best of 9303)
  $ hg perfrevset 'draft() & ::(all())'
  ! wall 0.004319 comb 0.000000 user 0.000000 sys 0.000000 (best of 655)

Differential Revision: https://phab.mercurial-scm.org/D441
2017-08-28 14:49:00 -07:00
Denis Laxalde
fbe693e88b followlines: join merge parents line ranges in blockdescendants() (issue5595)
In blockdescendants(), we had an assertion when line range of a merge
changeset was not consistent depending on which parent was considered for
computation. For instance, this might occur when file content (in lookup
range) is significantly different between parent branches of the merge as
demonstrated in added tests (where we almost completely rewrite the "baz" file
while also introducing similarities with its content in the other branch we
later merge to).

Now, in such case, we combine line ranges from all parents by storing the
envelope of both line ranges. This is conservative (the line range is
extended, possibly unnecessarily) but at least this should avoid missing
descendants with changes in a range that would fall in that of one parent but
not in another one (the case of "baz: narrow change (2->2+)" changeset in
tests).
2017-07-05 13:54:53 +02:00
Yuya Nishihara
44aa43c0dc revset: add depth limit to descendants() (issue5374)
This is naive implementation using two-pass scanning. Tracking descendants
isn't an easy problem if both start and stop depths are specified. It's
impractical to remember all possible depths of each node while scanning from
roots to descendants because the number of depths explodes. Instead, we could
cache (min, max) depths as a good approximation and track ancestors back when
needed, but that's likely to have off-by-one bug.

Since this implementation appears not significantly slower, and is quite
straightforward, I think it's good enough for practical use cases. The time
and space complexity is O(n) ish.

  revisions:
  0) 1-pass scanning with (min, max)-depth cache (worst-case quadratic)
  1) 2-pass scanning (this version)

  repository:
  mozilla-central

  # descendants(0) (for reference)
  *) 0.430353

  # descendants(0, depth=1000)
  0) 0.264889
  1) 0.398289

  # descendants(limit(tip:0, 1, offset=10000), depth=1000)
  0) 0.025478
  1) 0.029099

  # descendants(0, depth=2000, startdepth=1000)
  0) painfully slow (due to quadratic backtracking of ancestors)
  1) 1.531138
2017-06-24 23:05:57 +09:00
Yuya Nishihara
34373a25d4 dagop: make walk direction switchable so it can track descendants
# ancestors(tip) using hg repo
  2) 0.068527
  3) 0.069097
2017-06-24 23:35:03 +09:00
Yuya Nishihara
06592918ad dagop: factor out generator of ancestor nodes
# ancestors(tip) using hg repo
  1) 0.068976
  2) 0.068527
2017-06-24 23:30:51 +09:00
Yuya Nishihara
568f49d319 dagop: factor out pfunc from revancestors() generator
This generator will be reused for tracking descendants with depth limit.

  # ancestors(tip) using hg repo
  0) 0.065868
  1) 0.068976
2017-06-24 23:22:45 +09:00
Yuya Nishihara
1103437683 dagop: use smartset.min() in revdescendants() generator
All callers pass the result of revset.getset(), which should be a smartset.
2017-06-23 21:15:10 +09:00
Yuya Nishihara
c6472824e6 dagop: change revdescendants() to include all root revisions
Prepares for adding depth support. I want to process depth=0 in
revdescendants() to make things simpler.

only() also calls dagop.revdescendants(), but it filters out root revisions
explicitly. So this should cause no problem.

  # descendants(0) using hg repo
  0) 0.052380
  1) 0.051226

  # only(tip) using hg repo
  0) 0.001433
  1) 0.001425
2017-06-20 22:26:52 +09:00
Yuya Nishihara
0a45222557 dagop: unnest inner generator of revdescendants()
This just moves iterate() to module-level function.
2017-06-18 17:02:03 +09:00
Martin von Zweigbergk
dc5aabddf4 dagop: raise ProgrammingError if stopdepth < 0
revset.py should never send such a value.
2017-06-23 22:15:22 -07:00
Yuya Nishihara
c47ec16b6a revset: add startdepth limit to ancestors() as internal option
This is necessary to implement the set{gen} (set subscript) operator. For
example, set{-n} will be translated to ancestors(set, depth=n, startdepth=n).

https://www.mercurial-scm.org/wiki/RevsetOperatorPlan#ideas_from_mpm

The UI is undecided and I doubt if the startdepth option would be actually
useful, so the option is hidden for now. 'depth' could be extended to take
min:max range, in which case, integer depth should select a single generation.

  ancestors(set, depth=:y)  # scan up to y-th generation
  ancestors(set, depth=x:)  # skip until (x-1)-th generation
  ancestors(set, depth=x)   # select only x-th generation

Any ideas are welcomed.

  # reverse(ancestors(tip)) using hg repo
  3) 0.075951
  4) 0.076175
2017-06-18 00:40:58 +09:00
Yuya Nishihara
3a18a16767 revset: add depth limit to ancestors()
This is proposed by the issue5374, and will be a building block of set{gen}
(set subscript) operator.

https://www.mercurial-scm.org/wiki/RevsetOperatorPlan#ideas_from_mpm

  # reverse(ancestors(tip)) using hg repo
  2) 0.075408
  3) 0.075951
2017-06-18 00:22:41 +09:00
Yuya Nishihara
af9646319c dagop: compute depth in revancestors() generator
Surprisingly, this makes revset benchmark slightly faster. I don't know why,
but it appears that wrapping -inputrev by tuple is the key. So I decided to
just enable depth computation by default.

  # reverse(ancestors(tip)) using hg repo
  1) 0.081051
  2) 0.075408
2017-06-18 00:11:48 +09:00
Yuya Nishihara
538d2e426b dagop: just compare with the last value to deduplicate input of revancestors()
Since we're using a max heap, the current rev should be a duplicate only
if it equals to the previous one. We don't have to maintain the whole seen
set.

  # reverse(ancestors(tip)) using hg repo
  0) 0.086420
  1) 0.081051
2017-06-18 08:59:09 +09:00
Yuya Nishihara
a08fb00a93 dagop: bulk rename variables in revancestors() generator
- h -> pendingheap: "h" seems too short for variable of long lifetime
 - current -> currev: future patches will add current "depth" variable
 - parent -> prev or pctx: short lifetime, follows common naming rules
2017-06-18 17:22:57 +09:00
Yuya Nishihara
6ade9d6bff dagop: comment why revancestors() doesn't heapify input revs at once
I wondered why we're doing this complicated stuff without noticing the input
revs may be iterated lazily in descending order. e9a070fa585b showed why.
2017-06-18 17:16:02 +09:00
Yuya Nishihara
038a677e17 dagop: unnest inner generator of revancestors()
This just moves iterate() to module-level function.
2017-06-17 22:33:23 +09:00
Yuya Nishihara
e75a42ecc9 dagop: move blockancestors() and blockdescendants() from context
context.py seems not a good place to host these functions.

  % wc -l mercurial/context.py mercurial/dagop.py
    2306 mercurial/context.py
     424 mercurial/dagop.py
    2730 total
2017-02-19 19:37:14 +09:00
Yuya Nishihara
54b39af2a1 dagop: split module hosting DAG-related algorithms from revset
This module hosts the following functions. They are somewhat similar (e.g.
scanning revisions using heap queue or stack) and seem non-trivial in
algorithmic point of view.

 - _revancestors()
 - _revdescendants()
 - reachableroots()
 - _toposort()

I was thinking of adding revset._fileancestors() generator for better follow()
implementation, but it would be called from context.py as well. So I decided
to create new module.

Naming is hard. I couldn't come up with any better module name, so it's called
"dag operation" now. I rejected the following candidates:

 - ancestor.py - existing, revlog-level DAG algorithm
 - ancestorset.py - doesn't always return a set
 - dagalgorithm.py - hard to type
 - dagutil.py - existing
 - revancestor.py - I want to add fileancestors()

  % wc -l mercurial/dagop.py mercurial/revset.py
    339 mercurial/dagop.py
   2020 mercurial/revset.py
   2359 total
2016-10-16 18:03:24 +09:00