sapling/eden
Jun Wu e12b6c81de debugbenchmark: add a command to benchmark revsets
Summary:
Provide a way to benchmark revsets, optionally on different backends.

Some example benchmarks:

On the linux.git repo:

  $ git clone https://github.com/torvalds/linux --filter=tree:0 -n
  # might need edit .git/config, set repositoryformat to 0
  $ hg debuginitgit --git-dir=linux/.git linux-hg
  $ hg debugbenchmarkrevsets --cwd linux-hg -x v2.6.26 -Y v5.8  -m
  # x:  bce7f793daec3e65ec5c5705d2457b81fe7b5725  (v2.6.26)
  # y:  bcf876870b95592b52519ed4aafcf9d95999bc9c  (v5.8)

  | revset \ backend | segments | revlog | revlog-cpy |
  |------------------|----------|--------|------------|
  | ancestor(x, x)   |    0.1ms |  0.1ms |      0.1ms |
  | ancestor(x, y)   |    0.1ms |   10ms |       11ms |
  | ancestors(x)     |    0.2ms |   10ms |      264ms |
  | ancestors(y)     |    0.2ms |  175ms |      3.0 s |
  | children(x)      |    0.2ms |   12ms |      955ms |
  | children(y)      |    0.2ms |  0.3ms |       54ms |
  | descendants(x)   |     75ms |  164ms |       69ms |
  | descendants(y)   |    1.6ms |  0.6ms |      0.7ms |
  | y % x            |    0.2ms |   18ms |      863ms |
  | x::y             |     75ms |  160ms |       68ms |
  | heads(_all())    |    0.1ms |  9.8ms |      843ms |
  | roots(_all())    |    0.5ms |   15ms |      1.6 s |

On the git.git repo with lots of merges but relatively short history:

  # x:  a3eb250f996bf5e12376ec88622c4ccaabf20ea8  (v0.99)
  # y:  4d4165b80d6b91a255e2847583bd4df98b5d54e1  (v2.9.5)

  | revset \ backend | segments | revlog | revlog-cpy |
  |------------------|----------|--------|------------|
  | ancestor(x, x)   |    0.1ms |  0.1ms |      0.1ms |
  | ancestor(x, y)   |    0.7ms |  0.6ms |      0.6ms |
  | ancestors(x)     |    0.2ms |  0.4ms |      1.7ms |
  | ancestors(y)     |    0.8ms |  4.4ms |      140ms |
  | children(x)      |    0.2ms |  1.1ms |       75ms |
  | children(y)      |    0.2ms |  0.4ms |       20ms |
  | descendants(x)   |     16ms |  8.2ms |      2.9ms |
  | descendants(y)   |    4.2ms |  1.8ms |      0.9ms |
  | y % x            |    0.8ms |  1.2ms |       42ms |
  | x::y             |     13ms |  5.8ms |      1.7ms |
  | heads(_all())    |    0.2ms |  0.6ms |       46ms |
  | roots(_all())    |    0.4ms |  1.0ms |      102ms |

On large repo 1 with lots of drafts (and heads):

  # x:  94fccdcc90d52995bf47f1d9259372c290257420  (94fccdcc90 & public())
  # y:  afa87d815d528afadbe5622278e285346d5376f4  (afa87d81 & draft())

  | revset \ backend | segments | revlog | revlog-cpy |
  |------------------|----------|--------|------------|
  | ancestor(x, x)   |    0.1ms |  0.2ms |      0.1ms |
  | ancestor(x, y)   |    0.1ms |   40ms |       62ms |
  | ancestors(x)     |    0.2ms |  1.2 s |      6.8 s |
  | ancestors(y)     |    0.2ms |  2.7 s |       16 s |
  | children(x)      |    0.2ms |   52ms |      5.2 s |
  | children(y)      |    0.2ms |  5.4ms |      357ms |
  | descendants(x)   |    6.0ms |  616ms |      149ms |
  | descendants(y)   |    1.0ms |  0.9ms |      1.5ms |
  | y % x            |    0.2ms |   73ms |      4.2 s |
  | x::y             |    2.3ms |  557ms |      159ms |
  | heads(_all())    |    184ms |   87ms |       10 s |
  | roots(_all())    |     22ms |  110ms |       16 s |

On large repo 2 with mostly linear history:

  # x:  a5b69b059257f732c3b06e5af4ace9fd58ba87e4  (10000)
  # y:  e1e93ca550a89f7803e5a8fe5d388342c44bdd13  (e1e93ca5)

  | revset \ backend | segments | revlog | revlog-cpy |
  |------------------|----------|--------|------------|
  | ancestor(x, x)   |    0.1ms |  0.1ms |      0.1ms |
  | ancestor(x, y)   |    0.1ms |  354ms |      541ms |
  | ancestors(x)     |    0.1ms |  1.1ms |       13ms |
  | ancestors(y)     |    0.1ms |   16 s |       59 s |
  | children(x)      |    0.1ms |  371ms |       32 s |
  | children(y)      |    0.1ms |  0.1ms |      1.3 s |
  | descendants(x)   |    0.3ms |  5.7 s |      1.3 s |
  | descendants(y)   |    0.2ms |  0.2ms |      5.5ms |
  | y % x            |    0.1ms |  583ms |       30 s |
  | x::y             |    0.3ms |  5.7 s |      1.4 s |
  | heads(_all())    |    0.1ms |  317ms |       28 s |
  | roots(_all())    |    0.1ms |  493ms |       47 s |

Notes about the segments backend:
- Optimized for (common) ancestors calculation.
- x::y, or descendants are sensitive to the number of merges.
- descendants or heads are sensitive to the number of heads.
- Not optimized for too many heads. But with narrow-heads, `descendants(x)` is re-written to `x::visible_heads()` and it could be less of an issue if heads are "narrowed".
- More efficient IdDag implementation would improve performance by a constant time factor.
  Namely, having the Index pre-checksum the byte range would make it about 2x faster.

Reviewed By: DurhamG

Differential Revision: D23106173

fbshipit-source-id: b88770e2fc9f0f626bb65e214a83da1a0b927344
2020-08-26 15:32:25 -07:00
..
fs Daily arc lint --take CLANGFORMAT 2020-08-26 13:47:00 -07:00
integration win: conditionally enable negative path caching 2020-08-14 17:35:50 -07:00
locale add a copyright header to glibc_en.po 2019-04-26 14:38:27 -07:00
mononoke mononoke: add admin command to find rejected blames 2020-08-26 11:54:08 -07:00
scm debugbenchmark: add a command to benchmark revsets 2020-08-26 15:32:25 -07:00
scripts packaging: use scheduled tasks 2020-06-10 19:29:15 -07:00
test_support Manually upgrading eden, and fixing their config 2020-08-06 12:37:04 -07:00
test-data enable treemanifest in snapshots 2019-08-28 18:46:03 -07:00
.clang-format Cut FOR_EACH_KV 2020-06-10 19:29:43 -07:00
.gitignore eden: wire up mac contbuild 2019-02-05 21:52:30 -08:00
Eden.project.toml Eden.project.toml file for Nuclide 2018-04-26 11:05:23 -07:00