Commit Graph

2503 Commits

Author SHA1 Message Date
David Tolnay
75c2118e01 Remove crate_root from Rust dependency info
Reviewed By: danobi

Differential Revision: D23430948

fbshipit-source-id: c4b374021325fc247121ceecd0e82a0291aa75d6
2020-08-31 14:43:24 -07:00
Jun Wu
9aa9d022ae util: stop using time.perf_counter() for timer()
Summary:
Some code paths (ex. metalog.commit) use `util.timer()` as a way to get
seconds since epoch, and get 0 for tests. Other use-cases of `util.timer()`
are ad-hoc time measure for displaying speed / progress. They do not need high
precision or strong guarantee that the clock does not go backwards. Drop the
`time.perf_counter()` to meet the first use-case's expectation.

Reviewed By: singhsrb

Differential Revision: D23431253

fbshipit-source-id: 8bf2d1ed32e284e17285742e1d0fd7178f181fb3
2020-08-31 13:04:54 -07:00
Jun Wu
9f33746b31 histedit: do not show revision numbers
Summary:
With segments backend, the revision numbers will be longer than commit hashes
and are confusing.

Reviewed By: DurhamG

Differential Revision: D23408971

fbshipit-source-id: e2057fa644fc7b6be4291f879eee3235bb4e687b
2020-08-31 11:57:53 -07:00
Jun Wu
96548cade8 remotefilelog: do not assume range(len(cl)) are valid revs in _linkrev
Summary: `range(len(cl))` contains invalid revs with segments backend.

Reviewed By: DurhamG

Differential Revision: D23411209

fbshipit-source-id: 2f83a5402bb46824cf38871926c1954507b64b56
2020-08-31 11:57:53 -07:00
Jun Wu
ff2d572717 changelog2: avoid excessive memory usage during large pulls
Summary:
Pulling from older repos (ex. years ago) could require GBs of commit text data.
Flush commit data if they exceed certain size.

This is for revlog compatibility.
In the future we probably just make commit text lazy to avoid this kind of issues.

Reviewed By: DurhamG

Differential Revision: D23408834

fbshipit-source-id: 273384f5a05be07877bb1c9871c17b53ba436233
2020-08-31 11:57:53 -07:00
Jun Wu
01c551bb30 hgcommits: add flush_commit_data API
Summary: This would be used to avoid excessive memory usage during pull.

Reviewed By: DurhamG

Differential Revision: D23408833

fbshipit-source-id: 8edd95ab8201697074f65cc118d14755a230567d
2020-08-31 11:57:53 -07:00
Jun Wu
fee02d78e0 changelog2: only call addcommits once in addgroup
Summary:
`addcommits` is designed to be more efficiently if called with a batch of
commits. So let's buffer the commits to add then only call it once.

This avoids some N^2 behaviors, for example, the NameDag internally will
prepare "snapshot" of itself which involves coping the pending Rust vecs
about the segments and id <-> hash map.

The change makes `pull` usable from unusably slow:

Original Python Revlog backend:

```
In [1]: %trace repo.pull(bookmarknames=['master'],quiet=False)
 5191   +466      | Apply Changegroup                                   edenscm.mercurial.bundle2 line 516
                  | - Commits = 125                                     :
                  | - Range = a1d1b3ade136:2e3fe78af189                 :
 5191   +466      | changegroup.cg1unpacker.apply                       edenscm.mercurial.changegroup line 313
 5192   +416      | Progress Bar: commits                               (progressbar)
 5192   +415      | changelog.changelog.addgroup                        edenscm.mercurial.changelog line 536
 5192   +409      | revlog.revlog.addgroup                              edenscm.mercurial.revlog line 2116
 5215   +371      | changelog.changelog._addrevision (125 times)        edenscm.mercurial.changelog line 558
```

DoubleWrite (Segments + Revlog) backend, Before:

```
In [2]: %trace repo.pull(bookmarknames=['master'],quiet=False)
  2396 +154059   | Apply Changegroup                            edenscm.mercurial.bundle2 line 516
                 | - Commits = 323                              :
                 | - Range = cb0b100180ba:5fb57c74f72e          :
  2396 +154059   | changegroup.cg1unpacker.apply                edenscm.mercurial.changegroup line 313
  2397 +151433    \ Progress Bar: commits                       (progressbar)
  2397 +151433     | changelog2.changelog.addgroup              edenscm.mercurial.changelog2 line 334
```

DoubleWrite (Segments + Revlog) backend, After:

```
In [2]: %trace repo.pull(bookmarknames=['master'],quiet=False)
 4629   +512      | Apply Changegroup                                       edenscm.mercurial.bundle2 line 516
                  | - Commits = 45                                          :
                  | - Range = cf23c6972934:1ff0c5f0e7ad                     :
 4629   +512      | changegroup.cg1unpacker.apply                           edenscm.mercurial.changegroup line 313
 4630   +494      | changelog2.changelog.addgroup                           edenscm.mercurial.changelog2 line 334
```

Reviewed By: DurhamG

Differential Revision: D23390435

fbshipit-source-id: dd97a5008dedd844d4134b87bfef190fa739a80b
2020-08-31 11:57:52 -07:00
Jun Wu
e5a4533622 revlog: drop addrevisoncb from addgroup
Summary:
The users of addrevisoncb are gone.
This also removes the "alwayscache" parameter of "_addrevision".

Reviewed By: DurhamG

Differential Revision: D23390437

fbshipit-source-id: 7edd9dd0b93d4cb9d4f35d088a1aef719b450ec1
2020-08-31 11:57:52 -07:00
Jun Wu
1199790982 upgrade: remove the upgrade module
Summary: It is about legacy revlog formats that are no longer relevant.

Reviewed By: DurhamG

Differential Revision: D23390436

fbshipit-source-id: 58c2c432804181bcc6517d6c988777b843fc9ba4
2020-08-31 11:57:52 -07:00
Stanislau Hlebik
2d5000293e sparse: disallow changing profiles if it includes bad file
Summary:
We have a few safeguards against creating full checkouts. However we have
sparse profiles that are not full, but that include very large directories
which normally should not be included.

This diff adds a logic that checks if a new sparse profile has any of the "marker"
files i.e. some files from a folder that should not be included. Operation
aborts if that the case, however there's always a way to workaround that.

Reviewed By: DurhamG

Differential Revision: D23414200

fbshipit-source-id: 626f392319eb1be8b35f39cadafb61f3c1dfefe3
2020-08-31 11:38:16 -07:00
Stanislau Hlebik
7bbf044a49 sparse: fix --sparse to work on eden
Summary:
"hg diff" has --sparse option which diffs only files inside a sparse checkout.
The problem is that it doesn't work on eden checkouts because eden repo doesn't
have sparsematch() function.

This diff makes it so that if sparsematch() function doesn't exist then
--sparse option is just ignored.

The motivation for this change is
https://fb.workplace.com/groups/corehg/?post_id=687768245151742. There are some
diff calls that are triggered by arc lint that race with "hg update" and might download
loads of data on people's laptops. This diff doesn't fix the race, but it:
1) Makes sure we don't download too much data that are not in sparse profiles.
2) arc lint doesn't care about files outside of sparse profiles anyway, so
running --sparse make sense.

Reviewed By: DurhamG

Differential Revision: D23396918

fbshipit-source-id: 2a386fdbeab85187e2c2acab69cb86b74124d46f
2020-08-28 23:47:40 -07:00
Jun Wu
fbc9b865b6 changegroup: do not calculate how many files received commits include
Summary:
This is practically just 0 in our production setup during `pull`s. In the
future when the commit data become lazy, it's no longer possible to read the
files locally. So let's just don't scan the commits.

Reviewed By: DurhamG

Differential Revision: D23390438

fbshipit-source-id: 4c54c4aac5fd840205296ab86955ec1b8ab76607
2020-08-28 13:40:18 -07:00
root@sandcastle5869.frc3.facebook.com
5f749ee470 suppress errors in eden - batch 1
Differential Revision: D23401295

fbshipit-source-id: 01fe0ff888d074c503a445c6d97f17bf0ec2b79c
2020-08-28 12:46:36 -07:00
Durham Goode
08c938e859 dirstate: block addition of paths containing "." and ".."
Summary:
Mergedrivers can call dirstate.add directly and are adding paths with
"." and "..". Let's block those paths.

Reviewed By: quark-zju

Differential Revision: D23375469

fbshipit-source-id: 64e9f20169cfd50325ecd8ebcc1dd3be7a5cb202
2020-08-28 09:42:25 -07:00
Durham Goode
2f5130c882 py3: fix extdiff
Summary:
extdiff uses shutil.rmtree which calls os.rmdir with new python 3
options. Since we pathc os.rmdir, we need to support those options.

Reviewed By: quark-zju

Differential Revision: D23350968

fbshipit-source-id: 081d179dcd67b51ffdeb6b85899adf4e574a8d0f
2020-08-27 19:15:22 -07:00
Jun Wu
f271d882e6 hgcommands: make commands! macro define modules
Summary: Similar to D18528858 so module names do not need to be spelled twice.

Reviewed By: markbt

Differential Revision: D23091380

fbshipit-source-id: a2a261abc9c78c8805cea62b38498ba65398796d
2020-08-27 19:02:27 -07:00
Arun Kulshreshtha
cb3f95d06e configparser: make code compile without "fb" feature
Summary: This crate would fail to build without the "fb" feature because `serde_json` was listed as an optional dependency (but is used in a way that isn't conditional on the `fb` feature). This diff makes the dependency non-optional, and also silences several dead code warnings that are emitted when building without the "fb" feature.

Reviewed By: quark-zju

Differential Revision: D23386786

fbshipit-source-id: b00a8b0b8b0b978c1cfab2838629fcb388a076e9
2020-08-27 18:28:46 -07:00
Jun Wu
d586a40ada hgcommands: add debugfsync
Summary:
The `debugfsync` command calls fsync on newly modified files in svfs.
Right now it only includes locations that we know have constant number
of files.

The fsync logic is put in a separate crate to avoid slow compiles.

Reviewed By: DurhamG

Differential Revision: D23124169

fbshipit-source-id: 438296002eed14db599d6ec225183bf824096940
2020-08-27 18:26:03 -07:00
Xavier Deguillard
eb57ebb4d8 eden: decrease verbosity of "fetching tree" message
Summary:
A warning means that every tree fetched will be printed in the edenfs log,
which is way too much. Let's decrease this to a debug message.

Reviewed By: genevievehelsel

Differential Revision: D23385778

fbshipit-source-id: d77f1cac3efb945d4b95750822f2f12f48c75ffe
2020-08-27 18:16:51 -07:00
Jun Wu
c2d36d03c4 changegroup: avoid using rev numbers
Summary: `len(repo)` can no longer predicate the next rev number. Use nodes instead.

Reviewed By: DurhamG

Differential Revision: D23307791

fbshipit-source-id: cc20e53f039eee2a714748352e8e98aab253095a
2020-08-27 18:14:29 -07:00
Jun Wu
d8e775f423 tracing-collector: limit maximum count of spans
Summary:
Some functions might be called very frequently. For example,
`phases.phasecache.loadphaserevs` might be called 100k+ times.
That makes the tracing data harder to process.

Limit the count of spans to 1k by default so the data is cheaper to process,
and some highly repetitive cases can now be reasoned about. Note the limit
is only put on static Span Ids. If a span uses dynamic metadata or ask for
different Span Ids each time, they will not be limited.

In debugshell,

  td = %trace repo.revs('smartlog()')
  len(td.serialize())

dropped from 6MB to 0.87MB.

It's also possible to reason about:

  td = %trace len(repo.revs('ancestors(.)'))

in debugshell (taking 30s, 98KB serialized, vs 21s without tracing), while
previously the result would be too large to show (`%trace` just hangs).

Reviewed By: DurhamG

Differential Revision: D23307793

fbshipit-source-id: 3c1e9885ce7a275c2abd8935a4e4539a4f14ce83
2020-08-27 18:14:29 -07:00
Jun Wu
9f4dac104f dag: truncate output in <SpanSet as Debug>::fmt
Summary: Set a default limit so the output won't be too long.

Reviewed By: DurhamG

Differential Revision: D23307792

fbshipit-source-id: 7e2ed99e96bbde06436a034e78f899fc2e3e03f8
2020-08-27 18:14:29 -07:00
Jun Wu
54cd73b41b profiling: do not profile debugshell command
Summary:
The debugshell command can be long running and contains uninteresting stuff.
Do not profile it.

Practically this hides showing the background statprof thread when using `%trace`.

Reviewed By: DurhamG

Differential Revision: D23278597

fbshipit-source-id: bad97de22e1be2be8b866bee705ea3a6755aa54b
2020-08-27 18:14:29 -07:00
Jun Wu
d92c80ebcc dispatch: enter ipdb for "NameError 'ipdb' is not defined"
Summary:
This allows entering ipdb for code like: `ipdb` or `ipdb()`. It can be handy to
debug something.

Reviewed By: DurhamG

Differential Revision: D23278599

fbshipit-source-id: 4355dd1944617aeb795450935789f01f66f094eb
2020-08-27 18:14:28 -07:00
Jun Wu
28fa0e1cfe debugshell: add %trace and %hg magics
Summary: This makes it possible to get tracing results, or run hg commands directly.

Reviewed By: DurhamG

Differential Revision: D23278601

fbshipit-source-id: e7dc92080d2881cb4155a481df5ca93f324828fc
2020-08-27 18:14:28 -07:00
Jun Wu
ed78542610 dispatch: add --trace flag
Summary:
The `--trace` flag enables tracing Python modules.
For compatibility reasons, it also enables `--traceback`.

It can be used with debugshell to make `%trace` more useful.

Reviewed By: sfilipco

Differential Revision: D23278600

fbshipit-source-id: d6d0b34bd5c48111f8cd33d7df115f349b0e95b6
2020-08-27 18:14:28 -07:00
Jun Wu
3bbdfd3743 revset: successors(x) should only show visible commits
Summary:
I found this when I aborted an rebase Dxxx and trying rebasing again and it
complained about "nothing to rebase". It was caused by Dxxx resolving into
a hidden commit.

Reviewed By: sfilipco

Differential Revision: D23307794

fbshipit-source-id: f7a956b5300240089b6a4648f28cf4a152ee2433
2020-08-27 18:14:28 -07:00
Arun Kulshreshtha
0b9ca4e83b hgcommands: remove unused imports in dynamicconfig module
Summary: Remove unused imports.

Reviewed By: quark-zju

Differential Revision: D23356940

fbshipit-source-id: 31b81eac11946aa8b24ec23c98ddb14716fbea3a
2020-08-27 14:06:52 -07:00
Genevieve Helsel
3eb96cfb62 fix dictionary changed size during iteration in patch
Summary:
We shouldn't delete from a dictionary while iterating over it, instead we should iterate over a copy and then delete from the original.

`.items()` returns a view of the dict, while wrapping it in `list` makes a deep copy.

Reviewed By: DurhamG

Differential Revision: D23283668

fbshipit-source-id: a168eef1ed2a1ce02fe71b3f6e3aed090965d2a4
2020-08-27 13:14:36 -07:00
Durham Goode
fe56f44ca0 treemanifest: prevent fetching nullid
Summary:
Mononoke throws an error if we request the nullid. In the long term we
want to get rid of the concept of the nullid entirely, so let's just add some
Python level blocks to prevent us from attempting to fetch it. This way we can
start to limit how much Rust has to know about these concepts.

Reviewed By: sfilipco

Differential Revision: D23332359

fbshipit-source-id: 8a67703ba1197ead00d4984411f7ae0325612605
2020-08-27 09:59:40 -07:00
Durham Goode
4d4e425624 configs: add fbitwhoami tiers to dynamicconfig inputs
Summary:
Corp has a different concept of tier than prod. Let's load the corp
tier into our tier set as well.

Reviewed By: quark-zju

Differential Revision: D23354056

fbshipit-source-id: c9543b8253f042c7b1224578e0687b4bdf21738e
2020-08-27 09:24:28 -07:00
Durham Goode
c190d283ec py3: don't use universal newlines for patch import
Summary:
The Python 3 email library internally stores the message as text, even
though our input and requested output is bytes. Let's make our own wrapper
around the parser to use ascii surrogateescape encoding so we can get the
actual bytes out later and not get universal newlines.

Based off the upstream 7b12a2d2eedc995405187cdf9a35736a14d60706,
which is basically a copy of the BytesParser implementation (https://github.com/python/cpython/blob/3.8/Lib/email/parser.py) with
newline=chr(10) added.

Reviewed By: quark-zju

Differential Revision: D23363965

fbshipit-source-id: 880f0642cce96edfdd22da5908c0b573887bed12
2020-08-27 09:21:04 -07:00
Liubov Dmitrieva
06c1d37383 move try up in the rejoin command
Summary:
`hg cloud rejoin` command is used in fbclone and it is supposed to print a
message on RegistrationError but this has been broken recently.

Reviewed By: markbt

Differential Revision: D23342773

fbshipit-source-id: 4f3318848953656dea65a2b5d4d832694f6b353c
2020-08-27 06:53:28 -07:00
Liubov Dmitrieva
bd63a78f96 add more information to hg cloud leave command
Summary:
There are users who prefer run `hg cloud leave` if they notice they are
connected to commit cloud sync.

Proving more information and add a prompt might help them to change their mind.

For some users who left new fbclone will connect them back. So on next leave they can learn more information about Commit Cloud Workspaces.

Reviewed By: markbt

Differential Revision: D23346091

fbshipit-source-id: 72f170f7133cd64b772ec75ae29a85dc8809e351
2020-08-26 22:43:20 -07:00
Durham Goode
8f9c0899cc update: fix performance of updating to null commit
Summary:
When updating to the null commit, the logic that computes the update
distance was broken. The null commit is pre-resolved to -1, which when passed to
a revset raw gets resolved as the tip commit. In large repositories this can
take a long time and use a lot of memory, since it's computing the difference
between tip and null.

Let's fix it to not pass the raw rev number, and also to handle the case of a 0
distance update.

Reviewed By: quark-zju

Differential Revision: D23358402

fbshipit-source-id: 3b0a1fe1bbcb07effba4d0ab2c092e66bdc02e67
2020-08-26 22:14:59 -07:00
Jun Wu
12d23ba64d revisionstore: fix GitHub build (#46)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/46

See https://github.com/facebookexperimental/eden/runs/1034006668:

   error: unused import: `env::set_var`
      --> src/lfs.rs:1539:15
       |
  1539 |     use std::{env::set_var, str::FromStr};
       |               ^^^^^^^^^^^^
       |
  note: the lint level is defined here
      --> src/lib.rs:125:9
       |
  125  | #![deny(warnings)]
       |         ^^^^^^^^
       = note: `#[deny(unused_imports)]` implied by `#[deny(warnings)]`

  error: unnecessary braces around method argument
      --> src/lfs.rs:2439:36
       |
  2439 |         remote.batch_upload(&objs, { move |sha256| local_lfs.blobs.get(&sha256) })?;
       |                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: remove these braces
       |
  note: the lint level is defined here
      --> src/lib.rs:125:9
       |
  125  | #![deny(warnings)]
       |         ^^^^^^^^
       = note: `#[deny(unused_braces)]` implied by `#[deny(warnings)]`

  error: aborting due to 2 previous errors

  error: could not compile `revisionstore`.

I dropped `#![deny(warnings)]` as I don't think warnings like the above ones
should break the build. (denying specific warnings that we care about explicitly
might be a better approach)

Reviewed By: singhsrb

Differential Revision: D23362178

fbshipit-source-id: 02258f57727edfac9818cd29dda5e451c7ca80a7
2020-08-26 20:40:25 -07:00
Arun Kulshreshtha
30e2cf4413 cargo_from_buck: reenable autocargo for edenapi
Summary: Now that it is possible to control which features are enabled on manually-managed dependencies, we can reenable autocargo for `edenapi`. See D23216925, D23327844, and D23329351 (840e6dd6f6) for context.

Reviewed By: dtolnay

Differential Revision: D23335122

fbshipit-source-id: 8ce250c3a106d2a02f457f7ed531623dd866232f
2020-08-26 19:16:48 -07:00
Jun Wu
d60e80796a py3: fix absorb -i
Summary: The command does not crash but `-` lines are ignored.

Reviewed By: DurhamG

Differential Revision: D23357655

fbshipit-source-id: f48568bc193f947503bc19f3e192b33346c317e1
2020-08-26 17:21:01 -07:00
Jun Wu
039419d281 configparser: fix non-fb dependencies (#45)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/45

Fix referring to 'version' without proper codegen by making 'version' compile
without codegen. This fixes configparser test when version/src/lib.rs was not
generated.

Make unneeded deps without 'fb' feature optional.

This would hopefully fix the "EdenSCM Rust Libraries" GitHub workflow.

Reviewed By: DurhamG

Differential Revision: D23269864

fbshipit-source-id: f9e691fe0a75159c4530177b8a96dad47d2494a9
2020-08-26 16:31:00 -07:00
Jun Wu
0705bd3b8d pydag: use dag::delegate to simplify code
Summary: This makes the code simpler.

Reviewed By: sfilipco

Differential Revision: D23269858

fbshipit-source-id: bb9ac0bd1696f7429ca1856e6c63e04fabc2757a
2020-08-26 15:32:26 -07:00
Jun Wu
55116e223f hgcommits: use dag::delegate to simplify code
Summary: This makes the code simpler.

Reviewed By: sfilipco

Differential Revision: D23269866

fbshipit-source-id: 30c9e9d218378c0d6df8b822b2a81df2b38f5b01
2020-08-26 15:32:26 -07:00
Jun Wu
85b3cea8ee dag: define delegate macro for other main traits
Summary: Will be used to simplify code.

Reviewed By: sfilipco

Differential Revision: D23269859

fbshipit-source-id: bed0c4dca075ff60900025642af1d84bdd03452d
2020-08-26 15:32:26 -07:00
Jun Wu
6b3096c7a4 dag: avoid other 'impl<T> Trait for T' usecases
Summary:
`impl<T> Trait for T` in the current Rust makes it impossible to have
`impl<Q> Trait for Q`. Avoid using it for IdConvert and PrefixLookup.

Reviewed By: sfilipco

Differential Revision: D23269861

fbshipit-source-id: a837f3984ff4e1bd5a3983dd1642b9f064f51a36
2020-08-26 15:32:25 -07:00
Jun Wu
4a2ee4c522 dag: avoid impl<T> DagAlgorithm for T
Summary:
`impl<T> Trait for T` in the current Rust makes it impossible to have
`impl<Q> Trait for Q`. Avoid using it for DagAlgorithm.

Reviewed By: sfilipco

Differential Revision: D23269860

fbshipit-source-id: 031e75e9bf1f1eec2b9e8f36220ef8b817a143a5
2020-08-26 15:32:25 -07:00
Jun Wu
846768fb53 dag: drop LowLevelAccess
Summary: LowLevelAccess is a subset of NameDagStorage. Use the latter instead.

Reviewed By: sfilipco

Differential Revision: D23269865

fbshipit-source-id: 81ebb1e986d8b02c968a9a237ad9a97d4afd54bf
2020-08-26 15:32:25 -07:00
Jun Wu
f4021486ab dag: move beautify to default_impl
Summary: This makes `ops.rs` look simpler.

Reviewed By: sfilipco

Differential Revision: D23269863

fbshipit-source-id: ddb55ab8eb3b2d3e7c4b2ccbc2252395d62317a1
2020-08-26 15:32:25 -07:00
Jun Wu
e12b6c81de debugbenchmark: add a command to benchmark revsets
Summary:
Provide a way to benchmark revsets, optionally on different backends.

Some example benchmarks:

On the linux.git repo:

  $ git clone https://github.com/torvalds/linux --filter=tree:0 -n
  # might need edit .git/config, set repositoryformat to 0
  $ hg debuginitgit --git-dir=linux/.git linux-hg
  $ hg debugbenchmarkrevsets --cwd linux-hg -x v2.6.26 -Y v5.8  -m
  # x:  bce7f793daec3e65ec5c5705d2457b81fe7b5725  (v2.6.26)
  # y:  bcf876870b95592b52519ed4aafcf9d95999bc9c  (v5.8)

  | revset \ backend | segments | revlog | revlog-cpy |
  |------------------|----------|--------|------------|
  | ancestor(x, x)   |    0.1ms |  0.1ms |      0.1ms |
  | ancestor(x, y)   |    0.1ms |   10ms |       11ms |
  | ancestors(x)     |    0.2ms |   10ms |      264ms |
  | ancestors(y)     |    0.2ms |  175ms |      3.0 s |
  | children(x)      |    0.2ms |   12ms |      955ms |
  | children(y)      |    0.2ms |  0.3ms |       54ms |
  | descendants(x)   |     75ms |  164ms |       69ms |
  | descendants(y)   |    1.6ms |  0.6ms |      0.7ms |
  | y % x            |    0.2ms |   18ms |      863ms |
  | x::y             |     75ms |  160ms |       68ms |
  | heads(_all())    |    0.1ms |  9.8ms |      843ms |
  | roots(_all())    |    0.5ms |   15ms |      1.6 s |

On the git.git repo with lots of merges but relatively short history:

  # x:  a3eb250f996bf5e12376ec88622c4ccaabf20ea8  (v0.99)
  # y:  4d4165b80d6b91a255e2847583bd4df98b5d54e1  (v2.9.5)

  | revset \ backend | segments | revlog | revlog-cpy |
  |------------------|----------|--------|------------|
  | ancestor(x, x)   |    0.1ms |  0.1ms |      0.1ms |
  | ancestor(x, y)   |    0.7ms |  0.6ms |      0.6ms |
  | ancestors(x)     |    0.2ms |  0.4ms |      1.7ms |
  | ancestors(y)     |    0.8ms |  4.4ms |      140ms |
  | children(x)      |    0.2ms |  1.1ms |       75ms |
  | children(y)      |    0.2ms |  0.4ms |       20ms |
  | descendants(x)   |     16ms |  8.2ms |      2.9ms |
  | descendants(y)   |    4.2ms |  1.8ms |      0.9ms |
  | y % x            |    0.8ms |  1.2ms |       42ms |
  | x::y             |     13ms |  5.8ms |      1.7ms |
  | heads(_all())    |    0.2ms |  0.6ms |       46ms |
  | roots(_all())    |    0.4ms |  1.0ms |      102ms |

On large repo 1 with lots of drafts (and heads):

  # x:  94fccdcc90d52995bf47f1d9259372c290257420  (94fccdcc90 & public())
  # y:  afa87d815d528afadbe5622278e285346d5376f4  (afa87d81 & draft())

  | revset \ backend | segments | revlog | revlog-cpy |
  |------------------|----------|--------|------------|
  | ancestor(x, x)   |    0.1ms |  0.2ms |      0.1ms |
  | ancestor(x, y)   |    0.1ms |   40ms |       62ms |
  | ancestors(x)     |    0.2ms |  1.2 s |      6.8 s |
  | ancestors(y)     |    0.2ms |  2.7 s |       16 s |
  | children(x)      |    0.2ms |   52ms |      5.2 s |
  | children(y)      |    0.2ms |  5.4ms |      357ms |
  | descendants(x)   |    6.0ms |  616ms |      149ms |
  | descendants(y)   |    1.0ms |  0.9ms |      1.5ms |
  | y % x            |    0.2ms |   73ms |      4.2 s |
  | x::y             |    2.3ms |  557ms |      159ms |
  | heads(_all())    |    184ms |   87ms |       10 s |
  | roots(_all())    |     22ms |  110ms |       16 s |

On large repo 2 with mostly linear history:

  # x:  a5b69b059257f732c3b06e5af4ace9fd58ba87e4  (10000)
  # y:  e1e93ca550a89f7803e5a8fe5d388342c44bdd13  (e1e93ca5)

  | revset \ backend | segments | revlog | revlog-cpy |
  |------------------|----------|--------|------------|
  | ancestor(x, x)   |    0.1ms |  0.1ms |      0.1ms |
  | ancestor(x, y)   |    0.1ms |  354ms |      541ms |
  | ancestors(x)     |    0.1ms |  1.1ms |       13ms |
  | ancestors(y)     |    0.1ms |   16 s |       59 s |
  | children(x)      |    0.1ms |  371ms |       32 s |
  | children(y)      |    0.1ms |  0.1ms |      1.3 s |
  | descendants(x)   |    0.3ms |  5.7 s |      1.3 s |
  | descendants(y)   |    0.2ms |  0.2ms |      5.5ms |
  | y % x            |    0.1ms |  583ms |       30 s |
  | x::y             |    0.3ms |  5.7 s |      1.4 s |
  | heads(_all())    |    0.1ms |  317ms |       28 s |
  | roots(_all())    |    0.1ms |  493ms |       47 s |

Notes about the segments backend:
- Optimized for (common) ancestors calculation.
- x::y, or descendants are sensitive to the number of merges.
- descendants or heads are sensitive to the number of heads.
- Not optimized for too many heads. But with narrow-heads, `descendants(x)` is re-written to `x::visible_heads()` and it could be less of an issue if heads are "narrowed".
- More efficient IdDag implementation would improve performance by a constant time factor.
  Namely, having the Index pre-checksum the byte range would make it about 2x faster.

Reviewed By: DurhamG

Differential Revision: D23106173

fbshipit-source-id: b88770e2fc9f0f626bb65e214a83da1a0b927344
2020-08-26 15:32:25 -07:00
Jun Wu
bb461d2240 dag: improve range calculation in repos with many heads
Summary:
If there are too many heads, the current `descendants` algorithm would visit
all "old" heads. For example, with this graph:

      head9999  (N9999)
     /
    Z (master)
    :
    : (many heads)
    :/
    : head2 (N2)
    :/
    C head1 (N1)
    |/
    B head0 (N0)
    |/
    A

`A::head9999` or `Z::head9999` will visit N0, N1, ..., N9999, because
`descendands_up_to` is provided with `max_id = N9999` and Z as a vertex in the
master group, is before N0 in non-master.  The current algorithm also means
`descendands_up_to` gets linearly slower as the user uses the repo more, which
is quite undesirable.

This diff changes `descendands_up_to` to take an `ancestors` set, which is
`::head9999` in this case, and iterate non-master flat segments in it. So it
will skip N0 to N9998 directly by finding the N9999 flat segment and only use
it. The number of heads will have a smaller impact on performance.

Another slowness is `draft::draft_heads`, if there are too many `draft_heads`,
the internal calculation of `::draft_heads` can be slow. Optimize it by
limiting `draft_heads` to `draft:`. Practically this affects `y::` revset as
`y::` is translated to `y::visible_heads` and `visible_heads` can be large.

`cargo bench --bench dag_ops -- '::-master'` shows significant difference:

Before:

  range (master::draft)                              18.112 s
  range (recent_draft::drafts)                        2.594 s

After:

  range (master::draft)                              72.542 ms
  range (recent_draft::drafts)                       14.932 ms

In my fbsource checkout there were 20k+ heads. The improvement of
`master::recent_draft` (`x::y`) is pretty visible, and `y::` is also improved:

    % lhg debugbenchmarkrevsets -m -x 'p1(min(7e8c86ae % master))' -Y 'draft() & 7e8c86ae' -e 'x::y' -e 'y::' --no-default
    # x:  168f5228e570fb6b2ff7f851bd82413102748d84  (p1(min(7e8c86ae % master)))
    # y:  7e8c86aec68ebc6e0b8254afcb381315991fd21c  (draft() & 7e8c86ae)

    # before
    | revset \ backend | segments | revlog | revlog-cpy |
    |------------------|----------|--------|------------|
    | x::y             |     17ms |  0.1ms |      0.5ms |
    | y::              |    3.3ms |  0.7ms |      1.3ms |

    # after
    | revset \ backend | segments | revlog | revlog-cpy |
    |------------------|----------|--------|------------|
    | x::y             |    0.2ms |  0.1ms |      0.6ms |
    | y::              |    1.0ms |  0.7ms |      1.3ms |

Reviewed By: sfilipco

Differential Revision: D23214387

fbshipit-source-id: 4d11db84cd28f4e04e8b991cbc650c9d5781fd27
2020-08-26 15:32:25 -07:00
Jun Wu
a3cbda76bb dag: add a benchmark for x::y with lots non-master heads
Summary:
Lots of non-master heads is not an exercised graph in the benchmarks.
Add it as it practically happens.  This will be used by the next change.

Reviewed By: sfilipco

Differential Revision: D23259879

fbshipit-source-id: 7fe290d14403e42e6d135bde56e2d5c8519ae530
2020-08-26 15:32:24 -07:00
Jun Wu
89570e223a dag: use non-master group in fuzz test
Summary:
Currently the fuzz test only uses the master group. Let it exercise non-master
group too.

Reviewed By: DurhamG

Differential Revision: D23214388

fbshipit-source-id: 7108a1055fbdda2b012f93c5948fb83ef3b9a96f
2020-08-26 15:32:24 -07:00