Commit Graph

6842 Commits

Author SHA1 Message Date
Mateusz Kwapich
37192dc0e1 add a way to exclude a commit and its ancestor from commit history
Summary: We need this functionality for scmquery replacement.

Reviewed By: krallin

Differential Revision: D22999792

fbshipit-source-id: 56e5ec68469cb9c154a5c3045ded969253270b94
2020-08-25 03:48:49 -07:00
Mateusz Kwapich
42cc5431a4 add a way to exclude a commit and its ancestor from commit history
Summary: We need this functionality for scmquery replacement.

Reviewed By: krallin

Differential Revision: D22999793

fbshipit-source-id: 94e53adf5458e0bc1ebceffb3b548b7fc021218a
2020-08-25 03:48:49 -07:00
Durham Goode
c42a494668 dynamicconfig: don't block read operations on dynamicconfig write permission errors
Summary:
Dynamicconfig was throwing errors if hgrc.dynamic wasn't writable.
Let's eat those errors for normal read operations. We still treat it as an error
for straight hg debugdynamicconfig invocations.

Reviewed By: quark-zju

Differential Revision: D23301100

fbshipit-source-id: ed0bd1282d2c7ee747f0909c238a5fa07b7bc9bc
2020-08-24 21:40:00 -07:00
Durham Goode
9b7b351ed9 configs: introduce test demonstrating permission denied error
Summary:
We've seen user reports of this error. Let's add a test to demonstrate
it. The next diff will fix it.

Reviewed By: sfilipco

Differential Revision: D23309612

fbshipit-source-id: 6fb9e4e65d3351fa29812fc75095d054465cfe13
2020-08-24 21:40:00 -07:00
Zeyi (Rice) Fan
50378e741a run edenfsctl redirect fixup after mout is done
Summary: Use `Subprocess` in `win/utils` to call `edenfsctl redirection fixup` after mount is done.

Reviewed By: wez

Differential Revision: D22958764

fbshipit-source-id: a485994a3816169299e8514a5c355f3d37edad99
2020-08-24 21:38:12 -07:00
Zeyi (Rice) Fan
4d21d2dd8a clean up fs/win/utils/Process.h
Summary: Some clean up to do. `Process` will crash the entire process if `Pipe` is ever `std::nullptr`. So let's not give it a default argument `std::nullptr`.

Reviewed By: xavierd

Differential Revision: D22958765

fbshipit-source-id: 0c35e805f24a0d572bbc08efc97e59a37d0cbf88
2020-08-24 21:38:12 -07:00
Zeyi (Rice) Fan
17d2c95a18 enable mkscratch on Windows
Summary: With D22956659, we can now use mkscratch on Windows with EdenFS.

Reviewed By: xavierd

Differential Revision: D22956983

fbshipit-source-id: 995073cbc89d5cb23dbb9c1a58926f8c51f0a896
2020-08-24 21:38:12 -07:00
Zeyi (Rice) Fan
7f0f310af3 handle extended-length on Windows
Summary:
On Windows, Rust's `std::fs::canonicalize` [1] will generate extended-length path that will include a `\\?\` prefix [2]. This has subsequently cause `encode` to generate a path that contains a question mark, which is an invalid path on Windows.

This diff teaches `encode` to handle extended-length path on Windows. It essentially converts the path back so it no longer contains the prefix.

[1] http://doc.rust-lang.org/1.45.2/std/fs/fn.canonicalize.html
[2] https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file#maximum-path-length-limitation

Reviewed By: wez

Differential Revision: D22956659

fbshipit-source-id: 54691e204d7cb481bdb40f62c6520c0f70c3f648
2020-08-24 21:38:12 -07:00
Durham Goode
4a8a5290e8 curses: eat curses error for weird inputs
Summary:
In python 3 curses sometimes throws an error when weird keys are
pressed. I'm not certain exactly what key causes the problem, but let's just
prevent all such errors from crashing the process.

Reviewed By: quark-zju

Differential Revision: D23310301

fbshipit-source-id: a9684ce6f690d0753ff9956ef9f13c330eb0a77b
2020-08-24 20:16:41 -07:00
Xavier Deguillard
111d960ccb win: move some code between the dispatcher and the channel
Summary:
By making the EdenDispatcher less Windows dependant, we can more easily move it
into a non-Windows specific location later.

Reviewed By: chadaustin

Differential Revision: D23298028

fbshipit-source-id: 21726677808a9b8ce3d3e211dd65d9e47caad569
2020-08-24 16:49:12 -07:00
Mateusz Kwapich
1b00df7887 add a way to exclude a commit and its ancestor from path history
Summary: We need this functionality for scmquery replacement.

Reviewed By: krallin

Differential Revision: D22999141

fbshipit-source-id: e2e4177e56db85f65930b67a9e927a5c93b652df
2020-08-24 13:03:05 -07:00
Mateusz Kwapich
a3f8760fbc add a way to exclude a commit and its ancestor from path history
Summary: We need this functionality for scmquery replacement.

Reviewed By: krallin

Differential Revision: D22999142

fbshipit-source-id: 04cea361ea6270626e7ff77255e3dc75875ece97
2020-08-24 13:03:04 -07:00
Mateusz Kwapich
e7daab0dfb change the path history options to struct
Summary:
Rust doesn't have named arguments as with positional it's hard to keep track
of all of them if there're many. I'm planning to add one more so let's switc to
struct.

Reviewed By: krallin

Differential Revision: D22999143

fbshipit-source-id: 54dade05f860b41d18bebb52317586015a893919
2020-08-24 13:03:04 -07:00
Jun Wu
7872c44fdf configparser: stabilize tests
Summary:
Add locking for tests reading / mutating global env vars.
Restore HG_TEST_REMOTE_CONFIG after testing.

Reviewed By: DurhamG

Differential Revision: D23269862

fbshipit-source-id: d61141b25c923a059de07c3dc8479f3bee06dce7
2020-08-24 12:36:09 -07:00
Egor Tkachenko
7fd2f22cc0 Fix bug with zero hash manifest
Summary:
If the imported commit has manifest id with all zeros (empty commit). Blobimport job can't find it in blobstore and returns error D23266254.
Add an early return when the manifest_id is NULL_HASH.

Reviewed By: StanislavGlebik

Differential Revision: D23266254

fbshipit-source-id: b8a3c47edfdfdc9d8cc8ea032fb96e27a04ef911
2020-08-24 07:34:29 -07:00
Pavel Aslanov
69e57b232d fix panic in slice index
Summary:
Based on [user report](https://fb.workplace.com/groups/scm/permalink/3128221090560823/).
Note that slices in rust behave differently and if index exceeds slice size this will always be panic. My fix was based on assumption that behavior should be similar to python.

Reviewed By: quark-zju

Differential Revision: D23263922

fbshipit-source-id: 3d2a1a1b59f14e43b1f1a2b7102982b11637c0b4
2020-08-24 05:24:58 -07:00
Katie Mancini
3827b9787d add fetch type to data fetch logging
Summary:
Having the type of data fetched can help in debugging where these fetches are
comming from. In the currently logs figuring out if a data fetch is blob or
tree requires some manual work. When looking at a big bunch of fetches this is
not super practical.

So this includes this info in our logging.

Reviewed By: chadaustin

Differential Revision: D23243444

fbshipit-source-id: 9abe5180c5d2afc0d02b27ba6a6b76401e86556e
2020-08-21 17:38:14 -07:00
Jun Wu
34df768136 log: add a config to simplify graphs
Summary:
This could help simplify the graph a lot for repos with lots of merges. For
example, logging tags on linux.git looks like:

  o                      fb893de3  Yesterday at 17:28  master
  ├─┬─┬─┬─┬─┬─┬─┬─┬─┬─╮
  ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ o  bcf87687  Aug 02 at 14:21  v5.8
  ╷ ╷ ╷ ╭─────┬─┬───┬─╯
  ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ o  92ed3019  Jul 26 at 14:14  v5.8-rc7
  ╷ ╷ ╷ ╭─────┬─┬─┬─╯
  ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ o  ba47d845  Jul 19 at 15:41  v5.8-rc6
  ╷ ╷ ╷ ╭─┬─┬─┬─┬─╯
  ╷ ╷ ╷ ╷ ╷ ╷ ╷ o  11ba4688  Jul 12 at 16:34  v5.8-rc5
  ╷ ╷ ╷ ╭─┬─┬─┬─╯
  ╷ ╷ ╷ ╷ ╷ ╷ o  dcb7fd82  Jul 05 at 16:20  v5.8-rc4
  ╷ ╷ ╷ ╭─┬─┬─┤
  ╷ ╷ ╷ ╷ ╷ o ╷  9ebcfadb  Jun 28 at 15:00  v5.8-rc3
  ╷ ╷ ╭─┬─┬─╯ ╷
  ╷ ╷ ╷ ╷ o   ╷  48778464  Jun 21 at 15:45  v5.8-rc2
  ╷ ╷ ╷ ╭─╯   ╷
  ╷ ╷ ╷ o     ╷                      b3a9e3b9  Jun 14 at 12:45  v5.8-rc1
  ╭─┬─┬─┼─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─╮
  ╷ ╷ o ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷  3d77e6a8  May 31 at 16:49  v5.7
  ╭─┬─┴───────┬───────────┬─┬───┬─╮
  ╷ o   ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷  9cb1fd0e  May 24 at 15:32  v5.7-rc7
  ╷ ╰─────────┬─────────────┬─┬─┬─╮
  ╷     ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ o  b9bbe6ed  May 17 at 16:48  v5.7-rc6
  ╭───────────┬─────────────┬─┬─┬─╯
  ╷     ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ o  2ef96a5b  May 10 at 15:16  v5.7-rc5
  ╭───────────┬─────────────┬─┬─╯
  ╷     ╷ ╷ ╷ o ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷  0e698dfa  May 03 at 14:56  v5.7-rc4
  ╭───────────┴───────────┬─┬─╮
  o     ╷ ╷ ╷   ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷  6a8b55ed  Apr 26 at 13:51  v5.7-rc3
  ╰─────────────────┬───────┬─╮
        ╷ ╷ ╷   ╷ ╷ ╷ ╷ ╷ ╷ ╷ o  ae83d0b4  Apr 19 at 14:35  v5.7-rc2
        ╷ ╷ ╷   ╷ ╷ ╷ ╷ ╷ ╷ ╭─┤
        ╷ ╷ ╷   ╷ ╷ ╷ ╷ ╷ ╷ o ╷  8f3d9f35  Apr 12 at 12:35  v5.7-rc1
  ╭─┬─┬───────┬─────┬─┬─┬─┬─┼─╮
  ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ ╷ o ╷ ╷  7111951b  Mar 29 at 15:25  v5.6
  ╷ ╭─────────┬─────┬─┬─┬─┴───╮
  ╷ ╷ ╷ ╷ ╷ ╷ o ╷ ╷ ╷ ╷ ╷   ╷ ╷  16fbf79b  Mar 22 at 18:31  v5.6-rc7
  ╷ ╷ ╭───────┴─────┬─┬─┬─────╮
  ╷ ╷ ╷ ╷ ╷ ╷   ╷ ╷ ╷ ╷ ╷   ╷ o  fb33c651  Mar 15 at 15:01  v5.6-rc6
  ╷ ╭─┬─────────────┬─┬─┬─────╯
  ╷ ╷ ╷ ╷ ╷ ╷   ╷ ╷ ╷ ╷ o   ╷  2c523b34  Mar 08 at 17:44  v5.6-rc5
  ╷ ╭─┬─────────────┬─┬─╯   ╷
  ╷ ╷ o ╷ ╷ ╷   ╷ ╷ ╷ ╷     ╷  98d54f81  Mar 01 at 14:38  v5.6-rc4
  ╷ ╭─┴─────────────┬─╮     ╷
  ╷ ╷   ╷ ╷ ╷   ╷ ╷ ╷ o     ╷  f8788d86  Feb 23 at 16:17  v5.6-rc3
  ....

And with simplification turned on, it looks like:

  o    fb893de3  Yesterday at 17:28  master
  ├─╮
  o ╷  bcf87687  Aug 02 at 14:21  v5.8
  ╷ ╷
  o ╷  92ed3019  Jul 26 at 14:14  v5.8-rc7
  ╷ ╷
  o ╷  ba47d845  Jul 19 at 15:41  v5.8-rc6
  ╷ ╷
  o ╷  11ba4688  Jul 12 at 16:34  v5.8-rc5
  ╷ ╷
  o ╷  dcb7fd82  Jul 05 at 16:20  v5.8-rc4
  ╷ ╷
  o ╷  9ebcfadb  Jun 28 at 15:00  v5.8-rc3
  ╷ ╷
  o ╷  48778464  Jun 21 at 15:45  v5.8-rc2
  ├─╯
  o  b3a9e3b9  Jun 14 at 12:45  v5.8-rc1
  ╷
  o  3d77e6a8  May 31 at 16:49  v5.7
  ╷
  o  9cb1fd0e  May 24 at 15:32  v5.7-rc7
  ╷
  o  b9bbe6ed  May 17 at 16:48  v5.7-rc6
  ╷
  o  2ef96a5b  May 10 at 15:16  v5.7-rc5
  ╷
  o  0e698dfa  May 03 at 14:56  v5.7-rc4
  ╷
  o  6a8b55ed  Apr 26 at 13:51  v5.7-rc3
  ╷
  o  ae83d0b4  Apr 19 at 14:35  v5.7-rc2
  ╷
  o  8f3d9f35  Apr 12 at 12:35  v5.7-rc1
  ╷
  o  7111951b  Mar 29 at 15:25  v5.6
  ╷
  o  16fbf79b  Mar 22 at 18:31  v5.6-rc7
  ╷
  o  fb33c651  Mar 15 at 15:01  v5.6-rc6
  ╷
  o  2c523b34  Mar 08 at 17:44  v5.6-rc5
  ╷
  o  98d54f81  Mar 01 at 14:38  v5.6-rc4
  ╷
  o  f8788d86  Feb 23 at 16:17  v5.6-rc3
  ....

Under the hood, the difference is how `reachableroots` gets calculated.
See also D22657197 (a5c36fd0b1) and D22368827 (da42f2c17e).

Since the old behavior almost always seems confusing to human. The new
config is turned on by default (but only takes effect if the "segments"
backend is used).

Reviewed By: sfilipco

Differential Revision: D23095468

fbshipit-source-id: f0fc631d2d9a00e3b36744e4236b43d230d10687
2020-08-21 17:10:36 -07:00
Katie Mancini
e1836f679c Add spaces to process name in data fetch logs
Summary:
Previously pieces of the command line for a process were seperated by `\0`.
This makes them a bit hard to read and also makes running queries on them
harder. Converts these `\0` back to spaces to fix this.

see https://fb.workplace.com/groups/edenfs/permalink/1446711485499079/ for
more motivation.

Reviewed By: wez

Differential Revision: D23266909

fbshipit-source-id: e4a9284e04039fcd971bed0d6e21d220e946acdb
2020-08-21 13:57:56 -07:00
Mark Thomas
b9c0772f2f commitcloud: handle missing optional fields
Summary:
The files in commit cloud `References` structures are optional.  Handle them
not being present.

Reviewed By: quark-zju

Differential Revision: D23266786

fbshipit-source-id: ed7128bc7e6b762d3509d77b40a00b77885191b9
2020-08-21 13:52:02 -07:00
Jun Wu
e7f3167810 hgcommands: show milliseconds on RUST_LOG output
Summary: This makes it a bit easier to track down perf issues printed by RUST_LOGs.

Reviewed By: sfilipco

Differential Revision: D23095463

fbshipit-source-id: 78221a1992389f512fac6e6e633be6d19123e04a
2020-08-21 13:00:45 -07:00
Jun Wu
b4c9b6a7a1 test-git-changelog: fix the test on Windows
Summary:
Use `git config core.autocrlf false` to silent warnings like:

```
   $ git add alpha
+  warning: LF will be replaced by CRLF in alpha.
+  The file will have its original line endings in your working directory
```

Reviewed By: sfilipco

Differential Revision: D23270146

fbshipit-source-id: af3bf241edb9f615bcc285b51cc491385f208039
2020-08-21 13:00:45 -07:00
Liubov Dmitrieva
56e9cd9ed7 add undelete workspace command
Summary: The command is needed to restore a deleted workspace

Reviewed By: markbt

Differential Revision: D23250376

fbshipit-source-id: e24a7cbc0aad004291853b4c34d7474789aa9c2b
2020-08-21 13:00:45 -07:00
Jun Wu
d7cbb641ff dag: fix fuzz tests
Summary:
The fuzz tests need `TestContext::id_dag()`, which was removed by D20471712 (1fb5acf242).
Restore it so fuzz tests can run. This is mainly to check the new `range`
function.

The `range` fuzz test does find an issue caused by `>` written as `>=`
relatively quickly.

Reviewed By: sfilipco

Differential Revision: D23106176

fbshipit-source-id: e9540cc932503a9d54246d24c70bac829fcb13df
2020-08-21 13:00:45 -07:00
Jun Wu
60ebf5c2a0 changelog2: add SHA1 verification
Summary: Ensure that the commit text is verified, but do not verify git hashes.

Reviewed By: DurhamG

Differential Revision: D23095464

fbshipit-source-id: e62341f6c7258c6f18b7cc75088c25dfc7040ab1
2020-08-21 13:00:45 -07:00
Jun Wu
0dc28f689f changelog2: initial support for segmented git changelog
Summary:
The immediate goal is to run benchmarks on a commit graph provided by a git
repo without converting a whole (large) repo from git to hg. Note git repos can
be cloned in a shallow way so it only contains the commit graph. For example:

  git clone https://github.com/torvalds/linux --filter=tree:0 -n

Note: The above command writes repositoryformat=1 in `.git/config`
which is not supported by libgit2. Manually editing it to repositoryformat=0
would enable libgit2 to read it for this crate's use-case.

In the longer term we might want to extend the support so refs/trees/files can
be read/written directly via the git repo based on this work. However that's
currently beyond scope.

Reviewed By: DurhamG

Differential Revision: D23095467

fbshipit-source-id: 868beb0c7de60453b47962639863eb8f7e3f5753
2020-08-21 13:00:45 -07:00
Jun Wu
749602e534 hgcommits: add gitsegments backend
Summary:
The backend translates git commit graph to segments. It's useful for
benchmarking on git commit graphs.

Reviewed By: DurhamG

Differential Revision: D23095470

fbshipit-source-id: 21a28869e91ef8f38bbf9925443eb4ac26f05e3d
2020-08-21 13:00:45 -07:00
Jun Wu
d352133d6d hgcommits: use concrete error types
Summary: Migrate to concrete types so it can be typechecked.

Reviewed By: DurhamG

Differential Revision: D23095469

fbshipit-source-id: 27c6da30ca8a1329df544cd2ded7d9734593e48a
2020-08-21 13:00:45 -07:00
Jun Wu
e5527715b7 gitdag: crate to build segmented dag from git history
Summary:
Read git commit graph and migrate them to `dag::Dag`.

This allows using Rust dag abstractions on the git
commit graph.

Reviewed By: DurhamG

Differential Revision: D23095471

fbshipit-source-id: 2163701350ce82ce6e97074e56ad5877f3c9c158
2020-08-21 13:00:45 -07:00
Jun Wu
aa6575e377 revset: optimize revset functions using rust fast paths
Summary:
Add alternative paths will be faster if changelog2 is used, since they are
backed by native paths.

Add a config option to disable the fast paths if they cause issues.

Reviewed By: DurhamG

Differential Revision: D23036074

fbshipit-source-id: 489b6eac64148867c209d595623d0b9c21ad1d5a
2020-08-21 13:00:45 -07:00
Durham Goode
7fbac081e2 configs: fix osx test runs
Summary:
OSX doesn't support touch -d. Let's just skip that part of the test on
that platform. This fixes the OSX build.

Reviewed By: singhsrb

Differential Revision: D23253475

fbshipit-source-id: 0eccb884cbdd4bf0a4068fbf943ba7dac9df4e04
2020-08-21 13:00:45 -07:00
Jun Wu
d6bff56df1 smartlog: migrate some revset calculation to a faster path
Summary:
Detect the "segments" backend and calculate the revset differently.

Practically, with collapse-obsolete disabled, the time of related revset
calculation drops from 0.14s to 0.03s in my fbsource repo.

The `obsolete()` set calculation is expensive (0.4-0.6s) and a bit more
expensive with the new DAG APIs, which will be addressed in upcoming
changes. EDIT: Addressed by D23036063.

Reviewed By: DurhamG

Differential Revision: D23036055

fbshipit-source-id: 71140a88599cc68bfa90d564c786da89b3ebd38b
2020-08-21 13:00:45 -07:00
Jun Wu
8c9f1f5cee test-smartlog: avoid using rev numbers
Summary: Migrated by `./fix-revnum.py`.

Reviewed By: DurhamG

Differential Revision: D23036082

fbshipit-source-id: cf456b3625e39329c817c696691494dc6725bc22
2020-08-21 13:00:45 -07:00
Jun Wu
fb38ea9152 test-smartlog: use explicit template
Summary:
The `compact` template is rarely used and is coupled with rev numbers (ex. rev
number decides what "parents" to show). Use explicit templates.  This makes the
test change easier to check.

Reviewed By: DurhamG

Differential Revision: D23036076

fbshipit-source-id: f2cc0f25191711fa7d846a8ad38aee8fb9171273
2020-08-21 13:00:45 -07:00
Jun Wu
e1ad0df320 commitcloud: optimize revset for segmented changelog backend
Summary:
The `notbackedup()` revset is used as part of `summary` that prints information
at the end of `smartlog`. It can take hundreds of milliseconds if there are
many heads. Detect segmented changelog and use a fast path for it.

Practically this reduces `summary` from 594ms to 91ms for me:

With segmented changelog (doublewrite backend) and new code path:

    91    \ summary                             status.py:23
     2      \ currentworkspace                  workspace.py:121
     3       | _get (2 times)                   workspace.py:110
     3       | read (2 times)                   config.py:195
     3       | parse (2 times)                  config.py:116
     2       | compile (14 times)               util.py:1464
     3      \ __init__                          syncstate.py:44
    82      \ revs                              localrepo.py:1203

With revlog and old code path:

   594    \ summary                             status.py:23
     2      \ currentworkspace                  workspace.py:121
     4       | _get (2 times)                   workspace.py:110
     3       | read (2 times)                   config.py:195
     3       | parse (2 times)                  config.py:116
     3       | compile (14 times)               util.py:1464
     3      \ __init__                          syncstate.py:44
    46      \ revs                              localrepo.py:1203
   539      \ _iterfilter                       smartset.py:647
   538       | <lambda> (1565 times)            commitcloud/__init__.py:371
   537       | __contains__ (1565 times)        smartset.py:1039
   533       | _consumegen (17355 times)        smartset.py:1122

Reviewed By: markbt

Differential Revision: D23036075

fbshipit-source-id: 09dcc34f34a42814c6526e558d40b4d75ba9d75f
2020-08-21 13:00:45 -07:00
Jun Wu
f26dfc7d46 pymutationstore: make getdag support selecting successors or predecessors
Summary: Expose the Rust API so `getdag` can choose to skip successors or predecessors.

Reviewed By: markbt

Differential Revision: D23036056

fbshipit-source-id: 30cd437c5420d2d10176e33ef9de98814046f4ce
2020-08-21 13:00:45 -07:00
Jun Wu
45db3bbf96 mutationstore: add a native path to calculate 'obsolete()'
Summary:
The new path does not calculate the complicated `successorssets`, and is
known to make wez's repo operations significantly faster (which, I suspect is
slowed by a very long chain).

The new code is about 3x faster on my repo too:

  # before
  In [1]: list(repo.nodes('draft()'))
  In [2]: %time len(m.mutation.obsoletenodes(repo))
  CPU times: user 246 ms, sys: 42.2 ms, total: 288 ms
  Wall time: 316 ms
  Out[2]: 1127

  # after
  In [1]: list(repo.nodes('draft()'))
  In [2]: %time len(m.mutation.obsoletenodes(repo))
  CPU times: user 74.3 ms, sys: 7.92 ms, total: 82.3 ms
  Wall time: 82.3 ms
  Out[2]: 1127

Reviewed By: markbt

Differential Revision: D23036063

fbshipit-source-id: afd6ac122bb5d8d513b5cdc033e04d2c377286eb
2020-08-21 13:00:45 -07:00
Jun Wu
78477ad9c5 mutationstore: optimize get_dag
Summary:
Optimize get_dag:
- Avoid parsing mutation entries once they are parsed, by keeping an in-memory
  `parent_map`.
- Pass `heads` to `add_heads` so the segments are less fragmented, cycle break
  helper is more efficient.

The `heads` optimization is effective. Practically this makes `get_dag` about 2x faster.

This has a subtle change on cycle handling - full cycle without any non-cycle heads will
be ignored. Practically cycles are rare so it might be okay.

Together with improvements on the `dag` side, `get_dag` is about 4x faster.

Reviewed By: markbt

Differential Revision: D23036062

fbshipit-source-id: 3dc407b562f7ebf2543a87c5cd651ad6a2339d67
2020-08-21 13:00:45 -07:00
Jun Wu
be2d28fb95 dag: fix non-master high-level segments building
Summary:
If there is no new master segments, it's still possible to have new non-master
segments. Fix the loop condition so we don't skip building non-master segments.

Reviewed By: sfilipco

Differential Revision: D23095465

fbshipit-source-id: 46eb9d5b5f2b04241981558646e0bc090652abce
2020-08-21 13:00:45 -07:00
Jun Wu
e11f36e96b dag: test high-level segments building for non-master
Summary:
I noticed that high-level segments are somehow not built for non-master vertexes.
Add a test to demonstrate the issue.

Reviewed By: DurhamG, sfilipco

Differential Revision: D23095466

fbshipit-source-id: c5a6da14bdfabcf7c432f6c6dfe096c71cc10ee9
2020-08-21 13:00:45 -07:00
Jun Wu
23074edd9b dag: add some tracing spans
Summary: This is useful to investigate internals of dag calculations.

Reviewed By: sfilipco

Differential Revision: D23095473

fbshipit-source-id: 4750c1b4ffad32b1317051d17db9659aaaed59c4
2020-08-21 13:00:45 -07:00
Jun Wu
cd9aa9cb6c dag: improve segment building perf by using precalculated flat segments
Summary:
Follow up of the previous change by actually using the flat segments to build
segments. This significantly improved the perf. `cargo bench --bench dag_ops`
shows:

  building segments (old)                           774.109 ms
  building segments (new)                           143.879 ms

Besides, a `O(N^2)` update to `head_ids` is changed. It improves performance
when the graph has many heads (ex. the mutation graph).

Reviewed By: sfilipco

Differential Revision: D23036080

fbshipit-source-id: 033565700f253c6f20e30a00adb6b579921d6679
2020-08-21 13:00:45 -07:00
Jun Wu
9c9ecbc82b dag: make IdMap::assign_head calculate flat segments
Summary:
While testing the `obsolete()` set, I found an in-memory segmented DAG takes
10x time to build than a HashMap DAG.

Part of the inefficiency is to use a translated "parent_func" that round-trips
through Id and Vertex, used by segment building logic. This diff makes
`IdMap::assign_head` return flat segments, so we don't need a translated
"parent_func" to build flat segments.

This diff only adds checks to make sure the parent_func (Id version) matches
the segments. The next diff switches the segment building to not use the
translated parent_func.

Reviewed By: sfilipco

Differential Revision: D23036060

fbshipit-source-id: 99137f4b5be455cdf43218ba23eb3954b6d9e05a
2020-08-21 13:00:45 -07:00
Jun Wu
0742dc6293 dag: make to_set API bind the dag
Summary:
This affects the `tonodes` API in the Python world. Practically this will bind
the main commit graph to sets like draft, public.

The `ToSet` requirement on `DagAlgorithm` has to be removed to avoid stack
overflow of rustc resolving constraints.

Reviewed By: sfilipco

Differential Revision: D23036077

fbshipit-source-id: 912b924e29611680ab6b2ee4dbcd7ab39824409a
2020-08-21 13:00:45 -07:00
Jun Wu
adf027742e nameset: add flatten API
Summary: This will be useful for the `obsolete()` set.

Reviewed By: sfilipco

Differential Revision: D23036072

fbshipit-source-id: 2f944ef31cf19f902622d90545fa02b7dda89221
2020-08-21 13:00:45 -07:00
Jun Wu
f23b1112f0 nameset: a & b should not use id-based fast path if id map is incompatible
Summary:
If two sets have different IdMap, their Ids cannot be compared directly
for correctness.

Reviewed By: sfilipco

Differential Revision: D23036068

fbshipit-source-id: e800e8273b95c1f8174236e0f30445db7fd44556
2020-08-21 13:00:45 -07:00
Jun Wu
c1e596dbd6 nameset: use real id map snapshot instead of a pointer in hints
Summary: This is similar to the previous change. This allows "binding" IdMaps to sets.

Reviewed By: sfilipco

Differential Revision: D23036058

fbshipit-source-id: ec1b1ec73e949ad4865aecf17bfcc5c1ca723e0d
2020-08-21 13:00:45 -07:00
Jun Wu
0ac5f05097 nameset: use real dag snapshot instead of a pointer in hints
Summary:
This trades a bit performance (calculating the snapshot) for correctness (no
pointer reuse issues) and convenience (set captures dag information with them
and enables use-cases like converting NameSet from another dag to the
current dag without requiring extra `dag` objects).

Reviewed By: sfilipco

Differential Revision: D23036067

fbshipit-source-id: 2e691f09ad401ba79dbc635e908d79e54dadca5e
2020-08-21 13:00:45 -07:00
Jun Wu
759ceb6212 nameset: do not swap x & y if they come from different graphs
Summary:
If `x` and `y` come from a same graph, `x & y` is more efficient than
`y & x` if `y` is larger. However, if `x` and `y` are from different
graphs, the `FULL` hint can no longer accurately predict which one
is larger. Therefore the swap should be avoided.

Reviewed By: sfilipco

Differential Revision: D23036081

fbshipit-source-id: fe3970fc38c853b36689bfd0ee1dec20643ace78
2020-08-21 13:00:45 -07:00
Jun Wu
762603455a nameset: new metaset for separate iter+contains lazy/fast paths
Summary:
For sets like `obsolete()`, `merge()`, they could have a fast "contains" path:
Just check the given commit without calculating a full set. It's also possible
to have a relatively efficient code path to return StaticSet (for obsolete()),
or IdStaticSet (for merge(), by checking flat segments). This diff adds a
`MetaSet` that allows defining two fast paths separately.

This will be used for the `obsolete()` set in upcoming changes.

Reviewed By: sfilipco

Differential Revision: D23036059

fbshipit-source-id: 06e6f90e7e9511626a12cfa729c306ff539256d2
2020-08-21 13:00:45 -07:00