Summary:
The files in commit cloud `References` structures are optional. Handle them
not being present.
Reviewed By: quark-zju
Differential Revision: D23266786
fbshipit-source-id: ed7128bc7e6b762d3509d77b40a00b77885191b9
Summary: This makes it a bit easier to track down perf issues printed by RUST_LOGs.
Reviewed By: sfilipco
Differential Revision: D23095463
fbshipit-source-id: 78221a1992389f512fac6e6e633be6d19123e04a
Summary:
Use `git config core.autocrlf false` to silent warnings like:
```
$ git add alpha
+ warning: LF will be replaced by CRLF in alpha.
+ The file will have its original line endings in your working directory
```
Reviewed By: sfilipco
Differential Revision: D23270146
fbshipit-source-id: af3bf241edb9f615bcc285b51cc491385f208039
Summary: The command is needed to restore a deleted workspace
Reviewed By: markbt
Differential Revision: D23250376
fbshipit-source-id: e24a7cbc0aad004291853b4c34d7474789aa9c2b
Summary:
The fuzz tests need `TestContext::id_dag()`, which was removed by D20471712 (1fb5acf242).
Restore it so fuzz tests can run. This is mainly to check the new `range`
function.
The `range` fuzz test does find an issue caused by `>` written as `>=`
relatively quickly.
Reviewed By: sfilipco
Differential Revision: D23106176
fbshipit-source-id: e9540cc932503a9d54246d24c70bac829fcb13df
Summary: Ensure that the commit text is verified, but do not verify git hashes.
Reviewed By: DurhamG
Differential Revision: D23095464
fbshipit-source-id: e62341f6c7258c6f18b7cc75088c25dfc7040ab1
Summary:
The immediate goal is to run benchmarks on a commit graph provided by a git
repo without converting a whole (large) repo from git to hg. Note git repos can
be cloned in a shallow way so it only contains the commit graph. For example:
git clone https://github.com/torvalds/linux --filter=tree:0 -n
Note: The above command writes repositoryformat=1 in `.git/config`
which is not supported by libgit2. Manually editing it to repositoryformat=0
would enable libgit2 to read it for this crate's use-case.
In the longer term we might want to extend the support so refs/trees/files can
be read/written directly via the git repo based on this work. However that's
currently beyond scope.
Reviewed By: DurhamG
Differential Revision: D23095467
fbshipit-source-id: 868beb0c7de60453b47962639863eb8f7e3f5753
Summary: Migrate to concrete types so it can be typechecked.
Reviewed By: DurhamG
Differential Revision: D23095469
fbshipit-source-id: 27c6da30ca8a1329df544cd2ded7d9734593e48a
Summary:
Read git commit graph and migrate them to `dag::Dag`.
This allows using Rust dag abstractions on the git
commit graph.
Reviewed By: DurhamG
Differential Revision: D23095471
fbshipit-source-id: 2163701350ce82ce6e97074e56ad5877f3c9c158
Summary:
Add alternative paths will be faster if changelog2 is used, since they are
backed by native paths.
Add a config option to disable the fast paths if they cause issues.
Reviewed By: DurhamG
Differential Revision: D23036074
fbshipit-source-id: 489b6eac64148867c209d595623d0b9c21ad1d5a
Summary:
OSX doesn't support touch -d. Let's just skip that part of the test on
that platform. This fixes the OSX build.
Reviewed By: singhsrb
Differential Revision: D23253475
fbshipit-source-id: 0eccb884cbdd4bf0a4068fbf943ba7dac9df4e04
Summary:
Detect the "segments" backend and calculate the revset differently.
Practically, with collapse-obsolete disabled, the time of related revset
calculation drops from 0.14s to 0.03s in my fbsource repo.
The `obsolete()` set calculation is expensive (0.4-0.6s) and a bit more
expensive with the new DAG APIs, which will be addressed in upcoming
changes. EDIT: Addressed by D23036063.
Reviewed By: DurhamG
Differential Revision: D23036055
fbshipit-source-id: 71140a88599cc68bfa90d564c786da89b3ebd38b
Summary:
The `compact` template is rarely used and is coupled with rev numbers (ex. rev
number decides what "parents" to show). Use explicit templates. This makes the
test change easier to check.
Reviewed By: DurhamG
Differential Revision: D23036076
fbshipit-source-id: f2cc0f25191711fa7d846a8ad38aee8fb9171273
Summary:
The `notbackedup()` revset is used as part of `summary` that prints information
at the end of `smartlog`. It can take hundreds of milliseconds if there are
many heads. Detect segmented changelog and use a fast path for it.
Practically this reduces `summary` from 594ms to 91ms for me:
With segmented changelog (doublewrite backend) and new code path:
91 \ summary status.py:23
2 \ currentworkspace workspace.py:121
3 | _get (2 times) workspace.py:110
3 | read (2 times) config.py:195
3 | parse (2 times) config.py:116
2 | compile (14 times) util.py:1464
3 \ __init__ syncstate.py:44
82 \ revs localrepo.py:1203
With revlog and old code path:
594 \ summary status.py:23
2 \ currentworkspace workspace.py:121
4 | _get (2 times) workspace.py:110
3 | read (2 times) config.py:195
3 | parse (2 times) config.py:116
3 | compile (14 times) util.py:1464
3 \ __init__ syncstate.py:44
46 \ revs localrepo.py:1203
539 \ _iterfilter smartset.py:647
538 | <lambda> (1565 times) commitcloud/__init__.py:371
537 | __contains__ (1565 times) smartset.py:1039
533 | _consumegen (17355 times) smartset.py:1122
Reviewed By: markbt
Differential Revision: D23036075
fbshipit-source-id: 09dcc34f34a42814c6526e558d40b4d75ba9d75f
Summary: Expose the Rust API so `getdag` can choose to skip successors or predecessors.
Reviewed By: markbt
Differential Revision: D23036056
fbshipit-source-id: 30cd437c5420d2d10176e33ef9de98814046f4ce
Summary:
The new path does not calculate the complicated `successorssets`, and is
known to make wez's repo operations significantly faster (which, I suspect is
slowed by a very long chain).
The new code is about 3x faster on my repo too:
# before
In [1]: list(repo.nodes('draft()'))
In [2]: %time len(m.mutation.obsoletenodes(repo))
CPU times: user 246 ms, sys: 42.2 ms, total: 288 ms
Wall time: 316 ms
Out[2]: 1127
# after
In [1]: list(repo.nodes('draft()'))
In [2]: %time len(m.mutation.obsoletenodes(repo))
CPU times: user 74.3 ms, sys: 7.92 ms, total: 82.3 ms
Wall time: 82.3 ms
Out[2]: 1127
Reviewed By: markbt
Differential Revision: D23036063
fbshipit-source-id: afd6ac122bb5d8d513b5cdc033e04d2c377286eb
Summary:
Optimize get_dag:
- Avoid parsing mutation entries once they are parsed, by keeping an in-memory
`parent_map`.
- Pass `heads` to `add_heads` so the segments are less fragmented, cycle break
helper is more efficient.
The `heads` optimization is effective. Practically this makes `get_dag` about 2x faster.
This has a subtle change on cycle handling - full cycle without any non-cycle heads will
be ignored. Practically cycles are rare so it might be okay.
Together with improvements on the `dag` side, `get_dag` is about 4x faster.
Reviewed By: markbt
Differential Revision: D23036062
fbshipit-source-id: 3dc407b562f7ebf2543a87c5cd651ad6a2339d67
Summary:
If there is no new master segments, it's still possible to have new non-master
segments. Fix the loop condition so we don't skip building non-master segments.
Reviewed By: sfilipco
Differential Revision: D23095465
fbshipit-source-id: 46eb9d5b5f2b04241981558646e0bc090652abce
Summary:
I noticed that high-level segments are somehow not built for non-master vertexes.
Add a test to demonstrate the issue.
Reviewed By: DurhamG, sfilipco
Differential Revision: D23095466
fbshipit-source-id: c5a6da14bdfabcf7c432f6c6dfe096c71cc10ee9
Summary: This is useful to investigate internals of dag calculations.
Reviewed By: sfilipco
Differential Revision: D23095473
fbshipit-source-id: 4750c1b4ffad32b1317051d17db9659aaaed59c4
Summary:
Follow up of the previous change by actually using the flat segments to build
segments. This significantly improved the perf. `cargo bench --bench dag_ops`
shows:
building segments (old) 774.109 ms
building segments (new) 143.879 ms
Besides, a `O(N^2)` update to `head_ids` is changed. It improves performance
when the graph has many heads (ex. the mutation graph).
Reviewed By: sfilipco
Differential Revision: D23036080
fbshipit-source-id: 033565700f253c6f20e30a00adb6b579921d6679
Summary:
While testing the `obsolete()` set, I found an in-memory segmented DAG takes
10x time to build than a HashMap DAG.
Part of the inefficiency is to use a translated "parent_func" that round-trips
through Id and Vertex, used by segment building logic. This diff makes
`IdMap::assign_head` return flat segments, so we don't need a translated
"parent_func" to build flat segments.
This diff only adds checks to make sure the parent_func (Id version) matches
the segments. The next diff switches the segment building to not use the
translated parent_func.
Reviewed By: sfilipco
Differential Revision: D23036060
fbshipit-source-id: 99137f4b5be455cdf43218ba23eb3954b6d9e05a
Summary:
This affects the `tonodes` API in the Python world. Practically this will bind
the main commit graph to sets like draft, public.
The `ToSet` requirement on `DagAlgorithm` has to be removed to avoid stack
overflow of rustc resolving constraints.
Reviewed By: sfilipco
Differential Revision: D23036077
fbshipit-source-id: 912b924e29611680ab6b2ee4dbcd7ab39824409a
Summary: This will be useful for the `obsolete()` set.
Reviewed By: sfilipco
Differential Revision: D23036072
fbshipit-source-id: 2f944ef31cf19f902622d90545fa02b7dda89221
Summary:
If two sets have different IdMap, their Ids cannot be compared directly
for correctness.
Reviewed By: sfilipco
Differential Revision: D23036068
fbshipit-source-id: e800e8273b95c1f8174236e0f30445db7fd44556
Summary: This is similar to the previous change. This allows "binding" IdMaps to sets.
Reviewed By: sfilipco
Differential Revision: D23036058
fbshipit-source-id: ec1b1ec73e949ad4865aecf17bfcc5c1ca723e0d
Summary:
This trades a bit performance (calculating the snapshot) for correctness (no
pointer reuse issues) and convenience (set captures dag information with them
and enables use-cases like converting NameSet from another dag to the
current dag without requiring extra `dag` objects).
Reviewed By: sfilipco
Differential Revision: D23036067
fbshipit-source-id: 2e691f09ad401ba79dbc635e908d79e54dadca5e
Summary:
If `x` and `y` come from a same graph, `x & y` is more efficient than
`y & x` if `y` is larger. However, if `x` and `y` are from different
graphs, the `FULL` hint can no longer accurately predict which one
is larger. Therefore the swap should be avoided.
Reviewed By: sfilipco
Differential Revision: D23036081
fbshipit-source-id: fe3970fc38c853b36689bfd0ee1dec20643ace78
Summary:
For sets like `obsolete()`, `merge()`, they could have a fast "contains" path:
Just check the given commit without calculating a full set. It's also possible
to have a relatively efficient code path to return StaticSet (for obsolete()),
or IdStaticSet (for merge(), by checking flat segments). This diff adds a
`MetaSet` that allows defining two fast paths separately.
This will be used for the `obsolete()` set in upcoming changes.
Reviewed By: sfilipco
Differential Revision: D23036059
fbshipit-source-id: 06e6f90e7e9511626a12cfa729c306ff539256d2
Summary:
Before this change, `flush` with empty changes but `master` moves will cause an
error, because the `parents_func` only contains "pending changes", aka. new
vertexes. The `parents_func` does not know `master` and `master` is needed to
re-assign them from the non-master to the master group.
With the snapshot API, things become easier. We just take a snapshot before
reloading, and use the snapshot to answer parent_names.
Reviewed By: sfilipco
Differential Revision: D22970569
fbshipit-source-id: 99a25857ba98792edff69985c16df118a560ffb0
Summary:
This API allows the underlying Dag to provide a snapshot. The snapshot can then
be used in places that do not want a lifetime (ex. NameSet).
Reviewed By: sfilipco
Differential Revision: D22970579
fbshipit-source-id: ededff82009fd5b4583f871eef084ec907b45d33
Summary:
Make it possible to snapshot a Dag. This is useful for cases where another
struct wants access to the Dag without lifetimes. Namely, the LazySet can
might want to keep a snapshot of Dag.
Reviewed By: sfilipco
Differential Revision: D22970568
fbshipit-source-id: 508c38d3ffac2ffcd2e682578c3c5e5787ea3bcf
Summary:
The only intended use of the inverse DAG is to implement the Python dag
interface in `dagutil.py`. D22519589 (2d4d44cf3d) stack changed it so the Python dag
interface becomes optional. Therefore there is no need to keep the inverse DAG
interface, which is a bit tricky on sorting.
Reviewed By: sfilipco
Differential Revision: D22970581
fbshipit-source-id: 58a126b41d992e75beaf76ece25cb578ee84760b
Summary:
The nameset serves as a bridge for Rust NameSet sets. It's different from the
Rust IdSet in a way that it supports all kinds of Rust NameSet (lazy or
non-lazy).
Unlike the native Rust binding, the added nameset uses rev numbers and fit in
the Python smartset framework.
Reviewed By: sfilipco
Differential Revision: D23036066
fbshipit-source-id: 060b3927dda6cd2275af21b093729c7e0e88ee7c
Summary: The Rust "flush(masternodes)" API does not handle nullid. Filter it out from Python.
Reviewed By: sfilipco
Differential Revision: D22970578
fbshipit-source-id: 671fe950948067a0b3f97c5b65ff2b9b7ed4b631
Summary:
By default, `torevs` calls Python iteration for non-list, non-spans Python
objects. The `idset` object has the `spans` which can be used as a fast
path.
Reviewed By: DurhamG
Differential Revision: D22970580
fbshipit-source-id: f491404ba803c4468c17cd74daaea90f46b8b38b
Summary:
This allows certain code paths to use `dageval` with the idea that `dageval` is
going to be faster.
Reviewed By: sfilipco
Differential Revision: D22970576
fbshipit-source-id: ba4536a55691de63640e574c898320629c6d7b2f
Summary: This allows migrating between a few changelog formats we have.
Reviewed By: DurhamG
Differential Revision: D22970571
fbshipit-source-id: d6b577ae5beb72a43fff999c26c35fcdc33e8f83
Summary:
This will be used for migrating revlog DAG to segmented changelog. It does not
migrate commit text data (which can take 10+ minutes).
Reviewed By: DurhamG, sfilipco
Differential Revision: D22970582
fbshipit-source-id: 125a8726d48e15ceb06edb139d6d5b2fc132a32c
Summary: For now it just prints some details about the changelog backend.
Reviewed By: DurhamG, sfilipco
Differential Revision: D22970573
fbshipit-source-id: 719a5e5bb6f3856df3c9357e47daa9e7c8584952
Summary:
This option is needed to validate Mononoke Smartlog against the original
infinitepush Commit Cloud Smartlog. This option is advanced and can be removed
after full migration to the Mononoke backend.
Reviewed By: markbt
Differential Revision: D23241251
fbshipit-source-id: e550334b104d18bb58d39acb8540ebdc9e711c4e