Commit Graph

62 Commits

Author SHA1 Message Date
Jun Wu
3ced6804e4 dag: ensure MetaSet has Hints set
Summary:
Previously, `MetaSet` was constructed with default `Hints`. That disables fast
paths. Revise the API so MetaSet requires an explicit `Hints` to address the
issue.

Reviewed By: sfilipco

Differential Revision: D26203557

fbshipit-source-id: 9e7658af8723b06d0efdcad1ab4671c79e907326
2021-02-05 11:53:46 -08:00
Jun Wu
9b02ebc711 dag: make spanset module private
Summary:
This just renames types so `IdSet` is the recommended name and `SpanSet`
remains an implementation detail.

Reviewed By: sfilipco

Differential Revision: D26203560

fbshipit-source-id: 7ca0262f3ad6d874363c73445f40f8c5bf3dc40e
2021-02-05 11:53:45 -08:00
Jun Wu
a3f7dc77a8 dag: move id.not_found() to a trait
Summary: In upcoming changes, we're moving Id to a separate crate. This makes that easier.

Reviewed By: sfilipco

Differential Revision: D25857918

fbshipit-source-id: 6e2163f6fa171d4cd3be4fc0c4c248fd87ba739c
2021-01-26 17:11:08 -08:00
Jun Wu
d0d149d868 dag: use VerLink to track IdMap change compatibility
Summary:
Similar to the previous change. `VerLink` tracks compatibility more accurately.
- No false positives comparing to the current `map_id` approach.
- Less false negatives comparing to the previous `Arc::ptr_eq` approach.

The `map_id` is kept for debugging purpose.

Reviewed By: sfilipco

Differential Revision: D25607513

fbshipit-source-id: 7d7c7e3d49f707a584142aaaf0a98cfd3a9b5fe8
2020-12-18 16:56:43 -08:00
Jun Wu
9ba0b046c0 dag: use VerLink to track dag change compatibility
Summary:
`VerLink` tracks compatibility more accurately.
- No false positives comparing to the current `dag_id` approach.
- Less false negatives comparing to the previous `Arc::ptr_eq` approach.

The `dag_id` is kept for debugging purpose.

Note: By the current implementation, `dag.flush()` will make `dag`
incompatible from its previous state. This is somewhat expected, as
`flush` might pick up any changes on the filesystem, reassign non-master. Those
can be actually incompatible. This might be improved in the future to detect
reload changes by using some extra information.

Reviewed By: sfilipco

Differential Revision: D25607511

fbshipit-source-id: 3cfc97610504813a3e5bb32ec19a90495551fd3a
2020-12-18 16:56:43 -08:00
Jun Wu
eea00a2cb1 dag: add an API for DagAlgorithm identity
Summary:
See the previous diff for context. The new API will be used to check if two
dags are compatible.

Note: It can cause false positive on compatibility checks, which need a
more complex solution. See D25607513 in this stack.

Reviewed By: sfilipco

Differential Revision: D25598079

fbshipit-source-id: f5fc9c03d73b42fadb931038fe2e078881be955f
2020-12-18 16:56:42 -08:00
Jun Wu
aef162f7a1 dag: add an API for IdMap identity
Summary:
It turns out `Arc::ptr_eq` is becoming unreliable, which will cause fast paths
to be not used, and extreme slowness in some cases (ex. `public & nodes`
iterating everything in `public`).

This diff adds an API for an IdMap to tell us its identity. That identity is
then used to replace the unreliable `Arc::ptr_eq`.

For an in-memory map, we just assign a unique number (per process) for its
identity on initialization. For an on-disk map, we use the type + path to
represent it.

Note: strictly speaking, this could cause false positives about
"maps are compatible", because two maps initially cloned from each other
can be mutated differently and their map_id do not change. That will
be addressed in upcoming diffs introducing a more complex but precise way to
track compatibility.

Reviewed By: sfilipco

Differential Revision: D25598076

fbshipit-source-id: 98c58f367770adaa14edcad20eeeed37420fbbaa
2020-12-16 20:08:41 -08:00
Jun Wu
dcf4957619 dag: make parent function async
Summary: Make the parent function used by various graph building functions async.

Reviewed By: sfilipco

Differential Revision: D25353612

fbshipit-source-id: 31f173dc82f0cce6022cc2caae78369fdc821c8f
2020-12-10 12:37:36 -08:00
Jun Wu
ad6f25addc dag: make IdConvert async
Summary: Make IdConvert async and migrate all its users.

Reviewed By: sfilipco

Differential Revision: D25350915

fbshipit-source-id: f05c89a43418f1180bf0ffa573ae2cdb87162c76
2020-12-10 12:37:35 -08:00
Jun Wu
53bdae78d9 dag: make ToIdSet async
Summary: This will make it easier to make IdConvert async.

Reviewed By: sfilipco

Differential Revision: D25345232

fbshipit-source-id: b8967ea51a6141a95070006a289dd724522f8e18
2020-12-10 12:37:34 -08:00
Jun Wu
f854d2e03e dag: make DagAlgorithm async
Summary:
Update DagAlgorithm and all its users to async. This makes it easier to make
IdConvert async.

Reviewed By: sfilipco

Differential Revision: D25345236

fbshipit-source-id: d6cf76723356bd0eb81822843b2e581de1e3290a
2020-12-10 12:37:34 -08:00
Jun Wu
a03e8f4c55 dag: make DagPersistent and DagAddHeads async
Summary: This makes it easier to make DagAlgorithm async.

Reviewed By: sfilipco

Differential Revision: D25345234

fbshipit-source-id: 5ca4bac38f5aac4c6611146a87f423a244f1f5a2
2020-12-10 12:37:33 -08:00
Jun Wu
2496281d78 dag: make PrefixLookup async
Summary: Use async function for the PrefixLookup trait.

Reviewed By: sfilipco

Differential Revision: D24840820

fbshipit-source-id: d22cac9f11b06e3127fa956e3f116cf232214125
2020-12-10 12:37:32 -08:00
Jun Wu
32176eca42 dag: add async interface for NameSet
Summary:
Change the main API of NameSet to async. Use the `nonblocking` crate to bridge
the sync and async world for compatibility. Future changes will migrate
Iterator to async Stream.

Reviewed By: sfilipco

Differential Revision: D24806696

fbshipit-source-id: f72571407a5747a4eabe096dada288656c9d426e
2020-12-10 12:37:31 -08:00
Jun Wu
465ac6c5df dag: drop IdMapEq trait
Summary:
The trait requires an `IdMap` snapshot to be locally ready. That's not easy for
all possible implementations. Drop it to simplify things.

Reviewed By: sfilipco

Differential Revision: D24399501

fbshipit-source-id: 4d85f77c99208cda30b2a543a0bb5b295f49a65c
2020-10-20 15:19:30 -07:00
Durham Goode
10248e54b3 phases: make public phase calculation more efficient
Summary:
Previously phase calculation was done via a simple ancestor check. This
was very slow in cases that required going far back into the graph. Going a year
back could take a number of seconds.

To fix it, let's take the Rust phaseset logic and rework it to make only_both
produce an incremental public nodes set. In a later diff we can switch the
phaseset function to use this as well, but right now phaseset returns IdSet, and
that would need to be changed to Set, which may have consequences. So I'll do it
later.

Reviewed By: quark-zju

Differential Revision: D24096539

fbshipit-source-id: 5730ddd45b08cc985ecd9128c25021b6e7d7bc89
2020-10-05 14:40:53 -07:00
David Tolnay
e83e05ff25 Update formatter to rustfmt 2.0
Reviewed By: zertosh

Differential Revision: D23591028

fbshipit-source-id: f458503fc2b9c25023fa1643eca5e166882a4811
2020-09-09 07:52:34 -07:00
David Tolnay
e62b176170 Prepare for rustfmt 2.0
Summary:
Generated by formatting with rustfmt 2.0.0-rc.2 and then a second time with fbsource's current rustfmt (1.4.14).

This results in formatting for which rustfmt 1.4 is idempotent but is closer to the style of rustfmt 2.0, reducing the amount of code that will need to change atomically in that upgrade.

 ---

*Why now?* **:** The 1.x branch is no longer being developed and fixes like https://github.com/rust-lang/rustfmt/issues/4159 (which we need in fbcode) only land to the 2.0 branch.

 ---

Reviewed By: zertosh

Differential Revision: D23568779

fbshipit-source-id: 477200f35b280a4f6471d8e574e37e5f57917baf
2020-09-07 20:47:59 -07:00
Jun Wu
682365f14d nameset: make Id{Static,Lazy}Set require Dag on construction
Summary:
Both IdSet and IdLazy set require both Dag and IdMap to construct.
This is step 1 torwards making Dag and IdMap immutable in hints.

A misspeall of "lhs" vs "hints" in the union set is discovered by the change
and fixed.

Reviewed By: sfilipco

Differential Revision: D23182520

fbshipit-source-id: 3d052de4b8681d3672ebc45d953d1e784f64b2a4
2020-08-26 15:32:22 -07:00
Jun Wu
bd7769b34a dag: rename snapshot_dag to dag_snapshot
Summary: This is more consistent with `id_map_snapshot`.

Reviewed By: sfilipco

Differential Revision: D23182519

fbshipit-source-id: 62b7fc8bfdc9d6b3a4639a6518ea084c7f3807dd
2020-08-26 15:32:22 -07:00
Jun Wu
0ac5f05097 nameset: use real dag snapshot instead of a pointer in hints
Summary:
This trades a bit performance (calculating the snapshot) for correctness (no
pointer reuse issues) and convenience (set captures dag information with them
and enables use-cases like converting NameSet from another dag to the
current dag without requiring extra `dag` objects).

Reviewed By: sfilipco

Differential Revision: D23036067

fbshipit-source-id: 2e691f09ad401ba79dbc635e908d79e54dadca5e
2020-08-21 13:00:45 -07:00
Jun Wu
f666cb1cf0 dag: add DagAlgorithm::snapshot_dag
Summary:
This API allows the underlying Dag to provide a snapshot. The snapshot can then
be used in places that do not want a lifetime (ex. NameSet).

Reviewed By: sfilipco

Differential Revision: D22970579

fbshipit-source-id: ededff82009fd5b4583f871eef084ec907b45d33
2020-08-21 13:00:45 -07:00
Jun Wu
2db783bed8 revlogindex: make parent_revs fallible
Summary: If parent_revs gets an out-of-bound rev, it should fail.

Reviewed By: sfilipco

Differential Revision: D23036071

fbshipit-source-id: 7fae0fd5adf07ac3c933a29d7d06289d8d740c60
2020-08-14 22:00:26 -07:00
Jun Wu
0f838d7abf revlogindex: fix \0 header handling
Summary:
If the text starts with `\0`, the `\0` should be considered as part of the
uncompressed text instead of a separated header.

Reviewed By: sfilipco

Differential Revision: D22970575

fbshipit-source-id: 49e8a1a1ea42a3c4cf153b70f59fd0558dcfcede
2020-08-14 22:00:26 -07:00
Jun Wu
54a1a620d0 revlogindex: fix parent handling
Summary:
The parent handling is unsound when there are revs that are skipped. Fix it by
reasoning about commit hashes for parents.

Reviewed By: sfilipco

Differential Revision: D23036078

fbshipit-source-id: 8f710171471025cd48b3bd8f6ea57c68330eb8b8
2020-08-14 22:00:26 -07:00
Jun Wu
4b5833968a revlogindex: be Ctrl+C/SIGKILL safe
Summary:
This provides Ctrl+C/SIGKILL safety. It's needed because we no longer use the
Python transaction framework. If the write is incomplete, the revlog index
logical length will ensure new processes won't see incomplete data.

The length of revlog data is not tracked, as some "unused" in it does not
really matter. Reading the revlog should be still fine.

Reviewed By: sfilipco

Differential Revision: D22914423

fbshipit-source-id: f2f446cde79c7270cbd1ef165f8707368a0a2990
2020-08-06 12:31:57 -07:00
Jun Wu
6fd7a2e582 dag: use concrete error types
Summary:
This is more complex than previous libraries, mainly because `dag` defines APIs
(traits) used by other code, which might raise error type not interested
by `dag` itself. `BackendError::Other(anyhow::Error)` is currently used to
capture types that do not fit in `dag`'s predefined error types.

Reviewed By: sfilipco

Differential Revision: D22883865

fbshipit-source-id: 3699e14775f335620eec28faa9a05c3cc750e1d1
2020-08-06 12:31:57 -07:00
Jun Wu
ff9c979b07 revlogindex: use concrete error types
Summary:
All dependencies of revlogindex have migrated to concreted error types.
Let's migrate revlogindex itself. This allows compile-time type checks
and makes the error returned by revlogindex APIs more predictable.

Reviewed By: sfilipco

Differential Revision: D22857554

fbshipit-source-id: 7d32599508ad682c6e9c827d4599e6ed0769899c
2020-08-06 12:31:57 -07:00
Jun Wu
af375c51b0 radixbuf: use concrete error types
Summary: The `radixbuf` crate already has its own concrete error type. Use it.

Reviewed By: sfilipco

Differential Revision: D22855450

fbshipit-source-id: 307a46ddd79b28a18ee779867ee1e604b531828a
2020-08-06 12:31:57 -07:00
Jun Wu
235a9306e1 revlogindex: support delta-ed content
Summary:
Although new changelog revlogs do not use deltas since years ago, early
revisions in our production changelog still use mpatch delta format
because they are stream-cloned.

Teach revlogindex to support them.

Reviewed By: sfilipco

Differential Revision: D22657204

fbshipit-source-id: 7aa3b76a9a6b184294432962d36e6a862c4fe371
2020-07-30 20:32:38 -07:00
Jun Wu
a36f77673e revlogindex: implement reachable_roots fast path
Summary:
The default reachable_roots implementation is good enough for segmented
changelog, but not efficient for revlogindex use-case.

Reviewed By: sfilipco

Differential Revision: D22657193

fbshipit-source-id: a81bc255d42d46c50e61fe954f027f1160dacb6c
2020-07-30 20:32:37 -07:00
Jun Wu
fcc78319a0 revlogindex: use dedicated error type for missing commits
Summary:
This replaces RustError that might happen during `addcommits`, and allow us to
handle it without having a stacktrace.

Reviewed By: DurhamG

Differential Revision: D22539564

fbshipit-source-id: 356814b9baf0b31528dfc92d62b0dcf352bc1e24
2020-07-30 20:32:33 -07:00
Jun Wu
c68d389d95 revlogindex: update DAG hints
Summary:
Follow up of D22638454.

This makes revlogindex marks its compatible DAG so "all()" fast paths can be used properly.

Reviewed By: sfilipco

Differential Revision: D22638459

fbshipit-source-id: 074e95b9fccbc486b69a947fec5172662e7dd3b7
2020-07-30 20:32:32 -07:00
Jun Wu
d5d429a5c7 revlogindex: optimize with ANCESTORS hint
Summary:
Similar to the segmented changelog version using `ANCESTORS`. This makes
`heads(all())` calculates `heads_ancestors(all())` automatically and gets
the speed-up.

Reviewed By: sfilipco

Differential Revision: D22638464

fbshipit-source-id: 014412f1c226925e50387f18c1282b3cb96d434b
2020-07-30 20:32:31 -07:00
Jun Wu
f5fb9fb09d revlogindex: optimize heads_ancestors
Summary:
Optimize it to not covert revs to `Vec<u32>`, and have a fast path to
initialize `states` with `Unspecified`. This makes it about 2x faster and match
the C revlog `headrevs` performance when calculating `headsancestors(all())`:

```
In [2]: %timeit cl.index.clearcaches(); len(cl.index.headrevs())
10 loops, best of 3: 66.9 ms per loop

In [3]: %timeit len(cl.dageval(lambda: headsancestors(all())))
10 loops, best of 3: 64.9 ms per loop
```

Reviewed By: sfilipco

Differential Revision: D22638461

fbshipit-source-id: 965eb16e3a78ae02a65a8a44559f3a64c16f6884
2020-07-30 20:32:30 -07:00
Jun Wu
2d4bb1d7e3 revlogindex: fast path for parents
Summary:
Change `parents` from using the default implementation that returns `StaticSet`
of commit hashes, to a customized implementation that returns `IdStaticSet`.
This avoids unnecessary commit hash lookups, and makes `heads(all())` 30x
faster, matching `headsancestors(all())` (but is still 2x slower than the C
revlog index `headsrevs` implementation).

Reviewed By: sfilipco

Differential Revision: D22638453

fbshipit-source-id: 4fef78080b990046b91fee110c48e36301d83b4f
2020-07-30 20:32:30 -07:00
Jun Wu
3d9f195721 changelog: use Rust RevlogIndex for 'ancestor' revset function
Summary: This avoids depending on the C index if the Rust DAG is available.

Reviewed By: DurhamG

Differential Revision: D22519587

fbshipit-source-id: a89d91184feaeef6641d2b04353601297bf5d4d5
2020-07-30 20:00:41 -07:00
Jun Wu
e440d3ce2b revlogindex: update nodemap even if it's non-symlink and mmaped on Windows
Summary: On Windows a mmap file cannot be replaced. Detect that and delete manually.

Reviewed By: farnz

Differential Revision: D22428731

fbshipit-source-id: 4d308a07aae02dcaf2aedb7b0267a535c2e09c92
2020-07-08 11:31:21 -07:00
Jun Wu
b1ae5b2874 revlogindex: skip flushing duplicated entries
Summary:
If the revlog on disk was changed to include new commits, read them and avoid
writing duplicated commits (which breaks nodemap building).

Reviewed By: sfilipco

Differential Revision: D22323187

fbshipit-source-id: cdd65f31e65865d9f3868e43416633297896c0f9
2020-07-06 15:51:00 -07:00
Jun Wu
a0c5b1b3a5 revlogindex: is_ancestor(x, x) should return true
Summary: This is discovered by using it in Python world.

Reviewed By: sfilipco

Differential Revision: D22323186

fbshipit-source-id: 295811e0950b94ad2ad73ad242228b6a3f9765d0
2020-07-06 15:50:59 -07:00
Jun Wu
52668752d8 revlogindex: de-duplicate insertions
Summary: Adding a same commit multiple times is a no-op.

Reviewed By: sfilipco

Differential Revision: D22323190

fbshipit-source-id: 61a06335581a9cad32dc7e929b841ec69b551a9c
2020-07-06 15:50:59 -07:00
Jun Wu
b86f3bd6e2 revlogindex: use tests from the dag crate
Summary: This adds some test coverage for the revlog DagAlgorithm implementation.

Reviewed By: sfilipco

Differential Revision: D22249157

fbshipit-source-id: a1d347b4d90d0e7f8fb229c317cc75c2b8e16242
2020-07-06 15:50:59 -07:00
Jun Wu
fcbe821dd1 revlogindex: impl DagAddHeads for RevlogIndex
Summary:
This makes RevlogIndex compatible with the generic DAG testing API from the dag
crate.

Reviewed By: sfilipco

Differential Revision: D22249156

fbshipit-source-id: 54a3c458e85804968964174eab674e494a6fa8a2
2020-07-06 15:50:59 -07:00
Jun Wu
5d9baa2f07 revlogindex: implement fast path for only
Summary:
For revlog, calculating `only` can have some fast paths that do not scan the
entire changelog.

Reviewed By: sfilipco

Differential Revision: D21944136

fbshipit-source-id: 58391636350f8f19643d59c46d663f55861d6de3
2020-07-06 15:50:58 -07:00
Jun Wu
7d75f6046f revlogindex: implement fast path for only_both
Summary:
This will be used to maintain narrow-heads phase calculation and sunsetting the
revlog-specific changelog.index2.

Reviewed By: sfilipco

Differential Revision: D21944131

fbshipit-source-id: a8bbd1fd24546f4891ffa677476bff750c3faf5f
2020-07-06 15:50:58 -07:00
Jun Wu
40cd0f8f06 revlogindex: fix an offset-by-one error
Summary: The values of `pending_nodes_index` should start from 0 instead of 1.

Reviewed By: sfilipco

Differential Revision: D21944133

fbshipit-source-id: a2a332868f16b398037289c81bf8076d1400c0a7
2020-07-06 15:50:58 -07:00
Jun Wu
bd0a35f2a0 revlogindex: do not raise errors on ambiguous prefix
Summary: This matches the interface of segmented changelog.

Reviewed By: sfilipco

Differential Revision: D21944134

fbshipit-source-id: 75f68b2838de4abe95f13cb3c62dc68af132fff7
2020-07-06 15:50:58 -07:00
Jun Wu
4670200c21 revlogindex: maintain revlog.d file handler transparently
Summary:
This drops the `file` parameter from the `raw_data` API, making
RevlogIndex easier to use.

Reviewed By: sfilipco

Differential Revision: D21854228

fbshipit-source-id: 259726524d1cc6a1f9d00783e22f9502c7decdeb
2020-07-06 15:50:58 -07:00
Jun Wu
137fa3cd34 revlogindex: implement writing to revlog data
Summary: Extend RevlogIndex to support writing to revlog data.

Reviewed By: sfilipco

Differential Revision: D21854227

fbshipit-source-id: 11b6bf3b706b316f23c33ab07144530c9db92d58
2020-07-06 15:50:58 -07:00
Jun Wu
f005d92f07 revlogindex: implement reading from revlog data
Summary: Extend RevlogIndex to support reading from revlog data.

Reviewed By: sfilipco

Differential Revision: D21854229

fbshipit-source-id: 4cbc08762fd236a97370d5d55c59a222f935b262
2020-07-06 15:50:58 -07:00