Commit Graph

429 Commits

Author SHA1 Message Date
Jun Wu
48d7297887 dag: remove unnecessary snapshots
Summary:
Some code paths use (expensive) snapshot to be compatible with `Arc::ptr_eq`
compatibility check. With `VerLink` it's more efficient to use `VerLink`
directly. This is potentially more efficient for `VerLink` too because the
`Arc` won't be cloned unnecessarily and `VerLink::bump()` is more likely to
use its optimized path.

Reviewed By: sfilipco

Differential Revision: D25608200

fbshipit-source-id: 1b3ecc5d7ec5d495bdda22d66025bb812f3d68a0
2020-12-18 16:56:44 -08:00
Jun Wu
d0d149d868 dag: use VerLink to track IdMap change compatibility
Summary:
Similar to the previous change. `VerLink` tracks compatibility more accurately.
- No false positives comparing to the current `map_id` approach.
- Less false negatives comparing to the previous `Arc::ptr_eq` approach.

The `map_id` is kept for debugging purpose.

Reviewed By: sfilipco

Differential Revision: D25607513

fbshipit-source-id: 7d7c7e3d49f707a584142aaaf0a98cfd3a9b5fe8
2020-12-18 16:56:43 -08:00
Jun Wu
21a1f61285 dag: invalidate snapshot is no longer required or correctness
Summary:
Previously, snapshots need to be invalidated manually. That is error-prone.
For example, `import_clone_data` forgot to call `invalidate_snapshot`.

With `VerLink`, it's easy to check if snapshot is up-to-date. So let's just
use that and remove the need of invalidating manually.

`invalidate_snapshot` is still useful to drop `version` in `snapshot` so
`VerLink::bump` might be more efficient. Forgetting about it no longer affects
correctness.

Reviewed By: sfilipco

Differential Revision: D25607514

fbshipit-source-id: 5efb489cda1d4875bcd274c5a197948f67101dc1
2020-12-18 16:56:43 -08:00
Jun Wu
9ba0b046c0 dag: use VerLink to track dag change compatibility
Summary:
`VerLink` tracks compatibility more accurately.
- No false positives comparing to the current `dag_id` approach.
- Less false negatives comparing to the previous `Arc::ptr_eq` approach.

The `dag_id` is kept for debugging purpose.

Note: By the current implementation, `dag.flush()` will make `dag`
incompatible from its previous state. This is somewhat expected, as
`flush` might pick up any changes on the filesystem, reassign non-master. Those
can be actually incompatible. This might be improved in the future to detect
reload changes by using some extra information.

Reviewed By: sfilipco

Differential Revision: D25607511

fbshipit-source-id: 3cfc97610504813a3e5bb32ec19a90495551fd3a
2020-12-18 16:56:43 -08:00
Jun Wu
edaed5d4b4 dag: add VerLink to track change compatibility more precisely
Summary:
There are 2 kinds of changes:
- Append-only changes. It is backwards-compatible.
- Non-append-only changes. It is not backwards-compatible.

Previously,
- `Arc::ptr_eq` on snapshot is too fragile. It treats append-only compatible
  changes as incompatible.
  - Even worse, because of wrapper types (ex. `Arc::new(Arc::new(dag))` is
    different from `dag`), even a same underlying struct can be treated as
    incompatible.
- `(map|dag)_id` is too rough. It treats incompatible non-append-only changes
  as compatible.

Add `VerLink` to track those 2 different kinds of changes. It basically keeps a
(cheap) tree so backwards compatible changes will be detected precisely.
`VerLink` will replace IdMap and Dag compatibility checks.

Reviewed By: sfilipco

Differential Revision: D25607512

fbshipit-source-id: 478f81deee4d2494b56491ec4a851154ab7ae52d
2020-12-18 16:56:43 -08:00
Jun Wu
f626f09bfd dag: add some debug logs about set operations
Summary:
This makes it easier to check if set operations are using fast paths or not by
setting `RUST_LOG=dag=debug`.

Reviewed By: sfilipco

Differential Revision: D25598075

fbshipit-source-id: 1503a195268c0989d5166596f2c8a66e15201372
2020-12-18 16:56:43 -08:00
Jun Wu
eea00a2cb1 dag: add an API for DagAlgorithm identity
Summary:
See the previous diff for context. The new API will be used to check if two
dags are compatible.

Note: It can cause false positive on compatibility checks, which need a
more complex solution. See D25607513 in this stack.

Reviewed By: sfilipco

Differential Revision: D25598079

fbshipit-source-id: f5fc9c03d73b42fadb931038fe2e078881be955f
2020-12-18 16:56:42 -08:00
Jun Wu
aef162f7a1 dag: add an API for IdMap identity
Summary:
It turns out `Arc::ptr_eq` is becoming unreliable, which will cause fast paths
to be not used, and extreme slowness in some cases (ex. `public & nodes`
iterating everything in `public`).

This diff adds an API for an IdMap to tell us its identity. That identity is
then used to replace the unreliable `Arc::ptr_eq`.

For an in-memory map, we just assign a unique number (per process) for its
identity on initialization. For an on-disk map, we use the type + path to
represent it.

Note: strictly speaking, this could cause false positives about
"maps are compatible", because two maps initially cloned from each other
can be mutated differently and their map_id do not change. That will
be addressed in upcoming diffs introducing a more complex but precise way to
track compatibility.

Reviewed By: sfilipco

Differential Revision: D25598076

fbshipit-source-id: 98c58f367770adaa14edcad20eeeed37420fbbaa
2020-12-16 20:08:41 -08:00
Jun Wu
ec7c0659c9 dag: relax trait bounds for AbstractNameDag
Summary: This makes it more flexible.

Reviewed By: kulshrax

Differential Revision: D24467604

fbshipit-source-id: 63023cf0dde2fb7eac592ac79008e4b7a62340c1
2020-12-11 17:03:37 -08:00
Jun Wu
dcf4957619 dag: make parent function async
Summary: Make the parent function used by various graph building functions async.

Reviewed By: sfilipco

Differential Revision: D25353612

fbshipit-source-id: 31f173dc82f0cce6022cc2caae78369fdc821c8f
2020-12-10 12:37:36 -08:00
Jun Wu
bffffb2415 dag: remove IdMapBuildParents
Summary:
It is no longer needed for building segments (replaced by "prepared flat
segments"). Remove it.

Reviewed By: sfilipco

Differential Revision: D25353613

fbshipit-source-id: aede9e33c3217a61b5b14aae5b128d8953bc578e
2020-12-10 12:37:36 -08:00
Jun Wu
ad6f25addc dag: make IdConvert async
Summary: Make IdConvert async and migrate all its users.

Reviewed By: sfilipco

Differential Revision: D25350915

fbshipit-source-id: f05c89a43418f1180bf0ffa573ae2cdb87162c76
2020-12-10 12:37:35 -08:00
Jun Wu
24badee0d0 dag: make IdMapAssignHead async
Summary: This will make it easier to make IdConvert async.

Reviewed By: sfilipco

Differential Revision: D25350912

fbshipit-source-id: fbaf638b16a9cf468b7530b19d699b7996ddc4f1
2020-12-10 12:37:35 -08:00
Jun Wu
f30934ab4f dag: require Send and Sync on Persist::Lock
Summary: This will make async migrating easier.

Reviewed By: sfilipco

Differential Revision: D25350913

fbshipit-source-id: f33bdc0023ae0cc49601504b811991ea6813ff9e
2020-12-10 12:37:35 -08:00
Jun Wu
abc97bb6fe dag: make write_sparse_idmap async
Summary: This will make it easier to make IdConvert async.

Reviewed By: sfilipco

Differential Revision: D25350914

fbshipit-source-id: 9f2957731f13a28fdfab834de19763b8afcf8ffa
2020-12-10 12:37:35 -08:00
Jun Wu
461fa77fd7 dag: make Set::flatten async
Summary: This will make it easier to make IdConvert async.

Reviewed By: sfilipco

Differential Revision: D25345239

fbshipit-source-id: 684a0843ae32270aa9b537ef9a2b17a28c027e51
2020-12-10 12:37:34 -08:00
Jun Wu
53bdae78d9 dag: make ToIdSet async
Summary: This will make it easier to make IdConvert async.

Reviewed By: sfilipco

Differential Revision: D25345232

fbshipit-source-id: b8967ea51a6141a95070006a289dd724522f8e18
2020-12-10 12:37:34 -08:00
Jun Wu
f854d2e03e dag: make DagAlgorithm async
Summary:
Update DagAlgorithm and all its users to async. This makes it easier to make
IdConvert async.

Reviewed By: sfilipco

Differential Revision: D25345236

fbshipit-source-id: d6cf76723356bd0eb81822843b2e581de1e3290a
2020-12-10 12:37:34 -08:00
Jun Wu
6c02a90386 dag: make MetaSet accept async evaluate and contains
Summary:
Make it possible to use async functions in MetaSet functions.
It will be used when DagAlgorithm becomes async.

Reviewed By: sfilipco

Differential Revision: D25345229

fbshipit-source-id: 0469d572b56df21fbdbdfae4178377e572adbcda
2020-12-10 12:37:34 -08:00
Jun Wu
a03e8f4c55 dag: make DagPersistent and DagAddHeads async
Summary: This makes it easier to make DagAlgorithm async.

Reviewed By: sfilipco

Differential Revision: D25345234

fbshipit-source-id: 5ca4bac38f5aac4c6611146a87f423a244f1f5a2
2020-12-10 12:37:33 -08:00
Jun Wu
e2542490e8 dag: change &impl Trait to &dyn Trait
Summary: `impl Trait` does not work with `async_trait`.

Reviewed By: sfilipco

Differential Revision: D25345238

fbshipit-source-id: e7890dbaeb162d44e072ea4428d045004608719b
2020-12-10 12:37:33 -08:00
Jun Wu
da6b03768a dag: require Send + Sync on AbstractNameDag
Summary: This makes it easier to migrate to async.

Reviewed By: sfilipco

Differential Revision: D25345228

fbshipit-source-id: e819f0de5f805377a6977325216ef11b14d68c1d
2020-12-10 12:37:33 -08:00
Jun Wu
053cd9d8a3 dag: mark a few things as Sync or Send
Summary:
Marking IdConvert Sync makes it possible to be used as a trait object with async-trait.
See https://docs.rs/async-trait/0.1.41/async_trait/#dyn-traits

`dag` uses a lot `dyn DagAlgorithm`. In the future when async is used more, the
trait object will be required to be Send or Sync. Just require it on the trait
to make our life easier.

Marking `IdDagStore` as Send + Sync makes async migration easier.

Reviewed By: sfilipco

Differential Revision: D25345231

fbshipit-source-id: 45b96057907cbe2a1d38fd424e7d4c963dd1b245
2020-12-10 12:37:32 -08:00
Jun Wu
2496281d78 dag: make PrefixLookup async
Summary: Use async function for the PrefixLookup trait.

Reviewed By: sfilipco

Differential Revision: D24840820

fbshipit-source-id: d22cac9f11b06e3127fa956e3f116cf232214125
2020-12-10 12:37:32 -08:00
Jun Wu
c0bc7800d3 dag: add some default impls
Summary: This makes the trait objects slightly easier to use.

Reviewed By: sfilipco

Differential Revision: D24840821

fbshipit-source-id: 22fcdf13b62420302b562c309874e08360d02372
2020-12-10 12:37:32 -08:00
Jun Wu
4d791fd823 dag: require PrefixLookup for IdConvert
Summary: This makes `dyn IdConvert` include `PrefixLookup`.

Reviewed By: sfilipco

Differential Revision: D24840819

fbshipit-source-id: 8d4e25c534f6e4397ec6f643eb3aa116bff12a2c
2020-12-10 12:37:32 -08:00
Jun Wu
d9128caac1 dag: delegate IdMapSnapshot impls
Summary:
In the future, when async APIs are used, Python bindings will have lifetime
issues. Make it possible to clone the IdMap so the Python bindings can be made
to work.

Reviewed By: sfilipco

Differential Revision: D24840822

fbshipit-source-id: 6aa4e369c877c428ed39d2cbea79e6943836afa8
2020-12-10 12:37:32 -08:00
Jun Wu
c6f14f02f4 dag: use Stream interface for NameSet iteration
Summary: This makes NameSet more friendly for async use-cases, interface-wise.

Reviewed By: sfilipco

Differential Revision: D24806695

fbshipit-source-id: 6e640ba2666872a9128d6460e8b53d6a0e595e56
2020-12-10 12:37:31 -08:00
Jun Wu
32176eca42 dag: add async interface for NameSet
Summary:
Change the main API of NameSet to async. Use the `nonblocking` crate to bridge
the sync and async world for compatibility. Future changes will migrate
Iterator to async Stream.

Reviewed By: sfilipco

Differential Revision: D24806696

fbshipit-source-id: f72571407a5747a4eabe096dada288656c9d426e
2020-12-10 12:37:31 -08:00
Stefan Filip
8327c54db1 dag: add DagImportCloneData::import_clone_data
Summary:
This method reconstructs a dag from clone data.

At the moment we only have a clone data construction method in Mononoke. It's
the Dags job to construct and import the clone_data. We'll consolidate that at
a later time.

Reviewed By: quark-zju

Differential Revision: D24954823

fbshipit-source-id: fe92179ec80f71234fc8f1cf7709f5104aabb4fb
2020-12-01 09:59:27 -08:00
Stefan Filip
4f70ffdff8 dag: update IdDag::universal_id visibility to public
Summary:
This function is useful in the mononoke to compute the universal commit idmap
that is required for clone.

Reviewed By: quark-zju

Differential Revision: D24808327

fbshipit-source-id: 0cccd59bd7982dd0bc024d5fc85fb5aa5eafb831
2020-11-10 16:47:23 -08:00
Stefan Filip
d00281f8dc dag: add IdDag::flat_segments
Summary:
`flat_segments` are going to be used to generate CloneData. These segments will
be sent to a client repository and are going to bootstrap the iddag.

Reviewed By: quark-zju

Differential Revision: D24808331

fbshipit-source-id: 00bf9723a43bb159cd98304c2c4c6583988d75aa
2020-11-10 16:47:23 -08:00
Stefan Filip
bff5a9ba29 dag: add CloneData
Summary: This is the object that will be used to bootstrap a Dag after a clone.

Reviewed By: quark-zju

Differential Revision: D24808328

fbshipit-source-id: 2c7e97c027c84a11e8716f2e288500474990169b
2020-11-10 16:47:23 -08:00
Stefan Filip
40333a545f dag: rename AssignHeadOutcome to PreparedFlatSegments
Summary:
The goal is to reused the functionality provided by AssignHeadOutcome for clone
purposes.

Reviewed By: quark-zju

Differential Revision: D24717924

fbshipit-source-id: e88f21ee0d8210e805e9d6896bc8992009bd7975
2020-11-10 16:47:23 -08:00
Stefan Filip
07200876bb segmented_changelog: account for iddag lag in incremental build
Summary:
I initially saw the incremental build as something that would be run in places
that had IdMap and IdDag stored side by side in process. I am reconsidering
to use incremental build in the tailing process to keeps Segmented Changelog
artifacts up to date.

Since we update the IdMap before we update the IdDag, it is likely that we
will have runs that only update the IdMap and fail to update IdDags. This diff
adds a mechanism for the IdDag to catch up.

Reviewed By: krallin

Differential Revision: D24516440

fbshipit-source-id: 3a99248451d806ae20a0ba96199a34a8a35edaa4
2020-10-29 17:40:19 -07:00
Stefan Filip
2cce5b532b dag: update InProcessStore serialization
Summary: Removing redundant indexes from serialization.

Reviewed By: quark-zju

Differential Revision: D24580272

fbshipit-source-id: 49b1d6ae00e2f079dd0ed9d710afcd04b9744442
2020-10-28 14:55:59 -07:00
Stefan Filip
dc1edebf9e dag: benchmark for inprocess iddag serialization
Summary:
I am wondering whether we should customize the serialization format for the
InProcessStore. I want to have a basis for the comparison before I proceed.

Reviewed By: quark-zju

Differential Revision: D24580273

fbshipit-source-id: d3ddfdc029dbdd84f60acace06fddc80b4d005f4
2020-10-28 14:55:59 -07:00
Jun Wu
91ab519edb streams: add API to prefetch commit text in a streaming fashion
Summary:
This will be used to avoid 1-by-1 fetching for the changelog backend with
commit text stored remotely.

Reviewed By: sfilipco

Differential Revision: D24321293

fbshipit-source-id: 9695c72166cadc0b167e2ce7fde822cdf6b1cea8
2020-10-20 18:40:58 -07:00
Jun Wu
a3ac43a23a rollout: use doublewrite changelog backend for hg-dev
Summary:
Turn on rust changelog (changelog2) for all hosts (except hgsql).

Turn on doublewrite backend for hg-dev hosts, triggered by pull.
Tests are mostly working, and I have been using it for weeks.

Reviewed By: singhsrb

Differential Revision: D24259759

fbshipit-source-id: b89a27f98a6d3d1e4ea187bf7b29f875d0e96e2e
2020-10-20 15:53:57 -07:00
Jun Wu
75ae217850 dag: remove NameDagStorage
Summary: It's no longer useful as the new abstract interface does not need it.

Reviewed By: sfilipco

Differential Revision: D24399516

fbshipit-source-id: 2b6735d2a26706c6a3e6b592d2f3ecfc874c94cb
2020-10-20 15:19:31 -07:00
Jun Wu
24edf32eac dag: impl MemNameDag using AbstractNameDag
Summary:
This verifies the abstraction and simplifies the code.

The new code will use non-master segments for add_heads. Therefore the test
changes.

Reviewed By: sfilipco

Differential Revision: D24399496

fbshipit-source-id: 39067ad88ade79b4f7758bcdaafc03e5f34ced91
2020-10-20 15:19:31 -07:00
Jun Wu
72c4a10e7e dag: move indexedlog NameDag to a module
Summary: This makes the main namedag.rs cleaner. The next step is to move MemNameDag.

Reviewed By: sfilipco

Differential Revision: D24399495

fbshipit-source-id: c1e79a60edd8597fe7264f04548e5312414241a7
2020-10-20 15:19:31 -07:00
Jun Wu
ddf2468f00 dag: impl Debug for AbstractNameDag
Summary: This is the last non-abstract interface of NameDag.

Reviewed By: sfilipco

Differential Revision: D24399514

fbshipit-source-id: f39bb84a1851a4fe4d1f29e6b0961e6a153c943d
2020-10-20 15:19:31 -07:00
Jun Wu
f4e4fb9342 dag: impl DagPersistent for AbstractNameDag
Summary: Implement DagPersistent using the abstracted interface.

Reviewed By: sfilipco

Differential Revision: D24399499

fbshipit-source-id: 3e2c3c776d4ff8d84238a4675d443e36ce212819
2020-10-20 15:19:31 -07:00
Jun Wu
c611f18d28 dag: add 'path' to AbstractNameDag
Summary:
There is a need to open AbstractNameDag cleanly from a path.
Abstract that.

Reviewed By: sfilipco

Differential Revision: D24399498

fbshipit-source-id: ca242cd929e8f5580120c01eeaa928f630c21ed7
2020-10-20 15:19:31 -07:00
Jun Wu
c41904b765 dag: impl DagAlgorithm for AbstractNameDag
Summary:
I copied the code since it's hard to implement using the macros.
In the future I plan to merge MemNameDag into AbstractNameDag
and remove the macros.

Reviewed By: sfilipco

Differential Revision: D24399517

fbshipit-source-id: 326e76cd06a6e1ad26b39bcb51ba0ff24106c984
2020-10-20 15:19:31 -07:00
Jun Wu
1b27600778 dag: impl IdConvert and PrefixLookup for AbstractNameDag
Summary: The `delegate!` is updated to support complex `impl`s.

Reviewed By: sfilipco

Differential Revision: D24399518

fbshipit-source-id: b9ba31174472cce4248e9644611cfc207abc3c1d
2020-10-20 15:19:30 -07:00
Jun Wu
0695b17375 dag: abstract some implementations
Reviewed By: sfilipco

Differential Revision: D24399504

fbshipit-source-id: 388b6788fbe7bbb30a34dcb91b4ca488d49ac8af
2020-10-20 15:19:30 -07:00
Jun Wu
558ed0466f dag: Persist::lock takes &mut self
Summary: It'll satisfy a future change.

Reviewed By: sfilipco

Differential Revision: D24399505

fbshipit-source-id: 6ee2cd0d2b4bd20003082e2733423647cb99619b
2020-10-20 15:19:30 -07:00
Jun Wu
f8e3d67631 dag: add TryClone trait
Summary: Will be used as bounds for abstraction.

Reviewed By: sfilipco

Differential Revision: D24399497

fbshipit-source-id: 343be12237d4850fbde9ebbe4034469527bd77fc
2020-10-20 15:19:30 -07:00
Jun Wu
f04e0fa60e namedag: define an abstract NameDag struct
Summary: It's not yet abstract. But one step closer.

Reviewed By: sfilipco

Differential Revision: D24399510

fbshipit-source-id: 32969733babd41b221170ee440f5d7ced1f7490a
2020-10-20 15:19:30 -07:00
Jun Wu
c49caf64b4 dag: drop snapshot_map from NameDag
Summary: The `snapshot` field can be used instead.

Reviewed By: sfilipco

Differential Revision: D24399507

fbshipit-source-id: 67de20d897b8b763f724f3ccbd46618dec7911b9
2020-10-20 15:19:30 -07:00
Jun Wu
465ac6c5df dag: drop IdMapEq trait
Summary:
The trait requires an `IdMap` snapshot to be locally ready. That's not easy for
all possible implementations. Drop it to simplify things.

Reviewed By: sfilipco

Differential Revision: D24399501

fbshipit-source-id: 4d85f77c99208cda30b2a543a0bb5b295f49a65c
2020-10-20 15:19:30 -07:00
Jun Wu
5f00c7984b dag: unify prepare_filesystem_sync
Summary: There were 2 prepare_filesystem_sync. Unify them into one implementation.

Reviewed By: sfilipco

Differential Revision: D24399513

fbshipit-source-id: 80d009c33b7f23dc2c4225da6fd0fb09589ba061
2020-10-20 15:19:30 -07:00
Jun Wu
4a7ffc4bd7 dag: use Locked for SyncableIdMap
Summary: Simplifies some code.

Reviewed By: sfilipco

Differential Revision: D24399500

fbshipit-source-id: a1317149da066617c4060b7efdae5234e5bd7262
2020-10-20 15:19:30 -07:00
Jun Wu
282c6800cb dag: use Locked for SyncableIdDag
Reviewed By: sfilipco

Differential Revision: D24399506

fbshipit-source-id: 91cfa176b8cfeca3f96dfeb211bf9d46a3d95bd5
2020-10-20 15:19:30 -07:00
Jun Wu
5c7b169e0e dag: add Locked type
Summary: More general purposed type for Syncable{IdDag,IdMap}.

Reviewed By: sfilipco

Differential Revision: D24399502

fbshipit-source-id: 0599db6dd07fe3d430458f86a33a9144d850fca1
2020-10-20 15:19:29 -07:00
Jun Wu
77790e9f49 dag: move non-master write APIs to IdMapWrite trait
Summary: This makes it more generic.

Reviewed By: sfilipco

Differential Revision: D24399493

fbshipit-source-id: 8a1d0a13dd29989b17fe3ef1497b10b6fa0629d6
2020-10-20 15:19:29 -07:00
Jun Wu
b249950984 dag: impl Persist on IdMaps
Summary: IdMap fits the Persist trait.

Reviewed By: sfilipco

Differential Revision: D24399494

fbshipit-source-id: 97b84d155f4b9bb3006bfad116defa4fca6330d6
2020-10-20 15:19:29 -07:00
Jun Wu
625b8ab4d5 dag: move IdMap impls to separate files
Summary: Similar to IdDag change, move impls to separate files.

Reviewed By: sfilipco

Differential Revision: D24399508

fbshipit-source-id: 575b6e7194677b67b6755b0a30ae7d014d498b10
2020-10-20 15:19:29 -07:00
Jun Wu
34aa41f24f dag: iddagstore::GetLock -> ops::Persist
Summary:
The lock, reload, mutate, persist pattern is general. It can be used for IdMap
too.

Reviewed By: sfilipco

Differential Revision: D24399512

fbshipit-source-id: d25e51ba735061ca101101d75aff95deb88b1d36
2020-10-20 15:19:29 -07:00
Jun Wu
f01dca1995 dag: drop build_segments_persistent APIs
Summary:
Now `build_segments_persistent` and `build_segments_volatile` are the same.
Just keep one of them.

Reviewed By: sfilipco

Differential Revision: D24399511

fbshipit-source-id: a9f1ac920cdf5b448bd99bf9b6d4ca4160ba0304
2020-10-20 15:19:29 -07:00
Jun Wu
5fad63b010 dag: drop last high level segment by default
Summary:
Previously, we keep the last high level segment per level in memory, and
drop it on disk. When we cross the memory / disk boundary, we had to
maintain such properties carefully. That was needed because some DAG
algorithms rely on complete high level segments.

Now that no DAG algorithms depend on such properties, let's just drop
the logic adding the last segment back to simplify the code.

This removes the need of building segments after open() and sync().

Reviewed By: sfilipco

Differential Revision: D24399515

fbshipit-source-id: 4c640d9aa03c050fcd97f70ee386e32d3a8ee26d
2020-10-20 15:19:29 -07:00
Jun Wu
496724e45e dag: make children work with missing high-level segments
Summary:
This makes the algorithm a bit more robust. Now none of the DAG algorithms
depend on high-level segments are complete and cover all low-level segments.

This also removes constraints. For example, SyncableIdDag can now just
deref() to the normal IdDag for queries without worrying about correctness.

Reviewed By: sfilipco

Differential Revision: D24399503

fbshipit-source-id: e6a91010cff82264cf423e2f24dee1d372822ef6
2020-10-20 15:19:29 -07:00
Jun Wu
f290afe421 dag: remove {range,descendants}_old algorithms
Summary:
They depend on high-level segments covering low-level segments, which
adds extra complexities. Remove them to simplify logic.

Reviewed By: sfilipco

Differential Revision: D24399509

fbshipit-source-id: 56a8e06c263107d1da4d6754b884ce51e18e30bf
2020-10-20 15:19:29 -07:00
Jun Wu
9ed54f1b94 dag: replace 2 panics with non-panic errors
Summary: The panics can happen when the input sets are out of range.

Reviewed By: kulshrax

Differential Revision: D24191789

fbshipit-source-id: efbcbd7f6f69bd262aa979afa4f44acf9681d11e
2020-10-08 13:22:10 -07:00
Stefan Filip
6e2ec8b1ca dag: add serde derives to IdDag and InProcessStore
Summary:
Some sort of serialization for the Dag is useful for saving the IdDag produced
by offline jobs load that when a mononoke server starts.

Reviewed By: quark-zju

Differential Revision: D24096964

fbshipit-source-id: 5fac40f9c10a5815fbf5dc5e2d9855cd7ec88973
2020-10-08 09:43:46 -07:00
David Tolnay
e83e05ff25 Update formatter to rustfmt 2.0
Reviewed By: zertosh

Differential Revision: D23591028

fbshipit-source-id: f458503fc2b9c25023fa1643eca5e166882a4811
2020-09-09 07:52:34 -07:00
David Tolnay
e62b176170 Prepare for rustfmt 2.0
Summary:
Generated by formatting with rustfmt 2.0.0-rc.2 and then a second time with fbsource's current rustfmt (1.4.14).

This results in formatting for which rustfmt 1.4 is idempotent but is closer to the style of rustfmt 2.0, reducing the amount of code that will need to change atomically in that upgrade.

 ---

*Why now?* **:** The 1.x branch is no longer being developed and fixes like https://github.com/rust-lang/rustfmt/issues/4159 (which we need in fbcode) only land to the 2.0 branch.

 ---

Reviewed By: zertosh

Differential Revision: D23568779

fbshipit-source-id: 477200f35b280a4f6471d8e574e37e5f57917baf
2020-09-07 20:47:59 -07:00
Jun Wu
e74133f0fa dag: limit max segment level to 4
Summary:
This is based on fbsource data, building level 5 proves to be not useful.

This would save 300ms in the write path.

Reviewed By: sfilipco

Differential Revision: D23494505

fbshipit-source-id: ca795b4900af40dbfdaa463d36f3169413bf6a62
2020-09-04 12:20:54 -07:00
Jun Wu
b4adf0602f dag: remove non-master "Name -> Id" index on request
Summary:
Previously the IdMap's "Name -> Id" index simply ignores the "reassign
non-master" request. It turns out stale entries in that index can cause
issues as demonstrated by the previous diff.

Update IdMap to actually remove both indexes of non-master group on
remove_non_master so it cannot have stale entries.

To optimize the index, the format of IdMap is changed from:

  [ 8 bytes Id (Big Endian) ] [ Name ]

to:

  [ 8 bytes Id (Big Endian) ] [ 1 byte Group ] [ Name ]

So the index can use reference to the slice, instead of embedding the bytes, to
reduce index size.

The filesystem directory name for IdMap used by NameDag is bumped to `idmap2`
so it won't read the incompatible old `idmap` data.

Reviewed By: sfilipco

Differential Revision: D23494508

fbshipit-source-id: 3cb7782577750ba5bd13515b370f787519ed3894
2020-09-04 12:20:53 -07:00
Jun Wu
c5d6c9d0f2 dag: add a test showing non-master rebuild issues
Summary: Some vertexes can disappear from the graph!

Reviewed By: sfilipco

Differential Revision: D23494506

fbshipit-source-id: ecbf2a4169e5fc82596e89a4bfe4c442a82e9cd2
2020-09-04 12:20:53 -07:00
Jun Wu
4aea3657e1 dag: move some test utilities to a TestDag struct
Summary: The TestDag struct will be used to do some more complicated tests.

Reviewed By: sfilipco

Differential Revision: D23494507

fbshipit-source-id: 11350f9e448725ae49f50a7b6f19efc57ad84448
2020-09-04 12:20:53 -07:00
Jun Wu
cea2bf8728 dag: limit segment level at open time
Summary:
At open time, it's pointless to attempt to create new levels. So let's just
read the existing max_level and do not try to build max_level + 1.

This turns out to save 300ms in profiling result.

Reviewed By: sfilipco

Differential Revision: D23494509

fbshipit-source-id: 4ea326a3cc21792790ea0b87e5bf608a94ae382b
2020-09-03 13:48:43 -07:00
Jun Wu
99511f8743 dag: benchmark dag_ops on different IdDagStores
Summary:
Change dag_ops benchmarks to use different IdDagStores. An example run shows:

  benchmarking dag::iddagstore::indexedlog_store::IndexedLogStore
  building segments (old)                           856.803 ms
  building segments (new)                           127.831 ms
  ancestors                                          54.288 ms
  children (spans)                                  619.966 ms
  children (1 id)                                    12.596 ms
  common_ancestors (spans)                            3.050 s
  descendants (small subset)                         35.652 ms
  gca_one (2 ids)                                   164.296 ms
  gca_one (spans)                                     3.132 s
  gca_all (2 ids)                                   270.542 ms
  gca_all (spans)                                     2.817 s
  heads                                             247.504 ms
  heads_ancestors                                    40.106 ms
  is_ancestor                                       108.719 ms
  parents                                           243.317 ms
  parent_ids                                         10.752 ms
  range (2 ids)                                       7.370 ms
  range (spans)                                      23.933 ms
  roots                                             620.150 ms

  benchmarking dag::iddagstore::in_process_store::InProcessStore
  building segments (old)                           790.429 ms
  building segments (new)                            55.007 ms
  ancestors                                           8.618 ms
  children (spans)                                  196.562 ms
  children (1 id)                                     2.488 ms
  common_ancestors (spans)                          545.344 ms
  descendants (small subset)                          8.093 ms
  gca_one (2 ids)                                    24.569 ms
  gca_one (spans)                                   529.080 ms
  gca_all (2 ids)                                    38.462 ms
  gca_all (spans)                                   540.486 ms
  heads                                             103.930 ms
  heads_ancestors                                     6.763 ms
  is_ancestor                                        16.208 ms
  parents                                           103.889 ms
  parent_ids                                          0.822 ms
  range (2 ids)                                       1.748 ms
  range (spans)                                       6.157 ms
  roots                                             197.924 ms

  benchmarking dag::iddagstore::bytes_store::BytesStore
  building segments (old)                           724.467 ms
  building segments (new)                            90.207 ms
  ancestors                                          23.812 ms
  children (spans)                                  348.237 ms
  children (1 id)                                     4.609 ms
  common_ancestors (spans)                            1.315 s
  descendants (small subset)                         20.819 ms
  gca_one (2 ids)                                    72.423 ms
  gca_one (spans)                                     1.346 s
  gca_all (2 ids)                                   116.025 ms
  gca_all (spans)                                     1.470 s
  heads                                             155.667 ms
  heads_ancestors                                    19.486 ms
  is_ancestor                                        51.529 ms
  parents                                           157.285 ms
  parent_ids                                          5.427 ms
  range (2 ids)                                       4.448 ms
  range (spans)                                      13.874 ms
  roots                                             365.568 ms

Overall, InProcessStore > BytesStore > IndexedLogStore. The InProcessStore
uses `Vec<BTreeMap<Id, StoreId>>` for the level-head index, which is more
efficient on the "Level" lookup (Vec), and more cache efficient (BTree).
BytesStore outperforms IndexedLogStore because it does not need to verify
checksum on every read access - the checksum was verified at store creation
(IdDag::from_bytes).

Note: The `BytesStore` is something optimized for serialization, and hasn't been sent.

Reviewed By: sfilipco

Differential Revision: D23438174

fbshipit-source-id: 6e5f15188e3b935659ccde25fac573e9b963b78f
2020-09-02 18:54:12 -07:00
Jun Wu
84ad7a5351 dag: implement GetLock for all IdDagStores
Summary: This allows them to use the SyncableIdDag APIs.

Reviewed By: sfilipco

Differential Revision: D23438170

fbshipit-source-id: 7ec7288cfb8186b88f85f0212a913cb0dffe7345
2020-09-02 18:54:12 -07:00
Jun Wu
cfff0e9144 dag: make IdDag::prepare_filesystem_sync generic
Summary: Other IdDagStores can also use the API. This will be used in benchmarks.

Reviewed By: sfilipco

Differential Revision: D23438180

fbshipit-source-id: 565552b66372dcfbb268c397883f627491d6e154
2020-09-02 18:54:12 -07:00
Jun Wu
8874e07f9b dag: IdDagStore::reload -> GetLock::reload
Summary:
Similar to `IdDagStore::sync` -> `GetLock::persist`, `reload` is more related
to filesystem/internal state exchange, and should be protected by a lock.  So
let's move the API there, and requires a lock.

Reviewed By: sfilipco

Differential Revision: D23438169

fbshipit-source-id: 4228106b7739a1a758677adfddd213ad54aa4b6a
2020-09-02 18:54:12 -07:00
Jun Wu
d633576880 dag: remove NameDag::reload
Summary:
`NameDag::reload` is used in `flush` to get a "fresh" NameDag.
In a future diff the `IdDag::reload` API gets changed, so let's
remove NameDag's use of it.

Instead, let's just re-`open` the path again to get a fresh NameDag.
It's a bit more expensive but probably okay, and easier to understand.
`get_new_segment_size()` was added as an internal API to preserve tests.

This also solves an issue where `NameDag` cannot recover properly if its
`flush` fails, because the old `NameDag` state is not lost.

After removing `NameDag::reload`, `idMap::reload` is no longer used publicly
and was made private.

Reviewed By: sfilipco

Differential Revision: D23438179

fbshipit-source-id: 0a32556a2cd786919c233d7efcae1cb9cbc5fb09
2020-09-02 18:54:11 -07:00
Jun Wu
8e16e4260f dag: IdDagStore::sync -> GetLock::persist
Summary:
The word "sync" is bi-directional: flush + reload. It was indexedlog::Log's
behavior. However, in the IdDag context "sync" is confusing - it is actually
only used to write data out, with protection from lock. Rename to `persist`
to clarify it's memory -> disk. Besides, requires a reference to a lock object
as a lightweight prove that some lock is held.

Reviewed By: sfilipco

Differential Revision: D23438175

fbshipit-source-id: 3d9ccd7431691d1c4e2ee74f3c80d95f5e7243b5
2020-09-02 18:54:11 -07:00
Jun Wu
3ad58ff945 dag: make SyncableIdMap use &mut IdMap instead of IdMap
Summary:
This removes the need of cloning `IdMap`.

SyncableIdMap is a bit tricky. I added some comments to clarify things.

Reviewed By: sfilipco

Differential Revision: D23438176

fbshipit-source-id: fe66071da07067ed6c53a6437790af1d81b28586
2020-09-02 18:54:11 -07:00
Jun Wu
23f9bec22b dag: move IdDagStore impls to separate files
Summary: This makes `iddagstore.rs` cleaner.

Reviewed By: sfilipco

Differential Revision: D23438177

fbshipit-source-id: 465cec2231a084a36b20da8e413cb9272f64a00a
2020-09-02 18:54:10 -07:00
Jun Wu
4e9200db44 dag: test IndexedLogIdDagStore
Summary:
Make the test cover IndexedLogIdDagStore. The only change is the parent index
returns children in a different order.

Reviewed By: sfilipco

Differential Revision: D23438173

fbshipit-source-id: bcfabcd329e45bbc5e7e773103fa42307c23c35d
2020-09-02 18:54:10 -07:00
Jun Wu
a0223bc7e7 dag: make iddagstore test generic
Summary: Make it possible to test other IdDagStores.

Reviewed By: sfilipco

Differential Revision: D23438178

fbshipit-source-id: e5fc1b20833c71dd7569c77c31c76a26a6e357fe
2020-09-01 23:58:04 -07:00
Jun Wu
211739f00c dag: remove SpanSetAsc
Summary:
Now SpanSet can easily support `push_front`, we can just use SpanSet
efficiently without SpanSetAsc.

Reviewed By: sfilipco

Differential Revision: D23385246

fbshipit-source-id: b2e0086f014977fa990d5142e6eee844293e7ca5
2020-09-01 21:02:08 -07:00
Jun Wu
64bdf70811 dag: add SpanSet::intersection_span_min
Summary: To remove SpanSetAsc, its API needs to be implemented on SpanSet.

Reviewed By: sfilipco

Differential Revision: D23385250

fbshipit-source-id: ebd9d537287b5c1cde6e2c52ffb6da57dbd71852
2020-09-01 21:02:08 -07:00
Jun Wu
16eaceafe9 dag: use VecDeque for SpanSet
Summary: This will make it possible to `push_front` and remove SpanSetAsc special case.

Reviewed By: sfilipco

Differential Revision: D23385249

fbshipit-source-id: 63ac67e9bce7cb281236399b3fb86eba23bbf8a0
2020-09-01 20:53:32 -07:00
Jun Wu
71f101054a dag: implement binary_search_by for VecDeque
Summary:
This makes it easier to replace Vec<Span> with VecDeque<Span> in SpanSet for
efficient push_front and deprecates SpanSetAsc (which uses Id in a bit hacky
way - they are not real Ids).

Reviewed By: sfilipco

Differential Revision: D23385245

fbshipit-source-id: b612cd816223a301e2705084057bd24865beccf0
2020-09-01 20:38:29 -07:00
Jun Wu
2d02d3b0f7 dag: validate SpanSet order and no mergable adjacent spans
Summary:
Previously the `is_valid()` function only checks about ordering.
Make it also check "no mergeable adjacent spans" and `span.low<=span.high`.
To provide better debug messages, the function does assertions
directly without returning a bool.

Reviewed By: sfilipco

Differential Revision: D23385247

fbshipit-source-id: 84829e9242e47e68dc2a4b2a6775b13331eba959
2020-09-01 20:27:03 -07:00
Jun Wu
4bf5817dad dag: always merge adjacent spans in SpanSet
Summary:
Previously, `SpanSet::from_sorted_spans` allows having adjacent spans like
`[1..=2, 3..=4]`, while `SpanSet::from_spans` would merge them into `[1..=4]`.
Change it so `SpanSet::from_sorted_spans` merges them too.  This simplifies
the `contains` logic and could make some Sets more efficient.

Reviewed By: sfilipco

Differential Revision: D23385248

fbshipit-source-id: 85b5ba9533f15034779e93255085a4fa09c6328a
2020-09-01 20:04:12 -07:00
Jun Wu
9f4dac104f dag: truncate output in <SpanSet as Debug>::fmt
Summary: Set a default limit so the output won't be too long.

Reviewed By: DurhamG

Differential Revision: D23307792

fbshipit-source-id: 7e2ed99e96bbde06436a034e78f899fc2e3e03f8
2020-08-27 18:14:29 -07:00
Jun Wu
85b3cea8ee dag: define delegate macro for other main traits
Summary: Will be used to simplify code.

Reviewed By: sfilipco

Differential Revision: D23269859

fbshipit-source-id: bed0c4dca075ff60900025642af1d84bdd03452d
2020-08-26 15:32:26 -07:00
Jun Wu
6b3096c7a4 dag: avoid other 'impl<T> Trait for T' usecases
Summary:
`impl<T> Trait for T` in the current Rust makes it impossible to have
`impl<Q> Trait for Q`. Avoid using it for IdConvert and PrefixLookup.

Reviewed By: sfilipco

Differential Revision: D23269861

fbshipit-source-id: a837f3984ff4e1bd5a3983dd1642b9f064f51a36
2020-08-26 15:32:25 -07:00
Jun Wu
4a2ee4c522 dag: avoid impl<T> DagAlgorithm for T
Summary:
`impl<T> Trait for T` in the current Rust makes it impossible to have
`impl<Q> Trait for Q`. Avoid using it for DagAlgorithm.

Reviewed By: sfilipco

Differential Revision: D23269860

fbshipit-source-id: 031e75e9bf1f1eec2b9e8f36220ef8b817a143a5
2020-08-26 15:32:25 -07:00
Jun Wu
846768fb53 dag: drop LowLevelAccess
Summary: LowLevelAccess is a subset of NameDagStorage. Use the latter instead.

Reviewed By: sfilipco

Differential Revision: D23269865

fbshipit-source-id: 81ebb1e986d8b02c968a9a237ad9a97d4afd54bf
2020-08-26 15:32:25 -07:00
Jun Wu
f4021486ab dag: move beautify to default_impl
Summary: This makes `ops.rs` look simpler.

Reviewed By: sfilipco

Differential Revision: D23269863

fbshipit-source-id: ddb55ab8eb3b2d3e7c4b2ccbc2252395d62317a1
2020-08-26 15:32:25 -07:00
Jun Wu
bb461d2240 dag: improve range calculation in repos with many heads
Summary:
If there are too many heads, the current `descendants` algorithm would visit
all "old" heads. For example, with this graph:

      head9999  (N9999)
     /
    Z (master)
    :
    : (many heads)
    :/
    : head2 (N2)
    :/
    C head1 (N1)
    |/
    B head0 (N0)
    |/
    A

`A::head9999` or `Z::head9999` will visit N0, N1, ..., N9999, because
`descendands_up_to` is provided with `max_id = N9999` and Z as a vertex in the
master group, is before N0 in non-master.  The current algorithm also means
`descendands_up_to` gets linearly slower as the user uses the repo more, which
is quite undesirable.

This diff changes `descendands_up_to` to take an `ancestors` set, which is
`::head9999` in this case, and iterate non-master flat segments in it. So it
will skip N0 to N9998 directly by finding the N9999 flat segment and only use
it. The number of heads will have a smaller impact on performance.

Another slowness is `draft::draft_heads`, if there are too many `draft_heads`,
the internal calculation of `::draft_heads` can be slow. Optimize it by
limiting `draft_heads` to `draft:`. Practically this affects `y::` revset as
`y::` is translated to `y::visible_heads` and `visible_heads` can be large.

`cargo bench --bench dag_ops -- '::-master'` shows significant difference:

Before:

  range (master::draft)                              18.112 s
  range (recent_draft::drafts)                        2.594 s

After:

  range (master::draft)                              72.542 ms
  range (recent_draft::drafts)                       14.932 ms

In my fbsource checkout there were 20k+ heads. The improvement of
`master::recent_draft` (`x::y`) is pretty visible, and `y::` is also improved:

    % lhg debugbenchmarkrevsets -m -x 'p1(min(7e8c86ae % master))' -Y 'draft() & 7e8c86ae' -e 'x::y' -e 'y::' --no-default
    # x:  168f5228e570fb6b2ff7f851bd82413102748d84  (p1(min(7e8c86ae % master)))
    # y:  7e8c86aec68ebc6e0b8254afcb381315991fd21c  (draft() & 7e8c86ae)

    # before
    | revset \ backend | segments | revlog | revlog-cpy |
    |------------------|----------|--------|------------|
    | x::y             |     17ms |  0.1ms |      0.5ms |
    | y::              |    3.3ms |  0.7ms |      1.3ms |

    # after
    | revset \ backend | segments | revlog | revlog-cpy |
    |------------------|----------|--------|------------|
    | x::y             |    0.2ms |  0.1ms |      0.6ms |
    | y::              |    1.0ms |  0.7ms |      1.3ms |

Reviewed By: sfilipco

Differential Revision: D23214387

fbshipit-source-id: 4d11db84cd28f4e04e8b991cbc650c9d5781fd27
2020-08-26 15:32:25 -07:00
Jun Wu
a3cbda76bb dag: add a benchmark for x::y with lots non-master heads
Summary:
Lots of non-master heads is not an exercised graph in the benchmarks.
Add it as it practically happens.  This will be used by the next change.

Reviewed By: sfilipco

Differential Revision: D23259879

fbshipit-source-id: 7fe290d14403e42e6d135bde56e2d5c8519ae530
2020-08-26 15:32:24 -07:00
Jun Wu
89570e223a dag: use non-master group in fuzz test
Summary:
Currently the fuzz test only uses the master group. Let it exercise non-master
group too.

Reviewed By: DurhamG

Differential Revision: D23214388

fbshipit-source-id: 7108a1055fbdda2b012f93c5948fb83ef3b9a96f
2020-08-26 15:32:24 -07:00
Jun Wu
9666dab916 dag: implement Debug for NameDag
Summary:
Provide a way to print out all segments with resolved names. This will be used
in a debug command.

Reviewed By: sfilipco

Differential Revision: D23196410

fbshipit-source-id: 1712bfda0271aa548699fe4a6b8603c5ec07af7f
2020-08-26 15:32:23 -07:00
Jun Wu
5829fc4e20 dag: children(small set) has a fast path
Summary:
Use the parent-child index to answer children query quickly.

`cargo bench --bench dag_ops -- children`:

Before:

  children (spans)                                  606.076 ms
  children (1 id)                                   124.105 ms

After:

  children (spans)                                  602.999 ms
  children (1 id)                                    10.777 ms

Reviewed By: sfilipco

Differential Revision: D23196411

fbshipit-source-id: 37195d5ccaa582d35314e0000352ef477287d38c
2020-08-26 15:32:23 -07:00
Jun Wu
a5a396027d dag: expose API to lookup children by parent
Summary: This will be used to optimize "children(single vertex)" query.

Reviewed By: sfilipco

Differential Revision: D23196409

fbshipit-source-id: 050c0859faf83b909e3174bb7c7bd6e7725165c0
2020-08-26 15:32:23 -07:00
Jun Wu
bad2ae41ef dag: maintain non-master parent-child indexes
Summary:
Update the parent index to store non-master group too. To make
"remove_non_master" work, the index contains a "child group" prefix that
allows efficient range invalidation.

This will allow answering "children(single vertex)" query more efficiently.

This diff does not expose an API to query the index yet.

Reviewed By: sfilipco

Differential Revision: D23196406

fbshipit-source-id: 9137da5ffa8306bdafbcabc06b6f0d23f38dcf57
2020-08-26 15:32:23 -07:00
Jun Wu
6c468b7ac0 dag: add benchmark about children(1 id)
Summary:
Practically, the input of `children` is often one vertex instead of a large set.
Add a benchmark for it.

It looks like:

  children (spans)                                  606.076 ms
  children (1 id)                                   124.105 ms

Reviewed By: sfilipco

Differential Revision: D23196407

fbshipit-source-id: 0645b59ac846836fd061386384f6386a57661741
2020-08-26 15:32:23 -07:00
Jun Wu
6f3616a2b8 nameset: make dag and idmap immutable in hints
Summary: They can be figured out at Hints initialization time. So they don't need to be mutable.

Reviewed By: sfilipco

Differential Revision: D23182518

fbshipit-source-id: 133375fdf27a2546a50b63fb130534acdadc5938
2020-08-26 15:32:22 -07:00
Jun Wu
682365f14d nameset: make Id{Static,Lazy}Set require Dag on construction
Summary:
Both IdSet and IdLazy set require both Dag and IdMap to construct.
This is step 1 torwards making Dag and IdMap immutable in hints.

A misspeall of "lhs" vs "hints" in the union set is discovered by the change
and fixed.

Reviewed By: sfilipco

Differential Revision: D23182520

fbshipit-source-id: 3d052de4b8681d3672ebc45d953d1e784f64b2a4
2020-08-26 15:32:22 -07:00
Jun Wu
3ba655abf3 dag: add DummyDag for testing
Summary:
It will be used in places (ex. tests) where a Dag is required but constructing
a real Dag is troublesome.

Reviewed By: sfilipco

Differential Revision: D23182517

fbshipit-source-id: 736911365778e5071c1e0b9615090a4e960392a0
2020-08-26 15:32:22 -07:00
Jun Wu
bd7769b34a dag: rename snapshot_dag to dag_snapshot
Summary: This is more consistent with `id_map_snapshot`.

Reviewed By: sfilipco

Differential Revision: D23182519

fbshipit-source-id: 62b7fc8bfdc9d6b3a4639a6518ea084c7f3807dd
2020-08-26 15:32:22 -07:00
Jun Wu
4d798c39d9 dag: add new range algorithm
Summary:
Similar to descendants, the new range algorithm avoids potentially expensive
checks about whether high-level segments can be used or not. Practically this
is overall an improvement.

`cargo bench --bench dag_ops -- range`:

Before:

  range (2 ids)                                     115.380 ms
  range (spans)                                     243.666 ms

After:

  range (2 ids)                                     123.274 ms
  range (spans)                                      23.101 ms

It is 100x faster with the range x::y benchmark added later on `git.git`.

Reviewed By: sfilipco

Differential Revision: D23106175

fbshipit-source-id: 691e0418ba2b7ad9f52ac15b5cd6088ec28d5f48
2020-08-26 15:32:22 -07:00
Jun Wu
c2e03b9129 dag: add new descendants algorithm
Summary:
The old algorithm tries to make use high-level segments.
However, the code to test whether a high-level segment can be used is
often too expensive for the benefit. Often, high-level segments cannot
be used most of the time and it's similar to O(flat segments).

This diff adds a simpler algorithm that just iterates through the flat
segments. It's faster in most practical cases.

`cargo bench --bench dag_ops -- descendants` shows improvements too:

Before:

  descendants (small subset)                        436.515 ms

After:

  descendants (small subset)                         33.460 ms

Reviewed By: sfilipco

Differential Revision: D23106174

fbshipit-source-id: e6101483d8539b2b1c881be2ccfd0071f122352f
2020-08-26 15:32:22 -07:00
Jun Wu
e22b816a12 dag: add iddag.iter_segments_ascending API
Summary: This will be used by upcoming changes.

Reviewed By: sfilipco

Differential Revision: D23106177

fbshipit-source-id: 9bf183f7464c06b801be64fd938db0babd544756
2020-08-26 15:32:21 -07:00
Jun Wu
0dcf08e509 dag: add SpanSetAsc struct
Summary: This internal struct will be used by upcoming changes.

Reviewed By: sfilipco

Differential Revision: D23106172

fbshipit-source-id: 6d5b9bc1c810984814d0912100acca38a2565a63
2020-08-26 15:32:21 -07:00
Jun Wu
d7cbb641ff dag: fix fuzz tests
Summary:
The fuzz tests need `TestContext::id_dag()`, which was removed by D20471712 (1fb5acf242).
Restore it so fuzz tests can run. This is mainly to check the new `range`
function.

The `range` fuzz test does find an issue caused by `>` written as `>=`
relatively quickly.

Reviewed By: sfilipco

Differential Revision: D23106176

fbshipit-source-id: e9540cc932503a9d54246d24c70bac829fcb13df
2020-08-21 13:00:45 -07:00
Jun Wu
e5527715b7 gitdag: crate to build segmented dag from git history
Summary:
Read git commit graph and migrate them to `dag::Dag`.

This allows using Rust dag abstractions on the git
commit graph.

Reviewed By: DurhamG

Differential Revision: D23095471

fbshipit-source-id: 2163701350ce82ce6e97074e56ad5877f3c9c158
2020-08-21 13:00:45 -07:00
Jun Wu
be2d28fb95 dag: fix non-master high-level segments building
Summary:
If there is no new master segments, it's still possible to have new non-master
segments. Fix the loop condition so we don't skip building non-master segments.

Reviewed By: sfilipco

Differential Revision: D23095465

fbshipit-source-id: 46eb9d5b5f2b04241981558646e0bc090652abce
2020-08-21 13:00:45 -07:00
Jun Wu
e11f36e96b dag: test high-level segments building for non-master
Summary:
I noticed that high-level segments are somehow not built for non-master vertexes.
Add a test to demonstrate the issue.

Reviewed By: DurhamG, sfilipco

Differential Revision: D23095466

fbshipit-source-id: c5a6da14bdfabcf7c432f6c6dfe096c71cc10ee9
2020-08-21 13:00:45 -07:00
Jun Wu
23074edd9b dag: add some tracing spans
Summary: This is useful to investigate internals of dag calculations.

Reviewed By: sfilipco

Differential Revision: D23095473

fbshipit-source-id: 4750c1b4ffad32b1317051d17db9659aaaed59c4
2020-08-21 13:00:45 -07:00
Jun Wu
cd9aa9cb6c dag: improve segment building perf by using precalculated flat segments
Summary:
Follow up of the previous change by actually using the flat segments to build
segments. This significantly improved the perf. `cargo bench --bench dag_ops`
shows:

  building segments (old)                           774.109 ms
  building segments (new)                           143.879 ms

Besides, a `O(N^2)` update to `head_ids` is changed. It improves performance
when the graph has many heads (ex. the mutation graph).

Reviewed By: sfilipco

Differential Revision: D23036080

fbshipit-source-id: 033565700f253c6f20e30a00adb6b579921d6679
2020-08-21 13:00:45 -07:00
Jun Wu
9c9ecbc82b dag: make IdMap::assign_head calculate flat segments
Summary:
While testing the `obsolete()` set, I found an in-memory segmented DAG takes
10x time to build than a HashMap DAG.

Part of the inefficiency is to use a translated "parent_func" that round-trips
through Id and Vertex, used by segment building logic. This diff makes
`IdMap::assign_head` return flat segments, so we don't need a translated
"parent_func" to build flat segments.

This diff only adds checks to make sure the parent_func (Id version) matches
the segments. The next diff switches the segment building to not use the
translated parent_func.

Reviewed By: sfilipco

Differential Revision: D23036060

fbshipit-source-id: 99137f4b5be455cdf43218ba23eb3954b6d9e05a
2020-08-21 13:00:45 -07:00
Jun Wu
0742dc6293 dag: make to_set API bind the dag
Summary:
This affects the `tonodes` API in the Python world. Practically this will bind
the main commit graph to sets like draft, public.

The `ToSet` requirement on `DagAlgorithm` has to be removed to avoid stack
overflow of rustc resolving constraints.

Reviewed By: sfilipco

Differential Revision: D23036077

fbshipit-source-id: 912b924e29611680ab6b2ee4dbcd7ab39824409a
2020-08-21 13:00:45 -07:00
Jun Wu
adf027742e nameset: add flatten API
Summary: This will be useful for the `obsolete()` set.

Reviewed By: sfilipco

Differential Revision: D23036072

fbshipit-source-id: 2f944ef31cf19f902622d90545fa02b7dda89221
2020-08-21 13:00:45 -07:00
Jun Wu
f23b1112f0 nameset: a & b should not use id-based fast path if id map is incompatible
Summary:
If two sets have different IdMap, their Ids cannot be compared directly
for correctness.

Reviewed By: sfilipco

Differential Revision: D23036068

fbshipit-source-id: e800e8273b95c1f8174236e0f30445db7fd44556
2020-08-21 13:00:45 -07:00
Jun Wu
c1e596dbd6 nameset: use real id map snapshot instead of a pointer in hints
Summary: This is similar to the previous change. This allows "binding" IdMaps to sets.

Reviewed By: sfilipco

Differential Revision: D23036058

fbshipit-source-id: ec1b1ec73e949ad4865aecf17bfcc5c1ca723e0d
2020-08-21 13:00:45 -07:00
Jun Wu
0ac5f05097 nameset: use real dag snapshot instead of a pointer in hints
Summary:
This trades a bit performance (calculating the snapshot) for correctness (no
pointer reuse issues) and convenience (set captures dag information with them
and enables use-cases like converting NameSet from another dag to the
current dag without requiring extra `dag` objects).

Reviewed By: sfilipco

Differential Revision: D23036067

fbshipit-source-id: 2e691f09ad401ba79dbc635e908d79e54dadca5e
2020-08-21 13:00:45 -07:00
Jun Wu
759ceb6212 nameset: do not swap x & y if they come from different graphs
Summary:
If `x` and `y` come from a same graph, `x & y` is more efficient than
`y & x` if `y` is larger. However, if `x` and `y` are from different
graphs, the `FULL` hint can no longer accurately predict which one
is larger. Therefore the swap should be avoided.

Reviewed By: sfilipco

Differential Revision: D23036081

fbshipit-source-id: fe3970fc38c853b36689bfd0ee1dec20643ace78
2020-08-21 13:00:45 -07:00
Jun Wu
762603455a nameset: new metaset for separate iter+contains lazy/fast paths
Summary:
For sets like `obsolete()`, `merge()`, they could have a fast "contains" path:
Just check the given commit without calculating a full set. It's also possible
to have a relatively efficient code path to return StaticSet (for obsolete()),
or IdStaticSet (for merge(), by checking flat segments). This diff adds a
`MetaSet` that allows defining two fast paths separately.

This will be used for the `obsolete()` set in upcoming changes.

Reviewed By: sfilipco

Differential Revision: D23036059

fbshipit-source-id: 06e6f90e7e9511626a12cfa729c306ff539256d2
2020-08-21 13:00:45 -07:00
Jun Wu
7d8f4ef92f dag: fix re-assigning master flush
Summary:
Before this change, `flush` with empty changes but `master` moves will cause an
error, because the `parents_func` only contains "pending changes", aka. new
vertexes. The `parents_func` does not know `master` and `master` is needed to
re-assign them from the non-master to the master group.

With the snapshot API, things become easier. We just take a snapshot before
reloading, and use the snapshot to answer parent_names.

Reviewed By: sfilipco

Differential Revision: D22970569

fbshipit-source-id: 99a25857ba98792edff69985c16df118a560ffb0
2020-08-21 13:00:45 -07:00
Jun Wu
f666cb1cf0 dag: add DagAlgorithm::snapshot_dag
Summary:
This API allows the underlying Dag to provide a snapshot. The snapshot can then
be used in places that do not want a lifetime (ex. NameSet).

Reviewed By: sfilipco

Differential Revision: D22970579

fbshipit-source-id: ededff82009fd5b4583f871eef084ec907b45d33
2020-08-21 13:00:45 -07:00
Jun Wu
b8e7828edd dag: add NameDag::snapshot_dag
Summary:
Make it possible to snapshot a Dag. This is useful for cases where another
struct wants access to the Dag without lifetimes. Namely, the LazySet can
might want to keep a snapshot of Dag.

Reviewed By: sfilipco

Differential Revision: D22970568

fbshipit-source-id: 508c38d3ffac2ffcd2e682578c3c5e5787ea3bcf
2020-08-21 13:00:45 -07:00
Jun Wu
741d050f10 dag: drop inverse DAG
Summary:
The only intended use of the inverse DAG is to implement the Python dag
interface in `dagutil.py`. D22519589 (2d4d44cf3d) stack changed it so the Python dag
interface becomes optional. Therefore there is no need to keep the inverse DAG
interface, which is a bit tricky on sorting.

Reviewed By: sfilipco

Differential Revision: D22970581

fbshipit-source-id: 58a126b41d992e75beaf76ece25cb578ee84760b
2020-08-21 13:00:45 -07:00
Jun Wu
6b64f9a2bf dag: add import_and_flush API
Summary:
This allows importing from other DAGs. It will be used to import revlog DAG to
the new segmented format.

Reviewed By: sfilipco

Differential Revision: D22970572

fbshipit-source-id: 0a183e7b64831574cc9c60d4639124d02d19cf43
2020-08-21 13:00:45 -07:00
Jun Wu
c448e0f575 renderdag: move to dag
Summary:
This allows dag to use renderdag in tests to verify graph result. Previously
it was hard because dag <-> renderdag would form circular dependency.

It also make it possible to implement more efficient and integrated fast paths
for graph rendering.

Reviewed By: sfilipco

Differential Revision: D22970570

fbshipit-source-id: 526497339bd7aa8898d1af4aa9cf6d2a6797aae0
2020-08-21 13:00:45 -07:00
Jun Wu
6fd7a2e582 dag: use concrete error types
Summary:
This is more complex than previous libraries, mainly because `dag` defines APIs
(traits) used by other code, which might raise error type not interested
by `dag` itself. `BackendError::Other(anyhow::Error)` is currently used to
capture types that do not fit in `dag`'s predefined error types.

Reviewed By: sfilipco

Differential Revision: D22883865

fbshipit-source-id: 3699e14775f335620eec28faa9a05c3cc750e1d1
2020-08-06 12:31:57 -07:00
Jun Wu
ff9c979b07 revlogindex: use concrete error types
Summary:
All dependencies of revlogindex have migrated to concreted error types.
Let's migrate revlogindex itself. This allows compile-time type checks
and makes the error returned by revlogindex APIs more predictable.

Reviewed By: sfilipco

Differential Revision: D22857554

fbshipit-source-id: 7d32599508ad682c6e9c827d4599e6ed0769899c
2020-08-06 12:31:57 -07:00
generatedunixname89002005287564
070b9abf48 Daily arc lint --take RUSTFMT
Reviewed By: zertosh

Differential Revision: D22862880

fbshipit-source-id: cc2a30bb5345ffae1a117bb6220d6c2f4d9f73ba
2020-07-31 04:28:59 -07:00
Jun Wu
5f3f7e49d6 dag: add reachable_roots API
Summary:
I thought it was just `roots & (::heads)`. It is actually more complex than
that.

Reviewed By: sfilipco

Differential Revision: D22657201

fbshipit-source-id: bd0b49fc4cdd2c516384cf70c1c5f79af4da1342
2020-07-30 20:32:37 -07:00
Jun Wu
a2b44103bd dag: add fast path for IdLazySet::contains
Summary:
No need to exhaust the entire IdLazySet if there are hints.
This is important to make `small & lazy` fast.

Reviewed By: sfilipco

Differential Revision: D22638462

fbshipit-source-id: 63a71986e6e254769c42eb6250c042ea6aa5808b
2020-07-30 20:32:32 -07:00
Jun Wu
e3059699ee dag: cross-DAG set operations should use FULL and ANCESTORS hint carefully
Summary:
When multiple DAGs (ex. a local DAG and a commit-cloud DAG) are involved,
certain fast paths become unsound. Namely, the fast paths of the FULL hint
should check DAG compatibility. For example:

  localrepodag.all() & remotedag.all()

should not simply return `localrepodag.all()` or `remotedag.all()`.

Fix it by checking DAG pointers.

A StaticSet might be created without using a DAG, add an optimization
to change `all & static` to `static & all`. So StaticSet without DAG
wouldn't require full DAG scans when intersecting with other sets.

Reviewed By: sfilipco

Differential Revision: D22638454

fbshipit-source-id: 72396417e9c1238d5411829da8f16f2c6d4c2f3a
2020-07-30 20:32:32 -07:00
Jun Wu
34de6956f6 dag: improve fmt::Debug on sets
Summary:
Improve `fmt::Debug` so it fits better in the Rust and Python eco-system:
- Support Rust formatter flags. For example `{:#5.3?}`. `5` defines limit of a
  large set to show, `3` defines hex commit hash length. `#` specifies the
  alternate form.
- Show commit hashes together with integer Ids for IdStaticSet.
- Use HG rev range syntax (`a:b`) to represent ranges for IdStaticSet.
- Limit spans to show for IdStaticSet, similar to StaticSet.
- Show only 8 chars of a long hex commit hash by default.
- Minor renames like `dag` -> `spans`, `difference` -> `diff`.

Python bindings uses `fmt::Debug` as `__repr__` and will be affected.

Reviewed By: sfilipco

Differential Revision: D22638455

fbshipit-source-id: 957784fec9c99c8fc5600b040d964ce5918e1bb4
2020-07-30 20:32:31 -07:00
Jun Wu
7c2dffb955 revlogindex: optimize set intersection with hints
Summary:
This makes intersection set stop early. It's useful to stop iteration on some
lazy sets. For example, the below `ancestors(tip) & span` or
`descendants(1) & span` sets can take seconds to calculate without this
optimization.

```
In [1]: cl.dag.ancestors([cl.tip()]) & cl.tonodes(bindings.dag.spans.unsaferange(len(cl)-10,len(cl)))
Out[1]: <and <lazy-id> <dag [...]>>

In [3]: %time len(cl.dag.ancestors([cl.tip()]) & cl.tonodes(bindings.dag.spans.unsaferange(len(cl)-10,len(cl))))
CPU times: user 364 µs, sys: 0 ns, total: 364 µs
Wall time: 362 µs

In [7]: %time len(cl.dag.descendants([repo[1].node()]) & cl.tonodes(bindings.dag.spans.unsaferange(0,100)))
CPU times: user 0 ns, sys: 574 µs, total: 574 µs
Wall time: 583 µs
```

Reviewed By: sfilipco

Differential Revision: D22638458

fbshipit-source-id: b9064ce2ff1aecc2d7d00025928dfcb3c0d78e0c
2020-07-30 20:32:31 -07:00
Jun Wu
a02c93864f dag: add ANCESTORS hint
Summary:
The hint indicates a set `X` is equivalent to `ancestors(X)`.

This allows us to make `heads` use `heads_ancestors` (which is faster in
segmented changelog) automatically without affecting correctness. It also
makes special queries like `ancestors(all())` super cheap because it'll just
return `all()` as-is.

Reviewed By: sfilipco

Differential Revision: D22638463

fbshipit-source-id: 44d9bbcbb0d7e2975a0c8322181c88daa1ba4e37
2020-07-30 20:32:30 -07:00
Jun Wu
a0c5b1b3a5 revlogindex: is_ancestor(x, x) should return true
Summary: This is discovered by using it in Python world.

Reviewed By: sfilipco

Differential Revision: D22323186

fbshipit-source-id: 295811e0950b94ad2ad73ad242228b6a3f9765d0
2020-07-06 15:50:59 -07:00
Jun Wu
cf1bc37007 dag: avoid using > 2 parents in generic DAG tests
Summary: Some DAG implementations does not support it.

Reviewed By: sfilipco

Differential Revision: D22249158

fbshipit-source-id: ebcdf164677ee647ef44aa1ee3cfd318bac658b0
2020-07-06 15:50:59 -07:00
Jun Wu
9a17be7ce0 dag: do not test the order of vertexes in generic tests
Summary:
Different implementation might return different orders. They should be
considered correct.

Reviewed By: sfilipco

Differential Revision: D22249159

fbshipit-source-id: 36e4cadf814366f7ee2ed8a778948ff810760550
2020-07-06 15:50:58 -07:00
Jun Wu
f24dc621cb dag: make part of the tests generic
Summary: This makes it possible to run tests for other DAGs, like the revlog.

Reviewed By: sfilipco

Differential Revision: D22249155

fbshipit-source-id: 205579eeaccd42a21297d965973957168bb8726e
2020-07-06 15:50:58 -07:00
Jun Wu
2bc4dd01ca dag: add a trait to convert IdSet to Set
Summary:
The reverse `to_id_set` exists.
It turns out that the Python land wants this in many places.

Reviewed By: sfilipco

Differential Revision: D22240175

fbshipit-source-id: b6a3a3a3869dc0c521a21b1d86394421b816632b
2020-07-06 15:50:58 -07:00
Jun Wu
07b3d60f80 dag: add "only(x, y)" to DagAlgorithm
Summary:
This provides a way for implementations to optimize the operation.

For segmented changelog, the default implementation is good enough.

For revlog, `only` can have a fast path that does not iterate through the
entire changelog.

A related API `only_both` is added. For revlog it has multiple use-cases,
including narrow-heads phase calculation and revlog.findcommonmissing used by
discovery.

Reviewed By: markbt

Differential Revision: D21944132

fbshipit-source-id: d11660dae85ea6158977eb00d1ceaceddf1d8234
2020-07-06 15:50:57 -07:00
Jun Wu
d745424bf9 dag: add a utility to help break cycles
Summary:
This makes it easier to remove cycles in other places.

There are probably fancier and more efficient algorithm for this.
For now I just wrote one that is easy to verify correctness.

Reviewed By: markbt

Differential Revision: D22174975

fbshipit-source-id: 8a2dc755e4bc0b066eda5f42a51208c92409f2f9
2020-07-02 13:22:34 -07:00
Jun Wu
234147239a dag: add ToIdSet trait
Summary: The trait converts NameSet to IdSet. It'll be used by the revlog index.

Reviewed By: sfilipco

Differential Revision: D21795869

fbshipit-source-id: 55f7a238158442db9d8bdfe84e64438be504f618
2020-06-03 13:26:25 -07:00
Jun Wu
45d6b00593 dag: add InverseDag
Summary: Add a way to inverse the DAG (swap parent / children relations).

Reviewed By: sfilipco

Differential Revision: D21795870

fbshipit-source-id: 2d076f4ae491141aa758faa5f5f303c97f7e56dc
2020-06-03 13:26:25 -07:00
Jun Wu
a3b663735e dag: add IdLazySet
Summary:
Similar to LazySet, but the iterator is using Ids. This will be useful for
lazy calculations that are cheaper with Ids.

Reviewed By: sfilipco

Differential Revision: D21626208

fbshipit-source-id: 9a34fbf18f0039caeb4f6e698294c4d335354093
2020-06-03 13:26:24 -07:00
Jun Wu
223faebe5f dag: rename DagSet to IdStaticSet
Summary:
The NameSet is not really about Dag. It is about using Id and is static.
Rename it to clarify. In an upcoming change we'll have IdLazySet.

Reviewed By: sfilipco

Differential Revision: D21626204

fbshipit-source-id: 84f25008f7032f6e26a26fc656ccbcd2a5880ecf
2020-06-03 13:26:24 -07:00
Jun Wu
bf90003c24 dag: implement NameIter automatically
Summary:
This makes it possible to use NameIter without manually specifying out iterator
types, which might be quite long.

Reviewed By: sfilipco

Differential Revision: D21626202

fbshipit-source-id: 67b338765c09629645794cf73a9b496271524f9d
2020-06-03 13:26:24 -07:00
Jun Wu
6292253ef8 dag: add fast paths using hints
Summary: Take advantage of Hints and add fast paths.

Reviewed By: sfilipco

Differential Revision: D21626216

fbshipit-source-id: 6d43666bd6cdec7ff4b93032c1064cafd8de85cf
2020-06-03 13:26:23 -07:00
Jun Wu
d3878732f8 dag: set hints with existing hints
Summary: Update hints if they are easy to obtain or calculate.

Reviewed By: sfilipco

Differential Revision: D21626206

fbshipit-source-id: 453b7db2444406ce51d574c688fe536316fb9b0f
2020-06-03 13:26:23 -07:00
Jun Wu
fb56b1962d dag: move optimization hints to a dedicate structure
Summary:
Previously, the NameSet has properties like "is_all", "is_topo_sorted", etc.
To make lazy sets efficient, it's important to have hints about min / max Ids
and maybe some other information.

Add a dedicated Hints structure for that.

Reviewed By: sfilipco

Differential Revision: D21626219

fbshipit-source-id: 845e88d3333f0f48f60f2739adae3dccc4a2dfc4
2020-06-02 14:00:36 -07:00
Jun Wu
13503a1490 dag: add some default impls for DagAlgorithm
Summary:
Implement a small subset of DagAlgorithm by default. This makes
other implementations of DagAlgorithm slightly easier.

Reviewed By: sfilipco

Differential Revision: D21626199

fbshipit-source-id: ac6dfb5c22bf1da44f521fc9e76d59bfb95063c7
2020-06-02 14:00:36 -07:00
Jun Wu
c920549e09 dag: fix DagSet::contains
Summary:
D21479023 broke it. It should convert to Id, and check Id against the SpanSet,
instead of just checking the IdMap ignoring the SpanSet.

Reviewed By: sfilipco

Differential Revision: D21626193

fbshipit-source-id: 6daf86f292a7acfd3688893a55e2a794cfe068fe
2020-06-02 14:00:36 -07:00
Jun Wu
62719f10eb dag: make to_span_set take reference
Summary: This makes the next change easier to implement.

Reviewed By: sfilipco

Differential Revision: D21626198

fbshipit-source-id: 57ab69cba7f43350767e5d0d52ebfe66764895ca
2020-06-02 14:00:35 -07:00
Jun Wu
14b3c2e0f0 dag: move from_ascii to traits
Summary:
This adds flexibility. Now every type that implements DagAddHeads, including
NameDag, can import ASCII graphs.

Reviewed By: sfilipco

Differential Revision: D21626213

fbshipit-source-id: e258d88f97cbcc9aaf98d353a929803325185df7
2020-05-27 12:16:48 -07:00
Jun Wu
bd6c6fe18b dag: implement IdConvert on Dag structs
Reviewed By: sfilipco

Differential Revision: D21626214

fbshipit-source-id: 90d5a587e42340ac2b0f0b3f35f3bc084e969d40
2020-05-27 12:16:48 -07:00
Jun Wu
be5e3a20b4 dag: IdMapLike -> IdConvert
Summary: The trait was about converting between Id and VertexName. Rename to clarify.

Reviewed By: sfilipco

Differential Revision: D21626195

fbshipit-source-id: 874ca4ca3a1467084a08c6d9aa321201974e1978
2020-05-27 12:16:47 -07:00
Jun Wu
64dc05ab9d dag: move add_heads, flush, add_heads_and_flush to traits
Summary: This allows other kinds of DAG to implement the operations.

Reviewed By: sfilipco

Differential Revision: D21626220

fbshipit-source-id: 896c5ccebb1672324d346dfca6bcac9b4d3b4929
2020-05-27 12:16:47 -07:00
Jun Wu
4934987796 dag: implement PrefixLookup for Dag, MemDag and MemIdMap
Summary: This makes things a bit more flexible.

Reviewed By: sfilipco

Differential Revision: D21626194

fbshipit-source-id: f3ad486bcd5a6478d9e00f674d48f99504cded8c
2020-05-27 12:16:46 -07:00
Jun Wu
26217dcdb5 dag: move hex prefix lookup to a trait
Summary: This makes it possible for other types to implement the hex prefix lookup.

Reviewed By: sfilipco

Differential Revision: D21626218

fbshipit-source-id: 96e8b8c37e5aae2bd60658a238333b61902936d1
2020-05-27 12:16:46 -07:00
Jun Wu
577c9442bb dag: add VertexName::from_hex
Summary: It will be used in the next change.

Reviewed By: sfilipco

Differential Revision: D21626207

fbshipit-source-id: bbef70ef9d4f9aaa2039a6bc15d296e88db7f8dc
2020-05-27 12:16:46 -07:00
Jun Wu
38cc83e1bf dag: add short aliases for main public types
Summary:
Types like IdDag are not really used. The use of the word "name" is sometimes
confusing in other context. Therefore export shorter names like Dag, MemDag,
Vertex, avoid "name" in NameDag, MemNameDag and NameSet. This makes external
code shorter and less ambiguous.

Reviewed By: sfilipco

Differential Revision: D21626212

fbshipit-source-id: 5bcf3cecfd38277149b41bf3ba9e6d4ef2a07b2b
2020-05-27 12:16:45 -07:00
Jun Wu
e0d11803f2 dag: move DagAlgorithm to an independent trait
Summary:
This decouples DagAlgorithm from the IdMap + IdDag backend, making it possible
to support other kinds of backends of DagAlgorithm (ex. a revlog backend).

Reviewed By: sfilipco

Differential Revision: D21626200

fbshipit-source-id: f53cc271a200062e9c02f739b6453e1d7de84e6d
2020-05-27 12:16:45 -07:00
Jun Wu
aeac1551d2 dag: implement beautify
Summary:
This function reorders commits so the graph looks better.
It will be used to optimize graph rendering for cloud smartlog (and perhaps
smartlog in the future).

Reviewed By: markbt

Differential Revision: D21554675

fbshipit-source-id: d3f0f27c7935c49581cfa6e87d7c32eb5a075f75
2020-05-14 12:03:43 -07:00
Jun Wu
cde3140e8f dag: implement BitAnd, BitOr, Sub for NameSet
Summary: This makes it easier to do `a & b`, `a | b`, `a - b`.

Reviewed By: markbt

Differential Revision: D21554677

fbshipit-source-id: e1e2571a3dc83f80a1ec7a056f2c8f71ab292d9e
2020-05-14 12:03:43 -07:00
Jun Wu
60684eb2c5 dag: make ASCII -> MemNameDag a public API
Summary:
It seems handy to construct a Dag just from ASCII. Therefore move it to a
public interface.

Reviewed By: sfilipco

Differential Revision: D21486525

fbshipit-source-id: de7f4b8dfcbcc486798928d4334c655431373276
2020-05-11 09:49:59 -07:00
Jun Wu
a6b7e965f3 dag: remove a TODO comment
Summary: It was done as NameSet.

Reviewed By: sfilipco

Differential Revision: D21479022

fbshipit-source-id: 1c32cabb27d72a6438409ede226104a9ebac6a1d
2020-05-11 09:49:59 -07:00
Jun Wu
4eb9251172 dag: move sort and parent_names to NameDagAlgorithm
Summary:
They are part of the read-only algorithms that are not specific to a certain
type of NameDag.

Reviewed By: sfilipco

Differential Revision: D21479017

fbshipit-source-id: 3fa58071ac43246d3cd45d84384ee93c7385f414
2020-05-11 09:49:59 -07:00
Jun Wu
282e034d30 dag: add MemNameDag
Summary:
Adds an in-memory NameDag so we can construct the DAG and use its algorithms by
just providing parents function and heads.

Reviewed By: sfilipco

Differential Revision: D21479021

fbshipit-source-id: e12d53a97afec77b2307d5efbb280bd506dee0ba
2020-05-11 09:49:58 -07:00
Jun Wu
5cbb99f4eb dag: add MemIdMap
Summary: Adds an in-memory IdMap to be used in an in-memory NameDag.

Reviewed By: sfilipco

Differential Revision: D21479018

fbshipit-source-id: bc702762b059e8659c6ab322f3c39f032e95d5b6
2020-05-11 09:49:58 -07:00
Jun Wu
682e8e96a7 dag: use IdMap traits in NameDag and NameSet
Summary:
This allows them to switch to a different IdMap implementation relatively
easily.

Reviewed By: sfilipco

Differential Revision: D21479023

fbshipit-source-id: 8ecb99cafe2093ec7d14b848ffa08581c5300414
2020-05-11 09:49:57 -07:00
Jun Wu
759f8b35c5 dag: move some IdMap operations to traits
Summary: This will allow different IdMap implementations.

Reviewed By: sfilipco

Differential Revision: D21479016

fbshipit-source-id: 852501896fddcb82624338acd9dceee41150e302
2020-05-11 09:49:57 -07:00
Jun Wu
30163eeb58 dag: update snapshot_map on change
Summary:
`NameDag::add_heads` API changes the internal `dag` state without updating
`snapshot_map`. That will cause queries relying on `snapshot_map` to fail.
Update it so that `snapshot_map` gets updated by `add_heads`.

Reviewed By: sfilipco

Differential Revision: D21479019

fbshipit-source-id: 70528aa4a488cef3dc71bf21dd89e45cfe763794
2020-05-11 09:49:57 -07:00
Jun Wu
f014f86b7a dag: move NameDag algorithms to a trait
Summary:
This makes it easier to add an "in-memory-only" NameDag with all the algorithms
implemented.

Reviewed By: sfilipco

Differential Revision: D21479020

fbshipit-source-id: c1a73e95f3291c273c800650f70db2a7eb0966d7
2020-05-11 09:49:56 -07:00
Stefan Filip
ea89b541e1 segmented_changelog: add Dag struct and location_to_name functionality
Summary:
The IdDag provides graph algorithms using Segments.
The IdMap allows converting from the SegmentedChangelogId domain to the
ChangesetId domain.
The Dag struct wraps IdDag and IdMap in order to provide graph algorithms using
the common application level identifiers for commits (ChangesetId).

The construction of the Dag is currently mocked with something that can only be
used in a test environment (unit tests but also integration tests).

This diff also implements a location_to_name function. This is the most
important new functionality that segmented changelog clients require. It
recovers the hash of a commit for which the client only has a segmented
changelog Id. The current assumption is that clients have identifiers for all
merge commit parents so the path to a known commit always follow a set
of first parents.

The IdMap queries will have to be changed to async in the future, but IdDag
queries we expect to stay sync.

Reviewed By: quark-zju

Differential Revision: D20635577

fbshipit-source-id: 4f9bd8dd4a5bd9b0de55f51086f3434ff507963c
2020-03-27 13:48:52 -07:00
Stefan Filip
7502ce31ca dag: add in process stored IdMap constructor
Summary: The interesting observation is that InProcessStore is not public.

Reviewed By: quark-zju

Differential Revision: D20635578

fbshipit-source-id: a0149929c8059ff77f047fd385bf3b26dc738dfd
2020-03-27 13:48:51 -07:00
Stefan Filip
c400809eba dag: rename child index iteration to iter_master_flat_segments_with_parent
Summary:
`iter_segments_with_parent` has a few more conditions attached to it than the
name would imply. We are renaming it to give a better sense of its true
behavior.

Reviewed By: quark-zju

Differential Revision: D20547631

fbshipit-source-id: 406f46b9de5efc9e8e6a8c4bc22ab18fa5bc54bb
2020-03-24 13:58:07 -07:00
Stefan Filip
59ff2a8571 dag: remove_non_master implementation for
Summary: Also adding better tests for non master entries.

Reviewed By: quark-zju

Differential Revision: D20504483

fbshipit-source-id: 60d4a20aecb00f7750db2fff5d3832aac99d00e2
2020-03-24 13:58:06 -07:00
Stefan Filip
03c1e1cac5 dag: iterator implementations for InProcessStore
Summary:
The main question I had while writing the tests was whether we expect a
specific order for Segments for `iter_segments_with_parent`. `InProcessStore`
will return the segments in the order that they were inserted.

Reviewed By: quark-zju

Differential Revision: D20501401

fbshipit-source-id: 48ceb78f3191c7425c1488a3392cf3167f7e7268
2020-03-24 13:58:06 -07:00
Stefan Filip
5f4e706f81 dag: Add InProcessStore as iddagstore
Summary:
First 6 methods implemented from the IdDagStore trait for the InProcessStore.

Any suggestions welcome.

Reviewed By: quark-zju

Differential Revision: D20499228

fbshipit-source-id: cb536a3a0136077ada78934d82a25d079a5bc809
2020-03-24 13:58:06 -07:00
Stefan Filip
3dcb56535e dag: add descriptions to IdDagStore methods
Summary: Documentation.

Reviewed By: quark-zju

Differential Revision: D20499926

fbshipit-source-id: ebbb7a1249109bd56ff459a659e0c628c2974179
2020-03-24 13:58:05 -07:00
Jun Wu
8cc30ac302 dag: add Segment::new API
Summary:
Now Segment has no lifetime we can create it directly and return the ownership.

Performance of "building segments" does not seem to change:

  # before
  building segments                                 750.129 ms

  # after
  building segments                                 712.177 ms

Reviewed By: sfilipco

Differential Revision: D20505200

fbshipit-source-id: 2448814751ad1a754b90267e43262da072bf4a16
2020-03-18 15:05:58 -07:00
Jun Wu
1bd54a5971 dag: drop lifetime on Segment<'a>
Summary:
This allows structures like BTreeMap to own and store Segment.

It was not possible until D19818714, which adds minibytes::Bytes interface for
indexedlog.

In theory this hurts performance a little bit. But the perf difference does not
seem visible by `cargo bench --bench dag_ops`:

  # before
  building segments                                 714.420 ms
  ancestors                                          54.045 ms
  children                                          490.386 ms
  common_ancestors (spans)                            2.579 s
  descendants (small subset)                        406.374 ms
  gca_one (2 ids)                                   161.260 ms
  gca_one (spans)                                     2.731 s
  gca_all (2 ids)                                   287.857 ms
  gca_all (spans)                                     2.799 s
  heads                                             234.130 ms
  heads_ancestors                                    39.383 ms
  is_ancestor                                       113.847 ms
  parents                                           251.604 ms
  parent_ids                                         11.412 ms
  range (2 ids)                                     117.037 ms
  range (spans)                                     241.156 ms
  roots                                             507.328 ms

  # after
  building segments                                 750.129 ms
  ancestors                                          53.341 ms
  children                                          515.607 ms
  common_ancestors (spans)                            2.664 s
  descendants (small subset)                        411.556 ms
  gca_one (2 ids)                                   164.466 ms
  gca_one (spans)                                     2.701 s
  gca_all (2 ids)                                   290.516 ms
  gca_all (spans)                                     2.801 s
  heads                                             240.548 ms
  heads_ancestors                                    39.625 ms
  is_ancestor                                       115.735 ms
  parents                                           239.353 ms
  parent_ids                                         11.172 ms
  range (2 ids)                                     115.483 ms
  range (spans)                                     235.694 ms
  roots                                             506.861 ms

Reviewed By: sfilipco

Differential Revision: D20505201

fbshipit-source-id: c34d48f0216fc5b20a1d348a75ace89ace7c080b
2020-03-18 15:05:57 -07:00
Stefan Filip
1fb5acf242 dag: use IdDagStore in IdDag with type parameter
Summary: Make IdDag storage generic by depending on IdDagStore.

Reviewed By: quark-zju

Differential Revision: D20471712

fbshipit-source-id: 3a2668f301758a3c880db35c9f0db6887ef1dd38
2020-03-16 14:41:41 -07:00
Stefan Filip
236292c0fd dag: add the GetLock trait
Summary: Used to generalize `get_lock` functionality.

Reviewed By: quark-zju

Differential Revision: D20471710

fbshipit-source-id: e44d5b22ecacdb653170ef83914354f521f82dfc
2020-03-16 14:41:40 -07:00
Stefan Filip
66436b4a3c dag: add the IdDagStore trait
Summary: Abstract the storage functionality required by IdDag.

Reviewed By: quark-zju

Differential Revision: D20449122

fbshipit-source-id: fc3c7d7b88d74f7a93670d310be2e680f35e8ce7
2020-03-16 14:41:40 -07:00
Stefan Filip
1239628ef8 dag: move IdDag storage details to the iddagstore module
Summary:
Right now the module has one implementation IndexedLogStore. The name could
be more specific in the context of the crate.

The goal will be to add a trait for storage requirements of IdDag and
make IndexedLogStorage one implementation of that trait.

Reviewed By: quark-zju

Differential Revision: D20446042

fbshipit-source-id: 7576e1cc4ad757c1a2c00322936cc884838ff710
2020-03-16 14:41:40 -07:00
Jun Wu
1f64b4ec50 nameset: fix LazySet iteration
Summary:
The `next` method forgot to increase the iteration index, causing infinite
iteration.

Reviewed By: ikostia

Differential Revision: D20473206

fbshipit-source-id: 82a95de1b1c12ac4e9e4d328a0adba7145d7b24c
2020-03-16 13:00:35 -07:00
Jun Wu
194b38385a nameset: add a way to convert between NameSet and SpanSet
Summary:
This will be used in the Python world for legacy reasons. It shouldn't be used
in new Rust node.

To use it, the name `LegacyCodeNeedIdAccess` has to be used so we can do a code
search to find all users of it.

Reviewed By: sfilipco

Differential Revision: D20367834

fbshipit-source-id: 9b93a29f1461ce24bba6f31a2bbb1f327e216c6d
2020-03-11 20:37:30 -07:00
Jun Wu
eef56d9c5b namedag: add a sort API
Summary: This will be useful to actually sort commits.

Reviewed By: sfilipco

Differential Revision: D20367835

fbshipit-source-id: 43bc7835277af3a14ef323ce34247e0c03878dc8
2020-03-11 20:37:29 -07:00
Jun Wu
2ecc0bb757 namedag: move "all" concept to DagSet
Summary:
The old "AllSet" implementation is not very practical - it does not support
iteration. Practically, the "all()" set comes from the DAG. Change the "all"
concept to a hint similar to "is_topo_sorted", and update the fast path
(intersection) accordingly.

Reviewed By: sfilipco

Differential Revision: D20367837

fbshipit-source-id: fdbf370897c93058bfcab0571c1f6fa4b99b0f6b
2020-03-11 20:37:29 -07:00
Jun Wu
ef1696b4db namedag: rename arc_map to snapshot_map
Summary: The word "snapshot" more accurately describes its purpose.

Reviewed By: sfilipco

Differential Revision: D20367836

fbshipit-source-id: c91a0bd402fa1718b5d805beedc0e062824c53d3
2020-03-11 20:37:29 -07:00
Jun Wu
4960709aa3 dag: do not depend on types
Summary:
The dag crate is designed to work with any kind of binary commit hashes (ex. bonsai,
git or hg). The only use of `types` is to convert from binary to hex. Since dag
already has its own `to_hex` logic in `VertexName`. Let's use that instead.

Reviewed By: sfilipco

Differential Revision: D20378447

fbshipit-source-id: 00ecb551ea927fdb60dd91e5e645064f23139bcd
2020-03-11 10:49:31 -07:00
Stefan Filip
d8b4ddcecf dag: split lock file acquisition to own function
Summary:
Spliting lock file acquisition from `IdDag::prepare_filesystem_sync` to its own
function.
Useful when looking ahead to split IdDag from IndexedLog.

Reviewed By: quark-zju

Differential Revision: D20316443

fbshipit-source-id: a0fd43439730376920706bb4349ce497f6624335
2020-03-09 10:18:07 -07:00
Stefan Filip
620cdd96f2 dag: add IdDag::iter_segments_with_parent
Summary:
This removes an inline use of the indexedlog indexes.
This is going to be useful when we try to separate IndexedLog specifics from
IdDag functionality.

Reviewed By: quark-zju

Differential Revision: D20316058

fbshipit-source-id: 942a0a71660bb327376c81fd3ac435d002ecca6e
2020-03-09 10:18:07 -07:00