Summary:
I noticed that high-level segments are somehow not built for non-master vertexes.
Add a test to demonstrate the issue.
Reviewed By: DurhamG, sfilipco
Differential Revision: D23095466
fbshipit-source-id: c5a6da14bdfabcf7c432f6c6dfe096c71cc10ee9
Summary: This is useful to investigate internals of dag calculations.
Reviewed By: sfilipco
Differential Revision: D23095473
fbshipit-source-id: 4750c1b4ffad32b1317051d17db9659aaaed59c4
Summary:
Follow up of the previous change by actually using the flat segments to build
segments. This significantly improved the perf. `cargo bench --bench dag_ops`
shows:
building segments (old) 774.109 ms
building segments (new) 143.879 ms
Besides, a `O(N^2)` update to `head_ids` is changed. It improves performance
when the graph has many heads (ex. the mutation graph).
Reviewed By: sfilipco
Differential Revision: D23036080
fbshipit-source-id: 033565700f253c6f20e30a00adb6b579921d6679
Summary:
While testing the `obsolete()` set, I found an in-memory segmented DAG takes
10x time to build than a HashMap DAG.
Part of the inefficiency is to use a translated "parent_func" that round-trips
through Id and Vertex, used by segment building logic. This diff makes
`IdMap::assign_head` return flat segments, so we don't need a translated
"parent_func" to build flat segments.
This diff only adds checks to make sure the parent_func (Id version) matches
the segments. The next diff switches the segment building to not use the
translated parent_func.
Reviewed By: sfilipco
Differential Revision: D23036060
fbshipit-source-id: 99137f4b5be455cdf43218ba23eb3954b6d9e05a
Summary:
This affects the `tonodes` API in the Python world. Practically this will bind
the main commit graph to sets like draft, public.
The `ToSet` requirement on `DagAlgorithm` has to be removed to avoid stack
overflow of rustc resolving constraints.
Reviewed By: sfilipco
Differential Revision: D23036077
fbshipit-source-id: 912b924e29611680ab6b2ee4dbcd7ab39824409a
Summary: This will be useful for the `obsolete()` set.
Reviewed By: sfilipco
Differential Revision: D23036072
fbshipit-source-id: 2f944ef31cf19f902622d90545fa02b7dda89221
Summary:
If two sets have different IdMap, their Ids cannot be compared directly
for correctness.
Reviewed By: sfilipco
Differential Revision: D23036068
fbshipit-source-id: e800e8273b95c1f8174236e0f30445db7fd44556
Summary: This is similar to the previous change. This allows "binding" IdMaps to sets.
Reviewed By: sfilipco
Differential Revision: D23036058
fbshipit-source-id: ec1b1ec73e949ad4865aecf17bfcc5c1ca723e0d
Summary:
This trades a bit performance (calculating the snapshot) for correctness (no
pointer reuse issues) and convenience (set captures dag information with them
and enables use-cases like converting NameSet from another dag to the
current dag without requiring extra `dag` objects).
Reviewed By: sfilipco
Differential Revision: D23036067
fbshipit-source-id: 2e691f09ad401ba79dbc635e908d79e54dadca5e
Summary:
If `x` and `y` come from a same graph, `x & y` is more efficient than
`y & x` if `y` is larger. However, if `x` and `y` are from different
graphs, the `FULL` hint can no longer accurately predict which one
is larger. Therefore the swap should be avoided.
Reviewed By: sfilipco
Differential Revision: D23036081
fbshipit-source-id: fe3970fc38c853b36689bfd0ee1dec20643ace78
Summary:
For sets like `obsolete()`, `merge()`, they could have a fast "contains" path:
Just check the given commit without calculating a full set. It's also possible
to have a relatively efficient code path to return StaticSet (for obsolete()),
or IdStaticSet (for merge(), by checking flat segments). This diff adds a
`MetaSet` that allows defining two fast paths separately.
This will be used for the `obsolete()` set in upcoming changes.
Reviewed By: sfilipco
Differential Revision: D23036059
fbshipit-source-id: 06e6f90e7e9511626a12cfa729c306ff539256d2
Summary:
Before this change, `flush` with empty changes but `master` moves will cause an
error, because the `parents_func` only contains "pending changes", aka. new
vertexes. The `parents_func` does not know `master` and `master` is needed to
re-assign them from the non-master to the master group.
With the snapshot API, things become easier. We just take a snapshot before
reloading, and use the snapshot to answer parent_names.
Reviewed By: sfilipco
Differential Revision: D22970569
fbshipit-source-id: 99a25857ba98792edff69985c16df118a560ffb0
Summary:
This API allows the underlying Dag to provide a snapshot. The snapshot can then
be used in places that do not want a lifetime (ex. NameSet).
Reviewed By: sfilipco
Differential Revision: D22970579
fbshipit-source-id: ededff82009fd5b4583f871eef084ec907b45d33
Summary:
Make it possible to snapshot a Dag. This is useful for cases where another
struct wants access to the Dag without lifetimes. Namely, the LazySet can
might want to keep a snapshot of Dag.
Reviewed By: sfilipco
Differential Revision: D22970568
fbshipit-source-id: 508c38d3ffac2ffcd2e682578c3c5e5787ea3bcf
Summary:
The only intended use of the inverse DAG is to implement the Python dag
interface in `dagutil.py`. D22519589 (2d4d44cf3d) stack changed it so the Python dag
interface becomes optional. Therefore there is no need to keep the inverse DAG
interface, which is a bit tricky on sorting.
Reviewed By: sfilipco
Differential Revision: D22970581
fbshipit-source-id: 58a126b41d992e75beaf76ece25cb578ee84760b
Summary:
This will be used for migrating revlog DAG to segmented changelog. It does not
migrate commit text data (which can take 10+ minutes).
Reviewed By: DurhamG, sfilipco
Differential Revision: D22970582
fbshipit-source-id: 125a8726d48e15ceb06edb139d6d5b2fc132a32c
Summary:
Dynamicconfigs compares the timestamp of config files with the current
timestamp to determine when to regenerate. If the timestamp of the config file
is newer than the current timestamp, Rust throws an exception. Let's handle that
case and treat it as if the file was just created instead of crashing.
Reviewed By: quark-zju
Differential Revision: D23230216
fbshipit-source-id: ca185de7dfca46953e04ec08c84668eda6d749bd
Summary: This fixes the Windows build.
Reviewed By: farnz
Differential Revision: D23212195
fbshipit-source-id: 159f3ddebf6a97f52f9b6c80ef19315c8f4b0c85
Summary:
This allows importing from other DAGs. It will be used to import revlog DAG to
the new segmented format.
Reviewed By: sfilipco
Differential Revision: D22970572
fbshipit-source-id: 0a183e7b64831574cc9c60d4639124d02d19cf43
Summary:
This allows dag to use renderdag in tests to verify graph result. Previously
it was hard because dag <-> renderdag would form circular dependency.
It also make it possible to implement more efficient and integrated fast paths
for graph rendering.
Reviewed By: sfilipco
Differential Revision: D22970570
fbshipit-source-id: 526497339bd7aa8898d1af4aa9cf6d2a6797aae0
Summary: This will be used to describe what the commit graph backend is.
Reviewed By: sfilipco
Differential Revision: D22970577
fbshipit-source-id: 753efdbdd4466730ece758d9f4789fbd21e2801b
Summary:
This allows us to try segmented changelog while maintaining revlog
compatibility.
Reviewed By: sfilipco
Differential Revision: D22970583
fbshipit-source-id: 7c43cdadd76300e76e89f38aac5ed3ecc0cff728
Summary:
We missed a Windows http client breakage because our LFS server integration
wasn't run on Windows. Let's enable the fb feature for all our cargo test runs.
Reviewed By: singhsrb
Differential Revision: D23140315
fbshipit-source-id: 46cc533c1e543ffc32d472b49a8f6daeee3b5009
Summary:
Aux data wire protocol part 1: field annotations & basic compatibility model.
Annotates fields in `file`, `tree`, and `complete_tree` wire structs with `#[serde(rename = "N", default, skip_serializing_if = "is_default")]`. I've avoided using `#[serde(default)]` on the container structs themselves because this can cause some confusion / incorrect behavior if not used carefully. Consider a wire struct `FooRequest` with a field of type `Option<Bar>`. `Option<Bar>` defaults to `None`. If `FooRequest`'s `Default` implenentation sets the field's default to `Some(bar)`, a `FooRequest` explicitly constructed with `None` for the field will be serialized with the field omitted (because it passes `is_default`) and will be deserialized on the server as `Some(bar)`, causing incorrect behavior. To address this, we'd need to change the `is_default` function used with `skip_serializing_if` to check against the field's default value as set by the container, which isn't trivially possible without some sort of reflection (please correct me if you know a good way to achieve this). This is unfortunate, as it'd be very desirable for the container to be able to set defaults different from the individual field type defaults, for cases where one boolean, for instance, should default to true. As-is, we'd need to address this with wrapper types instead, where we can fully control the `Default` implenentation.
We can, of course, address this by providing an alternate `skip_serializing_if` function to fields with default that doesn't match that set by the container. This will need to be done carefully, though, to avoid the issue I described above.
Currently the JSON module manually serializes and deserializes all the top-level request objects, so the rename annotation doesn't impact it. We can add `#[serde(alias = "rustfieldname")]` if we'd like the server and client to be able to accept manually-crafted requests and responses with explicit field names. This could also be useful to replace the manual parsing in the JSON module, but can't replace the manual serialization in a clean way. We'd need to introduce a second copy of the wire types, without the serde `rename` attribute, to allow serializing with the actual rust field name.
I've only modified the `tree`, `file`, and `complete_tree` modules. I intend to eventually update the rest of the edenapi protocol later on, when the implementation of `file` and `tree` are complete / stable. This will give us a chance to fix any mistakes before copying the design to more places.
Note: I do not intend to keep to proper wire protocol compatibility at this stage in the implementation. Expect field numbers to be re-used by non-compatible changes.
Reviewed By: kulshrax
Differential Revision: D23172756
fbshipit-source-id: 39976ed4bede892bd6981f9c3f23557a91f9028b
Summary:
As noted in the documentation for it, this can be removed once get and prefetch
return a continuation. This is now done, and thus we can remove it entirely.
Mis-use of it caused data to be fetched twice: once by memcache, and the second
one by getpackv2.
Reviewed By: singhsrb
Differential Revision: D23123344
fbshipit-source-id: 9ac0594faaba94ead04a8bb9035e14809a706641
Summary: The python code stripped new lines but the Rust code did not.
Reviewed By: singhsrb
Differential Revision: D23167515
fbshipit-source-id: add33ec6e4cfd9169e6fef8208490e0aeede38bd
Summary:
This new disallowlist will let us specify config section.key's which
should not be accepted from old rc files. This will let us incrementally disable
loading of those configs from the old files, which will then let us delete them
from the old rc's and eventually delete the old rc's entirely.
This diff also removes hgrc.local and hgrc.od from the list of configs we
verify, since those are not on the list of configs that need to be removed in
this initiative.
Reviewed By: quark-zju
Differential Revision: D23065595
fbshipit-source-id: 5cd742d099efd651174cab5e87bb7cdc4bae8054
Summary:
Previously the backing store was loading configs manually. Now that
system, dynamic, user, and repo config loading are unified, let's go through
that approved path.
Reviewed By: kulshrax
Differential Revision: D22736338
fbshipit-source-id: 232023e660107a096691e9d99bf89c04c218dfbd
Summary:
This threads the calls to load_dynamic and load_repo through the Rust
layer up to the Python bindings. This diff does 2 notable things:
1. It adds a reload API for reloading configs in place, versus creating a new
one. This will be used in localrepo.__init__ to construct a new config for the
repo while still maintaining the old pinned values from the copied ui.
2. It threads a repo path and readonly config list from Python down to the Rust
code. This allows load_dynamic and load_repo to operate on the repo path, and
allows the readonly filter to applied to all configs during reloading.
Reviewed By: quark-zju
Differential Revision: D22712623
fbshipit-source-id: a0f372f4971c5feac2f20e89a0fb3fe6d4a65d6f
Summary:
In a future diff we'll enable dynamic and repo config loading purely
from Rust. To do so we need load functions for both cases. A future diff will
call these.
The dynamicconfig loading is based off the Python equivalent in uiconfig.py
Reviewed By: quark-zju
Differential Revision: D22712624
fbshipit-source-id: ff46f6315fb80d4cd9e31d875ac60264563b12f2
Summary:
Previously load_system would skip loading if HGRCPATH was present and
then load_user would actually load the HGRCPATH. In an upcoming diff I add
load_dynamic, which happens after system but before user. The tests for
dynamicconfig depend on HGRCPATH being loaded when load_dynamic runs, so let's
move HGRCPATH loading up to load_system.
Reviewed By: quark-zju
Differential Revision: D22712627
fbshipit-source-id: 91175d9d7f85b9392ffea4af815a4facebbfe7c1
Summary:
In a future diff we'll allow an outside caller to pass an Options down
to configparsers::hg::load() so that filters can be applied during loading. Inside
hg::load() we need to use the options multiple times with different values, so
let's make Options clonable.
Reviewed By: quark-zju
Differential Revision: D22712626
fbshipit-source-id: 975145f38d35afe7d4a6c8e87071b0fb0ae74797
Summary:
A future diff will move all dynamic and repo config loading to be in
configparser. As part of this, let's simplify the repo.rs API to not pass
configs around everywhere.
Reviewed By: quark-zju
Differential Revision: D22712628
fbshipit-source-id: 79f23991aa826ce8b4f7430b45d7702efdc6b982
Summary:
Similar to the Python runbgcommand (extutil.py), this is a Rust utility that runs a
detached background process in a cross platform way.
This will be used in a later diff to run dynamicconfig generation in the
background.
Reviewed By: quark-zju
Differential Revision: D22712629
fbshipit-source-id: a317465bf03c96d977a203678e2bef13ce57cc12
Summary:
As part of moving all hg config loading and generation logic into Rust,
let's move the config generation logic from hgcommands and pyconfigparser to
configparser, unifying them at the same time.
Future diffs will move config loading in as well.
Reviewed By: quark-zju
Differential Revision: D22590208
fbshipit-source-id: d1760c404a6a5c57347df30713c20de55cfdb9a4
Summary:
A future diff will unify all config loading into configparser::hg, but
to do so we need dynamicconfig to live in configparser, so it can load
dynamicconfigs. Let's move everything in.
Reviewed By: quark-zju
Differential Revision: D22587237
fbshipit-source-id: 5613094175b6e1597aa113ee3e6d92ce7ec79f6d
Summary:
We had two spots that loaded system and user configs, one in the
pyconfigparser layer, and one in the pure rust config layer. In an upcoming diff
I'd like to move dynamicconfig loading down into the pure rust layer, so let's
unify these.
Reviewed By: quark-zju
Differential Revision: D22585554
fbshipit-source-id: 0cea7801ae1d5a3a3c12b80ee23b37f9e690e2bc
Summary:
In a future diff we'll increase the size of the rotatelog temporarily
during clones. To do so we need it to be configurable.
Reviewed By: quark-zju
Differential Revision: D23089539
fbshipit-source-id: ebfc3beaf3c0fe5b01b87d97c19455b0a24afa72
Summary:
In a future diff we'll increase the size of the rotatelog temporarily
during clones. To do so we need it to be configurable.
Reviewed By: quark-zju
Differential Revision: D23089541
fbshipit-source-id: 5010e417a83a2611283322f1dbb7023f4286f503
Summary:
from_path is an awkward constructor because it doesn't pass any other
information, like a config object. It also requires that the constructor be very
generic across all the stores. Right now it's only needed for pack files, so
let's move it to it's own trait that is limited to pack files.
This will allow us to make the indexedlog store constructors more versatile in a
later diff. Once we get rid of pack files we can delete the StoreFromPath trait
entirely.
Reviewed By: xavierd
Differential Revision: D23089542
fbshipit-source-id: ea6c50853e5d5390a029002ef5d15c74fe41fe69
Summary: If parent_revs gets an out-of-bound rev, it should fail.
Reviewed By: sfilipco
Differential Revision: D23036071
fbshipit-source-id: 7fae0fd5adf07ac3c933a29d7d06289d8d740c60
Summary:
If the text starts with `\0`, the `\0` should be considered as part of the
uncompressed text instead of a separated header.
Reviewed By: sfilipco
Differential Revision: D22970575
fbshipit-source-id: 49e8a1a1ea42a3c4cf153b70f59fd0558dcfcede
Summary:
The parent handling is unsound when there are revs that are skipped. Fix it by
reasoning about commit hashes for parents.
Reviewed By: sfilipco
Differential Revision: D23036078
fbshipit-source-id: 8f710171471025cd48b3bd8f6ea57c68330eb8b8
Summary:
Windows defaults to checking a revocation server for ssl certs. Inside
our datacenter it can't reach the server and fails. We don't have this on for
any other platforms, so let's disable it.
Reviewed By: sfilipco
Differential Revision: D23121739
fbshipit-source-id: 4d44d2a065bf340a8f74332553deb09a9c61be9b
Summary:
The primary change is in `eden/scm/lib/edenapi/types`:
* Split `DataEntry` into `FileEntry` and `TreeEntry`.
* Split `DataError` into `FileError` and `TreeError`. Remove `Redacted` error variant from `TreeError` and `MaybeHybridManifest` error variant from `FileError`.
* Split `DataRequest`, `DataResponse` into appropriate File and Tree types.
* Refactor `data.rs` into `file.rs` and `tree.rs`.
* Lift `InvalidHgId` error, used by both File and Tree, into `lib.rs`.
* Bugfix: change `MaybeHybridManifest` to be returned only for hash mismatches with empty paths, to match documented behavior.
Most of the remaining changes are straightforward fallout of this split. Notable changes include:
* `eden/scm/lib/edenapi/tools/read_res`: I've split the "data" commands into "file" and "tree", but I've left the identical arguments sharing the same argument structs. These can be refactored later if / when they diverge.
* `eden/scm/lib/types/src/hgid.rs`: Moved `compute_hgid` from `eden/scm/lib/edenapi/types/src/data.rs` to as a new `from_content` constructor on the `HgId` struct.
* `eden/scm/lib/revisionstore/src/datastore.rs`: Split `add_entry` method on `HgIdMutableDeltaStore` trait into `add_file` and `add_tree` methods.
* `eden/scm/lib/revisionstore/src/edenapi`
* `mod.rs`: Split `prefetch` method on `EdenApiStoreKind` into `prefetch_files` and `prefetch_trees`, which are given a default implementation that fails with `unimplemented!`.
* `data.rs`: Replace blanket trait implementations for `EdenApiDataStore<T>` with specific implementations for `EdenApiDataStore<File>` and `EdenApiDataStore<Tree>` which call the appropriate fetch and add functions.
* `data.rs` `test_get_*`: Replace dummy hashes with real hashes. These tests were only passing due to the hash mismatches (incorrectly) being considered `MaybeHybridManifest` errors, and allowed to pass.
Reviewed By: kulshrax
Differential Revision: D22958373
fbshipit-source-id: 788baaad4d9be20686d527f819a7342678740bc3
Summary:
The `debugfsync` command calls fsync on newly modified files in svfs.
Right now it only includes locations that we know have constant number
of files.
The fsync logic is put in a separate crate to avoid slow compiles.
Reviewed By: DurhamG
Differential Revision: D22992103
fbshipit-source-id: b5503e498d5216d4ba19701ecd5582387e4f45f5
Summary: This allows callsites to get access to the storage.
Reviewed By: DurhamG
Differential Revision: D22992104
fbshipit-source-id: c72fa313be1468170c9728d3856f822bb6385dc8
Summary:
This makes the main command table cleaner.
I dropped the `indexedlogrepair` command as it cannot rebuild indexes. `hg
doctor` is a better replacement. Some debug commands are renamed so they
no longer have `-` in the command name.
Reviewed By: DurhamG
Differential Revision: D22992107
fbshipit-source-id: f65d74e36fb971e592ad0cc8be9a94e245c39662