Summary:
This replaces RustError that might happen during `addcommits`, and allow us to
handle it without having a stacktrace.
Reviewed By: DurhamG
Differential Revision: D22539564
fbshipit-source-id: 356814b9baf0b31528dfc92d62b0dcf352bc1e24
Summary: It's the same as `__add__`. It's consistent with the revset language.
Reviewed By: sfilipco
Differential Revision: D22638456
fbshipit-source-id: 928177d553220461192650f4792ac39cadd57dc2
Summary:
The hint indicates a set `X` is equivalent to `ancestors(X)`.
This allows us to make `heads` use `heads_ancestors` (which is faster in
segmented changelog) automatically without affecting correctness. It also
makes special queries like `ancestors(all())` super cheap because it'll just
return `all()` as-is.
Reviewed By: sfilipco
Differential Revision: D22638463
fbshipit-source-id: 44d9bbcbb0d7e2975a0c8322181c88daa1ba4e37
Summary:
Replace the Python spanset with the Rust-backed idset.
The idset can represent multiple ranges and works better with Rust code.
The `idset` fast paths do not preserve order for the `or` operation, as
demonstrated in the test changes.
Reviewed By: DurhamG, kulshrax
Differential Revision: D22519584
fbshipit-source-id: 5d976a937e372a87e7f087d862e4b56d673f81d6
Summary: Similarly to the changes made for `get`, the same can be applied to prefetch.
Reviewed By: DurhamG
Differential Revision: D22565609
fbshipit-source-id: 0fbc1a0086fa44593a6aaffb746ed36b3261040c
Summary: Make `store` the first argument for all of the EdenAPI Python methods. I've found this arrangement to be more ergonomic when working with the client later in the stack.
Reviewed By: quark-zju
Differential Revision: D22703915
fbshipit-source-id: b0ca900d969ec86ee91e8c62d281c2102860e9ef
Summary:
When using LFS, it's possible that a pointer may be present in the local
LfsStore, but the blob would only be in the shared one. Such scenario can
happen after an upload, when the blob is moved to the shared store for
instance. In this case, during a `get` call, the local LFS store won't be able
to find the blob and thus would return Ok(None), the shared LFS store woud not
be able to find the pointer itself and would thus return Ok(None) too. If the
server is not aware of the file node itself, the `ContentStore::get` would also
return Ok(None), even though all the information is present locally.
The main reason why this is happening is due to the `get` call operating
primarily on file node based keys, and for content-based stores (like LFS),
this means that the translation layer needs to be present in the same store,
which in some case may not be the case. By allowing stores to return a
`StoreKey` when progress was made in finding the key we can effectively solve
the problem described above, the local store would translate the file node key
onto a content key, and the shared store would read the blob properly.
Reviewed By: DurhamG
Differential Revision: D22565607
fbshipit-source-id: 94dd74a462526778f7a7e232a97b21211f95239f
Summary: This change introduces a bail macro that allows tagging errors using the syntax `bail!(fault=Fault::Request, "my normal {}", bail_args)` or `bail!(Fault::Request, "my normal {}", bail_args)`.
Reviewed By: DurhamG
Differential Revision: D22646428
fbshipit-source-id: a6ec2940001b26db8ddc3a6d3620a1e17406c867
Summary:
The spanset has the assumption that `0..len(repo)` are valid revs.
That's not true with segmented changelog. So reduce the dependency on the
assumption.
Reviewed By: kulshrax
Differential Revision: D22519586
fbshipit-source-id: a493d26d6d69a36966f4a037f87a03593b697cbd
Summary:
It turns out the Python world needs the integer range API in many places.
Deprecating them is non-trivial. Therefore expose the API.
Reviewed By: DurhamG
Differential Revision: D22402201
fbshipit-source-id: de31d15c18e5f4e0f8826f71315b98ad58b1764e
Summary:
About 64 tests depend on the revlog `strip` behavior. `strip` is not used in
production client-repos. I tried to migrate them off `strip` but that seems
too much work for now. Instead let's just implement `strip` in the HgCommits
layer to be compatible to run the tests.
Reviewed By: DurhamG
Differential Revision: D22402195
fbshipit-source-id: f68d005e04690d8765d5268c698b6c96b981eb0a
Summary:
I dropped the special case of wdir handling. With the hope that we will handle
the virtual commits differently eventually (ex. drop special cases, insert real
commits to Rust DAG but do not flush them to disk, support multiple wdir
virtual commits, null is no longer an ancestor of every commit).
`test-listkeyspatterns.t` is changed because `0` no longer resolves to `null`.
Reviewed By: DurhamG
Differential Revision: D22368836
fbshipit-source-id: 14b9914506ef59bb69363b602d646ec89ce0d89a
Summary: Make the Python EdenAPI client's `health()` method return a dict of server metadata.
Reviewed By: DurhamG
Differential Revision: D22604932
fbshipit-source-id: 51ca60cc95a8dbd15635520b2a9bd72603643cb6
Summary:
Implements based Rust-Python binding layer for error metadata propagation.
We introduce a new type, `TaggedExceptionData`, which carries CommonMetadata and the original (without metadata) error message for a Rust Anyhow error. This class is passed to RustError and can be accessed in Python (somewhat awkwardly) via indexing:
```
except error.RustError as e:
fault = e.args[0].fault()
typename = e.args[0].typename()
message = e.args[0].message()
```
As far as I can tell, due to limitations in cpython-rs, this can't be made more ergonomic without introducing a Python shim around the Rust binding layer, which could adapt the cpython-rs classes to use whatever API we'd like.
Currently, anyhow errors that are not otherwise special-cased will be converted into RustError, with both the original error message and any attached metadata printed as shown below
```
abort: intentional error for debugging with message 'intentional_error'
error has type name taggederror::IntentionalError and fault None
```
We can of course re-raise the error if desired to maintain the previous behavior for handling a RustError.
If we'd like other, specialized Rust Python Exception types to carry metadata (such as `IndexedLogError`), we'll need to modify them to accept a `TaggedExceptionData` like `RustError`.
Renamed the "cause an error in pure rust command" function to `debugcauserusterror`, and instead used the name `debugthrowrustexception` for a command which causes an error in rust which is converted to a Python exception across the binding layer.
Introduced a simple integration test which exercises `debugthrowrustexception`.
Added a basic handler for RustError to scmutil.py
Reviewed By: DurhamG
Differential Revision: D22517796
fbshipit-source-id: 0409489243fe739a26958aad48f608890eb93aa0
Summary: Move the `tokio::Runtime` into `EdenApiRemoteStore` so that if initialization fails, we can propagate the error instead of panicking.
Reviewed By: xavierd
Differential Revision: D22564210
fbshipit-source-id: 9db1be99f2f77c6bb0f6e9dc445d624dc5990afe
Summary: Instead of restricting the allowed characters in a repo name, allow any UTF-8 string. The string will be percent-encoded before being used in URLs.
Reviewed By: quark-zju
Differential Revision: D22559830
fbshipit-source-id: f9caa51d263e06d424531e0947766f4fd37b035f
Summary: Adds support for sharding based on user name.
Reviewed By: quark-zju
Differential Revision: D22537540
fbshipit-source-id: 962f9582c8947335dc9d9d29c500d8c09df69878
Summary:
Previously you could only canary locally on a devserver by setting an
environment variable. Let's add a --canary flag to debugdynamicconfig that
accepts a host. Hg will ssh to that host and run the configerator cli to grab
the canaried config from that host.
Reviewed By: quark-zju
Differential Revision: D22535509
fbshipit-source-id: af1c21d8402c4e729769e50388d913bf52b66b89
Summary: Add an optional `edenapi` argument to metadatastore that allows using EdenAPI in place of the SSH remote store.
Reviewed By: quark-zju
Differential Revision: D22492535
fbshipit-source-id: eba034c9ba86c79c9a9dee6bab3ff615d0575b6f
Summary: Reimplement `EdenApiHgIdRemoteStore` as `EdenApiRemoteStore<T>`, where `T` is a marker type indicating whether this store fetches files or trees. This allows working with the stores in a more strongly-typed way, and avoid having to check what kind of store this is at runtime when fetching data.
Reviewed By: quark-zju
Differential Revision: D22492160
fbshipit-source-id: e17556093fa9b81d2301f281da36d75a03e33c5e
Summary:
Previously we would audit all configs and report them if the
dynamicconfig did not match the rc-file config. Now that dynamicconfigs are
widely deployed, let's switch this around to auditing only configs we know have
had issues. This will let us start adding new configs via dynamicconfigs instead
of via the legacy staticfiles and chef, before we've finished migrating all the
legacy configs over.
Reviewed By: quark-zju
Differential Revision: D22401865
fbshipit-source-id: 5c41c674d39c8113b2a40da61e020e8a33c39312
Summary:
We're seeing cases were cloning can take 10's of GB of memory because
we pend all the history information in memory. Let's flush the history info
every 10 million adds to bound the memory usage.
10 million was chosen somewhat arbitrarily, but it results in pack files that
are 800MB, which corresponds roughly with 8GB of memory usage.
This requires updating repack to be aware that a single flush could produce
multiple packs. Note, since repack writes via this same path, it may result in
repack producing multiple pack files. In the degenerate case repack could
produce the same number (or more) of pack files than was inputted. If we set the
threshold high enough I think we'll be fine though. 800MB is probably
sufficient.
Reviewed By: xavierd
Differential Revision: D22438569
fbshipit-source-id: 425d5d3b7999b81e44d1dbe1f2a4ea453ab6ca4f
Summary: Per comments on D22429347, add a new `ExtractInnerRef` trait that is similar to `ExtractInner`, but returns a reference to the underlying value. A default implementation is provided for types whose inner value is `Clone + 'static`, so in practice most types will only need to implement `ExtractInnerRef`, whereas the callsite may choose whether it needs a reference or an owned value.
Reviewed By: quark-zju
Differential Revision: D22464158
fbshipit-source-id: 7b97329aedcddb0e51fd242b519e79eba2eed350
Summary: Add add a `edenapistore` class to that wraps a `EdenApiHgIdRemoteStore`. This class is purely used as a means to set up the stores from Python code, and is only used as a way to get an `Arc<EdenApiHgIdRemoteStore>` to the Rust content store. It has no functionality of its own.
Reviewed By: quark-zju
Differential Revision: D22449702
fbshipit-source-id: ad2094c79da523071b6ed8344c8dde706e448c95
Summary: This is effectively a complete rewrite of the EdenAPI Python bindings to use the new client.
Reviewed By: quark-zju
Differential Revision: D22442903
fbshipit-source-id: c3cf2b2b8291e24d6d4d3a3546ccc69472510567
Summary:
A common pattern in Mercurial's data storage layer Python bindings is to have a Python object that wraps a Rust object. These Python objects are often passed across the FFI boundary to Rust code, which then may need to access the underlying Rust value.
Previously, the objects that used this pattern did so in an ad-hoc manner, typically by providing an `into_inner` or `to_inner` inherent method. This diff introduces a new `ExtractInner` trait that standardizes this pattern into a single interface, which in turn allows this pattern to be used with generics.
Reviewed By: quark-zju
Differential Revision: D22429347
fbshipit-source-id: cab4c24b8b98c6ef8307f72a9b4726aabdc829cc
Summary: Update the EdenAPI Python bindings to use the new client. This is mostly just a stopgap measure to allow us to delete the old client code; nothing in production actually uses these bindings anymore, and the new client will primarily be used from Rust.
Reviewed By: quark-zju
Differential Revision: D22379476
fbshipit-source-id: 953e0ffc2ce682869ee234d672a154046b373c1e
Summary:
We've seen a handful of users complaining about clone failing and not being
able to recover from it. From looking at the various reports and the
stacktraces, I believe this is caused by a flaky connection on the user end
that causes the Python code to retry the getpack calls. Before retrying, the
code will figure out what still needs fetching and this is done via the
getmissing API. When LFS pointers were fetched, the LFS blobs aren't yet
present on disk, and thus the underlying ContentStore::get_missing will a set
of keys that contain some StoreKey::Content keys. The code would previously
fail at this point, but since the key also contains the original key, we can
simply return this, the pointers might be refetched but these are fairly small.
Taking a step back from this bug, the issue really is that the retry logic is
done in code that cannot understand content-keys, and moving it to a part of
the code that understands this would also resolve the issue.
I went with the simple approach for now, but since other remote stores
(EdenAPI, the LFS one, etc) would also benefit from the retry logic, we may
want to move the logic into Rust and remove the getmissing API from the Python
exposed ContentStore.
Reviewed By: DurhamG
Differential Revision: D22425600
fbshipit-source-id: 69c2898cc302d2170cd0f206c89189c341db5278
Summary:
The Mercurial's concept of `null` revision (hardcoded as 20 zeros) is a
headache to special case. See https://www.mercurial-scm.org/wiki/RevsetVirtualRevisionPlan.
The Rust DAG layer cannot handle it. Make pydag drop the nullid or nullrev when
crossing the Python -> Rust boundary.
A cleaner way to handle `null` might be:
- Create a new vertex in the DAG in memory that has empty content.
Calculate its commit hash normally. The commit is isolated from other parts
of the commit graph. It has no parents and no children.
The vertex has an assigned Id, which is not zero if the repo is not empty.
- Assign the `null` special name (like how we do for `tip`) to the commit.
- Remove all hard-code special cases of the 20-zero `nullid`.
That would allow things like `hg up null`, `hg diff -r null -r X` to continue
work without special casing it in the commit graph layer.
Reviewed By: sfilipco
Differential Revision: D22240188
fbshipit-source-id: 707af47cbf36a7df60097a17d69094aae89d3250
Summary:
Change pydag from using concreate `namedag` and `memnamedag` to trait objects:
- `commits`: High-level read-write commits storage, supports Rust `HgCommits`
(segmented changelog), `MemHgCommits`, and `RevlogCommits`.
- `dagalgo`: maps to the `DagAlgorithm` Rust trait.
- `idmap`: maps to the `IdConvert + PrefixLookup` Rust traits.
The idea is that we move the revlog / segmented changelog difference from Python
to behind Rust trait objects so the Python code looks overall cleaner, the Rust
revset alternative gets exercised early, and switching from revlog to segmented
changelog becomes easier.
Reviewed By: sfilipco
Differential Revision: D21796242
fbshipit-source-id: 3a4a3ff3d9e7e46059d1ed3461a55003c352e82d
Summary:
Similar to D7121487 (af8ecd5f80) but works for mutation store. This makes sure at the Rust
layer, mutation entries won't get lost after rebasing or metaeditting a set of
commits where a subset of the commits being edited has mutation relations.
Unlike the Python layer, the Rust layer works for mutation chains. Therefore
some of the tests changes.
Reviewed By: markbt
Differential Revision: D22174991
fbshipit-source-id: d62f7c1071fc71f939ec8771ac5968b992aa253c
Summary: Move old EdenAPI crate to `scm/lib/edenapi/old` to make room for the new crate. This old code will eventually been deleted once all references to it are removed from the codebase.
Reviewed By: quark-zju
Differential Revision: D22305173
fbshipit-source-id: 45d211340900192d0488543ba13d9bf84909ce53
Summary:
D21626209 (38d6c6a819) changed revlogindex to read `00changelog.i` by its own instead of
taking the data from Python. That turns out to be racy. The `00changelog.i`
might be changed between the Rust and Python reads and that caused issues.
This diff makes Python re-use the indexdata read by Rust so they are guaranteed
the same.
Reviewed By: DurhamG
Differential Revision: D22303305
fbshipit-source-id: 823bf3aefc970a4a6ce8ab58bccf972a78f6de70
Summary:
This will be used by the next change.
The reason we use a `buffer` or `memoryview` instead of Python `bytes` is to expose
the buffer in a zero-copy way. That is important for startup performance.
Reviewed By: DurhamG
Differential Revision: D22303306
fbshipit-source-id: 3f7c8dff3575b998e025cd5940faa0c183b11626
Summary:
Fetches configs from a remote endpoint and caches them locally. If the
remote endpoint fails to respond, we use the cached version.
Reviewed By: quark-zju
Differential Revision: D22010684
fbshipit-source-id: bd6d4349d185d7450a3d18f9db2709967edc2971
Summary:
Whenever data is redacted on the server, a specific tombstone is returned when
fetching it. Make sure that whenever we update the file on disk, we write a
nice message to the user instead of the tombstone itself.
While this code could have been moved into the Rust store code itself, I prefer
to leave it to its users to decide what to do with redacted data. EdenFS for
instance may want to prevent access to it instead of showing the redacted
message.
Reviewed By: kulshrax
Differential Revision: D21999345
fbshipit-source-id: 39a83cdf5ea4567628a13fbd59520b9677aba749
Summary: The tracing APIs and error context APIs can achieve similar effects.
Reviewed By: xavierd
Differential Revision: D22129585
fbshipit-source-id: 0626e3f4c1a552c69c046ff06ac36f5e98a6c3d8
Summary:
In the case where an expensive network operation is involved, holding the GIL
would mean that Mercurial cannot display progress bars, or simply cannot be
interrupted. This is a less than ideal user experience.
To fix this, let's release the GIL whenever we enter into the revisionstore
code.
Reviewed By: DurhamG
Differential Revision: D22140399
fbshipit-source-id: 131c8cf81e39128810e0f20d1922b5681a33d95a
Summary: Replace usages of whitelist/blacklist with include/exclude/filter/allow. These terms are more descriptive and less likely to contribute to racial stereotyping. More context: https://fb.workplace.com/groups/sourcecontrolteam/permalink/2926049127516414/
Reviewed By: kulshrax
Differential Revision: D22039298
fbshipit-source-id: 255c7389ee5ce5e54bbccdfb05ffa4cafc6958e5
Summary:
The anyhow error contains the context for the error which will lead to better
error message for the user (and us). What the code previously did was simply
using the Debug trait to print the error, and thus was missing context.
Reviewed By: DurhamG
Differential Revision: D21985745
fbshipit-source-id: 31c603d7f42e79a360541f39e4aaf0fcfbb9a14f
Summary:
With the internal streampager, progress bars must be sent on a separate stream so that
streampager can render them correctly.
Reviewed By: quark-zju
Differential Revision: D21906173
fbshipit-source-id: eb41b0bf22807d9cae518b3f676996ab1c642c6e
Summary:
Mostly copy-paste from code added in D19503373 and D19511574. Adjusted to match
the revlog index interface.
Reviewed By: sfilipco
Differential Revision: D21626201
fbshipit-source-id: 05d160e4c03d7e2482b6a4f2d68c3688ad78f568
Summary:
The NameSet is not really about Dag. It is about using Id and is static.
Rename it to clarify. In an upcoming change we'll have IdLazySet.
Reviewed By: sfilipco
Differential Revision: D21626204
fbshipit-source-id: 84f25008f7032f6e26a26fc656ccbcd2a5880ecf
Summary:
Previously, the NameSet has properties like "is_all", "is_topo_sorted", etc.
To make lazy sets efficient, it's important to have hints about min / max Ids
and maybe some other information.
Add a dedicated Hints structure for that.
Reviewed By: sfilipco
Differential Revision: D21626219
fbshipit-source-id: 845e88d3333f0f48f60f2739adae3dccc4a2dfc4
Summary:
Implements part of the dag IdMap related traits.
It does not get used yet, but eventually I'd like `pydag` to be able to work
with an abstracted dag including RevlogIndex.
Reviewed By: sfilipco
Differential Revision: D21626210
fbshipit-source-id: 53f19622f03fd71b76073dccf8dcc9b4778b40ca