Summary: Update `edenapi::Builder` to use the CA certificate bundle specified in the `[auth]` section of the user's config.
Reviewed By: quark-zju
Differential Revision: D22591034
fbshipit-source-id: 3a417adbf50ef7d2c538f4a032e54a038cbd282e
Summary: Allow specifying a CA certificate bundle in the `auth` section of an `hgrc`. This is useful for testing with locally-built servers using self-signed certificates.
Reviewed By: quark-zju
Differential Revision: D22591045
fbshipit-source-id: 023fe006267b0b781a1af16a7505e188c008a8c0
Summary:
Implements based Rust-Python binding layer for error metadata propagation.
We introduce a new type, `TaggedExceptionData`, which carries CommonMetadata and the original (without metadata) error message for a Rust Anyhow error. This class is passed to RustError and can be accessed in Python (somewhat awkwardly) via indexing:
```
except error.RustError as e:
fault = e.args[0].fault()
typename = e.args[0].typename()
message = e.args[0].message()
```
As far as I can tell, due to limitations in cpython-rs, this can't be made more ergonomic without introducing a Python shim around the Rust binding layer, which could adapt the cpython-rs classes to use whatever API we'd like.
Currently, anyhow errors that are not otherwise special-cased will be converted into RustError, with both the original error message and any attached metadata printed as shown below
```
abort: intentional error for debugging with message 'intentional_error'
error has type name taggederror::IntentionalError and fault None
```
We can of course re-raise the error if desired to maintain the previous behavior for handling a RustError.
If we'd like other, specialized Rust Python Exception types to carry metadata (such as `IndexedLogError`), we'll need to modify them to accept a `TaggedExceptionData` like `RustError`.
Renamed the "cause an error in pure rust command" function to `debugcauserusterror`, and instead used the name `debugthrowrustexception` for a command which causes an error in rust which is converted to a Python exception across the binding layer.
Introduced a simple integration test which exercises `debugthrowrustexception`.
Added a basic handler for RustError to scmutil.py
Reviewed By: DurhamG
Differential Revision: D22517796
fbshipit-source-id: 0409489243fe739a26958aad48f608890eb93aa0
Summary: Move the `tokio::Runtime` into `EdenApiRemoteStore` so that if initialization fails, we can propagate the error instead of panicking.
Reviewed By: xavierd
Differential Revision: D22564210
fbshipit-source-id: 9db1be99f2f77c6bb0f6e9dc445d624dc5990afe
Summary:
Add a metadata field to `read_res` containing a `revisionstore::Metadata` struct (which contains the object size and flags). The main purpose of this is to support LFS, which is indicated via a metadata flag.
Although this change affects the `DataEntry` struct which is serialized over the wire, version skew between the client and server should not break things since the field will automatically be populated with a default value if it is missing in the serialized response, and ignored if the client was built with an earlier version of the code without this field.
In practice, version skew isn't really a concern since this isn't used in production yet.
Reviewed By: quark-zju
Differential Revision: D22544195
fbshipit-source-id: 0af5c0565c17bdd61be5d346df008c92c5854e08
Summary: Instead of restricting the allowed characters in a repo name, allow any UTF-8 string. The string will be percent-encoded before being used in URLs.
Reviewed By: quark-zju
Differential Revision: D22559830
fbshipit-source-id: f9caa51d263e06d424531e0947766f4fd37b035f
Summary: Makes the hostname available for dynamicconfig conditions.
Reviewed By: quark-zju
Differential Revision: D22537946
fbshipit-source-id: 630ee833bb3ec00253d718b3d03bbb8b3d49afca
Summary: Adds support for sharding based on user name.
Reviewed By: quark-zju
Differential Revision: D22537540
fbshipit-source-id: 962f9582c8947335dc9d9d29c500d8c09df69878
Summary: We have several repos whose names contain various non-alphanumeric/underscore/hyphen characters, so we need to be more permissive about accepting repo names.
Reviewed By: quark-zju
Differential Revision: D22554846
fbshipit-source-id: e7bb030e0b8fb6aa275c119ba0aa540405b29186
Summary:
Previously, the EdenAPI stores would not report errors returned from the remote store. The intention behind this pattern in other stores is to prevent `KeyError`s from aborting the operation since the local store might still have the key.
However, in the case of the EdenAPI store, EdenAPI will simply omit missing keys in its response rather than returning an error. Instead, any error returned by the EdenAPI store indicates a more fundamental problem (e.g., unable to reach the server, connection reset, etc) which should cause an abort and return the error.
Reviewed By: quark-zju
Differential Revision: D22544031
fbshipit-source-id: e01e8d88b75e46dcebd2eef5203e3a0edde69fc7
Summary: When working with large CBOR responses, it is sometimes useful to limit processing to the first N entries to prevent the operation from taking a long time. This diff adds an option to the `read_res` tool to only look at the first N entries in a data or history response.
Reviewed By: quark-zju
Differential Revision: D22544451
fbshipit-source-id: 5e8e2c7212aa3b315a25bd4cf9273009a5e43f72
Summary: Some repos do not have `remotefilelog.reponame` set, so this shouldn't be a required config item.
Reviewed By: fanzeyi
Differential Revision: D22553141
fbshipit-source-id: a0fe9c289a1a32650572a4c123cda60af90e79ec
Summary:
Introduce new rust library, taggederror, which contains utilities for attaching metadata to errors. The library provides two main methods for attaching metadata to an error, the TaggedError wrapper type, and the AnyhowExt trait methods. Provides a struct, CommonMetadata, which contains all the metadata types introduced by taggederror (fault, transience, category, and typename), which can also be attached individually (and the same pattern can be used to attach other metadata).
Introduce a new native rust command, debugthrowrustexception, which causes the command to return an error, with some attached metadata.
Modify hg rust native command dispatch error handling to use debug formatter to print anyhow::Error errors. This will print out the source chain, contexts, and backtrace if available, which will cause the metadata we attach as a wrapper error or context to be printed.
Reviewed By: DurhamG
Differential Revision: D22420941
fbshipit-source-id: d38c5a10b686d86b69a2c0a19f5bcbf4ca24dff6
Summary:
Previously you could only canary locally on a devserver by setting an
environment variable. Let's add a --canary flag to debugdynamicconfig that
accepts a host. Hg will ssh to that host and run the configerator cli to grab
the canaried config from that host.
Reviewed By: quark-zju
Differential Revision: D22535509
fbshipit-source-id: af1c21d8402c4e729769e50388d913bf52b66b89
Summary:
Previously we had no way of specifying an optional string flag. This
adds support.
I considered making the implementation more generic, so it'd support
Option<i64> and potentially Option<bool> but it introduced some complexity and
didn't seem worth the effort for now.
Reviewed By: quark-zju
Differential Revision: D22535511
fbshipit-source-id: 04d7b5419ca7ae44a9aeff1a5cea2c3043d80042
Summary:
Follow up on this diff: D22432330 (b7817ffbd8)
Renamed xdiff functions to avoid linking issues when using both libgit2-sys and xdiff.
Reviewed By: farnz
Differential Revision: D22511368
fbshipit-source-id: e4be20e3112a8e8829298d5748657e9bdbde8588
Summary: Add an EdenAPI-backed history store. Notably, thanks to the strongly-typed remote store design from the previous diff, it is not possible to construct an `EdenApiHistoryStore` for trees, even when the underlying remote store is behind a trait object. (This is because EdenAPI does not support fetching history for trees.)
Reviewed By: quark-zju
Differential Revision: D22492162
fbshipit-source-id: 23f1393919c4e8ac0918d2009a16f482d90df15c
Summary: Reimplement `EdenApiHgIdRemoteStore` as `EdenApiRemoteStore<T>`, where `T` is a marker type indicating whether this store fetches files or trees. This allows working with the stores in a more strongly-typed way, and avoid having to check what kind of store this is at runtime when fetching data.
Reviewed By: quark-zju
Differential Revision: D22492160
fbshipit-source-id: e17556093fa9b81d2301f281da36d75a03e33c5e
Summary: Move `src/edenapi.rs` to `src/edenapi/mod.rs` in anticipation of adding more files to this module.
Reviewed By: quark-zju
Differential Revision: D22492161
fbshipit-source-id: f6252ea9a9e32d94029b8e6e76be5d9d1754f63d
Summary:
Previously we would audit all configs and report them if the
dynamicconfig did not match the rc-file config. Now that dynamicconfigs are
widely deployed, let's switch this around to auditing only configs we know have
had issues. This will let us start adding new configs via dynamicconfigs instead
of via the legacy staticfiles and chef, before we've finished migrating all the
legacy configs over.
Reviewed By: quark-zju
Differential Revision: D22401865
fbshipit-source-id: 5c41c674d39c8113b2a40da61e020e8a33c39312
Summary:
We're seeing cases were cloning can take 10's of GB of memory because
we pend all the history information in memory. Let's flush the history info
every 10 million adds to bound the memory usage.
10 million was chosen somewhat arbitrarily, but it results in pack files that
are 800MB, which corresponds roughly with 8GB of memory usage.
This requires updating repack to be aware that a single flush could produce
multiple packs. Note, since repack writes via this same path, it may result in
repack producing multiple pack files. In the degenerate case repack could
produce the same number (or more) of pack files than was inputted. If we set the
threshold high enough I think we'll be fine though. 800MB is probably
sufficient.
Reviewed By: xavierd
Differential Revision: D22438569
fbshipit-source-id: 425d5d3b7999b81e44d1dbe1f2a4ea453ab6ca4f
Summary: Per comments on D22429347, add a new `ExtractInnerRef` trait that is similar to `ExtractInner`, but returns a reference to the underlying value. A default implementation is provided for types whose inner value is `Clone + 'static`, so in practice most types will only need to implement `ExtractInnerRef`, whereas the callsite may choose whether it needs a reference or an owned value.
Reviewed By: quark-zju
Differential Revision: D22464158
fbshipit-source-id: 7b97329aedcddb0e51fd242b519e79eba2eed350
Summary: Ensure that all of the components of an EdenAPI response use the same error type.
Reviewed By: quark-zju
Differential Revision: D22443029
fbshipit-source-id: 3e00a8b83677beb5ef2d90630fe9b85760874186
Summary: Add an `add_entry` convenience method to `HgIdMutableDeltaStore`, similar to the one present in `HgIdMutableHistoryStore`.
Reviewed By: quark-zju
Differential Revision: D22443031
fbshipit-source-id: 84fdaae9fbd51e6f2df466b0441ec5f7ce6715f7
Summary:
A common pattern in Mercurial's data storage layer Python bindings is to have a Python object that wraps a Rust object. These Python objects are often passed across the FFI boundary to Rust code, which then may need to access the underlying Rust value.
Previously, the objects that used this pattern did so in an ad-hoc manner, typically by providing an `into_inner` or `to_inner` inherent method. This diff introduces a new `ExtractInner` trait that standardizes this pattern into a single interface, which in turn allows this pattern to be used with generics.
Reviewed By: quark-zju
Differential Revision: D22429347
fbshipit-source-id: cab4c24b8b98c6ef8307f72a9b4726aabdc829cc
Summary:
D22396026 made it so that `HttpClient::send_async` no longer consumes `self`. This means that instead of creating a new HTTP client for each request, we can reuse the same one.
This has the benefit of allowing for connection reuse (which was the point of D22396026), resulting in lower latency for serial fetches.
Reviewed By: quark-zju
Differential Revision: D22397768
fbshipit-source-id: 9d066c1ec64a6aa1b36ec674ef294030c1f90b41
Summary: Allow passing multiple JSON requests to the EdenAPI CLI. The requests will be performed serially, which allows for testing the performance of serial EdenAPI calls.
Reviewed By: quark-zju
Differential Revision: D22397769
fbshipit-source-id: c59e5abf53eee9c2014010672183e202b6f180fc
Summary:
Add a pool of `Multi` handles that the client can reuse across requests.
Previously, `HttpClient`'s async functions had to consume the client in order to have a `'static` lifetime (since `Future`s generally cannot hold references to things outside of themselves). This meant that the each async operation would use its own `Multi` handle, preventing connection reuse across operations since the `Multi` handle maintains a connection cache internally.
With this change, the client can reuse the `Multi` session after an async operation, thereby benefitting from libcurl's caching. Note that the same `Multi` handle still cannot be used by concurrently running `Future`s (as this [would not be thread safe](https://curl.haxx.se/libcurl/c/threadsafe.html)), but once a `Future` has completed its `Multi` handle will return to the pool for use by subsequent requests.
---
(Somewhat tangential)
As is noted in the code comments, `libcurl`'s C API provides a way to share caches across multiple multi sessions: [the "share" interface](https://curl.haxx.se/libcurl/c/libcurl-share.html).
While using this would seems preferable to an ad-hoc solution like this diff, it turns out that the `curl` crate does not provide safe bindings to the share interface. This means that in order to use the share interface, we'd need to directly use the unsafe bindings from `curl-sys`.
In addition to the difficulty of working with unsafe FFI code, the API expects the application to handle synchronization by passing it function pointers to handle locking/unlocking shared resources.
Ultimately, I came to the conclusion that managing lifetimes and synchronization in unsafe code across an FFI boundary would be nontrivial, and ensuring correctness would require a lot of effort that could be avoided by implementing an ad-hoc solution on top of the safe API instead. However, it might make sense to change this to use the share interface in the future.
Reviewed By: quark-zju
Differential Revision: D22396026
fbshipit-source-id: 06eea2ffacdc791527eac9ce4becc457af5c0480
Summary: Update the `revisionstore` and `backingstore` crates to use the new EdenAPI crate.
Reviewed By: quark-zju
Differential Revision: D22378330
fbshipit-source-id: 989f34827b744ff4b4ac0aa10d004f03dbe9058f
Summary: Add a new `EdenApiBlocking` trait that exposes blocking versions of the `EdenApi` trait's methods, for use in non-async code.
Reviewed By: quark-zju
Differential Revision: D22305396
fbshipit-source-id: d0f3a73cad1a23a4f0892a17f18267374e63108e
Summary:
This diff adds an EdenAPI CLI program that allows manually sending requests to the server.
Requests are read from stdin in a JSON format (the same format used by the `make_req` tool and the EdenAPI server integration tests). This makes it easy to create and edit requests during debugging.
Responses are re-serialized as CBOR and written to stdout. (The program will refuse to write output if stdout is a TTY.) These responses can then be analyzed using the `read_res` tool (also used by the EdenAPI server integration tests).
The program prints real-time download statistics during data fetching, allow the user to debug performance in addition to correctness.
The program uses standard `hgrc` files to configure the EdenAPI client, which means that one can simulate production settings by specifying a production `hgrc`. By default, it will read from `~/.hgrc.edenapi` rather than `~/.hgrc` since the user will most likely want to configure this program independently of Mercurial.
Reviewed By: quark-zju
Differential Revision: D22370163
fbshipit-source-id: 5d9974bc05fa960d26cd2c87810f4646e2bc55b4
Summary:
Renamed xdiff functions to avoid linking issues when using both libgit2-sys and xdiff.
When using repo_import tool (https://fburl.com/diffusion/8p6fhjt2) we have libgit2-sys dependency for importing git repos. However, when we derive blame data types, we need to use xdiff functionalities (from_no_parents: https://fburl.com/diffusion/pitukmyo -> diff_hunks: https://fburl.com/diffusion/9f8caan9 -> xdl_diff: https://fburl.com/diffusion/260x66hf). Both libgit2 and eden/scm have vendored versions of xdiff library. Therefore, libgit2-sys and eden/scm share functions with the same signatures, but have different behaviours and when we tried to derive blame, it used libgit2-sys's xdl_diff instead of eden's. This resulted in getting segfaults (https://fburl.com/paste/04gwalpo).
Note: repo_import is the first tool that has tried to import both and the first to run into this issue.
Reviewed By: StanislavGlebik
Differential Revision: D22432330
fbshipit-source-id: f2b965f3926a2dc45de1bf20e41dad70ca09cdfd
Summary:
This diff is a complete, ground-up rewrite of the EdenAPI client. Rather than attempting to use `libcurl` directly, it relies on the new `http_client` crate, which makes the code considerably simpler and allows for a proper async interface.
The most notable change is that `EdenApi` is now an async trait. A blocking API is added later in the stack for use in non-async contexts.
Reviewed By: quark-zju
Differential Revision: D22305397
fbshipit-source-id: 4c1e5d3091d6dd04cf13291e7b7a4217dfdd249f
Summary:
As was pointed out in the review for D22280745 (d73c63d862), `CborStream` is inefficient in situations where the underlying stream produces chunks that are much smaller than the size of the serialized items. To avoid pathological behavior, make `CborStream` buffer the incoming data, and only attempt deserialization if enough data has accumulated.
For now, the buffer size is fixed (with a default of 1MB, chosen arbitrarily). In the future, it might make sense to have the stream adjust the buffer size based on the average size of observed deserialized values.
Reviewed By: quark-zju
Differential Revision: D22370164
fbshipit-source-id: ed940c56ca2cbbfc07f01d47becf6f1d71872872
Summary: On Windows a mmap file cannot be replaced. Detect that and delete manually.
Reviewed By: farnz
Differential Revision: D22428731
fbshipit-source-id: 4d308a07aae02dcaf2aedb7b0267a535c2e09c92
Summary: The 0.3 version (currently being used only in one crate eden/scm/lib/commitcloudsubscriber) is using an old openssl crate which doesn't work with openssl library installed on most machines (Both in FB and on GitHub Actions).
Reviewed By: mitrandir77
Differential Revision: D22430649
fbshipit-source-id: b8fa930841dbcdd4c085d8c9488d768b3526e1c4
Summary: D22381744 updated the version of `futures` in third-party/rust to 0.3.5, but did not regenerate the autocargo-managed Cargo.toml files in the repo. Although this is a semver-compatible change (and therefore should not break anything), it means that affected projects would see changes to all of their Cargo.toml files the next time they ran `cargo autocargo`.
Reviewed By: dtolnay
Differential Revision: D22403809
fbshipit-source-id: eb1fdbaf69c99549309da0f67c9bebcb69c1131b
Summary:
The key is "hit_count" not "hits". This typo caused the trace to always claim
that no data was fetched from memcache, which is obviously not true as the
getpack trace that follows listed significantly less requested keys.
Reviewed By: kulshrax
Differential Revision: D22401592
fbshipit-source-id: ab2ea3e7f8ff3a9c7322678afc8a174e09d6dc09
Summary:
If the revlog on disk was changed to include new commits, read them and avoid
writing duplicated commits (which breaks nodemap building).
Reviewed By: sfilipco
Differential Revision: D22323187
fbshipit-source-id: cdd65f31e65865d9f3868e43416633297896c0f9
Summary: This is used by the next diff.
Reviewed By: sfilipco
Differential Revision: D21944139
fbshipit-source-id: 184c4e97aaeca36c3608665defd1473c9300fb5b
Summary: This will satisfy some use-cases.
Reviewed By: sfilipco
Differential Revision: D21854225
fbshipit-source-id: 76758716b35cfd31dc3843c118917c0fb7609027
Summary: This will help move more Python logic to Rust.
Reviewed By: sfilipco
Differential Revision: D21854224
fbshipit-source-id: b03cbacedc11d77e8c56262437a8d10bd9a89e59
Summary: This is discovered by using it in Python world.
Reviewed By: sfilipco
Differential Revision: D22323186
fbshipit-source-id: 295811e0950b94ad2ad73ad242228b6a3f9765d0
Summary: Adding a same commit multiple times is a no-op.
Reviewed By: sfilipco
Differential Revision: D22323190
fbshipit-source-id: 61a06335581a9cad32dc7e929b841ec69b551a9c
Summary: This adds some test coverage for the revlog DagAlgorithm implementation.
Reviewed By: sfilipco
Differential Revision: D22249157
fbshipit-source-id: a1d347b4d90d0e7f8fb229c317cc75c2b8e16242