Summary:
Introduce separate wire types to allow protocol evolution and client API changes to happen independently.
* Duplicate `*Request`, `*Entry`, `Key`, `Parents`, `RepoPathBuf`, `HgId`, and `revisionstore_types::Metadata` types into the `wire` module. The versions in the `wire` module are required to have proper `serde` annotations, `Serialize` / `Deserialize` implementations, etc. These have been removed from the original structs.
* Introduce infallible conversions from "API types" to "wire types" with the `ToWire` trait and fallible conversions from "wire types" to "API types" with the `ToApi`. API -> wire conversions should never fail in a binary that builds succesfully, but wire -> API conversions can fail in the case that the server and client are using different versions of the library. This will cause, for instance, a newly-introduced enum variant used by the client to be deserialized into the catch-all `Unknown` variant on the server, which won't generally have a corresponding representation in the API type.
* Cleanup: remove `*Response` types, which are no longer used anywhere.
* Introduce a `map` method on `Fetch` struct which allows a fallible conversion function to be used to convert a `Fetch<T>` to a `Fetch<U>`. This function is used in the edenapi client implementation to convert from wire types to API types.
* Modify `edenapi_server` to convert from API types to wire types.
* Modify `edenapi_cli` to convert back to wire types before serializing responses to disk.
* Modify `make_req` to use `ToWire` for converting API structs from the `json` module to wire structs.
* Modify `read_res` to use `ToApi` to convert deserialized wire types to API types with the necessary methods for investigating the contents (`.data()`, primarily). It will print an error message to stderr if it encounters a wire type which cannot be converted into the corresponding API type.
* Add some documentation about protocol conventions to the root of the `wire` module.
Reviewed By: kulshrax
Differential Revision: D23224705
fbshipit-source-id: 88f8addc403f3a8da3cde2aeee765899a826446d
Summary: Add log messages for debugging using the `tracing` crate, which allows them to be enabled via `env_logger`.
Reviewed By: quark-zju
Differential Revision: D23858076
fbshipit-source-id: a8ef1afac6c9ecbfb5d6d78232aa0d03a2fe2054
Summary:
hg-http's built client should provide integration with Mercurial's stats
collection mechanisms.
Reviewed By: kulshrax
Differential Revision: D23577867
fbshipit-source-id: 93c777021bc347511322269d678d6879710eed3e
Summary: The Mercurial codebase uses hyphens in crate names rather than underscores. This is similar to the convention favored by the larger Rust community, though it is different from Mononoke, which uses underscores. While we'll probably need to eventually settle on a consistent convention for all of projects in the Eden SCM repo, for now, `http_client` should be made consistent with the adjacent crates.
Reviewed By: sfilipco
Differential Revision: D23585721
fbshipit-source-id: d2e690d86815be02d7b8d645198bcd28e8cbd6e0
Summary:
Replacing places where the tokio runtime is instantiated inside the edenapi
client crate.
Reviewed By: quark-zju
Differential Revision: D23468596
fbshipit-source-id: ef68718c7d5b89b6477a2946daaa51618b53d06a
Summary:
There aren't too many thigs that we can do with the responses that we get back
from the server. Thigs are somewhat application specific for this endpoint.
One option that is not available right now and might make sense to add is
limiting the number of entries that are printed for a given location.
Reviewed By: kulshrax
Differential Revision: D23456220
fbshipit-source-id: eb24602c3dea39b568859b82fc27b7f6acc77600
Summary:
To reduce the size over the wire on cases where we would be traversing the
changelog on the client, we want to allow the endpoint to return a whole parent
chain with their hashes.
Reviewed By: kulshrax
Differential Revision: D23456216
fbshipit-source-id: d048462fa8415d0466dd8e814144347df7a3452a
Summary:
Renaming all the LocationToHash related structures to CommitLocationToHash.
This is done for consistency. I realized the issue when the command for reading
the request from cbor was not what I was expecting it to be. The reason was that
the commit prefix was used inconsistently for LocationToHash.
Reviewed By: kulshrax
Differential Revision: D23456221
fbshipit-source-id: 0181dcaf81368b978902d8ca79c5405838e4b184
Summary: Now that it is possible to control which features are enabled on manually-managed dependencies, we can reenable autocargo for `edenapi`. See D23216925, D23327844, and D23329351 (840e6dd6f6) for context.
Reviewed By: dtolnay
Differential Revision: D23335122
fbshipit-source-id: 8ce250c3a106d2a02f457f7ed531623dd866232f
Summary:
Aux data wire protocol part 1: field annotations & basic compatibility model.
Annotates fields in `file`, `tree`, and `complete_tree` wire structs with `#[serde(rename = "N", default, skip_serializing_if = "is_default")]`. I've avoided using `#[serde(default)]` on the container structs themselves because this can cause some confusion / incorrect behavior if not used carefully. Consider a wire struct `FooRequest` with a field of type `Option<Bar>`. `Option<Bar>` defaults to `None`. If `FooRequest`'s `Default` implenentation sets the field's default to `Some(bar)`, a `FooRequest` explicitly constructed with `None` for the field will be serialized with the field omitted (because it passes `is_default`) and will be deserialized on the server as `Some(bar)`, causing incorrect behavior. To address this, we'd need to change the `is_default` function used with `skip_serializing_if` to check against the field's default value as set by the container, which isn't trivially possible without some sort of reflection (please correct me if you know a good way to achieve this). This is unfortunate, as it'd be very desirable for the container to be able to set defaults different from the individual field type defaults, for cases where one boolean, for instance, should default to true. As-is, we'd need to address this with wrapper types instead, where we can fully control the `Default` implenentation.
We can, of course, address this by providing an alternate `skip_serializing_if` function to fields with default that doesn't match that set by the container. This will need to be done carefully, though, to avoid the issue I described above.
Currently the JSON module manually serializes and deserializes all the top-level request objects, so the rename annotation doesn't impact it. We can add `#[serde(alias = "rustfieldname")]` if we'd like the server and client to be able to accept manually-crafted requests and responses with explicit field names. This could also be useful to replace the manual parsing in the JSON module, but can't replace the manual serialization in a clean way. We'd need to introduce a second copy of the wire types, without the serde `rename` attribute, to allow serializing with the actual rust field name.
I've only modified the `tree`, `file`, and `complete_tree` modules. I intend to eventually update the rest of the edenapi protocol later on, when the implementation of `file` and `tree` are complete / stable. This will give us a chance to fix any mistakes before copying the design to more places.
Note: I do not intend to keep to proper wire protocol compatibility at this stage in the implementation. Expect field numbers to be re-used by non-compatible changes.
Reviewed By: kulshrax
Differential Revision: D23172756
fbshipit-source-id: 39976ed4bede892bd6981f9c3f23557a91f9028b
Summary:
The primary change is in `eden/scm/lib/edenapi/types`:
* Split `DataEntry` into `FileEntry` and `TreeEntry`.
* Split `DataError` into `FileError` and `TreeError`. Remove `Redacted` error variant from `TreeError` and `MaybeHybridManifest` error variant from `FileError`.
* Split `DataRequest`, `DataResponse` into appropriate File and Tree types.
* Refactor `data.rs` into `file.rs` and `tree.rs`.
* Lift `InvalidHgId` error, used by both File and Tree, into `lib.rs`.
* Bugfix: change `MaybeHybridManifest` to be returned only for hash mismatches with empty paths, to match documented behavior.
Most of the remaining changes are straightforward fallout of this split. Notable changes include:
* `eden/scm/lib/edenapi/tools/read_res`: I've split the "data" commands into "file" and "tree", but I've left the identical arguments sharing the same argument structs. These can be refactored later if / when they diverge.
* `eden/scm/lib/types/src/hgid.rs`: Moved `compute_hgid` from `eden/scm/lib/edenapi/types/src/data.rs` to as a new `from_content` constructor on the `HgId` struct.
* `eden/scm/lib/revisionstore/src/datastore.rs`: Split `add_entry` method on `HgIdMutableDeltaStore` trait into `add_file` and `add_tree` methods.
* `eden/scm/lib/revisionstore/src/edenapi`
* `mod.rs`: Split `prefetch` method on `EdenApiStoreKind` into `prefetch_files` and `prefetch_trees`, which are given a default implementation that fails with `unimplemented!`.
* `data.rs`: Replace blanket trait implementations for `EdenApiDataStore<T>` with specific implementations for `EdenApiDataStore<File>` and `EdenApiDataStore<Tree>` which call the appropriate fetch and add functions.
* `data.rs` `test_get_*`: Replace dummy hashes with real hashes. These tests were only passing due to the hash mismatches (incorrectly) being considered `MaybeHybridManifest` errors, and allowed to pass.
Reviewed By: kulshrax
Differential Revision: D22958373
fbshipit-source-id: 788baaad4d9be20686d527f819a7342678740bc3
Summary: Client portion for the commit/revlog_data endpoint that was added to the server.
Reviewed By: kulshrax
Differential Revision: D23065989
fbshipit-source-id: 3115ad2b426daca22472e2106fcd293f3ccd70f3
Summary:
Matches the `getcommitdata` SSH endpoint.
This is going to be used to remove the requirement that client repostories
need to have all commits locally.
Reviewed By: krallin
Differential Revision: D22979458
fbshipit-source-id: 75d7265daf4e51d3b32d76aeac12207f553f8f61
Summary:
Eden api endpoint for segmented changelog. It translates a path in the
graph to the hash corresponding to that commit that the path lands on.
It is expected that paths point to unique commits.
This change looks to go through the plumbing of getting the request from
the edenapi side through mononoke internals and to the segmented changelog
crate. The request used is an example. Follow up changes will look more at
what shape the request and reponse should have.
Reviewed By: kulshrax
Differential Revision: D22702016
fbshipit-source-id: 9615a0571f31a8819acd2b4dc548f49e36f44ab2
Summary: Allow the user to specify extra HTTP headers that should be sent with each EdenAPI request in the `edenapi.headers` config option. The field is expected to be a JSON object whose key-value pairs are used as header keys and values.
Reviewed By: quark-zju
Differential Revision: D22591870
fbshipit-source-id: ac1bb669270d667895554dcc5f7176d18736375c
Summary: Include the name of bad config fields in the error message so the user can more easily fix the problem.
Reviewed By: quark-zju
Differential Revision: D22591871
fbshipit-source-id: e23e2c71e49e0458e7ea5c13e7feac3a990ead0c
Summary: Update `edenapi::Builder` to use the CA certificate bundle specified in the `[auth]` section of the user's config.
Reviewed By: quark-zju
Differential Revision: D22591034
fbshipit-source-id: 3a417adbf50ef7d2c538f4a032e54a038cbd282e
Summary:
Add a metadata field to `read_res` containing a `revisionstore::Metadata` struct (which contains the object size and flags). The main purpose of this is to support LFS, which is indicated via a metadata flag.
Although this change affects the `DataEntry` struct which is serialized over the wire, version skew between the client and server should not break things since the field will automatically be populated with a default value if it is missing in the serialized response, and ignored if the client was built with an earlier version of the code without this field.
In practice, version skew isn't really a concern since this isn't used in production yet.
Reviewed By: quark-zju
Differential Revision: D22544195
fbshipit-source-id: 0af5c0565c17bdd61be5d346df008c92c5854e08
Summary: Instead of restricting the allowed characters in a repo name, allow any UTF-8 string. The string will be percent-encoded before being used in URLs.
Reviewed By: quark-zju
Differential Revision: D22559830
fbshipit-source-id: f9caa51d263e06d424531e0947766f4fd37b035f
Summary: We have several repos whose names contain various non-alphanumeric/underscore/hyphen characters, so we need to be more permissive about accepting repo names.
Reviewed By: quark-zju
Differential Revision: D22554846
fbshipit-source-id: e7bb030e0b8fb6aa275c119ba0aa540405b29186
Summary: When working with large CBOR responses, it is sometimes useful to limit processing to the first N entries to prevent the operation from taking a long time. This diff adds an option to the `read_res` tool to only look at the first N entries in a data or history response.
Reviewed By: quark-zju
Differential Revision: D22544451
fbshipit-source-id: 5e8e2c7212aa3b315a25bd4cf9273009a5e43f72
Summary: Ensure that all of the components of an EdenAPI response use the same error type.
Reviewed By: quark-zju
Differential Revision: D22443029
fbshipit-source-id: 3e00a8b83677beb5ef2d90630fe9b85760874186
Summary:
D22396026 made it so that `HttpClient::send_async` no longer consumes `self`. This means that instead of creating a new HTTP client for each request, we can reuse the same one.
This has the benefit of allowing for connection reuse (which was the point of D22396026), resulting in lower latency for serial fetches.
Reviewed By: quark-zju
Differential Revision: D22397768
fbshipit-source-id: 9d066c1ec64a6aa1b36ec674ef294030c1f90b41
Summary: Allow passing multiple JSON requests to the EdenAPI CLI. The requests will be performed serially, which allows for testing the performance of serial EdenAPI calls.
Reviewed By: quark-zju
Differential Revision: D22397769
fbshipit-source-id: c59e5abf53eee9c2014010672183e202b6f180fc
Summary: Update the `revisionstore` and `backingstore` crates to use the new EdenAPI crate.
Reviewed By: quark-zju
Differential Revision: D22378330
fbshipit-source-id: 989f34827b744ff4b4ac0aa10d004f03dbe9058f
Summary: Add a new `EdenApiBlocking` trait that exposes blocking versions of the `EdenApi` trait's methods, for use in non-async code.
Reviewed By: quark-zju
Differential Revision: D22305396
fbshipit-source-id: d0f3a73cad1a23a4f0892a17f18267374e63108e
Summary:
This diff adds an EdenAPI CLI program that allows manually sending requests to the server.
Requests are read from stdin in a JSON format (the same format used by the `make_req` tool and the EdenAPI server integration tests). This makes it easy to create and edit requests during debugging.
Responses are re-serialized as CBOR and written to stdout. (The program will refuse to write output if stdout is a TTY.) These responses can then be analyzed using the `read_res` tool (also used by the EdenAPI server integration tests).
The program prints real-time download statistics during data fetching, allow the user to debug performance in addition to correctness.
The program uses standard `hgrc` files to configure the EdenAPI client, which means that one can simulate production settings by specifying a production `hgrc`. By default, it will read from `~/.hgrc.edenapi` rather than `~/.hgrc` since the user will most likely want to configure this program independently of Mercurial.
Reviewed By: quark-zju
Differential Revision: D22370163
fbshipit-source-id: 5d9974bc05fa960d26cd2c87810f4646e2bc55b4
Summary:
This diff is a complete, ground-up rewrite of the EdenAPI client. Rather than attempting to use `libcurl` directly, it relies on the new `http_client` crate, which makes the code considerably simpler and allows for a proper async interface.
The most notable change is that `EdenApi` is now an async trait. A blocking API is added later in the stack for use in non-async contexts.
Reviewed By: quark-zju
Differential Revision: D22305397
fbshipit-source-id: 4c1e5d3091d6dd04cf13291e7b7a4217dfdd249f
Summary: Use `thiserror` to provide a more ergonomic API for `DataEntry` hash checking. The `.data()` method now simply returns a `Result` rather than a tuple with an ad-hoc enum.
Reviewed By: quark-zju
Differential Revision: D22376164
fbshipit-source-id: fc39cb212ec1ee5830292db4aa5eca18f2c16a2b
Summary: Tidy up some comments in this file.
Reviewed By: ikostia
Differential Revision: D22376165
fbshipit-source-id: ce4760776048aa8e72809b4f828d0ea426fcf878
Summary: Add a `ToJson` trait as a counterpart to the `FromJson` trait introduced in the last diff. The primary use case for this will be to allow recording and replaying the data fetches that occur during Mercurial operations. A JSON representation was chosen so that the format will be directly compatible with tools like `make_req` used in EdenAPI integration tests.
Reviewed By: markbt
Differential Revision: D22344599
fbshipit-source-id: 52c888bde93a8e86b6dd76cb862337f716b007eb
Summary:
Add a `FromJson` trait to `edenapi_types` and use it instead of the parsing functions when parsing requests from JSON.
Some design commentary:
I've created a custom trait rather than using `TryFrom<serde_json::Value>` for two reasons:
- From a design standpoint, I'd like users to have to explicitly opt-in to this functionality by importing this `FromJson` trait, since this is different from the usual way of deserializing with serde, and it might cause confusion.
- From an implementation standpoint, it turns out that using `TryFrom` in a trait bound causes difficulties with type inference in some situations (in particular, around the associated `Error` type), which necessitates type annotations, making the API less ergonomic to use.
Why not just use `serde::Deserialize` directly?
- The representation here doesn't actually directly match the structure of the underlying type. Instead, the JSON representation has been designed to be easy for humans to edit when writing tests. (For example, keys are replaced with simply arrays, and hashes are represented as hex rather than as byte arrays.)
- In this case, we'd like to work with `serde_json::Value` rather than going directly between bytes and structs, since this allows for a level of dynamism that will be useful later in the stack.
Reviewed By: markbt
Differential Revision: D22320576
fbshipit-source-id: 64b6bed42e1ec599a0da61ae5d55feb7c90101a4
Summary: Add `Arbitrary` implementations for `DataRequest`, `HistoryRequest`, and `CompleteTreeRequest` so that these types can be used in quickcheck tests.
Reviewed By: quark-zju
Differential Revision: D22344600
fbshipit-source-id: c7fcbcd4648ab45f8dde00cc4bb3c1c4d203c4e1
Summary: The `#[quickcheck]` attribute is a more modern way of defining quickcheck predicates (as opposed to the older `quickcheck!` macro). Update all existing usages in the crate.
Reviewed By: quark-zju
Differential Revision: D22344601
fbshipit-source-id: b7ee4d317a64bed5fcd8b35f90f2544f6b024410
Summary: Replace use of `.ok_or_else(|| anyhow!(...))?` with `.context(...)?`, which is a cleaner way to express the same thing.
Reviewed By: quark-zju
Differential Revision: D22323298
fbshipit-source-id: 9fc1a8183a54ee0c4f30f7497b98005a18a06468
Summary:
Move JSON request parsing code out of `make_req` into `edenapi_types` so it can be reused in the EdenAPI test CLI (added later in the stack).
Note to reviewer: This diff look bigger than it actually is; it is mostly just cut-and-paste (recorded as a copy to preserve history).
Reviewed By: quark-zju
Differential Revision: D22305693
fbshipit-source-id: 13248fb29b2fbb705203f889f3d2fb3c1c760bed
Summary:
EdenAPI's `make_req` tools allows developers to create ad-hoc CBOR request payloads for debugging purposes (e.g., for use with `curl`). The tool generates requests from human-created JSON, which are particularly useful in Mercurial and Mononoke's integration tests.
Later in this stack, the use of this JSON format will be extended beyond just this one tool. As such, it is important that the representation be sufficiently extensible so accommodate future changes to the request structs. In the case of the JSON representation of `DataRequest`, this means changing from an array to a single-attribute object, so that additional fields can potentially be added in the future.
Reviewed By: quark-zju
Differential Revision: D22319314
fbshipit-source-id: 5931bc7ab01ca48ceab5ffd1c9177dd3035b643c
Summary: Use autocargo for all EdenAPI code.
Reviewed By: quark-zju
Differential Revision: D22344400
fbshipit-source-id: 522e82fcc76792ed01ca2cdbfa169c23be5bf38f
Summary: Move old EdenAPI crate to `scm/lib/edenapi/old` to make room for the new crate. This old code will eventually been deleted once all references to it are removed from the codebase.
Reviewed By: quark-zju
Differential Revision: D22305173
fbshipit-source-id: 45d211340900192d0488543ba13d9bf84909ce53
Summary: Replace the term 'blacklisted' with 'redacted'.
Reviewed By: quark-zju
Differential Revision: D22142842
fbshipit-source-id: 461faa09a5d3958a4575c48f03e2286eec917681
Summary: Add a crate-level doc comment clarifying the purpose of this crate. In particular, the motivation behind moving some of EdenAPI's types into this crate was to make it easier to share types between the server and client. As such, types that are not shared (which may pull in more complex dependencies and therefore cause build issues for either the server or client) should not be put in this crate.
Reviewed By: quark-zju
Differential Revision: D22142180
fbshipit-source-id: 7258ef33b73a87acf72d4f6bcbe8b27cbc361735
Summary:
Remove unused dependencies for Rust targets.
This failed to remove the dependencies in eden/scm/edenscmnative/bindings
because of the extra macro layer.
Manual edits (named_deps) and misc output in P133451794
Reviewed By: dtolnay
Differential Revision: D22083498
fbshipit-source-id: 170bbaf3c6d767e52e86152d0f34bf6daa198283
Summary:
Rename the `subtree` endpoint on the EdenAPI server to `complete_trees` to better express what it does (namely, fetching complete trees, in contrast to the lighter weight `/trees` endpoint that serves individual tree nodes). This endpoint is not used by anything yet, so there isn't much risk in renaming it at this stage.
In addition to renaming the endpoint, the relevant request struct has been renamed to `CompleteTreeRequest` to better evoke its purpose, and the relevant client and test code has been updated accordingly. Notably, now that the API server is gone, we can remove the usage of this type from Mononoke's `hgproto` crate, thereby cleaning up our dependency graph a bit.
Reviewed By: krallin
Differential Revision: D22033356
fbshipit-source-id: 87bf6afbeb5e0054896a39577bf701f67a3edfec
Summary: Make most of the handlers on the EdenAPI server return streaming responses.
Reviewed By: krallin
Differential Revision: D22014652
fbshipit-source-id: 177e540e1372e7dfcba73c594614f0225da3a10f
Summary:
Note: Although this diff looks big, it's all just cutting and pasting code.
Restructure the crate so that the types relevant to each EdenAPI endpoint are grouped in their own module. This means that authors of new endpoints can just add a new module and put their types there, rather than having to make edits throughout the crate.
Reviewed By: quark-zju
Differential Revision: D21983470
fbshipit-source-id: 4901662dc6b5f0a1feca1ab8d914f394114ef233