Summary: Will be used as bounds for abstraction.
Reviewed By: sfilipco
Differential Revision: D24399497
fbshipit-source-id: 343be12237d4850fbde9ebbe4034469527bd77fc
Summary: The `snapshot` field can be used instead.
Reviewed By: sfilipco
Differential Revision: D24399507
fbshipit-source-id: 67de20d897b8b763f724f3ccbd46618dec7911b9
Summary:
The trait requires an `IdMap` snapshot to be locally ready. That's not easy for
all possible implementations. Drop it to simplify things.
Reviewed By: sfilipco
Differential Revision: D24399501
fbshipit-source-id: 4d85f77c99208cda30b2a543a0bb5b295f49a65c
Summary: There were 2 prepare_filesystem_sync. Unify them into one implementation.
Reviewed By: sfilipco
Differential Revision: D24399513
fbshipit-source-id: 80d009c33b7f23dc2c4225da6fd0fb09589ba061
Summary: More general purposed type for Syncable{IdDag,IdMap}.
Reviewed By: sfilipco
Differential Revision: D24399502
fbshipit-source-id: 0599db6dd07fe3d430458f86a33a9144d850fca1
Summary: This makes it more generic.
Reviewed By: sfilipco
Differential Revision: D24399493
fbshipit-source-id: 8a1d0a13dd29989b17fe3ef1497b10b6fa0629d6
Summary: Similar to IdDag change, move impls to separate files.
Reviewed By: sfilipco
Differential Revision: D24399508
fbshipit-source-id: 575b6e7194677b67b6755b0a30ae7d014d498b10
Summary:
The lock, reload, mutate, persist pattern is general. It can be used for IdMap
too.
Reviewed By: sfilipco
Differential Revision: D24399512
fbshipit-source-id: d25e51ba735061ca101101d75aff95deb88b1d36
Summary:
Now `build_segments_persistent` and `build_segments_volatile` are the same.
Just keep one of them.
Reviewed By: sfilipco
Differential Revision: D24399511
fbshipit-source-id: a9f1ac920cdf5b448bd99bf9b6d4ca4160ba0304
Summary:
Previously, we keep the last high level segment per level in memory, and
drop it on disk. When we cross the memory / disk boundary, we had to
maintain such properties carefully. That was needed because some DAG
algorithms rely on complete high level segments.
Now that no DAG algorithms depend on such properties, let's just drop
the logic adding the last segment back to simplify the code.
This removes the need of building segments after open() and sync().
Reviewed By: sfilipco
Differential Revision: D24399515
fbshipit-source-id: 4c640d9aa03c050fcd97f70ee386e32d3a8ee26d
Summary:
This makes the algorithm a bit more robust. Now none of the DAG algorithms
depend on high-level segments are complete and cover all low-level segments.
This also removes constraints. For example, SyncableIdDag can now just
deref() to the normal IdDag for queries without worrying about correctness.
Reviewed By: sfilipco
Differential Revision: D24399503
fbshipit-source-id: e6a91010cff82264cf423e2f24dee1d372822ef6
Summary:
They depend on high-level segments covering low-level segments, which
adds extra complexities. Remove them to simplify logic.
Reviewed By: sfilipco
Differential Revision: D24399509
fbshipit-source-id: 56a8e06c263107d1da4d6754b884ce51e18e30bf
Summary: This preserves the `--noninteractive` flag used by some tools.
Reviewed By: DurhamG
Differential Revision: D24040789
fbshipit-source-id: 8d50f3f3ce6b2015f0ef6c3bd1b4fbb874d0ea7d
Summary:
This restores the compatibility of setting up merge tools using the `ui.merge`
config while still limiting the default `editmerge` tool to interactive
sessions.
Reviewed By: sfilipco
Differential Revision: D24377259
fbshipit-source-id: 3d2befba412b824fc985ddffa131e339644178c2
Summary: Make it testable by allowing specifying paths to load as user hgrc.
Reviewed By: sfilipco
Differential Revision: D24377258
fbshipit-source-id: 969028df64d55ad1f1304e35675d84595ed6a2bf
Summary:
Include a `User-Agent` header in EdenAPI requests from Mercurial. This will allow us to see the version in Scuba, and in the future, will allow us to distinguish between requests send by Mercurial and those sent directly by EdenFS.
Keeping with the current output of `hg version`, the application is specified as "EdenSCM" rather than "Mercurial".
Reviewed By: singhsrb
Differential Revision: D24347021
fbshipit-source-id: e323cfc945c9d95d8b2a0490e22c2b2505a620dc
Summary:
Rust tests run in multiple threads. Setting environment variables affects other
tests running in other threads and causes random test failures.
Protect env vars using a lock.
Reviewed By: DurhamG
Differential Revision: D24296639
fbshipit-source-id: db0bee85625a7b63e07b95ea76d96029487881d4
Summary:
The shell-script cargo tests seem very flaky. Use a dedicated Python script to
run the tests, with a more concise output that only includes failures, and run
tests in parallel.
Reviewed By: DurhamG
Differential Revision: D24296433
fbshipit-source-id: 1d63146c6c84f1035dded24fcd3d79f116c2e740
Summary:
When the server returns a 429, the intention is that the client will wait for a
little bit then try again later (there is no harm in that, as we haven't really
used many server resources for this). However, it turned out that right now we
just abort. Let's fix it!
Note that this changes the behavior a bit for the error cases, in the sense
that we no longer return `Ok(None)` but instead return an `Err`. Xavier noted
this should make sense here.
I've also had the client send its retry attempt via a header, because who
knows, that might be useful.
Reviewed By: kulshrax
Differential Revision: D24308127
fbshipit-source-id: 35639956f36342dfb0056b0d348dc4ad56bd576c
Summary: Introduces fetching of child entry IDs, and child file metadata for a specified tree manifest ID. The aux data lookup will only be performed if `with_file_metadata` is set, which is actually kind of wrong. Instead `with_children` from the wire type should be exposed in the API request type, and `with_*_metadata` should be hidden or used for data other than the child entry `Key`s.
Reviewed By: kulshrax
Differential Revision: D23886678
fbshipit-source-id: 0cba72cea7be47ae3348a406d407a19b60976c0c
Summary:
Include the client correlator string from the `clienttelemetry` extension in each EdenAPI HTTP request via the `X-Client-Correlator` header.
The `ClientIdentityMiddleware` in `gotham_ext` already understands this header (as it is already used by the LFS server), and `gotham_ext`'s `ScubaMiddleware` will automatically include the provided correlator in the server's Scuba samples.
Reviewed By: farnz
Differential Revision: D24282244
fbshipit-source-id: 13d04e706eda38893cff6e740bd1d7bf104e43dd
Summary:
This change introduces two new metadata types, Category and Transience, and a mechanism for Category to provide a default Fault and Transience, which can be overriden by the user.
Also introduces a mechanism for attempting to log exceptions which occur during exception logging, falling back to the previous behavior of just swallowing the exception on failure.
Reviewed By: DurhamG
Differential Revision: D22677565
fbshipit-source-id: 1cf75ca1e2a65964a0ede1f072439378a46bd391
Summary:
It only has benchmark code that led to the use of mincode. Now hgcommits is the
main crate for commit storage. `commitstore` without `hg` in its name was
initially planned to support other kinds of commits including git and bonsai.
However we don't have immediate goal for that at present. So let's just remove
the commitstore directory.
Reviewed By: singhsrb
Differential Revision: D24263618
fbshipit-source-id: 84b4861ae490817377e69d8c2006c63331e3db1f
Summary: Need to add new quickcheck tests, verify that remove `Serialize` from `TreeEntry` is okay.
Reviewed By: kulshrax
Differential Revision: D23457777
fbshipit-source-id: aa94ed7aa81b41924eba4a8bd1bdc2c737365b77
Summary:
Print a message for each EdenAPI method call to stderr if the user has `edenapi.debug` set.
These messages are already logged to `tracing`, but also printing them out when `edenapi.debug` is set makes the debug output more useful, since it provides context for the download stats. This is especially useful when reading through EdenFS logs.
Reviewed By: quark-zju
Differential Revision: D24204381
fbshipit-source-id: 37b47eed8b89438cdf510443e917a5c8660eb43b
Summary: Use a `HashMap` to store user-specified additional HTTP headers. This allows headers to be set in multiple places (whereas previously, setting new headers would replace all previously set headers).
Reviewed By: quark-zju
Differential Revision: D24200833
fbshipit-source-id: 93147cf334a849c4d2fc4f29849018a4c7565143
Summary: The panics can happen when the input sets are out of range.
Reviewed By: kulshrax
Differential Revision: D24191789
fbshipit-source-id: efbcbd7f6f69bd262aa979afa4f44acf9681d11e
Summary:
Some sort of serialization for the Dag is useful for saving the IdDag produced
by offline jobs load that when a mononoke server starts.
Reviewed By: quark-zju
Differential Revision: D24096964
fbshipit-source-id: 5fac40f9c10a5815fbf5dc5e2d9855cd7ec88973
Summary: Add `--debug` flag to `read_res cat` command for debug printing entire entry rather than just the data blob.
Reviewed By: kulshrax
Differential Revision: D23999804
fbshipit-source-id: 6955854edab2643cffbe5fae484a398716b48055
Summary:
The hybrid backend is similar to the doublewrite backend, except that it does
not use revlog to read commit data, but uses EdenAPI instead.
Note:
- The non-stream API will not fetch commit data from EdenAPI.
- The commit hashes are not lazy yet.
Reviewed By: sfilipco
Differential Revision: D23924147
fbshipit-source-id: eb2cf8d3a7e1704b4efb13ad3ad86f8b6a1b31d0
Summary:
This makes it convertible to `PyObject` via `cpython_ext::convert::Serde`
without additional code or dependencies.
Reviewed By: sfilipco
Differential Revision: D23966993
fbshipit-source-id: 74d83524a7c0701cde7aa6d61bb930ff4a1c90f5
Summary:
This API allows us to stream the data. If callsites only use this API, we'll
be more confident that there are no 1-by-1 fetches.
Reviewed By: sfilipco
Differential Revision: D23911865
fbshipit-source-id: 4c7dd8c2b5be33be5a55822845d55345797bacdf
Summary:
The API is basically to resolve `input_stream` to `output_stream`, with a
stateful "resolver" that can resolve locally and remotely.
Reviewed By: sfilipco
Differential Revision: D23915775
fbshipit-source-id: 14a3a37fc897c8229514acac5c91c7e46b270896
Summary:
Introduce `FileMetadata` and `DirectoryMetadata` to `Treeentry`, along with corresponding request API.
Move `metadata.flags` to `file_metadata.revisionstore_flags`, as it is never populated for trees. Do not use `metadata.size` on the wire, as it is never currently populated.
Leaving `DirectoryMetadata` commented out temporarily because serde round trips fail for unit struct. Re-introduced with fields in the next change in this stack.
Reviewed By: DurhamG
Differential Revision: D23455274
fbshipit-source-id: 57f440d5167f0b09eef2ea925484c84f739781e2
Summary:
EdenAPI always checks the integrity of filenode hashes before returning file data to the application. In the case of LFS files, this resulted in errors because the filenode hash is computed using the full file content, but the blob from the server only contains an LFS pointer.
Fix the bug by exempting LFS blobs from filenode integrity checks. (If integrity checks for LFS blobs are desired, the LFS code should be able to do this on its own since LFS blobs are content-addressed.)
Reviewed By: quark-zju
Differential Revision: D24145027
fbshipit-source-id: d7d86e2b912f267eba4120d1f5186908c3f4e9e3
Summary:
This can be used to automate Python/Rust conversions for complex structures
like `CommitRevlogData`.
Reviewed By: kulshrax
Differential Revision: D23966988
fbshipit-source-id: 17a19d38270e6ef0952c13a1cd778487e84a94ff
Summary:
The goal is to implement `FromPyObject` and `ToPyObject` more easily.
Today crates have to dependent on `cpython` to implement `From/ToPyObject`,
which is somewhat unwanted for pure Rust crates.
The `ser` module used to ignore the `variant` field for non-unit enum variants.
They have been fixed so the serialized value can be deserialized correctly.
For example, `enum E { A, B(T) }` will be serialized to `"A"` for `E::A`, and
`{"B": T}` for `E::B`.
Reviewed By: kulshrax
Differential Revision: D23966994
fbshipit-source-id: c50d57bf313caeec65a604ed9b05a5729f3b3635
Summary:
Switch from the default tuple deserialization which only understands the tuple
format, to "bytes" deserialization, which understands not only the existing
"tuple" format (therefore compatible with old data), but also "bytes" and "hex"
formats (for CBOR).
This will unblock us from switching to bytes serialization in the future.
Note: This is a breaking change for mincode serialization. Mincode + HgId users
(zsotre, metalog) have switched to explicit tuple serialization so they don't use
the default deserializaiton and remain unaffected.
Reviewed By: kulshrax
Differential Revision: D23966995
fbshipit-source-id: 83dd53f57bd4e6098de054f46a1d47f8b48133d0
Summary: This will unblock us from switching HgId to bytes serialization by default.
Reviewed By: kulshrax
Differential Revision: D24009039
fbshipit-source-id: a277869ec24652af428cda581faffa62c25d32c4
Summary: Similar to D23966992 (2a2971a4c7), add support to serialize Key differently.
Reviewed By: DurhamG
Differential Revision: D24009041
fbshipit-source-id: 2ecf1610b989a04083196d180bc62307b5162c2f
Summary: Similar to D23966992 (2a2971a4c7), add support to serialize Sha256 differently.
Reviewed By: DurhamG
Differential Revision: D24009040
fbshipit-source-id: b77f6732802f95507e1540f0bbde4d5a92d13cac
Summary: Instead of returning an error upon receiving an empty request, just return a `Fetch` object that does nothing. This prevents Mercurial from crashing in situations where an empty request somehow makes it to the EdenAPI remote store.
Reviewed By: quark-zju
Differential Revision: D24119632
fbshipit-source-id: cf4ec707b4097656c76d7084a55b2d0b3150b679
Summary:
Previously, EdenAPI was using `remotefilelog.debug` to determine whether to print things like download stats. Let's give EdenAPI its own `debug` option that can be configured independently of remotefilelog.
One notable benefit of this change is that download stats will always be printed immediately after the HTTP request completes. This can help rule out network or server issues in situations where Mercurial appears to be hanging during data fetching. (e.g, if hg had downloaded all of the data but was taking a while to process it, the debug output would show this.)
Reviewed By: DurhamG
Differential Revision: D24097942
fbshipit-source-id: bf9b065e7b97fc7ffe50ab74b1b13e2fe364755c
Summary:
Previously phase calculation was done via a simple ancestor check. This
was very slow in cases that required going far back into the graph. Going a year
back could take a number of seconds.
To fix it, let's take the Rust phaseset logic and rework it to make only_both
produce an incremental public nodes set. In a later diff we can switch the
phaseset function to use this as well, but right now phaseset returns IdSet, and
that would need to be changed to Set, which may have consequences. So I'll do it
later.
Reviewed By: quark-zju
Differential Revision: D24096539
fbshipit-source-id: 5730ddd45b08cc985ecd9128c25021b6e7d7bc89
Summary: D24070707: `[Thrift] Provide sorted fields to read_field_begin` made a change to the generated rust thrift files, so the eden/scm thrift files have to be regenerated to fix the build.
Reviewed By: farnz
Differential Revision: D24109655
fbshipit-source-id: e8575a76642673a11514fdce8e30f13ca28151f0
Summary: This can be used by dag::Vertex and minibytes::Bytes.
Reviewed By: kulshrax
Differential Revision: D23966985
fbshipit-source-id: 3b4b29648e038ef49f26ce2b500119e148544d9e
Summary:
The py_stream_class causes the code to be more verbose. It basically enforces
the bindings crate to define new types wrapping pure Rust types, and then
define py_stream_class.
In a future diff, I'm adding FromPyObject/ToPyObject support for types that
implements serde Deserialize/Serialize. py_stream_class gets in the way,
because the blanket type from cpython-ext cannot be used in the py_stream_class
macro. cpython-ext is not the proper place to define business-related stream
types.
Therefore, define a type-erased Python class, and implement
FromPyObject/ToPyObject automatically for TStream<anyhow::Result<T>> where
T implements FromPyObject or ToPyObject.
The FromPyObject now converts a Python iterator back to a stream. It's
no longer zero-cost. However, I'd imagine such usecases can be short-cut
using pure Rust code.
Background: Initially, I added some FromPyObject/ToPyObject impls to pure
Rust crates gated by a "pytypes" feature. While that works fine with cargo
build, buck does not support dynamic features and the fact that we support
both py2 and py3 makes it extremely hard to support cleanly in buck build.
For example, if minibytes::Bytes defines ToPyObject for Bytes, then any
crate using minibytes would have 2 different versions: a py2 version, a
py3 version, and they both depend on python. That seems to be a bad approach.
Reviewed By: sfilipco
Differential Revision: D23966984
fbshipit-source-id: eafb31ad458dcbdd8e970d8e419a10fbbe30595f
Summary:
Per the feedback on D23920367 (318f5683a5), let's make the human-readable download stats shorter. Example:
```
Downloaded 10.59 MiB in 12.35s over 5 requests (7.19 Mb/s, latency: 123ms)
```
The amount downloaded is now reported in binary-prefixed bytes (so that it can be directly compared to file sizes) whereas the transfer rate is reported in decimal-prefixed bits per second (so that it can be directly compared to a user's measured network speed).
Additionally, we now use the default formatting available from `std::time::Duration`, which will automatically choose the appropriate display units.
Reviewed By: quark-zju
Differential Revision: D24096525
fbshipit-source-id: 39c49f1b08135bbae7a7544b1ffe2bdbfe1533a1
Summary: We now get progress bar output when fetching from memcache!
Reviewed By: kulshrax
Differential Revision: D24060663
fbshipit-source-id: ff5efa08bced2dac12f1e16c4a55fbc37fbc0837
Summary: These aren't included anywhere, we can remove them.
Reviewed By: DurhamG
Differential Revision: D24062627
fbshipit-source-id: 9ff101eb44965ac3502ada3265ffcc8acc09d2e5
Summary: These are unused, no need to keep the code around.
Reviewed By: DurhamG
Differential Revision: D24055085
fbshipit-source-id: 6246d746983a575c051ddcb51ae02582a764a814
Summary: This is unused, no need to keep it around.
Reviewed By: DurhamG
Differential Revision: D24054164
fbshipit-source-id: 161b294eb952c6b4584aa0d49d8ff46cd63ee30f
Summary: Disable CLANGTIDY checks for several places at the code.
Reviewed By: zertosh, benoitsteiner
Differential Revision: D24018176
fbshipit-source-id: b2d294f9efd64b2e2c72b11b18d8033f9928e826
Summary:
This would have been easier if we can upgrade tokio (D24011447).
For now, let's just solve it by using a channel so the mutex is not held for long.
The implementation has some side effects, though:
- panic message is not preserved.
- 'static lifetime is required on Future.
The `'static` lifetime is incompatible with some existing code. The old function
is preserved as `block_on_exclusive` and is used in places where a future does
not have `'static` lifetime.
Reviewed By: sfilipco
Differential Revision: D24033134
fbshipit-source-id: 7b35d1ff636d2a289db9b04e60419c31bdea9453
Summary:
Add a null progress bar implementation that just keeps track of state, similar to the `progress.nullbar` in hg's Python code.
A benefit of this is that code that optionally shows progress can unconditionally update the progress bar rather than wrapping it in an `Option` and checking for presence each time.
Reviewed By: markbt
Differential Revision: D23982318
fbshipit-source-id: ffd762b59cc0c9bd2ad0c67c3ca785350db4850f
Summary:
This diff introduces a new `progress` crate that provides an abstract interface for progress bars in Rust code:
- The `ProgressFactory` trait can be used to create new progress bars.
- The `ProgressBar` trait allows Rust code to interact with the progress bar.
- The `ProgressSpinner` trait is similar, but for spinner-type progress indicators.
These traits are intended to be used as trait objects, allowing pure Rust code to accept an opaque `ProgressFactory` and use it to report progress. This kind of abstraction, while not common in idiomatic Rust code, allows the progress implementation to be completed decoupled from the pure Rust code, which is important given that Mercurial's progress bars are currently implemented in Python.
Part of the goal of this crate is to allow a smooth transition to pure Rust progress bars (once we eventually implement them). As long as the Rust progress bars implement the above traits, the can be used as drop-in replacements for Python progress bars everywhere.
Reviewed By: markbt
Differential Revision: D23982319
fbshipit-source-id: 9ccf167f18d9518bb0ed66e1606a5b8188d98428
Summary:
As EdenFS depends on a few bits of Mercurial code, these needs to be able to
compile with Buck.
Reviewed By: chadaustin
Differential Revision: D24000881
fbshipit-source-id: 078a2a958039a63db1b716785f872b4bbde3bab6
Summary: Make `parents`, `data`, and `metadata` optional, and introduce `WireTreeAttributesRequest` for selecting which attributes to request on the wire.
Reviewed By: kulshrax
Differential Revision: D23406763
fbshipit-source-id: 5edd674d9ba5d37c23b12ab4d7b54bbf6c9ff990
Summary:
Adds a `WireTreeQuery` enum for query method, with a single `ByKeys(WireTreeKeyQuery)` available currently, to request a specific set of keys.
Leave the API struct alone for now.
Reviewed By: kulshrax
Differential Revision: D23402366
fbshipit-source-id: 19cd8066afd9f14c7e5f718f7583d1e2b9ffac02
Summary: The size can change with zstd upgrades. Do not test them.
Reviewed By: sfilipco
Differential Revision: D23976933
fbshipit-source-id: d560061b6e4fefc3bb89513bdb12c770ea0bd881
Summary:
This makes it so commit hashes are serialized to bytes instead of tuples in Python:
In [1]: s,f=api.commitdata(repo.name, list(repo.nodes('master')))
In [2]: list(s)
Out[3]: [{'hgid': '...', ...}]
Some `Vec<HgId>`s cannot be changed using this way. It'd be nice if we can change
the default `HgId` serialization to bytes.
Reviewed By: kulshrax
Differential Revision: D23966989
fbshipit-source-id: 4d013525419741d3c5c23621be16e70441bab3c4
Summary:
`HgId` currently serializes into a tuple of 20 items. This is suboptimal in
CBOR, because the items are untyped. A byte might be serialized into one or two
bytes:
In [2]: cbor.dumps([1,1,1,1])
Out[2]: b'\x84\x01\x01\x01\x01'
In [3]: cbor.dumps([255,255,255,255])
Out[3]: b'\x84\x18\xff\x18\xff\x18\xff\x18\xff'
CBOR supports "bytes" type to efficiently encode a `[u8]`:
In [5]: cbor.dumps(b"\x01\x01\x01\x01")
Out[5]: b'D\x01\x01\x01\x01'
In [6]: cbor.dumps(b"\xff\xff\xff\xff")
Out[6]: b'D\xff\xff\xff\xff'
Add `serde_with` with 3 flavors: `bytes`, `tuple`, `hex` to satisfy different
needs. Check the added docstring for details.
Reviewed By: kulshrax
Differential Revision: D23966992
fbshipit-source-id: 704132648f9e50b952ffde0e96ee2106f2f2fbcf
Summary:
Dynamicconfig can generate configs two ways, 1) via `hg
debugdynamicconfig` and 2) synchronously in-process in an hg command when it
detects that the dynamicconfig is completely missing or has the wrong version
number.
In the first case, dynamicconfig gets the repo name from the standard config
object loaded by the hg dispatch. In the second case, the standard config
object isn't even loaded yet, so dynamicconfig does a mini-load of the user and
repo hgrcs so it can get the repo name and user name (needed for dynamic
conditions).
Unfortunately the second code path computed the wrong path (it had two .hg/'s)
which meant the reponame and user name were always none. This meant that the
dynamicconfig on disk could randomly be either computed with or without a
reponame.
Let's fix the path computation, and add a test. We may want to make
dynamicconfig fail if no repo name is passed, but I'm not sure if we'll want to
support no-repo configuration at some point.
This didn't cause a problem for most people, since it would only happen during a
hg version number change, and 15 minutes later the background 'hg
debugdynamiconfig' process would fix it up. It did affect sandcastle though,
since it often creates new repositories and acts on them immediately.
Reviewed By: quark-zju
Differential Revision: D23955628
fbshipit-source-id: c922f4b523d19df9223aa28c97700b7011fc03eb
Summary:
The old code tried to express 4GB by using ^ to do an exponent. That
operator is actually the bitwise xor, so this was producing a limit closer to 4
bytes. It doesn't seem to have mattered much since a later diff overrode the
default via dynamicconfig. But let's fix this anyway.
Reviewed By: krallin
Differential Revision: D23955629
fbshipit-source-id: 6abebcb7e84b7a47f70ac501fa11b0dc60dfda7b
Summary: Now that the `async_runtime` crate exists, use Mercurial's global `tokio::Runtime` instead of creating one for each EdenAPI store.
Reviewed By: quark-zju
Differential Revision: D23945569
fbshipit-source-id: 7d7ef6efbb554ca80131daeeb2467e57bbda6e72
Summary: Now that the EdenAPI server is using the `LoadMiddleware` from `gotham_ext`, each response will contain an `X-Load` header that contains the number of active requests that the server is currently handling.
Reviewed By: quark-zju
Differential Revision: D23922809
fbshipit-source-id: 973143de5ddccf074d28aa3ef38d73f9fc1501b6
Summary:
Network speeds are usually reported in megabits per second (Mb/s), whereas file sizes are usually reported in [mebibytes](https://en.wikipedia.org/wiki/Binary_prefix) per second (MiB/s). Previously, the HTTP client reported neither of those and instead reported megabytes per second (MB/s).
This diff changes the latter to the former so that the numbers are more immediately useful. As a bonus, the speeds are now directly comparable to those reported by `hg debugnetwork`.
Reviewed By: quark-zju
Differential Revision: D23920367
fbshipit-source-id: 46500a42681ab83fc7c4ead82980e8ed620a4d5a
Summary: Now that stats are logged to `tracing` by the `HttpClient` directly, we no longer need to log them here. This commit backs out D23858077 (613fbc858f) which added the logging.
Reviewed By: quark-zju
Differential Revision: D23919308
fbshipit-source-id: 23d3a12c5307bc4b84dd9ffd25bd376718e3cc91
Summary:
Improve the log output of the HTTP client to avoid spewing redundant debug messages.
As part of this change, logging now uses the `tracing` crate instead of the `log` crate for better integration with the rest of Mercurial's logging infrastructure. Right now, `tracing` is just being used as a drop-in replacement for `log`, but now that it's in use we can start using its full capabilities (such as defining tracing spans) in later diffs.
Reviewed By: quark-zju
Differential Revision: D23919310
fbshipit-source-id: 95555ad083ead805ceece39c6e30aaf879bdf2bc
Summary:
We were using the timeout parameter on `Multi::wait` (equivalent to `curl_multi_wait` in C) incorrectly. Previously, we were passing in the timeout provided by `curl_multi_timeout`.
This is incorrect usage because the value returned by `curl_multi_timeout` is the current value of libcurl's internal timeout (based on the state of the transfers), which will always be respected. The actual intention of the timeout parameter is to allow the caller to specify a hard cap on curl's internal timeout, so we should just pass some reasonable default value here. ([See explanation here.](https://github.com/curl/curl/issues/2996))
The purpose of `curl_multi_timeout` is to allow libcurl to tell the application what its desired timeout is in situations where the application itself is waiting for socket activity (using something like `epoll`), which is not the case when using `curl_multi_wait`.
Reviewed By: DurhamG
Differential Revision: D23914093
fbshipit-source-id: 76a25d7c59a4b08437c8d7be3d24708fb37b9172
Summary: Use the functionality from D23910534 (721f5af278) to set a timeout for EdenAPI requests, configured via the `edenapi.timeout` option.
Reviewed By: DurhamG
Differential Revision: D23911552
fbshipit-source-id: 4a6e3de1094d0faa1daaf6fe4b9b7aafb37a25a8
Summary: Add the ability to set a timeout on HTTP requests. Equivalent to [`CURLOPT_TIMEOUT_MS`](https://curl.haxx.se/libcurl/c/CURLOPT_TIMEOUT_MS.html).
Reviewed By: DurhamG
Differential Revision: D23910534
fbshipit-source-id: a7aec792ec3c122a01aa44fcfe2e2df6e3a111fc
Summary:
There are several places in the HTTP client where we log and discard errors. (Typically, these are "this should never happen" type situations.)
Previously, these were logged at the `trace` log level, meaning that in practice no one would ever know if we did hit these errors.
Let's upgrade them to `error` so that they'll be printed out. (In theory, users should never see these error messages unless something has gone horribly wrong.)
Reviewed By: DurhamG
Differential Revision: D23888268
fbshipit-source-id: 9007205f946ebb0127238c76812cf62524878047
Summary:
Treemanifest needs to be able to write to the shared stores from paths
other than just prefetch (like when it receives certain trees via a standard
pull). To make this possible we need to expose the Rust shared mutable stores.
This will also make just general integration with Python cleaner.
In the future we can get rid of the non-prefetch download paths and remove this.
Reviewed By: quark-zju
Differential Revision: D23772385
fbshipit-source-id: c1e67e3d21b354b85895dba8d82a7a9f0ffc5d73
Summary:
Introduce separate wire types to allow protocol evolution and client API changes to happen independently.
* Duplicate `*Request`, `*Entry`, `Key`, `Parents`, `RepoPathBuf`, `HgId`, and `revisionstore_types::Metadata` types into the `wire` module. The versions in the `wire` module are required to have proper `serde` annotations, `Serialize` / `Deserialize` implementations, etc. These have been removed from the original structs.
* Introduce infallible conversions from "API types" to "wire types" with the `ToWire` trait and fallible conversions from "wire types" to "API types" with the `ToApi`. API -> wire conversions should never fail in a binary that builds succesfully, but wire -> API conversions can fail in the case that the server and client are using different versions of the library. This will cause, for instance, a newly-introduced enum variant used by the client to be deserialized into the catch-all `Unknown` variant on the server, which won't generally have a corresponding representation in the API type.
* Cleanup: remove `*Response` types, which are no longer used anywhere.
* Introduce a `map` method on `Fetch` struct which allows a fallible conversion function to be used to convert a `Fetch<T>` to a `Fetch<U>`. This function is used in the edenapi client implementation to convert from wire types to API types.
* Modify `edenapi_server` to convert from API types to wire types.
* Modify `edenapi_cli` to convert back to wire types before serializing responses to disk.
* Modify `make_req` to use `ToWire` for converting API structs from the `json` module to wire structs.
* Modify `read_res` to use `ToApi` to convert deserialized wire types to API types with the necessary methods for investigating the contents (`.data()`, primarily). It will print an error message to stderr if it encounters a wire type which cannot be converted into the corresponding API type.
* Add some documentation about protocol conventions to the root of the `wire` module.
Reviewed By: kulshrax
Differential Revision: D23224705
fbshipit-source-id: 88f8addc403f3a8da3cde2aeee765899a826446d
Summary: Add log messages for debugging using the `tracing` crate, which allows them to be enabled via `env_logger`.
Reviewed By: quark-zju
Differential Revision: D23858076
fbshipit-source-id: a8ef1afac6c9ecbfb5d6d78232aa0d03a2fe2054
Summary: Log HTTP stats to stderr to assist with ad-hoc debugging. Will not be printed unless `RUST_LOG` is set appropriately.
Reviewed By: quark-zju
Differential Revision: D23858077
fbshipit-source-id: 39acf3de3fd0ca4403a986eb5373a6a79f1d004a
Summary:
Add a `PyFuture<F>` type that can be used as return type in binding function.
It converts Rust Future to a Python object with an `await` method so Python
can access the value stored in the future.
Unlike `TStream`, it's currently only designed to support Rust->Python one
way conversion so it looks simpler.
Reviewed By: kulshrax
Differential Revision: D23799644
fbshipit-source-id: da4a322527ad9bb4c2dbaa1c302147b784d1ee41
Summary:
The exposed type can be used as a Python iterator:
for value in stream:
...
The Python type can be used as input and output parameters in binding functions:
# Rust
type S = TStream<anyhow::Result<X>>;
def f1() -> PyResult<S> { ... }
def f2(x: S) -> PyResult<S> { Ok(x.stream().map_ok(...).into()) }
# Python
stream1 = f1()
stream2 = f2(stream1)
This crate is similar to `cpython-ext`: it does not define actual business
logic exposed by `bindings` module. So it's put in `lib`, not
`bindings/modules`.
Reviewed By: markbt
Differential Revision: D23799641
fbshipit-source-id: c13b0c788a6465679b562976728f0002fd872bee
Summary:
Move bunch of code into a separate file (scm daemon related options). Move them
out of cloud sync.
Also introduce additional check that the `hg cloud sync` command scm daemon
runs is intended for the current connected workspace
In theory when we switch a subscription, the SCM daemon gets notified but races possible and it is better to have this additional check, so SCM daemon triggers cloud sync where it is supposed to.
Reviewed By: markbt
Differential Revision: D23783616
fbshipit-source-id: b91a8b79189b7810538c15f8e61080b41abde386
Summary:
The Rust contentstore has no way to flush the shared stores, except
when the object is destructed. In treemanifest, the lifetime of the shared store
seems to be different from with files and we're not seeing them flushes
appropriately during certain commands. Let's make the flush api also flush the
shared stores.
Reviewed By: quark-zju
Differential Revision: D23662976
fbshipit-source-id: a542c3e45d5b489fcb5faf2726854cb49df16f4c
Summary: The old logic would just double pack some bits. Let's prevent that.
Reviewed By: xavierd
Differential Revision: D23661933
fbshipit-source-id: 155291fa08ec2c060619329bd1cb6040769feb63
Summary:
The rust pack stores currently have logic to refresh their list of
packs if there's a key miss and if it's been a while since we last loaded the
list of packs. In some cases we want to manually trigger this refresh, like if
we're in the middle of a histedit and it invokes an external command that
produces pack files that the histedit should later consume (like an external
amend, that histedit then needs to work on top of).
Python pack stores solve this by allowing callers to mark the store for a
refresh. Let's add the same logic for rust stores. Once pack files are gone we
can delete this.
This will be useful for the upcoming migration of treemanifest to Rust
contentstore. Filelog usage of the Rust contentstore avoided this issue by
recreating the entire contentstore object in certain situations, but refresh
seems useful and less expensive.
Reviewed By: quark-zju
Differential Revision: D23657036
fbshipit-source-id: 7c6438024c3d642bd22256a8e58961a6ee4bc867
Summary:
Instants do not represent actual time and can only be compared against
each other. When we subtracted arbitrary Durations from them, we run the risk of
overflowing the underlying storage, since the Instant may be represented by a
low number (such as the age of the process).
This caused crashes in test_refresh (in the next diff) on Windows.
Let's instead represent the "must rescan" state as a None last_scanned time, and avoid any arbitrary subtraction. It's generally much cleaner too.
Reviewed By: quark-zju
Differential Revision: D23752511
fbshipit-source-id: db89b14a701f238e1c549e497a5d751447115fb2
Summary:
Previously the MetadataStore would always construct a mutable pack, even
if the operation was readonly. This meant all read commands required write
access. It also means that random .tmp files get scattered all over the place
when the rust structures are not properly destructed (like if python doesn't
bother doing the final gc to call destructors for the Rust types).
Let's just only create mutable packs when we actually need them.
Reviewed By: quark-zju
Differential Revision: D23219961
fbshipit-source-id: a47f3d94f70adac1f2ee763f3170ed582ef01a14
Summary:
Previously the ContentStore would always construct a mutable pack, even
if the operation was readonly. This meant all read commands required write
access. It also means that random .tmp files get scattered all over the place
when the rust structures are not properly destructed (like if python doesn't
bother doing the final gc to call destructors for the Rust types).
Let's just only create mutable packs when we actually need them.
Reviewed By: quark-zju
Differential Revision: D23219962
fbshipit-source-id: 573844f81966d36ad324df03eecec3711c14eafe
Summary:
As it says in the title, this adds support for receiving compressed responses
in the revisionstore LFS client. This is controlled by a flag, which I'll
roll out through dynamicconfig.
The hope is that this should greatly improve our throughput to corp, where
our bandwidth is fairly scarce.
Reviewed By: StanislavGlebik
Differential Revision: D23652306
fbshipit-source-id: 53bf86d194657564bc3bd532e1a62208d39666df
Summary:
This imports the async-compression crate. We have an equivalent-ish in
common/rust, but it targets Tokio 0.1, whereas this community-supported crate
targets Tokio 0.2 (it offers a richer API, notably in the sense that we
can use it for Streams, whereas the async-compression crate we have is only for
AsyncWrite).
In the immediate term, I'd like to use this for transfer compression in
Mononoke's LFS Server. In the future, we might also use it in Mononoke where we
currently use our own async compression crate when all that stuff moves to
Tokio 0.2.
Finally, this also updates zstd: the version we link to from tp2 is actually
zstd 1.4.5, so it's a good idea to just get the same version of the zstd crate.
The zstd crate doesn't keep a great changelog, so it's hard to tell what has changed.
At a glance, it looks like the answer is not much, but I'm going to look to Sandcastle
to root out potential issues here.
Reviewed By: StanislavGlebik
Differential Revision: D23652335
fbshipit-source-id: e250cef7a52d640bbbcccd72448fd2d4f548a48a
Summary:
We've often had cases where we need to nuke peoples caches for various
reasons. It's a hug pain since we haven't a way to communicate with all hg
clients. Now that we have configerator dynamicconfigs, we can use that to reach
all clients.
This diff adds support for configs like:
```
[hgcache-purge]
foo=2020-08-20
```
The key, 'foo' in this case, is an identifier used to only run this purge once.
The value is a date after which this purge will no longer run. This is useful
for bounding the damager from forgetting about a purge and having it delete caches
over and over in the future for new repos or repos where the run once marker
file is deleted for some reason.
Reviewed By: quark-zju
Differential Revision: D23044205
fbshipit-source-id: 8394fcf9ba6df09f391b5317bad134f369e9b416
Summary:
For repositories that have the old-style LFS extension enabled, the pointers
are stored in packfiles/indexedlog alongside with a flag that signify to the
upper layers that the blob is externally stored. With the new way of doing LFS,
pointers are stored separately.
When both are enabled, we are observing some interesting behavior where
different get and get_meta calls may return different blobs/metadata for the
same filenode. This may happen if a filenode is stored in both a packfile as an
LFS pointers, and in the LFS store. Guaranteeing that the revisionstore code is
deterministic in this situation is unfortunately way too costly (a get_meta
call would for instance have to fully validate the sha256 of the blob, and this
wouldn't guarantee that it wouldn't become corrupted on disk before calling
get).
The solution take here is to simply ignore all the lfs pointers from
packfiles/indexedlog when remotefilelog.lfs is enabled. This way, there is no
risk of reading the metadata from the packfiles, and the blob from the
LFSStore. This brings however another complication for the user created blobs:
these are stored in packfiles and would thus become unreadable, the solution is
to simply perform a one-time full repack of the local store to make sure that
all the pointers are moved from the packfiles to to LFSStore.
In the code, the Python bindings are using ExtStoredPolicy::Ignore directly as
these are only used in the treemanifest code where no LFS pointers should be
present, the repack code uses ExtStoredPolicy::Use to be able to read the
pointers, it wouldn't be able to otherwise.
Reviewed By: DurhamG
Differential Revision: D22951598
fbshipit-source-id: 0e929708ba5a3bb2a02c0891fd62dae1ccf18204
Summary:
hg-http's built client should provide integration with Mercurial's stats
collection mechanisms.
Reviewed By: kulshrax
Differential Revision: D23577867
fbshipit-source-id: 93c777021bc347511322269d678d6879710eed3e
Summary:
Add `with_stats_reporting` to HttpClient. It takes a closure that will be
called with all `Stats` objects generated. We then use this function in
the hg-http crate to integrate with the metrics backend used in Mercurial.
Reviewed By: kulshrax
Differential Revision: D23577869
fbshipit-source-id: 5ac23f00183f3c3d956627a869393cd4b27610d4
Summary:
We start off simple here. Python only really has counters so we only implement
counters. There are a lot of options on how to improve this and things get
slightly complicated when we look at the how ecosystem and fb303. Anyway,
simple start.
Reviewed By: quark-zju
Differential Revision: D23577874
fbshipit-source-id: d50f5b2ba302d900b254200308bff7446121ae1d
Summary: The Mercurial codebase uses hyphens in crate names rather than underscores. This is similar to the convention favored by the larger Rust community, though it is different from Mononoke, which uses underscores. While we'll probably need to eventually settle on a consistent convention for all of projects in the Eden SCM repo, for now, `http_client` should be made consistent with the adjacent crates.
Reviewed By: sfilipco
Differential Revision: D23585721
fbshipit-source-id: d2e690d86815be02d7b8d645198bcd28e8cbd6e0
Summary: No more tokio-core! More `async/await`.
Reviewed By: kulshrax
Differential Revision: D23586509
fbshipit-source-id: b2e766ddb7575bc96963432f0c8582b4370b19aa
Summary:
This diff adds a `SocketTransport` implementation that no longer uses legacy `tokio-core` based futures but `tokio-tower` and `tower-service` for processing Thrift requests.
The old implementation is renamed to `SocketTransportLegacy` for better transitioning.
Reviewed By: dtolnay
Differential Revision: D20019196
fbshipit-source-id: 3bee684e9254bf1a81669ef0d2c2262a55e75daa
Summary:
In order to keep the hgcache size bounded we need to keep track of pack
file size even during normal operations and delete excess packs.
This has the negative side effect of deleting necessary data if the operation is
legitimately huge, but we'd rather have extra downloading time than fill up the
entire disk.
Reviewed By: quark-zju
Differential Revision: D23486922
fbshipit-source-id: d21be095a8671d2bfc794c85918f796358dc4834
Summary:
In a future diff we'll add logic to delete old pack files. We'll want
to use this pack iteration code, so let's move it to a function.
Reviewed By: quark-zju
Differential Revision: D23486920
fbshipit-source-id: 5f872e946ffe816289c925dd2e03c292e29da5af
Summary:
As the repository grows the opportunity for large downloads increases.
Today all writes to data packs get sent straight to disk, but we have no way to
prevent this from eating all the disk.
Let's automatically flush datapacks when they reach a certain size (default
4GB). In a future diff this will let us automatically garbage collect data packs
to bound the maximum size of packs.
Rotatelog already have this behavior.
Reviewed By: quark-zju
Differential Revision: D23478780
fbshipit-source-id: 14f9f707e8bffc59260c2d04c18b1e4f6bdb2f90
Summary:
See D23538897 for context. This adds a killswitch so we can rollout client
certs gradually through dynamicconfig.
Reviewed By: StanislavGlebik
Differential Revision: D23563905
fbshipit-source-id: 52141365d89c3892ad749800db36af08b79c3d0c
Summary:
Like it says in the title, this updates remotefilelog to present client
certificates when connecting to LFS (this was historically the case in the
previous LFs extension). This has a few upsides:
- It lets us understand who is connecting, which makes debugging easier;
- It lets us enforce ACLs.
- It lets us apply different rate limits to different use cases.
Config-wise, those certs were historically set up for Ovrsource, and the auth
mechanism will ignore them if not found, so this should be safe. That said, I'd
like to a killswitch for this nonetheless. I'll reach out to Durham to see if I
can use dynamic config for that
Also, while I was in there, I cleaned up few functions that were taking
ownership of things but didn't need it.
Reviewed By: DurhamG
Differential Revision: D23538897
fbshipit-source-id: 5658e7ae9f74d385fb134b88d40add0531b6fd10
Summary:
Generated by formatting with rustfmt 2.0.0-rc.2 and then a second time with fbsource's current rustfmt (1.4.14).
This results in formatting for which rustfmt 1.4 is idempotent but is closer to the style of rustfmt 2.0, reducing the amount of code that will need to change atomically in that upgrade.
---
*Why now?* **:** The 1.x branch is no longer being developed and fixes like https://github.com/rust-lang/rustfmt/issues/4159 (which we need in fbcode) only land to the 2.0 branch.
---
Reviewed By: zertosh
Differential Revision: D23568779
fbshipit-source-id: 477200f35b280a4f6471d8e574e37e5f57917baf
Summary:
Now that the Rust revisionstore records undesired filename fetches,
let's log those results to Scuba in Python.
Reviewed By: StanislavGlebik
Differential Revision: D23462572
fbshipit-source-id: b55f2290e30e3a5c3b67d9f612b24bc3aad403a8
Summary:
We want to be able to record when fetches to certain paths happen.
Let's add recording infrastructure to the new ReportingRemoteDataStore.
A future diff will make the seen accessible from Python for scuba logging.
Reviewed By: xavierd
Differential Revision: D23462574
fbshipit-source-id: 5d749f2429e26e8e7fe4fb5adc29140b4309eac9
Summary:
We want to monitor what paths are fetched from our remote servers.
Since all of our remote stores are hidden behind the RemoteDataStore interface,
let's create a wrapper around that. A future diff will insert the actual
monitoring and reporting.
Reviewed By: quark-zju
Differential Revision: D23462571
fbshipit-source-id: e6031f19db23f7d1b09767efb9613d7528fb457d
Summary:
This is based on fbsource data, building level 5 proves to be not useful.
This would save 300ms in the write path.
Reviewed By: sfilipco
Differential Revision: D23494505
fbshipit-source-id: ca795b4900af40dbfdaa463d36f3169413bf6a62
Summary:
Previously the IdMap's "Name -> Id" index simply ignores the "reassign
non-master" request. It turns out stale entries in that index can cause
issues as demonstrated by the previous diff.
Update IdMap to actually remove both indexes of non-master group on
remove_non_master so it cannot have stale entries.
To optimize the index, the format of IdMap is changed from:
[ 8 bytes Id (Big Endian) ] [ Name ]
to:
[ 8 bytes Id (Big Endian) ] [ 1 byte Group ] [ Name ]
So the index can use reference to the slice, instead of embedding the bytes, to
reduce index size.
The filesystem directory name for IdMap used by NameDag is bumped to `idmap2`
so it won't read the incompatible old `idmap` data.
Reviewed By: sfilipco
Differential Revision: D23494508
fbshipit-source-id: 3cb7782577750ba5bd13515b370f787519ed3894
Summary: Some vertexes can disappear from the graph!
Reviewed By: sfilipco
Differential Revision: D23494506
fbshipit-source-id: ecbf2a4169e5fc82596e89a4bfe4c442a82e9cd2
Summary: The TestDag struct will be used to do some more complicated tests.
Reviewed By: sfilipco
Differential Revision: D23494507
fbshipit-source-id: 11350f9e448725ae49f50a7b6f19efc57ad84448
Summary:
Replacing places where the tokio runtime is instantiated inside the edenapi
client crate.
Reviewed By: quark-zju
Differential Revision: D23468596
fbshipit-source-id: ef68718c7d5b89b6477a2946daaa51618b53d06a
Summary:
At open time, it's pointless to attempt to create new levels. So let's just
read the existing max_level and do not try to build max_level + 1.
This turns out to save 300ms in profiling result.
Reviewed By: sfilipco
Differential Revision: D23494509
fbshipit-source-id: 4ea326a3cc21792790ea0b87e5bf608a94ae382b
Summary:
With MultiLog, per-log meta was previously entirely ignored. However, they can
be useful for updated indexes. For example, application defines a new index,
and opens a Log via MultiLog. The application would expect the new index is
built only once. Without MultiLog, per-log meta is updated at open time in
place. With MultiLog, the updated index meta is not written back to the
multimeta so the new index would be rebuilt multiple times undesirably.
Update MultiLog to reuse the per-log meta if it's compatible so it can pick up
new indexes.
Reviewed By: sfilipco
Differential Revision: D23488212
fbshipit-source-id: c8b3e6b5589dbda2e76a143d15085862a93dae22
Summary:
The poisoned meta makes investigation harder. ex. `debugdumpindexlog` won't
work on those logs.
Reviewed By: sfilipco
Differential Revision: D23488213
fbshipit-source-id: b33894d8c605694b6adf5afdaed45707fbd7357e
Summary:
Change dag_ops benchmarks to use different IdDagStores. An example run shows:
benchmarking dag::iddagstore::indexedlog_store::IndexedLogStore
building segments (old) 856.803 ms
building segments (new) 127.831 ms
ancestors 54.288 ms
children (spans) 619.966 ms
children (1 id) 12.596 ms
common_ancestors (spans) 3.050 s
descendants (small subset) 35.652 ms
gca_one (2 ids) 164.296 ms
gca_one (spans) 3.132 s
gca_all (2 ids) 270.542 ms
gca_all (spans) 2.817 s
heads 247.504 ms
heads_ancestors 40.106 ms
is_ancestor 108.719 ms
parents 243.317 ms
parent_ids 10.752 ms
range (2 ids) 7.370 ms
range (spans) 23.933 ms
roots 620.150 ms
benchmarking dag::iddagstore::in_process_store::InProcessStore
building segments (old) 790.429 ms
building segments (new) 55.007 ms
ancestors 8.618 ms
children (spans) 196.562 ms
children (1 id) 2.488 ms
common_ancestors (spans) 545.344 ms
descendants (small subset) 8.093 ms
gca_one (2 ids) 24.569 ms
gca_one (spans) 529.080 ms
gca_all (2 ids) 38.462 ms
gca_all (spans) 540.486 ms
heads 103.930 ms
heads_ancestors 6.763 ms
is_ancestor 16.208 ms
parents 103.889 ms
parent_ids 0.822 ms
range (2 ids) 1.748 ms
range (spans) 6.157 ms
roots 197.924 ms
benchmarking dag::iddagstore::bytes_store::BytesStore
building segments (old) 724.467 ms
building segments (new) 90.207 ms
ancestors 23.812 ms
children (spans) 348.237 ms
children (1 id) 4.609 ms
common_ancestors (spans) 1.315 s
descendants (small subset) 20.819 ms
gca_one (2 ids) 72.423 ms
gca_one (spans) 1.346 s
gca_all (2 ids) 116.025 ms
gca_all (spans) 1.470 s
heads 155.667 ms
heads_ancestors 19.486 ms
is_ancestor 51.529 ms
parents 157.285 ms
parent_ids 5.427 ms
range (2 ids) 4.448 ms
range (spans) 13.874 ms
roots 365.568 ms
Overall, InProcessStore > BytesStore > IndexedLogStore. The InProcessStore
uses `Vec<BTreeMap<Id, StoreId>>` for the level-head index, which is more
efficient on the "Level" lookup (Vec), and more cache efficient (BTree).
BytesStore outperforms IndexedLogStore because it does not need to verify
checksum on every read access - the checksum was verified at store creation
(IdDag::from_bytes).
Note: The `BytesStore` is something optimized for serialization, and hasn't been sent.
Reviewed By: sfilipco
Differential Revision: D23438174
fbshipit-source-id: 6e5f15188e3b935659ccde25fac573e9b963b78f
Summary: This allows them to use the SyncableIdDag APIs.
Reviewed By: sfilipco
Differential Revision: D23438170
fbshipit-source-id: 7ec7288cfb8186b88f85f0212a913cb0dffe7345
Summary: Other IdDagStores can also use the API. This will be used in benchmarks.
Reviewed By: sfilipco
Differential Revision: D23438180
fbshipit-source-id: 565552b66372dcfbb268c397883f627491d6e154
Summary:
Similar to `IdDagStore::sync` -> `GetLock::persist`, `reload` is more related
to filesystem/internal state exchange, and should be protected by a lock. So
let's move the API there, and requires a lock.
Reviewed By: sfilipco
Differential Revision: D23438169
fbshipit-source-id: 4228106b7739a1a758677adfddd213ad54aa4b6a
Summary:
`NameDag::reload` is used in `flush` to get a "fresh" NameDag.
In a future diff the `IdDag::reload` API gets changed, so let's
remove NameDag's use of it.
Instead, let's just re-`open` the path again to get a fresh NameDag.
It's a bit more expensive but probably okay, and easier to understand.
`get_new_segment_size()` was added as an internal API to preserve tests.
This also solves an issue where `NameDag` cannot recover properly if its
`flush` fails, because the old `NameDag` state is not lost.
After removing `NameDag::reload`, `idMap::reload` is no longer used publicly
and was made private.
Reviewed By: sfilipco
Differential Revision: D23438179
fbshipit-source-id: 0a32556a2cd786919c233d7efcae1cb9cbc5fb09
Summary:
The word "sync" is bi-directional: flush + reload. It was indexedlog::Log's
behavior. However, in the IdDag context "sync" is confusing - it is actually
only used to write data out, with protection from lock. Rename to `persist`
to clarify it's memory -> disk. Besides, requires a reference to a lock object
as a lightweight prove that some lock is held.
Reviewed By: sfilipco
Differential Revision: D23438175
fbshipit-source-id: 3d9ccd7431691d1c4e2ee74f3c80d95f5e7243b5
Summary:
This removes the need of cloning `IdMap`.
SyncableIdMap is a bit tricky. I added some comments to clarify things.
Reviewed By: sfilipco
Differential Revision: D23438176
fbshipit-source-id: fe66071da07067ed6c53a6437790af1d81b28586
Summary:
Make the test cover IndexedLogIdDagStore. The only change is the parent index
returns children in a different order.
Reviewed By: sfilipco
Differential Revision: D23438173
fbshipit-source-id: bcfabcd329e45bbc5e7e773103fa42307c23c35d
Summary:
There aren't too many thigs that we can do with the responses that we get back
from the server. Thigs are somewhat application specific for this endpoint.
One option that is not available right now and might make sense to add is
limiting the number of entries that are printed for a given location.
Reviewed By: kulshrax
Differential Revision: D23456220
fbshipit-source-id: eb24602c3dea39b568859b82fc27b7f6acc77600
Summary:
To reduce the size over the wire on cases where we would be traversing the
changelog on the client, we want to allow the endpoint to return a whole parent
chain with their hashes.
Reviewed By: kulshrax
Differential Revision: D23456216
fbshipit-source-id: d048462fa8415d0466dd8e814144347df7a3452a
Summary:
Renaming all the LocationToHash related structures to CommitLocationToHash.
This is done for consistency. I realized the issue when the command for reading
the request from cbor was not what I was expecting it to be. The reason was that
the commit prefix was used inconsistently for LocationToHash.
Reviewed By: kulshrax
Differential Revision: D23456221
fbshipit-source-id: 0181dcaf81368b978902d8ca79c5405838e4b184
Summary:
Replacing uses of the custom Runtime in lfs with the global runtime in the
`async-runtime` crate.
Reviewed By: xavierd
Differential Revision: D23468347
fbshipit-source-id: 61d2858634a37eb2d7d807104702d24889ec047a
Summary: Make it possible to test other IdDagStores.
Reviewed By: sfilipco
Differential Revision: D23438178
fbshipit-source-id: e5fc1b20833c71dd7569c77c31c76a26a6e357fe
Summary:
Now SpanSet can easily support `push_front`, we can just use SpanSet
efficiently without SpanSetAsc.
Reviewed By: sfilipco
Differential Revision: D23385246
fbshipit-source-id: b2e0086f014977fa990d5142e6eee844293e7ca5
Summary: To remove SpanSetAsc, its API needs to be implemented on SpanSet.
Reviewed By: sfilipco
Differential Revision: D23385250
fbshipit-source-id: ebd9d537287b5c1cde6e2c52ffb6da57dbd71852
Summary: This will make it possible to `push_front` and remove SpanSetAsc special case.
Reviewed By: sfilipco
Differential Revision: D23385249
fbshipit-source-id: 63ac67e9bce7cb281236399b3fb86eba23bbf8a0
Summary:
This makes it easier to replace Vec<Span> with VecDeque<Span> in SpanSet for
efficient push_front and deprecates SpanSetAsc (which uses Id in a bit hacky
way - they are not real Ids).
Reviewed By: sfilipco
Differential Revision: D23385245
fbshipit-source-id: b612cd816223a301e2705084057bd24865beccf0
Summary:
Previously the `is_valid()` function only checks about ordering.
Make it also check "no mergeable adjacent spans" and `span.low<=span.high`.
To provide better debug messages, the function does assertions
directly without returning a bool.
Reviewed By: sfilipco
Differential Revision: D23385247
fbshipit-source-id: 84829e9242e47e68dc2a4b2a6775b13331eba959
Summary:
Previously, `SpanSet::from_sorted_spans` allows having adjacent spans like
`[1..=2, 3..=4]`, while `SpanSet::from_spans` would merge them into `[1..=4]`.
Change it so `SpanSet::from_sorted_spans` merges them too. This simplifies
the `contains` logic and could make some Sets more efficient.
Reviewed By: sfilipco
Differential Revision: D23385248
fbshipit-source-id: 85b5ba9533f15034779e93255085a4fa09c6328a
Summary:
The Rust upstream took the "set F_CLOEXEC on every opened file" approach and
provided no support for closing fds at spawn time to make spawn lightweight [1].
However, that does not play well in our case:
- On Windows:
- stdin/stdout/stderr are not created by Rust, and inheritable by
default (other process like `cargo`, or `dotslash` might leak them too).
- a few other handles like "Null", "Afd" are inheritable. It's
unclear how they get created, though.
- Fortunately, files opened by Python or C in edenscm (ex. packfiles) seem to
be not inheritable and do not require special handling.
- On Linux:
- Files opened by Python or C are likely lack of F_CLOEXEC and need special
handling.
Implement logic to close file handlers (or set F_CLOEXEC) explicitly.
[1]: https://github.com/rust-lang/rust/issues/12148
Reviewed By: DurhamG
Differential Revision: D23124167
fbshipit-source-id: 32f3a1b9e3ae3a9475609df282151c9d6c4badd4
Summary: This would be used to avoid excessive memory usage during pull.
Reviewed By: DurhamG
Differential Revision: D23408833
fbshipit-source-id: 8edd95ab8201697074f65cc118d14755a230567d
Summary:
Mergedrivers can call dirstate.add directly and are adding paths with
"." and "..". Let's block those paths.
Reviewed By: quark-zju
Differential Revision: D23375469
fbshipit-source-id: 64e9f20169cfd50325ecd8ebcc1dd3be7a5cb202
Summary: Similar to D18528858 so module names do not need to be spelled twice.
Reviewed By: markbt
Differential Revision: D23091380
fbshipit-source-id: a2a261abc9c78c8805cea62b38498ba65398796d
Summary: This crate would fail to build without the "fb" feature because `serde_json` was listed as an optional dependency (but is used in a way that isn't conditional on the `fb` feature). This diff makes the dependency non-optional, and also silences several dead code warnings that are emitted when building without the "fb" feature.
Reviewed By: quark-zju
Differential Revision: D23386786
fbshipit-source-id: b00a8b0b8b0b978c1cfab2838629fcb388a076e9
Summary:
The `debugfsync` command calls fsync on newly modified files in svfs.
Right now it only includes locations that we know have constant number
of files.
The fsync logic is put in a separate crate to avoid slow compiles.
Reviewed By: DurhamG
Differential Revision: D23124169
fbshipit-source-id: 438296002eed14db599d6ec225183bf824096940
Summary:
Some functions might be called very frequently. For example,
`phases.phasecache.loadphaserevs` might be called 100k+ times.
That makes the tracing data harder to process.
Limit the count of spans to 1k by default so the data is cheaper to process,
and some highly repetitive cases can now be reasoned about. Note the limit
is only put on static Span Ids. If a span uses dynamic metadata or ask for
different Span Ids each time, they will not be limited.
In debugshell,
td = %trace repo.revs('smartlog()')
len(td.serialize())
dropped from 6MB to 0.87MB.
It's also possible to reason about:
td = %trace len(repo.revs('ancestors(.)'))
in debugshell (taking 30s, 98KB serialized, vs 21s without tracing), while
previously the result would be too large to show (`%trace` just hangs).
Reviewed By: DurhamG
Differential Revision: D23307793
fbshipit-source-id: 3c1e9885ce7a275c2abd8935a4e4539a4f14ce83
Summary: Set a default limit so the output won't be too long.
Reviewed By: DurhamG
Differential Revision: D23307792
fbshipit-source-id: 7e2ed99e96bbde06436a034e78f899fc2e3e03f8
Summary:
The `--trace` flag enables tracing Python modules.
For compatibility reasons, it also enables `--traceback`.
It can be used with debugshell to make `%trace` more useful.
Reviewed By: sfilipco
Differential Revision: D23278600
fbshipit-source-id: d6d0b34bd5c48111f8cd33d7df115f349b0e95b6
Summary:
Corp has a different concept of tier than prod. Let's load the corp
tier into our tier set as well.
Reviewed By: quark-zju
Differential Revision: D23354056
fbshipit-source-id: c9543b8253f042c7b1224578e0687b4bdf21738e
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/46
See https://github.com/facebookexperimental/eden/runs/1034006668:
error: unused import: `env::set_var`
--> src/lfs.rs:1539:15
|
1539 | use std::{env::set_var, str::FromStr};
| ^^^^^^^^^^^^
|
note: the lint level is defined here
--> src/lib.rs:125:9
|
125 | #![deny(warnings)]
| ^^^^^^^^
= note: `#[deny(unused_imports)]` implied by `#[deny(warnings)]`
error: unnecessary braces around method argument
--> src/lfs.rs:2439:36
|
2439 | remote.batch_upload(&objs, { move |sha256| local_lfs.blobs.get(&sha256) })?;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: remove these braces
|
note: the lint level is defined here
--> src/lib.rs:125:9
|
125 | #![deny(warnings)]
| ^^^^^^^^
= note: `#[deny(unused_braces)]` implied by `#[deny(warnings)]`
error: aborting due to 2 previous errors
error: could not compile `revisionstore`.
I dropped `#![deny(warnings)]` as I don't think warnings like the above ones
should break the build. (denying specific warnings that we care about explicitly
might be a better approach)
Reviewed By: singhsrb
Differential Revision: D23362178
fbshipit-source-id: 02258f57727edfac9818cd29dda5e451c7ca80a7
Summary: Now that it is possible to control which features are enabled on manually-managed dependencies, we can reenable autocargo for `edenapi`. See D23216925, D23327844, and D23329351 (840e6dd6f6) for context.
Reviewed By: dtolnay
Differential Revision: D23335122
fbshipit-source-id: 8ce250c3a106d2a02f457f7ed531623dd866232f
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/45
Fix referring to 'version' without proper codegen by making 'version' compile
without codegen. This fixes configparser test when version/src/lib.rs was not
generated.
Make unneeded deps without 'fb' feature optional.
This would hopefully fix the "EdenSCM Rust Libraries" GitHub workflow.
Reviewed By: DurhamG
Differential Revision: D23269864
fbshipit-source-id: f9e691fe0a75159c4530177b8a96dad47d2494a9
Summary: This makes the code simpler.
Reviewed By: sfilipco
Differential Revision: D23269866
fbshipit-source-id: 30c9e9d218378c0d6df8b822b2a81df2b38f5b01
Summary: Will be used to simplify code.
Reviewed By: sfilipco
Differential Revision: D23269859
fbshipit-source-id: bed0c4dca075ff60900025642af1d84bdd03452d
Summary:
`impl<T> Trait for T` in the current Rust makes it impossible to have
`impl<Q> Trait for Q`. Avoid using it for IdConvert and PrefixLookup.
Reviewed By: sfilipco
Differential Revision: D23269861
fbshipit-source-id: a837f3984ff4e1bd5a3983dd1642b9f064f51a36
Summary:
`impl<T> Trait for T` in the current Rust makes it impossible to have
`impl<Q> Trait for Q`. Avoid using it for DagAlgorithm.
Reviewed By: sfilipco
Differential Revision: D23269860
fbshipit-source-id: 031e75e9bf1f1eec2b9e8f36220ef8b817a143a5
Summary: LowLevelAccess is a subset of NameDagStorage. Use the latter instead.
Reviewed By: sfilipco
Differential Revision: D23269865
fbshipit-source-id: 81ebb1e986d8b02c968a9a237ad9a97d4afd54bf
Summary:
If there are too many heads, the current `descendants` algorithm would visit
all "old" heads. For example, with this graph:
head9999 (N9999)
/
Z (master)
:
: (many heads)
:/
: head2 (N2)
:/
C head1 (N1)
|/
B head0 (N0)
|/
A
`A::head9999` or `Z::head9999` will visit N0, N1, ..., N9999, because
`descendands_up_to` is provided with `max_id = N9999` and Z as a vertex in the
master group, is before N0 in non-master. The current algorithm also means
`descendands_up_to` gets linearly slower as the user uses the repo more, which
is quite undesirable.
This diff changes `descendands_up_to` to take an `ancestors` set, which is
`::head9999` in this case, and iterate non-master flat segments in it. So it
will skip N0 to N9998 directly by finding the N9999 flat segment and only use
it. The number of heads will have a smaller impact on performance.
Another slowness is `draft::draft_heads`, if there are too many `draft_heads`,
the internal calculation of `::draft_heads` can be slow. Optimize it by
limiting `draft_heads` to `draft:`. Practically this affects `y::` revset as
`y::` is translated to `y::visible_heads` and `visible_heads` can be large.
`cargo bench --bench dag_ops -- '::-master'` shows significant difference:
Before:
range (master::draft) 18.112 s
range (recent_draft::drafts) 2.594 s
After:
range (master::draft) 72.542 ms
range (recent_draft::drafts) 14.932 ms
In my fbsource checkout there were 20k+ heads. The improvement of
`master::recent_draft` (`x::y`) is pretty visible, and `y::` is also improved:
% lhg debugbenchmarkrevsets -m -x 'p1(min(7e8c86ae % master))' -Y 'draft() & 7e8c86ae' -e 'x::y' -e 'y::' --no-default
# x: 168f5228e570fb6b2ff7f851bd82413102748d84 (p1(min(7e8c86ae % master)))
# y: 7e8c86aec68ebc6e0b8254afcb381315991fd21c (draft() & 7e8c86ae)
# before
| revset \ backend | segments | revlog | revlog-cpy |
|------------------|----------|--------|------------|
| x::y | 17ms | 0.1ms | 0.5ms |
| y:: | 3.3ms | 0.7ms | 1.3ms |
# after
| revset \ backend | segments | revlog | revlog-cpy |
|------------------|----------|--------|------------|
| x::y | 0.2ms | 0.1ms | 0.6ms |
| y:: | 1.0ms | 0.7ms | 1.3ms |
Reviewed By: sfilipco
Differential Revision: D23214387
fbshipit-source-id: 4d11db84cd28f4e04e8b991cbc650c9d5781fd27
Summary:
Lots of non-master heads is not an exercised graph in the benchmarks.
Add it as it practically happens. This will be used by the next change.
Reviewed By: sfilipco
Differential Revision: D23259879
fbshipit-source-id: 7fe290d14403e42e6d135bde56e2d5c8519ae530
Summary:
Currently the fuzz test only uses the master group. Let it exercise non-master
group too.
Reviewed By: DurhamG
Differential Revision: D23214388
fbshipit-source-id: 7108a1055fbdda2b012f93c5948fb83ef3b9a96f
Summary: Provide a way to see segments.
Reviewed By: sfilipco
Differential Revision: D23196408
fbshipit-source-id: b1418f945a5a3364ac73b0f97466d973dd4b6300
Summary:
Provide a way to print out all segments with resolved names. This will be used
in a debug command.
Reviewed By: sfilipco
Differential Revision: D23196410
fbshipit-source-id: 1712bfda0271aa548699fe4a6b8603c5ec07af7f
Summary:
Use the parent-child index to answer children query quickly.
`cargo bench --bench dag_ops -- children`:
Before:
children (spans) 606.076 ms
children (1 id) 124.105 ms
After:
children (spans) 602.999 ms
children (1 id) 10.777 ms
Reviewed By: sfilipco
Differential Revision: D23196411
fbshipit-source-id: 37195d5ccaa582d35314e0000352ef477287d38c
Summary: This will be used to optimize "children(single vertex)" query.
Reviewed By: sfilipco
Differential Revision: D23196409
fbshipit-source-id: 050c0859faf83b909e3174bb7c7bd6e7725165c0
Summary:
Update the parent index to store non-master group too. To make
"remove_non_master" work, the index contains a "child group" prefix that
allows efficient range invalidation.
This will allow answering "children(single vertex)" query more efficiently.
This diff does not expose an API to query the index yet.
Reviewed By: sfilipco
Differential Revision: D23196406
fbshipit-source-id: 9137da5ffa8306bdafbcabc06b6f0d23f38dcf57
Summary:
Practically, the input of `children` is often one vertex instead of a large set.
Add a benchmark for it.
It looks like:
children (spans) 606.076 ms
children (1 id) 124.105 ms
Reviewed By: sfilipco
Differential Revision: D23196407
fbshipit-source-id: 0645b59ac846836fd061386384f6386a57661741
Summary: They can be figured out at Hints initialization time. So they don't need to be mutable.
Reviewed By: sfilipco
Differential Revision: D23182518
fbshipit-source-id: 133375fdf27a2546a50b63fb130534acdadc5938
Summary:
Both IdSet and IdLazy set require both Dag and IdMap to construct.
This is step 1 torwards making Dag and IdMap immutable in hints.
A misspeall of "lhs" vs "hints" in the union set is discovered by the change
and fixed.
Reviewed By: sfilipco
Differential Revision: D23182520
fbshipit-source-id: 3d052de4b8681d3672ebc45d953d1e784f64b2a4
Summary:
It will be used in places (ex. tests) where a Dag is required but constructing
a real Dag is troublesome.
Reviewed By: sfilipco
Differential Revision: D23182517
fbshipit-source-id: 736911365778e5071c1e0b9615090a4e960392a0
Summary: This is more consistent with `id_map_snapshot`.
Reviewed By: sfilipco
Differential Revision: D23182519
fbshipit-source-id: 62b7fc8bfdc9d6b3a4639a6518ea084c7f3807dd
Summary:
Similar to descendants, the new range algorithm avoids potentially expensive
checks about whether high-level segments can be used or not. Practically this
is overall an improvement.
`cargo bench --bench dag_ops -- range`:
Before:
range (2 ids) 115.380 ms
range (spans) 243.666 ms
After:
range (2 ids) 123.274 ms
range (spans) 23.101 ms
It is 100x faster with the range x::y benchmark added later on `git.git`.
Reviewed By: sfilipco
Differential Revision: D23106175
fbshipit-source-id: 691e0418ba2b7ad9f52ac15b5cd6088ec28d5f48
Summary:
The old algorithm tries to make use high-level segments.
However, the code to test whether a high-level segment can be used is
often too expensive for the benefit. Often, high-level segments cannot
be used most of the time and it's similar to O(flat segments).
This diff adds a simpler algorithm that just iterates through the flat
segments. It's faster in most practical cases.
`cargo bench --bench dag_ops -- descendants` shows improvements too:
Before:
descendants (small subset) 436.515 ms
After:
descendants (small subset) 33.460 ms
Reviewed By: sfilipco
Differential Revision: D23106174
fbshipit-source-id: e6101483d8539b2b1c881be2ccfd0071f122352f
Summary: This will be used by upcoming changes.
Reviewed By: sfilipco
Differential Revision: D23106177
fbshipit-source-id: 9bf183f7464c06b801be64fd938db0babd544756
Summary: This internal struct will be used by upcoming changes.
Reviewed By: sfilipco
Differential Revision: D23106172
fbshipit-source-id: 6d5b9bc1c810984814d0912100acca38a2565a63
Summary:
Our internal build infra creates a workspace and workspaces don't like
it when two crates have the same name. Eden scm had third-party rust crates that
were simple redirects to the internal location, but had the same name. This
caused breakages once these crates became part of the edenfs open source build.
Let's rename them to avoid this issue.
Reviewed By: kulshrax
Differential Revision: D23252539
fbshipit-source-id: 9ff2fa160a19c6bc54e015c71f9da7044ce659a7
Summary:
We have a thread blocking application. We have async libraries. This crate
provides common utilities for communicating between the blocking world and the
async world. It is intended to be a guide so that not all developers have to
get in depth understanding of Tokio in order to use async functions.
Reviewed By: quark-zju, xavierd
Differential Revision: D23222876
fbshipit-source-id: b9a61795bc917bfc664c9d6da95c9e5e2d506c79
Summary: The default method implementations on this trait were causing unused variable warnings. Prefix them with underscores to silence the warning and add `#![deny(warnings)]` to `revisionstore` to prevent future warnings from slipping through the cracks.
Reviewed By: singhsrb
Differential Revision: D23334309
fbshipit-source-id: d17b27ca0dd462e1613eac918fb595faa8637741
Summary:
We want to eventually get rid of system and repo configs, but for now
they should take precedence over the dynamicconfigs. Previously we relied on
validation to remove any entries from dynamicconfig that interferes with a
system rc config, but in some code paths we didn't run that validation (like if
we loaded configs purely from Rust).
Let's just make dynamicconfig load before system configs. If validation doesn't
run, we might miss the case where dynamicconfig sets a value and the system rc
doesn't. But that's probably fine.
Reviewed By: quark-zju
Differential Revision: D23305711
fbshipit-source-id: 77b5f49d348cfa116694a641ed17e6d1184a81ab
Summary:
Dynamicconfigs are now part of our critical path. Let's remove the
option to not load them. This also let's us get rid of a circularl dependency
where loading dynamicconfigs required having already loaded some configs. This
will let us move dynammicconfig loading to be before system rc loading in a
later diff.
Reviewed By: sfilipco
Differential Revision: D23309090
fbshipit-source-id: 5138059a8ed944c3616007e7c1289b6a57be0e65
Summary:
Dynamicconfig was throwing errors if hgrc.dynamic wasn't writable.
Let's eat those errors for normal read operations. We still treat it as an error
for straight hg debugdynamicconfig invocations.
Reviewed By: quark-zju
Differential Revision: D23301100
fbshipit-source-id: ed0bd1282d2c7ee747f0909c238a5fa07b7bc9bc
Summary: This makes it a bit easier to track down perf issues printed by RUST_LOGs.
Reviewed By: sfilipco
Differential Revision: D23095463
fbshipit-source-id: 78221a1992389f512fac6e6e633be6d19123e04a
Summary:
The fuzz tests need `TestContext::id_dag()`, which was removed by D20471712 (1fb5acf242).
Restore it so fuzz tests can run. This is mainly to check the new `range`
function.
The `range` fuzz test does find an issue caused by `>` written as `>=`
relatively quickly.
Reviewed By: sfilipco
Differential Revision: D23106176
fbshipit-source-id: e9540cc932503a9d54246d24c70bac829fcb13df
Summary: Migrate to concrete types so it can be typechecked.
Reviewed By: DurhamG
Differential Revision: D23095469
fbshipit-source-id: 27c6da30ca8a1329df544cd2ded7d9734593e48a
Summary:
Read git commit graph and migrate them to `dag::Dag`.
This allows using Rust dag abstractions on the git
commit graph.
Reviewed By: DurhamG
Differential Revision: D23095471
fbshipit-source-id: 2163701350ce82ce6e97074e56ad5877f3c9c158
Summary:
The new path does not calculate the complicated `successorssets`, and is
known to make wez's repo operations significantly faster (which, I suspect is
slowed by a very long chain).
The new code is about 3x faster on my repo too:
# before
In [1]: list(repo.nodes('draft()'))
In [2]: %time len(m.mutation.obsoletenodes(repo))
CPU times: user 246 ms, sys: 42.2 ms, total: 288 ms
Wall time: 316 ms
Out[2]: 1127
# after
In [1]: list(repo.nodes('draft()'))
In [2]: %time len(m.mutation.obsoletenodes(repo))
CPU times: user 74.3 ms, sys: 7.92 ms, total: 82.3 ms
Wall time: 82.3 ms
Out[2]: 1127
Reviewed By: markbt
Differential Revision: D23036063
fbshipit-source-id: afd6ac122bb5d8d513b5cdc033e04d2c377286eb
Summary:
Optimize get_dag:
- Avoid parsing mutation entries once they are parsed, by keeping an in-memory
`parent_map`.
- Pass `heads` to `add_heads` so the segments are less fragmented, cycle break
helper is more efficient.
The `heads` optimization is effective. Practically this makes `get_dag` about 2x faster.
This has a subtle change on cycle handling - full cycle without any non-cycle heads will
be ignored. Practically cycles are rare so it might be okay.
Together with improvements on the `dag` side, `get_dag` is about 4x faster.
Reviewed By: markbt
Differential Revision: D23036062
fbshipit-source-id: 3dc407b562f7ebf2543a87c5cd651ad6a2339d67
Summary:
If there is no new master segments, it's still possible to have new non-master
segments. Fix the loop condition so we don't skip building non-master segments.
Reviewed By: sfilipco
Differential Revision: D23095465
fbshipit-source-id: 46eb9d5b5f2b04241981558646e0bc090652abce
Summary:
I noticed that high-level segments are somehow not built for non-master vertexes.
Add a test to demonstrate the issue.
Reviewed By: DurhamG, sfilipco
Differential Revision: D23095466
fbshipit-source-id: c5a6da14bdfabcf7c432f6c6dfe096c71cc10ee9
Summary: This is useful to investigate internals of dag calculations.
Reviewed By: sfilipco
Differential Revision: D23095473
fbshipit-source-id: 4750c1b4ffad32b1317051d17db9659aaaed59c4
Summary:
Follow up of the previous change by actually using the flat segments to build
segments. This significantly improved the perf. `cargo bench --bench dag_ops`
shows:
building segments (old) 774.109 ms
building segments (new) 143.879 ms
Besides, a `O(N^2)` update to `head_ids` is changed. It improves performance
when the graph has many heads (ex. the mutation graph).
Reviewed By: sfilipco
Differential Revision: D23036080
fbshipit-source-id: 033565700f253c6f20e30a00adb6b579921d6679
Summary:
While testing the `obsolete()` set, I found an in-memory segmented DAG takes
10x time to build than a HashMap DAG.
Part of the inefficiency is to use a translated "parent_func" that round-trips
through Id and Vertex, used by segment building logic. This diff makes
`IdMap::assign_head` return flat segments, so we don't need a translated
"parent_func" to build flat segments.
This diff only adds checks to make sure the parent_func (Id version) matches
the segments. The next diff switches the segment building to not use the
translated parent_func.
Reviewed By: sfilipco
Differential Revision: D23036060
fbshipit-source-id: 99137f4b5be455cdf43218ba23eb3954b6d9e05a
Summary:
This affects the `tonodes` API in the Python world. Practically this will bind
the main commit graph to sets like draft, public.
The `ToSet` requirement on `DagAlgorithm` has to be removed to avoid stack
overflow of rustc resolving constraints.
Reviewed By: sfilipco
Differential Revision: D23036077
fbshipit-source-id: 912b924e29611680ab6b2ee4dbcd7ab39824409a
Summary: This will be useful for the `obsolete()` set.
Reviewed By: sfilipco
Differential Revision: D23036072
fbshipit-source-id: 2f944ef31cf19f902622d90545fa02b7dda89221
Summary:
If two sets have different IdMap, their Ids cannot be compared directly
for correctness.
Reviewed By: sfilipco
Differential Revision: D23036068
fbshipit-source-id: e800e8273b95c1f8174236e0f30445db7fd44556
Summary: This is similar to the previous change. This allows "binding" IdMaps to sets.
Reviewed By: sfilipco
Differential Revision: D23036058
fbshipit-source-id: ec1b1ec73e949ad4865aecf17bfcc5c1ca723e0d
Summary:
This trades a bit performance (calculating the snapshot) for correctness (no
pointer reuse issues) and convenience (set captures dag information with them
and enables use-cases like converting NameSet from another dag to the
current dag without requiring extra `dag` objects).
Reviewed By: sfilipco
Differential Revision: D23036067
fbshipit-source-id: 2e691f09ad401ba79dbc635e908d79e54dadca5e
Summary:
If `x` and `y` come from a same graph, `x & y` is more efficient than
`y & x` if `y` is larger. However, if `x` and `y` are from different
graphs, the `FULL` hint can no longer accurately predict which one
is larger. Therefore the swap should be avoided.
Reviewed By: sfilipco
Differential Revision: D23036081
fbshipit-source-id: fe3970fc38c853b36689bfd0ee1dec20643ace78
Summary:
For sets like `obsolete()`, `merge()`, they could have a fast "contains" path:
Just check the given commit without calculating a full set. It's also possible
to have a relatively efficient code path to return StaticSet (for obsolete()),
or IdStaticSet (for merge(), by checking flat segments). This diff adds a
`MetaSet` that allows defining two fast paths separately.
This will be used for the `obsolete()` set in upcoming changes.
Reviewed By: sfilipco
Differential Revision: D23036059
fbshipit-source-id: 06e6f90e7e9511626a12cfa729c306ff539256d2
Summary:
Before this change, `flush` with empty changes but `master` moves will cause an
error, because the `parents_func` only contains "pending changes", aka. new
vertexes. The `parents_func` does not know `master` and `master` is needed to
re-assign them from the non-master to the master group.
With the snapshot API, things become easier. We just take a snapshot before
reloading, and use the snapshot to answer parent_names.
Reviewed By: sfilipco
Differential Revision: D22970569
fbshipit-source-id: 99a25857ba98792edff69985c16df118a560ffb0
Summary:
This API allows the underlying Dag to provide a snapshot. The snapshot can then
be used in places that do not want a lifetime (ex. NameSet).
Reviewed By: sfilipco
Differential Revision: D22970579
fbshipit-source-id: ededff82009fd5b4583f871eef084ec907b45d33
Summary:
Make it possible to snapshot a Dag. This is useful for cases where another
struct wants access to the Dag without lifetimes. Namely, the LazySet can
might want to keep a snapshot of Dag.
Reviewed By: sfilipco
Differential Revision: D22970568
fbshipit-source-id: 508c38d3ffac2ffcd2e682578c3c5e5787ea3bcf
Summary:
The only intended use of the inverse DAG is to implement the Python dag
interface in `dagutil.py`. D22519589 (2d4d44cf3d) stack changed it so the Python dag
interface becomes optional. Therefore there is no need to keep the inverse DAG
interface, which is a bit tricky on sorting.
Reviewed By: sfilipco
Differential Revision: D22970581
fbshipit-source-id: 58a126b41d992e75beaf76ece25cb578ee84760b
Summary:
This will be used for migrating revlog DAG to segmented changelog. It does not
migrate commit text data (which can take 10+ minutes).
Reviewed By: DurhamG, sfilipco
Differential Revision: D22970582
fbshipit-source-id: 125a8726d48e15ceb06edb139d6d5b2fc132a32c
Summary:
Dynamicconfigs compares the timestamp of config files with the current
timestamp to determine when to regenerate. If the timestamp of the config file
is newer than the current timestamp, Rust throws an exception. Let's handle that
case and treat it as if the file was just created instead of crashing.
Reviewed By: quark-zju
Differential Revision: D23230216
fbshipit-source-id: ca185de7dfca46953e04ec08c84668eda6d749bd
Summary: This fixes the Windows build.
Reviewed By: farnz
Differential Revision: D23212195
fbshipit-source-id: 159f3ddebf6a97f52f9b6c80ef19315c8f4b0c85
Summary:
This allows importing from other DAGs. It will be used to import revlog DAG to
the new segmented format.
Reviewed By: sfilipco
Differential Revision: D22970572
fbshipit-source-id: 0a183e7b64831574cc9c60d4639124d02d19cf43
Summary:
This allows dag to use renderdag in tests to verify graph result. Previously
it was hard because dag <-> renderdag would form circular dependency.
It also make it possible to implement more efficient and integrated fast paths
for graph rendering.
Reviewed By: sfilipco
Differential Revision: D22970570
fbshipit-source-id: 526497339bd7aa8898d1af4aa9cf6d2a6797aae0
Summary: This will be used to describe what the commit graph backend is.
Reviewed By: sfilipco
Differential Revision: D22970577
fbshipit-source-id: 753efdbdd4466730ece758d9f4789fbd21e2801b
Summary:
This allows us to try segmented changelog while maintaining revlog
compatibility.
Reviewed By: sfilipco
Differential Revision: D22970583
fbshipit-source-id: 7c43cdadd76300e76e89f38aac5ed3ecc0cff728
Summary:
We missed a Windows http client breakage because our LFS server integration
wasn't run on Windows. Let's enable the fb feature for all our cargo test runs.
Reviewed By: singhsrb
Differential Revision: D23140315
fbshipit-source-id: 46cc533c1e543ffc32d472b49a8f6daeee3b5009
Summary:
Aux data wire protocol part 1: field annotations & basic compatibility model.
Annotates fields in `file`, `tree`, and `complete_tree` wire structs with `#[serde(rename = "N", default, skip_serializing_if = "is_default")]`. I've avoided using `#[serde(default)]` on the container structs themselves because this can cause some confusion / incorrect behavior if not used carefully. Consider a wire struct `FooRequest` with a field of type `Option<Bar>`. `Option<Bar>` defaults to `None`. If `FooRequest`'s `Default` implenentation sets the field's default to `Some(bar)`, a `FooRequest` explicitly constructed with `None` for the field will be serialized with the field omitted (because it passes `is_default`) and will be deserialized on the server as `Some(bar)`, causing incorrect behavior. To address this, we'd need to change the `is_default` function used with `skip_serializing_if` to check against the field's default value as set by the container, which isn't trivially possible without some sort of reflection (please correct me if you know a good way to achieve this). This is unfortunate, as it'd be very desirable for the container to be able to set defaults different from the individual field type defaults, for cases where one boolean, for instance, should default to true. As-is, we'd need to address this with wrapper types instead, where we can fully control the `Default` implenentation.
We can, of course, address this by providing an alternate `skip_serializing_if` function to fields with default that doesn't match that set by the container. This will need to be done carefully, though, to avoid the issue I described above.
Currently the JSON module manually serializes and deserializes all the top-level request objects, so the rename annotation doesn't impact it. We can add `#[serde(alias = "rustfieldname")]` if we'd like the server and client to be able to accept manually-crafted requests and responses with explicit field names. This could also be useful to replace the manual parsing in the JSON module, but can't replace the manual serialization in a clean way. We'd need to introduce a second copy of the wire types, without the serde `rename` attribute, to allow serializing with the actual rust field name.
I've only modified the `tree`, `file`, and `complete_tree` modules. I intend to eventually update the rest of the edenapi protocol later on, when the implementation of `file` and `tree` are complete / stable. This will give us a chance to fix any mistakes before copying the design to more places.
Note: I do not intend to keep to proper wire protocol compatibility at this stage in the implementation. Expect field numbers to be re-used by non-compatible changes.
Reviewed By: kulshrax
Differential Revision: D23172756
fbshipit-source-id: 39976ed4bede892bd6981f9c3f23557a91f9028b
Summary:
As noted in the documentation for it, this can be removed once get and prefetch
return a continuation. This is now done, and thus we can remove it entirely.
Mis-use of it caused data to be fetched twice: once by memcache, and the second
one by getpackv2.
Reviewed By: singhsrb
Differential Revision: D23123344
fbshipit-source-id: 9ac0594faaba94ead04a8bb9035e14809a706641
Summary: The python code stripped new lines but the Rust code did not.
Reviewed By: singhsrb
Differential Revision: D23167515
fbshipit-source-id: add33ec6e4cfd9169e6fef8208490e0aeede38bd
Summary:
This new disallowlist will let us specify config section.key's which
should not be accepted from old rc files. This will let us incrementally disable
loading of those configs from the old files, which will then let us delete them
from the old rc's and eventually delete the old rc's entirely.
This diff also removes hgrc.local and hgrc.od from the list of configs we
verify, since those are not on the list of configs that need to be removed in
this initiative.
Reviewed By: quark-zju
Differential Revision: D23065595
fbshipit-source-id: 5cd742d099efd651174cab5e87bb7cdc4bae8054
Summary:
Previously the backing store was loading configs manually. Now that
system, dynamic, user, and repo config loading are unified, let's go through
that approved path.
Reviewed By: kulshrax
Differential Revision: D22736338
fbshipit-source-id: 232023e660107a096691e9d99bf89c04c218dfbd
Summary:
This threads the calls to load_dynamic and load_repo through the Rust
layer up to the Python bindings. This diff does 2 notable things:
1. It adds a reload API for reloading configs in place, versus creating a new
one. This will be used in localrepo.__init__ to construct a new config for the
repo while still maintaining the old pinned values from the copied ui.
2. It threads a repo path and readonly config list from Python down to the Rust
code. This allows load_dynamic and load_repo to operate on the repo path, and
allows the readonly filter to applied to all configs during reloading.
Reviewed By: quark-zju
Differential Revision: D22712623
fbshipit-source-id: a0f372f4971c5feac2f20e89a0fb3fe6d4a65d6f
Summary:
In a future diff we'll enable dynamic and repo config loading purely
from Rust. To do so we need load functions for both cases. A future diff will
call these.
The dynamicconfig loading is based off the Python equivalent in uiconfig.py
Reviewed By: quark-zju
Differential Revision: D22712624
fbshipit-source-id: ff46f6315fb80d4cd9e31d875ac60264563b12f2
Summary:
Previously load_system would skip loading if HGRCPATH was present and
then load_user would actually load the HGRCPATH. In an upcoming diff I add
load_dynamic, which happens after system but before user. The tests for
dynamicconfig depend on HGRCPATH being loaded when load_dynamic runs, so let's
move HGRCPATH loading up to load_system.
Reviewed By: quark-zju
Differential Revision: D22712627
fbshipit-source-id: 91175d9d7f85b9392ffea4af815a4facebbfe7c1
Summary:
In a future diff we'll allow an outside caller to pass an Options down
to configparsers::hg::load() so that filters can be applied during loading. Inside
hg::load() we need to use the options multiple times with different values, so
let's make Options clonable.
Reviewed By: quark-zju
Differential Revision: D22712626
fbshipit-source-id: 975145f38d35afe7d4a6c8e87071b0fb0ae74797
Summary:
A future diff will move all dynamic and repo config loading to be in
configparser. As part of this, let's simplify the repo.rs API to not pass
configs around everywhere.
Reviewed By: quark-zju
Differential Revision: D22712628
fbshipit-source-id: 79f23991aa826ce8b4f7430b45d7702efdc6b982
Summary:
Similar to the Python runbgcommand (extutil.py), this is a Rust utility that runs a
detached background process in a cross platform way.
This will be used in a later diff to run dynamicconfig generation in the
background.
Reviewed By: quark-zju
Differential Revision: D22712629
fbshipit-source-id: a317465bf03c96d977a203678e2bef13ce57cc12
Summary:
As part of moving all hg config loading and generation logic into Rust,
let's move the config generation logic from hgcommands and pyconfigparser to
configparser, unifying them at the same time.
Future diffs will move config loading in as well.
Reviewed By: quark-zju
Differential Revision: D22590208
fbshipit-source-id: d1760c404a6a5c57347df30713c20de55cfdb9a4
Summary:
A future diff will unify all config loading into configparser::hg, but
to do so we need dynamicconfig to live in configparser, so it can load
dynamicconfigs. Let's move everything in.
Reviewed By: quark-zju
Differential Revision: D22587237
fbshipit-source-id: 5613094175b6e1597aa113ee3e6d92ce7ec79f6d
Summary:
We had two spots that loaded system and user configs, one in the
pyconfigparser layer, and one in the pure rust config layer. In an upcoming diff
I'd like to move dynamicconfig loading down into the pure rust layer, so let's
unify these.
Reviewed By: quark-zju
Differential Revision: D22585554
fbshipit-source-id: 0cea7801ae1d5a3a3c12b80ee23b37f9e690e2bc
Summary:
In a future diff we'll increase the size of the rotatelog temporarily
during clones. To do so we need it to be configurable.
Reviewed By: quark-zju
Differential Revision: D23089539
fbshipit-source-id: ebfc3beaf3c0fe5b01b87d97c19455b0a24afa72
Summary:
In a future diff we'll increase the size of the rotatelog temporarily
during clones. To do so we need it to be configurable.
Reviewed By: quark-zju
Differential Revision: D23089541
fbshipit-source-id: 5010e417a83a2611283322f1dbb7023f4286f503
Summary:
from_path is an awkward constructor because it doesn't pass any other
information, like a config object. It also requires that the constructor be very
generic across all the stores. Right now it's only needed for pack files, so
let's move it to it's own trait that is limited to pack files.
This will allow us to make the indexedlog store constructors more versatile in a
later diff. Once we get rid of pack files we can delete the StoreFromPath trait
entirely.
Reviewed By: xavierd
Differential Revision: D23089542
fbshipit-source-id: ea6c50853e5d5390a029002ef5d15c74fe41fe69
Summary: If parent_revs gets an out-of-bound rev, it should fail.
Reviewed By: sfilipco
Differential Revision: D23036071
fbshipit-source-id: 7fae0fd5adf07ac3c933a29d7d06289d8d740c60
Summary:
If the text starts with `\0`, the `\0` should be considered as part of the
uncompressed text instead of a separated header.
Reviewed By: sfilipco
Differential Revision: D22970575
fbshipit-source-id: 49e8a1a1ea42a3c4cf153b70f59fd0558dcfcede
Summary:
The parent handling is unsound when there are revs that are skipped. Fix it by
reasoning about commit hashes for parents.
Reviewed By: sfilipco
Differential Revision: D23036078
fbshipit-source-id: 8f710171471025cd48b3bd8f6ea57c68330eb8b8
Summary:
Windows defaults to checking a revocation server for ssl certs. Inside
our datacenter it can't reach the server and fails. We don't have this on for
any other platforms, so let's disable it.
Reviewed By: sfilipco
Differential Revision: D23121739
fbshipit-source-id: 4d44d2a065bf340a8f74332553deb09a9c61be9b
Summary:
The primary change is in `eden/scm/lib/edenapi/types`:
* Split `DataEntry` into `FileEntry` and `TreeEntry`.
* Split `DataError` into `FileError` and `TreeError`. Remove `Redacted` error variant from `TreeError` and `MaybeHybridManifest` error variant from `FileError`.
* Split `DataRequest`, `DataResponse` into appropriate File and Tree types.
* Refactor `data.rs` into `file.rs` and `tree.rs`.
* Lift `InvalidHgId` error, used by both File and Tree, into `lib.rs`.
* Bugfix: change `MaybeHybridManifest` to be returned only for hash mismatches with empty paths, to match documented behavior.
Most of the remaining changes are straightforward fallout of this split. Notable changes include:
* `eden/scm/lib/edenapi/tools/read_res`: I've split the "data" commands into "file" and "tree", but I've left the identical arguments sharing the same argument structs. These can be refactored later if / when they diverge.
* `eden/scm/lib/types/src/hgid.rs`: Moved `compute_hgid` from `eden/scm/lib/edenapi/types/src/data.rs` to as a new `from_content` constructor on the `HgId` struct.
* `eden/scm/lib/revisionstore/src/datastore.rs`: Split `add_entry` method on `HgIdMutableDeltaStore` trait into `add_file` and `add_tree` methods.
* `eden/scm/lib/revisionstore/src/edenapi`
* `mod.rs`: Split `prefetch` method on `EdenApiStoreKind` into `prefetch_files` and `prefetch_trees`, which are given a default implementation that fails with `unimplemented!`.
* `data.rs`: Replace blanket trait implementations for `EdenApiDataStore<T>` with specific implementations for `EdenApiDataStore<File>` and `EdenApiDataStore<Tree>` which call the appropriate fetch and add functions.
* `data.rs` `test_get_*`: Replace dummy hashes with real hashes. These tests were only passing due to the hash mismatches (incorrectly) being considered `MaybeHybridManifest` errors, and allowed to pass.
Reviewed By: kulshrax
Differential Revision: D22958373
fbshipit-source-id: 788baaad4d9be20686d527f819a7342678740bc3
Summary:
The `debugfsync` command calls fsync on newly modified files in svfs.
Right now it only includes locations that we know have constant number
of files.
The fsync logic is put in a separate crate to avoid slow compiles.
Reviewed By: DurhamG
Differential Revision: D22992103
fbshipit-source-id: b5503e498d5216d4ba19701ecd5582387e4f45f5
Summary: This allows callsites to get access to the storage.
Reviewed By: DurhamG
Differential Revision: D22992104
fbshipit-source-id: c72fa313be1468170c9728d3856f822bb6385dc8
Summary:
This makes the main command table cleaner.
I dropped the `indexedlogrepair` command as it cannot rebuild indexes. `hg
doctor` is a better replacement. Some debug commands are renamed so they
no longer have `-` in the command name.
Reviewed By: DurhamG
Differential Revision: D22992107
fbshipit-source-id: f65d74e36fb971e592ad0cc8be9a94e245c39662
Summary:
If every command lives in their module, then we can define the "module" interface:
- run(...): run the command
- doc(): the help text
- name(): command name, with aliases
Then the macro would make command registration look simpler.
This diff changes `status` to use the pattern as an example.
Reviewed By: DurhamG
Differential Revision: D22992109
fbshipit-source-id: eaf589863092ec2eb1f8c24c1c7e425492fe1e3a
Summary:
As the number of commands grows, it starts making sense to move them to
individual files. Let's create a directory for them.
Reviewed By: DurhamG
Differential Revision: D22992108
fbshipit-source-id: a0556be602b832579a8e027342d5b86d9d84d257
Summary:
EdenFS on Windows is a bit weird as ProjectedFS is implemented as a filter
driver that adds reparse point to all the files/directories to get notified of
filesystem operations on them. It then hides these reparse points from the
outside which means that the dwAttributes of a file in EdenFS will not claim
that a reparse point is attached to it. On top of this, newly created
files/directories won't have any reparse points attached to them, until they
start being tracked by EdenFS.
While the first issue can be solved by always querying the reparse tags, I'm
not entirely sure how to solve the second one. That second issue causes
Mercurial to always try to create hardlink in the .hg directory, while it shouldn't.
Reviewed By: DurhamG
Differential Revision: D22937788
fbshipit-source-id: 5d90cd37d40858ed60103ff2d17c2cef16472b38
Summary: Client portion for the commit/revlog_data endpoint that was added to the server.
Reviewed By: kulshrax
Differential Revision: D23065989
fbshipit-source-id: 3115ad2b426daca22472e2106fcd293f3ccd70f3
Summary:
When doing large clones or checkouts the amount of data we add to an
indexedlog can be many GB. On a laptop we don't have much memory, so let's set a
max memory threshold for the file data/history indexedlogs.
Reviewed By: xavierd
Differential Revision: D23046489
fbshipit-source-id: 43b7686b11fe05e4c074bcb02c475ebf8cf14ab1
Summary:
Matches the `getcommitdata` SSH endpoint.
This is going to be used to remove the requirement that client repostories
need to have all commits locally.
Reviewed By: krallin
Differential Revision: D22979458
fbshipit-source-id: 75d7265daf4e51d3b32d76aeac12207f553f8f61
Summary:
Introduce taggederror-util, which provides a new trait `AnyhowEdenExt`, which provides a method `eden_metadata` for anyhow errors and results. This method works much like `AnyhowExt::common_metadata`, but additionally supports extracting default error metadata from known `Tagged` types which are listed explicitly in the method implementation.
Extend `FilteredAnyhow` to support a configuration "metadata function", which allows swapping out `eden_metadata` for the standard `common_metadata`.
Modify Rust dispatch and Python bindings to use `AnyhowEdenExt` for metadata extraction and printing.
Modify `intentional_error` to rely on `AnyhowEdenExt` for tagging (removes `.tagged` call, no tags will be visible if `AnyhowEdenExt` is not used).
Reviewed By: DurhamG
Differential Revision: D22927203
fbshipit-source-id: 04b36fdfaa24af591118acb9e418d1ed7ae33f91
Summary:
Add a `buffered()` method to `AsyncResponse` allowing the user to specify the desired chunk size for the body stream.
(This was already used internally by `CborStream`; this just exposes it in the public interface.)
Reviewed By: quark-zju
Differential Revision: D22935891
fbshipit-source-id: e110e85bf9cb4c7923a8977ea4631ca1cc4cf4cb
Summary: Rename the `cbor` module to `stream` to better indicate that it contains various stream combinators (not all of which are related to CBOR).
Reviewed By: quark-zju
Differential Revision: D22935892
fbshipit-source-id: 3f73aa707ab59c31717c1cf35995ad79946a15c9
Summary:
This provides Ctrl+C/SIGKILL safety. It's needed because we no longer use the
Python transaction framework. If the write is incomplete, the revlog index
logical length will ensure new processes won't see incomplete data.
The length of revlog data is not tracked, as some "unused" in it does not
really matter. Reading the revlog should be still fine.
Reviewed By: sfilipco
Differential Revision: D22914423
fbshipit-source-id: f2f446cde79c7270cbd1ef165f8707368a0a2990
Summary:
This is more complex than previous libraries, mainly because `dag` defines APIs
(traits) used by other code, which might raise error type not interested
by `dag` itself. `BackendError::Other(anyhow::Error)` is currently used to
capture types that do not fit in `dag`'s predefined error types.
Reviewed By: sfilipco
Differential Revision: D22883865
fbshipit-source-id: 3699e14775f335620eec28faa9a05c3cc750e1d1
Summary:
Prefix some `Result` with `dag::Result`. Since `dag::Result` is just
`anyhow::Result` for now, this does not change anything but makes
it more compatible with upcoming changes.
Reviewed By: sfilipco
Differential Revision: D22883864
fbshipit-source-id: 95a26897ed026f1bb8000b7caddeb461dcaad0e7
Summary:
All dependencies of revlogindex have migrated to concreted error types.
Let's migrate revlogindex itself. This allows compile-time type checks
and makes the error returned by revlogindex APIs more predictable.
Reviewed By: sfilipco
Differential Revision: D22857554
fbshipit-source-id: 7d32599508ad682c6e9c827d4599e6ed0769899c
Summary: The `radixbuf` crate already has its own concrete error type. Use it.
Reviewed By: sfilipco
Differential Revision: D22855450
fbshipit-source-id: 307a46ddd79b28a18ee779867ee1e604b531828a
Summary:
`util` as a low-level library should use concrete error types so callsite can
type check their type conversions.
Reviewed By: sfilipco
Differential Revision: D22855448
fbshipit-source-id: 37b3fce36f1ae82a9604ef8ac0dc22c02280ceb2
Summary:
Change the `lz4-pyframe` library to use a concrete error type instead of trait
object. This would allow callsites to type check the error type.
Reviewed By: sfilipco
Differential Revision: D22855449
fbshipit-source-id: 3497b3e0bfb814302fee2f7297b35de8b8a916ed
Summary: This is unused, no need to add to the build time.
Reviewed By: fanzeyi
Differential Revision: D22967814
fbshipit-source-id: 91a5ed9f03128947af9cb69bca62ed75b75e7e66
Summary:
We were experiencing hangs during lfs http fetches. We've seen similar
issues before when using Hyper, which Reqwest is based off of. Let's switch to
the new Curl-based http_client instead.
Note, this does get rid of the ability to obey hg config http proxy settings.
That said, they shouldn't be used much, and the http proxy environment variables
are respected by libcurl.
Reviewed By: xavierd
Differential Revision: D22935348
fbshipit-source-id: 1c61c04bbb4043e3bde592251f12bf846ab3afd4
Summary:
A future diff does LFS fetches via http_client. Curl has some default
behavior of adding the "Expect: 100-continue" header which causes the server to
send a 100 status code response after the headers have been received but before
the the payload has been received. Since the http_client model only expects a
single response, this breaks the model and we're unable to read the second
response. Let's disable this behavior by manually setting the header to empty
string, which appears to be the official way to handle this.
Add it early so callers can overwrite it.
Reviewed By: quark-zju
Differential Revision: D22935349
fbshipit-source-id: 3009a5eb72f40584b846510f34f121e0e821a2bc
Summary:
One repack code path would return an error if the pack was already
deleted before the repack started. This is a fine situation, so let's just eat
the error and repack what can be repacked.
Reviewed By: xavierd
Differential Revision: D22873219
fbshipit-source-id: c716a5f0cd6106fd3464702753fb79df0bc7d13f
Summary:
Eden api endpoint for segmented changelog. It translates a path in the
graph to the hash corresponding to that commit that the path lands on.
It is expected that paths point to unique commits.
This change looks to go through the plumbing of getting the request from
the edenapi side through mononoke internals and to the segmented changelog
crate. The request used is an example. Follow up changes will look more at
what shape the request and reponse should have.
Reviewed By: kulshrax
Differential Revision: D22702016
fbshipit-source-id: 9615a0571f31a8819acd2b4dc548f49e36f44ab2
Summary:
During large prefetches, (say a clone), it is possible that 2 different
filenode actually refer to the same file content, which thus share the same LFS
blob. The code would wrongly prefetch this blob twice which would then fail due
to the `obj_set` only containing one instance of this object.
Instead of using a Vec for the objects to prefetch, we can simply use a
`HashSet` which will take care of de-duplicating the objects.
Reviewed By: DurhamG
Differential Revision: D22903606
fbshipit-source-id: 4983555d2b16639051acbbb591ebb752d55acc2d
Summary:
There was a small but easy to miss mistake when prefetch was changed to return
the keys that couldn't be prefetched. For LFS pointers, the code would wrongly
return that the blob was fetched, which is misleading as the LFS blob isn't
actually downloaded. For LFS pointers, we need to translate them to their LFS
blob content hashes.
Reviewed By: DurhamG
Differential Revision: D22903607
fbshipit-source-id: e86592cd986498d9f4a574585eb92da695de2e27
Summary:
An earlier diff, D21772132 (713fbeec24), add an option to default hgcache data store
writes to indexedlog but it only did it for data, not history. Let's also do it
for history.
Reviewed By: quark-zju
Differential Revision: D22870952
fbshipit-source-id: 649361b2d946359b9fbdd038867e1058077bd101
Summary: This makes it a little bit easier to use.
Reviewed By: sfilipco
Differential Revision: D22853717
fbshipit-source-id: aa3c1ed2a9a2d1020a48a4493a644093d8b07e67
Summary:
We're seeing users report lfs fetching hanging for 24+ hours. Stack
traces seem to show it hanging on the lfs fetch. Let's read bytes off the wire
in smaller chunks and add a timeout to each read (default timeout is 10s).
Reviewed By: xavierd
Differential Revision: D22853074
fbshipit-source-id: 3cd9152c472acb1f643ba8c65473268e67d59505
Summary:
Although new changelog revlogs do not use deltas since years ago, early
revisions in our production changelog still use mpatch delta format
because they are stream-cloned.
Teach revlogindex to support them.
Reviewed By: sfilipco
Differential Revision: D22657204
fbshipit-source-id: 7aa3b76a9a6b184294432962d36e6a862c4fe371
Summary:
The default reachable_roots implementation is good enough for segmented
changelog, but not efficient for revlogindex use-case.
Reviewed By: sfilipco
Differential Revision: D22657193
fbshipit-source-id: a81bc255d42d46c50e61fe954f027f1160dacb6c
Summary:
I thought it was just `roots & (::heads)`. It is actually more complex than
that.
Reviewed By: sfilipco
Differential Revision: D22657201
fbshipit-source-id: bd0b49fc4cdd2c516384cf70c1c5f79af4da1342
Summary:
This replaces RustError that might happen during `addcommits`, and allow us to
handle it without having a stacktrace.
Reviewed By: DurhamG
Differential Revision: D22539564
fbshipit-source-id: 356814b9baf0b31528dfc92d62b0dcf352bc1e24
Summary:
Follow up of D22638454.
This makes revlogindex marks its compatible DAG so "all()" fast paths can be used properly.
Reviewed By: sfilipco
Differential Revision: D22638459
fbshipit-source-id: 074e95b9fccbc486b69a947fec5172662e7dd3b7
Summary:
No need to exhaust the entire IdLazySet if there are hints.
This is important to make `small & lazy` fast.
Reviewed By: sfilipco
Differential Revision: D22638462
fbshipit-source-id: 63a71986e6e254769c42eb6250c042ea6aa5808b
Summary:
When multiple DAGs (ex. a local DAG and a commit-cloud DAG) are involved,
certain fast paths become unsound. Namely, the fast paths of the FULL hint
should check DAG compatibility. For example:
localrepodag.all() & remotedag.all()
should not simply return `localrepodag.all()` or `remotedag.all()`.
Fix it by checking DAG pointers.
A StaticSet might be created without using a DAG, add an optimization
to change `all & static` to `static & all`. So StaticSet without DAG
wouldn't require full DAG scans when intersecting with other sets.
Reviewed By: sfilipco
Differential Revision: D22638454
fbshipit-source-id: 72396417e9c1238d5411829da8f16f2c6d4c2f3a
Summary:
Improve `fmt::Debug` so it fits better in the Rust and Python eco-system:
- Support Rust formatter flags. For example `{:#5.3?}`. `5` defines limit of a
large set to show, `3` defines hex commit hash length. `#` specifies the
alternate form.
- Show commit hashes together with integer Ids for IdStaticSet.
- Use HG rev range syntax (`a:b`) to represent ranges for IdStaticSet.
- Limit spans to show for IdStaticSet, similar to StaticSet.
- Show only 8 chars of a long hex commit hash by default.
- Minor renames like `dag` -> `spans`, `difference` -> `diff`.
Python bindings uses `fmt::Debug` as `__repr__` and will be affected.
Reviewed By: sfilipco
Differential Revision: D22638455
fbshipit-source-id: 957784fec9c99c8fc5600b040d964ce5918e1bb4
Summary:
This makes intersection set stop early. It's useful to stop iteration on some
lazy sets. For example, the below `ancestors(tip) & span` or
`descendants(1) & span` sets can take seconds to calculate without this
optimization.
```
In [1]: cl.dag.ancestors([cl.tip()]) & cl.tonodes(bindings.dag.spans.unsaferange(len(cl)-10,len(cl)))
Out[1]: <and <lazy-id> <dag [...]>>
In [3]: %time len(cl.dag.ancestors([cl.tip()]) & cl.tonodes(bindings.dag.spans.unsaferange(len(cl)-10,len(cl))))
CPU times: user 364 µs, sys: 0 ns, total: 364 µs
Wall time: 362 µs
In [7]: %time len(cl.dag.descendants([repo[1].node()]) & cl.tonodes(bindings.dag.spans.unsaferange(0,100)))
CPU times: user 0 ns, sys: 574 µs, total: 574 µs
Wall time: 583 µs
```
Reviewed By: sfilipco
Differential Revision: D22638458
fbshipit-source-id: b9064ce2ff1aecc2d7d00025928dfcb3c0d78e0c
Summary:
Similar to the segmented changelog version using `ANCESTORS`. This makes
`heads(all())` calculates `heads_ancestors(all())` automatically and gets
the speed-up.
Reviewed By: sfilipco
Differential Revision: D22638464
fbshipit-source-id: 014412f1c226925e50387f18c1282b3cb96d434b
Summary:
Optimize it to not covert revs to `Vec<u32>`, and have a fast path to
initialize `states` with `Unspecified`. This makes it about 2x faster and match
the C revlog `headrevs` performance when calculating `headsancestors(all())`:
```
In [2]: %timeit cl.index.clearcaches(); len(cl.index.headrevs())
10 loops, best of 3: 66.9 ms per loop
In [3]: %timeit len(cl.dageval(lambda: headsancestors(all())))
10 loops, best of 3: 64.9 ms per loop
```
Reviewed By: sfilipco
Differential Revision: D22638461
fbshipit-source-id: 965eb16e3a78ae02a65a8a44559f3a64c16f6884
Summary:
Change `parents` from using the default implementation that returns `StaticSet`
of commit hashes, to a customized implementation that returns `IdStaticSet`.
This avoids unnecessary commit hash lookups, and makes `heads(all())` 30x
faster, matching `headsancestors(all())` (but is still 2x slower than the C
revlog index `headsrevs` implementation).
Reviewed By: sfilipco
Differential Revision: D22638453
fbshipit-source-id: 4fef78080b990046b91fee110c48e36301d83b4f
Summary:
The hint indicates a set `X` is equivalent to `ancestors(X)`.
This allows us to make `heads` use `heads_ancestors` (which is faster in
segmented changelog) automatically without affecting correctness. It also
makes special queries like `ancestors(all())` super cheap because it'll just
return `all()` as-is.
Reviewed By: sfilipco
Differential Revision: D22638463
fbshipit-source-id: 44d9bbcbb0d7e2975a0c8322181c88daa1ba4e37
Summary: This avoids depending on the C index if the Rust DAG is available.
Reviewed By: DurhamG
Differential Revision: D22519587
fbshipit-source-id: a89d91184feaeef6641d2b04353601297bf5d4d5
Summary:
Better Engineering: remove dead code about secret tool.
Secret tool is a FB specific tool (keychain like) and has been used to transfer OAuth token between
different devservers without user's involvement. We have migrated to certs on devservers, so it is not needed anymore.
Also, it is FB specific and doesn't make sense for open source either.
Reviewed By: mitrandir77
Differential Revision: D22827264
fbshipit-source-id: cd89168ad75ca041d2a0f18d63474dd1eaad483d
Summary:
If the LFS server is down, we are going to retry fetching filenode from the
Mercurial server directly, who is expected to not return a pointer.
Previously, this was achieved by adding a hack into `get_missing`, but since
the function is no longer called during prefetch calls, we cannot rely on it
anymore. Instead, we can wrap the regular remote store and translate all the
StoreKey::Content onto their corresponding hgid keys.
Reviewed By: DurhamG
Differential Revision: D22565604
fbshipit-source-id: 2532a1fc3dfd9ba5600957ed5cf905255cb5b3fd
Summary:
The ContentStore code has a strong assumption that all the data fetched ends up
in the hgcache. Unfortunately, this assumption breaks somewhat if an LFS
pointer is in the local store but the blob isn't alonside it. This can happen
for instance when ubundling a bundle that contains a pointer, the pointer will
be written to the local store, but the blob would be fetched in the shared
store.
We can break this assumption a bit in the LFS store code by writing the fetched
blob alongside the pointer, this allows the `get` operation to find both the
pointer and the blob in the same store.
Reviewed By: DurhamG
Differential Revision: D22714708
fbshipit-source-id: 01aedf04d692c787b7cddb0f7a76828ea37dcf29
Summary:
The pointer for a blob might very well be in the local store, so let's search
in it.
Reviewed By: DurhamG
Differential Revision: D22565608
fbshipit-source-id: 925dd5718fc19e11a1ccaa0887bf5c477e85b2e5
Summary: Similarly to the changes made for `get`, the same can be applied to prefetch.
Reviewed By: DurhamG
Differential Revision: D22565609
fbshipit-source-id: 0fbc1a0086fa44593a6aaffb746ed36b3261040c
Summary: It was once used by EdenFS, but is now dead code, no need to keep it around.
Reviewed By: singhsrb
Differential Revision: D22784582
fbshipit-source-id: d01cf5a99a010530166cabb0de55a5ea3c51c9c7
Summary:
When using LFS, it's possible that a pointer may be present in the local
LfsStore, but the blob would only be in the shared one. Such scenario can
happen after an upload, when the blob is moved to the shared store for
instance. In this case, during a `get` call, the local LFS store won't be able
to find the blob and thus would return Ok(None), the shared LFS store woud not
be able to find the pointer itself and would thus return Ok(None) too. If the
server is not aware of the file node itself, the `ContentStore::get` would also
return Ok(None), even though all the information is present locally.
The main reason why this is happening is due to the `get` call operating
primarily on file node based keys, and for content-based stores (like LFS),
this means that the translation layer needs to be present in the same store,
which in some case may not be the case. By allowing stores to return a
`StoreKey` when progress was made in finding the key we can effectively solve
the problem described above, the local store would translate the file node key
onto a content key, and the shared store would read the blob properly.
Reviewed By: DurhamG
Differential Revision: D22565607
fbshipit-source-id: 94dd74a462526778f7a7e232a97b21211f95239f
Summary:
I thought I had removed that code a while back, it turns out I didn't, so let's
do it.
Reviewed By: singhsrb
Differential Revision: D22583556
fbshipit-source-id: b92644195994e0a83bdbcd8019253ea217474486
Summary: This change introduces a bail macro that allows tagging errors using the syntax `bail!(fault=Fault::Request, "my normal {}", bail_args)` or `bail!(Fault::Request, "my normal {}", bail_args)`.
Reviewed By: DurhamG
Differential Revision: D22646428
fbshipit-source-id: a6ec2940001b26db8ddc3a6d3620a1e17406c867
Summary:
About 64 tests depend on the revlog `strip` behavior. `strip` is not used in
production client-repos. I tried to migrate them off `strip` but that seems
too much work for now. Instead let's just implement `strip` in the HgCommits
layer to be compatible to run the tests.
Reviewed By: DurhamG
Differential Revision: D22402195
fbshipit-source-id: f68d005e04690d8765d5268c698b6c96b981eb0a
Summary:
I dropped the special case of wdir handling. With the hope that we will handle
the virtual commits differently eventually (ex. drop special cases, insert real
commits to Rust DAG but do not flush them to disk, support multiple wdir
virtual commits, null is no longer an ancestor of every commit).
`test-listkeyspatterns.t` is changed because `0` no longer resolves to `null`.
Reviewed By: DurhamG
Differential Revision: D22368836
fbshipit-source-id: 14b9914506ef59bb69363b602d646ec89ce0d89a
Summary: Allow the user to specify extra HTTP headers that should be sent with each EdenAPI request in the `edenapi.headers` config option. The field is expected to be a JSON object whose key-value pairs are used as header keys and values.
Reviewed By: quark-zju
Differential Revision: D22591870
fbshipit-source-id: ac1bb669270d667895554dcc5f7176d18736375c
Summary: Include the name of bad config fields in the error message so the user can more easily fix the problem.
Reviewed By: quark-zju
Differential Revision: D22591871
fbshipit-source-id: e23e2c71e49e0458e7ea5c13e7feac3a990ead0c
Summary: Update `edenapi::Builder` to use the CA certificate bundle specified in the `[auth]` section of the user's config.
Reviewed By: quark-zju
Differential Revision: D22591034
fbshipit-source-id: 3a417adbf50ef7d2c538f4a032e54a038cbd282e
Summary: Allow specifying a CA certificate bundle in the `auth` section of an `hgrc`. This is useful for testing with locally-built servers using self-signed certificates.
Reviewed By: quark-zju
Differential Revision: D22591045
fbshipit-source-id: 023fe006267b0b781a1af16a7505e188c008a8c0
Summary:
Implements based Rust-Python binding layer for error metadata propagation.
We introduce a new type, `TaggedExceptionData`, which carries CommonMetadata and the original (without metadata) error message for a Rust Anyhow error. This class is passed to RustError and can be accessed in Python (somewhat awkwardly) via indexing:
```
except error.RustError as e:
fault = e.args[0].fault()
typename = e.args[0].typename()
message = e.args[0].message()
```
As far as I can tell, due to limitations in cpython-rs, this can't be made more ergonomic without introducing a Python shim around the Rust binding layer, which could adapt the cpython-rs classes to use whatever API we'd like.
Currently, anyhow errors that are not otherwise special-cased will be converted into RustError, with both the original error message and any attached metadata printed as shown below
```
abort: intentional error for debugging with message 'intentional_error'
error has type name taggederror::IntentionalError and fault None
```
We can of course re-raise the error if desired to maintain the previous behavior for handling a RustError.
If we'd like other, specialized Rust Python Exception types to carry metadata (such as `IndexedLogError`), we'll need to modify them to accept a `TaggedExceptionData` like `RustError`.
Renamed the "cause an error in pure rust command" function to `debugcauserusterror`, and instead used the name `debugthrowrustexception` for a command which causes an error in rust which is converted to a Python exception across the binding layer.
Introduced a simple integration test which exercises `debugthrowrustexception`.
Added a basic handler for RustError to scmutil.py
Reviewed By: DurhamG
Differential Revision: D22517796
fbshipit-source-id: 0409489243fe739a26958aad48f608890eb93aa0
Summary: Move the `tokio::Runtime` into `EdenApiRemoteStore` so that if initialization fails, we can propagate the error instead of panicking.
Reviewed By: xavierd
Differential Revision: D22564210
fbshipit-source-id: 9db1be99f2f77c6bb0f6e9dc445d624dc5990afe
Summary:
Add a metadata field to `read_res` containing a `revisionstore::Metadata` struct (which contains the object size and flags). The main purpose of this is to support LFS, which is indicated via a metadata flag.
Although this change affects the `DataEntry` struct which is serialized over the wire, version skew between the client and server should not break things since the field will automatically be populated with a default value if it is missing in the serialized response, and ignored if the client was built with an earlier version of the code without this field.
In practice, version skew isn't really a concern since this isn't used in production yet.
Reviewed By: quark-zju
Differential Revision: D22544195
fbshipit-source-id: 0af5c0565c17bdd61be5d346df008c92c5854e08
Summary: Instead of restricting the allowed characters in a repo name, allow any UTF-8 string. The string will be percent-encoded before being used in URLs.
Reviewed By: quark-zju
Differential Revision: D22559830
fbshipit-source-id: f9caa51d263e06d424531e0947766f4fd37b035f
Summary: Makes the hostname available for dynamicconfig conditions.
Reviewed By: quark-zju
Differential Revision: D22537946
fbshipit-source-id: 630ee833bb3ec00253d718b3d03bbb8b3d49afca
Summary: Adds support for sharding based on user name.
Reviewed By: quark-zju
Differential Revision: D22537540
fbshipit-source-id: 962f9582c8947335dc9d9d29c500d8c09df69878
Summary: We have several repos whose names contain various non-alphanumeric/underscore/hyphen characters, so we need to be more permissive about accepting repo names.
Reviewed By: quark-zju
Differential Revision: D22554846
fbshipit-source-id: e7bb030e0b8fb6aa275c119ba0aa540405b29186
Summary:
Previously, the EdenAPI stores would not report errors returned from the remote store. The intention behind this pattern in other stores is to prevent `KeyError`s from aborting the operation since the local store might still have the key.
However, in the case of the EdenAPI store, EdenAPI will simply omit missing keys in its response rather than returning an error. Instead, any error returned by the EdenAPI store indicates a more fundamental problem (e.g., unable to reach the server, connection reset, etc) which should cause an abort and return the error.
Reviewed By: quark-zju
Differential Revision: D22544031
fbshipit-source-id: e01e8d88b75e46dcebd2eef5203e3a0edde69fc7
Summary: When working with large CBOR responses, it is sometimes useful to limit processing to the first N entries to prevent the operation from taking a long time. This diff adds an option to the `read_res` tool to only look at the first N entries in a data or history response.
Reviewed By: quark-zju
Differential Revision: D22544451
fbshipit-source-id: 5e8e2c7212aa3b315a25bd4cf9273009a5e43f72
Summary: Some repos do not have `remotefilelog.reponame` set, so this shouldn't be a required config item.
Reviewed By: fanzeyi
Differential Revision: D22553141
fbshipit-source-id: a0fe9c289a1a32650572a4c123cda60af90e79ec
Summary:
Introduce new rust library, taggederror, which contains utilities for attaching metadata to errors. The library provides two main methods for attaching metadata to an error, the TaggedError wrapper type, and the AnyhowExt trait methods. Provides a struct, CommonMetadata, which contains all the metadata types introduced by taggederror (fault, transience, category, and typename), which can also be attached individually (and the same pattern can be used to attach other metadata).
Introduce a new native rust command, debugthrowrustexception, which causes the command to return an error, with some attached metadata.
Modify hg rust native command dispatch error handling to use debug formatter to print anyhow::Error errors. This will print out the source chain, contexts, and backtrace if available, which will cause the metadata we attach as a wrapper error or context to be printed.
Reviewed By: DurhamG
Differential Revision: D22420941
fbshipit-source-id: d38c5a10b686d86b69a2c0a19f5bcbf4ca24dff6
Summary:
Previously you could only canary locally on a devserver by setting an
environment variable. Let's add a --canary flag to debugdynamicconfig that
accepts a host. Hg will ssh to that host and run the configerator cli to grab
the canaried config from that host.
Reviewed By: quark-zju
Differential Revision: D22535509
fbshipit-source-id: af1c21d8402c4e729769e50388d913bf52b66b89
Summary:
Previously we had no way of specifying an optional string flag. This
adds support.
I considered making the implementation more generic, so it'd support
Option<i64> and potentially Option<bool> but it introduced some complexity and
didn't seem worth the effort for now.
Reviewed By: quark-zju
Differential Revision: D22535511
fbshipit-source-id: 04d7b5419ca7ae44a9aeff1a5cea2c3043d80042
Summary:
Follow up on this diff: D22432330 (b7817ffbd8)
Renamed xdiff functions to avoid linking issues when using both libgit2-sys and xdiff.
Reviewed By: farnz
Differential Revision: D22511368
fbshipit-source-id: e4be20e3112a8e8829298d5748657e9bdbde8588
Summary: Add an EdenAPI-backed history store. Notably, thanks to the strongly-typed remote store design from the previous diff, it is not possible to construct an `EdenApiHistoryStore` for trees, even when the underlying remote store is behind a trait object. (This is because EdenAPI does not support fetching history for trees.)
Reviewed By: quark-zju
Differential Revision: D22492162
fbshipit-source-id: 23f1393919c4e8ac0918d2009a16f482d90df15c
Summary: Reimplement `EdenApiHgIdRemoteStore` as `EdenApiRemoteStore<T>`, where `T` is a marker type indicating whether this store fetches files or trees. This allows working with the stores in a more strongly-typed way, and avoid having to check what kind of store this is at runtime when fetching data.
Reviewed By: quark-zju
Differential Revision: D22492160
fbshipit-source-id: e17556093fa9b81d2301f281da36d75a03e33c5e
Summary: Move `src/edenapi.rs` to `src/edenapi/mod.rs` in anticipation of adding more files to this module.
Reviewed By: quark-zju
Differential Revision: D22492161
fbshipit-source-id: f6252ea9a9e32d94029b8e6e76be5d9d1754f63d
Summary:
Previously we would audit all configs and report them if the
dynamicconfig did not match the rc-file config. Now that dynamicconfigs are
widely deployed, let's switch this around to auditing only configs we know have
had issues. This will let us start adding new configs via dynamicconfigs instead
of via the legacy staticfiles and chef, before we've finished migrating all the
legacy configs over.
Reviewed By: quark-zju
Differential Revision: D22401865
fbshipit-source-id: 5c41c674d39c8113b2a40da61e020e8a33c39312
Summary:
We're seeing cases were cloning can take 10's of GB of memory because
we pend all the history information in memory. Let's flush the history info
every 10 million adds to bound the memory usage.
10 million was chosen somewhat arbitrarily, but it results in pack files that
are 800MB, which corresponds roughly with 8GB of memory usage.
This requires updating repack to be aware that a single flush could produce
multiple packs. Note, since repack writes via this same path, it may result in
repack producing multiple pack files. In the degenerate case repack could
produce the same number (or more) of pack files than was inputted. If we set the
threshold high enough I think we'll be fine though. 800MB is probably
sufficient.
Reviewed By: xavierd
Differential Revision: D22438569
fbshipit-source-id: 425d5d3b7999b81e44d1dbe1f2a4ea453ab6ca4f
Summary: Per comments on D22429347, add a new `ExtractInnerRef` trait that is similar to `ExtractInner`, but returns a reference to the underlying value. A default implementation is provided for types whose inner value is `Clone + 'static`, so in practice most types will only need to implement `ExtractInnerRef`, whereas the callsite may choose whether it needs a reference or an owned value.
Reviewed By: quark-zju
Differential Revision: D22464158
fbshipit-source-id: 7b97329aedcddb0e51fd242b519e79eba2eed350
Summary: Ensure that all of the components of an EdenAPI response use the same error type.
Reviewed By: quark-zju
Differential Revision: D22443029
fbshipit-source-id: 3e00a8b83677beb5ef2d90630fe9b85760874186
Summary: Add an `add_entry` convenience method to `HgIdMutableDeltaStore`, similar to the one present in `HgIdMutableHistoryStore`.
Reviewed By: quark-zju
Differential Revision: D22443031
fbshipit-source-id: 84fdaae9fbd51e6f2df466b0441ec5f7ce6715f7