Summary:
Matches the `getcommitdata` SSH endpoint.
This is going to be used to remove the requirement that client repostories
need to have all commits locally.
Reviewed By: krallin
Differential Revision: D22979458
fbshipit-source-id: 75d7265daf4e51d3b32d76aeac12207f553f8f61
Summary:
Introduce taggederror-util, which provides a new trait `AnyhowEdenExt`, which provides a method `eden_metadata` for anyhow errors and results. This method works much like `AnyhowExt::common_metadata`, but additionally supports extracting default error metadata from known `Tagged` types which are listed explicitly in the method implementation.
Extend `FilteredAnyhow` to support a configuration "metadata function", which allows swapping out `eden_metadata` for the standard `common_metadata`.
Modify Rust dispatch and Python bindings to use `AnyhowEdenExt` for metadata extraction and printing.
Modify `intentional_error` to rely on `AnyhowEdenExt` for tagging (removes `.tagged` call, no tags will be visible if `AnyhowEdenExt` is not used).
Reviewed By: DurhamG
Differential Revision: D22927203
fbshipit-source-id: 04b36fdfaa24af591118acb9e418d1ed7ae33f91
Summary:
Add a `buffered()` method to `AsyncResponse` allowing the user to specify the desired chunk size for the body stream.
(This was already used internally by `CborStream`; this just exposes it in the public interface.)
Reviewed By: quark-zju
Differential Revision: D22935891
fbshipit-source-id: e110e85bf9cb4c7923a8977ea4631ca1cc4cf4cb
Summary: Rename the `cbor` module to `stream` to better indicate that it contains various stream combinators (not all of which are related to CBOR).
Reviewed By: quark-zju
Differential Revision: D22935892
fbshipit-source-id: 3f73aa707ab59c31717c1cf35995ad79946a15c9
Summary:
This provides Ctrl+C/SIGKILL safety. It's needed because we no longer use the
Python transaction framework. If the write is incomplete, the revlog index
logical length will ensure new processes won't see incomplete data.
The length of revlog data is not tracked, as some "unused" in it does not
really matter. Reading the revlog should be still fine.
Reviewed By: sfilipco
Differential Revision: D22914423
fbshipit-source-id: f2f446cde79c7270cbd1ef165f8707368a0a2990
Summary:
This is more complex than previous libraries, mainly because `dag` defines APIs
(traits) used by other code, which might raise error type not interested
by `dag` itself. `BackendError::Other(anyhow::Error)` is currently used to
capture types that do not fit in `dag`'s predefined error types.
Reviewed By: sfilipco
Differential Revision: D22883865
fbshipit-source-id: 3699e14775f335620eec28faa9a05c3cc750e1d1
Summary:
Prefix some `Result` with `dag::Result`. Since `dag::Result` is just
`anyhow::Result` for now, this does not change anything but makes
it more compatible with upcoming changes.
Reviewed By: sfilipco
Differential Revision: D22883864
fbshipit-source-id: 95a26897ed026f1bb8000b7caddeb461dcaad0e7
Summary:
All dependencies of revlogindex have migrated to concreted error types.
Let's migrate revlogindex itself. This allows compile-time type checks
and makes the error returned by revlogindex APIs more predictable.
Reviewed By: sfilipco
Differential Revision: D22857554
fbshipit-source-id: 7d32599508ad682c6e9c827d4599e6ed0769899c
Summary: The `radixbuf` crate already has its own concrete error type. Use it.
Reviewed By: sfilipco
Differential Revision: D22855450
fbshipit-source-id: 307a46ddd79b28a18ee779867ee1e604b531828a
Summary:
`util` as a low-level library should use concrete error types so callsite can
type check their type conversions.
Reviewed By: sfilipco
Differential Revision: D22855448
fbshipit-source-id: 37b3fce36f1ae82a9604ef8ac0dc22c02280ceb2
Summary:
Change the `lz4-pyframe` library to use a concrete error type instead of trait
object. This would allow callsites to type check the error type.
Reviewed By: sfilipco
Differential Revision: D22855449
fbshipit-source-id: 3497b3e0bfb814302fee2f7297b35de8b8a916ed
Summary: This is unused, no need to add to the build time.
Reviewed By: fanzeyi
Differential Revision: D22967814
fbshipit-source-id: 91a5ed9f03128947af9cb69bca62ed75b75e7e66
Summary:
We were experiencing hangs during lfs http fetches. We've seen similar
issues before when using Hyper, which Reqwest is based off of. Let's switch to
the new Curl-based http_client instead.
Note, this does get rid of the ability to obey hg config http proxy settings.
That said, they shouldn't be used much, and the http proxy environment variables
are respected by libcurl.
Reviewed By: xavierd
Differential Revision: D22935348
fbshipit-source-id: 1c61c04bbb4043e3bde592251f12bf846ab3afd4
Summary:
A future diff does LFS fetches via http_client. Curl has some default
behavior of adding the "Expect: 100-continue" header which causes the server to
send a 100 status code response after the headers have been received but before
the the payload has been received. Since the http_client model only expects a
single response, this breaks the model and we're unable to read the second
response. Let's disable this behavior by manually setting the header to empty
string, which appears to be the official way to handle this.
Add it early so callers can overwrite it.
Reviewed By: quark-zju
Differential Revision: D22935349
fbshipit-source-id: 3009a5eb72f40584b846510f34f121e0e821a2bc
Summary:
One repack code path would return an error if the pack was already
deleted before the repack started. This is a fine situation, so let's just eat
the error and repack what can be repacked.
Reviewed By: xavierd
Differential Revision: D22873219
fbshipit-source-id: c716a5f0cd6106fd3464702753fb79df0bc7d13f
Summary:
Eden api endpoint for segmented changelog. It translates a path in the
graph to the hash corresponding to that commit that the path lands on.
It is expected that paths point to unique commits.
This change looks to go through the plumbing of getting the request from
the edenapi side through mononoke internals and to the segmented changelog
crate. The request used is an example. Follow up changes will look more at
what shape the request and reponse should have.
Reviewed By: kulshrax
Differential Revision: D22702016
fbshipit-source-id: 9615a0571f31a8819acd2b4dc548f49e36f44ab2
Summary:
During large prefetches, (say a clone), it is possible that 2 different
filenode actually refer to the same file content, which thus share the same LFS
blob. The code would wrongly prefetch this blob twice which would then fail due
to the `obj_set` only containing one instance of this object.
Instead of using a Vec for the objects to prefetch, we can simply use a
`HashSet` which will take care of de-duplicating the objects.
Reviewed By: DurhamG
Differential Revision: D22903606
fbshipit-source-id: 4983555d2b16639051acbbb591ebb752d55acc2d
Summary:
There was a small but easy to miss mistake when prefetch was changed to return
the keys that couldn't be prefetched. For LFS pointers, the code would wrongly
return that the blob was fetched, which is misleading as the LFS blob isn't
actually downloaded. For LFS pointers, we need to translate them to their LFS
blob content hashes.
Reviewed By: DurhamG
Differential Revision: D22903607
fbshipit-source-id: e86592cd986498d9f4a574585eb92da695de2e27
Summary:
An earlier diff, D21772132 (713fbeec24), add an option to default hgcache data store
writes to indexedlog but it only did it for data, not history. Let's also do it
for history.
Reviewed By: quark-zju
Differential Revision: D22870952
fbshipit-source-id: 649361b2d946359b9fbdd038867e1058077bd101
Summary: This makes it a little bit easier to use.
Reviewed By: sfilipco
Differential Revision: D22853717
fbshipit-source-id: aa3c1ed2a9a2d1020a48a4493a644093d8b07e67
Summary:
We're seeing users report lfs fetching hanging for 24+ hours. Stack
traces seem to show it hanging on the lfs fetch. Let's read bytes off the wire
in smaller chunks and add a timeout to each read (default timeout is 10s).
Reviewed By: xavierd
Differential Revision: D22853074
fbshipit-source-id: 3cd9152c472acb1f643ba8c65473268e67d59505
Summary:
Although new changelog revlogs do not use deltas since years ago, early
revisions in our production changelog still use mpatch delta format
because they are stream-cloned.
Teach revlogindex to support them.
Reviewed By: sfilipco
Differential Revision: D22657204
fbshipit-source-id: 7aa3b76a9a6b184294432962d36e6a862c4fe371
Summary:
The default reachable_roots implementation is good enough for segmented
changelog, but not efficient for revlogindex use-case.
Reviewed By: sfilipco
Differential Revision: D22657193
fbshipit-source-id: a81bc255d42d46c50e61fe954f027f1160dacb6c
Summary:
I thought it was just `roots & (::heads)`. It is actually more complex than
that.
Reviewed By: sfilipco
Differential Revision: D22657201
fbshipit-source-id: bd0b49fc4cdd2c516384cf70c1c5f79af4da1342
Summary:
This replaces RustError that might happen during `addcommits`, and allow us to
handle it without having a stacktrace.
Reviewed By: DurhamG
Differential Revision: D22539564
fbshipit-source-id: 356814b9baf0b31528dfc92d62b0dcf352bc1e24
Summary:
Follow up of D22638454.
This makes revlogindex marks its compatible DAG so "all()" fast paths can be used properly.
Reviewed By: sfilipco
Differential Revision: D22638459
fbshipit-source-id: 074e95b9fccbc486b69a947fec5172662e7dd3b7
Summary:
No need to exhaust the entire IdLazySet if there are hints.
This is important to make `small & lazy` fast.
Reviewed By: sfilipco
Differential Revision: D22638462
fbshipit-source-id: 63a71986e6e254769c42eb6250c042ea6aa5808b
Summary:
When multiple DAGs (ex. a local DAG and a commit-cloud DAG) are involved,
certain fast paths become unsound. Namely, the fast paths of the FULL hint
should check DAG compatibility. For example:
localrepodag.all() & remotedag.all()
should not simply return `localrepodag.all()` or `remotedag.all()`.
Fix it by checking DAG pointers.
A StaticSet might be created without using a DAG, add an optimization
to change `all & static` to `static & all`. So StaticSet without DAG
wouldn't require full DAG scans when intersecting with other sets.
Reviewed By: sfilipco
Differential Revision: D22638454
fbshipit-source-id: 72396417e9c1238d5411829da8f16f2c6d4c2f3a
Summary:
Improve `fmt::Debug` so it fits better in the Rust and Python eco-system:
- Support Rust formatter flags. For example `{:#5.3?}`. `5` defines limit of a
large set to show, `3` defines hex commit hash length. `#` specifies the
alternate form.
- Show commit hashes together with integer Ids for IdStaticSet.
- Use HG rev range syntax (`a:b`) to represent ranges for IdStaticSet.
- Limit spans to show for IdStaticSet, similar to StaticSet.
- Show only 8 chars of a long hex commit hash by default.
- Minor renames like `dag` -> `spans`, `difference` -> `diff`.
Python bindings uses `fmt::Debug` as `__repr__` and will be affected.
Reviewed By: sfilipco
Differential Revision: D22638455
fbshipit-source-id: 957784fec9c99c8fc5600b040d964ce5918e1bb4
Summary:
This makes intersection set stop early. It's useful to stop iteration on some
lazy sets. For example, the below `ancestors(tip) & span` or
`descendants(1) & span` sets can take seconds to calculate without this
optimization.
```
In [1]: cl.dag.ancestors([cl.tip()]) & cl.tonodes(bindings.dag.spans.unsaferange(len(cl)-10,len(cl)))
Out[1]: <and <lazy-id> <dag [...]>>
In [3]: %time len(cl.dag.ancestors([cl.tip()]) & cl.tonodes(bindings.dag.spans.unsaferange(len(cl)-10,len(cl))))
CPU times: user 364 µs, sys: 0 ns, total: 364 µs
Wall time: 362 µs
In [7]: %time len(cl.dag.descendants([repo[1].node()]) & cl.tonodes(bindings.dag.spans.unsaferange(0,100)))
CPU times: user 0 ns, sys: 574 µs, total: 574 µs
Wall time: 583 µs
```
Reviewed By: sfilipco
Differential Revision: D22638458
fbshipit-source-id: b9064ce2ff1aecc2d7d00025928dfcb3c0d78e0c
Summary:
Similar to the segmented changelog version using `ANCESTORS`. This makes
`heads(all())` calculates `heads_ancestors(all())` automatically and gets
the speed-up.
Reviewed By: sfilipco
Differential Revision: D22638464
fbshipit-source-id: 014412f1c226925e50387f18c1282b3cb96d434b
Summary:
Optimize it to not covert revs to `Vec<u32>`, and have a fast path to
initialize `states` with `Unspecified`. This makes it about 2x faster and match
the C revlog `headrevs` performance when calculating `headsancestors(all())`:
```
In [2]: %timeit cl.index.clearcaches(); len(cl.index.headrevs())
10 loops, best of 3: 66.9 ms per loop
In [3]: %timeit len(cl.dageval(lambda: headsancestors(all())))
10 loops, best of 3: 64.9 ms per loop
```
Reviewed By: sfilipco
Differential Revision: D22638461
fbshipit-source-id: 965eb16e3a78ae02a65a8a44559f3a64c16f6884
Summary:
Change `parents` from using the default implementation that returns `StaticSet`
of commit hashes, to a customized implementation that returns `IdStaticSet`.
This avoids unnecessary commit hash lookups, and makes `heads(all())` 30x
faster, matching `headsancestors(all())` (but is still 2x slower than the C
revlog index `headsrevs` implementation).
Reviewed By: sfilipco
Differential Revision: D22638453
fbshipit-source-id: 4fef78080b990046b91fee110c48e36301d83b4f
Summary:
The hint indicates a set `X` is equivalent to `ancestors(X)`.
This allows us to make `heads` use `heads_ancestors` (which is faster in
segmented changelog) automatically without affecting correctness. It also
makes special queries like `ancestors(all())` super cheap because it'll just
return `all()` as-is.
Reviewed By: sfilipco
Differential Revision: D22638463
fbshipit-source-id: 44d9bbcbb0d7e2975a0c8322181c88daa1ba4e37
Summary: This avoids depending on the C index if the Rust DAG is available.
Reviewed By: DurhamG
Differential Revision: D22519587
fbshipit-source-id: a89d91184feaeef6641d2b04353601297bf5d4d5
Summary:
Better Engineering: remove dead code about secret tool.
Secret tool is a FB specific tool (keychain like) and has been used to transfer OAuth token between
different devservers without user's involvement. We have migrated to certs on devservers, so it is not needed anymore.
Also, it is FB specific and doesn't make sense for open source either.
Reviewed By: mitrandir77
Differential Revision: D22827264
fbshipit-source-id: cd89168ad75ca041d2a0f18d63474dd1eaad483d
Summary:
If the LFS server is down, we are going to retry fetching filenode from the
Mercurial server directly, who is expected to not return a pointer.
Previously, this was achieved by adding a hack into `get_missing`, but since
the function is no longer called during prefetch calls, we cannot rely on it
anymore. Instead, we can wrap the regular remote store and translate all the
StoreKey::Content onto their corresponding hgid keys.
Reviewed By: DurhamG
Differential Revision: D22565604
fbshipit-source-id: 2532a1fc3dfd9ba5600957ed5cf905255cb5b3fd
Summary:
The ContentStore code has a strong assumption that all the data fetched ends up
in the hgcache. Unfortunately, this assumption breaks somewhat if an LFS
pointer is in the local store but the blob isn't alonside it. This can happen
for instance when ubundling a bundle that contains a pointer, the pointer will
be written to the local store, but the blob would be fetched in the shared
store.
We can break this assumption a bit in the LFS store code by writing the fetched
blob alongside the pointer, this allows the `get` operation to find both the
pointer and the blob in the same store.
Reviewed By: DurhamG
Differential Revision: D22714708
fbshipit-source-id: 01aedf04d692c787b7cddb0f7a76828ea37dcf29
Summary:
The pointer for a blob might very well be in the local store, so let's search
in it.
Reviewed By: DurhamG
Differential Revision: D22565608
fbshipit-source-id: 925dd5718fc19e11a1ccaa0887bf5c477e85b2e5
Summary: Similarly to the changes made for `get`, the same can be applied to prefetch.
Reviewed By: DurhamG
Differential Revision: D22565609
fbshipit-source-id: 0fbc1a0086fa44593a6aaffb746ed36b3261040c
Summary: It was once used by EdenFS, but is now dead code, no need to keep it around.
Reviewed By: singhsrb
Differential Revision: D22784582
fbshipit-source-id: d01cf5a99a010530166cabb0de55a5ea3c51c9c7
Summary:
When using LFS, it's possible that a pointer may be present in the local
LfsStore, but the blob would only be in the shared one. Such scenario can
happen after an upload, when the blob is moved to the shared store for
instance. In this case, during a `get` call, the local LFS store won't be able
to find the blob and thus would return Ok(None), the shared LFS store woud not
be able to find the pointer itself and would thus return Ok(None) too. If the
server is not aware of the file node itself, the `ContentStore::get` would also
return Ok(None), even though all the information is present locally.
The main reason why this is happening is due to the `get` call operating
primarily on file node based keys, and for content-based stores (like LFS),
this means that the translation layer needs to be present in the same store,
which in some case may not be the case. By allowing stores to return a
`StoreKey` when progress was made in finding the key we can effectively solve
the problem described above, the local store would translate the file node key
onto a content key, and the shared store would read the blob properly.
Reviewed By: DurhamG
Differential Revision: D22565607
fbshipit-source-id: 94dd74a462526778f7a7e232a97b21211f95239f
Summary:
I thought I had removed that code a while back, it turns out I didn't, so let's
do it.
Reviewed By: singhsrb
Differential Revision: D22583556
fbshipit-source-id: b92644195994e0a83bdbcd8019253ea217474486
Summary: This change introduces a bail macro that allows tagging errors using the syntax `bail!(fault=Fault::Request, "my normal {}", bail_args)` or `bail!(Fault::Request, "my normal {}", bail_args)`.
Reviewed By: DurhamG
Differential Revision: D22646428
fbshipit-source-id: a6ec2940001b26db8ddc3a6d3620a1e17406c867
Summary:
About 64 tests depend on the revlog `strip` behavior. `strip` is not used in
production client-repos. I tried to migrate them off `strip` but that seems
too much work for now. Instead let's just implement `strip` in the HgCommits
layer to be compatible to run the tests.
Reviewed By: DurhamG
Differential Revision: D22402195
fbshipit-source-id: f68d005e04690d8765d5268c698b6c96b981eb0a