Summary:
Revsets should use ChangesetId instead of NodeHash. This diff cleans up
ancestors revset
Reviewed By: farnz
Differential Revision: D10032436
fbshipit-source-id: 2c7d170738826154e3b606e9e29a739a34b1840e
Summary:
From non-master region fetching many changeset entries can be slow. Bumping the
buffer size helped a lot. It shouldn't have negative impact on memory usage
because changeset objects are fairly small.
Reviewed By: farnz
Differential Revision: D10026544
fbshipit-source-id: fa015660a0f7839e73b5aff7d143e85f274dd1b0
Summary:
This diff adds a real implementation for CachingChangesetFetcher. Now it
fetches the data for the cache from the blobstore.
The rest is explained in the comments.
Reviewed By: farnz
Differential Revision: D9908320
fbshipit-source-id: 5427f3ed312cb7753434161423cb27b48744347f
Summary:
Initial implementation of ChangesetsFetcher that will use cache smarter.
At the moment it doesn't do anything special, but in the next diffs it will pre
warm cache in case it has a lot of cache misses (that's why it has to have a
reference to the cachelib CachePool).
Reviewed By: farnz
Differential Revision: D9908319
fbshipit-source-id: 6377a947696bae6b060de5a441722c28309b341c
Summary:
This diff enables access to file content via contains_string() and len() (same as in file hooks) form inside changeset hooks.
This is necessary as some changeset hooks need access to file content and length, e.g. to compute total changeset size.
Reviewed By: StanislavGlebik
Differential Revision: D9788596
fbshipit-source-id: da7bafe6f6fa17a1f25b42550d0bb1a5d871579e
Summary:
High-level goal: we want to make certain big getbundle requests faster. To do
that we'd store blobs of commits that are close to each other in the blobstore
and fetch them only if we had too many cache misses. All this logic will be
hidden in ChangesetFetcher trait implementation. ChangesetFetcher will be
created per request (hence the factory).
Reviewed By: farnz
Differential Revision: D9869659
fbshipit-source-id: 9e3ace3188b3c13f83ef1bd61b668d4f22103f74
Summary:
POST request mononoke_api/objects/batch from hg client.
According to git-lfs protocol
https://github.com/git-lfs/git-lfs/tree/master/docs/apihttps://github.com/git-lfs/git-lfs/blob/master/docs/api/batch.md
In order to get url for uploading/downloading files, hg client is sending POST request mononoke_api/objects/batch of the following format.
This diff implements support for this POST request.
As an answer it returns json in the format required in git-lfs protocol (see link for more info).
Reviewed By: StanislavGlebik
Differential Revision: D9966691
fbshipit-source-id: 53bcbb4b455e61d9d344bfd9b5b6fb00bc201084
Summary:
This is mainly generated by
```
fbgs mononoke_test_2 -l -s | xargs sed -i 's/mononoke_test_2/mononoke_production/g'
```
mononoke config repo is still pending todo. But it's ok to do it in several run
as right now 2 names are pointing to the same shard.
Reviewed By: StanislavGlebik
Differential Revision: D9939694
fbshipit-source-id: ded772037844a220b18d99b207c976b88dafdaa5
Summary: Sometimes we may not want it.
Reviewed By: StanislavGlebik
Differential Revision: D9840364
fbshipit-source-id: b71a4f1275733aead94b17d21a9bbf4ddc3a8ff2
Summary:
WIP
Mononoke API download for lfs
support get request
curl http://127.0.0.1:8000/{repo_name}/lfs/download/{sha256}
Reviewed By: StanislavGlebik
Differential Revision: D9850413
fbshipit-source-id: 4d756679716893b2b9c8ee877433cd443df52285
Summary:
Streaming clones are a neat hack; we get to send files down to the
Mercurial client, which it then writes out as-is. Usefully, we can send no
files, and the protocol still works.
Set up the capabilities etc needed so that we send streaming clones to
Mercurial clients, even if they're rather useless so far.
Reviewed By: StanislavGlebik
Differential Revision: D9926967
fbshipit-source-id: b543e802adac38c8bc318081eec364aed87b0231
Summary:
It was broken because of D9849883.
The problem was in the following. repo-pull was created using
`hginit_treemanifest` which creates treemanifest server repo. This repo was
backfilling flat manifests, and this backfilling sends a getbunde request with
empty common heads.
The problem was fixed by using `hgclone_treemanifest`.
Reviewed By: farnz
Differential Revision: D9940386
fbshipit-source-id: 837be6fd27c8e5ee81634d223aa1a88101926961
Summary:
We've upgraded crates.io, and somehow broke Actix so that HTTP/2 is no
longer supported. For now, update the test to run with HTTP/1.1
Reviewed By: StanislavGlebik
Differential Revision: D9934777
fbshipit-source-id: c41aa5ad376dc7b07700f1d1d1b30ff9ff694f68
Summary:
Let's check that new case conflicts are not added by a commit.
That diff also fixes function check_case_conflict_in_manifest - it needs to
take into account that if one of the conflicting files was removed then there
is no case conflict.
There should be a way to disable this check because we sometimes need to allow
broken commits. For example, during blobimport
Reviewed By: aslpavel
Differential Revision: D9789809
fbshipit-source-id: ca09ee2d3e5340876a8dbf57d13e5135344d1d36
Summary:
In the next diff I'm going to add a separate functions to look in the cache and
insert into the cache, so rename it to avoid confusion
Reviewed By: Imxset21
Differential Revision: D9869648
fbshipit-source-id: f2bd806b14d78660518d841d90a903970028eb37
Summary: Should reduce memory usage and make getbundle a bit faster
Reviewed By: farnz
Differential Revision: D9861578
fbshipit-source-id: 57bca3700e3a38aeb70f267e6dc90d6b8a9d2955
Summary: Use `ChangesetId` in `DifferenceOfUnionsOfAncestorsNodeStream` instead of `HgNodeHash`. This avoids several bonsai lookups of parent nodes.
Reviewed By: StanislavGlebik
Differential Revision: D9631341
fbshipit-source-id: 1d1be7857bf4e84f9bf5ded70c28ede9fd3a2663
Summary:
We have getbundle requests that don't have common heads specified at all. In
that case it's cheaper to just run `hg clone`. Let's fail requests like that
earlier.
Reviewed By: aslpavel
Differential Revision: D9849883
fbshipit-source-id: b5f2d2302697f176576867b7077db71aa09676ad
Summary:
Add support for the storerequirements feature of Mercurial repositories, which
requires the reader to additionally check the store/requires file for store
requirements.
Reviewed By: StanislavGlebik
Differential Revision: D9850335
fbshipit-source-id: 557ea0f90f3d138d1df56edd94ee23760b9fd849
Summary:
Additional 2-step reference for blob:
For each file add an additional blob with:
key = aliases.sha256.sha256(raw_file_contents)
value = blob_key
Pay attention, that sha256 hash is taken `from raw_file_content`, not from a blob content.
Additional blob is sent together with the file content blob.
Reviewed By: lukaspiatkowski, StanislavGlebik
Differential Revision: D9775509
fbshipit-source-id: 4cc997ca5903d0a991fa0310363d6af929f8bbe7
Summary:
We have plans to add a cache of many changeset entries and store it in the
blobstore. The main reason is to speed up Mononoke's revsets and in turn speed
up getbundle wireproto request.
To cache we first need to serialize them. Let's use thrift serialization for
that.
Reviewed By: lukaspiatkowski
Differential Revision: D9738637
fbshipit-source-id: ba771545de9a955956acb6d169ee7bc424ef271b
Summary: It will be useful outside of pushrebase library as well
Reviewed By: farnz
Differential Revision: D9789811
fbshipit-source-id: c851df8a8cce8b1c26daa09b7fe2ffa40f290160
Summary:
There were quite a lot of pushes that use pushvars.
This diff adds a parsing of it.
After I added the parsing of pushvars it started to fail because it seems to be
a flat manifest push. But having parsing of pushvars probably won't hurt.
Reviewed By: farnz
Differential Revision: D9751962
fbshipit-source-id: 49796e91edfad76fb022a2e0fc049a79859de1b7
Summary:
In `fetch_file_contents()` `blobstore_bytes.into()` converted the bytes to
`Blob<Id>`. This code calls `MononokeId::from_data()` which calls blake2
hashing. Turns out it causes big problems for large many large files that
getfiles can return.
Since this hash is not used at all, let's avoid generating it.
Reviewed By: jsgf
Differential Revision: D9786549
fbshipit-source-id: 65de6f82c1671ed64bdd74b3a2a3b239f27c9f17
Summary: Use failure rather than error-chain for errors.
Reviewed By: StanislavGlebik
Differential Revision: D9780341
fbshipit-source-id: 4d41855093cf812e83b6c348a7499e85d9472daf
Summary:
Split the decoder out into its own function. This can handle partial results,
but the Decoder trait API cannot, so make sure the Decoder still only returns
complete results.
Reviewed By: farnz
Differential Revision: D9780342
fbshipit-source-id: b2439cba95b1e42444adbf2ee4b6e3792703a188
Summary:
Profiling showed that since we are inserting objects into blobstore
sequentially it takes a lot of time for long stacks of commit. Let's do it in
parallel.
Note that we are still inserting sequentially into changesets table
Reviewed By: farnz
Differential Revision: D9683037
fbshipit-source-id: 8f9496b97eaf265d9991b94243f0f14133f463da
Summary:
The "path" in manifold blobrepo is used for logging, but it has been quite confusing with "fbsource" and "fbsource-pushrebase" to be logged in an identical way - both are "fbsource", because of the "path" config. Lets not use the "path" for logging, instead use the "reponame" from metaconfig repo.
In case we ever want to have two repos that are named the same (please don't) or have logging under a different name than "reponame" from config then we can add a proper optional "name" parameter, but for now we don't require this confusing feature.
Reviewed By: StanislavGlebik
Differential Revision: D9769514
fbshipit-source-id: 89f2291df90a7396749e127d8985dc12e61f4af4
Summary:
Let's log how long it takes to do pushrebase, how many retry attempts
pushrebase has done, and how long it takes to generate
the response to the client.
Reviewed By: farnz
Differential Revision: D9683036
fbshipit-source-id: 3ad57c2925bdceb3839cae1ff4215c3dd8cd0cc2
Summary:
We had a lot of requests that took > 15 mins on Mononoke, while taking few
seconds on mercurial. Turned out that hgcli doesn't play well with big chunks.
Looks like AsyncRead very inefficiently tries to allocate memory, and that
causes huge slowness (T33775046 for more details).
As a short-term fix let's chunk the data on the server. Note that now we have
to make getfiles request streamable and manually insert the size of the
request.
Reviewed By: lukaspiatkowski
Differential Revision: D9738591
fbshipit-source-id: f504cf540bc7d90e2cbebba9808455b6e89c92c6
Summary:
We were using incorrect buffer size. That's *very* surprising that our servers
weren't continuously crashing. However, see the test plan - it really looks
like `LZ4_compressBound()` is the correct option here.
Reviewed By: farnz
Differential Revision: D9738590
fbshipit-source-id: d531f32e79ab900f40d46b7cb6dac01dff8e9cdc
Summary:
See the comment near "DecompressionType::OverreadingZstd" to see what it does.
Why OverreadingZstd works for Mononoke's use case? Answer:
Because we use it in bundle2 parsing, which is already chunked by the outer Reader. This means that when we have a stream of bytes:
```
uncompressed -> compressed bundle2 -> uncompressed
```
thanks to chunking we extract the compressed part:
```
do_stuff(uncompressed)
ZstdDecoder(compressed bundle2)
do_stuff(uncompressed)
```
rather than
```
do_stuff(uncompressed)
ZstdDecoder(compressed bundle2 -> uncompressed)
```
So overreading doesn't hurt us here
Reviewed By: StanislavGlebik
Differential Revision: D9700778
fbshipit-source-id: 70dd6f405ffa00fb981791aff25c60f60831ea6b
Summary:
Use .chain_err() where appropriate to give context to errors coming up from
below. This requires the outer errors to be proper Fail-implementing errors (or
failure::Error), so leave the string wrappers as Context.
Reviewed By: lukaspiatkowski
Differential Revision: D9439058
fbshipit-source-id: 58e08e6b046268332079905cb456ab3e43f5bfcd
Summary: Cleans things up a bit, esp when matching Context/Chain.
Reviewed By: lukaspiatkowski
Differential Revision: D9439062
fbshipit-source-id: cde8727437f58b288bed9dfacb864bdcd7dea45c
Summary:
Use the err_downcast macros instead of manual downcasting. Doesn't make
a huge code-size difference in this case, but a little neater?
Reviewed By: kulshrax, fanzeyi
Differential Revision: D9405014
fbshipit-source-id: 170665f3ec3e78819c5c8a78d458636de253bb6f
Summary:
Add a type to explicitly model a causal chain of errors, akin to
error_chain. This looks a lot like Context, but is intended to show the entire
stack of errors rather than deciding that only the top-level one is
interesting.
This adds a `ChainExt` trait, which adds a `.chain_ext(OuterError)` method to
add another step to the causal chain. This is implemented for:
- `F` where `F: Fail`
- `Error`
- `Result<_, F>` where `F: Fail`
- `Result<_, Error>`
- `Future`/`Stream<Error=F>` where `F: Fail`
- `Future`/`Stream<Error=Error>`
- `Chain`
Using it is simple:
```
let res = something_faily().chain_err(LocalError::new("Something amiss"))?;
```
where `something_faily()` returns any of the above types.
(This is done by adding an extra dummy marker type parameter to the `ChainExt`
trait so that it can avoid problems with the coherence rules - thanks for the idea @[100000771202578:kulshrax]!)
Reviewed By: lukaspiatkowski
Differential Revision: D9394192
fbshipit-source-id: 0817844d283b3900d2555f526c2683231ca7fe12
Summary:
Add a pair of macros to make downcasting errors less tedious:
```
let res = err_downcast! {
err, // failure::Error
foo: FooError => { println!("err is an FooError! {:?}", foo) },
bar: BarError => { println!("err is a BarError! {:?}", bar) },
};
```
`err_downcast` takes `failure::Error` and deconstructs it into one of the
desired types and returns `Ok(match action)`, or returning it as `Err(Error)`
if nothing matches.
`err_downcast_ref` takes `&failure::Error` and gives a reference type. It
returns `Some(match action)` or `None` if nothing matches.
The error types are required to implement `failure::Fail`.
`err_downcast_ref` also matches each error type `E` as `Context<E>`.
Reviewed By: lukaspiatkowski
Differential Revision: D9394193
fbshipit-source-id: c56d91362d5bed8ab3e254bc44bb6f8a0eb376a2
Summary:
Pushrebase should send back the newly created commits. This diff adds this
functionality.
Note that it fetches both pushrebased commit and current "onto" bookmark.
Normally they should be the same, however they maybe different if bookmark
suddenly moved before current pushrebase finished.
Reviewed By: lukaspiatkowski
Differential Revision: D9635433
fbshipit-source-id: 12a076cc95f55b1af49690d236cee567429aef93
Summary: We are going to use it in pushrebase as well
Reviewed By: lukaspiatkowski
Differential Revision: D9635432
fbshipit-source-id: 5cbe0879d002d9b6c21431b0938562357347a67f
Summary:
`asynchronize` does two conceptually separate things:
1. Given a closure that can do blocking I/O or is CPU heavy, create a future
that runs that closure inside a Tokio task.
2. Given a future, run it on a new Tokio task and shuffle the result back to
the caller via a channel.
Split these two things out into their own functions - one to make the future,
one to spawn it and recover the result. For now, this is no net change - but
`spawn_future` is likely to come in useful once we need more parallelism than
we get from I/O alone, and `closure_to_blocking_future` at least signals intent
when we allow a long-running function to take over a Tokio task.
Reviewed By: jsgf
Differential Revision: D9635812
fbshipit-source-id: e15aeeb305c8499219b89a542962cb7c4b740354
Summary:
`asynchronize` currently does not warn the event loop that it's
running blocking code, so we can end up starving the thread pool of threads.
We can't use `blocking` directly, because it won't spawn a synchronous task
onto a fresh Tokio task, so your "parallel" futures end up running in series.
Instead, use it inside `asynchronize` so that we can pick up extra threads in
the thread pool as and when we need them due to heavy load.
While in here, fix up `asynchronize` to only work on synchronous tasks and
push the boxing out one layer. Filenodes needs a specific change that's
worth extra eyes.
Reviewed By: jsgf
Differential Revision: D9631141
fbshipit-source-id: 06f79c4cb697288d3fadc96448a9173e38df425f
Summary:
We have suspect timings in Mononoke where `asynchronize` is used to
turn a blocking function into a future. Add a test case to ensure that
`asynchronize` itself cannot be causing accidental serialization.
Reviewed By: jsgf
Differential Revision: D9561367
fbshipit-source-id: 14f03e3f003f258450bb897498001050dee0b40d
Summary: In case `max_depth=1` we should only return the topmost entry, which in this case always is the root-entry. This fixes it so that we always return-fast in case `max_depth=1`.
Reviewed By: StanislavGlebik
Differential Revision: D9614259
fbshipit-source-id: a6b82bd5aac74d004f61a07bc24f5d26e5c56412