Summary:
Our post-request processing was still implemented using old futures. Let's get
rid of that, and make it all new futures. This will make it easier to maintain,
and means we can get rid of the tokio-old dependency.
Reviewed By: mitrandir77
Differential Revision: D20343188
fbshipit-source-id: 513e124805e2ca588a1e312d8f5ca697ed6030c8
Summary:
This updates the lfs server and eden api server to use a newer version of
Gotham, which comes along with an updated version of Bytes and Hyper.
A few things had to change for this:
- New bytes don't support concatenation, so we need to fold them ourselves,
except...
- ... new Hyper bodies don't tell you how big they are (either in requests or
responses), so we need to inspect headers to find the size instead (I added
this in `gotham_ext::body_ext::BodyExt`, although it arguably belongs more in
a `hyper_ext` crate, but creating a new crate for just this seems overkill).
- New Hyper requires its data stream to be `Sync` for reasons that have more to
do with developer experience than runtime
(https://github.com/hyperium/hyper/pull/1857). Unfortunately, our Filestore
streams aren't `Sync`, because our `BoxFuture` contains a `dyn Future` that
isn't explicitly `Sync` (which is how we pull things out of blobstores). Even
if `BoxFuture` contained a `Sync` future, that still wouldn't be enough
anyway, because `compat()` explicitly implements `!Sync` on the stream it
returns. I'll ask upstream in Hyper if this can possibly change in the
future, but for now we can work around it by wrapping the stream in a
channel. I'll keep an eye out for performance here.
- When I updated our "pre state data" tweaks on top of Gotham, I renamed those
to "socket data", since that's a better name or what they are (hence the
changes here).
- I updated the lfs_protocol to stop depending on Hyper and instead depend on
http, since that's all we need here.
As you review this, please pay close attention to the updated implementation of
`SignalStream`. Since this is a custom `Stream` in new futures, it requires a
bit of `unsafe { ... }`.
Note that, unfortunately, the diff includes both the Gotham update and the
server updates, since they have to happen together.
Reviewed By: kulshrax, dtolnay
Differential Revision: D20342689
fbshipit-source-id: a490db96ca7c4da8ff761cb80c1e7e3c836bad87
Summary:
This adds the ability to specify a priority in hgcli, and to pass it on to
Mononoke. This will be used to replay commit cloud traffic at a lower priority.
Reviewed By: farnz
Differential Revision: D20038573
fbshipit-source-id: 4055d28ee295e2b15c15945bd3741f6d739ead3a
Summary:
This adds a blobstore that can reach into a CoreContext in order to identify
the allowed level of concurrency for blobstore requests initiated by this
CoreContext. This will let us replay infinitepush bundles with limits on a
per-request basis.
Reviewed By: farnz
Differential Revision: D20038575
fbshipit-source-id: 07299701879b7ae65ad9b7ff6e991ceddf062b24
Summary:
Time filters for file history require fetching changeset or changeset info to decide whether to include the commit into response. To speed up the process instead of sequential mapping, I buffer the map stream in batches of 100.
However it is unfortunate to fetch additional 100 history changesets if there is no time filters and only several commits were requested. Keeping also in mind that the most of the requests don't care about time.
Avoiding that would speed up the history generation,
I changed the changeset_path API so now it just returns the stream of changeset contexts. And commit_path API for history if there is no time filters collects everything to vector, otherwise applies them and also collects to vector. Then this vector is converted into response using FuturesOrdered.
Reviewed By: StanislavGlebik
Differential Revision: D20287497
fbshipit-source-id: 0c6b1315eccddb48f67bf5fa732bdf7c9a54a489
Summary:
A lot of callsites want to know repo name. Currently they need to pass it from
the place where repo was initialized, and that's quite awkward, and in some
places even impossible (i.e. in derived data, where I want to log reponame).
This diff adds reponame in BlobRepo
Reviewed By: krallin
Differential Revision: D20363065
fbshipit-source-id: 5e2eb611fb9d58f8f78638574fdcb32234e5ca0d
Summary:
We track this in Scuba right now (and alarm on it), but tracking it in ODS will
make it easier to incorporate in our canary and post-release health check
workflow.
Reviewed By: StanislavGlebik
Differential Revision: D20361803
fbshipit-source-id: 99fb514d41f9cda42c3c9a82f3b8d6681285430a
Summary:
Before we assumed that if the rows_affected length doesn't match the number of
entries we were trying to insert we have a conflict. Let's verify if we really
have conflict or we're trying to insert the same entry twice.
Reviewed By: krallin
Differential Revision: D20343219
fbshipit-source-id: 19e032439fdd65f5fe1afe1a10b401bc2fe33462
Summary: Running blobimport twice on the same commit seems to cause problems.
Reviewed By: krallin
Differential Revision: D20343218
fbshipit-source-id: 4d572630e7c15c219bee8db15cc879b2cb8602fe
Summary: Add ability to walk all published bookmarks as there may be multiple important bookmarks
Reviewed By: krallin
Differential Revision: D20249806
fbshipit-source-id: aff2ee1ec7d51a9e4fb6e1e803612abd207fd6cb
Summary:
The goal of the whole stack is quite simple (add reponame field to BlobRepo), but
this stack also tries to make it easier to initialize BlobRepo.
To do that BlobrepoBuilder was added. It now accepts RepoConfig instead of 6
different fields from RepoConfig - that makes it easier to pass a field from
config into BlobRepo. It also allows to customize BlobRepo. Currently it's used
just to add redaction override, but later we can extend it for other use cases
as well, with the hope that we'll be able to remove a bunch of repo-creation
functions from cmdlib.
Because of BlobrepoBuilder we no longer need open_blobrepo function. Later we
might consider removing open_blobrepo_given_datasources as well.
Note that this diff *adds* a few new clones. I don't consider it being a big
problem, though I'm curious to hear your thoughts folks.
Note that another option for the implementation would be to take a reference to objects
instead of taking them by value. I briefly looked into how they used, and lot of them are passed to the
objects that actually take ownership of what's inside these config fields. I.e. Blobstore essentially takes ownership
of BlobstoreOptions, because it needs to store manifold bucket name.
Same for scuba_censored_table, filestore_params, bookmarks_cache_ttl etc. So unless I'm missing anything, we can
either pass them as reference and then we'll have to copy them, or we can
just pass a value from BlobrepoBuilder directly.
Reviewed By: krallin
Differential Revision: D20312567
fbshipit-source-id: 14634f5e14f103b110482557254f084da1c725e1
Summary:
We've recently added new scuba table for derived data
(https://fburl.com/scuba/mononoke_derived_data/e4sekisf), and looks like our
previous format of logging is not very useful. It's better to have separate
fields for changeset id and derived data type, since it makes aggregation
easier.
Reviewed By: krallin
Differential Revision: D20309093
fbshipit-source-id: 48f5f04e0412002ef04028e34b12bf267a9b6834
Summary:
The mock crates contain a standard set of mock commits ids with all nybbles
set to a single hex value.
For the mutation tests I want to be able to generate them from a number and
have more than 15 changeset IDs. Add a new function that generates an hg
changeset ID from a number.
Reviewed By: krallin
Differential Revision: D20287382
fbshipit-source-id: 1f57de89f19e2e2eea8dbfea969a4d54510e23d8
Summary:
Note that comparing to many other asyncifying efforts, this one actually adds
one more clone instead of removing them. This is the clone of a logger field.
That shouldn't matter much because it can be cleaned up later and because this
function will be called once per repo.
Reviewed By: krallin
Differential Revision: D20311122
fbshipit-source-id: ace2a108790b1423f8525d08bdea9dc3a2e3c37c
Summary:
Instead of returning `anyhow::Error` wrapping an `ErrorKind` enum
from each Thrift client method, just return an error type specific
to that method. This will make error handling simpler and less
error-prone by removing the need to downcast the returned error.
This diff also removes the `ErrorKind` enums so that we can be sure
that there are no leftover places trying to downcast to them.
(Note: this ignores all push blocking failures!)
Reviewed By: dtolnay
Differential Revision: D20260398
fbshipit-source-id: f0dd96a7b83dd49f6b30948660456539012f82e6
Summary:
Previously we could have "Started ..." before "Starting ..."
This diff fixes it.
Reviewed By: krallin
Differential Revision: D20277406
fbshipit-source-id: 3c2f3fa1723c2e0852c6b114592ab7ad90be17ff
Summary:
This updates the store_bytes method to chunk incoming data instead of uploading
it as-is. This is unfortunately a bit hacky (but so was the previous
implementation), since it means we have to hash the data before it has gone
through the Filestore's preparation.
That said, one of the invariants of the filestore is that chunk size shouldn't
affect the Content ID (and there is fairly extensive test coverage for this),
so, notionally, this does work.
Performance-wise, it does mean we are hashing the object twice. That actually
was the case before as well anyway (since obtain the ContentId for FileContents
would clone them then hash them).
The upshot of this change is that large files uploaded through unbundle will
actually be chunked (whereas before, they wouldn't be).
Long-term, we should try and delete this method, as it is quite unsavory to
begin with. But, for now, we don't really have a choice since our content
upload path does rely on its existence.
Reviewed By: StanislavGlebik
Differential Revision: D20281937
fbshipit-source-id: 78d584b2f9eea6996dd1d4acbbadc10c9049a408
Summary:
This is a very small struct (2 u64s) that really doesn't need to be passed by
reference. Might as well just pass it by value.
Differential Revision: D20281936
fbshipit-source-id: 2cc64c8ab6e99ee50b2e493eff61ea34d6eb54c1
Summary: separate out the Facebook-specific pieces of the sql_ext crate
Reviewed By: ahornby
Differential Revision: D20218219
fbshipit-source-id: e933c7402b31fcd5c4af78d5e70adafd67e91ecd
Summary:
Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/
This codemod replaces all dependencies on `//common/rust/renamed:tokio-preview` with `fbsource//third-party/rust:tokio-preview` and their uses in Rust code from `tokio_preview::` to `tokio::`.
This does not introduce any collisions with `tokio::` meaning 0.1 tokio because D20235404 previously renamed all of those to `tokio_old::` in crates that depend on both 0.1 and 0.2 tokio.
This is the tokio version of what D20213432 did for futures.
Codemod performed by:
```
rg \
--files-with-matches \
--type-add buck:TARGETS \
--type buck \
--glob '!/experimental' \
--regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
-x \
buck query "labels(srcs, rdeps(%Ss, //common/rust/renamed:tokio-preview, 1))" \
| xargs sed -i 's,\btokio_preview::,tokio::,'
rg \
--files-with-matches \
--type-add buck:TARGETS \
--type buck \
--glob '!/experimental' \
--regexp '(_|\b)rust(_|\b)' \
| xargs sed -i 's,//common/rust/renamed:tokio-preview,fbsource//third-party/rust:tokio-preview,'
```
Reviewed By: k21
Differential Revision: D20236557
fbshipit-source-id: 15068b93a0a944d6249a1d9f63840a4c61c9c1ba
Summary:
This updates microwave to also support changesets, in addition to filenodes.
Those create a non-trivial amount of SQL load when we warm up the cache (due to
sequential reads), which we can eliminate by loading them through microwave.
They're also a bottleneck when manifests are loaded already.
Note: as part of this, I've updated the Microwave wrapper methods to panic if
we try to access a method that isn't instrumented. Since we'd be running
the Microwave builder in the background, this feels OK (because then we'd find
out if we call them during cache warmup unexpectedly).
Reviewed By: farnz
Differential Revision: D20221463
fbshipit-source-id: 317023677af4180007001fcaccc203681b7c95b7
Summary:
This incorporates microwave into the cache warmup process. See earlier in this
stack for a description of what this does, how it works, and why it's useful.
Reviewed By: ahornby
Differential Revision: D20219904
fbshipit-source-id: 52db74dc83635c5673ffe97cd5ff3e06faba7621
Summary: Improvements aim to minimize number of db queries
Differential Revision: D20280711
fbshipit-source-id: 6cc06f1ac4ed8db9978e0eee956550fcd16bbe8a
Summary:
Implementation of derivation logic for the changeset info.
BonsaiDerived is implemented for the ChangesetInfo. `derive_from_parents` just derives an info and BonsaiDerivedMapping then puts it into the blobstore.
```
ChangesetInfo::derive(..) -> ChacgesetInfo
```
Reviewed By: krallin
Differential Revision: D20185954
fbshipit-source-id: afe609d1b2711aed7f2740714df6b9417c6fe716
Summary:
Introducing data structures for derived Bonsai changeset info, which is supposed to store all commit metadata except of the file changes.
Bonsai changeset consists of the commit metadata and a set of all the file changes associated with the commit.
Some of the changesets, usually for merge commits, include thousands of file changes. It is not a problem by itself, however in cases where we need to know some information about the commit apart from its hash, we have to fetch the whole changeset. And it can take up to 15-20 seconds
Changeset info as a separate data structure is needed to speed up changeset fetching process: when we need to use commit metadata but not the file changes.
Reviewed By: markbt
Differential Revision: D20139434
fbshipit-source-id: 4faab267304d987b44d56994af9e36b6efabe02a
Summary:
The new API is required for migration Commit Cloud off hg servers and infinitepush database
This also can fix phases issues with `hg cloud sl`.
Reviewed By: markbt
Differential Revision: D20221913
fbshipit-source-id: 67ddceb273b8c6156c67ce5bc7e71d679e8999b6
Summary:
Fix the tail interval delay, it wasn't triggering.
Took the opportunity to structure the code as a loop as well which simplified it a bit.
Reviewed By: markbt
Differential Revision: D20247077
fbshipit-source-id: 1786ef1528a4b0493f5e454d28450d7198af8ad4
Summary: The `rust-crypto` crate has not been maintained; replacing it with the `sha-1` crate since it's the only algorithm used in this library.
Reviewed By: dtolnay
Differential Revision: D20236029
fbshipit-source-id: 9c4ff25f393b099ec9570a7badbe4b378fbd98af
Summary:
Previously warm bookmark cache tried to derive all bookmarks on startup. It slows down the startup time and in some cases it might prevent scs server from starting up at all.
Let's change how warm bookmark cache initializes the bookmarks - instead of trying to derive all of them let's move underived bookmarks back in history.
Reviewed By: krallin
Differential Revision: D20195211
fbshipit-source-id: 5cb5d8599d3035973175d3063186a7c01536889a
Summary:
We didn't use DelayBlob at all, however we use DelayedBlobstore in benchmark
lib. DelayedBlobstore seem to have more useful options, so let's remove
DelayBlob and use DelayedBlobstore instead.
Reviewed By: farnz
Differential Revision: D20245865
fbshipit-source-id: bd694a0e178367014adc2776185450693f87475d
Summary:
Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/
In targets that depend on both 0.1 and 0.2 tokio, this codemod renames the 0.1 dependency to be exposed as tokio_old::. This is in preparation for flipping the 0.2 dependencies from tokio_preview:: to plain tokio::.
This is the tokio version of what D20168958 did for futures.
Codemod performed by:
```
rg \
--files-with-matches \
--type-add buck:TARGETS \
--type buck \
--glob '!/experimental' \
--regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
-x \
buck query "labels(srcs,
rdeps(%Ss, fbsource//third-party/rust:tokio-old, 1)
intersect
rdeps(%Ss, //common/rust/renamed:tokio-preview, 1)
)" \
| xargs sed -i 's,\btokio::,tokio_old::,'
```
Reviewed By: k21
Differential Revision: D20235404
fbshipit-source-id: cfb2689a584ad0d73f16d98d8587fb9c44661465
Summary: clippy was failing, this diff should fix it hopefully
Reviewed By: krallin
Differential Revision: D20250585
fbshipit-source-id: 6a9becdb84ec293659433fa9078e456d40210b6c
Summary:
This removes the Extend implementation for FileBytes, which was incorrect (it
discarded existing data!). I had introduced this as a backwards compatibility
shim when doing the Bytes 0.4 to Bytes 0.5 migration :/
We don't really need this shim, considering:
- The only place that really matters that uses this is the remotefilelog crate,
where we have a content id, and where we should use `filestore::fetch_concat`
instead.
- The other places are tests (or close to abandonware...), which can do their
own folding.
Longer term, I'd like to remove the whole `Content` stream in hg entries, so
those callsites can use the filestore methods, which a) have test coverage
(unlike ad-hoc folds, which don't always do), and b) are more efficient since
they know how large the destination buffer needs to be ahead of time, and don't
need to re-allocate.
To make sure this fixes the bug, I also introduced tests for the remotefilelog
crate. As expected, the chunked variant fails without this fix.
Reviewed By: mitrandir77
Differential Revision: D20248978
fbshipit-source-id: 1b554d3e595eb867b6b6cf4204d31f27dd90a111
Summary:
Not sampling errors will make it easier to use Fastreplay as an early alarm
system for errors.
Reviewed By: ahornby
Differential Revision: D20249202
fbshipit-source-id: 92da53d5703b58bcef49cfcdc251f008ae6f25bc
Summary:
The git mappings are normally populated during blobimport of the repo but we
need something for the repos we've already imported.
Reviewed By: markbt
Differential Revision: D20160768
fbshipit-source-id: 9e37c7d0f12682e73ca9990e56e4d827e9861a9f
Summary:
We don't use it, and this tries to write to Manifold from tests, which is
undesirable. Let's remove it;
Reviewed By: farnz
Differential Revision: D20219902
fbshipit-source-id: 2e983bee54cadad257648cc9633695be825a1ef3
Summary:
This introduces a new binary and library that (microwave: it makes warmup
faster..!) that can be used to accelerate cache warmup. The idea is the
microwave binary will run cache warmup and capture things that are loaded
during cache warmup, and commit those to a file.
We can then use that file when starting up a host to get a head start on cache
warmup by injecting all those entries into our local cache before actually
starting cache warmup.
Currently, this only supports filenodes, but that's already a pretty good
improvement. Changesets should be easy to add as well. Blobs might require a
bit more work.
Reviewed By: StanislavGlebik
Differential Revision: D20219905
fbshipit-source-id: 82bb13ca487f82ca53b4a68a90ac5893895a96e9
Summary:
The walker has been hitting the filenodes-enforced 5 second SQL timeout when
querying filenodes from MySQL.
It's not clear why that is, but looking at previous run history shows that we
occasionally have queries that take > 30 seconds to complete (none of those
show up in MySQL slow queries, though, and there's no particular load on the
hosts around that time, so it's not clear whether this is happening in MySQL or
our end).
Anyhow, those queries would have worked in the old implementation (after a long
time), but they fail in the new one, since it enforces a 5-second timeout.
We should investigate why this is happening (and Alex has landed diffs to add
more reporting in the walker to that end), but in the meantime, there's no
reason to break the walker
Reviewed By: farnz
Differential Revision: D20227842
fbshipit-source-id: 5ee5c8225b6474b66c1f48a10b4a2d671ebc79c6
Summary: When it fails, it's better to know which repo failed.
Reviewed By: farnz
Differential Revision: D20245375
fbshipit-source-id: 9794911308dbdd67b20673857ac8b7b54f06a217
Summary: Makes it easier to understand which repo is failing
Reviewed By: krallin
Differential Revision: D20244630
fbshipit-source-id: ca32f7831c5ed4e701103020e9878c459ba6d573
Summary: Make these functions generic so that callers don't need to construct a trait object whenever they want to call them. Passing in a trait object should still work so existing callsites should not be affected.
Reviewed By: krallin
Differential Revision: D20225830
fbshipit-source-id: df0389b0f19aa44aaa89682198f43cb9f1d84b25
Summary: Add a method to `HgFileContext` to stream the history of the file. Will be used to support EdenAPI history requests.
Reviewed By: krallin
Differential Revision: D20211779
fbshipit-source-id: 49e8c235468d18b23976e64a9205cbcc86a7a1b4