Summary:
This modifies our object popularity mechanism to account for the size of the
objects being downloaded. Indeed, considering our bottleneck is bandwidth, we
forcing similar routing for 10 downloads of a 10MB object and 10 downloads of a
1GB object doesn't make that much sense.
This diffs updates our counting so that we now record the object size instead
of a count. I'll set up routing so that we disallow consistent routing when a
single object exceeds 250MiB/s of throughput ( = 1/4th of a task).
It's worth noting that this will be equivalent to what we have right now for
our most problematic objects (GraphQL schemas in Fbsource, 35M each), given
that we "unroute" at 150 requests / 20 seconds (`150 * 35 / 20 = 262`).
The key difference here is that this will work for all objects.
This does mean LFS needs to cache and know about content metadata. That's not
actually a big deal. Indeed, over a week, we serve 25K distinct objects
(https://fburl.com/scuba/mononoke_lfs/a2d26s1a), so considering content
metadata is a hundred bytes (and so is the key), we are looking at a few MBs of
cache space here.
As part of this, I've also refactored our config handling to stop duplicating
structures in Configerator and LFS by using the Thrift objects directly (we
still have a few dedicated structs when post-processing is necessary, but we
no longer have anything that deserializes straight from JSON).
Note that one further refinement here would be to consistently route but to
more tasks (i.e. return one of 2 routing tokens for an object that is being downloaded
at 500MiB/s). We'll see if we need that.
Reviewed By: HarveyHunt
Differential Revision: D24361314
fbshipit-source-id: 49e1f86cf49357f60f1eac298a753e0c1fcbdbe5
Summary:
Previously needed to pass in cachelib settings once to MononokeAppBuilder and once to parse_and_init_cachelib.
This change adds a MononokeClapApp and MononokeMatches that preserve the settings, thus preventing the need to pass them in twice (and thus avoiding possible inconsistency)
MononokeMatches uses MaybeOwned to hold the inner ArgMatches, which allows us to hold both the usual reference case from get_matches and an owned case for get_matches_from which is used in test cases.
Reviewed By: krallin
Differential Revision: D24788450
fbshipit-source-id: aad5fff2edda305177dcefa4b3a98ab99bc2d811
Summary:
Show cachelib cache_size default in --help usage so its clear what you'll get if no command line args passed
Because we need to convert from bytes to GiB, the lifetime of the help string isn't long enough for clap's reference recieving default_value, so use OnceCell to be able to pass a static reference.
Reviewed By: krallin
Differential Revision: D24761026
fbshipit-source-id: 81b5e7ceb832d5cb55cc9faef59c5e6432f7c4b0
Summary:
It has a build() method and later in stack it will build a mononoke
specific type rather than the clap::App
Differential Revision: D25216827
fbshipit-source-id: 24a531856405a702e7fecf54d60be1ea3d2aa6e7
Summary:
This diff prepares the Mononoke codebase for composition-based extendability of
`ScubaSampleBuilder`. Specifically, in the near future I will add:
- new methods for verbose scuba logging
- new data field (`ObservabilityContext`) to check if verbose logging should
be enabled or disabled
The higher-level goal here is to be able to enable/disable verbose Scuba
logging (either overall or for certain slices of logs, like for a certain
session id) in real time, without restarting Mononoke. To do so, I plan to
expose the aforementioned verbose logging methods, which will run a check
against the stored `ObservabilityContext` and make a decision of whether the
logging is enabled or not. `ObservabilityContext` will of course hide
implementation details from the renamed `ScubaSampleBuilderExt`, and just provide a yes/no
answer based on the current config and sample fields.
At the moment this should be a completely harmless change.
Reviewed By: krallin
Differential Revision: D25211089
fbshipit-source-id: ea03dda82fadb7fc91a2433e12e220582ede5fb8
Summary: Remove 'static requirement for async methods of Blobstore, propagate this change and fixup low hanging fruits where the code can become 'static free easily.
Reviewed By: ahornby, farnz
Differential Revision: D24839054
fbshipit-source-id: 5d5daa04c23c4c9ae902b669b0a71fe41ee6dee6
Summary:
This updates the external facing API of the filestore to use 0.3 streams.
Internally, there is still a bit of 0.3 streams, but as of this change, it's
all 0.3 outside.
This required a few changes here and there in places where it was simpler to
just update them to use 0.3 futures instead of `compat()`-ing everything.
Reviewed By: ikostia
Differential Revision: D24731298
fbshipit-source-id: 18a1dc58b27d129970a6aa2d0d23994d5c5de6aa
Summary:
Like it says in the title. This required quite a lot of changes at callsites,
as you'd expect.
Reviewed By: StanislavGlebik
Differential Revision: D24731299
fbshipit-source-id: e58447e88dcc3ba1ab3c951f87f7042e2b03eb2c
Summary: Like it says in the title. This updates `store()` and its (many) callsites.
Reviewed By: ahornby
Differential Revision: D24728658
fbshipit-source-id: 5fccf76d25e58eaf069f3f0cf5a31d2c397687ea
Summary: SQLBlob GC (next diff in stack) will need a ConfigStore in SQLBlob. Make one available to blobstore creation
Reviewed By: krallin
Differential Revision: D24460586
fbshipit-source-id: ea2d5149e0c548844f1fd2a0d241ed0647e137ae
Summary: In D24447404, I provided some utility functions that allowed me to avoid constructing and/or passing around a ConfigStore. Remove those functions and fix up the code to run.
Reviewed By: krallin
Differential Revision: D24502692
fbshipit-source-id: 742dbc54fbcf735895d6829745b9317af14dfa0b
Summary: It's designed as a singleton store for normal use - rather than have lots of ways to get different config stores, let's use one global store
Reviewed By: krallin
Differential Revision: D24447404
fbshipit-source-id: 6ecc46351b14794471f654cec98527a11a93d3ef
Summary:
I updated the client to send this earlier this week. Let's start collecting it
server side. The bottom line is this will let us identify when we reject a
client that is done retrying.
Reviewed By: HarveyHunt
Differential Revision: D24361884
fbshipit-source-id: b31e53c7dc989f1698e28b2a0bf14dc3fdbb507a
Summary: Make `StreamBody` accept a `Stream` of `Bytes` instead of a `TryStream` of `Bytes`. This means that applications returning streaming responses will be forced to deal with errors prior to returning the response.
Reviewed By: krallin
Differential Revision: D23780216
fbshipit-source-id: dbad61947ef23bbfc4edf3d286ad0218c1859d87
Summary:
Using the `EndOnErr` combinator introduced in the previous diff, log any errors that occur during a streaming response to stderr.
Note that **the intent of this diff is to implement the most basic possible example of doing something with these errors**, with the goal of allowing us to modify `StreamBody` to only accept infallible `Stream`s.
I'd imagine that in all likelihood we'd want to do something smarter with the errors than just print them, but I figure that can be added in later diffs since it seems like doing something else (like logging the error to Scuba or adding to the RequestContext) might require additional changes that are beyond the scope of this diff.
At the very least, this seems like an improvement from before, where these errors would just get passed straight through to Hyper.
Reviewed By: krallin
Differential Revision: D23780217
fbshipit-source-id: 2f885f9fdc6af3dd28d95be1daa1d82c732453fa
Summary:
The `gotham_ext::response` module was getting a bit large, so this diff moves `ContentMeta`, `ContentStream`, and `CompressedContentStream` into a new submodule, alongside the contents of the old `content_encoding` module. This way, the `response` module remains entirely centered around the `TryIntoResponse` trait (and the various body structs that implement that trait).
Later diffs in this stack will be adding an additional layer between the content streams and the body structs, at which point it probably doesn't make sense to have these right next to each other. Splitting them out now will allow for better code organization going forward.
Reviewed By: krallin
Differential Revision: D23777492
fbshipit-source-id: 86e598dcb37578d3b22217a2a65f1bde84d72215
Summary:
This adds support for compressing responses in the LFS Server, based on what
the client sent in `Accept-Encoding`. The compression changes are fairly
simple. Most of the codes changes are around the fact that when we compress,
we don't send a Content-Length (because we don't know how long the content will
be).
Note that this is largely implemented in StreamBody. This means it can be used
for free by the EdenAPI server as well. The reason it's in there is because we
need to avoid setting the Content-Length when compression is going to be used
(`StreamBody` is what takes charge for doing this). This also exposes a
callback to get access to the stream post-compression, which also needs to be
exposed in `StreamBody`, since that's where compression happens.
Reviewed By: aslpavel
Differential Revision: D23652334
fbshipit-source-id: 8f462d69139991c3e1d37f392d448904206ec0d2
Summary:
Generated by formatting with rustfmt 2.0.0-rc.2 and then a second time with fbsource's current rustfmt (1.4.14).
This results in formatting for which rustfmt 1.4 is idempotent but is closer to the style of rustfmt 2.0, reducing the amount of code that will need to change atomically in that upgrade.
---
*Why now?* **:** The 1.x branch is no longer being developed and fixes like https://github.com/rust-lang/rustfmt/issues/4159 (which we need in fbcode) only land to the 2.0 branch.
---
Reviewed By: StanislavGlebik
Differential Revision: D23568780
fbshipit-source-id: b4b4a0aa683d236e2fdeb5b96d723ac2d84b9faf
Summary: Import middleware directly from `gotham_ext` rather than relying on reexports in the `middleware` module.
Reviewed By: farnz
Differential Revision: D23547320
fbshipit-source-id: e64a8acff55445a646b0a1b3b1e71cf6606c3d02
Summary:
Move `ScubaMiddleware` out of the LFS server and into `gotham_ext`.
This change required splitting up the `ScubaKey` enum to separate generally useful column names (e.g., HTTP columns that would be applicable to any HTTP service) from LFS-specific columns. `ScubaMiddlwareState` has been modified to accept any type that implements `Into<String>` as a key, and the `ScubaKey` enum has been split up into `HttpScubaKey` (in `gotham_ext`) and `LfsScubaKey` (in `lfs_server`).
The middleware now takes a type parameter to specify a "handler" (implementing the new `ScubaHandler` trait) which allows the application to add application-specific Scuba columns in addition to the default columns. The application-specific columns will be added immediately prior to the sample being logged.
Reviewed By: krallin
Differential Revision: D23458748
fbshipit-source-id: 3e99f3e0b5d3475a4f5ac9eaefade2eeff12c2fa
Summary:
Before redacting something it would be good to check that this file is not
accessed by anything. Having log-only mode would help with that.
Reviewed By: ikostia
Differential Revision: D23503666
fbshipit-source-id: ae492d4e0e6f2da792d36ee42a73f591e632dfa4
Summary:
In the next diff I'm going to add log-only mode to redaction, and it would be
good to have a way of testing it (i.e. testing that it actually logs accesses
to bad keys).
In this diff let's use a config option that allows logging censored scuba
accesses to file, and let's update redaction integration test to use it
Reviewed By: ikostia
Differential Revision: D23537797
fbshipit-source-id: 69af2f05b86bdc0ff6145979f211ddd4f43142d2
Summary: Now that post-request callbacks are available in `gotham_ext`, we can make `StreamBody` use them directly instead of using an LFS-specific wrapper (previously required to access the LFS server's `RequestContext`). This also means that the EdenAPI server will get this behavior for free.
Reviewed By: krallin
Differential Revision: D23402969
fbshipit-source-id: 56ab710473f13e8983b136664af364af6884bd3f
Summary: Now that `LogMiddleware` no longer depends on `RequestContext`, it can be moved into `gotham_ext`.
Reviewed By: DurhamG
Differential Revision: D23298412
fbshipit-source-id: d5288decba98c3dd4605b9a44e41eba0f47fee37
Summary: Now that `LoadMiddleware` no longer depends on `RequestContext`, it can be moved into `gotham_ext`.
Reviewed By: DurhamG
Differential Revision: D23298416
fbshipit-source-id: 5d29da492e39beb5621daf0570d9b3e657cbfc04
Summary: This diff removes the post-request callback functionality from the LFS server's `RequestContext` and replaces it with the new `PostRequestMiddleware`. The middleware is directly based on `RequestContext`, so the underlying behavior is essentially the same as before.
Reviewed By: krallin
Differential Revision: D23298413
fbshipit-source-id: 1e58a40f6ce6d526456dbd9ae3a8efc85768bf04
Summary:
If we exceed a rate limit, we probably don't want to just drop 100% of traffic.
This would create a sawtooth pattern where we allow a bunch of traffic, update
our counters, drop a bunch of traffic, update our counters again, allow a bunch
of traffic, etc.
To fix this, let's make limits probabilistic. This lets us say "beyond X GB/s,
drop Y% of traffic", which is closer to a sane rate limit.
It might also make sense to eventually change this to use ratelim. Initially,
we didn't do this because we needed our rate limiting decisions to be local to
a single host (because different hosts served different traffic), but now that
we spread the load for popular blobs across the whole tier, we should be able
to just delegate to ratelim.
For now, however, let's finish this bit of a functionality so we can turn it
on.
The corresponding Configerator change is here: D23472683
Reviewed By: aslpavel
Differential Revision: D23472945
fbshipit-source-id: f7d985fded3cdbbcea3bc8cef405224ff5426a25
Summary:
`PerfCounters` was the only application-specific type exposed as a parameter to the post-request callbacks, and it was only being used in one place. To facilitate making the post-request callback functionality more general, this diff makes the callback in question capture the `CoreContext` in its environment, thereby giving it access to the `PerfCounters` without requiring it to be passed as an argument.
This should not change the behavior since regardless of how the callback obtains a reference, it will still refer to the same underlying `PerfCounters` from the request's `CoreContext`.
Reviewed By: DurhamG
Differential Revision: D23298417
fbshipit-source-id: 898f14e5b35b827e98eaf1731db436261baa43bb
Summary: Now that `TimerMiddleware` no longer depends on `RequestContext`, it can be moved into `gotham_ext`.
Reviewed By: farnz
Differential Revision: D23298414
fbshipit-source-id: 058cb67c9294b28ec7aec03a45da9588e97facc5
Summary: Previously, the LFS server's `TimerMiddleware` needed to be used in conjunction with `RequestContext`, as its purpose was to simply call a method on the `RequestContext` to record the elapsed time. This diff moves tracking of the elapsed time into `TimerMiddleware` itself (via Gotham's `State`), allowing the middleware to be used on its own.
Reviewed By: farnz
Differential Revision: D23298418
fbshipit-source-id: 8077d40edec0936d95317ac11d86bbcd33a3bf04
Summary: Move client hostname reverse DNS lookup from inside of the LFS server's `RequestContext` to an async method on `ClientIdentity`, allowing it to be used elsewhere. The behavior of `RequestContext::dispatch_post_request` should remain unchanged.
Reviewed By: krallin
Differential Revision: D22835610
fbshipit-source-id: 15c1183f64324f216bd639630396c9c6f19bcaaa
Summary:
Now that we can configure ACL checking on a per-repo basis, use the
`enforce_acl_check` config option as a killswitch to quickly disable ACL
enforcement, if required.
Further, remove the `acl_check` config flag that was always set to True.
As part of this change I've refactored the integration test a little and
replaced the phrase "ACL check" with "ACL enforcement", as we always check the
ACL inside of the LFS server.
Reviewed By: krallin
Differential Revision: D22764510
fbshipit-source-id: 8e09c743a9cd78d54b1423fd2a5cfc9bf7383d7a
Summary:
Update the LFS server to use the `enforce_lfs_acl_check` to enforce
ACL checks for specific repos and also reject clients with missing idents.
In the next diff, I will use the existing LFS server config's
`enforce_acl_check` flag as a killswitch.
Reviewed By: krallin
Differential Revision: D22762451
fbshipit-source-id: 61d26944127711f3503e04154e8c079ae75dc815
Summary:
ODS counters are helpful to know if the feature is turned on or off without
requiring a traffic spike, so let's log them. Also, let's add timeouts in here,
so we know if things aren't working as expected (I did check in the Mononoke
LFS dataset — 10ms is a very conservative number, that's way beyond the p99 of
batch requests, which include potentially many counter checks).
To make this easier to iterate on, let's also add tests.
Reviewed By: StanislavGlebik
Differential Revision: D22545853
fbshipit-source-id: 02ea4484a4e4ba0dfd4a71030c129eb5c6bb1ec9
Summary:
If a particular is blob is too popular, we can saturate a LFS host through
consistent routing, and possibly OOM the host as well.
Historically, we haven't had enough traffic to LFS to make this a problem, but
we're getting there now.
This diffs adds support for reporting the popularity of a blob through SCS (not
Mononoke SCS — the couting one), and for using this popularity to identify when
we should stop consistently-routing a given blob.
The idea is that if e.g. something was requested 300 times in the last 20
seconds, it'll take a second for all the hosts to have it in cache, so we might
as well distribute this load.
There are plenty of things we could do slightly better here, such as making the
interval configurable, or having something in-between "consistently route to a
single host" and "don't consistently route at all". That said, I don't think
those are necessary right now, so let's start simple and find out.
Reviewed By: HarveyHunt
Differential Revision: D22503748
fbshipit-source-id: 48827bcfb7658ad22c88a8433359e29b0d56ad5a
Summary: D22381744 updated the version of `futures` in third-party/rust to 0.3.5, but did not regenerate the autocargo-managed Cargo.toml files in the repo. Although this is a semver-compatible change (and therefore should not break anything), it means that affected projects would see changes to all of their Cargo.toml files the next time they ran `cargo autocargo`.
Reviewed By: dtolnay
Differential Revision: D22403809
fbshipit-source-id: eb1fdbaf69c99549309da0f67c9bebcb69c1131b
Summary:
Eventually, we want everything to be `async`/`await`; as a stepping stone in that direction, switch the remaining lobstore traits to new-style futures.
This just pushes the `.compat()` out to old-style futures, but it makes the move to non-'static lifetimes easier, as all the compile errors will relate to lifetime issues.
Reviewed By: krallin
Differential Revision: D22183228
fbshipit-source-id: 3fe3977f4469626f55cbf5636d17fff905039827
Summary:
Eventually, we want everything to be `async`/`await`; as a stepping stone in that direction, switch some of the blobstore interfaces to new-style `BoxFuture` with a `'static` lifetime.
This does not enable any fixes at this point, but does mean that `.compat()` moves to the places that need old-style futures instead of new. It also means that the work needed to make the transition fully complete is changed from a full conversion to new futures, to simply changing the lifetimes involved and fixing the resulting compile failures.
Reviewed By: krallin
Differential Revision: D22164315
fbshipit-source-id: dc655c36db4711d84d42d1e81b76e5dddd16f59d
Summary:
If we don't read the body for a response, then Hyper cannot return the
connection to the pool. So, let's do it automatically upon dropping. This will
typically happen when we send a request to upstream then don't read the
response.
I seem to remember this used to work fine at some point, but looking at the
code I think it's actually broken now and we don't reuse upstream connections
if we skip waiting for upstream in a batch request. So, let's fix it once and
for all with a more robust abstraction.
Reviewed By: HarveyHunt
Differential Revision: D22206742
fbshipit-source-id: 2da1c008556e1d964c1cc337d58f06f8d691a916
Summary:
This was old Tokio 0.1 code that needed channels for spawns, but in 0.2 that
actually is built-in to tokio::spawn, so let's use this.
Reviewed By: HarveyHunt
Differential Revision: D22206738
fbshipit-source-id: 8f89ca4f7afc8dd888fe289e8d597148976cc54c
Summary:
This fixes a bit of a tech debt item in the LFS Server. We've had this
discard_stream functon for a while, which was necessary because if you just
drop the data stream, you get an error on the sending end.
This makes the code more complex than it needs to be, since you need to always
explicitly discard data streams you don't want instead of just dropping them.
This fixes that by letting us support a sender that tolerates the receiver
being closed, and just ignores those errors.
Reviewed By: HarveyHunt
Differential Revision: D22206739
fbshipit-source-id: d209679b20a3724bcd2e082ebd0d2ce10e9ac481
Summary:
We have a lot of integration tests for LFS, but a handful of unit tests don't
hurt for some simpler changes. Let's make it easier to write those.
Reviewed By: HarveyHunt
Differential Revision: D22206741
fbshipit-source-id: abcb73b35c01f28dd54cc543cd0a746327d3787b
Summary:
This diff is probably going to sound weird ... but xavierd and I both think
this is the best approach for where we are right now. Here is why this is
necessary.
Consider the following scenario
- A client creates a LFS object. They upload it to Mononoke LFS, but not
upstream.
- The client shares this (e.g. with Sandcastle), and includes a LFS pointer.
- The client tries to push this commit
When this happens, the client might not actually have the object locally.
Indeed, the only pieces of data the client is guaranteed to have is
locally-authored data.
Even if the client does have the blob, that's going to be in the hgcache, and
uploading from the hgcache is a bit sketchy (because, well, it's a cache, so
it's not like it's normally guaranteed to just hold data there for us to push
it to the server).
The problem boils down to a mismatch of assumptions between client and server:
- The client assumes that if the data wasn't locally authored, then the server
must have it, and will never request this piece of data again.
- The server assumes that if the client offers a blob for upload, it can
request this blob from the client (and the client will send it).
Those assumptions are obviously not compatible, since we can serve
not-locally-authored data from LFS and yet want the client to upload it, either
because it is missing in upstream or locally.
This leaves us with a few options:
- Upload from the hg cache. As noted above, this isn't desirable, because the
data might not be there to begin with! Populating the cache on demand (from
the server) just to push data back to the server would be quite messy.
- Skip the upload entirely, either by having the server not request the upload
if the data is missing, by having the server report that the upload is
optional, or by having the client not offer LFS blobs it doens't have to the
server, or finally by having the client simply disobey the server if it
doesn't have the data the server is asking for.
So, why can we not just skip the upload? The answer is: for the same reason we
upload to upstream to begin with. Consider the following scenario:
- Misconfigured client produces a commit, and upload it to upstream.
- Misconfigured client shares the commit with Sandcastle, and includes a LFS
pointer.
- Sandcastle wants to push to master, so it goes to check if the blob is
present in LFS. It isn't (Mononoke LFS checks both upstream and internal, and
only finds the blob in upstream, so it requests that the client submit the
blob), but it's also not not locally authored, so we skip the push.
- The client tries to push to Mononoke
This push will fail, because it'll reference LFS data that is not present in
Mononoke (it's only in upstream).
As for how we fix this: the key guarantee made by our proxying mechanism is
that if you write to either LFS server, your data is readable in both (the way
we do this is that if you write to Mononoke LFS, we write it to upstream too,
and if you write to upstream, we can read it from Mononoke LFS too).
What does not matter there is where the data came from. So, when the client
uploads, we simply let it submit a zero-length blob, and if so, we take that to
mean that the client doesn't think it authored data (and thinks we have it), so
we try to figure out where the blob is on the server side.
Reviewed By: xavierd
Differential Revision: D22192005
fbshipit-source-id: bf67e33e2b7114dfa26d356f373b407f2d00dc70
Summary: DangerousOverride is moved into a separate crate. Not only it is usually not needed but it was introducing dependencies on mercurial crate.
Reviewed By: StanislavGlebik
Differential Revision: D22115015
fbshipit-source-id: c9646896f906ea54d11aa83a8fbd8490a5b115ea
Summary:
Remove unused dependencies for Rust targets.
This failed to remove the dependencies in eden/scm/edenscmnative/bindings
because of the extra macro layer.
Manual edits (named_deps) and misc output in P133451794
Reviewed By: dtolnay
Differential Revision: D22083498
fbshipit-source-id: 170bbaf3c6d767e52e86152d0f34bf6daa198283
Summary:
Move the LFS server's `StreamBody` into `gotham_ext`, along with some changes to decouple it from the LFS server.
In particular, the `Content-Length` header and post-request `Sender` have been made optional fields that can be set via a builder-style interface. The LFS server's `StreamBody` has been renamed to `LfsStreamBody` and is now a thin wrapper around `gotham_ext`'s `StreamBody` that preserves the old behavior.
Reviewed By: krallin
Differential Revision: D21988855
fbshipit-source-id: a9bf9c04bb791388d761fc705ebc38472a713b65
Summary:
There are people that are hurt by usage of these terms, this should be more
then enough reason to replace these. Newly chosen terms are more
self-explanatory as well.
This doesn't yet touch the actualy config files, as that requires a bit more
effort than 1 diff and will require more coordination.
Reviewed By: krallin
Differential Revision: D21924440
fbshipit-source-id: e24fc638dc8c9d6d20b6f3fa5f0d0bbc91bbf77b