Commit Graph

91 Commits

Author SHA1 Message Date
Stefan Filip
2193b84b43 autocargo: regen
Summary: Regen autocargo

Reviewed By: quark-zju

Differential Revision: D25409963

fbshipit-source-id: 7dbbe420aeb30248bf43d3a96a9aa6a47bb2b0be
2020-12-08 18:30:24 -08:00
Thomas Orozco
d5097a2d8c mononoke/lfs_server: account for size in object popularity
Summary:
This modifies our object popularity mechanism to account for the size of the
objects being downloaded. Indeed, considering our bottleneck is bandwidth, we
forcing similar routing for 10 downloads of a 10MB object and 10 downloads of a
1GB object doesn't make that much sense.

This diffs updates our counting so that we now record the object size instead
of a count. I'll set up routing so that we disallow consistent routing when a
single object exceeds 250MiB/s of throughput ( = 1/4th of a task).

It's worth noting that this will be equivalent to what we have right now for
our most problematic objects (GraphQL schemas in Fbsource, 35M each), given
that we "unroute" at 150 requests / 20 seconds (`150 * 35 / 20 = 262`).

The key difference here is that this will work for all objects.

This does mean LFS needs to cache and know about content metadata. That's not
actually a big deal. Indeed, over a week, we serve 25K distinct objects
(https://fburl.com/scuba/mononoke_lfs/a2d26s1a), so considering content
metadata is a hundred bytes (and so is the key), we are looking at a few MBs of
cache space here.

As part of this, I've also refactored our config handling to stop duplicating
structures in Configerator and LFS by using the Thrift objects directly (we
still have a few dedicated structs when post-processing is necessary, but we
no longer have anything that deserializes straight from JSON).

Note that one further refinement here would be to consistently route but to
more tasks (i.e. return one of 2 routing tokens for an object that is being downloaded
at 500MiB/s). We'll see if we need that.

Reviewed By: HarveyHunt

Differential Revision: D24361314

fbshipit-source-id: 49e1f86cf49357f60f1eac298a753e0c1fcbdbe5
2020-12-03 06:17:06 -08:00
Alex Hornby
935a7ddfc8 mononoke: remove the need to pass in cachelib settings twice
Summary:
Previously needed to pass in cachelib settings once to MononokeAppBuilder and once to parse_and_init_cachelib.

This change adds a MononokeClapApp and MononokeMatches that preserve the settings, thus preventing the need to pass them in twice (and thus avoiding possible inconsistency)

MononokeMatches uses MaybeOwned to hold the inner ArgMatches, which allows us to hold both the usual reference case from get_matches and an owned case for get_matches_from which is used in test cases.

Reviewed By: krallin

Differential Revision: D24788450

fbshipit-source-id: aad5fff2edda305177dcefa4b3a98ab99bc2d811
2020-12-02 07:27:23 -08:00
Alex Hornby
caa684450f mononoke: show cachelib cache_size in --help usage
Summary:
Show cachelib cache_size default in --help usage so its clear what you'll get if no command line args passed

Because we need to convert from bytes to GiB, the lifetime of the help string isn't long enough for clap's reference recieving default_value, so use OnceCell to be able to pass a static reference.

Reviewed By: krallin

Differential Revision: D24761026

fbshipit-source-id: 81b5e7ceb832d5cb55cc9faef59c5e6432f7c4b0
2020-12-02 07:27:23 -08:00
Alex Hornby
a2247dc41c mononoke: rename MononokeApp to MononokeAppBuilder
Summary:
It has a build() method and later in stack it will build a mononoke
specific type rather than the clap::App

Differential Revision: D25216827

fbshipit-source-id: 24a531856405a702e7fecf54d60be1ea3d2aa6e7
2020-12-01 11:45:00 -08:00
Kostia Balytskyi
e4dab84619 scuba: turn ScubaSampleBuilderExt into a wrapper struct
Summary:
This diff prepares the Mononoke codebase for composition-based extendability of
`ScubaSampleBuilder`. Specifically, in the near future I will add:
- new methods for verbose scuba logging
- new data field (`ObservabilityContext`) to check if verbose logging should
  be enabled or disabled

The higher-level goal here is to be able to enable/disable verbose Scuba
logging (either overall or for certain slices of logs, like for a certain
session id) in real time, without restarting Mononoke. To do so, I plan to
expose the aforementioned verbose logging methods, which will run a check
against the stored `ObservabilityContext` and make a decision of whether the
logging is enabled or not. `ObservabilityContext` will of course hide
implementation details from the renamed `ScubaSampleBuilderExt`, and just provide a yes/no
answer based on the current config and sample fields.

At the moment this should be a completely harmless change.

Reviewed By: krallin

Differential Revision: D25211089

fbshipit-source-id: ea03dda82fadb7fc91a2433e12e220582ede5fb8
2020-11-30 21:26:24 -08:00
Lukas Piatkowski
fa1a195fd0 mononoke/blobstore: pass CoreContext via borrowed instead of owned value
Summary: Follow up after removing 'static from blobstore.

Reviewed By: StanislavGlebik

Differential Revision: D25182106

fbshipit-source-id: e13a7a31d71b4674425123268e655ae66127f1b7
2020-11-27 03:31:07 -08:00
Lukas Piatkowski
0f54cc3d63 mononoke/blobstore: make Blobstore generic over lifetime
Summary: Remove 'static requirement for async methods of Blobstore, propagate this change and fixup low hanging fruits where the code can become 'static free easily.

Reviewed By: ahornby, farnz

Differential Revision: D24839054

fbshipit-source-id: 5d5daa04c23c4c9ae902b669b0a71fe41ee6dee6
2020-11-20 05:51:52 -08:00
Thomas Orozco
26e06ef1a0 mononoke/filestore: update fetch methods to return 0.3 stream
Summary:
This updates the external facing API of the filestore to use 0.3 streams.
Internally, there is still a bit of 0.3 streams, but as of this change, it's
all 0.3 outside.

This required a few changes here and there in places where it was simpler to
just update them to use 0.3 futures instead of `compat()`-ing everything.

Reviewed By: ikostia

Differential Revision: D24731298

fbshipit-source-id: 18a1dc58b27d129970a6aa2d0d23994d5c5de6aa
2020-11-06 07:26:04 -08:00
Thomas Orozco
184310158b mononoke/filestore: update fetch external API to 0.3 futures
Summary:
Like it says in the title. This required quite a lot of changes at callsites,
as you'd expect.

Reviewed By: StanislavGlebik

Differential Revision: D24731299

fbshipit-source-id: e58447e88dcc3ba1ab3c951f87f7042e2b03eb2c
2020-11-06 07:26:03 -08:00
Thomas Orozco
b6949dbc26 mononoke/filestore: update store to futures 0.3
Summary: Like it says in the title. This updates `store()` and its (many) callsites.

Reviewed By: ahornby

Differential Revision: D24728658

fbshipit-source-id: 5fccf76d25e58eaf069f3f0cf5a31d2c397687ea
2020-11-06 07:26:03 -08:00
Simon Farnsworth
4e59e26775 Thread ConfigStore into blobstore creation
Summary: SQLBlob GC (next diff in stack) will need a ConfigStore in SQLBlob. Make one available to blobstore creation

Reviewed By: krallin

Differential Revision: D24460586

fbshipit-source-id: ea2d5149e0c548844f1fd2a0d241ed0647e137ae
2020-10-27 04:14:24 -07:00
Simon Farnsworth
7e06175e61 Make config store always explicit
Summary: In D24447404, I provided some utility functions that allowed me to avoid constructing and/or passing around a ConfigStore. Remove those functions and fix up the code to run.

Reviewed By: krallin

Differential Revision: D24502692

fbshipit-source-id: 742dbc54fbcf735895d6829745b9317af14dfa0b
2020-10-24 06:23:49 -07:00
Simon Farnsworth
00871310a7 Ensure we have only one ConfigStore for the application
Summary: It's designed as a singleton store for normal use - rather than have lots of ways to get different config stores, let's use one global store

Reviewed By: krallin

Differential Revision: D24447404

fbshipit-source-id: 6ecc46351b14794471f654cec98527a11a93d3ef
2020-10-24 06:23:49 -07:00
Thomas Orozco
fde21ba026 mononoke/lfs_server: log client attempt
Summary:
I updated the client to send this earlier this week. Let's start collecting it
server side. The bottom line is this will let us identify when we reject a
client that is done retrying.

Reviewed By: HarveyHunt

Differential Revision: D24361884

fbshipit-source-id: b31e53c7dc989f1698e28b2a0bf14dc3fdbb507a
2020-10-19 05:01:57 -07:00
Arun Kulshreshtha
3078f346f8 gotham_ext: make StreamBody take an infallible byte stream
Summary: Make `StreamBody` accept a `Stream` of `Bytes` instead of a `TryStream` of `Bytes`. This means that applications returning streaming responses will be forced to deal with errors prior to returning the response.

Reviewed By: krallin

Differential Revision: D23780216

fbshipit-source-id: dbad61947ef23bbfc4edf3d286ad0218c1859d87
2020-09-18 22:49:55 -07:00
Arun Kulshreshtha
980406cea8 lfs_server: log errors from streaming responses
Summary:
Using the `EndOnErr` combinator introduced in the previous diff, log any errors that occur during a streaming response to stderr.

Note that **the intent of this diff is to implement the most basic possible example of doing something with these errors**, with the goal of allowing us to modify `StreamBody` to only accept infallible `Stream`s.

I'd imagine that in all likelihood we'd want to do something smarter with the errors than just print them, but I figure that can be added in later diffs since it seems like doing something else (like logging the error to Scuba or adding to the RequestContext) might require additional changes that are beyond the scope of this diff.

At the very least, this seems like an improvement from before, where these errors would just get passed straight through to Hyper.

Reviewed By: krallin

Differential Revision: D23780217

fbshipit-source-id: 2f885f9fdc6af3dd28d95be1daa1d82c732453fa
2020-09-18 22:49:55 -07:00
Arun Kulshreshtha
6e5f95067a gotham_ext: move content streams into separate module
Summary:
The `gotham_ext::response` module was getting a bit large, so this diff moves `ContentMeta`, `ContentStream`, and `CompressedContentStream` into a new submodule, alongside the contents of the old `content_encoding` module. This way, the `response` module remains entirely centered around the `TryIntoResponse` trait (and the various body structs that implement that trait).

Later diffs in this stack will be adding an additional layer between the content streams and the body structs, at which point it probably doesn't make sense to have these right next to each other. Splitting them out now will allow for better code organization going forward.

Reviewed By: krallin

Differential Revision: D23777492

fbshipit-source-id: 86e598dcb37578d3b22217a2a65f1bde84d72215
2020-09-18 01:34:04 -07:00
Thomas Orozco
acc0d428cc mononoke/lfs_server: add support for response compression
Summary:
This adds support for compressing responses in the LFS Server, based on what
the client sent in `Accept-Encoding`. The compression changes are fairly
simple. Most of the codes changes are around the fact that when we compress,
we don't send a Content-Length (because we don't know how long the content will
be).

Note that this is largely implemented in StreamBody. This means it can be used
for free by the EdenAPI server as well. The reason it's in there is because we
need to avoid setting the Content-Length when compression is going to be used
(`StreamBody` is what takes charge for doing this). This also exposes a
callback to get access to the stream post-compression, which also needs to be
exposed in `StreamBody`, since that's where compression happens.

Reviewed By: aslpavel

Differential Revision: D23652334

fbshipit-source-id: 8f462d69139991c3e1d37f392d448904206ec0d2
2020-09-15 07:59:53 -07:00
Arun Kulshreshtha
a43444d481 lfs_server: remove mutable borrow from ODS middleware
Summary: `RequestContext` doesn't need to be mutable anymore.

Reviewed By: krallin

Differential Revision: D23631630

fbshipit-source-id: 0080aa3d5c25bafd784251adc34e98f9d04908de
2020-09-10 20:57:12 -07:00
David Tolnay
be0786f14b Prepare for rustfmt 2.0
Summary:
Generated by formatting with rustfmt 2.0.0-rc.2 and then a second time with fbsource's current rustfmt (1.4.14).

This results in formatting for which rustfmt 1.4 is idempotent but is closer to the style of rustfmt 2.0, reducing the amount of code that will need to change atomically in that upgrade.

 ---

*Why now?* **:** The 1.x branch is no longer being developed and fixes like https://github.com/rust-lang/rustfmt/issues/4159 (which we need in fbcode) only land to the 2.0 branch.

 ---

Reviewed By: StanislavGlebik

Differential Revision: D23568780

fbshipit-source-id: b4b4a0aa683d236e2fdeb5b96d723ac2d84b9faf
2020-09-08 07:33:16 -07:00
Arun Kulshreshtha
a7a96e55eb lfs_server: tidy up middleware imports
Summary: Import middleware directly from `gotham_ext` rather than relying on reexports in the `middleware` module.

Reviewed By: farnz

Differential Revision: D23547320

fbshipit-source-id: e64a8acff55445a646b0a1b3b1e71cf6606c3d02
2020-09-07 17:24:45 -07:00
Arun Kulshreshtha
83c54b48f8 gotham_ext: move ScubaMiddleware into gotham_ext
Summary:
Move `ScubaMiddleware` out of the LFS server and into `gotham_ext`.

This change required splitting up the `ScubaKey` enum to separate generally useful column names (e.g., HTTP columns that would be applicable to any HTTP service) from LFS-specific columns. `ScubaMiddlwareState` has been modified to accept any type that implements `Into<String>` as a key, and the `ScubaKey` enum has been split up into `HttpScubaKey` (in `gotham_ext`) and `LfsScubaKey` (in `lfs_server`).

The middleware now takes a type parameter to specify a "handler" (implementing the new `ScubaHandler`  trait) which allows the application to add application-specific Scuba columns in addition to the default columns. The application-specific columns will be added immediately prior to the sample being logged.

Reviewed By: krallin

Differential Revision: D23458748

fbshipit-source-id: 3e99f3e0b5d3475a4f5ac9eaefade2eeff12c2fa
2020-09-07 17:24:45 -07:00
Stanislau Hlebik
7b323a4fd9 mononoke: add log-only mode in redaction
Summary:
Before redacting something it would be good to check that this file is not
accessed by anything. Having log-only mode would help with that.

Reviewed By: ikostia

Differential Revision: D23503666

fbshipit-source-id: ae492d4e0e6f2da792d36ee42a73f591e632dfa4
2020-09-04 07:37:15 -07:00
Stanislau Hlebik
0740f99f13 mononoke: allow logging censored scuba accesses to file
Summary:
In the next diff I'm going to add log-only mode to redaction, and it would be
good to have a way of testing it (i.e. testing that it actually logs accesses
to bad keys).

In this diff let's use a config option that allows logging censored scuba
accesses to file, and let's update redaction integration test to use it

Reviewed By: ikostia

Differential Revision: D23537797

fbshipit-source-id: 69af2f05b86bdc0ff6145979f211ddd4f43142d2
2020-09-04 07:37:14 -07:00
Arun Kulshreshtha
858a080502 gotham_ext: make StreamBody automatically delay post-request callbacks
Summary: Now that post-request callbacks are available in `gotham_ext`, we can make `StreamBody` use them directly instead of using an LFS-specific wrapper (previously required to access the LFS server's `RequestContext`). This also means that the EdenAPI server will get this behavior for free.

Reviewed By: krallin

Differential Revision: D23402969

fbshipit-source-id: 56ab710473f13e8983b136664af364af6884bd3f
2020-09-03 11:59:32 -07:00
Arun Kulshreshtha
7144363d2c gotham_ext: move LogMiddleware to gotham_ext
Summary: Now that `LogMiddleware` no longer depends on `RequestContext`, it can be moved into `gotham_ext`.

Reviewed By: DurhamG

Differential Revision: D23298412

fbshipit-source-id: d5288decba98c3dd4605b9a44e41eba0f47fee37
2020-09-03 11:59:31 -07:00
Arun Kulshreshtha
35d292e513 gotham_ext: move LoadMiddleware to gotham_ext
Summary: Now that `LoadMiddleware` no longer depends on `RequestContext`, it can be moved into `gotham_ext`.

Reviewed By: DurhamG

Differential Revision: D23298416

fbshipit-source-id: 5d29da492e39beb5621daf0570d9b3e657cbfc04
2020-09-03 11:59:31 -07:00
Arun Kulshreshtha
82c451fb9f lfs_server: use PostRequestMiddleware
Summary: This diff removes the post-request callback functionality from the LFS server's `RequestContext` and replaces it with the new `PostRequestMiddleware`. The middleware is directly based on `RequestContext`, so the underlying behavior is essentially the same as before.

Reviewed By: krallin

Differential Revision: D23298413

fbshipit-source-id: 1e58a40f6ce6d526456dbd9ae3a8efc85768bf04
2020-09-03 11:59:31 -07:00
Thomas Orozco
b8e197fdb4 mononoke/lfs_server: allow enabling rate limits probabilistically
Summary:
If we exceed a rate limit, we probably don't want to just drop 100% of traffic.
This would create a sawtooth pattern where we allow a bunch of traffic, update
our counters, drop a bunch of traffic, update our counters again, allow a bunch
of traffic, etc.

To fix this, let's make limits probabilistic. This lets us say "beyond X GB/s,
drop Y% of traffic", which is closer to a sane rate limit.

It might also make sense to eventually change this to use ratelim. Initially,
we didn't do this because we needed our rate limiting decisions to be local to
a single host (because different hosts served different traffic), but now that
we spread the load for popular blobs across the whole tier, we should be able
to just delegate to ratelim.

For now, however, let's finish this bit of a functionality so we can turn it
on.

The corresponding Configerator change is here: D23472683

Reviewed By: aslpavel

Differential Revision: D23472945

fbshipit-source-id: f7d985fded3cdbbcea3bc8cef405224ff5426a25
2020-09-02 11:02:18 -07:00
Arun Kulshreshtha
767570d298 lfs_server: remove PerfCounters from post-request callback signature
Summary:
`PerfCounters` was the only application-specific type exposed as a parameter to the post-request callbacks, and it was only being used in one place. To facilitate making the post-request callback functionality more general, this diff makes the callback in question capture the `CoreContext` in its environment, thereby giving it access to the `PerfCounters` without requiring it to be passed as an argument.

This should not change the behavior since regardless of how the callback obtains a reference, it will still refer to the same underlying `PerfCounters` from the request's `CoreContext`.

Reviewed By: DurhamG

Differential Revision: D23298417

fbshipit-source-id: 898f14e5b35b827e98eaf1731db436261baa43bb
2020-08-27 14:15:25 -07:00
Arun Kulshreshtha
9f68c673f3 gotham_ext: move TimerMiddleware into gotham_ext
Summary: Now that `TimerMiddleware` no longer depends on `RequestContext`, it can be moved into `gotham_ext`.

Reviewed By: farnz

Differential Revision: D23298414

fbshipit-source-id: 058cb67c9294b28ec7aec03a45da9588e97facc5
2020-08-26 16:04:31 -07:00
Arun Kulshreshtha
825016043f lfs_server: decouple TimerMiddleware from RequestContext
Summary: Previously, the LFS server's `TimerMiddleware` needed to be used in conjunction with `RequestContext`, as its purpose was to simply call a method on the `RequestContext` to record the elapsed time. This diff moves tracking of the elapsed time into `TimerMiddleware` itself (via Gotham's `State`), allowing the middleware to be used on its own.

Reviewed By: farnz

Differential Revision: D23298418

fbshipit-source-id: 8077d40edec0936d95317ac11d86bbcd33a3bf04
2020-08-26 16:04:31 -07:00
Arun Kulshreshtha
439dd2d495 gotham_ext: move client hostname lookup into gotham_ext
Summary: Move client hostname reverse DNS lookup from inside of the LFS server's `RequestContext` to an async method on `ClientIdentity`, allowing it to be used elsewhere. The behavior of `RequestContext::dispatch_post_request` should remain unchanged.

Reviewed By: krallin

Differential Revision: D22835610

fbshipit-source-id: 15c1183f64324f216bd639630396c9c6f19bcaaa
2020-07-30 10:27:35 -07:00
Harvey Hunt
e5b249cefc mononoke: lfs_server: Use enforce_acl_check as a killswitch for ACL enforcement
Summary:
Now that we can configure ACL checking on a per-repo basis, use the
`enforce_acl_check` config option as a killswitch to quickly disable ACL
enforcement, if required.

Further, remove the `acl_check` config flag that was always set to True.

As part of this change I've refactored the integration test a little and
replaced the phrase "ACL check" with "ACL enforcement", as we always check the
ACL inside of the LFS server.

Reviewed By: krallin

Differential Revision: D22764510

fbshipit-source-id: 8e09c743a9cd78d54b1423fd2a5cfc9bf7383d7a
2020-07-28 04:57:01 -07:00
Harvey Hunt
cce86abf14 mononoke: lfs_server: Enforce ACL checks on a per repo basis
Summary:
Update the LFS server to use the `enforce_lfs_acl_check` to enforce
ACL checks for specific repos and also reject clients with missing idents.

In the next diff, I will use the existing LFS server config's
`enforce_acl_check` flag as a killswitch.

Reviewed By: krallin

Differential Revision: D22762451

fbshipit-source-id: 61d26944127711f3503e04154e8c079ae75dc815
2020-07-27 11:04:59 -07:00
Thomas Orozco
ef6d7b48b4 mononoke/lfs_server: popularity: add ODS counters, timeouts and tests
Summary:
ODS counters are helpful to know if the feature is turned on or off without
requiring a traffic spike, so let's log them. Also, let's add timeouts in here,
so we know if things aren't working as expected (I did check in the Mononoke
LFS dataset — 10ms is a very conservative number, that's way beyond the p99 of
batch requests, which include potentially many counter checks).

To make this easier to iterate on, let's also add tests.

Reviewed By: StanislavGlebik

Differential Revision: D22545853

fbshipit-source-id: 02ea4484a4e4ba0dfd4a71030c129eb5c6bb1ec9
2020-07-15 03:39:23 -07:00
Thomas Orozco
2e99b4b7cd mononoke/lfs_server: blob popularity: skip consistent routing for hot blobs
Summary:
If a particular is blob is too popular, we can saturate a LFS host through
consistent routing, and possibly OOM the host as well.

Historically, we haven't had enough traffic to LFS to make this a problem, but
we're getting there now.

This diffs adds support for reporting the popularity of a blob through SCS (not
Mononoke SCS — the couting one), and for using this popularity to identify when
we should stop consistently-routing a given blob.

The idea is that if e.g. something was requested 300 times in the last 20
seconds, it'll take a second for all the hosts to have it in cache, so we might
as well distribute this load.

There are plenty of things we could do slightly better here, such as making the
interval configurable, or having something in-between "consistently route to a
single host" and "don't consistently route at all". That said, I don't think
those are necessary right now, so let's start simple and find out.

Reviewed By: HarveyHunt

Differential Revision: D22503748

fbshipit-source-id: 48827bcfb7658ad22c88a8433359e29b0d56ad5a
2020-07-13 13:00:36 -07:00
Arun Kulshreshtha
5f0181f48c Regenerate all Cargo.tomls after upgrade to futures 0.3.5
Summary: D22381744 updated the version of `futures` in third-party/rust to 0.3.5, but did not regenerate the autocargo-managed Cargo.toml files in the repo. Although this is a semver-compatible change (and therefore should not break anything), it means that affected projects would see changes to all of their Cargo.toml files the next time they ran `cargo autocargo`.

Reviewed By: dtolnay

Differential Revision: D22403809

fbshipit-source-id: eb1fdbaf69c99549309da0f67c9bebcb69c1131b
2020-07-06 20:49:43 -07:00
Simon Farnsworth
b1c85aaf4b Switch Blobstore to new-style futures
Summary:
Eventually, we want everything to be `async`/`await`; as a stepping stone in that direction, switch the remaining lobstore traits to new-style futures.

This just pushes the `.compat()` out to old-style futures, but it makes the move to non-'static lifetimes easier, as all the compile errors will relate to lifetime issues.

Reviewed By: krallin

Differential Revision: D22183228

fbshipit-source-id: 3fe3977f4469626f55cbf5636d17fff905039827
2020-06-26 03:54:42 -07:00
Simon Farnsworth
454de31134 Switch Loadable and Storable interfaces to new-style futures
Summary:
Eventually, we want everything to be `async`/`await`; as a stepping stone in that direction, switch some of the blobstore interfaces to new-style `BoxFuture` with a `'static` lifetime.

This does not enable any fixes at this point, but does mean that `.compat()` moves to the places that need old-style futures instead of new. It also means that the work needed to make the transition fully complete is changed from a full conversion to new futures, to simply changing the lifetimes involved and fixing the resulting compile failures.

Reviewed By: krallin

Differential Revision: D22164315

fbshipit-source-id: dc655c36db4711d84d42d1e81b76e5dddd16f59d
2020-06-25 08:45:37 -07:00
Thomas Orozco
76606260c2 mononoke/lfs_server: automatically consume HTTP response bodies when dropped
Summary:
If we don't read the body for a response, then Hyper cannot return the
connection to the pool. So, let's do it automatically upon dropping. This will
typically happen when we send a request to upstream then don't read the
response.

I seem to remember this used to work fine at some point, but looking at the
code I think it's actually broken now and we don't reuse upstream connections
if we skip waiting for upstream in a batch request. So, let's fix it once and
for all with a more robust abstraction.

Reviewed By: HarveyHunt

Differential Revision: D22206742

fbshipit-source-id: 2da1c008556e1d964c1cc337d58f06f8d691a916
2020-06-24 10:02:02 -07:00
Thomas Orozco
b60ff4403f mononoke/lfs_server: clean up a bit of spawning code
Summary:
This was old Tokio 0.1 code that needed channels for spawns, but in 0.2 that
actually is built-in to tokio::spawn, so let's use this.

Reviewed By: HarveyHunt

Differential Revision: D22206738

fbshipit-source-id: 8f89ca4f7afc8dd888fe289e8d597148976cc54c
2020-06-24 10:02:01 -07:00
Thomas Orozco
e6d8747347 mononoke/lfs_server: don't require reading data streams to drop them
Summary:
This fixes a bit of a tech debt item in the LFS Server. We've had this
discard_stream functon for a while, which was necessary because if you just
drop the data stream, you get an error on the sending end.

This makes the code more complex than it needs to be, since you need to always
explicitly discard data streams you don't want instead of just dropping them.

This fixes that by letting us support a sender that tolerates the receiver
being closed, and just ignores those errors.

Reviewed By: HarveyHunt

Differential Revision: D22206739

fbshipit-source-id: d209679b20a3724bcd2e082ebd0d2ce10e9ac481
2020-06-24 10:02:01 -07:00
Thomas Orozco
7f48790fb4 mononoke/lfs_server: refactor upload to make it easier to unit test
Summary:
We have a lot of integration tests for LFS, but a handful of unit tests don't
hurt for some simpler changes. Let's make it easier to write those.

Reviewed By: HarveyHunt

Differential Revision: D22206741

fbshipit-source-id: abcb73b35c01f28dd54cc543cd0a746327d3787b
2020-06-24 10:02:01 -07:00
Thomas Orozco
ce7f53422f mononoke/lfs_server: support the client not having the data it wants to send us
Summary:
This diff is probably going to sound weird ... but xavierd and I both think
this is the best approach for where we are right now. Here is why this is
necessary.

Consider the following scenario

- A client creates a LFS object. They upload it to Mononoke LFS, but not
  upstream.
- The client shares this (e.g. with Sandcastle), and includes a LFS pointer.
- The client tries to push this commit

When this happens, the client might not actually have the object locally.
Indeed, the only pieces of data the client is guaranteed to have is
locally-authored data.

Even if the client does have the blob, that's going to be in the hgcache, and
uploading from the hgcache is a bit sketchy (because, well, it's a cache, so
it's not like it's normally guaranteed to just hold data there for us to push
it to the server).

The problem boils down to a mismatch of assumptions between client and server:

- The client assumes that if the data wasn't locally authored, then the server
  must have it, and will never request this piece of data again.
- The server assumes that if the client offers a blob for upload, it can
  request this blob from the client (and the client will send it).

Those assumptions are obviously not compatible, since we can serve
not-locally-authored data from LFS and yet want the client to upload it, either
because it is missing in upstream or locally.

This leaves us with a few options:

- Upload from the hg cache. As noted above, this isn't desirable, because the
  data might not be there to begin with! Populating the cache on demand (from
  the server) just to push data back to the server would be quite messy.
- Skip the upload entirely, either by having the server not request the upload
  if the data is missing, by having the server report that the upload is
  optional, or by having the client not offer LFS blobs it doens't have to the
  server, or finally by having the client simply disobey the server if it
  doesn't have the data the server is asking for.

So, why can we not just skip the upload? The answer is: for the same reason we
upload to upstream to begin with. Consider the following scenario:

- Misconfigured client produces a commit, and upload it to upstream.
- Misconfigured client shares the commit with Sandcastle, and includes a LFS
  pointer.
- Sandcastle wants to push to master, so it goes to check if the blob is
  present in LFS. It isn't (Mononoke LFS checks both upstream and internal, and
  only finds the blob in upstream, so it requests that the client submit the
  blob), but it's also not not locally authored, so we skip the push.
- The client tries to push to Mononoke

This push will fail, because it'll reference LFS data that is not present in
Mononoke (it's only in upstream).

As for how we fix this: the key guarantee made by our proxying mechanism is
that if you write to either LFS server, your data is readable in both (the way
we do this is that if you write to Mononoke LFS, we write it to upstream too,
and if you write to upstream, we can read it from Mononoke LFS too).

What does not matter there is where the data came from. So, when the client
uploads, we simply let it submit a zero-length blob, and if so, we take that to
mean that the client doesn't think it authored data (and thinks we have it), so
we try to figure out where the blob is on the server side.

Reviewed By: xavierd

Differential Revision: D22192005

fbshipit-source-id: bf67e33e2b7114dfa26d356f373b407f2d00dc70
2020-06-24 10:02:01 -07:00
Pavel Aslanov
d13768d768 move DangerousOverride into a separate crate blobrepo_override
Summary: DangerousOverride is moved into a separate crate. Not only it is usually not needed but it was introducing dependencies on mercurial crate.

Reviewed By: StanislavGlebik

Differential Revision: D22115015

fbshipit-source-id: c9646896f906ea54d11aa83a8fbd8490a5b115ea
2020-06-22 07:29:19 -07:00
Jeremy Fitzhardinge
1b4edb5567 eden: remove unused Rust dependencies
Summary:
Remove unused dependencies for Rust targets.

This failed to remove the dependencies in eden/scm/edenscmnative/bindings
because of the extra macro layer.

Manual edits (named_deps) and misc output in P133451794

Reviewed By: dtolnay

Differential Revision: D22083498

fbshipit-source-id: 170bbaf3c6d767e52e86152d0f34bf6daa198283
2020-06-17 17:55:03 -07:00
Arun Kulshreshtha
5be68125a7 gotham_ext: Add StreamBody
Summary:
Move the LFS server's `StreamBody` into `gotham_ext`, along with some changes to decouple it from the LFS server.

In particular, the `Content-Length` header and post-request `Sender` have been made optional fields that can be set via a builder-style interface. The LFS server's `StreamBody` has been renamed to `LfsStreamBody` and is now a thin wrapper around `gotham_ext`'s `StreamBody` that preserves the old behavior.

Reviewed By: krallin

Differential Revision: D21988855

fbshipit-source-id: a9bf9c04bb791388d761fc705ebc38472a713b65
2020-06-11 13:03:24 -07:00
Johan Schuijt-Li
1d4c5cbfc4 mononoke: replace instances of whitelist/blacklist/blackhole
Summary:
There are people that are hurt by usage of these terms, this should be more
then enough reason to replace these. Newly chosen terms are more
self-explanatory as well.

This doesn't yet touch the actualy config files, as that requires a bit more
effort than 1 diff and will require more coordination.

Reviewed By: krallin

Differential Revision: D21924440

fbshipit-source-id: e24fc638dc8c9d6d20b6f3fa5f0d0bbc91bbf77b
2020-06-10 19:29:30 -07:00