Commit Graph

158 Commits

Author SHA1 Message Date
Alex Hornby
31b900bc08 mononoke: make CountedBlobstoreOps put behaviour aware
Summary:
Add CountedBlobstoreOps so that blobstore layers that need BlobstoreOps can still use counting

This unblocks adding sqlblob to blobstore-test in next diff in stack

Reviewed By: farnz

Differential Revision: D24079256

fbshipit-source-id: 6a6505aff8c8405353a1f10d79f6e6e08911228a
2020-10-12 07:12:10 -07:00
Alex Hornby
60a588aa87 mononoke: make prefixblob put behaviour aware
Summary: Add BlobstorePutOps so that blobstore layers that need BlobstorePutOps can still use PrefixBlob as a wrapper.

Reviewed By: farnz

Differential Revision: D24109298

fbshipit-source-id: 710571e6c30fa8a432d463eedfab5fcc0389baa3
2020-10-12 07:12:10 -07:00
Alex Hornby
48900ae545 mononoke: predicate based PutBehaviour logic to manifoldblob
Summary:
Add predicate based PutBehaviour logic to manifoldblob.

This will prevent overwrites of keys when in IfAbsent mode, and will generate useful logging in OverwriteAndLog and IsAbsent mode.

This change factors our part of the put logic to put_check_conflict, so that it can use re-used from each of the PutBehaviour cases.

Reviewed By: StanislavGlebik

Differential Revision: D24021170

fbshipit-source-id: d2e71afadada3d5e661634449108e6c9f8dc5907
2020-10-12 07:12:10 -07:00
Alex Hornby
9756def014 mononoke: implement BlobstorePutOps for sqlblob
Summary: Implement BlobstorePutOps for sqlblob

Differential Revision: D24021172

fbshipit-source-id: be24bc0d58263e190fdca546a3adf9b5815b3c4b
2020-10-08 04:59:11 -07:00
Alex Hornby
4e772d07d5 mononoke: implement BlobstorePutOps for S3Blob
Summary:
Implement BlobstorePutOps for S3Blob.  This uses is_present to check the various put behaviours

While implementing this I noticed get_sharded_key could be updated to take a reference, so I did that as well.

Differential Revision: D24079253

fbshipit-source-id: 16e194076dbdb4da8a7a9b779e0bd5fb60f550a6
2020-10-07 12:11:11 -07:00
Alex Hornby
cac5350f5f mononoke: add test for blobstore OverwriteStatus
Summary: Now that fileblob and memblob support put behaviour logic, update the overwrite test to check the overwrite result.

Differential Revision: D24021167

fbshipit-source-id: d9578630205cf5d79999a459cc29481968d5717d
2020-10-07 12:11:10 -07:00
Alex Hornby
cad15511f8 mononoke: update memblob to be PutBehaviour aware
Summary: Update memblob to be PutBehaviour aware by changing implementation from Blobstore to BlobstoreOps

Differential Revision: D24021166

fbshipit-source-id: 04dd25c5535769ea507120c1886592b808a7bbc6
2020-10-07 12:11:10 -07:00
Alex Hornby
fb1d4515df mononoke: update Memblob::new callsites to ::default()
Summary: Update Memblob::new callsites to ::default() in preparation for adding arguments to ::new() to specify the put behaviour desired

Differential Revision: D24021173

fbshipit-source-id: 07bf4e6c576ba85c9fa0374d5aac57a533132448
2020-10-07 12:11:10 -07:00
Alex Hornby
9c9401f691 mononoke: add put behaviour handling to fileblob
Summary: Add put behaviour handling to fileblob so that it can prevent overwrites if requested.

Differential Revision: D23933228

fbshipit-source-id: 8e74ac96b232be841174f6ad2bd2fccf92aaa90d
2020-10-07 12:11:10 -07:00
Alex Hornby
2abe862535 mononoke: add put behaviour to BlobstoreOptions
Summary:
Add put behaviour to BlobstoreOptions in preparation for passing in the put behaviour through blobstore_factory.

Later in the stack a command line option is added to set this non-None so that we can turn on overwrite logging for particular jobs.

Reviewed By: StanislavGlebik

Differential Revision: D24021169

fbshipit-source-id: 5692e2d3912ebde07b0d7bcce54b79df188a9f16
2020-10-07 12:11:10 -07:00
Alex Hornby
4f0b9c3e42 mononoke: introduce BlobstorePutOps for blobstore implementations
Summary:
This is the first part of allowing us to update mononoke blobstore put behaviour to optionally a) log when it is overwriting keys, and b) not overwrite existing keys.

Introduce BlobstorePutOps for blobstore implementations so we can track overwrite status of a put, and force an explicit PutBehaviour if required.  Its intended that only blobstore implementation code and special admin tooling will need to access BlobstorePutOps methods.

Reviewed By: farnz

Differential Revision: D24021168

fbshipit-source-id: 56ae34f9995a93cf1e47fbcfa2565f236c28ae12
2020-10-06 13:05:40 -07:00
Alex Hornby
42f5c54104 mononoke: remove unnecessary clone in packblob
Summary: Remove unnecessary clone in packblob along with the Clone constraint on the inner blobstore.

Reviewed By: krallin

Differential Revision: D24109293

fbshipit-source-id: b47e68e63b6ffda95d28d974ed6883e4ae31b3a1
2020-10-06 03:34:36 -07:00
Simon Farnsworth
83801357d4 Make SQLBlob overwrite on put rather than ignoring
Summary:
We want to end up with two `put` behaviours - overwrite and do not overwrite.

Currently, SQLBlob only implements the latter, but some users assume that `put` always overwrites. Change to match Manifold

Reviewed By: aslpavel

Differential Revision: D24079501

fbshipit-source-id: f75cac81acf874337c38f82597aae645c41a319b
2020-10-02 10:41:12 -07:00
Alex Hornby
409a9da79d mononoke: remove assert_present from Blobstore trait
Summary:
Remove assert_present from Blobstore trait as it had only one callsite other than the various blobstore layers/impls.

Replaced that one last call in repo_commit.rs/assert_in_blobstore() with an equivalent call to is_present.

Reviewed By: farnz

Differential Revision: D24016927

fbshipit-source-id: 764fddbebeb4b1192d196078b8824cf8a08e9691
2020-10-01 01:23:52 -07:00
Aida Getoeva
40b8353d21 mononoke: integrate mysql client
Summary:
This diff introduces Mysql client for Rust to Mononoke as a one more backend in the same row with raw xdb connections and myrouter. So now Mononoke can use new Mysql client connections instead of Myrouter.

To run Mononoke with the new backend, pass `--use-mysql-client` options (conflicts with `--myrouter-port`).

I also added a new target for integration tests, which runs mysql tests using mysql client.
Now to run mysql tests using raw xdb connections, you can use `mononoke/tests/integration:integration-mysql-raw-xdb` and using mysql client `mononoke/tests/integration:integration-mysql`

Reviewed By: ahornby

Differential Revision: D23213228

fbshipit-source-id: c124ccb15747edb17ed94cdad2c6f7703d3bf1a2
2020-09-29 03:09:05 -07:00
Lukas Piatkowski
eea2b564a8 mononoke/s3blob: remove it from OSS (#62)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/62

This diff fixes OSS Mononoke build.

Reviewed By: HarveyHunt

Differential Revision: D23852016

fbshipit-source-id: 90371149a3566efdd5653b4ba5098dad81357ef2
2020-09-23 06:38:03 -07:00
Alex Hornby
d107b28d52 mononoke: SomeFailedOthersNone should not consider write mostly blobstores None if all other stores Error
Summary: SomeFailedOthersNone should not consider write mostly blobstores None if all other stores Error

Reviewed By: farnz

Differential Revision: D23840334

fbshipit-source-id: 9838bead6fec0d5f920e4a788387025d0dacf80b
2020-09-22 09:35:38 -07:00
Alex Hornby
d3a94e0a70 mononoke: Add a test for SomeFailedOthersNone when write mostly blobstore is None
Summary: Add a test for SomeFailedOthersNone when write mostly blobstore is None

Reviewed By: farnz

Differential Revision: D23840685

fbshipit-source-id: 81834663169b3a522b9c08e0a36f0b91354916c7
2020-09-22 09:35:38 -07:00
Egor Tkachenko
4d0ae8ae41 Added S3 blobstore
Summary:
Implemented S3 blobstore
Isilon implements S3 as 1:1 mapping into filesystem, and it limits the maximum number of blobs in the single directory. To overcome it lets shard the keys using base64 encoding and making 2 level dir structure with 2 chars dir names.

Reviewed By: krallin

Differential Revision: D23562541

fbshipit-source-id: c87aca2410381a07babb191cbd8cf28233556e03
2020-09-22 04:15:34 -07:00
Stanislau Hlebik
9fc2a01f0b mononoke: bump memcache key for blobstore
Summary:
I've re-backfilled some of blame values for configerator. But old values might
still be in memcache. To make sure that's not the case let's bump the memcache
key.

Reviewed By: krallin

Differential Revision: D23810971

fbshipit-source-id: c333a51ffb2babf7da808b276f9cfa31baaa105c
2020-09-21 01:47:01 -07:00
Alex Hornby
4203aca84b mononoke: include the causes in the log to mononoke_blobstore_trace
Summary:
We are currently logging only the outermost underlying error or context, not any of the lower level causes. This makes mononoke_blobstore_trace less useful!

This changes to use anyhow's alternate format that includes causes

Reviewed By: krallin

Differential Revision: D23708577

fbshipit-source-id: fa2e71734841e2b75d824c456dccf61c1fb13fd2
2020-09-16 02:02:56 -07:00
Johan Schuijt-Li
deb57a25ed mononoke: deprecate preamble in favor of metadata
Summary:
In preparation of moving away from SSH as an intermediate entry point for
Mononoke, let Mononoke work with newly introduced Metadata. This removes any
assumptions we now make about how certain data is presented to us, making the
current "ssh preamble" no longer central.

Metadata is primarily based around identities and provides some
backwards-compatible entry points to make sure we can satisfy downstream
consumers of commits like hooks and logs.

Simarly we now do our own reverse DNS resolving instead of relying on what's
been provided by the client. This is done in an async matter and we don't rely
on the result, so Mononoke can keep functioning in case DNS is offline.

Reviewed By: farnz

Differential Revision: D23596262

fbshipit-source-id: 3a4e97a429b13bae76ae1cdf428de0246e684a27
2020-09-15 10:28:38 -07:00
Thomas Orozco
21290702e1 third-party/rust: import async-compression + update zstd
Summary:
This imports the async-compression crate. We have an equivalent-ish in
common/rust, but it targets Tokio 0.1, whereas this community-supported crate
targets Tokio 0.2 (it offers a richer API, notably in the sense that we
can use it for Streams, whereas the async-compression crate we have is only for
AsyncWrite).

In the immediate term, I'd like to use this for transfer compression in
Mononoke's LFS Server. In the future, we might also use it in Mononoke where we
currently use our own async compression crate when all that stuff moves to
Tokio 0.2.

Finally, this also updates zstd: the version we link to from tp2 is actually
zstd 1.4.5, so it's a good idea to just get the same version of the zstd crate.

The zstd crate doesn't keep a great changelog, so it's hard to tell what has changed.
At a glance, it looks like the answer is not much, but I'm going to look to Sandcastle
to root out potential issues here.

Reviewed By: StanislavGlebik

Differential Revision: D23652335

fbshipit-source-id: e250cef7a52d640bbbcccd72448fd2d4f548a48a
2020-09-15 07:59:53 -07:00
Aida Getoeva
b92c64af7d shed/sql: make queries! macros work with new Rust mysql client
Summary:
shed/sql library used mainly to communicate with Mysql db and to have a nice abstraction layer around mysql (which is used in production) and sqlite (integration tests). The library provided an interface, that was backed up from Mysql side my raw connections and by MyRouter.
This diff introduces a new backend - new Mysql client for Rust.

New backend is exposed as a third variant for the current model: sqlite, mysql (raw conn and myrouter) and mysql2 (new client). The main reason for that is the fact that the current shed/sql interface for Mysql
(1) heavily depends on mysql_async crate, (2) introduces much more complexity than needed for the new client and (3) it seems like this will be refactored and cleaned up later, old things will be deprecated.
So to not overcomplicate things by trying to implement the given interface for the new Mysql client, I tried to simplify things by adding it as a third backend option.

Reviewed By: farnz

Differential Revision: D22458189

fbshipit-source-id: 4a484b5201a38cc017023c4086e9f57544de68b8
2020-09-11 06:33:37 -07:00
Simon Farnsworth
89e30973ff Report write errors when scrubbing
Summary: When we're scrubbing blobstores, it's not actually a success state if a scrub fails to write. Report this back to the caller - no-one will usually be scrubbing unless they expect repair writes to succeed, and a failure is a sign that we need to investigate further

Reviewed By: mitrandir77

Differential Revision: D23601541

fbshipit-source-id: d328935af9999c944719a6b863d0c86b28c54f59
2020-09-10 02:29:47 -07:00
David Tolnay
0cb8a052f5 Update formatter to rustfmt 2.0
Reviewed By: zertosh

Differential Revision: D23591021

fbshipit-source-id: e664aa2fdd3aaa457796a59080be6b94f604a112
2020-09-09 07:52:33 -07:00
Simon Farnsworth
aa2df38491 Improve errors on scrub failure
Summary:
With three blobstores in play, we have issues working out exactly what's wrong during a manual scrub. Make the error handling better:

1. Manual scrub adds the key as context for the failure.
2. Scrub error groups blobstores by content, so that you can see which blobstore is most likely to be wrong.

Reviewed By: ahornby, krallin

Differential Revision: D23565906

fbshipit-source-id: a199e9f08c41b8e967d418bc4bc09cb586bbb94b
2020-09-09 07:25:13 -07:00
Stanislau Hlebik
66fbdf72c7 mononoke: add sampling for redacted accesses
Summary:
Previously we were not logging a redacted access if previous access was logged
less < MIN_REPORT_TIME_DIFFERENCE_NS ago. That doesn't work well with our
tests.

Let's instead add a sampling tunable.

Reviewed By: krallin

Differential Revision: D23595067

fbshipit-source-id: 47f6152945d9fdc2796fd1e74804e8bcf7f34940
2020-09-09 02:51:41 -07:00
David Tolnay
be0786f14b Prepare for rustfmt 2.0
Summary:
Generated by formatting with rustfmt 2.0.0-rc.2 and then a second time with fbsource's current rustfmt (1.4.14).

This results in formatting for which rustfmt 1.4 is idempotent but is closer to the style of rustfmt 2.0, reducing the amount of code that will need to change atomically in that upgrade.

 ---

*Why now?* **:** The 1.x branch is no longer being developed and fixes like https://github.com/rust-lang/rustfmt/issues/4159 (which we need in fbcode) only land to the 2.0 branch.

 ---

Reviewed By: StanislavGlebik

Differential Revision: D23568780

fbshipit-source-id: b4b4a0aa683d236e2fdeb5b96d723ac2d84b9faf
2020-09-08 07:33:16 -07:00
Stanislau Hlebik
7b323a4fd9 mononoke: add log-only mode in redaction
Summary:
Before redacting something it would be good to check that this file is not
accessed by anything. Having log-only mode would help with that.

Reviewed By: ikostia

Differential Revision: D23503666

fbshipit-source-id: ae492d4e0e6f2da792d36ee42a73f591e632dfa4
2020-09-04 07:37:15 -07:00
David Tolnay
75c2118e01 Remove crate_root from Rust dependency info
Reviewed By: danobi

Differential Revision: D23430948

fbshipit-source-id: c4b374021325fc247121ceecd0e82a0291aa75d6
2020-08-31 14:43:24 -07:00
Alex Hornby
213edc10ce mononoke: limit queue peeking from scrub blobstore when one store has missing value
Summary: When running manual scrub for a large repo with one empty store, we are doing one peek per key.  For keys that have existed for some time this is unnecessary as we know the key should exist and slows down the scrub.

Reviewed By: farnz

Differential Revision: D23054582

fbshipit-source-id: d2222350157ca37aa31b7792214af4446129c692
2020-08-14 02:37:45 -07:00
Alex Hornby
d59dd787c5 mononoke: make blobstore ctime a bit easier to use
Summary: Ctime is an Option<i64>, so rather than as_ctime()/into_ctime() use the fact that it's fairly small and Copy to just .ctime()

Reviewed By: krallin

Differential Revision: D23081739

fbshipit-source-id: be62912eca02e5c29d7473d6f386d98df11000dd
2020-08-14 02:09:46 -07:00
Alex Hornby
e0c6e249fe mononoke: add a non-thrift header to packblob so we can vary thrift protocol in future
Summary: Add a non-thrift header to packblob so we can vary thrift protocol in future.

Reviewed By: farnz

Differential Revision: D22953758

fbshipit-source-id: a114a350105e75cbe57f6c824295d863c723f32f
2020-08-07 03:43:56 -07:00
Simon Farnsworth
0c3fe9b20f Fully asyncify blobstore sync queue
Summary: Move it from `'static` BoxFutures to async_trait and lifetimes

Reviewed By: markbt

Differential Revision: D22927171

fbshipit-source-id: 637a983fa6fa91d4cd1e73d822340cb08647c57d
2020-08-05 15:41:15 -07:00
Simon Farnsworth
aa94fb9581 Add a multiplex mode that doesn't update the sync queue
Summary:
Backfillers and other housekeeping processes can run so far ahead of the blobstore sync queue that we can't empty it from the healer task as fast as the backfillers can fill it.

Work around this by providing a new mode that background tasks can use to avoid filling the queue if all the blobstores are writing successfully. This has a side-effect of slowing background tasks to the speed of the slowest blobstore, instead of allowing them to run ahead at the speed of the fastest blobstore and relying on the healer ensuring that all blobs are present.

Future diffs will add this mode to appropriate tasks

Reviewed By: ikostia

Differential Revision: D22866818

fbshipit-source-id: a8762528bb3f6f11c0ec63e4a3c8dac08d0b4d8e
2020-08-05 06:35:46 -07:00
Simon Farnsworth
33c2a0c846 Update auto_impl to 0.4
Summary: We were using a git snapshot of auto_impl from somewhere between 0.3 and 0.4; 0.4 fixes a bug around Self: 'lifetime constraints on methods that blocks work I'm doing in Mononoke, so update.

Reviewed By: dtolnay

Differential Revision: D22922790

fbshipit-source-id: 7bb68589a1d187393e7de52635096acaf6e48b7e
2020-08-04 18:12:45 -07:00
Santiago Alfonso Muñoz Rodriguez
007dc93916 Enumeration API for BlobStore keys
Summary:
- Enumerate API now provided via trait BlobstoreKeySource
- Implementation for Fileblob and ManifoldBlob
- Modified populate_healer to use new api
- Modified fixrepocontents to use new api

Reviewed By: ahornby

Differential Revision: D22763274

fbshipit-source-id: 8ee4503912bf40d4ac525114289a75d409ef3790
2020-08-04 06:54:18 -07:00
Simon Farnsworth
f7e8931a56 Add a minimum successful writes count for MultiplexedBlobstore
Summary:
There are two reasons to want a write quorum:

1. One or more blobstores in the multiplex are experimental, and we don't want to accept a write unless the write is in a stable blobstore.
2. To reduce the risk of data loss if one blobstore loses data at a bad time.

Make it possible

Reviewed By: krallin

Differential Revision: D22850261

fbshipit-source-id: ed87d71c909053867ea8b1e3a5467f3224663f6a
2020-08-04 02:45:38 -07:00
Simon Farnsworth
a5e9b79d7d Return all errors in the event of a multiplexed put failure
Summary:
With upcoming write quorum work, it'll be interesting to know all the failures that prevent a put from succeeding, not just the most recent, as the most recent may be from a blobstore whose reliability is not yet established.

Store and return all errors, so that we can see exactly why a put failed

Reviewed By: ahornby

Differential Revision: D22896745

fbshipit-source-id: a3627a04a46052357066d64135f9bf806b27b974
2020-08-03 09:30:05 -07:00
Simon Farnsworth
a9b8793d2d Add a write-mostly blobstore mode for populating blobstores
Summary:
We're going to add an SQL blobstore to our existing multiplex, which won't have all the blobs initially.

In order to populate it safely, we want to have normal operations filling it with the latest data, and then backfill from Manifold; once we're confident all the data is in here, we can switch to normal mode, and never have an excessive number of reads of blobs that we know aren't in the new blobstore.

Reviewed By: krallin

Differential Revision: D22820501

fbshipit-source-id: 5f1c78ad94136b97ae3ac273a83792ab9ac591a9
2020-08-03 04:36:19 -07:00
Alex Hornby
ecb58ff8d7 mononoke: add cmdlib argument to control cachelib zstd compression
Summary:
Add a cmdlib argument to control cachelib zstd compression. The default behaviour is unchanged, in that the CachelibBlobstore will attempted compression when putting to the cache if the object is larger than the cachelib max size.

To make the cache behaviour more testable, this change also adds an option to do an eager put to cache without the spawn. The default remains to do a lazy fire and forget put into the cache with tokio::spawn.

The motivation for the change is that when running the walker the compression putting to cachelib can dominate CPU usage for part of the walk, so it's best to turn it off and let those items be uncached as the walker is unlikely to visit them again (it only revisits items that were not fully derived).

Reviewed By: StanislavGlebik

Differential Revision: D22797872

fbshipit-source-id: d05f63811e78597bf3874d7fd0e139b9268cf35d
2020-07-31 01:12:02 -07:00
Simon Farnsworth
a40a8f36b7 Asyncify MultiplexedBlobstore
Summary:
We have two deficiencies to correct in here; modernise the code without changing behaviour first to make it easier to later fix them.

Deficiency 1 is that we always call the `on_put` handler; we need a mode that doesn't do that unless a blobstore returns an error, for jobs not waiting on a human.

Deficiency 2 is that we accept a write after one blobstore accepts it; there's a risk of that being the only copy if a certain set of race conditions are met

Reviewed By: StanislavGlebik

Differential Revision: D22701961

fbshipit-source-id: 0990d3229153cec403717fcd4383abcdf7a52e58
2020-07-27 06:09:47 -07:00
Lukasz Piatkowski
0dd3c4e4bb add Mononoke integration tests CI (#26)
Summary:
This diff adds a minimal workflow for running integrations tests for Mononoke. Currently only one test is run and it fails.

This also splits the regular Mononoke CI into separate files for Linux and Mac to match the current style in Eden repo.
There are the "scopeguard::defer" fixes here that somehow escaped the CI tests.
Some tweaks have been made to "integration_runner_real.py" to make it runnable outside FB context.
Lastly the change from using "[[ -v ... ]" to "[[ -n "${...:-}" ]]; in "library.sh" was made because the former is not supported by the default Bash version preinstalled on modern MacOS.

Pull Request resolved: https://github.com/facebookexperimental/eden/pull/26

Reviewed By: krallin

Differential Revision: D22541344

Pulled By: lukaspiatkowski

fbshipit-source-id: 5023d147823166a8754be852c29b1e7b0e6d9f5f
2020-07-16 12:16:10 -07:00
Simon Farnsworth
78847ff88c Make BlobstoreSyncQueue use new futures
Summary: Stage 1 of a migration - next step is to make all users of this trait use new futures, and then I can come back, add lifetimes and references, and leave it modernised

Reviewed By: StanislavGlebik

Differential Revision: D22460164

fbshipit-source-id: 94591183912c0b006b7bcd7388a3d7c296e60577
2020-07-10 06:43:13 -07:00
Thomas Orozco
ae917ba227 mononoke/virtually_sharded_blobstore: make sampling rate tunable
Summary: As it says in the title.

Reviewed By: farnz

Differential Revision: D22432526

fbshipit-source-id: 42726584689cbc2f5c9138b42b7bf77939921bdd
2020-07-08 09:07:19 -07:00
Arun Kulshreshtha
5f0181f48c Regenerate all Cargo.tomls after upgrade to futures 0.3.5
Summary: D22381744 updated the version of `futures` in third-party/rust to 0.3.5, but did not regenerate the autocargo-managed Cargo.toml files in the repo. Although this is a semver-compatible change (and therefore should not break anything), it means that affected projects would see changes to all of their Cargo.toml files the next time they ran `cargo autocargo`.

Reviewed By: dtolnay

Differential Revision: D22403809

fbshipit-source-id: eb1fdbaf69c99549309da0f67c9bebcb69c1131b
2020-07-06 20:49:43 -07:00
Thomas Orozco
ce0af2d591 mononoke/virtually_sharded_blobstore: deduplicate puts based on data being put
Summary:
This updates the virtually_sharded_blobstore to deduplicate puts only if the
data being put is actually the data we have put in the past. This is done by
keeping track of the hash of things we've put in the presence cache.

This has 2 benefits:

- This is safer. We only dedupe puts we 100% know succeeded (because this
  particular instance was the one to attempt the put).
- This is creates less surprises, notably it lets us overwrite data in the
  backing store (if we are writing something different).

Reviewed By: StanislavGlebik

Differential Revision: D22392809

fbshipit-source-id: d76a49baa9a5749b0fb4865ee1fc1aa5016791bc
2020-07-06 12:10:46 -07:00
Thomas Orozco
19b31ead9d mononoke/virtually_sharded_blobstore: make race tests a little more forgiving
Summary:
Running those on my devserver, I noticed they can be a bit flaky. They're are
racy on the purpose, but let's relax them a bit.

We have a lot of margin here — our blobstore is rate limited at once request
every 10ms, and we need to do 100 requests (the goal is to show that they don't
all wait), so 100ms is fine to prove that they're not rate limited when sharing
the same data.

Reviewed By: StanislavGlebik

Differential Revision: D22392810

fbshipit-source-id: 2e3c9cdf19b0e4ab979dfc000fbfa8da864c4fd6
2020-07-06 12:10:46 -07:00
Thomas Orozco
07907b2b26 mononoke/virtually_sharded_blobstore: merge in the context_concurrency_blobstore
Summary:
There is inevitably interaction between caching, deduplication and rate
limiting:

- You don't want the rate limiting to be above caching (in the blobstore stack,
  that is), because you shouldn't rate limits cache hits (this is where we are
  today).
- You don't want the rate limiting to below deduplication, because then you get
  priority inversion where a low-priority rate-limited request might hold the
  semaphore while a higher-priority, non rate limited request wants to do the
  same fetch (we could have moved rate limiting here prior to introducing
  deduplication, but I didn't do it earlier because I wanted to eventually
  introduce deduplication).

So, now that we have caching and deduplication in the same blobstore, let's
also incorporate rate limiting there!.

Note that this also brings a potential motivation for moving Memcache into this
blobstore, in case we don't want rate limiting to apply to requests before they
go to the _actual_ blobstore (I did not do this in this diff).

The design here when accessing the blobstore is as follows:

- Get the semaphore
- Check if the data is in cache, if so release the semaphore and return the
  data.
- Otherwise, check if we are rater limited.

Then, if we are rate limited:

- Release the semaphore
- Wait for our turn
- Acquire the semaphore again
- Check the cache again (someone might have put the data we want while we were
  waiting).
    - If the data is there, then return our rate limit token.
    - If the data isn't there, then proceed to query the blobstore.

If we aren't rate limited, then we just proceed to query the blobstore.

There are a couple subtle aspects of this:

- If we have a "late" cache hit (i.e. after we waited for rate limiting), then
  we'll have waited but we won't need to query the blobstore.
    - This is important when a large number of requests from the same key
      arrive at the same time and get rate limited. If we don't do this second
      cache check or if we don't return the token, then we'll consume a rate
      limiting token for each request (instead of 1 for the first request).
- If a piece of data isn't cacheable, we should treat it like a cache hit with
  regard to semaphores (i.e. release early), but like a miss with regard to
  rate limits (i.e. wait).

Both of those are addressed captured in the code by returning the `Ticket` on a
cache hit. We can then choose to either return the ticket on a cache hit, or wait
for it on a cache miss.

(all of this logic is captured in unit tests, we can remove any of the blocks
there in `Shards::acquire` and a test will fail)

Reviewed By: farnz

Differential Revision: D22374606

fbshipit-source-id: c3a48805d3cdfed2a885bec8c47c173ee7ebfe2d
2020-07-06 04:38:31 -07:00