Commit Graph

137 Commits

Author SHA1 Message Date
Pavel Aslanov
13ee62332d prefix old future with Old
Summary: Prefix old futures with `Old` so it would be possible to start conversion of BlobRepo to new type futures

Reviewed By: StanislavGlebik

Differential Revision: D25187882

fbshipit-source-id: d66debd2981564b289d50292888f3f6bd3343e94
2020-11-27 06:49:06 -08:00
Lukas Piatkowski
fa1a195fd0 mononoke/blobstore: pass CoreContext via borrowed instead of owned value
Summary: Follow up after removing 'static from blobstore.

Reviewed By: StanislavGlebik

Differential Revision: D25182106

fbshipit-source-id: e13a7a31d71b4674425123268e655ae66127f1b7
2020-11-27 03:31:07 -08:00
Thomas Orozco
acf1453e12 mononoke: rename BlobManifest to HgBlobManifest
Summary:
We have `HgBlobChangeset`, `HgFileEnvelope`, `HgManifestEnvelope` ... but we
also have `BlobManifest`. Let's be a little consistent.

Reviewed By: markbt

Differential Revision: D25122288

fbshipit-source-id: 9ae0be49986dbcc31cee9a46bd30093b07795c62
2020-11-26 11:26:50 -08:00
Thomas Orozco
65c6869384 mononoke/blobrepo/test: use fbinit::compat_test consistently
Summary:
We have 4 different ways of awaiting futures in there: sometimes we create a
runtime, sometimes we use async-unit, sometimes we use `fbinit::test` and
sometimes we use `fbinit::compat_test`. Let's be consistent.

While in there, let's also get rid of `should_panic`, since that's not a very
good way of running tests.

Reviewed By: HarveyHunt

Differential Revision: D25186195

fbshipit-source-id: b64bb61935fb2132d2e5d8dff66fd4efdae1bf64
2020-11-26 08:22:05 -08:00
Thomas Orozco
015331583d mononoke: remove HgBlobEntry
Summary:
HgBlobEntry is kind of a problematic type right now:

- It has no typing to indicate if it's a file or a manifest
- It always has a file type, but we only sometimes care about it

This results in us using `HgBlobEntry` to sometimes represent
`Entry<HgManifestId, (FileType, HgFileNodeId)>`, and some other times to
represent `Entry<HgManifestId, HgFileNodeId>`.

This makes code a) confusing, b) harder to refactor because you might be having
to change unrelated code that cares for the use case you don't care about (i.e.
with or without the FileType), and c) results in us sometimes just throwing in
a `FileType::Normal` because we don't care about the type, which is prone to
breaking stuff.

So, this diff just removes it, and replaces it with the appropriate types.

Reviewed By: farnz

Differential Revision: D25122291

fbshipit-source-id: e9c060c509357321a8059d95daf22399177140f1
2020-11-26 08:22:05 -08:00
Lukas Piatkowski
15f0f924e6 mononoke/blobstore: use async_trait instead of BoxFuture
Reviewed By: farnz

Differential Revision: D25124793

fbshipit-source-id: 1ebe72d1db8043fabf9f20538f3e95c755e049e0
2020-11-23 07:58:34 -08:00
Lukas Piatkowski
0f54cc3d63 mononoke/blobstore: make Blobstore generic over lifetime
Summary: Remove 'static requirement for async methods of Blobstore, propagate this change and fixup low hanging fruits where the code can become 'static free easily.

Reviewed By: ahornby, farnz

Differential Revision: D24839054

fbshipit-source-id: 5d5daa04c23c4c9ae902b669b0a71fe41ee6dee6
2020-11-20 05:51:52 -08:00
Thomas Orozco
d3235ff615 mononoke: remove name from HgBlobEntry
Summary:
This isn't actually being consulted anywhere save for a single test, so let's
just remove it (it's not like the test checks anything important — that field
might not as well exist given we never read it).

Reviewed By: farnz

Differential Revision: D25093494

fbshipit-source-id: 5f4a53f8666fc0e8a89ceade44baa96e71fb813f
2020-11-19 12:55:28 -08:00
David Tolnay
448edf7461 Format fbsource with rustfmt-2.0.0-rc.2
Reviewed By: zertosh

Differential Revision: D25075717

fbshipit-source-id: 244182839311f96b69f381c07983276a04ecc5d3
2020-11-18 19:46:38 -08:00
Mark Juggurnauth-Thomas
2a76d65847 derived_data: rename BonsaiDerived::derive03 to derive
Summary: Now that `derive03` is the only version available, rename it to `derive`.

Reviewed By: krallin

Differential Revision: D24900106

fbshipit-source-id: c7fbf9a00baca7d52da64f2b5c17e3fe1ddc179e
2020-11-13 01:48:03 -08:00
Mark Juggurnauth-Thomas
26aaa544b0 blobrepo: switch to BonsaiDerived::derive03
Reviewed By: krallin

Differential Revision: D24900110

fbshipit-source-id: 7f73563547e06ee89ece15a614b6efa8c300cab4
2020-11-13 01:48:02 -08:00
Stefan Filip
3ffb223968 config: add SegmentedChangelog that downloads dag for functionality
Summary:
Under this configuration SegmentedChangelog Dags (IdDag + IdMap) are always
downloaded from saves. There is no real state kept in memory.

It's a simple configuration and somewhat flexible with treaks to blobstore
caching.

Reviewed By: krallin

Differential Revision: D24808330

fbshipit-source-id: 450011657c4d384b5b42e881af8a1bd008d2e005
2020-11-11 22:53:38 -08:00
Mark Juggurnauth-Thomas
bfc7614037 skeleton_manifest: implement skeleton manifest derivation
Summary: Implement derivation of skeleton manifests.

Differential Revision: D24787534

fbshipit-source-id: e55d053a717fe052fc4da69bd9034784b356b7cc
2020-11-11 13:23:48 -08:00
Mark Juggurnauth-Thomas
89957422b8 tests_utils: allow all MPaths in created commits
Summary:
Allow users of `tests_utils` to create paths that are not `String`, by supporting any type
that can be converted into `MPath`.

Reviewed By: StanislavGlebik

Differential Revision: D24887002

fbshipit-source-id: 47ad567507185863c1cfa3c6738f30aa9266901a
2020-11-11 13:23:48 -08:00
Egor Tkachenko
99643e0409 Added verification of generated bonsai changeset between backup and prod repos during blobimport
Summary: It is possible that hash of newly created bonsai_changeset will be different from what is in prod repo. In this case let's fetch bonsai from the prod, to make backup repo consistent with prod.

Reviewed By: StanislavGlebik

Differential Revision: D24593003

fbshipit-source-id: 70496c59927dae190a8508d67f0e3d5bf8d32e5c
2020-11-10 08:46:16 -08:00
Thomas Orozco
26e06ef1a0 mononoke/filestore: update fetch methods to return 0.3 stream
Summary:
This updates the external facing API of the filestore to use 0.3 streams.
Internally, there is still a bit of 0.3 streams, but as of this change, it's
all 0.3 outside.

This required a few changes here and there in places where it was simpler to
just update them to use 0.3 futures instead of `compat()`-ing everything.

Reviewed By: ikostia

Differential Revision: D24731298

fbshipit-source-id: 18a1dc58b27d129970a6aa2d0d23994d5c5de6aa
2020-11-06 07:26:04 -08:00
Thomas Orozco
8cad2ed3f2 mononoke/filestore: update exists() to futures 0.3
Summary: Like it says in the title.

Reviewed By: StanislavGlebik

Differential Revision: D24731300

fbshipit-source-id: b9c44fc1e4bd4cfe8655e1024a0547e40fb99424
2020-11-06 07:26:03 -08:00
Thomas Orozco
184310158b mononoke/filestore: update fetch external API to 0.3 futures
Summary:
Like it says in the title. This required quite a lot of changes at callsites,
as you'd expect.

Reviewed By: StanislavGlebik

Differential Revision: D24731299

fbshipit-source-id: e58447e88dcc3ba1ab3c951f87f7042e2b03eb2c
2020-11-06 07:26:03 -08:00
Thomas Orozco
b6949dbc26 mononoke/filestore: update store to futures 0.3
Summary: Like it says in the title. This updates `store()` and its (many) callsites.

Reviewed By: ahornby

Differential Revision: D24728658

fbshipit-source-id: 5fccf76d25e58eaf069f3f0cf5a31d2c397687ea
2020-11-06 07:26:03 -08:00
Thomas Orozco
40e8cab560 mononoke/filestore: update metadata to futures 0.3
Summary:
This updates the metadata APIs in the filestore to futures 0.3 & async / await.
This changes the external API of the filestore, so there's quite a bit of churn
outside of that module.

Reviewed By: markbt

Differential Revision: D24727255

fbshipit-source-id: 59833f185abd6ab9c609c6bcc22ca88ada6f1b42
2020-11-06 07:26:03 -08:00
Lukas Piatkowski
3c3de9e954 rust-shed/futures_01_ext: rename futures_ext to futures_01_ext
Summary: As part of the effort to deprecate futures 0.1 in favor of 0.3 I want to create a new futures_ext crate that will contain some of the extensions that are applicable from the futures_01_ext. But first I need to reclame this crate name by renaming the old futures_ext crate. This will also make it easier to track which parts of codebase still use the old futures.

Reviewed By: farnz

Differential Revision: D24725776

fbshipit-source-id: 3574d2a0790f8212f6fad4106655cd41836ff74d
2020-11-05 06:07:16 -08:00
Egor Tkachenko
a4d5c2c172 Remove old future from bonsai_generation
Summary: Into the bright new future

Reviewed By: farnz

Differential Revision: D24715795

fbshipit-source-id: 8e0b9df136373c99de77809db31f3e6847507704
2020-11-04 05:20:38 -08:00
Simon Farnsworth
93c92dae38 Remove old futures from test fixtures
Summary: We're getting rid of old futures - remove them as a dep here

Reviewed By: StanislavGlebik

Differential Revision: D24705787

fbshipit-source-id: 83ae938be0c9f7f485c74d3e26d041e844e94a43
2020-11-04 02:05:52 -08:00
Alex Hornby
89ace3790f mononoke: extend MononokeApp so admin apps can have a special default put behaviour
Summary:
Extend MononokeApp so admin apps can have a special default put behaviour (typically
 overwrite) vs the soon to be new default of IfAbsent

Use it from the admin tools.

Reviewed By: farnz

Differential Revision: D24623094

fbshipit-source-id: 5709c68429f8e1de0535eec132998d20411fc0e6
2020-10-29 16:07:22 -07:00
Simon Farnsworth
4e59e26775 Thread ConfigStore into blobstore creation
Summary: SQLBlob GC (next diff in stack) will need a ConfigStore in SQLBlob. Make one available to blobstore creation

Reviewed By: krallin

Differential Revision: D24460586

fbshipit-source-id: ea2d5149e0c548844f1fd2a0d241ed0647e137ae
2020-10-27 04:14:24 -07:00
Pavel Aslanov
23fc168668 convert ManifestOps to new style futures
Summary:
- convert ManifestOps to new style futures
- at this point `//eden/manifest:manifest` crate is completely free from old style futures

Reviewed By: krallin

Differential Revision: D24502214

fbshipit-source-id: f1cdb11bd8234f22af5c905243f71e1e9fca11f1
2020-10-23 06:42:35 -07:00
Thomas Orozco
0b083a74b1 mononoke/blobrepo_hg: optimize case conflict check performance
Summary:
Our case conflict checking is very inefficient on large changesets. The root
cause is that we traverse the parent manifest for every single file we are
modifying in the new changeset.

This results in very poor performance on large changes since we end up
reparsing manifests and doing case comparisons a lot more than we should. In
some pathological cases, it results in us taking several *minutes* to do a case
conflict check, with all of that time being spent on CPU lower-casing strings
and deserializing manifests.

This is actually a step we do after having uploaded all the data for a commit,
so this is pure overhead that is being added to the push process (but note it's
not part of the pushrebase critical section).

I ended up looking at this issue because it is contributing to the high
latencies we are seeing in commit cloud right now. Some of the bundles I
checked had 300+ seconds of on-CPU time being spent to check for case
conflicts. The hope is that with this change, we'll get fewer pathological
cases, and might be able to root cause remaining instances of latency (or have
that finally fixed).

This is pretty easy to repro.

I added a binary that runs case conflict checks on an arbitrary commit, and
tested it on `38c845c90d59ba65e7954be001c1eda1eb76a87d` (a commit that I noted
was slow to ingest in commit cloud, despite all its data being present already,
meaning it was basically a no-op). The old code takes ~3 minutes. The new one
takes a second.

I also backtested this by rigging up the hook tailer to do case conflict checks
instead (P145550763). It is about the same speed for most commits (perhaps
marginally slower on some, but we're talking microseconds here), but for some
pathological commits, it is indeed much faster.

This notably revealed one interesting case:

473b6e21e910fcdf7338df66ee0cbeb4b8d311989385745151fa7ac38d1b46ef (~8K files)
took 118329us in the new code (~0.1s), and 86676677us in the old (~87 seconds).

There are also commits with more files in recent history, but they're
deletions, so they are just as fast in both (< 0.1 s).

Reviewed By: StanislavGlebik

Differential Revision: D24305563

fbshipit-source-id: eb548b54be14a846554fdf4c3194da8b8a466afe
2020-10-15 09:49:39 -07:00
Thomas Orozco
1dc25648bf mononoke/types: indicate what path conflicted in a case conflict
Summary:
I'm reworking some of our case conflict handling, and as part of this, I'm
going to be using check_case_conflicts for all our checking of case conflicts,
and notably for the case where we introduce a new commit and check it against
its parent (which, right now, does not check for case conflicts).

To do this and provide a good user experience (i.e. indicate which files
conflicted and with what), I need `check_case_conflicts` to report what files
the change conflicts with. This is what this diff does.

This does mean a few more allocations, so I "paid those off" by updating our
case lowering to allocate one fewer Vec and one fewer String per MPathElement
being lowercased.

Reviewed By: StanislavGlebik

Differential Revision: D24305562

fbshipit-source-id: 8ac14466ba3e84a3ee3d9216a84c2d9125a51b86
2020-10-15 09:49:39 -07:00
Thomas Orozco
c7478113a3 mononoke/mercurial_types: get rid of HgManifest & HgEntry
Summary:
This trait is no longer used all that much outsides of a handful of tests, the
walker, and an admin subcommand, as it has been replaced by the `Manifest`
trait, which works over all kinds of Manifests, and has stronger typing (its
sub-entries always have a path, and they are wrapped in an enum that knows if
they're leaves or trees).

This left a bunch of old legacy code here or there, which is worth removing
to make sure we don't introduce any new callsites to this. Another motivation
is that this legacy code is often not very compatible with new code, and has
historically made it a bit tricky (everything owns a blobstore in this code,
which is pretty awkward and not at all how we do things nowadays).

There is, I think, a bit more potential here since we could also perhaps try to
remove the `HgBlobEntry` struct, but that has a callsites still, so I'm not
doing this here.

Reviewed By: StanislavGlebik

Differential Revision: D24306946

fbshipit-source-id: 8a73dbbf40a904ce19ac65d791b732091c206263
2020-10-15 04:56:13 -07:00
Alex Hornby
fb1d4515df mononoke: update Memblob::new callsites to ::default()
Summary: Update Memblob::new callsites to ::default() in preparation for adding arguments to ::new() to specify the put behaviour desired

Differential Revision: D24021173

fbshipit-source-id: 07bf4e6c576ba85c9fa0374d5aac57a533132448
2020-10-07 12:11:10 -07:00
Alex Hornby
409a9da79d mononoke: remove assert_present from Blobstore trait
Summary:
Remove assert_present from Blobstore trait as it had only one callsite other than the various blobstore layers/impls.

Replaced that one last call in repo_commit.rs/assert_in_blobstore() with an equivalent call to is_present.

Reviewed By: farnz

Differential Revision: D24016927

fbshipit-source-id: 764fddbebeb4b1192d196078b8824cf8a08e9691
2020-10-01 01:23:52 -07:00
Pavel Aslanov
463acc581d use derived data infra to derive mercurial changesets
Summary:
This completely converts mercurial changeset to be an instance of derived data:
 - Custom lease logic is removed
 - Custom changeset traversal logic is removed

Naming scheme of keys for leases has been changed to conform with other derived data types. This might cause temporary spike of cpu usage during rollout.

Reviewed By: farnz

Differential Revision: D23575777

fbshipit-source-id: 8eb878b2b0a57312c69f865f4c5395d98df7141c
2020-09-11 07:23:11 -07:00
Pavel Aslanov
f87db3eecf move existing changeset derivation logic to mercurial_derived_data
Summary:
This change move logic associated with mercurial changeset derivation to `mercurial_derived_data` crate.

NOTE: it is not converted to derived data infrastructure at this point, it is a preparation step to actually do this

Reviewed By: farnz

Differential Revision: D23573610

fbshipit-source-id: 6e8cbf7d53ab5dbd39d5bf5e06c3f0fc5a8305c8
2020-09-09 07:56:32 -07:00
David Tolnay
0cb8a052f5 Update formatter to rustfmt 2.0
Reviewed By: zertosh

Differential Revision: D23591021

fbshipit-source-id: e664aa2fdd3aaa457796a59080be6b94f604a112
2020-09-09 07:52:33 -07:00
Pavel Aslanov
32e162c197 move function used by mercurial_derived_data into a separate crate
Summary: Moving some of the functionality (which is required for mercurial changeset derivation) into a separate crate. This is required to convert mercurial changeset to derived data to avoid circular dependency it would create otherwise.

Reviewed By: StanislavGlebik

Differential Revision: D23566293

fbshipit-source-id: 9d30b4b3b7d8a922f72551aa5118c43104ef382c
2020-09-09 02:48:09 -07:00
David Tolnay
be0786f14b Prepare for rustfmt 2.0
Summary:
Generated by formatting with rustfmt 2.0.0-rc.2 and then a second time with fbsource's current rustfmt (1.4.14).

This results in formatting for which rustfmt 1.4 is idempotent but is closer to the style of rustfmt 2.0, reducing the amount of code that will need to change atomically in that upgrade.

 ---

*Why now?* **:** The 1.x branch is no longer being developed and fixes like https://github.com/rust-lang/rustfmt/issues/4159 (which we need in fbcode) only land to the 2.0 branch.

 ---

Reviewed By: StanislavGlebik

Differential Revision: D23568780

fbshipit-source-id: b4b4a0aa683d236e2fdeb5b96d723ac2d84b9faf
2020-09-08 07:33:16 -07:00
Stanislau Hlebik
7b323a4fd9 mononoke: add log-only mode in redaction
Summary:
Before redacting something it would be good to check that this file is not
accessed by anything. Having log-only mode would help with that.

Reviewed By: ikostia

Differential Revision: D23503666

fbshipit-source-id: ae492d4e0e6f2da792d36ee42a73f591e632dfa4
2020-09-04 07:37:15 -07:00
Stanislau Hlebik
0740f99f13 mononoke: allow logging censored scuba accesses to file
Summary:
In the next diff I'm going to add log-only mode to redaction, and it would be
good to have a way of testing it (i.e. testing that it actually logs accesses
to bad keys).

In this diff let's use a config option that allows logging censored scuba
accesses to file, and let's update redaction integration test to use it

Reviewed By: ikostia

Differential Revision: D23537797

fbshipit-source-id: 69af2f05b86bdc0ff6145979f211ddd4f43142d2
2020-09-04 07:37:14 -07:00
Stefan Filip
310b3616a6 blobrepo: instantiate segmented changelog as an attribute
Summary:
Segmented Changelog is a component that has multiple components of each own
that each can be configured in different ways. It seems that it already is
more complicated than other components in how it is set up and it will probably
evolve to have more knobs (caching comes to mind).

Right now we have 3 ways of instantiating SegmentedChangelog:
- Disabled, all requests return errors
- ReadOnly, requests to unprocessed commits return errors
- OnDemandUpdate, requests trigger commit processing when required

Reviewed By: aslpavel

Differential Revision: D23456217

fbshipit-source-id: a6016f05197abbc3722764fa8e9056190a767b36
2020-09-02 17:20:42 -07:00
Stefan Filip
e57b1f9265 segmented_changelog: add on-demand updating dag implementation
Summary:
The Segmented Changelog must be built somewhere. One of the simplest deployments
of involves the on-demand update of the graph. When a commit that wasn't yet
processed is encountered, we sent it to processing along with all of it's
ancestors.

At this time not much attention was paid to the distinction of master commit
versus non-master commit. For now the expectation is that only commits from
master will exercise this code path. The current expectation is that clients
will only call location-to-hash using commits from master.
Let me know if there is an easy way to check if a commit is part of master.
Later changes will invest more in handling non-master commits.

Reviewed By: aslpavel

Differential Revision: D23456218

fbshipit-source-id: 28c70f589cdd13d08b83928c1968372b758c81ad
2020-09-02 17:20:42 -07:00
Stefan Filip
10b233f180 blobrepo: move ChangesetFetcher to attributes
Summary:
I am planning to add Segmented Changelog to attributes.

I am writing an integration test for an EdenApi endpoint that depends on
Segmented Changelog and I would like to set it up to update on demand. When a
request comes in for a commit that we haven't parsed for Segmented Changelog we
want to update the structure on demand. This means that we probably need to
fetch commits. This means that we want to pass the ChangesetFetcher to Segmented
Changelog when it is built. Since Segmented Changelog fits well as an attribute
we want the ChangesetFetcher as an attribute.

I wonder how much thought has been given to attributes behaving as a dependency
injector in the `guice` sense.

Reviewed By: aslpavel

Differential Revision: D23428201

fbshipit-source-id: 7003c018ba806fd657dd8f071e0e83d35058b10f
2020-09-02 17:20:41 -07:00
Egor Tkachenko
7fd2f22cc0 Fix bug with zero hash manifest
Summary:
If the imported commit has manifest id with all zeros (empty commit). Blobimport job can't find it in blobstore and returns error D23266254.
Add an early return when the manifest_id is NULL_HASH.

Reviewed By: StanislavGlebik

Differential Revision: D23266254

fbshipit-source-id: b8a3c47edfdfdc9d8cc8ea032fb96e27a04ef911
2020-08-24 07:34:29 -07:00
Stanislau Hlebik
e308419b58 RFC mononoke: limit number of filenodes get_all_filenodes_maybe_stale
Summary:
In a repository with files with large histories we run into a lot of SqlTimeout
errors while fetching file history to serve getpack calls. However fetching the
whole file history is not really necessary - client knows how to work with
partial history i.e. if client misses some portion of history then it would
just fetch it on demand.

This diff adds way to add a limit on how many entries were going to be fetched, and if more entries were fetched then we return FilenodeRangeResult::TooBig. The downside of this diff is that we'd have to do more sequential database
queries.

Reviewed By: krallin

Differential Revision: D23025249

fbshipit-source-id: ebed9d6df6f8f40e658bc4b83123c75f78e70d93
2020-08-12 14:33:43 -07:00
Stanislau Hlebik
43ac2a1c62 mononoke: use WarmBookmarkCache in repo_client
Summary:
This is the (almost) final diff to introduce WarmBookmarksCache in repo_client.
A lot of this code is to pass through the config value, but a few things I'd
like to point out:
1) Warm bookmark cache is enabled from config, but it can be killswitched using
a tunable.
2) WarmBookmarksCache in scs derives all derived data, but for repo_client I
decided to derive just hg changeset. The main motivation is to not change the
current behaviour, and to make mononoke server more resilient to failures in
other derived data types.
3) Note that WarmBookmarksCache doesn't obsolete SessionBookmarksCache that was
introduced earlier, but rather it complements it. If WarmBookmarksCache is
enabled, then SessionBookmarksCache reads the bookmarks from it and not from
db.
4) There's one exception in point #3 - if we just did a push then we read
bookmarks from db rather than from bookmarks cache (see
update_publishing_bookmarks_after_push() method). This is done intentionally -
after push is finished we want to return the latest updated bookmarks to the
client (because the client has just moved a bookmark after all!).
I'd argue that the current code is a bit sketchy already - it doesn't read from
master but from replica, which means we could still see outdated bookmarks.

Reviewed By: krallin

Differential Revision: D22820879

fbshipit-source-id: 64a0aa0311edf17ad4cb548993d1d841aa320958
2020-07-31 03:09:24 -07:00
Alex Hornby
ecb58ff8d7 mononoke: add cmdlib argument to control cachelib zstd compression
Summary:
Add a cmdlib argument to control cachelib zstd compression. The default behaviour is unchanged, in that the CachelibBlobstore will attempted compression when putting to the cache if the object is larger than the cachelib max size.

To make the cache behaviour more testable, this change also adds an option to do an eager put to cache without the spawn. The default remains to do a lazy fire and forget put into the cache with tokio::spawn.

The motivation for the change is that when running the walker the compression putting to cachelib can dominate CPU usage for part of the walk, so it's best to turn it off and let those items be uncached as the walker is unlikely to visit them again (it only revisits items that were not fully derived).

Reviewed By: StanislavGlebik

Differential Revision: D22797872

fbshipit-source-id: d05f63811e78597bf3874d7fd0e139b9268cf35d
2020-07-31 01:12:02 -07:00
Mark Thomas
fb5fdb9c15 bookmarks: remove repo_id from Bookmarks methods
Summary:
Remove the `repo_id` parameter from the `Bookmarks` trait methods.

The `repo_id` parameters was intended to allow a single `Bookmarks` implementation
to serve multiple repos.  In practise, however, each repo has its own config, which
results in a separate `Bookmarks` instance for each repo.  The `repo_id` parameter
complicates the API and provides no benefit.

To make this work, we switch to the `Builder` pattern for `SqlBookmarks`, which
allows us to inject the `repo_id` at construction time.  In fact nothing here
prevents us from adding back-end sharing later on, as these `SqlBookmarks` objects
are free to share data in their implementation.

Reviewed By: StanislavGlebik

Differential Revision: D22437089

fbshipit-source-id: d20e08ce6313108b74912683c620d25d6bf7ca01
2020-07-10 04:50:25 -07:00
Arun Kulshreshtha
5f0181f48c Regenerate all Cargo.tomls after upgrade to futures 0.3.5
Summary: D22381744 updated the version of `futures` in third-party/rust to 0.3.5, but did not regenerate the autocargo-managed Cargo.toml files in the repo. Although this is a semver-compatible change (and therefore should not break anything), it means that affected projects would see changes to all of their Cargo.toml files the next time they ran `cargo autocargo`.

Reviewed By: dtolnay

Differential Revision: D22403809

fbshipit-source-id: eb1fdbaf69c99549309da0f67c9bebcb69c1131b
2020-07-06 20:49:43 -07:00
Thomas Orozco
07907b2b26 mononoke/virtually_sharded_blobstore: merge in the context_concurrency_blobstore
Summary:
There is inevitably interaction between caching, deduplication and rate
limiting:

- You don't want the rate limiting to be above caching (in the blobstore stack,
  that is), because you shouldn't rate limits cache hits (this is where we are
  today).
- You don't want the rate limiting to below deduplication, because then you get
  priority inversion where a low-priority rate-limited request might hold the
  semaphore while a higher-priority, non rate limited request wants to do the
  same fetch (we could have moved rate limiting here prior to introducing
  deduplication, but I didn't do it earlier because I wanted to eventually
  introduce deduplication).

So, now that we have caching and deduplication in the same blobstore, let's
also incorporate rate limiting there!.

Note that this also brings a potential motivation for moving Memcache into this
blobstore, in case we don't want rate limiting to apply to requests before they
go to the _actual_ blobstore (I did not do this in this diff).

The design here when accessing the blobstore is as follows:

- Get the semaphore
- Check if the data is in cache, if so release the semaphore and return the
  data.
- Otherwise, check if we are rater limited.

Then, if we are rate limited:

- Release the semaphore
- Wait for our turn
- Acquire the semaphore again
- Check the cache again (someone might have put the data we want while we were
  waiting).
    - If the data is there, then return our rate limit token.
    - If the data isn't there, then proceed to query the blobstore.

If we aren't rate limited, then we just proceed to query the blobstore.

There are a couple subtle aspects of this:

- If we have a "late" cache hit (i.e. after we waited for rate limiting), then
  we'll have waited but we won't need to query the blobstore.
    - This is important when a large number of requests from the same key
      arrive at the same time and get rate limited. If we don't do this second
      cache check or if we don't return the token, then we'll consume a rate
      limiting token for each request (instead of 1 for the first request).
- If a piece of data isn't cacheable, we should treat it like a cache hit with
  regard to semaphores (i.e. release early), but like a miss with regard to
  rate limits (i.e. wait).

Both of those are addressed captured in the code by returning the `Ticket` on a
cache hit. We can then choose to either return the ticket on a cache hit, or wait
for it on a cache miss.

(all of this logic is captured in unit tests, we can remove any of the blocks
there in `Shards::acquire` and a test will fail)

Reviewed By: farnz

Differential Revision: D22374606

fbshipit-source-id: c3a48805d3cdfed2a885bec8c47c173ee7ebfe2d
2020-07-06 04:38:31 -07:00
Stanislau Hlebik
2cfc23770c mononoke: use override_blame_filesize_limit option
Summary: This diff actually start to use the option

Reviewed By: krallin

Differential Revision: D22373943

fbshipit-source-id: fe23da9c3daa1f9f91a5ee5e368b33e0091aa9c1
2020-07-03 09:58:46 -07:00
Thomas Orozco
be1bac6c06 mononoke/virtually_sharded_blobstore: expose this in cmdlib
Summary:
Eventually, I plan to make this the default, but for now I'd like to make it
something we can choose to turn on or off as a cmd argument (so we can start
with the experimental tier and Fastreplay).

Note that this mixes volatile vs. non-volatile pools when accessing the pools
for cacheblob. In practice, those pools are actually volatile, it's just that
things don't break if you access them as non-volatile.

Reviewed By: farnz

Differential Revision: D22356537

fbshipit-source-id: 53071b6b21ca5727d422e10f685061c709114ae7
2020-07-03 05:53:11 -07:00