Commit Graph

231 Commits

Author SHA1 Message Date
Mark Juggurnauth-Thomas
33ec4db653 bounded_traversal: require futures to be boxed
Summary:
Bounded traversal's internal book-keeping moves the futures returned from fold and unfold callbacks around while they are being queued to be scheduled.  If these futures are large, then this can result in a significant portion of bounded traversal's CPU time being spent on `memcpy`ing these futures around.

This can be prevented by always boxing the futures that are returned to bounded traversal.  Make this a requirement by changing the type from `impl Future<...>` to `BoxFuture<...>`.

Reviewed By: mitrandir77

Differential Revision: D26997706

fbshipit-source-id: 23a3583adc23c4e7d3607a78e82fc9d1056691c3
2021-03-12 08:12:57 -08:00
Mark Juggurnauth-Thomas
3da74d14bd deleted_files_manifest: use SortedVectorMap
Summary:
Deleted files manifests are immutable, and their maps are stored in sorted
order, so we can use `SortedVectorMap` to load them more efficiently.

During derivation, maps are constructed out of order, so BTreeMap continues
to be used there.

Reviewed By: mitrandir77

Differential Revision: D26275362

fbshipit-source-id: 0d159f4f53ff3ec345ac9bd34a70018e151d4e37
2021-03-11 04:28:43 -08:00
Mark Juggurnauth-Thomas
91358f3716 mononoke_types: use SortedVectorMap for BonsaiChangeset
Summary:
BonsaiChangesets are rarely mutated, and their maps are stored in sorted order,
so we can use `SortedVectorMap` to load them more efficiently.

In the cases where mutable maps of filechanges are needed, we can use `BTreeMap`
during the mutation and then convert them to `SortedVectorMap` to store them.

Reviewed By: mitrandir77

Differential Revision: D25615279

fbshipit-source-id: 796219c1130df5cb025952bb61002e8d2ae898f4
2021-03-11 04:28:43 -08:00
Mark Juggurnauth-Thomas
fc9b91729c unodes: use SortedVectorMap
Summary:
Unodes are immutable, and their maps are stored in sorted order, so we can
use `SortedVectorMap` to load them more efficiently.

Reviewed By: mitrandir77

Differential Revision: D25615276

fbshipit-source-id: ff8c93e8cde50aa7d3565550c0eade38b67017ca
2021-03-11 04:28:43 -08:00
Mark Juggurnauth-Thomas
392aafe6b7 changeset_info: use SortedVectorMap
Summary:
ChangesetInfo is immutable and its maps are stored in sorted order, so we
can use `SortedVectorMap` to load them more efficiently.

Reviewed By: mitrandir77

Differential Revision: D25615278

fbshipit-source-id: 2078adeddc8bd049985fb4efafe3090858a1bc0a
2021-03-11 04:28:43 -08:00
Mark Juggurnauth-Thomas
79c2c8996b mononoke_types: use SortedVectorMap for thrift maps
Summary:
Thrift maps are stored in sorted order, so we can use `SortedVectorMap` to
load them more efficiently.

Reviewed By: mitrandir77

Differential Revision: D25615277

fbshipit-source-id: 509a9d63df6a6a793a4b0b2935e9ed86e7c6eaaf
2021-03-11 04:28:42 -08:00
Mark Juggurnauth-Thomas
eb4d31cc82 benchmark: rename to benchmarks/simulated_repo
Summary: Rename this benchmark to a specific name so that we can add new benchmarks.

Differential Revision: D26947362

fbshipit-source-id: a1d060ee79781aa0ead51f284517471431418659
2021-03-11 04:28:42 -08:00
Stanislau Hlebik
acf26beb79 mononoke: add an option to record committer fields in hg extras
Summary:
When generating hg changesets from git changesets we can run into a situation
where two git commits are almost completely identical with the exception of
committer field. Currently it results in us generating identical mercurial
changesets for different bonsai changesets, and this is something we do not
allow.

Let's do the same thing as hggit does i.e. record committer field in extras if
committer is different from author.

In case of ambiguities (i.e. committer extra is already set, or committer date is set but committer name is not) I opted for returning an error instead of trying to guess what the correct field should be.

Reviewed By: farnz

Differential Revision: D26867740

fbshipit-source-id: 6271e2f7ad421bec296c60ff211723c2162878c6
2021-03-10 23:41:28 -08:00
Andrey Chursin
0be8e8ce29 vfs: introduce AsyncVfs
Summary:
AsyncVfs provides async vfs interface.
It will be used in the native checkout instead of current use case that spawns blocking tokio tasks for VFS action

Reviewed By: quark-zju

Differential Revision: D26801250

fbshipit-source-id: bb26c4fc8acac82f4b55bb3f2f3964a6d0b64014
2021-03-05 21:47:51 -08:00
Thomas Orozco
2a803fc10d third-party/rust: update futures
Summary:
Those newer versions of Futures have compatibility improvements with Tokio,
notably:

- https://github.com/rust-lang/futures-rs/pull/2333
- https://github.com/rust-lang/futures-rs/pull/2358

Reviewed By: farnz

Differential Revision: D26778794

fbshipit-source-id: 5a9dc002083e5edfa5c614d8d2242e586a93fcf6
2021-03-04 06:42:55 -08:00
Thomas Orozco
ef7045e818 common/rust: use fbinit-tokio
Summary:
This diffs add a layer of indirection between fbinit and tokio, thus allowing
us to use fbinit with tokio 0.2 or tokio 1.x.

The way this works is that you specify the Tokio you want by adding it as an
extra dependency alongside `fbinit` in your `TARGETS` (before this, you had to
always include `tokio-02`).

If you use `fbinit-tokio`, then `#[fbinit::main]` and `#[fbinit::test]` get you
a Tokio 1.x runtime, whereas if you use `fbinit-tokio-02`, you get a Tokio 0.2
runtime.

This diff is big, because it needs to change all the TARGETS that reference
this in the same diff that introduces the mechanism. I also didn't produce it
by hand.

Instead, I scripted the transformation using this script: P242773846

I then ran it using:

```
{ hg grep -l "fbinit::test"; hg grep -l "fbinit::main"  } | \
  sort | \
  uniq | \
  xargs ~/codemod/codemod.py \
&&  yes | arc lint \
&& common/rust/cargo_from_buck/bin/autocargo
```

Finally, I grabbed the files returned by `hg grep`, then fed them to:

```
arc lint-rust --paths-from ~/files2 --apply-patches --take RUSTFIXDEPS
```

(I had to modify the file list a bit: notably I removed stuff from scripts/ because
some of that causes Buck to crash when running lint-rust, and I also had to add
fbcode/ as a prefix everywhere).

Reviewed By: mitrandir77

Differential Revision: D26754757

fbshipit-source-id: 326b1c4efc9a57ea89db9b1d390677bcd2ab985e
2021-03-03 04:09:15 -08:00
Stanislau Hlebik
c8108e6f1e mononoke: log derivation error to derived data scuba table
Summary:
Previously we were just logging whether derivation was successful or not.
Let's start logging the error as well.

Reviewed By: krallin

Differential Revision: D26750165

fbshipit-source-id: 03983cfa7bfbd5aeeed27ffac5a7df96c05e0967
2021-03-02 05:38:54 -08:00
Simon Farnsworth
06009cf378 Add blobstore name information to logging
Summary:
With the upcoming rollout of C++ Manifold client, it's useful to know which blobstores are which in our Scuba log, `mononoke_blobstore_trace`.

Pass a debug name down to help

Reviewed By: krallin

Differential Revision: D26673162

fbshipit-source-id: e7c16ad217d8daf21565939a45ac82204459055a
2021-03-02 03:40:54 -08:00
Lukas Piatkowski
f317302b0f autocargo v1: reformating of oss-dependencies, workspace and patch sections and thrift files to match v2
Summary:
For dependencies V2 puts "version" as the first attribute of dependency or just after "package" if present.
Workspace section is after patch section in V2 and since V2 autoformats patch section then the third-party/rust/Cargo.toml manual entries had to be formatted manually since V1 takes it as it is.
The thrift files are to have "generated by autocargo" and not only "generated" on their first line. This diff also removes some previously generated thrift files that have been incorrectly left when the corresponding Cargo.toml was removed.

Reviewed By: ikostia

Differential Revision: D26618363

fbshipit-source-id: c45d296074f5b0319bba975f3cb0240119729c92
2021-02-25 15:10:56 -08:00
Lukas Piatkowski
279eb2b538 autocargo v1: changes to how thrift-related generation is done to match v2
Summary: Done some reordering of fields in Cargo.toml, added test and doctest = false, name of the target that generated the Cargo.toml file and sorted the cratemap.

Reviewed By: ahornby

Differential Revision: D26581275

fbshipit-source-id: 4c363369438c72d43d8ccf4799f103ff092457cc
2021-02-23 11:38:45 -08:00
Stefan Filip
84017abe21 segmented_changelog: update OnDemandUpdateDag to have smaller critical sections
Summary:
The on demand update code we have is the most basic logic that we could have.
The main problem is that it has long and redundant write locks. This change
reduces the write lock strictly to the section that has to update the in memory
IdDag.

Updating the Dag has 3 phases:
* loading the data that is required for the update;
* updating the IdMap;
* updating the IdDag;

The Dag can function well for serving requests as long as the commits involved
have been built so we want to have easy read access to both the IdMap and the
IdDag. The IdMap is a very simple structure and because it's described as an
Arc<dyn IdMap> we push the update locking logic to the storage.  The IdDag is a
complicated structure that we ask to update itself. Those functions take
mutable references. Updating the storage of the iddag to hide the complexities
of locking is more difficult. We deal with the IdDag directly by wrapping it in
a RwLock. The RwLock allows for easy read access which we expect to be the
predominant access pattern.

Updates to the dag are not completely stable so racing updates can have
conflicting results. In case of conflics one of the update processes would have
to restart. It's easier to reason about the process if we just allow one
"thread" to start an update process. The update process is locked by a sync
mutex. The "threads" that fail the race to update are asked to wait until the
ongoing update is complete. The waiters will poll on a shared future that
tracks the ongoing dag update. After the update is complete the waiters will go
back to checking if the data they have is available in the dag. It is possible
that the dag is updated in between determining that the an update is needed and
acquiring the ongoing_update lock. This is fine because the update building
process checks the state of dag before the dag and updates only what is
necessary if necessary.

Reviewed By: krallin

Differential Revision: D26508430

fbshipit-source-id: cd3bceed7e0ffb00aee64433816b5a23c0508d3c
2021-02-22 18:17:21 -08:00
Thomas Orozco
097e4ad00c mononoke: remove tokio-compat (i.e. use tokio 0.2 exclusively)
Summary:
The earlier diffs in this stack have removed all our dependencies on the Tokio
0.1 runtime environment (so, basically, `tokio-executor` and `tokio-timer`), so
we don't need this anymore.

We do still have some deps on `tokio-io`, but this is just traits + helpers,
so this doesn't actually prevent us from removing the 0.1 runtime!

Note that we still have a few transitive dependencies on Tokio 0.1:

- async-unit uses tokio-compat
- hg depends on tokio-compat too, and we depend on it in tests

This isn't the end of the world though, we can live with that :)

Reviewed By: ahornby

Differential Revision: D26544410

fbshipit-source-id: 24789be2402c3f48220dcaad110e8246ef02ecd8
2021-02-22 09:22:42 -08:00
Thomas Orozco
0734a61cb1 common/rust: remove tracing
Summary:
This was a thing that was only ever used in Mononoke, and we don't think it's
usable and haven't been using it. Let's get rid of it. As-is, it won't even work
for most people due to its (indirect) dependency on Tokio 0.1.

Reviewed By: StanislavGlebik

Differential Revision: D26512243

fbshipit-source-id: faa16683f2adb20dfba43c4768486b982bc02ff9
2021-02-22 09:22:41 -08:00
Lukas Piatkowski
cd0b6d50e2 autocargo v1: changes to match autocargo v2 generation results.
Summary:
The changes (and fixes) needed were:
- Ignore rules that are not rust_library or thrift_library (previously only ignore rust_bindgen_library, so that binary and test dependencies were incorrectly added to Cargo.toml)
- Thrift package name to match escaping logic of `tools/build_defs/fbcode_macros/build_defs/lib/thrift/rust.bzl`
- Rearrange some attributes, like features, authors, edition etc.
- Authors to use " instead of '
- Features to be sorted
- Sort all dependencies as one instead of grouping third party and fbcode dependencies together
- Manually format certain entries from third-party/rust/Cargo.toml, since V2 formats third party dependency entries and V1 just takes them as is.

Reviewed By: zertosh

Differential Revision: D26544150

fbshipit-source-id: 19d98985bd6c3ac901ad40cff38ee1ced547e8eb
2021-02-19 11:03:55 -08:00
Lukas Piatkowski
87ddbe2f74 autocargo v1: update autocargo field format to allow transition to autocargo v2
Summary:
Autocargo V2 will use a more structured format for autocargo field
with the help of `cargo_toml` crate it will be easy to deserialize and handle
it.

Also the "include" field is apparently obsolete as it is used for cargo-publish (see https://doc.rust-lang.org/cargo/reference/manifest.html#the-exclude-and-include-fields). From what I know this might be often wrong, especially if someone tries to publish a package from fbcode, then the private facebook folders might be shipped. Lets just not set it and in the new system one will be able to set it explicitly via autocargo parameter on a rule.

Reviewed By: ahornby

Differential Revision: D26339606

fbshipit-source-id: 510a01a4dd80b3efe58a14553b752009d516d651
2021-02-12 23:28:25 -08:00
David Tolnay
1ae53c794f Update symlinks for Rust 1.50.0
Summary: Release notes: https://blog.rust-lang.org/2021/02/11/Rust-1.50.0.html

Reviewed By: jsgf, Imxset21

Differential Revision: D26363768

fbshipit-source-id: 25188531cf0a5647128cbeb469225d8dd756d0af
2021-02-12 13:54:50 -08:00
Thomas Orozco
2a21e4fb17 third-party/rust: update Tokio to 0.2.25 + add a patch to disable coop scheduling
Summary:
See the patch & motivation here:

818f943db3

Reviewed By: StanislavGlebik

Differential Revision: D26399890

fbshipit-source-id: e184a3f6c1dd03cb4cdb7ea18073c3392d7ce355
2021-02-12 04:56:23 -08:00
Mark Juggurnauth-Thomas
f27d761fa8 derived_data: limit derivation times to 2 decimal places
Summary: Limit durations in derived data logging to two decimal places.

Reviewed By: krallin

Differential Revision: D26311139

fbshipit-source-id: 0f317d9c15c8425ad0591cbe7f07063c74e081b7
2021-02-08 05:38:33 -08:00
Stanislau Hlebik
da87622777 mononoke: make bg session class tunable per-repo
Reviewed By: ahornby

Differential Revision: D26164309

fbshipit-source-id: bf489bbc75fdd9bcaeb07eb9d9f27249577a64df
2021-02-01 04:34:22 -08:00
Mark Juggurnauth-Thomas
f0eb35b86f derived_data: support gaps in derivation
Summary:
For fsnodes and skeleton manifests it should be possible to allow gaps in the
commits that are backfilled.  Any access to a commit in one of these gaps can
be quickly derived from the nearest ancestor that is derived.  Since each
commit's derived data is independent, there is no loss from omitting them.

Add the `--gap-size` option to `backfill_derived_data`.  This allows `fsnodes`,
`skeleton_manifests` and other derived data types that implement it in the
future to skip some of the commits during backfill.  This will speed up
backfilling and reduced the amount of data that is stored for infrequently
accessed historical commits.

Reviewed By: StanislavGlebik

Differential Revision: D25997854

fbshipit-source-id: bf4df3f5c16a913c13f732f6837fc4c817664e29
2021-01-29 06:36:20 -08:00
Daniel Xu
5715e58fce Add version specificiation to internal dependencies
Summary:
Lots of generated code in this diff. Only code change was in
`common/rust/cargo_from_buck/lib/cargo_generator.py`.

Path/git-only dependencies (ie `mydep = { path = "../foo/bar" }`) are not
publishable to crates.io. However, we are allowed to specify both a path/git
_and_ a version. When building locally, the path/git is chosen. When publishing,
the version on crates.io is chosen.

See https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#multiple-locations .

Note that I understand that not all autocargo projects are published on crates.io (yet).
The point of this diff is to allow projects to slowly start getting uploaded.
The end goal is autocargo generated `Cargo.toml`s that can be `cargo publish`ed
without further modification.

Reviewed By: lukaspiatkowski

Differential Revision: D26028982

fbshipit-source-id: f7b4c9d4f4dd004727202bd98ab10e201a21e88c
2021-01-25 22:10:24 -08:00
Thomas Orozco
4dd3461824 third-party/rust: update Tokio 0.2.x to 0.2.24 & futures 1.x to 1.30
Summary:
When we tried to update to Tokio 0.2.14, we hit lots of hangs. Those were due
to incompatibilities between Tokio 0.2.14 and Futures 1.29. We fixed some of
the bugs (and others had been fixed and were pending a release), and Futures
1.30 have now been released, which unblocks our update.

This diff updates Tokio accordingly (the previous diff in the stack fixes an
incompatibility).

The underlying motivation here is to ease the transition to Tokio 1.0.
Ultimately we'll be pulling in those changes one or way or another, so let's
get started on this incremental first step.

Reviewed By: farnz

Differential Revision: D25952428

fbshipit-source-id: b753195a1ffb404e0b0975eb7002d6d67ba100c2
2021-01-25 08:06:55 -08:00
Radu Szasz
5fb5d23ec8 Make tokio-0.2 include test-util feature
Summary:
This feature is useful for testing time-dependent stuff (e.g. it
allows you to stop/forward time). It's already included in the buck build.

Reviewed By: SkyterX

Differential Revision: D25946732

fbshipit-source-id: 5e7b69967a45e6deaddaac34ba78b42d2f2ad90e
2021-01-18 10:38:08 -08:00
Stanislau Hlebik
f4ba9ed498 mononoke: maybe override SessionClass for batch_derive()
Summary:
override_ctx() sets different SessionClass while data is derived. This should
reduce the number of entries we write to blobstore sync queue. See previous
diff for more motivation.

This diff uses override_ctx() for batch_derive() method

Reviewed By: krallin

Differential Revision: D25910465

fbshipit-source-id: b8a3e729c059cad5716b1b09bd2f1cc618273627
2021-01-14 13:02:43 -08:00
Stanislau Hlebik
fca761e153 mononoke: maybe override SessionClass while deriving data
Summary:
We had issues with mononoke writing too much blobstore sync queue entries while
deriving data for large commits. We've contemplated a few solutions, and
decided to give this one a go.

This approach forces derive data to use Background SessionClass which has an
effect of not writing data to blobstore sync queue if the write to blobstore
was successful (it still writes data to the queue otherwise). This should
reduce the number of entries we write to the blobstore sync queue
significantly. The downside is that writes might get a bit slower - our
assumption is that this slowdown is acceptable. If that's not the case we can
always disable this option.

This diff overrides SessionClass for normal ::derive() method. However there's
also batch_derive() - this one will be addressed in the next diff.

One thing to note - we still write derived data mapping to blobstore sync queue. That should be find as we have a constant number of writes per commits.

Reviewed By: krallin

Differential Revision: D25910464

fbshipit-source-id: 4113d00bc0efe560fd14a5d4319b743d0a100dfa
2021-01-14 13:02:42 -08:00
Evgenii Kazakov
539362bae5 mononoke/derived_data: Replace error("{0}") with error(transparent)
Summary:
In this diff I've replaced non-transparent error definition error("{0}")  with error(transparent).

The reason is non-transparent errors print the same thing as the original errors:

```
Error: failed to complete task
Caused by: other error description       <-- this duplicate shouldn't be here
Caused by: other error description
```

Reviewed By: krallin

Differential Revision: D25899411

fbshipit-source-id: e586af86b635a7e2fbf8952297171c546b859300
2021-01-13 08:01:38 -08:00
Mark Juggurnauth-Thomas
4da7dfb91f add ref-cast to thrift dependencies
Summary:
Depending on the thrift defition, `thrift_library` targets may also depend on `ref-cast`.
Add this to the `Cargo.toml`.

Reviewed By: lukaspiatkowski

Differential Revision: D25636872

fbshipit-source-id: 8263395db2bb31127528f5c66c4cc5dd9180d89f
2021-01-11 02:17:50 -08:00
Lukas Piatkowski
67b05b6c24 mononoke/integration: make integration tests work under @mode/dev-rust-oss
Summary: Last step in covering Mononoke with mode/dev-rust-oss buck build mode.

Reviewed By: markbt

Differential Revision: D25461223

fbshipit-source-id: 3fa0fa05e8c96476e069990e8f5cc6d56acf38c0
2020-12-18 06:13:32 -08:00
Thomas Orozco
db4c8fa924 mononoke/bonsai_hg_mapping: get rid of futures 0.1
Summary:
Like it says in the title. This is nice to do because we had old futures
wrapping new futures here, so this lets us get rid of a lot of cruft.

Reviewed By: ahornby

Differential Revision: D25502648

fbshipit-source-id: a34973b32880d859b25dcb6dc455c42eec4c2f94
2020-12-17 14:30:57 -08:00
Pavel Aslanov
0fc5c3aca7 convert BlobRepoHg to new type futures
Summary: Convert all BlobRepoHg methods to new type futures

Reviewed By: StanislavGlebik

Differential Revision: D25471540

fbshipit-source-id: c8e99509d39d0e081d082097cbd9dbfca431637e
2020-12-17 07:45:26 -08:00
Mark Juggurnauth-Thomas
590b84f62d changeset_info: implement batch_derive
Summary:
ChangesetInfo derivation doesn't depend on the parent ChangesetInfo being
available, so we can derive them in batches very easily.

Reviewed By: krallin

Differential Revision: D25470721

fbshipit-source-id: cc8ce305990eb6c9846158f0e9e3917cf35e169d
2020-12-16 10:38:47 -08:00
Mark Juggurnauth-Thomas
18653985b5 derived_data: add some docs describing how the traits fit together
Summary:
Add documentation comments to the derived data crate to describe how they fit
together.

Reviewed By: krallin

Differential Revision: D25432449

fbshipit-source-id: b62440bcecae900ad75d74245ce175bd9e07a894
2020-12-16 10:38:47 -08:00
Mark Juggurnauth-Thomas
f8190ccadd bookmarks_movement: allow commits to land if there are pre-existing case conflicts
Summary:
The old case-conflict checks were more lenient, and only triggered if a commit
introduced a case conflict compared to its first parent.

This means that commits could still be landed to bookmarks that already had
pre-existing case conflicts.

Relax the new case-conflict checks to allow this same scenario.

Note that we're still a bit more strict: the previous checks ignored other
parents, and would not reject a commit if the act of merging introduces a case
conflict.  The new case conflict checks only permit case conflicts in the case
where all conflicting files were present in one of the parents.

Reviewed By: StanislavGlebik

Differential Revision: D25508845

fbshipit-source-id: 95f4db1300ee73b8e6495ba8b5c1c2ce5a957d1a
2020-12-15 01:53:39 -08:00
Alex Hornby
a39070bbcd mononoke: small simpliciation for filenode derivation
Summary: Spotted this in passing.  Was able to remove a call to fetch_root_manifest_id.

Reviewed By: StanislavGlebik

Differential Revision: D25472678

fbshipit-source-id: d450cb97630464be13d22fb37c3356611dc2e1b6
2020-12-15 00:48:03 -08:00
Mark Juggurnauth-Thomas
5e549d348f derived_data: don't use default mapping in batch derivation
Summary:
Now that the mapping is separated from BonsaiDerivable, it becomes clear where
batch derivation is incorrectly using the default mapping, rather than the
mapping that has been provided for batch-derivation.

This could mean, for example, if we are backfilling a v2 of these derived
data types, we could accidentally base them on a v1 parent that we obtained
from the default mapping.

Instead, ensure that we don't use `BonsaiDerived` and thus can't use the
default mapping.

Reviewed By: krallin

Differential Revision: D25371963

fbshipit-source-id: fb71e1f1c4bd7a112d3099e0e5c5c7111d457cd2
2020-12-14 09:24:58 -08:00
Mark Juggurnauth-Thomas
e2cd66da84 derived_data_utils: fix typo
Reviewed By: quark-zju

Differential Revision: D25371969

fbshipit-source-id: bbcb3cece6b94096f328d453277b57c7f09dd398
2020-12-14 09:24:58 -08:00
Mark Juggurnauth-Thomas
db28ca7f59 derived_data_utils: ratelimit backfilling
Summary:
The backfiller may read or write to the blobstore too quickly.  Apply QPS
limits to the backfill batch context to keep the read or write rate acceptable.

Reviewed By: ahornby

Differential Revision: D25371966

fbshipit-source-id: 276bf2dd428f7f66f7472aabd9e943eec5733afe
2020-12-14 09:24:58 -08:00
Mark Juggurnauth-Thomas
90dc6adb1c derived_data: buffer fetching of mapping batches
Summary:
When fetching many derived data mappings, the use of `FuturesUnordered` means
we may fetch many blobs concurrently, which may overload the blobstore.

Switch to using `buffered` to reduce the number of concurrent blob fetches.

Reviewed By: ahornby

Differential Revision: D25371965

fbshipit-source-id: 30417e86bc33defbb821f214a5520ab1b8a8c18c
2020-12-14 09:24:58 -08:00
Mark Juggurnauth-Thomas
96d8ce4ad2 derived_data: add option for parallel batch derivation
Summary:
Re-introduce parallel backfilling of changesets in a batch using `batch_derive`,
however keep it under the control of a flag, so we can enable or disable it as
necessary.

Reviewed By: ahornby

Differential Revision: D25401207

fbshipit-source-id: f9aeef3415be48fc03220c18fa547e05538ed479
2020-12-14 09:24:58 -08:00
Mark Juggurnauth-Thomas
73cdac45e3 derived_data: use new derived data configuration format
Summary:
Change derived data config to have "enabled" config and "backfilling" config.

The `Mapping` object has the responsibility of encapsulating the configuration options
for the derived data type.  Since it is only possible to obtain a `Mapping` from
appropriate configuration, ownership of a `Mapping` means derivation is permitted,
and so the `DeriveMode` enum is removed.

Most callers will use `BonsaiDerived::derive`, or a default `derived_data_utils` implementation
that requires the derived data to be enabled and configured on the repo.

Backfillers can additionally use `derived_data_utils_for_backfill` which will use the
`backfilling` configuration in preference to the default configuration.

Reviewed By: ahornby

Differential Revision: D25246317

fbshipit-source-id: 352fe6509572409bc3338dd43d157f34c73b9eac
2020-12-14 09:24:58 -08:00
Mark Juggurnauth-Thomas
05225bcf90 derived_data: make derivation options a parameter to derive_from_parents
Summary:
Currently, data derivation for types that have options (currrently unode
version and blame filesize limit) take the value of the option from the
repository configuration.

This is a side-effect, and means it's not possible to have data derivation
types with different configs active in the same repository (e.g. to
server unodes v1 while backfilling unodes v2).  To have data derivation
with different options, e.g. in tests, we must use `repo.dangerous_override`.

The first step to resolve this is to make the data derivation options a parameter.
Depending on the type of derived data, these options are passed into
`derive_from_parents` so that the right kind of derivation can happen.

The mapping is responsible for storing the options and providing it at the time
of derivation.  In this diff it just gets it from the repository config, the same
as was done previously.  In a future diff we will change this so that there
can be multiple configurations.

Reviewed By: krallin

Differential Revision: D25371967

fbshipit-source-id: 1cf4c06a4598fccbfa93367fc1f1c2fa00fd8235
2020-12-14 09:24:57 -08:00
Mark Juggurnauth-Thomas
64be2e8d87 derived_data_utils: take parameters by reference
Summary: Take the parameters to `derived_data_utils` and `derived_data_utils_unsafe` by reference.

Reviewed By: krallin

Differential Revision: D25371970

fbshipit-source-id: d260650c2398e33667e1bc5779fbabdff04f1f98
2020-12-14 09:24:57 -08:00
Mark Juggurnauth-Thomas
9e1b1448e6 derived_data: split BonsaiDerived trait
Summary:
The `BonsaiDerived` trait is split in two:

* The new `BonsaiDerivable` trait encapsulates the process of deriving the data, either
  a single item from its parents, or a batch.
* The `BonsaiDerived` trait is used only as an entry point for deriving with the default
  mapping and config.

This split will allow us to use `BonsaiDerivable` in batch backfilling with non-default
config, for example when backfilling a new version of a derived data type.

Reviewed By: krallin

Differential Revision: D25371964

fbshipit-source-id: 5874836bc06c18db306ada947a690658bf89723c
2020-12-14 09:24:57 -08:00
Stanislau Hlebik
40b67f9d09 mononoke: do not use batched in derived_data_tailer
Reviewed By: ikostia

Differential Revision: D25366938

fbshipit-source-id: 9b10853b1fdf5ee281187067aa768fa52e6d3074
2020-12-07 02:51:45 -08:00
Stefan Filip
1c310c516b derived_data: remove unused import
Summary: Fix the build.

Reviewed By: ikostia

Differential Revision: D25315064

fbshipit-source-id: 1f4da8b47eda7f9177ef0c2c6eddbdb374640472
2020-12-03 14:17:52 -08:00