Commit Graph

45 Commits

Author SHA1 Message Date
Thomas Orozco
097e4ad00c mononoke: remove tokio-compat (i.e. use tokio 0.2 exclusively)
Summary:
The earlier diffs in this stack have removed all our dependencies on the Tokio
0.1 runtime environment (so, basically, `tokio-executor` and `tokio-timer`), so
we don't need this anymore.

We do still have some deps on `tokio-io`, but this is just traits + helpers,
so this doesn't actually prevent us from removing the 0.1 runtime!

Note that we still have a few transitive dependencies on Tokio 0.1:

- async-unit uses tokio-compat
- hg depends on tokio-compat too, and we depend on it in tests

This isn't the end of the world though, we can live with that :)

Reviewed By: ahornby

Differential Revision: D26544410

fbshipit-source-id: 24789be2402c3f48220dcaad110e8246ef02ecd8
2021-02-22 09:22:42 -08:00
Stanislau Hlebik
da87622777 mononoke: make bg session class tunable per-repo
Reviewed By: ahornby

Differential Revision: D26164309

fbshipit-source-id: bf489bbc75fdd9bcaeb07eb9d9f27249577a64df
2021-02-01 04:34:22 -08:00
Mark Juggurnauth-Thomas
f0eb35b86f derived_data: support gaps in derivation
Summary:
For fsnodes and skeleton manifests it should be possible to allow gaps in the
commits that are backfilled.  Any access to a commit in one of these gaps can
be quickly derived from the nearest ancestor that is derived.  Since each
commit's derived data is independent, there is no loss from omitting them.

Add the `--gap-size` option to `backfill_derived_data`.  This allows `fsnodes`,
`skeleton_manifests` and other derived data types that implement it in the
future to skip some of the commits during backfill.  This will speed up
backfilling and reduced the amount of data that is stored for infrequently
accessed historical commits.

Reviewed By: StanislavGlebik

Differential Revision: D25997854

fbshipit-source-id: bf4df3f5c16a913c13f732f6837fc4c817664e29
2021-01-29 06:36:20 -08:00
Stanislau Hlebik
f4ba9ed498 mononoke: maybe override SessionClass for batch_derive()
Summary:
override_ctx() sets different SessionClass while data is derived. This should
reduce the number of entries we write to blobstore sync queue. See previous
diff for more motivation.

This diff uses override_ctx() for batch_derive() method

Reviewed By: krallin

Differential Revision: D25910465

fbshipit-source-id: b8a3e729c059cad5716b1b09bd2f1cc618273627
2021-01-14 13:02:43 -08:00
Mark Juggurnauth-Thomas
5e549d348f derived_data: don't use default mapping in batch derivation
Summary:
Now that the mapping is separated from BonsaiDerivable, it becomes clear where
batch derivation is incorrectly using the default mapping, rather than the
mapping that has been provided for batch-derivation.

This could mean, for example, if we are backfilling a v2 of these derived
data types, we could accidentally base them on a v1 parent that we obtained
from the default mapping.

Instead, ensure that we don't use `BonsaiDerived` and thus can't use the
default mapping.

Reviewed By: krallin

Differential Revision: D25371963

fbshipit-source-id: fb71e1f1c4bd7a112d3099e0e5c5c7111d457cd2
2020-12-14 09:24:58 -08:00
Mark Juggurnauth-Thomas
e2cd66da84 derived_data_utils: fix typo
Reviewed By: quark-zju

Differential Revision: D25371969

fbshipit-source-id: bbcb3cece6b94096f328d453277b57c7f09dd398
2020-12-14 09:24:58 -08:00
Mark Juggurnauth-Thomas
db28ca7f59 derived_data_utils: ratelimit backfilling
Summary:
The backfiller may read or write to the blobstore too quickly.  Apply QPS
limits to the backfill batch context to keep the read or write rate acceptable.

Reviewed By: ahornby

Differential Revision: D25371966

fbshipit-source-id: 276bf2dd428f7f66f7472aabd9e943eec5733afe
2020-12-14 09:24:58 -08:00
Mark Juggurnauth-Thomas
96d8ce4ad2 derived_data: add option for parallel batch derivation
Summary:
Re-introduce parallel backfilling of changesets in a batch using `batch_derive`,
however keep it under the control of a flag, so we can enable or disable it as
necessary.

Reviewed By: ahornby

Differential Revision: D25401207

fbshipit-source-id: f9aeef3415be48fc03220c18fa547e05538ed479
2020-12-14 09:24:58 -08:00
Mark Juggurnauth-Thomas
73cdac45e3 derived_data: use new derived data configuration format
Summary:
Change derived data config to have "enabled" config and "backfilling" config.

The `Mapping` object has the responsibility of encapsulating the configuration options
for the derived data type.  Since it is only possible to obtain a `Mapping` from
appropriate configuration, ownership of a `Mapping` means derivation is permitted,
and so the `DeriveMode` enum is removed.

Most callers will use `BonsaiDerived::derive`, or a default `derived_data_utils` implementation
that requires the derived data to be enabled and configured on the repo.

Backfillers can additionally use `derived_data_utils_for_backfill` which will use the
`backfilling` configuration in preference to the default configuration.

Reviewed By: ahornby

Differential Revision: D25246317

fbshipit-source-id: 352fe6509572409bc3338dd43d157f34c73b9eac
2020-12-14 09:24:58 -08:00
Mark Juggurnauth-Thomas
05225bcf90 derived_data: make derivation options a parameter to derive_from_parents
Summary:
Currently, data derivation for types that have options (currrently unode
version and blame filesize limit) take the value of the option from the
repository configuration.

This is a side-effect, and means it's not possible to have data derivation
types with different configs active in the same repository (e.g. to
server unodes v1 while backfilling unodes v2).  To have data derivation
with different options, e.g. in tests, we must use `repo.dangerous_override`.

The first step to resolve this is to make the data derivation options a parameter.
Depending on the type of derived data, these options are passed into
`derive_from_parents` so that the right kind of derivation can happen.

The mapping is responsible for storing the options and providing it at the time
of derivation.  In this diff it just gets it from the repository config, the same
as was done previously.  In a future diff we will change this so that there
can be multiple configurations.

Reviewed By: krallin

Differential Revision: D25371967

fbshipit-source-id: 1cf4c06a4598fccbfa93367fc1f1c2fa00fd8235
2020-12-14 09:24:57 -08:00
Mark Juggurnauth-Thomas
64be2e8d87 derived_data_utils: take parameters by reference
Summary: Take the parameters to `derived_data_utils` and `derived_data_utils_unsafe` by reference.

Reviewed By: krallin

Differential Revision: D25371970

fbshipit-source-id: d260650c2398e33667e1bc5779fbabdff04f1f98
2020-12-14 09:24:57 -08:00
Mark Juggurnauth-Thomas
9e1b1448e6 derived_data: split BonsaiDerived trait
Summary:
The `BonsaiDerived` trait is split in two:

* The new `BonsaiDerivable` trait encapsulates the process of deriving the data, either
  a single item from its parents, or a batch.
* The `BonsaiDerived` trait is used only as an entry point for deriving with the default
  mapping and config.

This split will allow us to use `BonsaiDerivable` in batch backfilling with non-default
config, for example when backfilling a new version of a derived data type.

Reviewed By: krallin

Differential Revision: D25371964

fbshipit-source-id: 5874836bc06c18db306ada947a690658bf89723c
2020-12-14 09:24:57 -08:00
Stanislau Hlebik
40b67f9d09 mononoke: do not use batched in derived_data_tailer
Reviewed By: ikostia

Differential Revision: D25366938

fbshipit-source-id: 9b10853b1fdf5ee281187067aa768fa52e6d3074
2020-12-07 02:51:45 -08:00
Pavel Aslanov
337bab2744 convert to new type futures
Summary: Convert `ChangesetFetcher` to new type futures

Reviewed By: StanislavGlebik

Differential Revision: D25244213

fbshipit-source-id: 4207386d81397a930a566db008019bb8f31bf602
2020-12-02 15:40:12 -08:00
Mark Juggurnauth-Thomas
0f13c1283e derived_data: rename Mode to DeriveMode
Reviewed By: StanislavGlebik

Differential Revision: D25218193

fbshipit-source-id: 7a4ff3cb3e2b0e0c0f849b0de69a75626aeddf4c
2020-12-02 07:33:41 -08:00
Mark Juggurnauth-Thomas
6b0203d10f derived_data: use batch_derive for backfilling data
Summary:
Batch derivation was disabled as it caused problems for other derived data types.

This was because the default batch implementation was wrong: it attempted to derive
potentially linear stacks concurrently, which would cause O(n^2) derivations.

Fix the default implementation to be a naive sequential iteration, and re-enable
batch derivation for fsnodes and skeleton manifests.

Reviewed By: StanislavGlebik

Differential Revision: D25218195

fbshipit-source-id: 730555829f092cc36e1c81cf02c2b4962a3904da
2020-12-02 07:33:41 -08:00
Pavel Aslanov
6f08815dc3 convert BlobRepo::get_bonsai_bookmark to new type futures
Summary: convert `BlobRepo::get_bonsai_bookmark` to new type futures

Reviewed By: StanislavGlebik

Differential Revision: D25188577

fbshipit-source-id: fb6f2b592b9e9f76736bc1af5fa5a08d12744b5f
2020-11-27 11:11:18 -08:00
Lukas Piatkowski
fa1a195fd0 mononoke/blobstore: pass CoreContext via borrowed instead of owned value
Summary: Follow up after removing 'static from blobstore.

Reviewed By: StanislavGlebik

Differential Revision: D25182106

fbshipit-source-id: e13a7a31d71b4674425123268e655ae66127f1b7
2020-11-27 03:31:07 -08:00
Lukas Piatkowski
0f54cc3d63 mononoke/blobstore: make Blobstore generic over lifetime
Summary: Remove 'static requirement for async methods of Blobstore, propagate this change and fixup low hanging fruits where the code can become 'static free easily.

Reviewed By: ahornby, farnz

Differential Revision: D24839054

fbshipit-source-id: 5d5daa04c23c4c9ae902b669b0a71fe41ee6dee6
2020-11-20 05:51:52 -08:00
Mark Juggurnauth-Thomas
bfc7614037 skeleton_manifest: implement skeleton manifest derivation
Summary: Implement derivation of skeleton manifests.

Differential Revision: D24787534

fbshipit-source-id: e55d053a717fe052fc4da69bd9034784b356b7cc
2020-11-11 13:23:48 -08:00
Mark Juggurnauth-Thomas
6b8a832433 derived_data: change BonsaiDerivedMapping to use new futures
Summary:
This changes the methods from ones that return old `BoxFuture`s to an async method
using `async_trait`.

Reviewed By: krallin

Differential Revision: D24689506

fbshipit-source-id: 7b13010924369f81681e6590898af703c5423385
2020-11-03 09:15:18 -08:00
Alex Hornby
dde412f273 mononoke: pass references to derive_impl::derive_impl
Summary: Preparation for exposing an async BonsaiDerived::derive variant

Reviewed By: StanislavGlebik

Differential Revision: D24389762

fbshipit-source-id: c24b23c3858a91e96b93348a7e42266bee5809a0
2020-10-27 02:37:41 -07:00
Alex Hornby
453b6f74f2 mononoke: async derive_impl::derive_impl
Summary: Async derive_impl::derive_impl in preparation for exposing an async BonsaiDerived::derive method

Reviewed By: StanislavGlebik

Differential Revision: D24389766

fbshipit-source-id: 086cfef5c6bb6465dac5a18a3d809d822409c6bf
2020-10-27 02:37:41 -07:00
Pavel Aslanov
d752cfa651 batched tailing mode
Summary:
This diff add new mode of tailing based on derived data graph, it uses same functionality as backfill.
- `tail_batch_iteration` uses `bounded_traversal_dag` directly instead of leveraging `DeriveGraph::derive` so we could warm-up dependencies for each node before calling `DerivedUitls::backfill_batch_dangerous`

Reviewed By: StanislavGlebik

Differential Revision: D24306156

fbshipit-source-id: 006cb6d4df9424cd6501cb4a381b95f096e70551
2020-10-19 07:30:03 -07:00
Pavel Aslanov
4190463c33 convert derived data utils to new futures
Summary: Convert derived data utils to use new style futures

Reviewed By: StanislavGlebik

Differential Revision: D24331068

fbshipit-source-id: ad658b278802afa1e4ecd44c5a24164135748790
2020-10-16 07:46:07 -07:00
Pavel Aslanov
7bbc5f6a23 add support for deriving all types at once
Summary: Add support for deriving all types of derived data for a given repository

Reviewed By: StanislavGlebik

Differential Revision: D23842909

fbshipit-source-id: 2fa5c4a9444169b26c5cf70d91a6cc707cca8022
2020-10-15 03:51:03 -07:00
Pavel Aslanov
d9e3abb105 batched derived data type generation
Summary: This change make it possible to find underived changesets for multiple derived data types at once, and then batch derive everything taking into account dependencies between derived data types.

Reviewed By: StanislavGlebik

Differential Revision: D23762152

fbshipit-source-id: 8112fc5deb86f3c55f2fa5079cf917a0506f045c
2020-10-15 03:51:03 -07:00
Pavel Aslanov
92b09459f0 support for multiple heads in BonsaiDerived::find_all_underived_ancestors
Summary: Support for multiple heads in `BonsaiDerived::find_all_underived_ancestors`. This change will be needed to remove manual step of fetching of all changesets in `backfill_derived_data` utilty.

Reviewed By: StanislavGlebik

Differential Revision: D23705295

fbshipit-source-id: 32aa97a77f0a4461cbe4bf1864477e3e121e1879
2020-09-15 09:21:12 -07:00
Stanislau Hlebik
3ba14b7ff1 Back out "mononoke: use batch_derive method in derived data utils"
Reviewed By: ahornby

Differential Revision: D22623587

fbshipit-source-id: 52e56e5b3ceb170ea41c58020fcd986023728ea2
2020-07-20 03:18:55 -07:00
Stanislau Hlebik
361f4e98a7 mononoke: use batch_derive method in derived data utils
Summary:
Previously backfill_batch_dangerous method was calling internal derive_impl() method
directly. That wasn't great (after all, we are calling a function whose name suggests it should only be called from inside derive data crate) and this diff changes it so that we call batch_derive() method instead.

This gives a few benefits:
1) We no longer call internal derive_impl function
2) It allows different types of derived data to override batching behaviour.
For example, we've already overriden it for fsnodes and next diff will override
it for blame as well.

To make it compatible with derive_impl() batch_derive() now accepts derive data mode and mapping

Reviewed By: krallin

Differential Revision: D22435044

fbshipit-source-id: a4d911606284676566583a94199195860ffe2ecf
2020-07-09 10:45:19 -07:00
Stanislau Hlebik
c43ea517b0 mononoke: move derive_data_for_csids to derived_data_utils
Summary: It can be useful in other places as well, not only in blobimport

Reviewed By: krallin

Differential Revision: D22307314

fbshipit-source-id: f7d8c91101edc2ed4f230f7ef6796e39fbea5117
2020-06-30 06:22:31 -07:00
Simon Farnsworth
454de31134 Switch Loadable and Storable interfaces to new-style futures
Summary:
Eventually, we want everything to be `async`/`await`; as a stepping stone in that direction, switch some of the blobstore interfaces to new-style `BoxFuture` with a `'static` lifetime.

This does not enable any fixes at this point, but does mean that `.compat()` moves to the places that need old-style futures instead of new. It also means that the work needed to make the transition fully complete is changed from a full conversion to new futures, to simply changing the lifetimes involved and fixing the resulting compile failures.

Reviewed By: krallin

Differential Revision: D22164315

fbshipit-source-id: dc655c36db4711d84d42d1e81b76e5dddd16f59d
2020-06-25 08:45:37 -07:00
Pavel Aslanov
d13768d768 move DangerousOverride into a separate crate blobrepo_override
Summary: DangerousOverride is moved into a separate crate. Not only it is usually not needed but it was introducing dependencies on mercurial crate.

Reviewed By: StanislavGlebik

Differential Revision: D22115015

fbshipit-source-id: c9646896f906ea54d11aa83a8fbd8490a5b115ea
2020-06-22 07:29:19 -07:00
Stanislau Hlebik
ffd60c01b0 blobimport: do not use batch_derive
Summary:
batch_derive() is a dangerous function to use. I'd love to delete it but this
function is very useful for backfilling, so unfortunately I can't.

The problem arises when one tries to backfill blame and unodes simultaneously
(or just derive blame  which in turn derives unodes). While batch_derive()
tries to be careful with inserting "outer" derived data's mappings (i.e. blame
mapping), it doesn't do it for inner derived data mappings (i.e. unodes). So we
might end up in the situation where we insert unodes mapping before we inserted
all the manifests for it. If this thing fails in the middle of derivation then
we have a corruption.

Let's do not use it in blobimport. It will make derivation slower, but I'd
rather make it slower than incorrect.

Reviewed By: farnz

Differential Revision: D21905619

fbshipit-source-id: c0227df195a8cf4482b2452ca928acbc5750b3e5
2020-06-10 19:29:31 -07:00
David Tolnay
cf412e0d6b rustfmt: Use use_try_shorthand default
Summary:
I observed that for whatever reason our setting of `use_try_shorthand = true` in rustfmt.toml was causing entire functions to not get processed by rustfmt. Even files that contain neither `try` nor `?`. Remove it and reformat fbsource.

Documentation of that config:

- https://github.com/rust-lang/rustfmt/blob/master/Configurations.md#use_try_shorthand

We don't particularly care about the value anymore because nobody writes `r#try!(...)` in 2018 edition code.

Minimized:

```
fn f() {
    g(
        )
        // ...
        .h
}
```

This function gets formatted only if use_try_shorthand is not set.

The bug is fixed in the rustfmt 2.0 release candidate.

Reviewed By: jsgf

Differential Revision: D21878162

fbshipit-source-id: b028673c0eb703984d24bf0d2983453fc2a8c212
2020-06-10 19:29:15 -07:00
Stanislau Hlebik
5f8ab2526c mononoke: make sure commit is regenerated when backfill_derived_data single is
Summary:
subcommand_single calls `derived_data_utils.regenerate(vec![cs_id])` with the
intention that derived data for this commit will be regenerated. However
previously it didn't work because DerivedDataUtils::derive() was ignoring
regenerate parameter. This diff fixes it.

Reviewed By: krallin

Differential Revision: D21527344

fbshipit-source-id: 56d93135071a7f3789262b7a9d9ad84a0896c895
2020-05-13 03:27:46 -07:00
Stanislau Hlebik
50b71ac322 mononoke: log the oldest underived ancestor
Summary:
This diff logs the delay in deriving data. In particular it logs how much time
has left since an underived commit was created.

Note that this code makes an assumption about monotonic dates - for repos with pushrebase
repos that should be the case.

Reviewed By: krallin

Differential Revision: D21427265

fbshipit-source-id: bfddf594467dfd2424f711f895275fb54a4e1c60
2020-05-08 07:47:19 -07:00
Stanislau Hlebik
d3fd4f09a2 mononoke: renamed futures to futures-old
Reviewed By: krallin

Differential Revision: D21425900

fbshipit-source-id: 121bf1d9e5527bd6f035f1f7a506ad0f75de68fc
2020-05-06 07:55:04 -07:00
Aida Getoeva
84c0aa9e8c mononoke: allow to derive changeset info
Summary: This also replaces RepoBlobstore with generic Blobstore.

Reviewed By: krallin

Differential Revision: D20336107

fbshipit-source-id: 4e09430d462053d5ccb0b2f0eca7a9d0417b4b5e
2020-03-09 16:41:53 -07:00
Stanislau Hlebik
638e637ef6 RFC: mononoke: introduce unodes v2
Summary:
Our previous implementation of unodes had a problem with diamond merges -
essentially because p1 and p2 might have the same file but with different
content unode will always create a merge unode which can be unexpected.
(code comment in unodes/derive.rs has more info about it).

This diff fixes the problem by introducing unodes v2. This allows us to import
new repos with new unode implementation while keeping the old repos with unode
v1.

This implementation uses a heuristic which should be fast and should do the
correct thing most of the time. In some cases it might exclude some parts of
the history completely. For example:

     O <- merge commit, doesn't change anything
    / \
   P1  |  <- modified "file.txt" to "B"
   |   P2    <- modified "file.txt" to "B"
   \  /
    ROOT <- created "file.txt" with content "A"

In that case history of "file.txt" starting from merge commit will contain only (P1, ROOT),
but it won't contain P2.

We also considered other options:
1) Move this heuristic to fastlog batch derived data. See D19973553 for more
details about why we decided not to do it.

2) Filter out parent unodes that are ancestors of other parent unodes. This should
always be correct, but it will be hard to implement, it wil be even harder to make
sure it always have good performance.

Reviewed By: krallin

Differential Revision: D19978157

fbshipit-source-id: 445ddd5629669d987e7aa88c35fecf0b34a40da0
2020-03-02 05:27:31 -08:00
Mark Thomas
a8f06f75c0 derived_data: add DeriveError for when derivation is disabled
Summary:
Currently if derivation of a particular derived data type is disabled, but a
client makes a request that requires that derived data type, we will fail with
an internal error.

This is not ideal, as internal errors should indicate something is wrong, but
in this case Mononoke is behaving correctly as configured.

Convert these errors to a new `DeriveError` type, and plumb this back up to
the SCS server.  The SCS server converts these to a new `RequestError`
variant: `NOT_AVAILABLE`.

Reviewed By: krallin

Differential Revision: D19943548

fbshipit-source-id: 964ad0aec3ab294e4bce789e6f38de224bed54fa
2020-02-19 09:28:09 -08:00
Stanislau Hlebik
ee7c0a8d26 backfill_derived_data mononoke: fail if derived data disabled
Summary:
Now let's fail if we try to derive data that's not enabled in the config.
Note that this diff also adds `derive_unsafe()` method which should be used in backfiller.

Reviewed By: krallin

Differential Revision: D19807956

fbshipit-source-id: c39af8555164314cafa9691197629ab9ddb956f1
2020-02-16 04:56:34 -08:00
Pavel Aslanov
c902acbc3f remove the need to pass mapping to ::derive method
Summary: remove the need to pass mapping to `::derive` method

Reviewed By: StanislavGlebik

Differential Revision: D19856560

fbshipit-source-id: 219af827ea7e077a4c3e678a85c51dc0e3822d79
2020-02-12 10:22:39 -08:00
Lukasz Piatkowski
542d1f93d3 Manual synchronization of fbcode/eden and facebookexperimental/eden
Summary:
This commit manually synchronizes the internal move of
fbcode/scm/mononoke under fbcode/eden/mononoke which couldn't be
performed by ShipIt automatically.

Reviewed By: StanislavGlebik

Differential Revision: D19722832

fbshipit-source-id: 52fbc8bc42a8940b39872dfb8b00ce9c0f6b0800
2020-02-11 11:42:43 +01:00
Lukasz Piatkowski
e8d62b64d5 mononoke: move the codebase under eden/ directory
fbshipit-source-id: 43a0252cb3ec42aa365f20d1b6faa4d24d74c9b8
2020-02-06 13:46:04 +01:00