Summary:
The earlier diffs in this stack have removed all our dependencies on the Tokio
0.1 runtime environment (so, basically, `tokio-executor` and `tokio-timer`), so
we don't need this anymore.
We do still have some deps on `tokio-io`, but this is just traits + helpers,
so this doesn't actually prevent us from removing the 0.1 runtime!
Note that we still have a few transitive dependencies on Tokio 0.1:
- async-unit uses tokio-compat
- hg depends on tokio-compat too, and we depend on it in tests
This isn't the end of the world though, we can live with that :)
Reviewed By: ahornby
Differential Revision: D26544410
fbshipit-source-id: 24789be2402c3f48220dcaad110e8246ef02ecd8
Summary:
For fsnodes and skeleton manifests it should be possible to allow gaps in the
commits that are backfilled. Any access to a commit in one of these gaps can
be quickly derived from the nearest ancestor that is derived. Since each
commit's derived data is independent, there is no loss from omitting them.
Add the `--gap-size` option to `backfill_derived_data`. This allows `fsnodes`,
`skeleton_manifests` and other derived data types that implement it in the
future to skip some of the commits during backfill. This will speed up
backfilling and reduced the amount of data that is stored for infrequently
accessed historical commits.
Reviewed By: StanislavGlebik
Differential Revision: D25997854
fbshipit-source-id: bf4df3f5c16a913c13f732f6837fc4c817664e29
Summary:
override_ctx() sets different SessionClass while data is derived. This should
reduce the number of entries we write to blobstore sync queue. See previous
diff for more motivation.
This diff uses override_ctx() for batch_derive() method
Reviewed By: krallin
Differential Revision: D25910465
fbshipit-source-id: b8a3e729c059cad5716b1b09bd2f1cc618273627
Summary:
Now that the mapping is separated from BonsaiDerivable, it becomes clear where
batch derivation is incorrectly using the default mapping, rather than the
mapping that has been provided for batch-derivation.
This could mean, for example, if we are backfilling a v2 of these derived
data types, we could accidentally base them on a v1 parent that we obtained
from the default mapping.
Instead, ensure that we don't use `BonsaiDerived` and thus can't use the
default mapping.
Reviewed By: krallin
Differential Revision: D25371963
fbshipit-source-id: fb71e1f1c4bd7a112d3099e0e5c5c7111d457cd2
Summary:
The backfiller may read or write to the blobstore too quickly. Apply QPS
limits to the backfill batch context to keep the read or write rate acceptable.
Reviewed By: ahornby
Differential Revision: D25371966
fbshipit-source-id: 276bf2dd428f7f66f7472aabd9e943eec5733afe
Summary:
Re-introduce parallel backfilling of changesets in a batch using `batch_derive`,
however keep it under the control of a flag, so we can enable or disable it as
necessary.
Reviewed By: ahornby
Differential Revision: D25401207
fbshipit-source-id: f9aeef3415be48fc03220c18fa547e05538ed479
Summary:
Change derived data config to have "enabled" config and "backfilling" config.
The `Mapping` object has the responsibility of encapsulating the configuration options
for the derived data type. Since it is only possible to obtain a `Mapping` from
appropriate configuration, ownership of a `Mapping` means derivation is permitted,
and so the `DeriveMode` enum is removed.
Most callers will use `BonsaiDerived::derive`, or a default `derived_data_utils` implementation
that requires the derived data to be enabled and configured on the repo.
Backfillers can additionally use `derived_data_utils_for_backfill` which will use the
`backfilling` configuration in preference to the default configuration.
Reviewed By: ahornby
Differential Revision: D25246317
fbshipit-source-id: 352fe6509572409bc3338dd43d157f34c73b9eac
Summary:
Currently, data derivation for types that have options (currrently unode
version and blame filesize limit) take the value of the option from the
repository configuration.
This is a side-effect, and means it's not possible to have data derivation
types with different configs active in the same repository (e.g. to
server unodes v1 while backfilling unodes v2). To have data derivation
with different options, e.g. in tests, we must use `repo.dangerous_override`.
The first step to resolve this is to make the data derivation options a parameter.
Depending on the type of derived data, these options are passed into
`derive_from_parents` so that the right kind of derivation can happen.
The mapping is responsible for storing the options and providing it at the time
of derivation. In this diff it just gets it from the repository config, the same
as was done previously. In a future diff we will change this so that there
can be multiple configurations.
Reviewed By: krallin
Differential Revision: D25371967
fbshipit-source-id: 1cf4c06a4598fccbfa93367fc1f1c2fa00fd8235
Summary: Take the parameters to `derived_data_utils` and `derived_data_utils_unsafe` by reference.
Reviewed By: krallin
Differential Revision: D25371970
fbshipit-source-id: d260650c2398e33667e1bc5779fbabdff04f1f98
Summary:
The `BonsaiDerived` trait is split in two:
* The new `BonsaiDerivable` trait encapsulates the process of deriving the data, either
a single item from its parents, or a batch.
* The `BonsaiDerived` trait is used only as an entry point for deriving with the default
mapping and config.
This split will allow us to use `BonsaiDerivable` in batch backfilling with non-default
config, for example when backfilling a new version of a derived data type.
Reviewed By: krallin
Differential Revision: D25371964
fbshipit-source-id: 5874836bc06c18db306ada947a690658bf89723c
Summary: Convert `ChangesetFetcher` to new type futures
Reviewed By: StanislavGlebik
Differential Revision: D25244213
fbshipit-source-id: 4207386d81397a930a566db008019bb8f31bf602
Summary:
Batch derivation was disabled as it caused problems for other derived data types.
This was because the default batch implementation was wrong: it attempted to derive
potentially linear stacks concurrently, which would cause O(n^2) derivations.
Fix the default implementation to be a naive sequential iteration, and re-enable
batch derivation for fsnodes and skeleton manifests.
Reviewed By: StanislavGlebik
Differential Revision: D25218195
fbshipit-source-id: 730555829f092cc36e1c81cf02c2b4962a3904da
Summary: convert `BlobRepo::get_bonsai_bookmark` to new type futures
Reviewed By: StanislavGlebik
Differential Revision: D25188577
fbshipit-source-id: fb6f2b592b9e9f76736bc1af5fa5a08d12744b5f
Summary: Remove 'static requirement for async methods of Blobstore, propagate this change and fixup low hanging fruits where the code can become 'static free easily.
Reviewed By: ahornby, farnz
Differential Revision: D24839054
fbshipit-source-id: 5d5daa04c23c4c9ae902b669b0a71fe41ee6dee6
Summary:
This changes the methods from ones that return old `BoxFuture`s to an async method
using `async_trait`.
Reviewed By: krallin
Differential Revision: D24689506
fbshipit-source-id: 7b13010924369f81681e6590898af703c5423385
Summary:
This diff add new mode of tailing based on derived data graph, it uses same functionality as backfill.
- `tail_batch_iteration` uses `bounded_traversal_dag` directly instead of leveraging `DeriveGraph::derive` so we could warm-up dependencies for each node before calling `DerivedUitls::backfill_batch_dangerous`
Reviewed By: StanislavGlebik
Differential Revision: D24306156
fbshipit-source-id: 006cb6d4df9424cd6501cb4a381b95f096e70551
Summary: Convert derived data utils to use new style futures
Reviewed By: StanislavGlebik
Differential Revision: D24331068
fbshipit-source-id: ad658b278802afa1e4ecd44c5a24164135748790
Summary: Add support for deriving all types of derived data for a given repository
Reviewed By: StanislavGlebik
Differential Revision: D23842909
fbshipit-source-id: 2fa5c4a9444169b26c5cf70d91a6cc707cca8022
Summary: This change make it possible to find underived changesets for multiple derived data types at once, and then batch derive everything taking into account dependencies between derived data types.
Reviewed By: StanislavGlebik
Differential Revision: D23762152
fbshipit-source-id: 8112fc5deb86f3c55f2fa5079cf917a0506f045c
Summary: Support for multiple heads in `BonsaiDerived::find_all_underived_ancestors`. This change will be needed to remove manual step of fetching of all changesets in `backfill_derived_data` utilty.
Reviewed By: StanislavGlebik
Differential Revision: D23705295
fbshipit-source-id: 32aa97a77f0a4461cbe4bf1864477e3e121e1879
Summary:
Previously backfill_batch_dangerous method was calling internal derive_impl() method
directly. That wasn't great (after all, we are calling a function whose name suggests it should only be called from inside derive data crate) and this diff changes it so that we call batch_derive() method instead.
This gives a few benefits:
1) We no longer call internal derive_impl function
2) It allows different types of derived data to override batching behaviour.
For example, we've already overriden it for fsnodes and next diff will override
it for blame as well.
To make it compatible with derive_impl() batch_derive() now accepts derive data mode and mapping
Reviewed By: krallin
Differential Revision: D22435044
fbshipit-source-id: a4d911606284676566583a94199195860ffe2ecf
Summary: It can be useful in other places as well, not only in blobimport
Reviewed By: krallin
Differential Revision: D22307314
fbshipit-source-id: f7d8c91101edc2ed4f230f7ef6796e39fbea5117
Summary:
Eventually, we want everything to be `async`/`await`; as a stepping stone in that direction, switch some of the blobstore interfaces to new-style `BoxFuture` with a `'static` lifetime.
This does not enable any fixes at this point, but does mean that `.compat()` moves to the places that need old-style futures instead of new. It also means that the work needed to make the transition fully complete is changed from a full conversion to new futures, to simply changing the lifetimes involved and fixing the resulting compile failures.
Reviewed By: krallin
Differential Revision: D22164315
fbshipit-source-id: dc655c36db4711d84d42d1e81b76e5dddd16f59d
Summary: DangerousOverride is moved into a separate crate. Not only it is usually not needed but it was introducing dependencies on mercurial crate.
Reviewed By: StanislavGlebik
Differential Revision: D22115015
fbshipit-source-id: c9646896f906ea54d11aa83a8fbd8490a5b115ea
Summary:
batch_derive() is a dangerous function to use. I'd love to delete it but this
function is very useful for backfilling, so unfortunately I can't.
The problem arises when one tries to backfill blame and unodes simultaneously
(or just derive blame which in turn derives unodes). While batch_derive()
tries to be careful with inserting "outer" derived data's mappings (i.e. blame
mapping), it doesn't do it for inner derived data mappings (i.e. unodes). So we
might end up in the situation where we insert unodes mapping before we inserted
all the manifests for it. If this thing fails in the middle of derivation then
we have a corruption.
Let's do not use it in blobimport. It will make derivation slower, but I'd
rather make it slower than incorrect.
Reviewed By: farnz
Differential Revision: D21905619
fbshipit-source-id: c0227df195a8cf4482b2452ca928acbc5750b3e5
Summary:
I observed that for whatever reason our setting of `use_try_shorthand = true` in rustfmt.toml was causing entire functions to not get processed by rustfmt. Even files that contain neither `try` nor `?`. Remove it and reformat fbsource.
Documentation of that config:
- https://github.com/rust-lang/rustfmt/blob/master/Configurations.md#use_try_shorthand
We don't particularly care about the value anymore because nobody writes `r#try!(...)` in 2018 edition code.
Minimized:
```
fn f() {
g(
)
// ...
.h
}
```
This function gets formatted only if use_try_shorthand is not set.
The bug is fixed in the rustfmt 2.0 release candidate.
Reviewed By: jsgf
Differential Revision: D21878162
fbshipit-source-id: b028673c0eb703984d24bf0d2983453fc2a8c212
Summary:
subcommand_single calls `derived_data_utils.regenerate(vec![cs_id])` with the
intention that derived data for this commit will be regenerated. However
previously it didn't work because DerivedDataUtils::derive() was ignoring
regenerate parameter. This diff fixes it.
Reviewed By: krallin
Differential Revision: D21527344
fbshipit-source-id: 56d93135071a7f3789262b7a9d9ad84a0896c895
Summary:
This diff logs the delay in deriving data. In particular it logs how much time
has left since an underived commit was created.
Note that this code makes an assumption about monotonic dates - for repos with pushrebase
repos that should be the case.
Reviewed By: krallin
Differential Revision: D21427265
fbshipit-source-id: bfddf594467dfd2424f711f895275fb54a4e1c60
Summary:
Our previous implementation of unodes had a problem with diamond merges -
essentially because p1 and p2 might have the same file but with different
content unode will always create a merge unode which can be unexpected.
(code comment in unodes/derive.rs has more info about it).
This diff fixes the problem by introducing unodes v2. This allows us to import
new repos with new unode implementation while keeping the old repos with unode
v1.
This implementation uses a heuristic which should be fast and should do the
correct thing most of the time. In some cases it might exclude some parts of
the history completely. For example:
O <- merge commit, doesn't change anything
/ \
P1 | <- modified "file.txt" to "B"
| P2 <- modified "file.txt" to "B"
\ /
ROOT <- created "file.txt" with content "A"
In that case history of "file.txt" starting from merge commit will contain only (P1, ROOT),
but it won't contain P2.
We also considered other options:
1) Move this heuristic to fastlog batch derived data. See D19973553 for more
details about why we decided not to do it.
2) Filter out parent unodes that are ancestors of other parent unodes. This should
always be correct, but it will be hard to implement, it wil be even harder to make
sure it always have good performance.
Reviewed By: krallin
Differential Revision: D19978157
fbshipit-source-id: 445ddd5629669d987e7aa88c35fecf0b34a40da0
Summary:
Currently if derivation of a particular derived data type is disabled, but a
client makes a request that requires that derived data type, we will fail with
an internal error.
This is not ideal, as internal errors should indicate something is wrong, but
in this case Mononoke is behaving correctly as configured.
Convert these errors to a new `DeriveError` type, and plumb this back up to
the SCS server. The SCS server converts these to a new `RequestError`
variant: `NOT_AVAILABLE`.
Reviewed By: krallin
Differential Revision: D19943548
fbshipit-source-id: 964ad0aec3ab294e4bce789e6f38de224bed54fa
Summary:
Now let's fail if we try to derive data that's not enabled in the config.
Note that this diff also adds `derive_unsafe()` method which should be used in backfiller.
Reviewed By: krallin
Differential Revision: D19807956
fbshipit-source-id: c39af8555164314cafa9691197629ab9ddb956f1
Summary: remove the need to pass mapping to `::derive` method
Reviewed By: StanislavGlebik
Differential Revision: D19856560
fbshipit-source-id: 219af827ea7e077a4c3e678a85c51dc0e3822d79
Summary:
This commit manually synchronizes the internal move of
fbcode/scm/mononoke under fbcode/eden/mononoke which couldn't be
performed by ShipIt automatically.
Reviewed By: StanislavGlebik
Differential Revision: D19722832
fbshipit-source-id: 52fbc8bc42a8940b39872dfb8b00ce9c0f6b0800