Summary:
The earlier diffs in this stack have removed all our dependencies on the Tokio
0.1 runtime environment (so, basically, `tokio-executor` and `tokio-timer`), so
we don't need this anymore.
We do still have some deps on `tokio-io`, but this is just traits + helpers,
so this doesn't actually prevent us from removing the 0.1 runtime!
Note that we still have a few transitive dependencies on Tokio 0.1:
- async-unit uses tokio-compat
- hg depends on tokio-compat too, and we depend on it in tests
This isn't the end of the world though, we can live with that :)
Reviewed By: ahornby
Differential Revision: D26544410
fbshipit-source-id: 24789be2402c3f48220dcaad110e8246ef02ecd8
Summary:
For fsnodes and skeleton manifests it should be possible to allow gaps in the
commits that are backfilled. Any access to a commit in one of these gaps can
be quickly derived from the nearest ancestor that is derived. Since each
commit's derived data is independent, there is no loss from omitting them.
Add the `--gap-size` option to `backfill_derived_data`. This allows `fsnodes`,
`skeleton_manifests` and other derived data types that implement it in the
future to skip some of the commits during backfill. This will speed up
backfilling and reduced the amount of data that is stored for infrequently
accessed historical commits.
Reviewed By: StanislavGlebik
Differential Revision: D25997854
fbshipit-source-id: bf4df3f5c16a913c13f732f6837fc4c817664e29
Summary:
override_ctx() sets different SessionClass while data is derived. This should
reduce the number of entries we write to blobstore sync queue. See previous
diff for more motivation.
This diff uses override_ctx() for batch_derive() method
Reviewed By: krallin
Differential Revision: D25910465
fbshipit-source-id: b8a3e729c059cad5716b1b09bd2f1cc618273627
Summary:
We had issues with mononoke writing too much blobstore sync queue entries while
deriving data for large commits. We've contemplated a few solutions, and
decided to give this one a go.
This approach forces derive data to use Background SessionClass which has an
effect of not writing data to blobstore sync queue if the write to blobstore
was successful (it still writes data to the queue otherwise). This should
reduce the number of entries we write to the blobstore sync queue
significantly. The downside is that writes might get a bit slower - our
assumption is that this slowdown is acceptable. If that's not the case we can
always disable this option.
This diff overrides SessionClass for normal ::derive() method. However there's
also batch_derive() - this one will be addressed in the next diff.
One thing to note - we still write derived data mapping to blobstore sync queue. That should be find as we have a constant number of writes per commits.
Reviewed By: krallin
Differential Revision: D25910464
fbshipit-source-id: 4113d00bc0efe560fd14a5d4319b743d0a100dfa
Summary: Convert all BlobRepoHg methods to new type futures
Reviewed By: StanislavGlebik
Differential Revision: D25471540
fbshipit-source-id: c8e99509d39d0e081d082097cbd9dbfca431637e
Summary:
Now that the mapping is separated from BonsaiDerivable, it becomes clear where
batch derivation is incorrectly using the default mapping, rather than the
mapping that has been provided for batch-derivation.
This could mean, for example, if we are backfilling a v2 of these derived
data types, we could accidentally base them on a v1 parent that we obtained
from the default mapping.
Instead, ensure that we don't use `BonsaiDerived` and thus can't use the
default mapping.
Reviewed By: krallin
Differential Revision: D25371963
fbshipit-source-id: fb71e1f1c4bd7a112d3099e0e5c5c7111d457cd2
Summary:
Change derived data config to have "enabled" config and "backfilling" config.
The `Mapping` object has the responsibility of encapsulating the configuration options
for the derived data type. Since it is only possible to obtain a `Mapping` from
appropriate configuration, ownership of a `Mapping` means derivation is permitted,
and so the `DeriveMode` enum is removed.
Most callers will use `BonsaiDerived::derive`, or a default `derived_data_utils` implementation
that requires the derived data to be enabled and configured on the repo.
Backfillers can additionally use `derived_data_utils_for_backfill` which will use the
`backfilling` configuration in preference to the default configuration.
Reviewed By: ahornby
Differential Revision: D25246317
fbshipit-source-id: 352fe6509572409bc3338dd43d157f34c73b9eac
Summary:
Currently, data derivation for types that have options (currrently unode
version and blame filesize limit) take the value of the option from the
repository configuration.
This is a side-effect, and means it's not possible to have data derivation
types with different configs active in the same repository (e.g. to
server unodes v1 while backfilling unodes v2). To have data derivation
with different options, e.g. in tests, we must use `repo.dangerous_override`.
The first step to resolve this is to make the data derivation options a parameter.
Depending on the type of derived data, these options are passed into
`derive_from_parents` so that the right kind of derivation can happen.
The mapping is responsible for storing the options and providing it at the time
of derivation. In this diff it just gets it from the repository config, the same
as was done previously. In a future diff we will change this so that there
can be multiple configurations.
Reviewed By: krallin
Differential Revision: D25371967
fbshipit-source-id: 1cf4c06a4598fccbfa93367fc1f1c2fa00fd8235
Summary:
The `BonsaiDerived` trait is split in two:
* The new `BonsaiDerivable` trait encapsulates the process of deriving the data, either
a single item from its parents, or a batch.
* The `BonsaiDerived` trait is used only as an entry point for deriving with the default
mapping and config.
This split will allow us to use `BonsaiDerivable` in batch backfilling with non-default
config, for example when backfilling a new version of a derived data type.
Reviewed By: krallin
Differential Revision: D25371964
fbshipit-source-id: 5874836bc06c18db306ada947a690658bf89723c
Summary:
Create `BlobstoreMapping` as a trait with the common implementations for
derived data mappings that are stored in the blobstore.
Reviewed By: StanislavGlebik
Differential Revision: D25099915
fbshipit-source-id: 8a62fbb809918045336944c8cd3584b109811012
Summary:
Batch derivation was disabled as it caused problems for other derived data types.
This was because the default batch implementation was wrong: it attempted to derive
potentially linear stacks concurrently, which would cause O(n^2) derivations.
Fix the default implementation to be a naive sequential iteration, and re-enable
batch derivation for fsnodes and skeleton manifests.
Reviewed By: StanislavGlebik
Differential Revision: D25218195
fbshipit-source-id: 730555829f092cc36e1c81cf02c2b4962a3904da
Summary:
Similar to fsnodes, allow skeleton manifests to be derived in parallel in large
batches by splitting the changesets into linear stacks with no internal
conflicts, and deriving each changeset in that batch in parallel.
Reviewed By: StanislavGlebik
Differential Revision: D25218196
fbshipit-source-id: e578de9ffd472e732abb1e2ef9cd19c073280cd4
Summary: convert `BlobRepo::get_bonsai_bookmark` to new type futures
Reviewed By: StanislavGlebik
Differential Revision: D25188577
fbshipit-source-id: fb6f2b592b9e9f76736bc1af5fa5a08d12744b5f
Summary: Now that `derive03` is the only version available, rename it to `derive`.
Reviewed By: krallin
Differential Revision: D24900106
fbshipit-source-id: c7fbf9a00baca7d52da64f2b5c17e3fe1ddc179e