Summary:
Now that the mapping is separated from BonsaiDerivable, it becomes clear where
batch derivation is incorrectly using the default mapping, rather than the
mapping that has been provided for batch-derivation.
This could mean, for example, if we are backfilling a v2 of these derived
data types, we could accidentally base them on a v1 parent that we obtained
from the default mapping.
Instead, ensure that we don't use `BonsaiDerived` and thus can't use the
default mapping.
Reviewed By: krallin
Differential Revision: D25371963
fbshipit-source-id: fb71e1f1c4bd7a112d3099e0e5c5c7111d457cd2
Summary:
The backfiller may read or write to the blobstore too quickly. Apply QPS
limits to the backfill batch context to keep the read or write rate acceptable.
Reviewed By: ahornby
Differential Revision: D25371966
fbshipit-source-id: 276bf2dd428f7f66f7472aabd9e943eec5733afe
Summary:
The common case of limiting blobstore rates using a leaky bucket rate limiter
is cumbersone to set up. Create a convenience method to do it more easily.
Reviewed By: ahornby
Differential Revision: D25438685
fbshipit-source-id: 821eda7bd0ddf71f22378c1b23e66b6d3f6454e7
Summary:
When fetching many derived data mappings, the use of `FuturesUnordered` means
we may fetch many blobs concurrently, which may overload the blobstore.
Switch to using `buffered` to reduce the number of concurrent blob fetches.
Reviewed By: ahornby
Differential Revision: D25371965
fbshipit-source-id: 30417e86bc33defbb821f214a5520ab1b8a8c18c
Summary:
Large batches with parallel derivation can cause problems in large repos.
Allow control of the batch size so that it can be reduced if needed.
Reviewed By: krallin
Differential Revision: D25401205
fbshipit-source-id: 88a76a7745c34e4e34bc9b3ea9228bd5dad857f6
Summary:
Re-introduce parallel backfilling of changesets in a batch using `batch_derive`,
however keep it under the control of a flag, so we can enable or disable it as
necessary.
Reviewed By: ahornby
Differential Revision: D25401207
fbshipit-source-id: f9aeef3415be48fc03220c18fa547e05538ed479
Summary:
Change derived data config to have "enabled" config and "backfilling" config.
The `Mapping` object has the responsibility of encapsulating the configuration options
for the derived data type. Since it is only possible to obtain a `Mapping` from
appropriate configuration, ownership of a `Mapping` means derivation is permitted,
and so the `DeriveMode` enum is removed.
Most callers will use `BonsaiDerived::derive`, or a default `derived_data_utils` implementation
that requires the derived data to be enabled and configured on the repo.
Backfillers can additionally use `derived_data_utils_for_backfill` which will use the
`backfilling` configuration in preference to the default configuration.
Reviewed By: ahornby
Differential Revision: D25246317
fbshipit-source-id: 352fe6509572409bc3338dd43d157f34c73b9eac
Summary:
Currently, data derivation for types that have options (currrently unode
version and blame filesize limit) take the value of the option from the
repository configuration.
This is a side-effect, and means it's not possible to have data derivation
types with different configs active in the same repository (e.g. to
server unodes v1 while backfilling unodes v2). To have data derivation
with different options, e.g. in tests, we must use `repo.dangerous_override`.
The first step to resolve this is to make the data derivation options a parameter.
Depending on the type of derived data, these options are passed into
`derive_from_parents` so that the right kind of derivation can happen.
The mapping is responsible for storing the options and providing it at the time
of derivation. In this diff it just gets it from the repository config, the same
as was done previously. In a future diff we will change this so that there
can be multiple configurations.
Reviewed By: krallin
Differential Revision: D25371967
fbshipit-source-id: 1cf4c06a4598fccbfa93367fc1f1c2fa00fd8235
Summary: Take the parameters to `derived_data_utils` and `derived_data_utils_unsafe` by reference.
Reviewed By: krallin
Differential Revision: D25371970
fbshipit-source-id: d260650c2398e33667e1bc5779fbabdff04f1f98
Summary:
The `BonsaiDerived` trait is split in two:
* The new `BonsaiDerivable` trait encapsulates the process of deriving the data, either
a single item from its parents, or a batch.
* The `BonsaiDerived` trait is used only as an entry point for deriving with the default
mapping and config.
This split will allow us to use `BonsaiDerivable` in batch backfilling with non-default
config, for example when backfilling a new version of a derived data type.
Reviewed By: krallin
Differential Revision: D25371964
fbshipit-source-id: 5874836bc06c18db306ada947a690658bf89723c
Summary: Looks like the permissions are different there. Let's glob it out.
Reviewed By: singhsrb
Differential Revision: D25507359
fbshipit-source-id: 6a5c19e41879798b829d9b6e79eba3009249c20c
Summary:
At the moment "hg pull -B bookmark" always fetches from infinitepush path even
if we do something like "hg pull -B master".
Let's fetch from infinitepush only if a bookmark matches scratch mather
Reviewed By: markbt
Differential Revision: D25460577
fbshipit-source-id: 6563dcd3423c6a7a70ea1c1f7acdaf5db5e21875
Summary: Two doctor tests don't pass on macOS, so disable them for now.
Reviewed By: genevievehelsel
Differential Revision: D25509814
fbshipit-source-id: c3fa92daefd4fda67335bdc66f56e35e94ae4e6a
Summary: Make the auth crate validate the user's certificate before returning it. This way we can catch invalid certs before trying to use them.
Reviewed By: sfilipco
Differential Revision: D25454687
fbshipit-source-id: ad253fb433310570c20f33dbd0d0bf11df21e966
Summary: Add a new module that can parse X.509 certificates and detect common issues (e.g., the certificate is missing, corrupt, or expired). This should allow us to provide better UX around certificate errors.
Reviewed By: sfilipco
Differential Revision: D25440548
fbshipit-source-id: b7785fd17fa85f812fd38de09e79420f4e256065
Summary: This makes it more flexible.
Reviewed By: kulshrax
Differential Revision: D24467604
fbshipit-source-id: 63023cf0dde2fb7eac592ac79008e4b7a62340c1
Summary:
RCU is a synchronization mechanism that allows for very fast reads, at the
expense of slower writes. This is achieved by having the reader sometimes
reading a stale pointer when concurrent to a write, at which point the writer
will delay reclaiming the old data to a later moment where it is known that no
reader can hold a pointer to the old data.
Doing that allows for the read operations to be significantly faster than using
a Synchronized lock. Folly's documentation claims a read lock/unlock of RCU
runs in ~4ns, while the same for Synchronized is ~26ns.
Due to the writers cost, RCU is perfectly suited for places where reads needs
to be as fast as possible, and writes are very infrequent. One typical example
is when caching an application's configuration, we can expect reading the
configuration values more frequently than it is being reloaded, and in the case
where the configuration mismatch, a stale configuration can be tolerated by the
application.
In EdenFS, we can use RCU on Windows to make sure that unmounting a repository
will wait on all the pending callbacks.
Reviewed By: kmancini
Differential Revision: D25351536
fbshipit-source-id: 050ca0337e67ae195f4f16062dddb60f584af692
Summary:
When a blob is redacted server side, the http code 410 is returned.
Unfortunately, that HTTP code wasn't supported and caused Mercurial to crash.
To fix this, we just need to store a placeholder when receiving this HTTP code
and simply return it up the stack when reading it.
Reviewed By: DurhamG
Differential Revision: D25433001
fbshipit-source-id: 66ec365fa2643bcf29e38b114ae1fc92aeaf1a7b
Summary: OD team wants to be able to track disk usage of EdenFS. Adding this option so it can be easier to parse.
Reviewed By: kmancini
Differential Revision: D25442582
fbshipit-source-id: 235bb8ba4377894b31dad48229ecb8f241f070ff
Summary:
There was a bug with local-data indexedlog storage where it
wasn't applying the appropriate suffix, so tree data was being stored in
.hg/store/indexedlogdatastore just like file data. Let's fix that and add a
test.
Reviewed By: quark-zju
Differential Revision: D25469917
fbshipit-source-id: 731252f924f9a8014867fc077a7ef10ac9870170
Summary: Allows a binary to specify if the repo args are required on command line, and if so if OnlyOne of AtLeastOne is the requirement.
Reviewed By: farnz
Differential Revision: D25422757
fbshipit-source-id: 44d27c954bd1e0fa38b2d44c1c3b2eac3e50bd0c
Summary:
This is useful to e.g. write tests in things that use mononoke_api (such as
edenapi): the test mode isn't transitive across crates. This also requires
making Repo itself public, since callers might reasonably want to create one.
I've also updated a few of the accessor methods that were `pub(crate)` given
that what we had right now seemed like it was kinda random: some things were
`pub(crate)`, others were just `pub`.
Reviewed By: markbt
Differential Revision: D25467624
fbshipit-source-id: 2279d4196e8dc0e7e1729239710d900b351be816
Summary: Factor out functions in preparation for change that uses them to optionally resolve multiple repos from cmdlib
Differential Revision: D25422754
fbshipit-source-id: e0bd33ae533b1450e7084d78bd1765148b71bc76
Summary: Could already specify "bonsai" useful to be able to pass "hg".
Reviewed By: farnz
Differential Revision: D25367322
fbshipit-source-id: aca6d22f98394af49e3d94d5fd533bc9a25a6869
Summary:
This is useful for jobs running multiple repos as it can then open the blobstore as many times as there are storage configs rather than as many times as there are repos.
Used in a diff I'm working on to group repos by storage config in a HashMap when setting up the walker to scrub multiple repos from single process.
Reviewed By: farnz
Differential Revision: D25422758
fbshipit-source-id: 578799db63dcf0bce4a79fca9642651601f2deeb