Summary: Take the parameters to `derived_data_utils` and `derived_data_utils_unsafe` by reference.
Reviewed By: krallin
Differential Revision: D25371970
fbshipit-source-id: d260650c2398e33667e1bc5779fbabdff04f1f98
Summary:
The `BonsaiDerived` trait is split in two:
* The new `BonsaiDerivable` trait encapsulates the process of deriving the data, either
a single item from its parents, or a batch.
* The `BonsaiDerived` trait is used only as an entry point for deriving with the default
mapping and config.
This split will allow us to use `BonsaiDerivable` in batch backfilling with non-default
config, for example when backfilling a new version of a derived data type.
Reviewed By: krallin
Differential Revision: D25371964
fbshipit-source-id: 5874836bc06c18db306ada947a690658bf89723c
Summary: Allows a binary to specify if the repo args are required on command line, and if so if OnlyOne of AtLeastOne is the requirement.
Reviewed By: farnz
Differential Revision: D25422757
fbshipit-source-id: 44d27c954bd1e0fa38b2d44c1c3b2eac3e50bd0c
Summary:
This is useful to e.g. write tests in things that use mononoke_api (such as
edenapi): the test mode isn't transitive across crates. This also requires
making Repo itself public, since callers might reasonably want to create one.
I've also updated a few of the accessor methods that were `pub(crate)` given
that what we had right now seemed like it was kinda random: some things were
`pub(crate)`, others were just `pub`.
Reviewed By: markbt
Differential Revision: D25467624
fbshipit-source-id: 2279d4196e8dc0e7e1729239710d900b351be816
Summary: Factor out functions in preparation for change that uses them to optionally resolve multiple repos from cmdlib
Differential Revision: D25422754
fbshipit-source-id: e0bd33ae533b1450e7084d78bd1765148b71bc76
Summary: Could already specify "bonsai" useful to be able to pass "hg".
Reviewed By: farnz
Differential Revision: D25367322
fbshipit-source-id: aca6d22f98394af49e3d94d5fd533bc9a25a6869
Summary:
This is useful for jobs running multiple repos as it can then open the blobstore as many times as there are storage configs rather than as many times as there are repos.
Used in a diff I'm working on to group repos by storage config in a HashMap when setting up the walker to scrub multiple repos from single process.
Reviewed By: farnz
Differential Revision: D25422758
fbshipit-source-id: 578799db63dcf0bce4a79fca9642651601f2deeb
Summary: also makes ERROR_MSG a constant
Reviewed By: farnz
Differential Revision: D25422756
fbshipit-source-id: e2f2b9122e2b90c7cb07b7d64156055d55c8c653
Summary:
Following features added to gitimport, both the library rutine and the command line version.
* Define your own parts of a git-library to import by implementing the `GitimportTarget` trait.
* Added `GitimportTarget` implementation
`ImportMissingForCommit` that will search for any missing reference in Mononoke for the specified git-commit and import them. Note that it will not import commits unreachable by the specified commit history.
* Added support to update the bonsai<->git commit mapping while importing commits.
* Commit import progress is now shown, making it a bit easier to estimate how big an import job is and how long it will take.
* Adding optional git-repo name. This is useful when using gitimport as a library to import missing commits from many repositories simultaneously.
* Email to author is now added in the author field.
* Committer information is now also exported.
* Optimized the blob-store import by checking if a blob already exists prior to importing it.
* Added brief functions to basic hash structs, this is to get only the first 4 bytes (8 hex chars) for easier human inspection and debugging.
* Added support to suppress the long ref->BonzaiID mapping (on by default to match old behavior).
Reviewed By: krallin
Differential Revision: D25445974
fbshipit-source-id: 6dc7f977b61ceec1a95b5f3c38548ac8eddbea27
Summary: Makes it a bit easier to use.
Reviewed By: markbt
Differential Revision: D25439113
fbshipit-source-id: 2c5da338a7000573ac92435c8982f5adff71bf42
Summary:
I'd like to lower the bar to entry as much as possible to use caching in e.g.
mappings. One thing that's a little annoying to setup right now is stats. Let's
unify those.
This does mean the stats names for mappings that had them will change, but I
think that's probably OK here.
A few notes about the implementation:
- I made the stats objects singletons, because ultimately that *is* what stats
are under the hood.
- I made a macro to define the singleton because that was otherwise really
verbose.
Reviewed By: farnz
Differential Revision: D25461182
fbshipit-source-id: f30d0908c40d8e0010c546ec95a385a06d557757
Summary:
This adds support for replacing things in the small repo with the things from
the large repo. This is useful when changing the bind.
This diff slightly changes how `rsync` is called: now `--source-csid`,
`--target-csid`, `--from-dir` and `--to-dir` are all specified before the
`copy`/`remove-excessive-files` subcommands.
There's still an ability to use this within a repo, as you can pass identical
source and target repos.
Reviewed By: StanislavGlebik
Differential Revision: D25087893
fbshipit-source-id: 6e5881f80d91ef4b794a967cf9f26dd3af7f56c9
Summary:
We had a small bug in blobimport. If mononoke repo is empty and blobimport
tries to import a single revision whose rev number is 0 then it will
successfull import it but it will report that no revision were imported
i.e. it will print "didn't import any revision" to stderr and won't update the
manifold latest imported revision marker.
That was because we didn't update max_rev_and_bcs_id if rev is equal to
max_rev. This diff fixes it.
Reviewed By: ahornby
Differential Revision: D25421164
fbshipit-source-id: 639ead0ac326a14051d3a4faba568ecb797857a2
Summary: See D25396203 for discussion. 100 is better than nothing.
Reviewed By: StanislavGlebik
Differential Revision: D25460398
fbshipit-source-id: 608a2dca9c381c78daf0e7d9bcbd1a32f201030a
Summary:
I had to have two of those while I was refactoring away the old style of
doing get or fill, but now that that's gone, we can have a single one and clean
up.
Reviewed By: StanislavGlebik
Differential Revision: D25396201
fbshipit-source-id: 459e9ec7e44e8b349c585212c2758d64077e56d1
Summary:
All the code that needed it basically gone. Might as well push the compat()
calls a little further down and be done with 0.1 futures here.
Reviewed By: StanislavGlebik
Differential Revision: D25396202
fbshipit-source-id: ae85f61c03cb2c38eabbaf0d45387f9d4422b336
Summary: All the callsites are gone now.
Reviewed By: StanislavGlebik
Differential Revision: D25396205
fbshipit-source-id: 74d0595c4528dc739d254f5dc950157e087b00dd
Summary:
Like it says in the title. The upshot here as well is a lot less cloning.
While in there, I removed the "caching" module since it basically only
contained a couple of things that were all needed in the sql module anyway.
Reviewed By: StanislavGlebik
Differential Revision: D25396208
fbshipit-source-id: aa6381c78f45a94fecd04544196180d2a918f97d
Summary: We need this in Phases, so let's add it there.
Reviewed By: StanislavGlebik
Differential Revision: D25396207
fbshipit-source-id: 34174f205028b95c9aa382c343b1344265391df2
Summary:
Like it says in the title. This reduces cloning. Unlike our cached mappings,
changesets have only way of being queried, so other than cloning, it doesn't
make a huge difference.
Reviewed By: StanislavGlebik
Differential Revision: D25396206
fbshipit-source-id: 45d3ebd403142a3f1d9e3ba7de5de2bf18317165
Summary:
Like it says in the title. I need the new futures module in there later in this
stack so this makes it cleaner.
Reviewed By: StanislavGlebik
Differential Revision: D25396200
fbshipit-source-id: 0148003c83b3dd0da5142eb468cf3a6ae2f74b7a
Summary:
Like it says in the title, this updates bonsai_hg_mapping to a futures 0.3
implementation of get_or_fill.
The upshot is that this requires less cloning (in fact, no cloning at all if
the rest of the code was 0.3 here), and in this particular instance it also
lets us completely get rid of the `from_bonsai` flag we were threading through
the whole method and checking many times.
Reviewed By: StanislavGlebik
Differential Revision: D25396199
fbshipit-source-id: f8126c96aad8d982c3deb535530484bec841929f
Summary:
Like it says in the title. Let's update this to the new get_or_fill
implementation that uses 0.3 futures.
Reviewed By: StanislavGlebik
Differential Revision: D25396204
fbshipit-source-id: 06bf449a0d15bd19625acfdcbb4578948e57cde7
Summary:
Like it says in the title, this adds a futures 0.3 variant of
GetOrFillMultipleFromCacheLayers. However, I didn't just port the code as-is,
since the code as it stands wouldn't have been very idiomatic if I did.
Instead, I refactored this to be a function and a few traits. I've also kept
the old code for now, and I'll remove it once I've converted all the callistes.
The upshot of the proposed refactor here is that it should be easier to use
this without having to heavily duplicate the instantiation of the "cacher" in
places where we have multiple variants that are cached (e.g. mappings), all the
while being able to leverage the type checker. See D25334478 (13255301b0) for a discussion
on this. This new approach also makes it much easier to work with the tests for
this (you can just mutate the store and access its fields).
Reviewed By: StanislavGlebik
Differential Revision: D25396203
fbshipit-source-id: d706729a800faa4b12fcf5e608c6dee93c5a909e
Summary: Switching to specify derived data types other than hg explicitly on the command line
Reviewed By: farnz
Differential Revision: D25367323
fbshipit-source-id: 0e0aea1aab46b43b325486ed6161ea322f7cec4b
Summary: It's useful sometimes, like in the next diff.
Reviewed By: mitrandir77
Differential Revision: D25422597
fbshipit-source-id: 0ebb5dcc349bbaacac3dddf03f19e5e092042468
Summary:
This is useful data to see how much we are over the limit. I noticed
the missing data while resolving
https://fb.workplace.com/groups/scm/permalink/3450652774984318/ as the oncall.
Reviewed By: StanislavGlebik
Differential Revision: D25414427
fbshipit-source-id: ec4bbca9c21a4bf0e675ec1cd82e4e703cd88631
Summary:
Same as development branch. Without configuration changes, nothing changes for
the production codepath.
Reviewed By: quark-zju
Differential Revision: D25405026
fbshipit-source-id: aff705aa5f96814f1f1d7552454ab1d0c13afd92
Summary:
The end goal is to have clients using a sparse IdMap. There is still some work
to get there though. In the mean time we can test repositories that don't use
any revlogs. The current expections for those repositories are that they have
a full idmap locally.
Reviewed By: quark-zju
Differential Revision: D25075341
fbshipit-source-id: 52ab881fc9c64d0d13944e9619c087e0d4fb547c
Summary:
The client dag cannot currently be instantiated with a sparse idmap (aka
universal commit idmap). Is should be usable with a full idmap. To test
repositories that use segmented changelog exclusively we add the capability of
cloning the full idmap.
I currently see StreamCloneData as an experiment. I am open to suggestions
around what structure we should have for the regular long term clone endpoint.
That said, I am leaning towards converting clone_data to return
StreamCloneData. Overall, Segmented Changelog has a few knobs that influence
how big the IdMap ends up being so the code that is more flexible will be more
useful long term. To add to that, we transform data higher in the stack using
streaming and this data does similar fetching, it seems that we should have a
stream idmap exposed by clone_data.
Reviewed By: quark-zju
Differential Revision: D24966338
fbshipit-source-id: 019b363568e3191280bd5ac09fc15062711e5523
Summary:
All the bookmarks is *a lot* of bookmarks. Don't do them all at once. Also, add
some logging output so we can tell how far along we are.
Reviewed By: HarveyHunt
Differential Revision: D25397297
fbshipit-source-id: c19b99123f88e05e99bff61e2399a62d378a6671
Summary: Also added a TryShared future to futures_ext. The problem with regular Shared is that if you want to share anyhow::Result the Error part of it is not cloneable. This TryShared will work nicely when returning anyhow::Result, which most of our code does.
Reviewed By: aslpavel
Differential Revision: D25223317
fbshipit-source-id: cf21141701884317a87dc726478dcd7a5a820c73
Summary:
Like it says in the title. This is helpful to measure the number of SQL queries
we make. This required actually threading in a CoreContext, which we didn't
have before.
Reviewed By: StanislavGlebik
Differential Revision: D25336069
fbshipit-source-id: 35677c55550e95b5126de29c2a824b4eda32092c
Summary: Like it says in the title.
Reviewed By: StanislavGlebik
Differential Revision: D25336068
fbshipit-source-id: 113050215c28a28c820d938348a0a3e54c14c3ee
Summary:
Like it says in the title, this adds a caching layer around Globalrevs using
our existing `GetOrFillMultipleFromCacheLayers` abstraction.
Note: I've opted to not track dedicated metrics for this (compare to the hg
mapping to see them), since I don't believe we really ever look at them.
I'd like to do a little bit of refactoring of
`GetOrFillMultipleFromCacheLayers` to a) track them without having to ad-hoc
code it, b) convert it 0.3 futures, and c) require less ceremony to call it.
However, I'll do so in another diff.
Reviewed By: StanislavGlebik
Differential Revision: D25334478
fbshipit-source-id: 1385298b8fdf1cd081b6e509c6c3e03b3abbfa5b
Summary: This lib.rs is getting too big. Split it.
Reviewed By: StanislavGlebik
Differential Revision: D25333510
fbshipit-source-id: ea15664d2de21a24ee107162e030b7762b1d413e
Summary:
I'd like to add a caching variant for this. Might as well not have to rewrite
those methods on an ad-hoc basis.
Reviewed By: StanislavGlebik
Differential Revision: D25333461
fbshipit-source-id: 632c0307189fe15a926d808c1eeca1e3f240eb19
Summary: Like it says in the title.
Reviewed By: StanislavGlebik
Differential Revision: D25333450
fbshipit-source-id: 49ad4b1964a4dfd9f3e0108fa421d451ef905256
Summary:
This makes logs go through a `Drain` which queries `ObservabilityContext` (introduced in a previous diff) for current logging level. ATM I did not add any tests, and it's pretty easy to add a unit-test checking that the drain indeed respects the level, but it's so simple that I am not 100% convinced that test would be all that valuable.
Note that currently `ObservabilityContext` is enabled in a `Static` variation.
Reviewed By: mitrandir77
Differential Revision: D25232400
fbshipit-source-id: 7499916e0a3ddab43538343e6ed215818517eaf7
Summary:
`ObservabilityContext` is a structure that helps logging facilities within Mononoke to make logging decisions. Specifically, the upcoming `DynamicLoggingDrain` and already existing `MononokeScubaSampleBuilder` will have this structure as a component and decide whether a particular logging statement (slog or scuba) should go ahead.
Internally, `ObservabilityContext` is a wrapper around data received from a [configerator endpoint](https://www.internalfb.com/intern/configerator/edit/?path=scm%2Fmononoke%2Fobservability%2Fobservability_config.cconf).
This diff makes a few unobvious decisions about how this is organized. My goals were:
1. to have production (i.e. reading from configerator), static (i.e. emulating current prod behavior) and test variants of `ObservabilityContext`
1. to avoid having consumers know which of these variants are used
1. to avoid making all consumers of `ObservabilityContext` necessarily generic
1. to avoid using dynamic dispatch of `ObservabilityContext`'s methods
Points 3 and 4 mean that `ObservabilityContext` cannot be a trait. `enum` is a common solution in such cases. However, if `ObservabilityContext` is an `enum`, consumers will know which variant they are using (because `enum` variants are public). So the solution is to use a private enum wrapped in a struct.
Reviewed By: mitrandir77
Differential Revision: D25287759
fbshipit-source-id: da034c71570137e8a8fb7749b1e4ad43be482f66
Summary: Can just pass on the iterator
Reviewed By: ikostia
Differential Revision: D25216892
fbshipit-source-id: 79c08737477ac7ed1f824c50105d5977ee592126
Summary: Its now the default for this binary, might as well shorten the test command lines
Reviewed By: ikostia
Differential Revision: D25219717
fbshipit-source-id: 8074145c6f05f26ab7fa18d2ff399482ad592885
Summary: Reduces boilerplate for binaries usually run in this mode, notably the walker
Reviewed By: ikostia
Differential Revision: D25216883
fbshipit-source-id: e31d2a6aec7da3baafd8bcf208cf79cc696752c0
Summary: This is useful to prevent accidentally consuming too much. Enabled it for the walker
Reviewed By: ikostia
Differential Revision: D25216880
fbshipit-source-id: e80f490d6ece40d64cc8609e7d6b80d0ecbb1671
Summary: Reduces boiler plate on command line for binaries like walker that want different default
Reviewed By: krallin
Differential Revision: D25216876
fbshipit-source-id: 0df474568d28e0726be223e9dc0a760523063d21
Summary: Darkisilon cell consists of multiple hosts which shares underlying storage, so write to one of them is visible for all hosts. Lets spread requests between all these hosts. I'll get list of hosts from the smc tier and will randomly connect to one on each request.
Reviewed By: krallin
Differential Revision: D25163782
fbshipit-source-id: b28085dd37b15972469b7334a47def473e10f34e
Summary: This allows us to be more flexible in choosing authentication and expands variables used in configuration.
Reviewed By: singhsrb
Differential Revision: D25304008
fbshipit-source-id: 636893a9eaec31ca5acfa02f72931d5e56b695d0
Summary: convert blobrepo:blobrepo_common to new type futures
Reviewed By: StanislavGlebik
Differential Revision: D25245426
fbshipit-source-id: d3db0e417545b2b0043e48a536737586005ac4bb
Summary:
I'd like to experiment with splitting this into its own service. To do that, I
don't want to have to include a path, since it's only used for reporting an
error that will never occur (because for that service I'll be using the
"generate" variant of the filenode id). Let's just make it optional.
Reviewed By: lukaspiatkowski
Differential Revision: D25220901
fbshipit-source-id: 6d3cf70a63b077de18a7d43f3b65766b453c425e
Summary: Like it says in the title. Let's turn this into an async fn.
Reviewed By: lukaspiatkowski
Differential Revision: D25220902
fbshipit-source-id: b5de783adaca05919eb5cd6858c8b0aaf03ddfc2
Summary:
This returns a Result of a tuple, but:
- This never errs.
- Nothing ever reads the left side of the tuple.
So let's stop doing that.
Reviewed By: StanislavGlebik
Differential Revision: D25219887
fbshipit-source-id: f33dcf6f6e68cb17b40c4638470312afae0662e6
Summary:
Like it says in the title. We've had this comment about potentially using this
for a couple of years now. It seems a bit unlikely at this point that we're
going to use this, but currently it makes the code that provides uploading hg
entries more complex than it needs to be, so let's just get rid of this.
Reviewed By: lukaspiatkowski, StanislavGlebik
Differential Revision: D25219728
fbshipit-source-id: 61d254edef16d24a4c29f96f983f894863b5232a
Summary:
This modifies our object popularity mechanism to account for the size of the
objects being downloaded. Indeed, considering our bottleneck is bandwidth, we
forcing similar routing for 10 downloads of a 10MB object and 10 downloads of a
1GB object doesn't make that much sense.
This diffs updates our counting so that we now record the object size instead
of a count. I'll set up routing so that we disallow consistent routing when a
single object exceeds 250MiB/s of throughput ( = 1/4th of a task).
It's worth noting that this will be equivalent to what we have right now for
our most problematic objects (GraphQL schemas in Fbsource, 35M each), given
that we "unroute" at 150 requests / 20 seconds (`150 * 35 / 20 = 262`).
The key difference here is that this will work for all objects.
This does mean LFS needs to cache and know about content metadata. That's not
actually a big deal. Indeed, over a week, we serve 25K distinct objects
(https://fburl.com/scuba/mononoke_lfs/a2d26s1a), so considering content
metadata is a hundred bytes (and so is the key), we are looking at a few MBs of
cache space here.
As part of this, I've also refactored our config handling to stop duplicating
structures in Configerator and LFS by using the Thrift objects directly (we
still have a few dedicated structs when post-processing is necessary, but we
no longer have anything that deserializes straight from JSON).
Note that one further refinement here would be to consistently route but to
more tasks (i.e. return one of 2 routing tokens for an object that is being downloaded
at 500MiB/s). We'll see if we need that.
Reviewed By: HarveyHunt
Differential Revision: D24361314
fbshipit-source-id: 49e1f86cf49357f60f1eac298a753e0c1fcbdbe5
Summary:
D24761026 (caa684450f) formatted the default cachelib size with the specifier
`{:3}`. This specifier pads the left side of the string with spaces if there
are less than 3 digits.
Unfortunately, this means that attempting to parse the string into an `f64`
fails. Here's a minimal example:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=11a1e819f9919f7d02565cb8fa561b85
Remove the format specifier and instead call `.to_string()`.
Reviewed By: ahornby
Differential Revision: D25302079
fbshipit-source-id: 461dd628a312a967f6cf5958d2e5d51b72b0ffd8
Summary:
Right now this is the same as our Filestore threshold so we always attempt to
compress Filestore chunks even though they were designed to fit in cache.
Whoops.
Reviewed By: ikostia
Differential Revision: D25274080
fbshipit-source-id: f17b54710fc36ca7c11c74247d038bf73777f7f9
Summary:
Cachelib wasn't initialized so a production configuration would crash because
the cache pools wouldn't be found.
Reviewed By: singhsrb
Differential Revision: D25291880
fbshipit-source-id: 382db0efda072b11da587e863566e816bd5393ca
Summary: Convert `ChangesetFetcher` to new type futures
Reviewed By: StanislavGlebik
Differential Revision: D25244213
fbshipit-source-id: 4207386d81397a930a566db008019bb8f31bf602
Summary:
If skeleton manifests are not enabled for a repository, ignore the tunables
that delay casefolding checks until land-time.
Reviewed By: StanislavGlebik
Differential Revision: D25271607
fbshipit-source-id: dcaf9291da31d0f57b3b632888ed688ecd6cecda
Summary:
Create `BlobstoreMapping` as a trait with the common implementations for
derived data mappings that are stored in the blobstore.
Reviewed By: StanislavGlebik
Differential Revision: D25099915
fbshipit-source-id: 8a62fbb809918045336944c8cd3584b109811012
Summary:
Batch derivation was disabled as it caused problems for other derived data types.
This was because the default batch implementation was wrong: it attempted to derive
potentially linear stacks concurrently, which would cause O(n^2) derivations.
Fix the default implementation to be a naive sequential iteration, and re-enable
batch derivation for fsnodes and skeleton manifests.
Reviewed By: StanislavGlebik
Differential Revision: D25218195
fbshipit-source-id: 730555829f092cc36e1c81cf02c2b4962a3904da
Summary:
On the second and subsequent linear stack in a batch, the parent fsnode may
have been derived in the previous iteration of the loop. Since we haven't
completed this batch yet, the mappings have not been stored, and so the
attempt to derive the parent will result in the parent being derived again,
repeating the work of this batch sequentially.
Apply the optimization used in skeleton manifests and fetch the parent fsnodes
out of the result we are accumulating.
Reviewed By: StanislavGlebik
Differential Revision: D25218194
fbshipit-source-id: 5cc49204b53984f8aa73542f1a794a6251eb2b2e
Summary:
Similar to fsnodes, allow skeleton manifests to be derived in parallel in large
batches by splitting the changesets into linear stacks with no internal
conflicts, and deriving each changeset in that batch in parallel.
Reviewed By: StanislavGlebik
Differential Revision: D25218196
fbshipit-source-id: e578de9ffd472e732abb1e2ef9cd19c073280cd4
Summary:
The Mononoke SCS tests can be flaky as the server starts to accept SSL
connections some time before it is fully ready to accept requests, which can
result in request timeouts.
Use a simple request to test for the server being fully up and ready to
handle requests.
Reviewed By: krallin
Differential Revision: D25250245
fbshipit-source-id: b79106ed51e5163ebe5cd1db7b0deaab0035b9bc
Summary: Update the few test cases remaining on older option to the newer option
Reviewed By: ikostia
Differential Revision: D25219710
fbshipit-source-id: 50af0dcac7ed980ec4c7180cf81e3c00ecc18b95
Summary: Remove this now it is the walker default. Makes command lines shorter
Reviewed By: ikostia
Differential Revision: D25219551
fbshipit-source-id: bc5ad4237cad35218a0b4c54aa81eb20edb3f3e1
Summary:
This will reduce boilerplate command line for the walker, as most of the time we want to run it with readonly storage
Because the existing --readonly-storage flag can't take a value this introduces a new --with-readonly-storage=<true|false> option
Reviewed By: krallin
Differential Revision: D25216871
fbshipit-source-id: e1b83b428a9c3787f48c18fd396d23ac95991b77
Summary:
Previously needed to pass in cachelib settings once to MononokeAppBuilder and once to parse_and_init_cachelib.
This change adds a MononokeClapApp and MononokeMatches that preserve the settings, thus preventing the need to pass them in twice (and thus avoiding possible inconsistency)
MononokeMatches uses MaybeOwned to hold the inner ArgMatches, which allows us to hold both the usual reference case from get_matches and an owned case for get_matches_from which is used in test cases.
Reviewed By: krallin
Differential Revision: D24788450
fbshipit-source-id: aad5fff2edda305177dcefa4b3a98ab99bc2d811
Summary:
Show cachelib cache_size default in --help usage so its clear what you'll get if no command line args passed
Because we need to convert from bytes to GiB, the lifetime of the help string isn't long enough for clap's reference recieving default_value, so use OnceCell to be able to pass a static reference.
Reviewed By: krallin
Differential Revision: D24761026
fbshipit-source-id: 81b5e7ceb832d5cb55cc9faef59c5e6432f7c4b0
Summary:
Move expected_item_size_byte into CachelibSettings, seems like it should be there.
To enable its use also exposes a parse_and_init_cachelib method for callers that have different defaults to default cachelibe settings.
Reviewed By: krallin
Differential Revision: D24761024
fbshipit-source-id: 440082ab77b5b9f879c99b8f764e71bec68d270e
Summary:
Introduce a new API type, `TreeAttributes`, corresponding to the existing type `WireTreeAttributesRequest`, which exposes which optional attributes are available for fetching. An `Option<TreeAttributes>` parameter is added to the `trees` API, and if set to `None`, the client will make a request with TreeAttributes::default().
The Python bindings accept a dictionary for the attributes parameter, and any fields present will overwrite the default settings from TreeAttributes::default(). Unrecognized attributes will be silently ignored.
Reviewed By: kulshrax
Differential Revision: D25041255
fbshipit-source-id: 5c581c20aac06eeb0428fff42bfd93f6aecbb629
Summary: Basic observability for how the segmeted changelog update process is performing.
Reviewed By: krallin
Differential Revision: D25108739
fbshipit-source-id: b1f406eb0c862464b186f933d126e0f3a08144e4
Summary:
The update of the segmented changelog is light weight enough that we can
consider all repositories sharing a common tailer process. With all
repositories sharing a single tailer the the maintenance burden will be lower.
Things that I am particularly unsure about are: tailer configuration setup and
tailer structure. With regards to setup, I am not sure if this is more or less
than what production servers do to instantiate. With regards to structure, I
think that it makes a lot of sense to have a function that takes a single repo
name as parameter but the configuration setup has an influence on the details.
I am also unsure how important it is to paralelize the instantiation of the
blobrepos.
Finally, it is worth mentioning that the SegmentedChangelogTailer waits for
`delay` after an update finishes rather than on a period. The benefit is that
we don't have large updates taking down a process because we schedule the same
large repo update too many timer. The drawback is that scheduling gets messed
up over time and multiple repo updates can end up starting at the same time.
Reviewed By: farnz
Differential Revision: D25100839
fbshipit-source-id: 5fff9f87ba4dc44a17c4a7aaa715d0698b04f5c3
Summary: This makes it easier to enable all derived data for scrubbing
Reviewed By: ikostia
Differential Revision: D25188963
fbshipit-source-id: e9c981e33273d6b2eeadcce0d0a341b33e91e42d
Summary:
Show the options for specifying node and edge types in the --help output
These changes removed the last use of lazy_static in the walker, so updated TARGETS as well.
Reviewed By: krallin
Differential Revision: D25188964
fbshipit-source-id: c5ccb4f5a0f3be1b8cb7d51cd5f99236d60d3029
Summary:
It has a build() method and later in stack it will build a mononoke
specific type rather than the clap::App
Differential Revision: D25216827
fbshipit-source-id: 24a531856405a702e7fecf54d60be1ea3d2aa6e7
Summary:
At the moment if we try to sync a bookmark entry but from_cs_id of bookmark
entry doesn't match the value of the bookmark on hg servers then the sync will
fail.
Let's add an option that in the case of this mismatch sets from_cs_id to the
current value on hg servers.
Reviewed By: krallin
Differential Revision: D25242172
fbshipit-source-id: 91180fb86f25d10c9ba2b78d7aa18ed0a52d13a5
Summary: convert mercurial_derived_data to new type futures
Reviewed By: ahornby
Differential Revision: D25220329
fbshipit-source-id: c2532a12e915b315fe6eb72f122dbc37822bbb2a
Summary:
This diff prepares the Mononoke codebase for composition-based extendability of
`ScubaSampleBuilder`. Specifically, in the near future I will add:
- new methods for verbose scuba logging
- new data field (`ObservabilityContext`) to check if verbose logging should
be enabled or disabled
The higher-level goal here is to be able to enable/disable verbose Scuba
logging (either overall or for certain slices of logs, like for a certain
session id) in real time, without restarting Mononoke. To do so, I plan to
expose the aforementioned verbose logging methods, which will run a check
against the stored `ObservabilityContext` and make a decision of whether the
logging is enabled or not. `ObservabilityContext` will of course hide
implementation details from the renamed `ScubaSampleBuilderExt`, and just provide a yes/no
answer based on the current config and sample fields.
At the moment this should be a completely harmless change.
Reviewed By: krallin
Differential Revision: D25211089
fbshipit-source-id: ea03dda82fadb7fc91a2433e12e220582ede5fb8
Summary:
cross repo bookmark validation alarm fired a few times, and looks like it fired
because of the following:
1) find_bookmark_diff compared boomarks and found an inconsistency for bookmark
BM which points to commit A in large repo. Next step is to check bookmark history
2) While find_bookmark_diff was running a new commit B was landed in a large repo
and was backsynced to the small repo, so BM now points to commit B.
3) check_large_bookmark_history is called and it fetches latest bookmark log entries, and
it gets entries for commit A and commit B. check_large_bookmark_history checks
if any of the fetched entries points to a commit in the small repo and if yes then
it also checks if this bookmark update happened not so long ago. And the
problem is in the way it checks the "not so long ago" part. It does so by
finding the time difference between latest bookmark update log entry and any
other bookmark update log entry.
Now, if time difference between these two log entries (for commit B and for
commit A) is more than max_delay_secs (which happens only
if commit rate is low e.g. during the weekends), then the alarm would fire
because the delay between latest bookmark update log entry (the one that moved
BM to commit B) and previous log entry (the one that moved BM to commit A) is too large.
This diff fixes this race by skipping newest entries until we found a bookmark
update log entry which points to the large commit that find_bookmark_diff
returned.
Reviewed By: ikostia
Differential Revision: D25196760
fbshipit-source-id: dfa0dca0001b1c38759ec9f4f790cfa3197ae2cf
Summary: remove dependency on old futures from derived data filenodes
Reviewed By: ahornby
Differential Revision: D25218521
fbshipit-source-id: 4d7eaf42c3ba15ea09276a7f3567128d5216e814
Summary: `derived_data:derived_data` had already been almost converted, I've cleaned up some test so it would be possible to completely remove old futures dependency
Reviewed By: StanislavGlebik
Differential Revision: D25197406
fbshipit-source-id: 064439f42a15f715befc019e5350dda0a2975f2b
Summary:
- convert save_bonsai_changesets to new type futures
- `blobrepo:blobrepo` is free from old futures deps
Reviewed By: StanislavGlebik
Differential Revision: D25197060
fbshipit-source-id: 910bd3f9674094b56e1133d7799cefea56c84123
Summary: Add support for walking fast entries for directories so we can scrub and inspect them
Differential Revision: D25187551
fbshipit-source-id: 812f9fd82459ef49dcd781c318fbe5c398daad21
Summary: Add FastlogFile to walker so it can be inspected and scrubbed. Directory and batch components of fastlog covered in following diffs.
Differential Revision: D25187549
fbshipit-source-id: c046cbf2561cdbbc9563497119e34d1b09d0ebef
Summary: As HG<->Mononoke will go through proxygen, testing showed that Proxygen forces us to use 'Upgrade: websocket' and confirm with the Websocket handshake. Adjust accordingly to do so.
Reviewed By: ahornby
Differential Revision: D25197395
fbshipit-source-id: ca00ac31be92817c6f1a99d7d492469b17b46286
Summary:
This diff adds bundle combining to hg sync job. See motivation for doing that in D25168877 (cebde43d3f).
Main struct here is BookmarkLogEntryBatch. This struct helds a vector of BookmarkUpdateLogEntry that were combined (they are used mostly for logging) and also it contains parameters necessary for producing the bundle, notably from/to changeset ids and bookmarks. This struct has try_append method that decides whether it's possible to combine bundles or not.
Reviewed By: mitrandir77
Differential Revision: D25186110
fbshipit-source-id: 77ce91915f460db73d8a996efe415954eeea2476