Commit Graph

2102 Commits

Author SHA1 Message Date
Mark Juggurnauth-Thomas
64be2e8d87 derived_data_utils: take parameters by reference
Summary: Take the parameters to `derived_data_utils` and `derived_data_utils_unsafe` by reference.

Reviewed By: krallin

Differential Revision: D25371970

fbshipit-source-id: d260650c2398e33667e1bc5779fbabdff04f1f98
2020-12-14 09:24:57 -08:00
Mark Juggurnauth-Thomas
9e1b1448e6 derived_data: split BonsaiDerived trait
Summary:
The `BonsaiDerived` trait is split in two:

* The new `BonsaiDerivable` trait encapsulates the process of deriving the data, either
  a single item from its parents, or a batch.
* The `BonsaiDerived` trait is used only as an entry point for deriving with the default
  mapping and config.

This split will allow us to use `BonsaiDerivable` in batch backfilling with non-default
config, for example when backfilling a new version of a derived data type.

Reviewed By: krallin

Differential Revision: D25371964

fbshipit-source-id: 5874836bc06c18db306ada947a690658bf89723c
2020-12-14 09:24:57 -08:00
Jan Mazur
64de2b3abf git_types: remove unused dependency on old futures
Summary: as in the title

Reviewed By: krallin

Differential Revision: D25531446

fbshipit-source-id: afa4858f8f18182f206234e69b242c20a2af6b2a
2020-12-14 06:39:01 -08:00
Pavel Aslanov
32585287f1 convert changeset creation to new type futures
Summary: Convert changeset creation to new type futures

Reviewed By: krallin

Differential Revision: D25430405

fbshipit-source-id: 64eb6dbc324846408e60c77e273c5d5edfd59318
2020-12-11 13:55:46 -08:00
Jan Mazur
47dd7719f7 mononoke_types: remove dependency on old_futures
Summary: Just removing unused dependencies

Reviewed By: ahornby

Differential Revision: D25492832

fbshipit-source-id: 5c9ace2c9a333a4d74239a5d7d8dfb9fbe1c772e
2020-12-11 06:54:28 -08:00
Alex Hornby
fec3ef773c mononoke: introduce RepoRequirement enum to MononokeAppBuilder
Summary: Allows a binary to specify if the repo args are required on command line, and if so if OnlyOne of AtLeastOne is the requirement.

Reviewed By: farnz

Differential Revision: D25422757

fbshipit-source-id: 44d27c954bd1e0fa38b2d44c1c3b2eac3e50bd0c
2020-12-11 04:30:02 -08:00
Thomas Orozco
6cf130f41f mononoke/mononoke_api: make Repo and its new_test methods available to dependents
Summary:
This is useful to e.g. write tests in things that use mononoke_api (such as
edenapi): the test mode isn't transitive across crates. This also requires
making Repo itself public, since callers might reasonably want to create one.

I've also updated a few of the accessor methods that were `pub(crate)` given
that what we had right now seemed like it was kinda random: some things were
`pub(crate)`, others were just `pub`.

Reviewed By: markbt

Differential Revision: D25467624

fbshipit-source-id: 2279d4196e8dc0e7e1729239710d900b351be816
2020-12-11 03:47:18 -08:00
Alex Hornby
6186481782 mononoke: factor out repo name and id resolution in cmdlib
Summary: Factor out functions in preparation for change that uses them to optionally resolve multiple repos from cmdlib

Differential Revision: D25422754

fbshipit-source-id: e0bd33ae533b1450e7084d78bd1765148b71bc76
2020-12-11 01:36:06 -08:00
Alex Hornby
f2ca14b1bf mononoke: add hg derived data shortcut to walker node types
Summary: Could already specify "bonsai" useful to be able to pass "hg".

Reviewed By: farnz

Differential Revision: D25367322

fbshipit-source-id: aca6d22f98394af49e3d94d5fd533bc9a25a6869
2020-12-11 01:36:06 -08:00
Alex Hornby
946472a110 mononoke: make BlobConfig hashable
Summary:
This is useful for jobs running multiple repos as it can then open the blobstore as many times as there are storage configs rather than as many times as there are repos.

Used in a diff I'm working on to group repos by storage config in a HashMap when setting up the walker to scrub multiple repos from single process.

Reviewed By: farnz

Differential Revision: D25422758

fbshipit-source-id: 578799db63dcf0bce4a79fca9642651601f2deeb
2020-12-11 01:36:06 -08:00
Alex Hornby
22d77348cd mononoke: remove unnecessary static lifetime in walker constants
Summary: also makes ERROR_MSG a constant

Reviewed By: farnz

Differential Revision: D25422756

fbshipit-source-id: e2f2b9122e2b90c7cb07b7d64156055d55c8c653
2020-12-11 01:36:06 -08:00
Robin Håkanson
b1c416c56c Extend gitimport functionality
Summary:
Following features added to gitimport, both the library rutine and the command line version.

* Define your own parts of a git-library to import by implementing the `GitimportTarget` trait.
* Added `GitimportTarget` implementation
 `ImportMissingForCommit` that will search for any missing reference in Mononoke for the specified git-commit and import them. Note that it will not import commits unreachable by the specified commit history.
* Added support to update the bonsai<->git commit mapping while importing commits.
* Commit import progress is now shown, making it a bit easier to estimate how big an import job is and how long it will take.
* Adding optional git-repo name. This is useful when using gitimport as a library to import missing commits from many repositories simultaneously.
* Email to author is now added in the author field.
* Committer information is now also exported.
* Optimized the blob-store import by checking if a blob already exists prior to importing it.
* Added brief functions to basic hash structs, this is to get only the first 4 bytes (8 hex chars) for easier human inspection and debugging.
* Added support to suppress the long ref->BonzaiID mapping (on by default to match old behavior).

Reviewed By: krallin

Differential Revision: D25445974

fbshipit-source-id: 6dc7f977b61ceec1a95b5f3c38548ac8eddbea27
2020-12-10 17:26:22 -08:00
Michael Samoylenko
9009f86713 Add Debug/Copy/Clone to PoolSizeConfig
Summary: Makes it a bit easier to use.

Reviewed By: markbt

Differential Revision: D25439113

fbshipit-source-id: 2c5da338a7000573ac92435c8982f5adff71bf42
2020-12-10 12:05:16 -08:00
Thomas Orozco
b766407054 mononoke/caching_ext: unify our cache stats reporting
Summary:
I'd like to lower the bar to entry as much as possible to use caching in e.g.
mappings. One thing that's a little annoying to setup right now is stats. Let's
unify those.

This does mean the stats names for mappings that had them will change, but I
think that's probably OK here.

A few notes about the implementation:

- I made the stats objects singletons, because ultimately that *is* what stats
  are under the hood.
- I made a macro to define the singleton because that was otherwise really
  verbose.

Reviewed By: farnz

Differential Revision: D25461182

fbshipit-source-id: f30d0908c40d8e0010c546ec95a385a06d557757
2020-12-10 12:01:45 -08:00
Kostia Balytskyi
a76550859d admin: make rsync capable of working across repos
Summary:
This adds support for replacing things in the small repo with the things from
the large repo. This is useful when changing the bind.

This diff slightly changes how `rsync` is called: now `--source-csid`,
`--target-csid`, `--from-dir` and `--to-dir` are all specified before the
`copy`/`remove-excessive-files` subcommands.

There's still an ability to use this within a repo, as you can pass identical
source and target repos.

Reviewed By: StanislavGlebik

Differential Revision: D25087893

fbshipit-source-id: 6e5881f80d91ef4b794a967cf9f26dd3af7f56c9
2020-12-10 11:45:08 -08:00
Stanislau Hlebik
52e7d42458 mononoke: fix blobimport importing a 0 revision only
Summary:
We had a small bug in blobimport. If mononoke repo is empty and blobimport
tries to import a single revision whose rev number is 0 then it will
successfull import it but it will report that no revision were imported
i.e. it will print "didn't import any revision" to stderr and won't update the
manifold latest imported revision marker.

That was because we didn't update max_rev_and_bcs_id if rev is equal to
max_rev. This diff fixes it.

Reviewed By: ahornby

Differential Revision: D25421164

fbshipit-source-id: 639ead0ac326a14051d3a4faba568ecb797857a2
2020-12-10 11:29:37 -08:00
Thomas Orozco
701c4116ab mononoke/caching_ext: limit concurrency of memcache operations
Summary: See D25396203 for discussion. 100 is better than nothing.

Reviewed By: StanislavGlebik

Differential Revision: D25460398

fbshipit-source-id: 608a2dca9c381c78daf0e7d9bcbd1a32f201030a
2020-12-10 10:24:58 -08:00
Thomas Orozco
01d5bd2c4e mononoke/caching_ext: rename CacheDispositionNew
Summary:
I had to have two of those while I was refactoring away the old style of
doing get or fill, but now that that's gone, we can have a single one and clean
up.

Reviewed By: StanislavGlebik

Differential Revision: D25396201

fbshipit-source-id: 459e9ec7e44e8b349c585212c2758d64077e56d1
2020-12-10 10:24:58 -08:00
Thomas Orozco
c844c0c74b mononoke/caching_ext: remove 0.1 futures
Summary:
All the code that needed it basically gone. Might as well push the compat()
calls a little further down and be done with 0.1 futures here.

Reviewed By: StanislavGlebik

Differential Revision: D25396202

fbshipit-source-id: ae85f61c03cb2c38eabbaf0d45387f9d4422b336
2020-12-10 10:24:58 -08:00
Thomas Orozco
6427a8e10d mononoke/caching_ext: remove 0.1 GetOrFillMultipleFromCacheLayers
Summary: All the callsites are gone now.

Reviewed By: StanislavGlebik

Differential Revision: D25396205

fbshipit-source-id: 74d0595c4528dc739d254f5dc950157e087b00dd
2020-12-10 10:24:58 -08:00
Thomas Orozco
06b6e35a85 mononoke/phases: update to futures 0.3 get_or_fill
Summary:
Like it says in the title. The upshot here as well is a lot less cloning.

While in there, I removed the "caching" module since it basically only
contained a couple of things that were all needed in the sql module anyway.

Reviewed By: StanislavGlebik

Differential Revision: D25396208

fbshipit-source-id: aa6381c78f45a94fecd04544196180d2a918f97d
2020-12-10 10:24:58 -08:00
Thomas Orozco
bf8f4048e3 mononoke/caching_ext: add fill_cache to 0.3 variant
Summary: We need this in Phases, so let's add it there.

Reviewed By: StanislavGlebik

Differential Revision: D25396207

fbshipit-source-id: 34174f205028b95c9aa382c343b1344265391df2
2020-12-10 10:24:58 -08:00
Thomas Orozco
a56d1509d1 mononoke/changesets: update to futures 0.3 get_or_fill
Summary:
Like it says in the title. This reduces cloning.  Unlike our cached mappings,
changesets have only way of being queried, so other than cloning, it doesn't
make a huge difference.

Reviewed By: StanislavGlebik

Differential Revision: D25396206

fbshipit-source-id: 45d3ebd403142a3f1d9e3ba7de5de2bf18317165
2020-12-10 10:24:58 -08:00
Thomas Orozco
dbf26aed73 mononoke/changesets: rename futures -> futures-old
Summary:
Like it says in the title. I need the new futures module in there later in this
stack so this makes it cleaner.

Reviewed By: StanislavGlebik

Differential Revision: D25396200

fbshipit-source-id: 0148003c83b3dd0da5142eb468cf3a6ae2f74b7a
2020-12-10 10:24:58 -08:00
Thomas Orozco
eb96570de3 mononoke/bonsai_hg_mapping: update to futures 0.3 get_or_fill
Summary:
Like it says in the title, this updates bonsai_hg_mapping to a futures 0.3
implementation of get_or_fill.

The upshot is that this requires less cloning (in fact, no cloning at all if
the rest of the code was 0.3 here), and in this particular instance it also
lets us completely get rid of the `from_bonsai` flag we were threading through
the whole method and checking many times.

Reviewed By: StanislavGlebik

Differential Revision: D25396199

fbshipit-source-id: f8126c96aad8d982c3deb535530484bec841929f
2020-12-10 10:24:58 -08:00
Thomas Orozco
3ef4906113 mononoke/bonsai_globalrev_mapping: update to 0.3 futures get_or_fill
Summary:
Like it says in the title. Let's update this to the new get_or_fill
implementation that uses 0.3 futures.

Reviewed By: StanislavGlebik

Differential Revision: D25396204

fbshipit-source-id: 06bf449a0d15bd19625acfdcbb4578948e57cde7
2020-12-10 10:24:58 -08:00
Thomas Orozco
0708585c16 mononoke/caching_ext: introduce a futures 0.3 GetOrFillMultipleFromCacheLayers
Summary:
Like it says in the title, this adds a futures 0.3 variant of
GetOrFillMultipleFromCacheLayers. However, I didn't just port the code as-is,
since the code as it stands wouldn't have been very idiomatic if I did.

Instead, I refactored this to be a function and a few traits. I've also kept
the old code for now, and I'll remove it once I've converted all the callistes.

The upshot of the proposed refactor here is that it should be easier to use
this without having to heavily duplicate the instantiation of the "cacher" in
places where we have multiple variants that are cached (e.g. mappings), all the
while being able to leverage the type checker. See D25334478 (13255301b0) for a discussion
on this. This new approach also makes it much easier to work with the tests for
this (you can just mutate the store and access its fields).

Reviewed By: StanislavGlebik

Differential Revision: D25396203

fbshipit-source-id: d706729a800faa4b12fcf5e608c6dee93c5a909e
2020-12-10 10:24:57 -08:00
Alex Hornby
8d997846e3 mononoke: remove fsnode from default walker params
Summary: Switching to specify derived data types other than hg explicitly on the command line

Reviewed By: farnz

Differential Revision: D25367323

fbshipit-source-id: 0e0aea1aab46b43b325486ed6161ea322f7cec4b
2020-12-10 05:28:45 -08:00
Thomas Orozco
c9568598fb mononoke/manifest: add Entry::map_leaf and Entry::map_tree
Summary: It's useful sometimes, like in the next diff.

Reviewed By: mitrandir77

Differential Revision: D25422597

fbshipit-source-id: 0ebb5dcc349bbaacac3dddf03f19e5e092042468
2020-12-10 03:07:45 -08:00
Saurabh Singh
0f465f7dc9 hooks: show the size of the large commit being blocked
Summary:
This is useful data to see how much we are over the limit. I noticed
the missing data while resolving
https://fb.workplace.com/groups/scm/permalink/3450652774984318/ as the oncall.

Reviewed By: StanislavGlebik

Differential Revision: D25414427

fbshipit-source-id: ec4bbca9c21a4bf0e675ec1cd82e4e703cd88631
2020-12-09 16:44:41 -08:00
Stefan Filip
e7be876b6f blobrepo: update factory to set SegmentedChagelog on the production setup
Summary:
Same as development branch. Without configuration changes, nothing changes for
the production codepath.

Reviewed By: quark-zju

Differential Revision: D25405026

fbshipit-source-id: aff705aa5f96814f1f1d7552454ab1d0c13afd92
2020-12-08 18:30:25 -08:00
Stefan Filip
5f6d1a2c61 edenapi: add full_idmap_clone endpoint
Summary:
The end goal is to have clients using a sparse IdMap. There is still some work
to get there though. In the mean time we can test repositories that don't use
any revlogs. The current expections for those repositories are that they have
a full idmap locally.

Reviewed By: quark-zju

Differential Revision: D25075341

fbshipit-source-id: 52ab881fc9c64d0d13944e9619c087e0d4fb547c
2020-12-08 18:30:24 -08:00
Stefan Filip
3afaeb858a segmented_changelog: add SegmentedChangelog::full_idmap_clone_data
Summary:
The client dag cannot currently be instantiated with a sparse idmap (aka
universal commit idmap). Is should be usable with a full idmap.  To test
repositories that use segmented changelog exclusively we add the capability of
cloning the full idmap.

I currently see StreamCloneData as an experiment. I am open to suggestions
around what structure we should have for the regular long term clone endpoint.
That said, I am leaning towards converting clone_data to return
StreamCloneData.  Overall, Segmented Changelog has a few knobs that influence
how big the IdMap ends up being so the code that is more flexible will be more
useful long term.  To add to that, we transform data higher in the stack using
streaming and this data does similar fetching, it seems that we should have a
stream idmap exposed by clone_data.

Reviewed By: quark-zju

Differential Revision: D24966338

fbshipit-source-id: 019b363568e3191280bd5ac09fc15062711e5523
2020-12-08 18:30:24 -08:00
Stefan Filip
2193b84b43 autocargo: regen
Summary: Regen autocargo

Reviewed By: quark-zju

Differential Revision: D25409963

fbshipit-source-id: 7dbbe420aeb30248bf43d3a96a9aa6a47bb2b0be
2020-12-08 18:30:24 -08:00
Thomas Orozco
6b04c4d7d2 mononoke/warm_bookmarks_cache: don't initialize ALL bookmarks at once
Summary:
All the bookmarks is *a lot* of bookmarks. Don't do them all at once. Also, add
some logging output so we can tell how far along we are.

Reviewed By: HarveyHunt

Differential Revision: D25397297

fbshipit-source-id: c19b99123f88e05e99bff61e2399a62d378a6671
2020-12-08 08:30:56 -08:00
Lukas Piatkowski
00fe313eff mononoke/unbundle: get rid of futures 0.1
Summary: Also added a TryShared future to futures_ext. The problem with regular Shared is that if you want to share anyhow::Result the Error part of it is not cloneable. This TryShared will work nicely when returning anyhow::Result, which most of our code does.

Reviewed By: aslpavel

Differential Revision: D25223317

fbshipit-source-id: cf21141701884317a87dc726478dcd7a5a820c73
2020-12-07 20:41:26 -08:00
Thomas Orozco
16bac45a07 mononoke/bonsai_globalrev_mapping: set perf counters
Summary:
Like it says in the title. This is helpful to measure the number of SQL queries
we make. This required actually threading in a CoreContext, which we didn't
have before.

Reviewed By: StanislavGlebik

Differential Revision: D25336069

fbshipit-source-id: 35677c55550e95b5126de29c2a824b4eda32092c
2020-12-07 08:23:19 -08:00
Thomas Orozco
482ab2d2a6 eden/mononoke: allow turning on the bonsai / globalrev cache
Summary: Like it says in the title.

Reviewed By: StanislavGlebik

Differential Revision: D25336068

fbshipit-source-id: 113050215c28a28c820d938348a0a3e54c14c3ee
2020-12-07 08:23:19 -08:00
Thomas Orozco
13255301b0 mononoke/bonsai_globalrev_mapping: add caching
Summary:
Like it says in the title, this adds a caching layer around Globalrevs using
our existing `GetOrFillMultipleFromCacheLayers` abstraction.

Note: I've opted to not track dedicated metrics for this (compare to the hg
mapping to see them), since I don't believe we really ever look at them.

I'd like to do a little bit of refactoring of
`GetOrFillMultipleFromCacheLayers` to a) track them without having to ad-hoc
code it, b) convert it 0.3 futures, and c) require less ceremony to call it.
However, I'll do so in another diff.

Reviewed By: StanislavGlebik

Differential Revision: D25334478

fbshipit-source-id: 1385298b8fdf1cd081b6e509c6c3e03b3abbfa5b
2020-12-07 08:23:19 -08:00
Thomas Orozco
ec0bff0b82 mononoke/bonsai_globalrev_mapping: split out sql module
Summary: This lib.rs is getting too big. Split it.

Reviewed By: StanislavGlebik

Differential Revision: D25333510

fbshipit-source-id: ea15664d2de21a24ee107162e030b7762b1d413e
2020-12-07 08:23:19 -08:00
Thomas Orozco
00c3297c49 mononoke/bonsai_globalrev_mapping: make point queries default impl methods
Summary:
I'd like to add a caching variant for this. Might as well not have to rewrite
those methods on an ad-hoc basis.

Reviewed By: StanislavGlebik

Differential Revision: D25333461

fbshipit-source-id: 632c0307189fe15a926d808c1eeca1e3f240eb19
2020-12-07 08:23:18 -08:00
Thomas Orozco
1ec5537e9e mononoke/bonsai_globalrev_mapping: update to futures 0.3
Summary: Like it says in the title.

Reviewed By: StanislavGlebik

Differential Revision: D25333450

fbshipit-source-id: 49ad4b1964a4dfd9f3e0108fa421d451ef905256
2020-12-07 08:23:18 -08:00
Stanislau Hlebik
40b67f9d09 mononoke: do not use batched in derived_data_tailer
Reviewed By: ikostia

Differential Revision: D25366938

fbshipit-source-id: 9b10853b1fdf5ee281187067aa768fa52e6d3074
2020-12-07 02:51:45 -08:00
Kostia Balytskyi
3ed778a5ea observability: add dynamic level drain
Summary:
This makes logs go through a `Drain` which queries `ObservabilityContext` (introduced in a previous diff) for current logging level.  ATM I did not add any tests, and it's pretty easy to add a unit-test checking that the drain indeed respects the level, but it's so simple that I am not 100% convinced that test would be all that valuable.

Note that currently `ObservabilityContext` is enabled in a `Static` variation.

Reviewed By: mitrandir77

Differential Revision: D25232400

fbshipit-source-id: 7499916e0a3ddab43538343e6ed215818517eaf7
2020-12-04 14:30:29 -08:00
Kostia Balytskyi
c60fcebc12 observability: introduce ObservabilityContext
Summary:
`ObservabilityContext` is a structure that helps logging facilities within Mononoke to make logging decisions. Specifically, the upcoming `DynamicLoggingDrain` and already existing `MononokeScubaSampleBuilder` will have this structure as a component and decide whether a particular logging statement (slog or scuba) should go ahead.

Internally, `ObservabilityContext` is a wrapper around data received from a [configerator endpoint](https://www.internalfb.com/intern/configerator/edit/?path=scm%2Fmononoke%2Fobservability%2Fobservability_config.cconf).

This diff makes a few unobvious decisions about how this is organized. My goals were:
1. to have production (i.e. reading from configerator), static (i.e. emulating current prod behavior) and test variants of `ObservabilityContext`
1. to avoid having consumers know which of these variants are used
1. to avoid making all consumers of `ObservabilityContext` necessarily generic
1. to avoid using dynamic dispatch of `ObservabilityContext`'s methods

Points 3 and 4 mean that `ObservabilityContext` cannot be a trait. `enum` is a common solution in such cases. However, if `ObservabilityContext` is an `enum`, consumers will know which variant they are using (because `enum` variants are public). So the solution is to use a private enum wrapped in a struct.

Reviewed By: mitrandir77

Differential Revision: D25287759

fbshipit-source-id: da034c71570137e8a8fb7749b1e4ad43be482f66
2020-12-04 14:30:29 -08:00
Alex Hornby
28d4471f75 mononoke: no need to collect walker iterators
Summary: Can just pass on the iterator

Reviewed By: ikostia

Differential Revision: D25216892

fbshipit-source-id: 79c08737477ac7ed1f824c50105d5977ee592126
2020-12-04 03:07:05 -08:00
Alex Hornby
c1a563a1c6 mononoke: remove --cachelib-only-blobstore from walker test cmdlines
Summary: Its now the default for this binary, might as well shorten the test command lines

Reviewed By: ikostia

Differential Revision: D25219717

fbshipit-source-id: 8074145c6f05f26ab7fa18d2ff399482ad592885
2020-12-04 03:07:05 -08:00
Alex Hornby
591363e1c4 mononoke: allow binaries to specify a default for cachelib-only-blobstore
Summary: Reduces boilerplate for binaries usually run in this mode, notably the walker

Reviewed By: ikostia

Differential Revision: D25216883

fbshipit-source-id: e31d2a6aec7da3baafd8bcf208cf79cc696752c0
2020-12-04 03:07:04 -08:00
Alex Hornby
54bda6537d mononoke: allow binaries to default a blobstore read qps
Summary: This is useful to prevent accidentally consuming too much.  Enabled it for the walker

Reviewed By: ikostia

Differential Revision: D25216880

fbshipit-source-id: e80f490d6ece40d64cc8609e7d6b80d0ecbb1671
2020-12-04 03:07:04 -08:00
Alex Hornby
f814075cee mononoke: allow binaries to default blobstore-cachelib-attempt-zstd option
Summary: Reduces boiler plate on command line for binaries like walker that want different default

Reviewed By: krallin

Differential Revision: D25216876

fbshipit-source-id: 0df474568d28e0726be223e9dc0a760523063d21
2020-12-04 03:07:04 -08:00
Egor Tkachenko
8c5d77f0f8 Adding connection pool for darkstorm
Summary: Darkisilon cell consists of multiple hosts which shares underlying storage, so write to one of them is visible for all hosts. Lets spread requests between all these hosts. I'll get list of hosts from the smc tier and will randomly connect to one on each request.

Reviewed By: krallin

Differential Revision: D25163782

fbshipit-source-id: b28085dd37b15972469b7334a47def473e10f34e
2020-12-03 17:10:42 -08:00
Stefan Filip
1c310c516b derived_data: remove unused import
Summary: Fix the build.

Reviewed By: ikostia

Differential Revision: D25315064

fbshipit-source-id: 1f4da8b47eda7f9177ef0c2c6eddbdb374640472
2020-12-03 14:17:52 -08:00
Johan Schuijt-Li
d3224db357 use authentication abstraction for mononoke
Summary: This allows us to be more flexible in choosing authentication and expands variables used in configuration.

Reviewed By: singhsrb

Differential Revision: D25304008

fbshipit-source-id: 636893a9eaec31ca5acfa02f72931d5e56b695d0
2020-12-03 13:57:48 -08:00
Pavel Aslanov
70bfc4abd0 convret to new type futures
Summary: convert blobrepo:blobrepo_common to new type futures

Reviewed By: StanislavGlebik

Differential Revision: D25245426

fbshipit-source-id: d3db0e417545b2b0043e48a536737586005ac4bb
2020-12-03 07:15:04 -08:00
Thomas Orozco
15d3670d72 mononoke: UploadHgFileEntry: allow calling it without a path
Summary:
I'd like to experiment with splitting this into its own service. To do that, I
don't want to have to include a path, since it's only used for reporting an
error that will never occur (because for that service I'll be using the
"generate" variant of the filenode id). Let's just make it optional.

Reviewed By: lukaspiatkowski

Differential Revision: D25220901

fbshipit-source-id: 6d3cf70a63b077de18a7d43f3b65766b453c425e
2020-12-03 06:19:31 -08:00
Thomas Orozco
5765030c7e mononoke: asyncify UploadHgFileEntry
Summary: Like it says in the title. Let's turn this into an async fn.

Reviewed By: lukaspiatkowski

Differential Revision: D25220902

fbshipit-source-id: b5de783adaca05919eb5cd6858c8b0aaf03ddfc2
2020-12-03 06:19:31 -08:00
Thomas Orozco
bf4dc18009 mononoke: UploadHgFileEntry::upload: just return a plain future
Summary:
This returns a Result of a tuple, but:

- This never errs.
- Nothing ever reads the left side of the tuple.

So let's stop doing that.

Reviewed By: StanislavGlebik

Differential Revision: D25219887

fbshipit-source-id: f33dcf6f6e68cb17b40c4638470312afae0662e6
2020-12-03 06:19:31 -08:00
Thomas Orozco
679735fbdf mononoke/unbundle: stop collecting ContentBlobMeta given we never use it
Summary:
Like it says in the title. We've had this comment about potentially using this
for a couple of years now. It seems a bit unlikely at this point that we're
going to use this, but currently it makes the code that provides uploading hg
entries more complex than it needs to be, so let's just get rid of this.

Reviewed By: lukaspiatkowski, StanislavGlebik

Differential Revision: D25219728

fbshipit-source-id: 61d254edef16d24a4c29f96f983f894863b5232a
2020-12-03 06:19:31 -08:00
Thomas Orozco
d5097a2d8c mononoke/lfs_server: account for size in object popularity
Summary:
This modifies our object popularity mechanism to account for the size of the
objects being downloaded. Indeed, considering our bottleneck is bandwidth, we
forcing similar routing for 10 downloads of a 10MB object and 10 downloads of a
1GB object doesn't make that much sense.

This diffs updates our counting so that we now record the object size instead
of a count. I'll set up routing so that we disallow consistent routing when a
single object exceeds 250MiB/s of throughput ( = 1/4th of a task).

It's worth noting that this will be equivalent to what we have right now for
our most problematic objects (GraphQL schemas in Fbsource, 35M each), given
that we "unroute" at 150 requests / 20 seconds (`150 * 35 / 20 = 262`).

The key difference here is that this will work for all objects.

This does mean LFS needs to cache and know about content metadata. That's not
actually a big deal. Indeed, over a week, we serve 25K distinct objects
(https://fburl.com/scuba/mononoke_lfs/a2d26s1a), so considering content
metadata is a hundred bytes (and so is the key), we are looking at a few MBs of
cache space here.

As part of this, I've also refactored our config handling to stop duplicating
structures in Configerator and LFS by using the Thrift objects directly (we
still have a few dedicated structs when post-processing is necessary, but we
no longer have anything that deserializes straight from JSON).

Note that one further refinement here would be to consistently route but to
more tasks (i.e. return one of 2 routing tokens for an object that is being downloaded
at 500MiB/s). We'll see if we need that.

Reviewed By: HarveyHunt

Differential Revision: D24361314

fbshipit-source-id: 49e1f86cf49357f60f1eac298a753e0c1fcbdbe5
2020-12-03 06:17:06 -08:00
Harvey Hunt
3b7978c437 mononoke: cmdlib: Fix default cachelib size crash
Summary:
D24761026 (caa684450f) formatted the default cachelib size with the specifier
`{:3}`. This specifier pads the left side of the string with spaces if there
are less than 3 digits.

Unfortunately, this means that attempting to parse the string into an `f64`
fails. Here's a minimal example:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=11a1e819f9919f7d02565cb8fa561b85

Remove the format specifier and instead call `.to_string()`.

Reviewed By: ahornby

Differential Revision: D25302079

fbshipit-source-id: 461dd628a312a967f6cf5958d2e5d51b72b0ffd8
2020-12-03 05:48:31 -08:00
Thomas Orozco
282d93d906 mononoke: up the cache zstd threshold a little bit
Summary:
Right now this is the same as our Filestore threshold so we always attempt to
compress Filestore chunks even though they were designed to fit in cache.

Whoops.

Reviewed By: ikostia

Differential Revision: D25274080

fbshipit-source-id: f17b54710fc36ca7c11c74247d038bf73777f7f9
2020-12-03 03:37:22 -08:00
Stefan Filip
5c7ecf402f segmented_changelog_tailer: initialize caching
Summary:
Cachelib wasn't initialized so a production configuration would crash because
the cache pools wouldn't be found.

Reviewed By: singhsrb

Differential Revision: D25291880

fbshipit-source-id: 382db0efda072b11da587e863566e816bd5393ca
2020-12-02 19:18:32 -08:00
Pavel Aslanov
337bab2744 convert to new type futures
Summary: Convert `ChangesetFetcher` to new type futures

Reviewed By: StanislavGlebik

Differential Revision: D25244213

fbshipit-source-id: 4207386d81397a930a566db008019bb8f31bf602
2020-12-02 15:40:12 -08:00
Mark Juggurnauth-Thomas
aea042fc29 skeleton_manifests: check case on upload if not enabled
Summary:
If skeleton manifests are not enabled for a repository, ignore the tunables
that delay casefolding checks until land-time.

Reviewed By: StanislavGlebik

Differential Revision: D25271607

fbshipit-source-id: dcaf9291da31d0f57b3b632888ed688ecd6cecda
2020-12-02 09:08:34 -08:00
Mark Juggurnauth-Thomas
8fe2b30e9a derived_data: commonise blobstore mapping implementations
Summary:
Create `BlobstoreMapping` as a trait with the common implementations for
derived data mappings that are stored in the blobstore.

Reviewed By: StanislavGlebik

Differential Revision: D25099915

fbshipit-source-id: 8a62fbb809918045336944c8cd3584b109811012
2020-12-02 07:33:41 -08:00
Mark Juggurnauth-Thomas
0f13c1283e derived_data: rename Mode to DeriveMode
Reviewed By: StanislavGlebik

Differential Revision: D25218193

fbshipit-source-id: 7a4ff3cb3e2b0e0c0f849b0de69a75626aeddf4c
2020-12-02 07:33:41 -08:00
Mark Juggurnauth-Thomas
6b0203d10f derived_data: use batch_derive for backfilling data
Summary:
Batch derivation was disabled as it caused problems for other derived data types.

This was because the default batch implementation was wrong: it attempted to derive
potentially linear stacks concurrently, which would cause O(n^2) derivations.

Fix the default implementation to be a naive sequential iteration, and re-enable
batch derivation for fsnodes and skeleton manifests.

Reviewed By: StanislavGlebik

Differential Revision: D25218195

fbshipit-source-id: 730555829f092cc36e1c81cf02c2b4962a3904da
2020-12-02 07:33:41 -08:00
Mark Juggurnauth-Thomas
c414b36a30 fsnodes: don't rederive parents fsnodes during batch derivation
Summary:
On the second and subsequent linear stack in a batch, the parent fsnode may
have been derived in the previous iteration of the loop.  Since we haven't
completed this batch yet, the mappings have not been stored, and so the
attempt to derive the parent will result in the parent being derived again,
repeating the work of this batch sequentially.

Apply the optimization used in skeleton manifests and fetch the parent fsnodes
out of the result we are accumulating.

Reviewed By: StanislavGlebik

Differential Revision: D25218194

fbshipit-source-id: 5cc49204b53984f8aa73542f1a794a6251eb2b2e
2020-12-02 07:33:41 -08:00
Mark Juggurnauth-Thomas
509dcf3ccd skeleton_manifests: enable derivation of skeleton manifests in batches
Summary:
Similar to fsnodes, allow skeleton manifests to be derived in parallel in large
batches by splitting the changesets into linear stacks with no internal
conflicts, and deriving each changeset in that batch in parallel.

Reviewed By: StanislavGlebik

Differential Revision: D25218196

fbshipit-source-id: e578de9ffd472e732abb1e2ef9cd19c073280cd4
2020-12-02 07:33:41 -08:00
Mark Juggurnauth-Thomas
2d01d4236d tests: wait for source control service to fully start
Summary:
The Mononoke SCS tests can be flaky as the server starts to accept SSL
connections some time before it is fully ready to accept requests, which can
result in request timeouts.

Use a simple request to test for the server being fully up and ready to
handle requests.

Reviewed By: krallin

Differential Revision: D25250245

fbshipit-source-id: b79106ed51e5163ebe5cd1db7b0deaab0035b9bc
2020-12-02 07:33:41 -08:00
Alex Hornby
3902c04261 mononoke: update test cases to use new --with-readonly-storage option
Summary: Update the few test cases remaining on older option to the newer option

Reviewed By: ikostia

Differential Revision: D25219710

fbshipit-source-id: 50af0dcac7ed980ec4c7180cf81e3c00ecc18b95
2020-12-02 07:27:24 -08:00
Alex Hornby
b458ae4217 mononoke: remove --readonly-storage from walker test cmdlines
Summary: Remove this now it is the walker default.  Makes command lines shorter

Reviewed By: ikostia

Differential Revision: D25219551

fbshipit-source-id: bc5ad4237cad35218a0b4c54aa81eb20edb3f3e1
2020-12-02 07:27:24 -08:00
Alex Hornby
99fb41c5bd mononoke: allow binaries to default readonly-storage option
Summary:
This will reduce boilerplate command line for the walker, as most of the time we want to run it with readonly storage

Because the existing --readonly-storage flag can't take a value this introduces a new --with-readonly-storage=<true|false> option

Reviewed By: krallin

Differential Revision: D25216871

fbshipit-source-id: e1b83b428a9c3787f48c18fd396d23ac95991b77
2020-12-02 07:27:23 -08:00
Alex Hornby
935a7ddfc8 mononoke: remove the need to pass in cachelib settings twice
Summary:
Previously needed to pass in cachelib settings once to MononokeAppBuilder and once to parse_and_init_cachelib.

This change adds a MononokeClapApp and MononokeMatches that preserve the settings, thus preventing the need to pass them in twice (and thus avoiding possible inconsistency)

MononokeMatches uses MaybeOwned to hold the inner ArgMatches, which allows us to hold both the usual reference case from get_matches and an owned case for get_matches_from which is used in test cases.

Reviewed By: krallin

Differential Revision: D24788450

fbshipit-source-id: aad5fff2edda305177dcefa4b3a98ab99bc2d811
2020-12-02 07:27:23 -08:00
Alex Hornby
31bcf94df7 mononoke: set a default cache_size in the walker
Summary: Shorten command lines by setting a default in code.

Reviewed By: ikostia

Differential Revision: D24761025

fbshipit-source-id: 13deb1622ee1b97135ee787f6b6ffeed2f05813b
2020-12-02 07:27:23 -08:00
Alex Hornby
caa684450f mononoke: show cachelib cache_size in --help usage
Summary:
Show cachelib cache_size default in --help usage so its clear what you'll get if no command line args passed

Because we need to convert from bytes to GiB, the lifetime of the help string isn't long enough for clap's reference recieving default_value, so use OnceCell to be able to pass a static reference.

Reviewed By: krallin

Differential Revision: D24761026

fbshipit-source-id: 81b5e7ceb832d5cb55cc9faef59c5e6432f7c4b0
2020-12-02 07:27:23 -08:00
Alex Hornby
f077f69408 mononoke: move expected_item_size_byte into CachelibSettings
Summary:
Move expected_item_size_byte into CachelibSettings, seems like it should be there.

To enable its use also exposes a parse_and_init_cachelib method for callers that have different defaults to default cachelibe settings.

Reviewed By: krallin

Differential Revision: D24761024

fbshipit-source-id: 440082ab77b5b9f879c99b8f764e71bec68d270e
2020-12-02 00:47:22 -08:00
Meyer Jacobs
293053b774 edenapi: expose "attributes" parameter in tree request API
Summary:
Introduce a new API type, `TreeAttributes`, corresponding to the existing type `WireTreeAttributesRequest`, which exposes which optional attributes are available for fetching. An `Option<TreeAttributes>` parameter is added to the `trees` API, and if set to `None`, the client will make a request with TreeAttributes::default().

The Python bindings accept a dictionary for the attributes parameter, and any fields present will overwrite the default settings from TreeAttributes::default(). Unrecognized attributes will be silently ignored.

Reviewed By: kulshrax

Differential Revision: D25041255

fbshipit-source-id: 5c581c20aac06eeb0428fff42bfd93f6aecbb629
2020-12-01 19:07:25 -08:00
Stefan Filip
4b9dc9074f segmented_changelog: measure runs/failures/duration for updates
Summary: Basic observability for how the segmeted changelog update process is performing.

Reviewed By: krallin

Differential Revision: D25108739

fbshipit-source-id: b1f406eb0c862464b186f933d126e0f3a08144e4
2020-12-01 17:29:23 -08:00
Stefan Filip
b2aac949cf cmds: update segmented-changelog-tailer to run on a list of repos
Summary:
The update of the segmented changelog is light weight enough that we can
consider all repositories sharing a common tailer process. With all
repositories sharing a single tailer the the maintenance burden will be lower.

Things that I am particularly unsure about are: tailer configuration setup and
tailer structure. With regards to setup, I am not sure if this is more or less
than what production servers do to instantiate. With regards to structure, I
think that it makes a lot of sense to have a function that takes a single repo
name as parameter but the configuration setup has an influence on the details.
I am also unsure how important it is to paralelize the instantiation of the
blobrepos.

Finally, it is worth mentioning that the SegmentedChangelogTailer waits for
`delay` after an update finishes rather than on a period. The benefit is that
we don't have large updates taking down a process because we schedule the same
large repo update too many timer. The drawback is that scheduling gets messed
up over time and multiple repo updates can end up starting at the same time.

Reviewed By: farnz

Differential Revision: D25100839

fbshipit-source-id: 5fff9f87ba4dc44a17c4a7aaa715d0698b04f5c3
2020-12-01 17:29:23 -08:00
Alex Hornby
dac5f36baa mononoke: add option to include all the derived types that are enabled for a repo
Summary: This makes it easier to enable all derived data for scrubbing

Reviewed By: ikostia

Differential Revision: D25188963

fbshipit-source-id: e9c981e33273d6b2eeadcce0d0a341b33e91e42d
2020-12-01 11:45:00 -08:00
Alex Hornby
14aa6d37f2 mononoke: Show possible Node and Edge type args in the scrub help
Summary:
Show the options for specifying node and edge types in the --help output

These changes removed the last use of lazy_static in the walker, so updated TARGETS as well.

Reviewed By: krallin

Differential Revision: D25188964

fbshipit-source-id: c5ccb4f5a0f3be1b8cb7d51cd5f99236d60d3029
2020-12-01 11:45:00 -08:00
Alex Hornby
a2247dc41c mononoke: rename MononokeApp to MononokeAppBuilder
Summary:
It has a build() method and later in stack it will build a mononoke
specific type rather than the clap::App

Differential Revision: D25216827

fbshipit-source-id: 24a531856405a702e7fecf54d60be1ea3d2aa6e7
2020-12-01 11:45:00 -08:00
Lukas Piatkowski
25a93e6a09 arc/bucklint/non-fbcode-build: cover Mononoke and its dependencies with oss build
Summary: This is step #5 of T78216315

Reviewed By: farnz

Differential Revision: D24506457

fbshipit-source-id: 1417693d3bbc2c7316d63149b945d522a5a8efb2
2020-12-01 09:43:12 -08:00
Stanislau Hlebik
561ff43310 mononoke: add --use-hg-server-bookmark-value-if-mismatch option to hg sync job
Summary:
At the moment if we try to sync a bookmark entry but from_cs_id of bookmark
entry doesn't match the value of the bookmark on hg servers then the sync will
fail.
Let's add an option that in the case of this mismatch sets from_cs_id to the
current value on hg servers.

Reviewed By: krallin

Differential Revision: D25242172

fbshipit-source-id: 91180fb86f25d10c9ba2b78d7aa18ed0a52d13a5
2020-12-01 05:58:23 -08:00
Pavel Aslanov
6e93ed73f9 convert to new type futures
Summary: convert mercurial_derived_data to new type futures

Reviewed By: ahornby

Differential Revision: D25220329

fbshipit-source-id: c2532a12e915b315fe6eb72f122dbc37822bbb2a
2020-12-01 03:03:45 -08:00
Kostia Balytskyi
e4dab84619 scuba: turn ScubaSampleBuilderExt into a wrapper struct
Summary:
This diff prepares the Mononoke codebase for composition-based extendability of
`ScubaSampleBuilder`. Specifically, in the near future I will add:
- new methods for verbose scuba logging
- new data field (`ObservabilityContext`) to check if verbose logging should
  be enabled or disabled

The higher-level goal here is to be able to enable/disable verbose Scuba
logging (either overall or for certain slices of logs, like for a certain
session id) in real time, without restarting Mononoke. To do so, I plan to
expose the aforementioned verbose logging methods, which will run a check
against the stored `ObservabilityContext` and make a decision of whether the
logging is enabled or not. `ObservabilityContext` will of course hide
implementation details from the renamed `ScubaSampleBuilderExt`, and just provide a yes/no
answer based on the current config and sample fields.

At the moment this should be a completely harmless change.

Reviewed By: krallin

Differential Revision: D25211089

fbshipit-source-id: ea03dda82fadb7fc91a2433e12e220582ede5fb8
2020-11-30 21:26:24 -08:00
Stanislau Hlebik
0d27cac271 mononoke: fix cross-repo bookmark validation alarm
Summary:
cross repo bookmark validation alarm fired a few times, and looks like it fired
because of the following:

1) find_bookmark_diff compared boomarks and found an inconsistency for bookmark
BM which points to commit A in large repo. Next step is  to check bookmark history
2) While find_bookmark_diff was running a new commit B was landed in a large repo
and was backsynced to the small repo, so BM now points to commit B.
3) check_large_bookmark_history is called and it fetches latest bookmark log entries, and
it gets entries for commit A and commit B. check_large_bookmark_history checks
if any of the fetched entries points to a commit in the small repo and if yes then
it also checks if this bookmark update happened not so long ago. And the
problem is in the way it checks the "not so long ago" part. It does so by
finding the time difference between latest bookmark update log entry and any
other bookmark update log entry.

Now, if time difference between these two log entries (for commit B and for
commit A)  is more than max_delay_secs (which happens only
if commit rate is low e.g. during the weekends), then the alarm would fire
because the delay between latest bookmark update log entry (the one that moved
BM to commit B) and previous log entry (the one that moved BM to commit A) is too large.

This diff fixes this race by skipping newest entries until we found a bookmark
update log entry which points to the large commit that find_bookmark_diff
returned.

Reviewed By: ikostia

Differential Revision: D25196760

fbshipit-source-id: dfa0dca0001b1c38759ec9f4f790cfa3197ae2cf
2020-11-30 14:06:31 -08:00
Pavel Aslanov
a1d0ce7ef7 remove dependency on old futures
Summary: remove dependency on old futures from derived data filenodes

Reviewed By: ahornby

Differential Revision: D25218521

fbshipit-source-id: 4d7eaf42c3ba15ea09276a7f3567128d5216e814
2020-11-30 12:00:22 -08:00
Pavel Aslanov
4f72c1eadf use new type futures in tests
Summary: Old futures removed from unodes tests

Reviewed By: ahornby

Differential Revision: D25197780

fbshipit-source-id: f5b3acaf6556515994495ccf443e9cb204b4573e
2020-11-30 12:00:22 -08:00
Pavel Aslanov
13cc72bd97 remove dependency on futures-old and futures_ext
Summary: `derived_data:derived_data` had already been almost converted, I've cleaned up some test so it would be possible to completely remove old futures dependency

Reviewed By: StanislavGlebik

Differential Revision: D25197406

fbshipit-source-id: 064439f42a15f715befc019e5350dda0a2975f2b
2020-11-30 12:00:22 -08:00
Pavel Aslanov
48b6813a06 convert save_bonsai_changesets to new type futures
Summary:
- convert save_bonsai_changesets to new type futures
- `blobrepo:blobrepo` is free from old futures deps

Reviewed By: StanislavGlebik

Differential Revision: D25197060

fbshipit-source-id: 910bd3f9674094b56e1133d7799cefea56c84123
2020-11-30 12:00:22 -08:00
Alex Hornby
5a091f1b6a mononoke: add FastlogBatch to walker
Summary: Add it so we can scrub and inspect it

Differential Revision: D25187550

fbshipit-source-id: a241c1e492dc3ad04358db4c7023f90c601d1b1e
2020-11-30 10:36:49 -08:00
Alex Hornby
6eb072fca7 mononoke: add FastlogDir to walker
Summary: Add support for walking fast entries for directories so we can scrub and inspect them

Differential Revision: D25187551

fbshipit-source-id: 812f9fd82459ef49dcd781c318fbe5c398daad21
2020-11-30 10:36:49 -08:00
Alex Hornby
68ba8fd566 mononoke: add fastlog for files to walker
Summary: Add FastlogFile to walker so it can be inspected and scrubbed.  Directory and batch components of fastlog covered in following diffs.

Differential Revision: D25187549

fbshipit-source-id: c046cbf2561cdbbc9563497119e34d1b09d0ebef
2020-11-30 10:36:49 -08:00
Lukas Piatkowski
21f701f0e8 mononoke/unbundle: migrate modules rate_limit, response and push_redirectors to futures 0.3
Reviewed By: ahornby

Differential Revision: D25196441

fbshipit-source-id: 7236df68d6f38d37eaf5f789a9521b63b6876937
2020-11-30 08:30:55 -08:00
Johan Schuijt-Li
ec5fc72848 proxygen compatibility
Summary: As HG<->Mononoke will go through proxygen, testing showed that Proxygen forces us to use 'Upgrade: websocket' and confirm with the Websocket handshake. Adjust accordingly to do so.

Reviewed By: ahornby

Differential Revision: D25197395

fbshipit-source-id: ca00ac31be92817c6f1a99d7d492469b17b46286
2020-11-30 08:10:42 -08:00
Stanislau Hlebik
550d331981 mononoke: remove unused code
Reviewed By: mitrandir77

Differential Revision: D25187055

fbshipit-source-id: 8faf2398e0407f56cf133feeb0da2812e27acaad
2020-11-30 07:51:08 -08:00
Stanislau Hlebik
80967e0c6a mononoke: fix clippy in hg sync job
Reviewed By: mitrandir77

Differential Revision: D25186814

fbshipit-source-id: 6258289cbf6ba349f1c1ba542e6bc81e94007fb9
2020-11-30 07:51:08 -08:00
Stanislau Hlebik
59c5f80b75 mononoke: add bundle combining to hg sync job
Summary:
This diff adds bundle combining to hg sync job. See motivation for doing that in D25168877 (cebde43d3f).

Main struct here is BookmarkLogEntryBatch. This struct helds a vector of BookmarkUpdateLogEntry that were combined (they are used mostly for logging) and also it contains parameters necessary for producing the bundle, notably from/to changeset ids and bookmarks. This struct has try_append method that decides whether it's possible to combine bundles or not.

Reviewed By: mitrandir77

Differential Revision: D25186110

fbshipit-source-id: 77ce91915f460db73d8a996efe415954eeea2476
2020-11-30 07:51:08 -08:00