Summary:
D23599866 (54d43b7f95) added an optimization for getbundle that reduces cpu usage when a new
commit with log generation number is added. I.e. the case like this
```
O
|
O
..
O <- new commit, low generation number
|
...
```
Unfortunately this optimization doesn't help with the case where a new repo is
merged into master
```
O <- also new commit, but generationo number is high!
| \
.. O <- new commit, low generation number, but it's not in "heads" parameter
|
|
O
...
```
The merge commit actually has a high generation number, but it's p2 has a low
generation number, so it causes the same issue with high cpu usage.
This diff adds a second optimization ( :( ) that should help with the shortcoming of the first one. See comments for more details.
Reviewed By: ikostia
Differential Revision: D23824204
fbshipit-source-id: 8f647f1813d2662e41325829d05def633372c140
Summary: fixes build and test errors on OSS introduced by D23596262 (deb57a25ed)
Reviewed By: ikostia
Differential Revision: D23757086
fbshipit-source-id: 7973ce36b3589cbe21590bd7e19a9828be72128f
Summary:
In preparation of moving away from SSH as an intermediate entry point for
Mononoke, let Mononoke work with newly introduced Metadata. This removes any
assumptions we now make about how certain data is presented to us, making the
current "ssh preamble" no longer central.
Metadata is primarily based around identities and provides some
backwards-compatible entry points to make sure we can satisfy downstream
consumers of commits like hooks and logs.
Simarly we now do our own reverse DNS resolving instead of relying on what's
been provided by the client. This is done in an async matter and we don't rely
on the result, so Mononoke can keep functioning in case DNS is offline.
Reviewed By: farnz
Differential Revision: D23596262
fbshipit-source-id: 3a4e97a429b13bae76ae1cdf428de0246e684a27
Summary:
These two perf counters proved to be not very convenient to evaluate the
volume of undesired file fetches. Let's get rid of them. Specifically, they are
not convenient, because they accumulate values and it's hard to aggregate over
them.
Note that I don't do the same for tree fetches, as there's no better way of
estimating those now.
Reviewed By: mitrandir77
Differential Revision: D23452913
fbshipit-source-id: 08f8dd25eece495f986dc912a302ab3109662478
Summary:
In a repository with files with large histories we run into a lot of SqlTimeout
errors while fetching file history to serve getpack calls. However fetching the
whole file history is not really necessary - client knows how to work with
partial history i.e. if client misses some portion of history then it would
just fetch it on demand.
This diff adds way to add a limit on how many entries were going to be fetched, and if more entries were fetched then we return FilenodeRangeResult::TooBig. The downside of this diff is that we'd have to do more sequential database
queries.
Reviewed By: krallin
Differential Revision: D23025249
fbshipit-source-id: ebed9d6df6f8f40e658bc4b83123c75f78e70d93
Summary: There are no users waiting on manual scrub, so set it to use the background session mode.
Reviewed By: krallin
Differential Revision: D23054581
fbshipit-source-id: 985bcadbaf17d2a8c92fdec811ecb239cbca7b37
Summary:
This is a backout of D22912569 (34760b5164), which is breaking opt-clang-thinlto builds on platform007 (S206790).
Original commit changeset: 5ffdc48adb1f
Reviewed By: aaronabramov
Differential Revision: D22956288
fbshipit-source-id: 45940c288d6f10dfe5457d295c405b84314e6b21
Summary:
Backfillers and other housekeeping processes can run so far ahead of the blobstore sync queue that we can't empty it from the healer task as fast as the backfillers can fill it.
Work around this by providing a new mode that background tasks can use to avoid filling the queue if all the blobstores are writing successfully. This has a side-effect of slowing background tasks to the speed of the slowest blobstore, instead of allowing them to run ahead at the speed of the fastest blobstore and relying on the healer ensuring that all blobs are present.
Future diffs will add this mode to appropriate tasks
Reviewed By: ikostia
Differential Revision: D22866818
fbshipit-source-id: a8762528bb3f6f11c0ec63e4a3c8dac08d0b4d8e
Summary: A couple of features stabilized, so drop their `#![feature(...)]` lines.
Reviewed By: eugeneoden, dtolnay
Differential Revision: D22912569
fbshipit-source-id: 5ffdc48adb1f57a1b845b1b611f34b8a7ceff216
Summary:
Like it says in the title. This would be helpful to understand why a particular
derivation took a given amount of time. To avoid having other work that shares
this CoreContext resulting in biased counters, I set this up so that we start
new perf counters for derivation.
Reviewed By: farnz
Differential Revision: D22595473
fbshipit-source-id: de85d5108aabde23cf6587662f15f25aac0cd650
Summary:
Just knowing the number of fetched undesired files doesn't give the full
picture. e.g. fetching lots of small files is better than fetching single
multi-Gb file.
So knowing the size of files is helpful
Reviewed By: krallin
Differential Revision: D22408400
fbshipit-source-id: 7653c1cdceccf50aeda9ce8a4880ee5178d4b107
Summary: Like it says in the title. Those are useful!
Reviewed By: farnz
Differential Revision: D22332479
fbshipit-source-id: f9bddad75fcbed2593c675f9ba45965bd87f1575
Summary:
This introduces a caching blobstore that deduplicates reads and writes. The
underlying motivation is to improve performance for processes that might find
themsleves inadvertently reading the same data concurrently from a bunch of
independent callsites (most of Mononoke), or writing the same bit of data over
and over again.
The latter is particularly useful for things like commit cloud backfilling in
WWW, where some logger commits include the same blob being written hundreds or
thousands of times, and cause us to overload the underlying Zippy shard in
Manifold. This is however a problem we've also encountered in the past in e.g.
the deleted files manifest and had to solve there. This blobstore is a little
different in the sense that it solves that problem for all writers.
This comes at the cost of writes being dropped if they're known to be
redundant, which prevents updates through this blobstore. This is desirable for
most of Mononoke, but not all (notably, for skiplist updates it's not great).
For now, I'm going to add this behind an opt-in flag, and later on I'm planning
to make it opt-out and turn it off there (I'm thinking to use the CoreContext
for this).
Reviewed By: farnz
Differential Revision: D22285270
fbshipit-source-id: 4e3502ab2da52a3a0e0e471cd9bc4c10b84a3cc5
Summary:
At the moment we can't test logging to scribe easily - we don't have a way to
mock it. Scribe are supposed to help with that.
They will let us to configure all scribe logs to go to a directory on a
filesystem similar to the way we configure scuba. The Scribe itself will
be stored in CoreContext
Reviewed By: farnz
Differential Revision: D22237730
fbshipit-source-id: 144340bcfb1babc3577026191428df48e30a0bb6
Summary:
Remove unused dependencies for Rust targets.
This failed to remove the dependencies in eden/scm/edenscmnative/bindings
because of the extra macro layer.
Manual edits (named_deps) and misc output in P133451794
Reviewed By: dtolnay
Differential Revision: D22083498
fbshipit-source-id: 170bbaf3c6d767e52e86152d0f34bf6daa198283
Summary:
See bottom diff in the stack for the motivation. Though you can probably guess
the motivation :)
Reviewed By: farnz
Differential Revision: D21623154
fbshipit-source-id: a0940d766a67080ddcb346c2e3313eb08699edad
Summary:
Let's add an option to log how many files and trees were fetched in a
particular repo that start with a prefix.
Reviewed By: farnz
Differential Revision: D21617347
fbshipit-source-id: a57f74eadf32781e6c024e18da252c98af21996d
Summary:
Update the corpus walker to dump the sampled bytes as early as possible to the Inflight area of the output dir, then move them to final location once path is known.
When walking large files and manifests this uses a lot less memory that holding the bytes in a map!
Layout is changed is to make comparison by file type easier. we get a top level dir per extension, e.g. all .json files are under FileContent/byext/json
This also reduces the number of bytes taken from the sampling fingerprint used to make directories, 8 was overkill. 3 is enough to limit directory size.
Reviewed By: farnz
Differential Revision: D21168633
fbshipit-source-id: e0e108736611d552302e085d91707cca48436a01
Summary:
Limits on concurrent calls are a bit hard to reason about, and it's not super
obvious what a good limit when all our underlying limits are expressed in QPS
(and when our data sets don't have peak concurrency - instead they have
completion time + # blob accesses).
Considering our past experience with ThrottledBlob has been quite positive
overall, I'd like to just use the same approach in ContextConcurrencyBlobstore.
To be safe, I've also updated this to be driven by tunables, which make it
easier to rollout and rollback.
Note that I removed `Debug` on `CoreContext` as part of this because it wasn't
used anywhere. We can bring back a meaningful implementation of `Debug` there
in the future if we want to. That triggered some warnings about unused fields,
which for now I just silenced.
Reviewed By: farnz
Differential Revision: D21449405
fbshipit-source-id: 5ca843694607888653a75067a4396b36e572f070
Summary: To make it easier to navigate the codebase the oss-only code will be from now on stored in a separate module, similarly to how the fbcode-only code is stored.
Reviewed By: markbt
Differential Revision: D21429060
fbshipit-source-id: aa7e80961de2897dae31bd0ec83488c683633b7a
Summary: Covering repo_listener and microwave plus some final touch and we have a buildable Mononoke binary.
Reviewed By: krallin
Differential Revision: D21379008
fbshipit-source-id: cca3fbb53b90ce6d2c3f3ced7717404d6b04dd51
Summary:
There are few related changes included in this diff:
- backsyncer is made public
- stubs for SessionContext::is_quicksand and scuba_ext::ScribeClientImplementation
- mononoke/hgproto is made buildable
Reviewed By: krallin
Differential Revision: D21330608
fbshipit-source-id: bf0a3c6f930cbbab28508e680a8ed7a0f10031e5
Summary: Making a trait out of LoadLimiter will help with providing different implementations of load limiting for OSS and FB.
Reviewed By: farnz
Differential Revision: D21302819
fbshipit-source-id: 1b982a367aa7126ca5d7772e4a2406dabbe9e13b
Summary: The changes to server/context, gotham_ext and the code that depends on them are the only reminding places where aclchecker is used directly and it is not easy to split this diff to convert them separately.
Reviewed By: krallin
Differential Revision: D21067809
fbshipit-source-id: a041ab141caa6fe6871e1fda6013e33f1f09bc56
Summary:
Add debug logging and perf counters for the number of mutation entries stored
during `add_entries`, and the number of mutation entries fetched during
`all_predecessors`.
Reviewed By: StanislavGlebik
Differential Revision: D21065934
fbshipit-source-id: 9b2ff9720116e6a168706f994655daffb18d0ffc
Summary:
This is needed because the tonic crate (see the diff stack) relies on tokio ^0.2.13
We can't go to a newer version because a bug that affects mononoke was introduced on 0.2.14 (discussion started on T65261126). The issue was reported upstream https://github.com/tokio-rs/tokio/issues/2390
This diff simply changed the version number on `fbsource/third-party/rust/Cargo.toml` and ran `fbsource/third-party/rust/reindeer/vendor`.
Also ran `buck run //common/rust/cargo_from_buck:cargo_from_buck` to fix the tokio version on generated cargo files
Reviewed By: krallin
Differential Revision: D21043344
fbshipit-source-id: e61797317a581aa87a8a54e9e2ae22655f22fb97
Summary:
We had accumulated lots of unused dependendencies, and had several test_deps in deps instead. Clean this all up to reduce build times and speed up autocargo processing.
Net removal is of around 500 unneeded dependency lines, which represented false dependencies; by removing them, we should get more parallelism in dev builds, and less overbuilding in CI.
Reviewed By: krallin, StanislavGlebik
Differential Revision: D20999762
fbshipit-source-id: 4db3772cbc3fb2af09a16601bc075ae8ed6f0c75
Summary:
Add sampling key to LoggingContainer so that we have a way to sample low level blobstore actions for a given high level blobrepo action with the
This is will be used to track the blobs loaded when walking a repo and associate them with paths.
Walker will assign a new sampling_key to each step, and the SamplingHandler attached to SamplingBlobstore will use it correlate the get() with the Node or Path we want to track accesses for. E.g. to build a corpus of blobs per path for compression analysis
Reviewed By: mitrandir77
Differential Revision: D20534844
fbshipit-source-id: 22662b66a57ad4fef044a1108648f4ad8f2dae78
Summary:
Now that Arun is about to roll this out to the team, we should get some more
logging in place server side. This updates the designated nodes handling code
to report whether it was enabled (and log prior to the request as well).
Reviewed By: HarveyHunt
Differential Revision: D20514429
fbshipit-source-id: 76ce62a296fe27310af75c884a3efebc5f210a8a
Summary:
Update the `getpack` code to calculate how many files (and their total
size) would be served over LFS.
NOTE: The columns have `Possible` in their names as we might not have LFS
enabled, in which case we aren't actually fetching this many blobs from an LFS
server.
Reviewed By: farnz
Differential Revision: D20444137
fbshipit-source-id: 85506d8c468cfdc470684dd216567f1848c43d08
Summary:
This adds a blobstore that can reach into a CoreContext in order to identify
the allowed level of concurrency for blobstore requests initiated by this
CoreContext. This will let us replay infinitepush bundles with limits on a
per-request basis.
Reviewed By: farnz
Differential Revision: D20038575
fbshipit-source-id: 07299701879b7ae65ad9b7ff6e991ceddf062b24
Summary:
In case this starts to cause problems, let's have a way to correlate those
problems with some exported metrics.
Reviewed By: StanislavGlebik
Differential Revision: D20158822
fbshipit-source-id: 6ac9e25861dbedaecdf04fd92bda835ae66535eb
Summary:
These comments end up being a source of churn as we roll out D20125635, and anyway are not particularly meaningful after the transformations performed by autocargo. For example:
```
bytes = { version = "0.4", features = ["serde"] } # todo: remove
```
^ This doesn't mean the generated Cargo.toml intends to drop its bytes dependency altogether, but just that will be migrated to a different version that is present in the third-party/rust/Cargo.toml but not visible in the generated Cargo.toml.
Reviewed By: jsgf
Differential Revision: D20128612
fbshipit-source-id: a9e7b29ddc4b26bc47a626dd73bdaa4771ee7b18
Summary: Adds the Cargo.toml files for blobstore, this is a step towards covering mononoke-types, so only the blobstore traits are covered by this diff.
Reviewed By: aslpavel
Differential Revision: D19948739
fbshipit-source-id: c945a9ca16ccceb0e50a50d941dec65ea74fe78f
Summary: The load_limiter was extracted from server/context into its own crate and the server/context itself was refactored into multiple modules, one of which contains facebook-specific code.
Reviewed By: StanislavGlebik
Differential Revision: D19902972
fbshipit-source-id: d577492b4fe01ccfe11b3e092e0521b190516268
Summary:
This will allow us to distinguish `getbundle` for a normal `pull` from the one
for infinitepush pull.
Reviewed By: StanislavGlebik
Differential Revision: D19833206
fbshipit-source-id: 86534320fbb4d60bac04d458a0953701201cba87
Summary:
This commit manually synchronizes the internal move of
fbcode/scm/mononoke under fbcode/eden/mononoke which couldn't be
performed by ShipIt automatically.
Reviewed By: StanislavGlebik
Differential Revision: D19722832
fbshipit-source-id: 52fbc8bc42a8940b39872dfb8b00ce9c0f6b0800