Commit Graph

73 Commits

Author SHA1 Message Date
Mark Juggurnauth-Thomas
7ed5a54640 repo_client: remove case conflict checks on upload
Summary:
Remove case conflict checking on upload.  Disallowing case conflicts will
always be done during bookmark movement by checking the skeleton manifests.

A side-effect of this change is that configuring a repository with case
conflict checks, but not enabling skeleton manifests, will mean that commits
can't be landed in that repository, as there are no skeleton manifests to
check.

Reviewed By: ahornby

Differential Revision: D26781269

fbshipit-source-id: b4030bea5d92fa87f182a70d31f7e9d9e74989a9
2021-03-10 03:51:42 -08:00
Stanislau Hlebik
00883f94d2 mononoke: remove unused getbundle code
Summary:
It was added for the initial rollout only so that we can fallback quickly if
needed (see D26221250 (7115cf31d2)). We can remove it now since the feature has been enabled
for a few weeks already with no big issues.

Reviewed By: krallin

Differential Revision: D26909490

fbshipit-source-id: 849fac4838c272e92a04a971869842156e88a1cf
2021-03-09 04:12:09 -08:00
Stanislau Hlebik
b5de3e4e21 mononoke: limit the number of hg blobs
Summary:
We ran into an issue while uploading too many blobs at once to darkstorm repo.
We were able to workaround this issue by spawning less blobstore writes at
once.

It's still a bit unclear why this issue happens exactly, but I'd like to make
the number of concurrent uploaded blobs configurable so that we can tweak it if
necessary.

Differential Revision: D26883061

fbshipit-source-id: 57c0d6fc51548b3c7404ebd55b5e07deba9e0601
2021-03-08 05:46:45 -08:00
Thomas Orozco
2a803fc10d third-party/rust: update futures
Summary:
Those newer versions of Futures have compatibility improvements with Tokio,
notably:

- https://github.com/rust-lang/futures-rs/pull/2333
- https://github.com/rust-lang/futures-rs/pull/2358

Reviewed By: farnz

Differential Revision: D26778794

fbshipit-source-id: 5a9dc002083e5edfa5c614d8d2242e586a93fcf6
2021-03-04 06:42:55 -08:00
Thomas Orozco
ef7045e818 common/rust: use fbinit-tokio
Summary:
This diffs add a layer of indirection between fbinit and tokio, thus allowing
us to use fbinit with tokio 0.2 or tokio 1.x.

The way this works is that you specify the Tokio you want by adding it as an
extra dependency alongside `fbinit` in your `TARGETS` (before this, you had to
always include `tokio-02`).

If you use `fbinit-tokio`, then `#[fbinit::main]` and `#[fbinit::test]` get you
a Tokio 1.x runtime, whereas if you use `fbinit-tokio-02`, you get a Tokio 0.2
runtime.

This diff is big, because it needs to change all the TARGETS that reference
this in the same diff that introduces the mechanism. I also didn't produce it
by hand.

Instead, I scripted the transformation using this script: P242773846

I then ran it using:

```
{ hg grep -l "fbinit::test"; hg grep -l "fbinit::main"  } | \
  sort | \
  uniq | \
  xargs ~/codemod/codemod.py \
&&  yes | arc lint \
&& common/rust/cargo_from_buck/bin/autocargo
```

Finally, I grabbed the files returned by `hg grep`, then fed them to:

```
arc lint-rust --paths-from ~/files2 --apply-patches --take RUSTFIXDEPS
```

(I had to modify the file list a bit: notably I removed stuff from scripts/ because
some of that causes Buck to crash when running lint-rust, and I also had to add
fbcode/ as a prefix everywhere).

Reviewed By: mitrandir77

Differential Revision: D26754757

fbshipit-source-id: 326b1c4efc9a57ea89db9b1d390677bcd2ab985e
2021-03-03 04:09:15 -08:00
Ilia Medianikov
58b9ac23ef mononoke: Don't create separate ConfigStore's in tests
Summary:
Atm in tests a separate ConfigStore with file source is created for some configs and then a reference to it is dropped immediately ([see get_config_handle function in mod.rs](https://fburl.com/diffusion/fpkj7ekv)). This is uncomfortable as we may need a reference to e.g. force update configs in tests.

Instead of creating separate stores we can reuse static configerator which already uses local files (in tests).

Reviewed By: krallin

Differential Revision: D26725515

fbshipit-source-id: 24269cd93b7d35216c025807c3f3eb527688b72b
2021-03-03 03:52:41 -08:00
Ilia Medianikov
0aff1c23ec mononoke: server: add support for force updating tunables and configerator and remove sleeps in tests
Summary: We have a number of sleeps in our integration tests. The two main reasons are configs & tunables that need reloading. Currently, we have no way of force-reloading those.

Reviewed By: krallin

Differential Revision: D26615732

fbshipit-source-id: 217c4ae039abd398972b4a9764d08e18d6182493
2021-02-26 10:18:31 -08:00
Lukas Piatkowski
f317302b0f autocargo v1: reformating of oss-dependencies, workspace and patch sections and thrift files to match v2
Summary:
For dependencies V2 puts "version" as the first attribute of dependency or just after "package" if present.
Workspace section is after patch section in V2 and since V2 autoformats patch section then the third-party/rust/Cargo.toml manual entries had to be formatted manually since V1 takes it as it is.
The thrift files are to have "generated by autocargo" and not only "generated" on their first line. This diff also removes some previously generated thrift files that have been incorrectly left when the corresponding Cargo.toml was removed.

Reviewed By: ikostia

Differential Revision: D26618363

fbshipit-source-id: c45d296074f5b0319bba975f3cb0240119729c92
2021-02-25 15:10:56 -08:00
Thomas Orozco
097e4ad00c mononoke: remove tokio-compat (i.e. use tokio 0.2 exclusively)
Summary:
The earlier diffs in this stack have removed all our dependencies on the Tokio
0.1 runtime environment (so, basically, `tokio-executor` and `tokio-timer`), so
we don't need this anymore.

We do still have some deps on `tokio-io`, but this is just traits + helpers,
so this doesn't actually prevent us from removing the 0.1 runtime!

Note that we still have a few transitive dependencies on Tokio 0.1:

- async-unit uses tokio-compat
- hg depends on tokio-compat too, and we depend on it in tests

This isn't the end of the world though, we can live with that :)

Reviewed By: ahornby

Differential Revision: D26544410

fbshipit-source-id: 24789be2402c3f48220dcaad110e8246ef02ecd8
2021-02-22 09:22:42 -08:00
Lukas Piatkowski
cd0b6d50e2 autocargo v1: changes to match autocargo v2 generation results.
Summary:
The changes (and fixes) needed were:
- Ignore rules that are not rust_library or thrift_library (previously only ignore rust_bindgen_library, so that binary and test dependencies were incorrectly added to Cargo.toml)
- Thrift package name to match escaping logic of `tools/build_defs/fbcode_macros/build_defs/lib/thrift/rust.bzl`
- Rearrange some attributes, like features, authors, edition etc.
- Authors to use " instead of '
- Features to be sorted
- Sort all dependencies as one instead of grouping third party and fbcode dependencies together
- Manually format certain entries from third-party/rust/Cargo.toml, since V2 formats third party dependency entries and V1 just takes them as is.

Reviewed By: zertosh

Differential Revision: D26544150

fbshipit-source-id: 19d98985bd6c3ac901ad40cff38ee1ced547e8eb
2021-02-19 11:03:55 -08:00
Lukas Piatkowski
87ddbe2f74 autocargo v1: update autocargo field format to allow transition to autocargo v2
Summary:
Autocargo V2 will use a more structured format for autocargo field
with the help of `cargo_toml` crate it will be easy to deserialize and handle
it.

Also the "include" field is apparently obsolete as it is used for cargo-publish (see https://doc.rust-lang.org/cargo/reference/manifest.html#the-exclude-and-include-fields). From what I know this might be often wrong, especially if someone tries to publish a package from fbcode, then the private facebook folders might be shipped. Lets just not set it and in the new system one will be able to set it explicitly via autocargo parameter on a rule.

Reviewed By: ahornby

Differential Revision: D26339606

fbshipit-source-id: 510a01a4dd80b3efe58a14553b752009d516d651
2021-02-12 23:28:25 -08:00
Stanislau Hlebik
af2ab0cf10 mononoke: store hydrated tree manifests in .hg
Reviewed By: krallin

Differential Revision: D26401093

fbshipit-source-id: e5050883b0e6f370a7cfbb5f46721aca7469dce1
2021-02-11 10:12:27 -08:00
Thomas Orozco
201a50db11 mononoke/server: expose EdenAPI
Summary:
Like it says in the title, this adds support for exposing EdenAPI in Mononoke
Server. That's it!

Differential Revision: D26131777

fbshipit-source-id: 15ed2d6d80b1ea06763adc0b7312d1cab2df5b76
2021-02-04 10:40:04 -08:00
Stanislau Hlebik
7115cf31d2 mononoke: getbundle optimization for many heads with low gen number
Reviewed By: markbt

Differential Revision: D26221250

fbshipit-source-id: dbc2dd4f181d22c30c6061f5b5de95b0be1ea19f
2021-02-03 03:55:46 -08:00
Stanislau Hlebik
da87622777 mononoke: make bg session class tunable per-repo
Reviewed By: ahornby

Differential Revision: D26164309

fbshipit-source-id: bf489bbc75fdd9bcaeb07eb9d9f27249577a64df
2021-02-01 04:34:22 -08:00
Kostia Balytskyi
ae344fe043 tunables: make per-repo getters take &str instead of &String
Summary: This is more flexible.

Reviewed By: StanislavGlebik

Differential Revision: D26168559

fbshipit-source-id: 5946b8b06b3a577f1a8398a228467925f748acf7
2021-02-01 02:29:09 -08:00
Kostia Balytskyi
7d23e203a0 tunables: add support for per-repo strings and ints
Summary: Just as we have global strings/ints, let's have per-repo ones.

Reviewed By: StanislavGlebik

Differential Revision: D26168541

fbshipit-source-id: f31cb4d556231d8f13f7d7dd521086497d52288b
2021-02-01 02:29:08 -08:00
Kostia Balytskyi
f0f9bc10ba tunables: fix ByRepoBool to allow more than 1 tunable
Summary:
Please see added test. Without this diff such test does not even compile,
as `new_values_by_repo` is moved out by `self.#names.swap(Arc::new(new_values_by_repo));` after processing the first tunable (line 202).

Reviewed By: StanislavGlebik

Differential Revision: D26168371

fbshipit-source-id: 3cd9d77b72554eb97927662bc631611fa91eaecb
2021-01-31 23:31:28 -08:00
Stanislau Hlebik
da6664a9b5 mononoke: use background session class for blobstore sync queue
Summary:
Yesterday we had an alarm when blobstore sync queue got overloaded again. This
time it was caused by a large commit cloud commit landing and writing lots of
content and alias blobs.

As we discussed before, let's add an option that would allow us to not write to
blobstore sync queue for commit cloud pushes of content and aliases.
It would slightly increase the latency, but will protect blobstore sync queue
from overloading.

Reviewed By: farnz

Differential Revision: D26129038

fbshipit-source-id: 0e96887e3aa3cf26880899c820f556bb16c437cb
2021-01-28 11:38:30 -08:00
Evgenii Kazakov
e05104c8c2 mononoke/tunables: add by_repo tunables
Summary: Added new bool tunables type, depending on repos.

Reviewed By: ikostia

Differential Revision: D26025163

fbshipit-source-id: 5c2fa846b3c115bea683be249e92d495d28c038c
2021-01-27 01:50:05 -08:00
Daniel Xu
5715e58fce Add version specificiation to internal dependencies
Summary:
Lots of generated code in this diff. Only code change was in
`common/rust/cargo_from_buck/lib/cargo_generator.py`.

Path/git-only dependencies (ie `mydep = { path = "../foo/bar" }`) are not
publishable to crates.io. However, we are allowed to specify both a path/git
_and_ a version. When building locally, the path/git is chosen. When publishing,
the version on crates.io is chosen.

See https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#multiple-locations .

Note that I understand that not all autocargo projects are published on crates.io (yet).
The point of this diff is to allow projects to slowly start getting uploaded.
The end goal is autocargo generated `Cargo.toml`s that can be `cargo publish`ed
without further modification.

Reviewed By: lukaspiatkowski

Differential Revision: D26028982

fbshipit-source-id: f7b4c9d4f4dd004727202bd98ab10e201a21e88c
2021-01-25 22:10:24 -08:00
Stanislau Hlebik
960c9943ba mononoke: tweak how we decide if a generation number is low or not
Summary:
A bit of background: some time ago I landed an optimization to getbundle, which
 split the commits user want to fetch into "high generation numbers"
(which means commits are close to main bookmark) and "low generation numbers"
(commits that are far away from the main bookmark). User can fetch "low
generation numbers" if e.g. a commit to an old release branch was landed.
Processing them separately can yield significant speedups.

Previously we chose a threshold statically e.g. if a commit has generation
number lower than X then it's considered to have a low generation number.
Threshold was chosen arbitrarily and relatively small [1], and it didn't work
that well for commits that were above this threshold, but still not close
enough to main bookmark - we still see cpu spikes.

Instead let's define a commit as having low generation number if it's >X
commits farther from the commit with the highest generation number.

Reviewed By: ahornby

Differential Revision: D25995936

fbshipit-source-id: 57eba4ba288114f430722266a8326183b3b1f0bd
2021-01-25 03:37:31 -08:00
Stanislau Hlebik
53360c82dc mononoke: fix backsyncing mapping change via commit extra
Summary:
Yesterday I landed a diff (D25950531 (79c34c5094)) which allows changing a mapping by
landing a commit with a special commit extra. The implementation was done
inside backsyncer.

However, this was incorrect for a few reasons, the main reason being backsyncer
is not the only thing that rewrites commits between repos (x-repo commit lookup
is another thing for example).
So we need to push this implementation down to CommitSyncer level so that
everyone rewrites commits in the same way.

This diff fixes this issue, and moves all the logic of figuring out correct
mapping version down to CommitSyncer level. Doing so allowed me to also
simplify backsyncer to use normal sync_commit() method to backsync commits.

Another important note is about how we handle commits with no parents. We allow
syncing them only if there's an unambiguous version that we can pick. In case
of any ambiguity we don't sync them and return an error, which means that e.g.
merging a new repo and simultaneously changing the mapping is not possible.

Reviewed By: ikostia

Differential Revision: D25975842

fbshipit-source-id: a87fee545ac1305832ac905337610e7b87884477
2021-01-21 09:43:29 -08:00
Stanislau Hlebik
79c34c5094 mononoke: make it possible to change mapping version via commit extra
Summary:
At the moment the procedure to change commit sync mapping version is fairly
complicated [1]. This diff attempts to simplify one aspect of it.

At the moment we need to disable pushredirection and manually sync a commit
between two repos. Disabling pushredirection is quite error prone and led to
SEVs in the past. With this diff we just need to create a commit with a special
extra "change-xrepo-mapping" and let the backsyncer rewrite a commit with the
new version for us.

This means that landing a commit with this extra should not be allowed to
everyone - I'm going to change bookmark movement to check that.

Reviewed By: ikostia

Differential Revision: D25950531

fbshipit-source-id: 78b1f4950e6bb7da8b511e02685ed6084c29a680
2021-01-19 08:55:27 -08:00
Stanislau Hlebik
fca761e153 mononoke: maybe override SessionClass while deriving data
Summary:
We had issues with mononoke writing too much blobstore sync queue entries while
deriving data for large commits. We've contemplated a few solutions, and
decided to give this one a go.

This approach forces derive data to use Background SessionClass which has an
effect of not writing data to blobstore sync queue if the write to blobstore
was successful (it still writes data to the queue otherwise). This should
reduce the number of entries we write to the blobstore sync queue
significantly. The downside is that writes might get a bit slower - our
assumption is that this slowdown is acceptable. If that's not the case we can
always disable this option.

This diff overrides SessionClass for normal ::derive() method. However there's
also batch_derive() - this one will be addressed in the next diff.

One thing to note - we still write derived data mapping to blobstore sync queue. That should be find as we have a constant number of writes per commits.

Reviewed By: krallin

Differential Revision: D25910464

fbshipit-source-id: 4113d00bc0efe560fd14a5d4319b743d0a100dfa
2021-01-14 13:02:42 -08:00
Stanislau Hlebik
220baef735 mononoke: add a tunable to disable all x-repo syncs
Reviewed By: krallin

Differential Revision: D25781199

fbshipit-source-id: cc0c515487447c5af8290aff4793e8f026732a50
2021-01-05 15:16:50 -08:00
Lukas Piatkowski
67b05b6c24 mononoke/integration: make integration tests work under @mode/dev-rust-oss
Summary: Last step in covering Mononoke with mode/dev-rust-oss buck build mode.

Reviewed By: markbt

Differential Revision: D25461223

fbshipit-source-id: 3fa0fa05e8c96476e069990e8f5cc6d56acf38c0
2020-12-18 06:13:32 -08:00
Stanislau Hlebik
79dda5ed04 mononoke: log public commits to scribe from scs move/create_bookmark method
Summary:
The goal of this stack is to start logging commits to scribe even if a commit was
introduced by scs create_commit/move_bookmark api.  Currently we don't do that.

Initially I had bigger plans and I wanted to log to scribe only from bookmarks_movement and remove scribe logging from unbundle/processing.rs, but it turned out to be trickier to implement. In general, the approach we use right now where in order to log to scribe we need to put `log_commit_to_scribe` call in all the places that can possibly create commits/move bookmarks seems wrong, but changing it is a bit hard. So for now I decided to solve the main problem we have, which is the fact that we don't get scribe logs from repos where bookmarks is moved via scs methods.

To fix that I added an additional option to CreateBookmark/UpdateBookmark structs. If this option is set to true then before moving/creating the bookmark it finds all draft commits that are going to be made public by the bookmark move i.e. all draft ancestors of new bookmark destinationl. This is unfortunate that we have to do this traversal on the critical path of the move_bookmark call, but in practice I hope it won't be too bad since we do similar traversal to record bonsai<->git mappings. In case my hopes are wrong we have scuba logging which should make it clear that this is an expensive operation and also we have a tunable to disable this behavioiur.

Also note that we don't use PushParams::commit_scribe_category. This is intentional - PushParams::commit_scribe_category doesn't seem useful at all, and I plan to deprecate it later in the stack. Even later it would make sense to deprecate PushrebaseParams::commit_scribe_category as well, and put commit_scribe_category optoin in another place in the config.

Reviewed By: markbt

Differential Revision: D25558248

fbshipit-source-id: f7dedea8d6f72ad40c006693d4f1a265977f843f
2020-12-17 00:19:00 -08:00
Mark Juggurnauth-Thomas
db28ca7f59 derived_data_utils: ratelimit backfilling
Summary:
The backfiller may read or write to the blobstore too quickly.  Apply QPS
limits to the backfill batch context to keep the read or write rate acceptable.

Reviewed By: ahornby

Differential Revision: D25371966

fbshipit-source-id: 276bf2dd428f7f66f7472aabd9e943eec5733afe
2020-12-14 09:24:58 -08:00
Stefan Filip
2193b84b43 autocargo: regen
Summary: Regen autocargo

Reviewed By: quark-zju

Differential Revision: D25409963

fbshipit-source-id: 7dbbe420aeb30248bf43d3a96a9aa6a47bb2b0be
2020-12-08 18:30:24 -08:00
Thomas Orozco
e65b22d9c4 mononoke/repo_client: do not re-upload commits we already have for infinitepush
Summary:
This adds support for optionally not uploading commits we already have when
they arrive via infinitepush. This can happen if we're replaying bundles.

This works by filtering the commits we have. We still get some futures created
for the underlying uploads, but we never poll them because when we visit
what futures are needed for what commits, we don't select uploads that are
only reachable from commits we filtered out.

Obviously, this isn't super efficient, since the client still has to send us
all this data, but it's better than having them send us all that data then
having us take hours overwriting it all.

Reviewed By: mitrandir77

Differential Revision: D25120844

fbshipit-source-id: f96881d0d98ec622259bedb624fd94b600a2bf1d
2020-11-20 06:30:12 -08:00
Mark Juggurnauth-Thomas
845167c91e bookmarks_movement: check for case conflicts when landing commits
Summary:
Move the check for commits not having case conflicts from upload time to when
the commit is being landed to a public bookmark.

This allows draft commits in commit cloud to contain erroneously-introduced
case conflicts, whilst still protecting public commits from these case conflicts.

Note that the checks we are moving are the checks for whether this commit
contains case conflicts internally (two files added by this commit conflict in
case), or against the content of ancestor commits (the commit adds a file which
conflicts with an existing file). The check for whether this commit and the
commits it is being pushrebased over conflict still happens during pushrebase.

This behaviour is controlled by a pair of tunable flags:
  * `check_case_conflicts_on_bookmark_movement` enables the new check at land time
  * `skip_case_conflict_check_on_changeset_upload` disables the old check at upload-time

The `check_case_conflicts_on_bookmark_movement` should always be set if the
`skip_case_conflict_check_on_changeset_upload` is set, otherwise users will
be able to land case conflicts.  Setting `check_case_conflicts_on_bookmark_movement`
without setting `skip_case_conflict_check_on_changeset_upload` allows both
checks to be enabled if that is necessary.

To handle bookmark-only moves, the `run_hooks_on_additional_changesets` tunable
flag must also be enabled.

Reviewed By: ahornby

Differential Revision: D24990174

fbshipit-source-id: 34e40e389f2c2139ba24ecee75473c362f365864
2020-11-18 02:10:55 -08:00
Stanislau Hlebik
d59b6d61e9 mononoke: make hash_validation_percentage a tunable
Summary:
Previously it was a config knob, but they are rather hard to change because
they require a restart of the service. Let's make it a tunable instead.

Reviewed By: farnz

Differential Revision: D24682129

fbshipit-source-id: 9832927a97b1ff9da49c69755c3fbdc1871b3c5d
2020-11-09 02:25:24 -08:00
Stanislau Hlebik
a5f6874c72 mononoke: add a tunable to log large read requests
Reviewed By: krallin

Differential Revision: D24780218

fbshipit-source-id: 9207a471e65407a023223ce5a0aeee3511a9823e
2020-11-06 04:23:40 -08:00
Kostia Balytskyi
8540105a10 cross_repo_sync: log commit syncs
Summary:
This adds some basic Scuba logging to commit sync logic in Mononoke. This can be disabled via a tunable.

Note: of course, the wrapping of logging functions would have been pretty cool to do in a macro, but I don't have time atm to figure it out, so have just code.

Reviewed By: StanislavGlebik

Differential Revision: D24571702

fbshipit-source-id: f8830fc675a2a62d2204671e86ab2c819372c5cc
2020-10-30 05:56:40 -07:00
Mark Thomas
c293fd3322 scs_server: add blobstore rate-limiting for requests
Summary:
Some requests can result in a large number of blob fetches.  Add rate limiting
so that these requests don't use up all available capacity.

Rate limits can be specified in tunables.

Reviewed By: ahornby

Differential Revision: D24592814

fbshipit-source-id: 9a73a92094d0ce01be5491b65b4808e3ebb05c11
2020-10-28 11:01:39 -07:00
Stanislau Hlebik
051d0d1ef4 mononoke: add a tunable to disable pushredirected hooks
Reviewed By: ikostia

Differential Revision: D24360325

fbshipit-source-id: b1b5e062c2d9b7af867bf9258d7a2b78d2f26629
2020-10-16 08:24:56 -07:00
Thomas Orozco
ae2566a807 mononoke/derived_data: refer to a tunable to know whether to log slow derivation
Summary:
We don't want to log slow derivations in tests. To do that, let's look at at
unable to decide if we should log.

We only log if the threshold is > 0, so since the default is 0, that means we
won't log unless a config overrides it (which won't be the case in tests, but
will be the case everywhere else: D24329372).

Reviewed By: StanislavGlebik

Differential Revision: D24329377

fbshipit-source-id: e9d70236c614780e383870297d08afaf64450383
2020-10-15 04:03:38 -07:00
Thomas Orozco
ab5af4b053 mononoke: add a tunable for ratio of master fallbacks
Summary:
Let's make this configurable so we can control how many fallbacks we want to
allow if we're overloaded.

Reviewed By: farnz

Differential Revision: D24017088

fbshipit-source-id: 9bccaf831a28daff9696950ae8aac9d53e0c51c0
2020-10-01 01:06:28 -07:00
Stanislau Hlebik
609c2ac257 mononoke: add another optimization for getbundle
Summary:
D23599866 (54d43b7f95) added an optimization for getbundle that reduces cpu usage when a new
commit with log generation number is added. I.e. the case like this

```
O
|
O
..
   O <- new commit, low generation number
   |
  ...
```

Unfortunately this optimization doesn't help with the case where a new repo is
merged into master

```
O  <- also new commit, but generationo number is high!
| \
..  O <- new commit, low generation number, but it's not in "heads" parameter
    |
    |
    O
  ...
```

The merge commit actually has a high generation number, but it's p2 has a low
generation number, so it causes the same issue with high cpu usage.

This diff adds a second optimization ( :( ) that should help with the shortcoming of the first one. See comments for more details.

Reviewed By: ikostia

Differential Revision: D23824204

fbshipit-source-id: 8f647f1813d2662e41325829d05def633372c140
2020-09-24 07:52:40 -07:00
Stanislau Hlebik
f6d3fc1fd7 mononoke: bump the timeouts for getpack
Summary:
There were a few instances of timed out getpack requests on ovrsource.
Example: https://fburl.com/sandcastle/yylr1w3v
Let's bump the timeout to unblock them.

Reviewed By: krallin

Differential Revision: D23900374

fbshipit-source-id: 3ee6e2d4f6b6ed12cd0c1516c686a03c87fa7cb4
2020-09-24 06:23:23 -07:00
Stanislau Hlebik
54d43b7f95 mononoke: add getbundle optimization for fetching commits with low generation
Summary:
Currently getbundle implementation works this way: it accepts two list of
changesets, a list of `heads` and a list of `common`. getbundle needs to return
enough changesets to the client so that client has `heads` and all of its
ancestors. `common` is a hint to the server that tells which nodes client
already has and doesn't need sending.

Current getbundle implementation is inefficient when any of the heads have a low
generation number but also we have a lot excludes which have a high generation
number. Current implementation needs to find ancestors of excludes that are
equal or lower than generation of this head with a low generation number.

This diff is hacky attempt to improve it. It does it by splitting the heads
into heads with high generation numbers and heads with low generation numbers,
sending separate getbundle requests for each of them and then combining the
results.

Example
```
 O <- head 1
 |
 O <- exclude 1
 |
...     O <- head 2 <- low generation number
 |      |
 O      O <- exclude 2
```

If we have a request like  `(heads: [head1, head2], excludes: [exclude1, exclude2])` then this optimization splits it into two requests: `(heads: [head1], excludes: [exclude1, exclude2] )` and `(heads:[head2], excludes:[exclude2])`, and then combines the results. Note that it might result in overfetching i.e. returning much more commits to the client then it requested. This might make a request much slower, and to combat that I suggest to have a tunable getbundle_low_gen_optimization_max_traversal_limit that prevents overfetching.

Reviewed By: krallin

Differential Revision: D23599866

fbshipit-source-id: fcd773eb6a0fb4e8d2128b819f7a13595aca65fa
2020-09-11 07:32:26 -07:00
Stanislau Hlebik
66fbdf72c7 mononoke: add sampling for redacted accesses
Summary:
Previously we were not logging a redacted access if previous access was logged
less < MIN_REPORT_TIME_DIFFERENCE_NS ago. That doesn't work well with our
tests.

Let's instead add a sampling tunable.

Reviewed By: krallin

Differential Revision: D23595067

fbshipit-source-id: 47f6152945d9fdc2796fd1e74804e8bcf7f34940
2020-09-09 02:51:41 -07:00
Mateusz Kwapich
f7be2eef14 tunable scuba sampling
Summary:
This allows us to sample the most popular method logs (`repo_list_hg_manifest` calls make up for 90% samples in our scuba table) while still have full logging for other queries end errors.

The sampling can be eaily disabled via tunable. In case we get a lot of errors we can also start sampling the error request with a simple configerator change.

Reviewed By: krallin

Differential Revision: D23507333

fbshipit-source-id: c7e34467d99410ec3de08cce2db275a55394effd
2020-09-04 06:26:35 -07:00
Kostia Balytskyi
e7ddc6cc13 undesired fetches: regex-based reporting
Summary:
We want to be able to report more than just on one prefix. Instead, let's add a regex-based reporting. To make deployment easier, let's keep both options for now and later just remove prefix-based one.

Note: this diff also changes how a situation with absent `undesired_path_prefix_to_log` is treated. Previously, if `undesired_path_prefix_to_log` is absent, but `"undesired_path_repo_name_to_log": "fbsource"`, it would report every path. Now it won't report any, which I think is a saner behavior. If we do ever want to report every path, we can just add `.*` as a regex.

Reviewed By: StanislavGlebik

Differential Revision: D23447800

fbshipit-source-id: 059109b44256f5703843625b7ab725a243a13056
2020-09-01 12:01:00 -07:00
Mark Thomas
61d45865de bookmarks_movement: prepare for running hooks on additional changesets
Summary:
When bookmarks are moved or created, work out what additional changesets
should have the hooks run on them.  This may apply to plain pushes,
force pushrebases, or bookmark only pushrebases.

At first, this will run in logging-only mode where we will count how many
changesets would have hooks run on them (up to a tunable limit).  We can
enable running of hooks with a tunable killswitch later on.

Reviewed By: StanislavGlebik

Differential Revision: D23194240

fbshipit-source-id: 8031fdc1634168308c7fe2ad3c22ae4389a04711
2020-08-25 09:14:08 -07:00
Mark Thomas
889e84f8d5 bookmarks_movement: move hook running into bookmarks_movement
Summary:
Move the running of hooks from in `repo_client` to in `bookmarks_movement`.

For pushrebase and plain push we still only run hooks on the new commits the client has sent.
Bookmark-only pushrebases, or moves where some commits were already known, do not run
the hooks on the omitted changesets.  That will be addressed next.

The push-redirector currently runs hooks in the large repo.  Since hook running has now been moved
to later on, they will automatically be run on the large repo, and instead the push-redirector runs them on
the small repo, to ensure they are run on both.

There's some additional complication with translating hook rejections in the push-redirector.  Since a
bookmark-only push can result in hook rejections for commits that are not translated, we fall back to
using the large-repo commit hash in those scenarios.

Reviewed By: StanislavGlebik

Differential Revision: D23077551

fbshipit-source-id: 07f66a96eaca4df08fc534e335e6d9f6b028730d
2020-08-25 09:14:07 -07:00
Stanislau Hlebik
e308419b58 RFC mononoke: limit number of filenodes get_all_filenodes_maybe_stale
Summary:
In a repository with files with large histories we run into a lot of SqlTimeout
errors while fetching file history to serve getpack calls. However fetching the
whole file history is not really necessary - client knows how to work with
partial history i.e. if client misses some portion of history then it would
just fetch it on demand.

This diff adds way to add a limit on how many entries were going to be fetched, and if more entries were fetched then we return FilenodeRangeResult::TooBig. The downside of this diff is that we'd have to do more sequential database
queries.

Reviewed By: krallin

Differential Revision: D23025249

fbshipit-source-id: ebed9d6df6f8f40e658bc4b83123c75f78e70d93
2020-08-12 14:33:43 -07:00
Stanislau Hlebik
43ac2a1c62 mononoke: use WarmBookmarkCache in repo_client
Summary:
This is the (almost) final diff to introduce WarmBookmarksCache in repo_client.
A lot of this code is to pass through the config value, but a few things I'd
like to point out:
1) Warm bookmark cache is enabled from config, but it can be killswitched using
a tunable.
2) WarmBookmarksCache in scs derives all derived data, but for repo_client I
decided to derive just hg changeset. The main motivation is to not change the
current behaviour, and to make mononoke server more resilient to failures in
other derived data types.
3) Note that WarmBookmarksCache doesn't obsolete SessionBookmarksCache that was
introduced earlier, but rather it complements it. If WarmBookmarksCache is
enabled, then SessionBookmarksCache reads the bookmarks from it and not from
db.
4) There's one exception in point #3 - if we just did a push then we read
bookmarks from db rather than from bookmarks cache (see
update_publishing_bookmarks_after_push() method). This is done intentionally -
after push is finished we want to return the latest updated bookmarks to the
client (because the client has just moved a bookmark after all!).
I'd argue that the current code is a bit sketchy already - it doesn't read from
master but from replica, which means we could still see outdated bookmarks.

Reviewed By: krallin

Differential Revision: D22820879

fbshipit-source-id: 64a0aa0311edf17ad4cb548993d1d841aa320958
2020-07-31 03:09:24 -07:00
Thomas Orozco
ae917ba227 mononoke/virtually_sharded_blobstore: make sampling rate tunable
Summary: As it says in the title.

Reviewed By: farnz

Differential Revision: D22432526

fbshipit-source-id: 42726584689cbc2f5c9138b42b7bf77939921bdd
2020-07-08 09:07:19 -07:00