Commit Graph

24 Commits

Author SHA1 Message Date
Stanislau Hlebik
eab97b6123 mononoke: sync changeset implementation for megarepo
Summary: First stab at implementing sync changeset functionality for megarepo.

Reviewed By: ikostia

Differential Revision: D28357210

fbshipit-source-id: 660e3f9914737929391ab1b29f891b3b5dd47638
2021-05-13 10:04:21 -07:00
Alex Hornby
da5dac311b rust: remove patch for async-compression
Summary: Upstream crate has landed my PR for zstd 1.4.9 support and made a release, so can remove this patch now.

Reviewed By: ikostia

Differential Revision: D28221163

fbshipit-source-id: b95a6bee4f0c8d11f495dc17b2737c9ac9142b36
2021-05-05 12:20:34 -07:00
Toan Mai
410f7c5c61 Imported a mysql_common patch to support FromRow for tuples up to arity 16 (#23)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/rust-shed/pull/23

Pull Request resolved: https://github.com/facebookincubator/resctl/pull/8081

Pull Request resolved: https://github.com/facebookexperimental/eden/pull/82

Imported a mysql_common patch to support FromRow for tuples upto arity 16
Context: https://fburl.com/zfnw7r86

Followed the guide: https://www.internalfb.com/intern/wiki/Rust-at-facebook/Managing_fbsource_third-party_with_Reindeer/#maintaining-local-change

Reviewed By: marcelogomez

Differential Revision: D28094262

fbshipit-source-id: fed48e3950e8a3ba3d7a15407522167e5ae41a98
2021-05-05 10:32:48 -07:00
Thomas Orozco
9c7aa6aaf7 third-party/rust: remove patches for Tokio 0.2 & Hyper 0.2
Summary:
We used to carry patches for Tokio 0.2 to add support for disabling Tokio coop
(which was necessary to make Mononoke work with it), but this was upstreamed
in Tokio 1.x (as a different implementation), so that's no longer needed. Nobody
else besides Mononoke was using this.

For Hyper we used to carry a patch with a bugfix. This was also fixed in Tokio
1.x-compatible versions of Hyper. There are still users of hyper-02 in fbcode.
However, this is only used for servers and only when accepting websocket
connections, and those users are just using Hyper as a HTTP client.

Reviewed By: farnz

Differential Revision: D28091331

fbshipit-source-id: de13b2452b654be6f3fa829404385e80a85c4420
2021-04-29 08:07:45 -07:00
Thomas Orozco
ffed22260d third-party/rust: remove Gotham 0.2
Summary:
This used to be used by Mononoke, but we're now on Tokio 1.x and on
corresponding versions of Gotham so it's not needed anymore.

Reviewed By: farnz

Differential Revision: D28091091

fbshipit-source-id: a58bcb4ba52f3f5d2eeb77b68ee4055d80fbfce2
2021-04-29 08:07:45 -07:00
Mark Juggurnauth-Thomas
139d93bedb changesets: split implementation to a separate crate
Summary:
Keeping the `Changesets` trait as well as its implementations in the same crate means that users of `Changesets` also transitively depend on everything that is needed to implement it.

Flatten the dependency graph a little by splitting it into two crates: most users of `Changesets` will only depend on the trait definition.  Only the factories need depend on the implementations.

Reviewed By: krallin

Differential Revision: D27430612

fbshipit-source-id: 6b45fe4ae6b0fa1b95439be5ab491b1675c4b177
2021-04-29 06:11:20 -07:00
Mark Juggurnauth-Thomas
d66e56c407 changesets: remember repo_id in changesets
Summary:
The changesets object is only valid to access the changesets of a single repo
(other repos may have different metadata database config), so it is pointless
for all methods to require the caller to provide the correct one.  Instead,
make the changesets object remember the repo id.

Reviewed By: krallin

Differential Revision: D27430611

fbshipit-source-id: bf2c398af2e5eb77c1c7c55a89752753020939ab
2021-04-29 06:11:20 -07:00
Mark Juggurnauth-Thomas
b935836e32 changesets: replace get_sql_changesets with enumeration methods
Summary:
The `get_sql_changesets` method on `Changesets` is an abstraction violation,
and prevents extraction of `SqlChangesets` to a separate crate as it would
introduce a circular dependency.

It is used to allow bulk queries to enumerate changesets by integer unique ID,
so promote this to a full feature of `changesets`, and remove the
`get_sql_changesets` method.

Reviewed By: krallin

Differential Revision: D27426921

fbshipit-source-id: 2839503029b262dd5e6a8be09bb35bb143b4c5ac
2021-04-29 06:11:20 -07:00
Stefan Filip
4d65afb5c2 bonsai_hg_mapping: add get_hg_in_range
Summary:
I want to add prefix lookup functionality to EdenApi.

I considered adding some sort of prefix abstraction to Hg or our wire types.
so that I could use the method that was already implemented. I found the
serialization and conversion logic complicated to reason about.  It is easier
to transform the prefix query in a range query and add a range query endpoint.
No need to invent new fancy types and deal conversions.

This diff updates the prefix query implementation to rely on the range query.
The range query is functionality that is more general than prefix lookup.

Reviewed By: quark-zju

Differential Revision: D28034531

fbshipit-source-id: 491db2354e3804c4cea6db16fe7d056a962515f7
2021-04-28 10:21:51 -07:00
Thomas Orozco
0f44a4f106 mononoke: update to tokio 1.x
Summary:
NOTE: there is one final pre-requisite here, which is that we should default all Mononoke binaries to `--use-mysql-client` because the other SQL client implementations will break once this lands. That said, this is probably the right time to start reviewing.

There's a lot going on here, but Tokio updates being what they are, it has to happen as just one diff (though I did try to minimize churn by modernizing a bunch of stuff in earlier diffs).

Here's a detailed list of what is going on:

- I had to add a number `cargo_toml_dir` for binaries in `eden/mononoke/TARGETS`, because we have to use 2 versions of Bytes concurrently at this time, and the two cannot co-exist in the same Cargo workspace.
- Lots of little Tokio changes:
  - Stream abstractions moving to `tokio-stream`
  - `tokio::time::delay_for` became `tokio::time::sleep`
  - `tokio::sync::Sender::send` became `tokio::sync::Sender::broadcast`
  - `tokio::sync::Semaphore::acquire` returns a `Result` now.
  - `tokio::runtime::Runtime::block_on` no longer takes a `&mut self` (just a `&self`).
  - `Notify` grew a few more methods with different semantics. We only use this in tests, I used what seemed logical given the use case.
- Runtime builders have changed quite a bit:
  - My `no_coop` patch is gone in Tokio 1.x, but it has a new `tokio::task::unconstrained` wrapper (also from me), which I included on  `MononokeApi::new`.
  - Tokio now detects your logical CPUs, not physical CPUs, so we no longer need to use `num_cpus::get()` to figure it out.
- Tokio 1.x now uses Bytes 1.x:
  - At the edges (i.e. streams returned to Hyper or emitted by RepoClient), we need to return Bytes 1.x. However, internally we still use Bytes 0.5 in some places (notably: Filestore).
  - In LFS, this means we make a copy. We used to do that a while ago anyway (in the other direction) and it was never a meaningful CPU cost, so I think this is fine.
  - In Mononoke Server it doesn't really matter because that still generates ... Bytes 0.1 anyway so there was a copy before from 0.1 to 0.5 and it's from 0.1 to 1.x.
  - In the very few places where we read stuff using Tokio from the outside world (historical import tools for LFS), we copy.
- tokio-tls changed a lot, they removed all the convenience methods around connecting. This resulted in updates to:
  - How we listen in Mononoke Server & LFS
  - How we connect in hgcli.
  - Note: all this stuff has test coverage.
- The child process API changed a little bit. We used to have a ChildWrapper around the hg sync job to make a Tokio 0.2.x child look more like a Tokio 1.x Child, so now we can just remove this.
- Hyper changed their Websocket upgrade mechanism (you now need the whole `Request` to upgrade, whereas before that you needed just the `Body`, so I changed up our code a little bit in Mononoke's HTTP acceptor to defer splitting up the `Request` into parts until after we know whether we plan to upgrade it.
- I removed the MySQL tests that didn't use mysql client, because we're leaving that behind and don't intend to support it on Tokio 1.x.

Reviewed By: mitrandir77

Differential Revision: D26669620

fbshipit-source-id: acb6aff92e7f70a7a43f32cf758f252f330e60c9
2021-04-28 07:36:31 -07:00
Ilia Medianikov
449fd2fd02 mononoke/pushrebase_hooks: add a hook that saves prepushrebase changeset id
Summary:
Knowing the prepushrebase changeset id is required for retroactive_review.
retroactive_review checks landed commits, but verify_integrity hook runs on a commit before landing. This way the landed commit has no straightforward connection with the original one and retroactive_review can't acknowledge if verify_integrity have seen it.

Reviewed By: krallin

Differential Revision: D27911317

fbshipit-source-id: f7bb0cfbd54fa6ad2ed27fb9d4d67b9f087879f1
2021-04-27 03:52:50 -07:00
Alex Hornby
bc85aade21 rust: update to zstd to 0.7.0+zstd.1.4.9
Summary:
Update the zstd crates.

This also patches async-compression crate to point at my fork until upstream PR https://github.com/Nemo157/async-compression/pull/117 to update to zstd 1.4.9 can land.

Reviewed By: jsgf, dtolnay

Differential Revision: D27942174

fbshipit-source-id: 26e604d71417e6910a02ec27142c3a16ea516c2b
2021-04-22 14:34:06 -07:00
Alex Hornby
45f521ddde mononoke: enable default patch.crates-io for internal Cargo.tomls
Reviewed By: quark-zju

Differential Revision: D27915811

fbshipit-source-id: 3f830def66c1c5f0569925c42cc8335ee585e0e7
2021-04-22 10:59:42 -07:00
Thomas Orozco
2a3cec2dfa mononoke/changesets: add RendezVousConnection
Summary:
Like it says in the title, this includes rendezvous into changesets. This is
our busiest connection by far, and it is often hitting the limits of our
connection pool: https://fburl.com/ods/kuq5x1vw

Reviewed By: markbt

Differential Revision: D27794574

fbshipit-source-id: e2574ce003f12f6c9ecafd0079fe5194cc63c24b
2021-04-16 10:27:44 -07:00
Thomas Orozco
1ee93bdcfb mononoke/rendezvous: use in-flight connection count to decide when to batch
Summary:
After doing some local benchmarking (using MononokeApi instantiation as the
benchmark), one thing that's apparent is that we have quite a few parameters
here and that tuning them is likely to be a challenge.

One parameter in particular is the batch "objective", which controls how many
requests we want to see in the last batching interval before we choose to
batch (this is `rendezvous_dispatch_min_threshold`).

The problem with this is this is that there is no good, real-world, metric to
set it based on. This in contrast to the other parameters we have, which do
have some reasonable metric to compare to:

- rendezvous_dispatch_delay_ms: this is overhead we add to queries, so it
  should be small & on the order of query execution latency (i.e. a few ms).
- rendezvous_dispatch_max_threshold: this controls how big our batches get, so
  it should be on the order of what makes a SQL query too big (i.e. less than
  a hundred records).

In contrast, we want to set `rendezvous_dispatch_min_threshold` such that
batching kicks in before we start using too many concurrent connections (which
is what query batching seeks to reduce), but the problem is that those two
numbers aren't directly connected. One clear problem, for example, is that if
our DB is in-region vs. out of-region, then for a given query execution time,
and a desired concurrency level before batching kicks in, we'd need different
values of `rendezvous_dispatch_min_threshold` (it would have to kick in faster
for the out-of-region workload).

So, this diff updates rendez vou to actually track concurrent connection count
before we force batching. This is the actual metric we care about here, and it
has a pretty natural "real world" values we can look at to decide where to set
it (our connection pool — which is limited at 100 concurrent connections —, and
our open connection baseline).

Note: I set this at 5 because that's more or less what servers look like
outside of spikes for Bonsai hg mapping, and of Changesets where I'm planning to
introduce this in the future:

- bonsai: https://fburl.com/ods/6d4a9qb5
- changesets: https://fburl.com/ods/kuq5x1vw (note: to make sense of this,
  focus on just one server, otherwise the constnat spikes we get sort of hide
  the big picture).

Reviewed By: farnz

Differential Revision: D27792603

fbshipit-source-id: 1a9189f6b50d48444b3373bd1cb14dc51b85a6d2
2021-04-16 10:27:44 -07:00
Thomas Orozco
c2c904f933 mononoke: initialize loggers, config, caching, tunables & runtime in MononokeMatches
Summary:
Basically every single Mononoke binary starts with the same preamble:

- Init mononoke
- Init caching
- Init logging
- Init tunables

Some of them forget to do it, some don't, etc. This is a mess.

To make things messier, our initialization consists of a bunch of lazy statics
interacting with each other (init logging & init configerator are kinda
intertwined due to the fact that configerator wants a logger but dynamic
observability wants a logger), and methods you must only call once.

This diff attempts to clean this up by moving all this initialization into the
construction of MononokeMatches. I didn't change all the accessor methods
(though I did update those that would otherwise return things instantiated at
startup).

I'm planning to do a bit more on top of this, as my actual goal here is to make
it easier to thread arguments from MononokeMatches to RepoFactory, and to do so
I'd like to just pass my MononokeEnvironment as an input to RepoFactory.

Reviewed By: HarveyHunt

Differential Revision: D27767698

fbshipit-source-id: 00d66b07b8c69f072b92d3d3919393300dd7a392
2021-04-16 10:27:43 -07:00
Mark Juggurnauth-Thomas
a5f7cccc38 benchmark: remove dependency on blobrepo_factory
Summary:
Remove the dependency on blobrepo factory by defining a custom facet factory
for benchmark repositories.

Reviewed By: krallin

Differential Revision: D27400618

fbshipit-source-id: 626e19f09914545fb72053d91635635b2bfb6e51
2021-04-07 14:01:48 -07:00
Mark Juggurnauth-Thomas
69896e90b5 bonsai_hg_mapping: construct rendezvous connections in a blocking closure
Summary:
`RendezVousConnection::new` can block for some time doing work on the CPU,
specifically creating the stats objects.  This causes problems for other
futures during repo construction.

Instead, move rendez-vous construction to a `spawn_blocking` closure, so that
it doesn't interfere with the other futures.

Since `SqlBonsaiHgMapping::from_sql_connections` is not async, and is part of
the SqlConstruct trait, we must convert this to the builder pattern so that we
can defer rendez-vous construction to later on.

Reviewed By: farnz

Differential Revision: D27501915

fbshipit-source-id: 9c58c32411301128424985deeab127d052c43532
2021-04-01 08:27:15 -07:00
Mark Juggurnauth-Thomas
dad89a4233 test_repo_factory: use test factory for derived data tests
Summary: Use the test factory for existing derived data tests.

Reviewed By: farnz

Differential Revision: D27169433

fbshipit-source-id: b9b57a97f73aeb639162c359ab60c6fd92da14a8
2021-03-25 07:34:49 -07:00
Mark Juggurnauth-Thomas
6b16e16fa9 blobrepo: convert to facet container
Summary:
Convert `BlobRepo` to a `facet::container`.  This will allow it to be built
from an appropriate facet factory.

This only changes the definition of the structure: we still use
`blobrepo_factory` to construct it.  The main difference is in the types
of the attributes, which change from `Arc<dyn Trait>` to
`Arc<dyn Trait + Send + Sync + 'static`>, specified by the `ArcTrait` alias
generated by the `#[facet::facet]` macro.

Reviewed By: StanislavGlebik

Differential Revision: D27169437

fbshipit-source-id: 3496b6ee2f0d1e72a36c9e9eb9bd3d0bb7beba8b
2021-03-25 07:34:49 -07:00
Alex Hornby
6e75bcc077 mononoke: add hg derived data support to benchmark_large_directory
Summary: Add ability to benchmark hg derivation

Reviewed By: krallin

Differential Revision: D26983309

fbshipit-source-id: 593f16c752610db4253e44509ffcf218dd241796
2021-03-12 02:58:53 -08:00
Mark Juggurnauth-Thomas
91358f3716 mononoke_types: use SortedVectorMap for BonsaiChangeset
Summary:
BonsaiChangesets are rarely mutated, and their maps are stored in sorted order,
so we can use `SortedVectorMap` to load them more efficiently.

In the cases where mutable maps of filechanges are needed, we can use `BTreeMap`
during the mutation and then convert them to `SortedVectorMap` to store them.

Reviewed By: mitrandir77

Differential Revision: D25615279

fbshipit-source-id: 796219c1130df5cb025952bb61002e8d2ae898f4
2021-03-11 04:28:43 -08:00
Mark Juggurnauth-Thomas
36f78eadb8 benchmarks: add benchmark_large_directory
Summary:
Add a microbenchmark for deriving data with large directories.

This benchmark creates a commit with 100k files in a single directory, and then
derives data for that commit and 10 descendant commits, each of which add,
modify and remove some files.

Reviewed By: ahornby

Differential Revision: D26947361

fbshipit-source-id: 4215f1ac806c53a112217ceb10e50cfad56f4f28
2021-03-11 04:28:42 -08:00
Mark Juggurnauth-Thomas
eb4d31cc82 benchmark: rename to benchmarks/simulated_repo
Summary: Rename this benchmark to a specific name so that we can add new benchmarks.

Differential Revision: D26947362

fbshipit-source-id: a1d060ee79781aa0ead51f284517471431418659
2021-03-11 04:28:42 -08:00