Commit Graph

203 Commits

Author SHA1 Message Date
Simon Farnsworth
24a157be39 Teach the packer to do multiple packs in a single process
Summary:
Paying the setup and teardown overhead of multiple processes seems silly, when we can pack in parallel in a single process.

Make it possible to run multiple packing runs from a single packer process

Reviewed By: ahornby

Differential Revision: D28508527

fbshipit-source-id: eab07d028db46d62731f06effbde2f5bc5579000
2021-05-19 03:32:40 -07:00
Alex Hornby
6c96b7f1f4 mononoke: switch manual_scrub to buffered_unordered
Summary:
manual_scrub was using the ordered form of buffered so that checkpoint was written correctly.

This diff switches to buffered_unordered which can give better throughput. To do so checkpoint uses a tracker to know what keys have completed, so it can save the latest done key which has no preceding keys still executing.

Reviewed By: farnz

Differential Revision: D28438371

fbshipit-source-id: 274aa371a0c33d37d0dc7779b04daec2b5e1bc15
2021-05-17 01:13:57 -07:00
Stanislau Hlebik
eab97b6123 mononoke: sync changeset implementation for megarepo
Summary: First stab at implementing sync changeset functionality for megarepo.

Reviewed By: ikostia

Differential Revision: D28357210

fbshipit-source-id: 660e3f9914737929391ab1b29f891b3b5dd47638
2021-05-13 10:04:21 -07:00
Kostia Balytskyi
0a58fc0c46 megarepo_api: introduce a MegarepoApi struct to rule them all
Summary:
This struct is intended to be a single entry-point to the megarepo logic. It is also intended to be owned directly by scs_server without the `Mononoke` struct (from `mononoke_api`) intermediary. In effect, this means that mononoke server won't be affected by `MegarepoApi` at all.

Apart from adding this struct, this diff also adds instantiation of prod `AsyncMethodRequestQueue` and wires it up to the scs_server to enqueue and poll requests.

Reviewed By: StanislavGlebik

Differential Revision: D28356563

fbshipit-source-id: b67ee48387d794fd333d106d3ffd40124086c97e
2021-05-12 12:00:20 -07:00
Stanislau Hlebik
4e232ea94d mononoke: add mapping for megarepo
Summary:
Adding mappng to keep track of two things:
1) keep track of the latest source commit that was synced into a given target - this will be used during sync_changeset() method to validate if a parent changeset of a given changeset was already synced
2) which source commit maps to what target commit

Reviewed By: ikostia

Differential Revision: D28319908

fbshipit-source-id: f776d294d779695e99d644bf5f0a5a331272cc14
2021-05-11 02:54:01 -07:00
Stanislau Hlebik
df340221a0 mononoke: add commit_rewriting logic to megarepo_api
Summary:
This is going to be use to rewrite (or transform) commits from source to
target. This diff does a few tihngs:
1) adds a MultiMover type and a function that produces a mover given a config. This is similar to Mover type we used for fbsource<-> ovrsource megarepo sync, though this time it can produce a few target paths for a given source path.
2) Moves `rewrite_commit` function from cross_repo_sync to megarepo_api, and make it work with MultiMover.

Reviewed By: ikostia

Differential Revision: D28259214

fbshipit-source-id: 16ba106dc0c65cb606df10c1a210578621c62367
2021-05-10 11:48:23 -07:00
Kostia Balytskyi
f10ef62cba megarepo: basic version of async-requests crate
Summary:
This crate is a foundation for the async requests support in megarepo service.

The idea is to be able to store serialized parameters in the blobstore upon
request arrival, and to be able to query request results from the blobstore
while polling.

This diff manipulates the following classes of types:
- param types for async methods: self-explanatory
- response types: these contain only a resulting value of a completed successful execution
- stored result types: these contain a result value of a completed execution. It may either be successful or failed. These types exist for the purpose of preserving execution result in the blobstore.
- poll-response types: these contain and option of a response. If the optional value is empty, this means that the request is not yet ready
- polling tokens: these are used by the client to ask about the processing status for a submitted request

Apart from that, some of these types have both Rust and Thrift counterparts, mainly for the purposes of us being able to implement traits for Rust types.

Relationships between these types are encoded in various traits and their associated types.

The lifecycle of an async request is as follows therefore:
1. the request is submitted by the client, and enqueued
   1. params are serialized and saved into a blobstore
   1. an entry is created in the SQL table
   1. the key from that table is used to create a polling token
1. some external system processes a request [completely absent form this diff]
   1. it notices a new entry in the queue
   1. it reads request's params from the blobstore
   1. it processes the request
   1. it preserves either a success of a failure of the request into the blobstore
   1. it updates the SQL table to mention that the request is now ready to be polled
1. the client polls the request
   1. queue struct receives a polling token
   1. out of that token it constructs DB keys
   1. it looks up the request row and checks if it is in the ready state
   1. if that is the case, it reads the result_blobstore_key value and fetches serialized result object
   1. now it has to turn this serialized result into a poll response:
       1. if the result is absent, poll response is a success with an empty payload
       1. if the result is present and successful, poll response is a success with the result's successful variant as  a payload
       1. if the result is present and is a failure, the polling call throws a thrift exception with that failure

Note: Why is there yet another .thrift file introduced in this diff? I felt like these types aren't a part of the scs interface, so they don't belong in `source_control.thrift`. On the other hand, they wrap things defined in `source_control.thrift,` so I needed to include it.

Reviewed By: StanislavGlebik

Differential Revision: D27964822

fbshipit-source-id: fc1a33a799d01c908bbe18a5394eba034b780174
2021-05-10 06:51:37 -07:00
Kostia Balytskyi
0617e3489e move scm/service into eden/mononoke/scs
Reviewed By: ahornby

Differential Revision: D28286267

fbshipit-source-id: 349a2d94eca9cf563ee2bb4076e268917aaa4fd6
2021-05-10 05:53:38 -07:00
Simon Farnsworth
23cbacd701 Remove cachelib setup from segemented_changelog_tailer
Summary:
Because cachelib is not initialised at this point, it returns `None` unconditionally.

I'm refactoring the cachelib bindings so that this returns an error - take it out completely for now, leaving room to add it back in if caching is useful here

Reviewed By: sfilipco

Differential Revision: D28286986

fbshipit-source-id: cd9f43425a9ae8f0eef6fd32b8cd0615db9af5f6
2021-05-07 12:24:22 -07:00
Kostia Balytskyi
390605bf1e megarepo: intial impl of async requests table
Summary:
This is where async requests are logged to be processed, and from where they
are polled later.

It will acquire more functionality when the actual request processing business
logic is implemented.

Reviewed By: StanislavGlebik

Differential Revision: D28092910

fbshipit-source-id: 00e45229aa2db73fa0ae5a1cf99b8f2d3a162006
2021-05-06 11:33:39 -07:00
Alex Hornby
da5dac311b rust: remove patch for async-compression
Summary: Upstream crate has landed my PR for zstd 1.4.9 support and made a release, so can remove this patch now.

Reviewed By: ikostia

Differential Revision: D28221163

fbshipit-source-id: b95a6bee4f0c8d11f495dc17b2737c9ac9142b36
2021-05-05 12:20:34 -07:00
Toan Mai
410f7c5c61 Imported a mysql_common patch to support FromRow for tuples up to arity 16 (#23)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/rust-shed/pull/23

Pull Request resolved: https://github.com/facebookincubator/resctl/pull/8081

Pull Request resolved: https://github.com/facebookexperimental/eden/pull/82

Imported a mysql_common patch to support FromRow for tuples upto arity 16
Context: https://fburl.com/zfnw7r86

Followed the guide: https://www.internalfb.com/intern/wiki/Rust-at-facebook/Managing_fbsource_third-party_with_Reindeer/#maintaining-local-change

Reviewed By: marcelogomez

Differential Revision: D28094262

fbshipit-source-id: fed48e3950e8a3ba3d7a15407522167e5ae41a98
2021-05-05 10:32:48 -07:00
Gus Wynn
cbbb45206b slog max_level_debug -> trace
Reviewed By: Imxset21

Differential Revision: D28097080

fbshipit-source-id: 7d417f8256922926cf379d9c2fb3249f6d2544ef
2021-05-03 10:30:21 -07:00
Thomas Orozco
9c7aa6aaf7 third-party/rust: remove patches for Tokio 0.2 & Hyper 0.2
Summary:
We used to carry patches for Tokio 0.2 to add support for disabling Tokio coop
(which was necessary to make Mononoke work with it), but this was upstreamed
in Tokio 1.x (as a different implementation), so that's no longer needed. Nobody
else besides Mononoke was using this.

For Hyper we used to carry a patch with a bugfix. This was also fixed in Tokio
1.x-compatible versions of Hyper. There are still users of hyper-02 in fbcode.
However, this is only used for servers and only when accepting websocket
connections, and those users are just using Hyper as a HTTP client.

Reviewed By: farnz

Differential Revision: D28091331

fbshipit-source-id: de13b2452b654be6f3fa829404385e80a85c4420
2021-04-29 08:07:45 -07:00
Thomas Orozco
ffed22260d third-party/rust: remove Gotham 0.2
Summary:
This used to be used by Mononoke, but we're now on Tokio 1.x and on
corresponding versions of Gotham so it's not needed anymore.

Reviewed By: farnz

Differential Revision: D28091091

fbshipit-source-id: a58bcb4ba52f3f5d2eeb77b68ee4055d80fbfce2
2021-04-29 08:07:45 -07:00
Mark Juggurnauth-Thomas
139d93bedb changesets: split implementation to a separate crate
Summary:
Keeping the `Changesets` trait as well as its implementations in the same crate means that users of `Changesets` also transitively depend on everything that is needed to implement it.

Flatten the dependency graph a little by splitting it into two crates: most users of `Changesets` will only depend on the trait definition.  Only the factories need depend on the implementations.

Reviewed By: krallin

Differential Revision: D27430612

fbshipit-source-id: 6b45fe4ae6b0fa1b95439be5ab491b1675c4b177
2021-04-29 06:11:20 -07:00
Thomas Orozco
0f44a4f106 mononoke: update to tokio 1.x
Summary:
NOTE: there is one final pre-requisite here, which is that we should default all Mononoke binaries to `--use-mysql-client` because the other SQL client implementations will break once this lands. That said, this is probably the right time to start reviewing.

There's a lot going on here, but Tokio updates being what they are, it has to happen as just one diff (though I did try to minimize churn by modernizing a bunch of stuff in earlier diffs).

Here's a detailed list of what is going on:

- I had to add a number `cargo_toml_dir` for binaries in `eden/mononoke/TARGETS`, because we have to use 2 versions of Bytes concurrently at this time, and the two cannot co-exist in the same Cargo workspace.
- Lots of little Tokio changes:
  - Stream abstractions moving to `tokio-stream`
  - `tokio::time::delay_for` became `tokio::time::sleep`
  - `tokio::sync::Sender::send` became `tokio::sync::Sender::broadcast`
  - `tokio::sync::Semaphore::acquire` returns a `Result` now.
  - `tokio::runtime::Runtime::block_on` no longer takes a `&mut self` (just a `&self`).
  - `Notify` grew a few more methods with different semantics. We only use this in tests, I used what seemed logical given the use case.
- Runtime builders have changed quite a bit:
  - My `no_coop` patch is gone in Tokio 1.x, but it has a new `tokio::task::unconstrained` wrapper (also from me), which I included on  `MononokeApi::new`.
  - Tokio now detects your logical CPUs, not physical CPUs, so we no longer need to use `num_cpus::get()` to figure it out.
- Tokio 1.x now uses Bytes 1.x:
  - At the edges (i.e. streams returned to Hyper or emitted by RepoClient), we need to return Bytes 1.x. However, internally we still use Bytes 0.5 in some places (notably: Filestore).
  - In LFS, this means we make a copy. We used to do that a while ago anyway (in the other direction) and it was never a meaningful CPU cost, so I think this is fine.
  - In Mononoke Server it doesn't really matter because that still generates ... Bytes 0.1 anyway so there was a copy before from 0.1 to 0.5 and it's from 0.1 to 1.x.
  - In the very few places where we read stuff using Tokio from the outside world (historical import tools for LFS), we copy.
- tokio-tls changed a lot, they removed all the convenience methods around connecting. This resulted in updates to:
  - How we listen in Mononoke Server & LFS
  - How we connect in hgcli.
  - Note: all this stuff has test coverage.
- The child process API changed a little bit. We used to have a ChildWrapper around the hg sync job to make a Tokio 0.2.x child look more like a Tokio 1.x Child, so now we can just remove this.
- Hyper changed their Websocket upgrade mechanism (you now need the whole `Request` to upgrade, whereas before that you needed just the `Body`, so I changed up our code a little bit in Mononoke's HTTP acceptor to defer splitting up the `Request` into parts until after we know whether we plan to upgrade it.
- I removed the MySQL tests that didn't use mysql client, because we're leaving that behind and don't intend to support it on Tokio 1.x.

Reviewed By: mitrandir77

Differential Revision: D26669620

fbshipit-source-id: acb6aff92e7f70a7a43f32cf758f252f330e60c9
2021-04-28 07:36:31 -07:00
Stanislau Hlebik
c35a78c649 mononoke: add a tool to copy blobs from one repo to another
Reviewed By: krallin

Differential Revision: D28024373

fbshipit-source-id: 4954f6d322f924b9291326ef5a948a1c52230955
2021-04-27 08:54:34 -07:00
Ilia Medianikov
449fd2fd02 mononoke/pushrebase_hooks: add a hook that saves prepushrebase changeset id
Summary:
Knowing the prepushrebase changeset id is required for retroactive_review.
retroactive_review checks landed commits, but verify_integrity hook runs on a commit before landing. This way the landed commit has no straightforward connection with the original one and retroactive_review can't acknowledge if verify_integrity have seen it.

Reviewed By: krallin

Differential Revision: D27911317

fbshipit-source-id: f7bb0cfbd54fa6ad2ed27fb9d4d67b9f087879f1
2021-04-27 03:52:50 -07:00
Ilia Medianikov
93b8cf116b mononoke/pushrebase: split pushrebase crate
Summary:
Split pushrebase crate into pushrebase hook definition and pushrebase implementation.
Before this change it was impossible to store an attribute in BlobRepo that would depend on PushrebaseHook as it would create a circular dependency `pushrebase -> blobrepo -> pushrebase`.

Reviewed By: krallin

Differential Revision: D27968029

fbshipit-source-id: 030ef1b02216546cd3658de1f417bc52b8dd9843
2021-04-27 03:52:50 -07:00
Alex Hornby
bc85aade21 rust: update to zstd to 0.7.0+zstd.1.4.9
Summary:
Update the zstd crates.

This also patches async-compression crate to point at my fork until upstream PR https://github.com/Nemo157/async-compression/pull/117 to update to zstd 1.4.9 can land.

Reviewed By: jsgf, dtolnay

Differential Revision: D27942174

fbshipit-source-id: 26e604d71417e6910a02ec27142c3a16ea516c2b
2021-04-22 14:34:06 -07:00
Kostia Balytskyi
d48c87a95e megarepo: introduce write side of MononokeMegarepoConfigs
Summary:
`MononokeMegarepoConfig` is going to be a single point of access to
config storage system - provide both writes and reads. It is also a trait, to
allow for unit-test implementations later.

This diff introduces a trait, as well as implements the write side of the
configerator-based implementor. The read side/oss impl/test impl
is left `unimplemented`. Read side and test impl will be implemented in the future.

Things I had to consider while implementing this:
- I wanted to store each version of `SyncTargetConfig` in an individual
  `.cconf` in configerator
- at the same time, I did not want all of them to live in the same dir, to
  avoid having dirs with thousands of files in it
- dir sharding uses sha1 of the target repo + target bookmark + version name,
  then separates it into a dir name and a file name, like git does
- this means that these `.cconf` files are not "human-addressable" in the
  configerator repo
- to help this, each new config creation also creates an entry in one of the
  "index" files: human-readable maps from target + version name to a
  corresponding `.cconf`
- using a single index file is also impractical, so these are separated by
  ascification of the repo_id + bookmark name

Note: this design means that there's no automatic way to fetch the list of all
targets in use. This can be bypassed by maintaining an extra index layer, whihc
will list all the targets. I don't think this is very important atm.

Reviewed By: StanislavGlebik

Differential Revision: D27795663

fbshipit-source-id: 4d824ee4320c8be5187915b23e9c9d261c198fe1
2021-04-22 02:13:19 -07:00
Alex Hornby
4b8ed9a79a rust: remove gotham and hyper patches as referenced PRs have been released
Summary: Remove gotham and hyper patches as referenced PRs have been released

Reviewed By: krallin

Differential Revision: D27905248

fbshipit-source-id: a2b25562614b71b25536b29bb1657a3f3a5de83c
2021-04-21 05:19:37 -07:00
Kostia Balytskyi
81675b2d5b megarepo: introduce megarepo_error and make configo_client use it
Summary:
We want to distinguish user vs system errors in `configo_client` and its users
(`mononoke_config` for instance). The reason is to allow `scs_server`
distinguish the two types of errors. Normally user errors would only ever be
instantiated fairly "shallowly" in the `scs_server` itself, but `configo_client`
is a transactional client (by this I mean that it calls user-provided transformation functions on fetched data), so we need to allow for a user error to originate from these updater functions.

Reviewed By: StanislavGlebik

Differential Revision: D27880928

fbshipit-source-id: e00a5580a339fdabc4b858235da9cd7e9fc7a376
2021-04-21 05:02:00 -07:00
Simon Farnsworth
ecf7c4665b Implement a baseline dumb packer
Summary:
This is a baseline packer, to let us experiment with packing.

It chooses the dictionary that gives us the smallest size, but does nothing intelligent around key handling.

Note that at this point, packing is not always a win - e.g. the alias blobs are 432 bytes in total, but the resulting pack is 1528 bytes.

Reviewed By: ahornby

Differential Revision: D27795486

fbshipit-source-id: 05620aadbf7b680cbfcb0932778f5849eaab8a48
2021-04-20 12:45:08 -07:00
Kostia Balytskyi
815b5ad04a megarepo_api: introduce a basic configo client
Summary:
This is not used on its on, but in subsequent diffs I will add a use-case, by
the megarepo configs crate.

When built in non-fbcode mode, this crate does not export anything. I chose this approach as opposed to the approach of exporting no-op stubs to force the clients to pay attention and implement gating their side too. This seems reasonable for a rather generic configo client.

Reviewed By: StanislavGlebik

Differential Revision: D27790753

fbshipit-source-id: d6dcec884ed7aa88abe5796ef0e58be8525893e2
2021-04-19 08:34:12 -07:00
Alex Hornby
43f8fbab26 rust: remove smallvec fork
Summary: Now we're on rustc 1.51 the fork is no longer needed.

Reviewed By: dtolnay

Differential Revision: D27827632

fbshipit-source-id: 131841590d3987d53f5f8afb5ebc205cd36937fb
2021-04-19 01:32:20 -07:00
Thomas Orozco
db4c509b9e mononoke: use MononokeEnvironment in RepoFactory
Summary:
There is a very frustrating operation that happens often when working on the
Mononoke code base:

- You want to add a flag
- You want to consume it in the repo somewhere

Unfortunately, when we need to do this, we end up having to thread this from a
million places and parse it out in every single main() we have.

This is a mess, and it results in every single Mononoke binary starting with
heaps of useless boilerplate:

```
    let matches = app.get_matches();

    let (caching, logger, mut runtime) = matches.init_mononoke(fb)?;

    let config_store = args::init_config_store(fb, &logger, &matches)?;

    let mysql_options = args::parse_mysql_options(&matches);
    let blobstore_options = args::parse_blobstore_options(&matches)?;
    let readonly_storage = args::parse_readonly_storage(&matches);
```

So, this diff updates us to just use MononokeEnvironment directly in
RepoFactory, which means none of that has to happen: we can now add a flag,
parse it into MononokeEnvironment, and get going.

While we're at it, we can also remove blobstore options and all that jazz from
MononokeApiEnvironment since now it's there in the underlying RepoFactory.

Reviewed By: HarveyHunt

Differential Revision: D27767700

fbshipit-source-id: e1e359bf403b4d3d7b36e5f670aa1a7dd4f1d209
2021-04-16 10:27:43 -07:00
Jan Mazur
77a205db89 quiet certain connection errors when shutting down
Summary: Similar to D27155123 (1a56da1c6f).

Reviewed By: krallin

Differential Revision: D27805926

fbshipit-source-id: cf58a2e9b2ef92ca536f3b61b63fb42cfb1ec940
2021-04-16 02:26:04 -07:00
Thomas Orozco
e64012ad9e mononoke/timeseries: introduce a basic crate for tracking time series
Summary:
I'd like to be able to track time series for access within Mononoke. The
underlying use case here is that I want to be able to track the max count of
connections in our SQL connection pools over time (and possibly other things in
the future).

Now, the obvious question is: why am I rolling my own? Well, as it turns out,
there isn't really an implementation of this that I can reuse:

- You might expect to be able to track the max of a value via fb303, but you
  can't:

https://www.internalfb.com/intern/diffusion/FBS/browse/master/fbcode/fb303/ExportType.h?commit=0405521ec858e012c0692063209f3e13a2671043&lines=26-29

- You might go look in Folly, but you'll find that the time series there only
  supports tracking Sum & Average, but I want my timeseries to track Max (and
  in fact I'd like it to be sufficiently flexible to track anything I want):

https://www.internalfb.com/intern/diffusion/FBS/browse/master/fbcode/folly/stats/BucketedTimeSeries.h

It's not the first time I've ran into a need for something like this. I needed
it in RendezVous to track connections over the last 2 N millisecond intervals,
and we needed it in metagit for host draining as well (note that the
implementation here is somewhat inspired by the implementation there).

Reviewed By: mzr

Differential Revision: D27678388

fbshipit-source-id: ba6d244b8bb848d4e1a12f9c6f54e3aa729f6c9c
2021-04-12 05:22:33 -07:00
Arun Kulshreshtha
e6e2e61084 third-party/rust: patch curl and curl-sys
Summary: Update the `curl` and `curl-sys` crates to use a patched version that supports `CURLOPT_SSLCERT_BLOB` and similar config options that allow the use of in-memory TLS credentials. These options were added last year in libcurl version `7.71.0`, but the Rust bindings have not yet been updated to support them. I intend to upstream this patch, but until then, this will allow us to use these options in fbcode.

Reviewed By: quark-zju

Differential Revision: D27633208

fbshipit-source-id: 911e0b8809bc0144ad8b32749e71208bd08458fd
2021-04-08 11:50:38 -07:00
Mark Juggurnauth-Thomas
53550b9f10 blobrepo_factory: remove blobrepo_factory
Summary: This has been superseded by `RepoFactory`.

Reviewed By: krallin

Differential Revision: D27400617

fbshipit-source-id: e029df8be6cd2b7f3a5917050520b83bce5630e9
2021-04-07 14:01:49 -07:00
Mark Juggurnauth-Thomas
3b9817b5d8 benchmark_storage_config: remove dependency on blobrepo_factory
Summary: Use the equivalent function from `repo_factory`.

Reviewed By: krallin

Differential Revision: D27363470

fbshipit-source-id: dce3cf843174caa2f9ef7083409e7935749be4cd
2021-04-07 14:01:47 -07:00
Mark Juggurnauth-Thomas
f902acfcd1 repo_factory: add main repo factory
Summary:
Add a factory for building development and production repositories.

This factory can be re-used to build many repositories, and they will share
metadata database factories and blobstores if their configs match.

Similarly, the factory will only load redacted blobs once per metadata
database config, rather than once per repo.

Reviewed By: krallin

Differential Revision: D27323369

fbshipit-source-id: 951f7343af97f5e507b76fb822ad2e66f4d8f3bd
2021-04-07 14:01:46 -07:00
Alex Hornby
1e98362120 mononoke: add checkpoint for input to manual_scrub
Summary: Reduce amount of manual steps needed to restart a manual scrub by checkpointing  where it has got to to a file.

Differential Revision: D27588450

fbshipit-source-id: cb0eda7d6ff57f3bb18a6669d38f5114ca9196b0
2021-04-07 09:30:51 -07:00
Alex Hornby
e907f53bf8 mononoke: add progress to manual_scrub
Summary:
Some manual scrub runs can take a long time.  Provide progress feedback logging.

Includes  a --quiet option for when progress reporting not required.

Reviewed By: farnz

Differential Revision: D27588449

fbshipit-source-id: 00840cdf2022358bc10398f08b3bbf3eeec2b299
2021-04-06 08:45:29 -07:00
Alex Hornby
d2f915d4dd mononoke: compress manual_scrub output files if requested
Summary: Use async-compression to optionally zstd compress key and error files.

Reviewed By: farnz

Differential Revision: D27467761

fbshipit-source-id: e1ccb7dc32e4c41eaba82a3716cf4d13f64f71ea
2021-04-01 07:02:06 -07:00
generatedunixname89002005287564
fc6c12b9c7 Daily common/rust/cargo_from_buck/bin/autocargo
Reviewed By: farnz

Differential Revision: D27458795

fbshipit-source-id: 6d4f0e8ebcebe2f71e1152a61bf6511230c31af5
2021-03-31 02:34:06 -07:00
Mark Juggurnauth-Thomas
64461bb361 test_repo_factory: use test factory for remaining tests
Summary: Use the test factory for the remaining existing tests.

Reviewed By: StanislavGlebik

Differential Revision: D27169443

fbshipit-source-id: 00d62d7794b66f5d3b053e8079f09f2532d757e7
2021-03-25 07:34:51 -07:00
Mark Juggurnauth-Thomas
cc30bf6552 test_repo_factory: add factory for test repos
Summary:
Create a factory that can be used to build repositories in tests.

The test repo factory will be kept in a separate crate to the production repo factory, so that it can depend on a smaller set of dependencies: just those needed for in-memory test repos.  This should eventually help make compilation speeds faster for tests.

A notable difference between the test repos produced by this factory and the ones produced by `blobrepo_factory` is that the new repos share the in-memory metadata database.  This is closer to what we use in production, and in a couple of places it is relied upon and existing tests must use `dangerous_override` to make it happen.

Reviewed By: ahornby

Differential Revision: D27169441

fbshipit-source-id: 82541a2ae71746f5e3b1a2a8a19c46bf29dd261c
2021-03-25 07:34:49 -07:00
Jan Mazur
24d2aa1442 bump region2region QPS
Summary:
We need to bump SCS counters expressing Mononoke's QPS. They will look something like:
`requests:mononoke:oregon:carolina` for requests coming from proxygen in prn and mononoke in frc.

CSLB expects regions' full names.

We're getting src region from proxygen as a header.

Reviewed By: krallin

Differential Revision: D27082868

fbshipit-source-id: 12accb8a9df5cf6a80c2c281d2f61ac1e68176d1
2021-03-22 10:12:43 -07:00
Mark Juggurnauth-Thomas
46e3a31d10 repo_derived_data: add new repo attribute to encapsulate derived data
Summary:
Add `repo_derived_data`.  This is a new struct that encapsulates derived data
configuration and lease operations, and will be used in the facet-based
construction of repositories.

Reviewed By: ahornby

Differential Revision: D27169431

fbshipit-source-id: dee7c032deb93db8934736c111ba7238a6aaf935
2021-03-22 07:26:48 -07:00
Mark Juggurnauth-Thomas
3d6160f216 repo_identity: add new repo attribute to encapsulate identity
Summary:
Add `repo_identity`.  This is a new struct that encapsulates repository
identity and will be used in the facet-based construction of repositories.

Reviewed By: ahornby

Differential Revision: D27169445

fbshipit-source-id: 02a435bba54a633190c6d2e4316e86726aecfdf0
2021-03-22 07:26:48 -07:00
Mark Juggurnauth-Thomas
c83baeb00d segmented_changelog: split trait to separate crate
Summary:
Resolve a circular dependency whereby `BlobRepo` needs to depend on
`Arc<dyn SegmentedChangelog>`, but the segmented changelog implementation
depends on `BlobRepo`, by moving the trait definition to its own crate.

Reviewed By: sfilipco

Differential Revision: D27169423

fbshipit-source-id: 5bf7c632607dc8baba40d7a9d65e96e265d58496
2021-03-22 07:26:47 -07:00
Thomas Orozco
a3a0347639 mononoke/rendezvous: introduce query batching
Summary:
This introduces a basic building block for query batching. I called this
rendezvous, since it's about multiple queries meeting up in the same place :)

There are a few (somewhat conflicting) goals this tries to satisfy, so let's go
over them:

1), we'd like to reduce the total number of queries made by batch jobs. For
example, group hg bonsai lookups made by the walker. Those jobs are
characterized by the fact that they have a lot of queries to make, all the
time. Here's an example: https://fburl.com/ods/zuiep7yh.

2), we'd like to reduce the overall number of connections held to MySQL by
our tasks. The main way we achieve this is by reducing the maximum number of
concurrent queries. Indeed, a high total number of queries doesn't necessarily
result in a lot of connections as long as they're not concurrent, because we
can reuse connections. On the other hand, if you dispatch 100 concurrent
queries, that _does_ use 100 connections. This is something that applies to
batch jobs due to their query volume, but also to "interactive" jobs like
Mononoke Server or SCS, just not all the time. Here's an example:
https://fburl.com/ods/o6gp07qp (you can see the query count is overall low, but
sometimes spikes substantially).

2.1) It's also worth noting that concurrent queries are often the result of
many clients wanting the same data, so deduplication is also useful here.

3), we also don't want to impact the latency of interactive jobs when they
need to a little query here or there (i.e. it's largely fine if our jobs all
hold a few connections to MySQL and use them somewhat consistently).

4), we'd like this to make it easier to do batching right. For example, if
you have 100 Bonsais to map to hg, you should be able to just map and call
`future::try_join_all` and have that do the right thing.

5), we don't want "bad" queries to affect other queries negatively. One
example would be the occasional queries we make to Bonsai <-> Hg mapping in
`known` for thousands (if not more) of rows.

6), we want this to be easy to incorporate into the codebase.

So, how do we try to address all of this? Here's how:

- We ... do batching, and we deduplicate requests in a batch. This is the
  easier bit and should address #1, #2 and #2.1, #4.
- However, batching is conditional. We notably don't batch very large requests
  with the rest (addresses #5). We also don't batch small queries all the time:
  we only batch if we are observing a throughput of queries that suggests we
  can find some benefit in batching (this targets #3).
- Finally, we have some utilities for common cases like having to group by repo
  id (this is `MultiRendezVous`), and this is all configurable via tunables
  (and the default is to not do anything).

Reviewed By: StanislavGlebik

Differential Revision: D27010317

fbshipit-source-id: 4a2397255f9785c6722c02e4d419438fd0aafa07
2021-03-19 08:50:40 -07:00
Chengxiong Ruan
4fb5ba1152 Use released cursive_tab and cursive_buffered_backend version (#8078)
Summary:
Pull Request resolved: https://github.com/facebookincubator/resctl/pull/8078

Pull Request resolved: https://github.com/facebookexperimental/rust-shed/pull/21

Pull Request resolved: https://github.com/facebookexperimental/eden/pull/76

Use released version to fix cursive_core version conflicts.

Reviewed By: boyuni

Differential Revision: D27032206

fbshipit-source-id: ba664b21cd55453dbc8124ff967a6f9d61fc4926
2021-03-16 10:00:37 -07:00
Stanislau Hlebik
480a0e9ef7 mononoke: start moving streaming changelog logic to rust
Summary:
Our current straming changelog updater logic is written in python, and it has a
few downsides:
1) It writes directly to manifold, which means it bypasses all the multiplexed
blobstore logic...
2) ...more importantly, we can't write to non-manifold blobstores at all.
3) There are no tests for the streaming changelogs

This diff moves the logic of initial creation of streaming changelog entry to
rust, which should fix the issues mentioned above. I want to highligh that
this implementation only works for the initial creation case i.e. when there are no
entries in the database. Next diffs will add incremental updates functionality.

Reviewed By: krallin

Differential Revision: D27008485

fbshipit-source-id: d9583bb1b98e5c4abea11c0a43c42bc673f8ed48
2021-03-12 14:46:30 -08:00
Mark Juggurnauth-Thomas
91358f3716 mononoke_types: use SortedVectorMap for BonsaiChangeset
Summary:
BonsaiChangesets are rarely mutated, and their maps are stored in sorted order,
so we can use `SortedVectorMap` to load them more efficiently.

In the cases where mutable maps of filechanges are needed, we can use `BTreeMap`
during the mutation and then convert them to `SortedVectorMap` to store them.

Reviewed By: mitrandir77

Differential Revision: D25615279

fbshipit-source-id: 796219c1130df5cb025952bb61002e8d2ae898f4
2021-03-11 04:28:43 -08:00
Mark Juggurnauth-Thomas
36f78eadb8 benchmarks: add benchmark_large_directory
Summary:
Add a microbenchmark for deriving data with large directories.

This benchmark creates a commit with 100k files in a single directory, and then
derives data for that commit and 10 descendant commits, each of which add,
modify and remove some files.

Reviewed By: ahornby

Differential Revision: D26947361

fbshipit-source-id: 4215f1ac806c53a112217ceb10e50cfad56f4f28
2021-03-11 04:28:42 -08:00
Mark Juggurnauth-Thomas
eb4d31cc82 benchmark: rename to benchmarks/simulated_repo
Summary: Rename this benchmark to a specific name so that we can add new benchmarks.

Differential Revision: D26947362

fbshipit-source-id: a1d060ee79781aa0ead51f284517471431418659
2021-03-11 04:28:42 -08:00