sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-10 08:47:12 +03:00

Author	SHA1	Message	Date
Simon Farnsworth	24a157be39	Teach the packer to do multiple packs in a single process Summary: Paying the setup and teardown overhead of multiple processes seems silly, when we can pack in parallel in a single process. Make it possible to run multiple packing runs from a single packer process Reviewed By: ahornby Differential Revision: D28508527 fbshipit-source-id: eab07d028db46d62731f06effbde2f5bc5579000	2021-05-19 03:32:40 -07:00
Alex Hornby	6c96b7f1f4	mononoke: switch manual_scrub to buffered_unordered Summary: manual_scrub was using the ordered form of buffered so that checkpoint was written correctly. This diff switches to buffered_unordered which can give better throughput. To do so checkpoint uses a tracker to know what keys have completed, so it can save the latest done key which has no preceding keys still executing. Reviewed By: farnz Differential Revision: D28438371 fbshipit-source-id: 274aa371a0c33d37d0dc7779b04daec2b5e1bc15	2021-05-17 01:13:57 -07:00
Stanislau Hlebik	eab97b6123	mononoke: sync changeset implementation for megarepo Summary: First stab at implementing sync changeset functionality for megarepo. Reviewed By: ikostia Differential Revision: D28357210 fbshipit-source-id: 660e3f9914737929391ab1b29f891b3b5dd47638	2021-05-13 10:04:21 -07:00
Kostia Balytskyi	0a58fc0c46	megarepo_api: introduce a MegarepoApi struct to rule them all Summary: This struct is intended to be a single entry-point to the megarepo logic. It is also intended to be owned directly by scs_server without the `Mononoke` struct (from `mononoke_api`) intermediary. In effect, this means that mononoke server won't be affected by `MegarepoApi` at all. Apart from adding this struct, this diff also adds instantiation of prod `AsyncMethodRequestQueue` and wires it up to the scs_server to enqueue and poll requests. Reviewed By: StanislavGlebik Differential Revision: D28356563 fbshipit-source-id: b67ee48387d794fd333d106d3ffd40124086c97e	2021-05-12 12:00:20 -07:00
Stanislau Hlebik	4e232ea94d	mononoke: add mapping for megarepo Summary: Adding mappng to keep track of two things: 1) keep track of the latest source commit that was synced into a given target - this will be used during sync_changeset() method to validate if a parent changeset of a given changeset was already synced 2) which source commit maps to what target commit Reviewed By: ikostia Differential Revision: D28319908 fbshipit-source-id: f776d294d779695e99d644bf5f0a5a331272cc14	2021-05-11 02:54:01 -07:00
Stanislau Hlebik	df340221a0	mononoke: add commit_rewriting logic to megarepo_api Summary: This is going to be use to rewrite (or transform) commits from source to target. This diff does a few tihngs: 1) adds a MultiMover type and a function that produces a mover given a config. This is similar to Mover type we used for fbsource<-> ovrsource megarepo sync, though this time it can produce a few target paths for a given source path. 2) Moves `rewrite_commit` function from cross_repo_sync to megarepo_api, and make it work with MultiMover. Reviewed By: ikostia Differential Revision: D28259214 fbshipit-source-id: 16ba106dc0c65cb606df10c1a210578621c62367	2021-05-10 11:48:23 -07:00
Kostia Balytskyi	f10ef62cba	megarepo: basic version of async-requests crate Summary: This crate is a foundation for the async requests support in megarepo service. The idea is to be able to store serialized parameters in the blobstore upon request arrival, and to be able to query request results from the blobstore while polling. This diff manipulates the following classes of types: - param types for async methods: self-explanatory - response types: these contain only a resulting value of a completed successful execution - stored result types: these contain a result value of a completed execution. It may either be successful or failed. These types exist for the purpose of preserving execution result in the blobstore. - poll-response types: these contain and option of a response. If the optional value is empty, this means that the request is not yet ready - polling tokens: these are used by the client to ask about the processing status for a submitted request Apart from that, some of these types have both Rust and Thrift counterparts, mainly for the purposes of us being able to implement traits for Rust types. Relationships between these types are encoded in various traits and their associated types. The lifecycle of an async request is as follows therefore: 1. the request is submitted by the client, and enqueued 1. params are serialized and saved into a blobstore 1. an entry is created in the SQL table 1. the key from that table is used to create a polling token 1. some external system processes a request [completely absent form this diff] 1. it notices a new entry in the queue 1. it reads request's params from the blobstore 1. it processes the request 1. it preserves either a success of a failure of the request into the blobstore 1. it updates the SQL table to mention that the request is now ready to be polled 1. the client polls the request 1. queue struct receives a polling token 1. out of that token it constructs DB keys 1. it looks up the request row and checks if it is in the ready state 1. if that is the case, it reads the result_blobstore_key value and fetches serialized result object 1. now it has to turn this serialized result into a poll response: 1. if the result is absent, poll response is a success with an empty payload 1. if the result is present and successful, poll response is a success with the result's successful variant as a payload 1. if the result is present and is a failure, the polling call throws a thrift exception with that failure Note: Why is there yet another .thrift file introduced in this diff? I felt like these types aren't a part of the scs interface, so they don't belong in `source_control.thrift`. On the other hand, they wrap things defined in `source_control.thrift,` so I needed to include it. Reviewed By: StanislavGlebik Differential Revision: D27964822 fbshipit-source-id: fc1a33a799d01c908bbe18a5394eba034b780174	2021-05-10 06:51:37 -07:00
Kostia Balytskyi	0617e3489e	move scm/service into eden/mononoke/scs Reviewed By: ahornby Differential Revision: D28286267 fbshipit-source-id: 349a2d94eca9cf563ee2bb4076e268917aaa4fd6	2021-05-10 05:53:38 -07:00
Simon Farnsworth	23cbacd701	Remove cachelib setup from segemented_changelog_tailer Summary: Because cachelib is not initialised at this point, it returns `None` unconditionally. I'm refactoring the cachelib bindings so that this returns an error - take it out completely for now, leaving room to add it back in if caching is useful here Reviewed By: sfilipco Differential Revision: D28286986 fbshipit-source-id: cd9f43425a9ae8f0eef6fd32b8cd0615db9af5f6	2021-05-07 12:24:22 -07:00
Kostia Balytskyi	390605bf1e	megarepo: intial impl of async requests table Summary: This is where async requests are logged to be processed, and from where they are polled later. It will acquire more functionality when the actual request processing business logic is implemented. Reviewed By: StanislavGlebik Differential Revision: D28092910 fbshipit-source-id: 00e45229aa2db73fa0ae5a1cf99b8f2d3a162006	2021-05-06 11:33:39 -07:00
Alex Hornby	da5dac311b	rust: remove patch for async-compression Summary: Upstream crate has landed my PR for zstd 1.4.9 support and made a release, so can remove this patch now. Reviewed By: ikostia Differential Revision: D28221163 fbshipit-source-id: b95a6bee4f0c8d11f495dc17b2737c9ac9142b36	2021-05-05 12:20:34 -07:00
Toan Mai	410f7c5c61	Imported a mysql_common patch to support FromRow for tuples up to arity 16 (#23 ) Summary: Pull Request resolved: https://github.com/facebookexperimental/rust-shed/pull/23 Pull Request resolved: https://github.com/facebookincubator/resctl/pull/8081 Pull Request resolved: https://github.com/facebookexperimental/eden/pull/82 Imported a mysql_common patch to support FromRow for tuples upto arity 16 Context: https://fburl.com/zfnw7r86 Followed the guide: https://www.internalfb.com/intern/wiki/Rust-at-facebook/Managing_fbsource_third-party_with_Reindeer/#maintaining-local-change Reviewed By: marcelogomez Differential Revision: D28094262 fbshipit-source-id: fed48e3950e8a3ba3d7a15407522167e5ae41a98	2021-05-05 10:32:48 -07:00
Gus Wynn	cbbb45206b	slog max_level_debug -> trace Reviewed By: Imxset21 Differential Revision: D28097080 fbshipit-source-id: 7d417f8256922926cf379d9c2fb3249f6d2544ef	2021-05-03 10:30:21 -07:00
Thomas Orozco	9c7aa6aaf7	third-party/rust: remove patches for Tokio 0.2 & Hyper 0.2 Summary: We used to carry patches for Tokio 0.2 to add support for disabling Tokio coop (which was necessary to make Mononoke work with it), but this was upstreamed in Tokio 1.x (as a different implementation), so that's no longer needed. Nobody else besides Mononoke was using this. For Hyper we used to carry a patch with a bugfix. This was also fixed in Tokio 1.x-compatible versions of Hyper. There are still users of hyper-02 in fbcode. However, this is only used for servers and only when accepting websocket connections, and those users are just using Hyper as a HTTP client. Reviewed By: farnz Differential Revision: D28091331 fbshipit-source-id: de13b2452b654be6f3fa829404385e80a85c4420	2021-04-29 08:07:45 -07:00
Thomas Orozco	ffed22260d	third-party/rust: remove Gotham 0.2 Summary: This used to be used by Mononoke, but we're now on Tokio 1.x and on corresponding versions of Gotham so it's not needed anymore. Reviewed By: farnz Differential Revision: D28091091 fbshipit-source-id: a58bcb4ba52f3f5d2eeb77b68ee4055d80fbfce2	2021-04-29 08:07:45 -07:00
Mark Juggurnauth-Thomas	139d93bedb	changesets: split implementation to a separate crate Summary: Keeping the `Changesets` trait as well as its implementations in the same crate means that users of `Changesets` also transitively depend on everything that is needed to implement it. Flatten the dependency graph a little by splitting it into two crates: most users of `Changesets` will only depend on the trait definition. Only the factories need depend on the implementations. Reviewed By: krallin Differential Revision: D27430612 fbshipit-source-id: 6b45fe4ae6b0fa1b95439be5ab491b1675c4b177	2021-04-29 06:11:20 -07:00
Thomas Orozco	0f44a4f106	mononoke: update to tokio 1.x Summary: NOTE: there is one final pre-requisite here, which is that we should default all Mononoke binaries to `--use-mysql-client` because the other SQL client implementations will break once this lands. That said, this is probably the right time to start reviewing. There's a lot going on here, but Tokio updates being what they are, it has to happen as just one diff (though I did try to minimize churn by modernizing a bunch of stuff in earlier diffs). Here's a detailed list of what is going on: - I had to add a number `cargo_toml_dir` for binaries in `eden/mononoke/TARGETS`, because we have to use 2 versions of Bytes concurrently at this time, and the two cannot co-exist in the same Cargo workspace. - Lots of little Tokio changes: - Stream abstractions moving to `tokio-stream` - `tokio::time::delay_for` became `tokio::time::sleep` - `tokio::sync:⌚:Sender::send` became `tokio::sync:⌚:Sender::broadcast` - `tokio::sync::Semaphore::acquire` returns a `Result` now. - `tokio::runtime::Runtime::block_on` no longer takes a `&mut self` (just a `&self`). - `Notify` grew a few more methods with different semantics. We only use this in tests, I used what seemed logical given the use case. - Runtime builders have changed quite a bit: - My `no_coop` patch is gone in Tokio 1.x, but it has a new `tokio::task::unconstrained` wrapper (also from me), which I included on `MononokeApi::new`. - Tokio now detects your logical CPUs, not physical CPUs, so we no longer need to use `num_cpus::get()` to figure it out. - Tokio 1.x now uses Bytes 1.x: - At the edges (i.e. streams returned to Hyper or emitted by RepoClient), we need to return Bytes 1.x. However, internally we still use Bytes 0.5 in some places (notably: Filestore). - In LFS, this means we make a copy. We used to do that a while ago anyway (in the other direction) and it was never a meaningful CPU cost, so I think this is fine. - In Mononoke Server it doesn't really matter because that still generates ... Bytes 0.1 anyway so there was a copy before from 0.1 to 0.5 and it's from 0.1 to 1.x. - In the very few places where we read stuff using Tokio from the outside world (historical import tools for LFS), we copy. - tokio-tls changed a lot, they removed all the convenience methods around connecting. This resulted in updates to: - How we listen in Mononoke Server & LFS - How we connect in hgcli. - Note: all this stuff has test coverage. - The child process API changed a little bit. We used to have a ChildWrapper around the hg sync job to make a Tokio 0.2.x child look more like a Tokio 1.x Child, so now we can just remove this. - Hyper changed their Websocket upgrade mechanism (you now need the whole `Request` to upgrade, whereas before that you needed just the `Body`, so I changed up our code a little bit in Mononoke's HTTP acceptor to defer splitting up the `Request` into parts until after we know whether we plan to upgrade it. - I removed the MySQL tests that didn't use mysql client, because we're leaving that behind and don't intend to support it on Tokio 1.x. Reviewed By: mitrandir77 Differential Revision: D26669620 fbshipit-source-id: acb6aff92e7f70a7a43f32cf758f252f330e60c9	2021-04-28 07:36:31 -07:00
Stanislau Hlebik	c35a78c649	mononoke: add a tool to copy blobs from one repo to another Reviewed By: krallin Differential Revision: D28024373 fbshipit-source-id: 4954f6d322f924b9291326ef5a948a1c52230955	2021-04-27 08:54:34 -07:00
Ilia Medianikov	449fd2fd02	mononoke/pushrebase_hooks: add a hook that saves prepushrebase changeset id Summary: Knowing the prepushrebase changeset id is required for retroactive_review. retroactive_review checks landed commits, but verify_integrity hook runs on a commit before landing. This way the landed commit has no straightforward connection with the original one and retroactive_review can't acknowledge if verify_integrity have seen it. Reviewed By: krallin Differential Revision: D27911317 fbshipit-source-id: f7bb0cfbd54fa6ad2ed27fb9d4d67b9f087879f1	2021-04-27 03:52:50 -07:00
Ilia Medianikov	93b8cf116b	mononoke/pushrebase: split pushrebase crate Summary: Split pushrebase crate into pushrebase hook definition and pushrebase implementation. Before this change it was impossible to store an attribute in BlobRepo that would depend on PushrebaseHook as it would create a circular dependency `pushrebase -> blobrepo -> pushrebase`. Reviewed By: krallin Differential Revision: D27968029 fbshipit-source-id: 030ef1b02216546cd3658de1f417bc52b8dd9843	2021-04-27 03:52:50 -07:00
Alex Hornby	bc85aade21	rust: update to zstd to 0.7.0+zstd.1.4.9 Summary: Update the zstd crates. This also patches async-compression crate to point at my fork until upstream PR https://github.com/Nemo157/async-compression/pull/117 to update to zstd 1.4.9 can land. Reviewed By: jsgf, dtolnay Differential Revision: D27942174 fbshipit-source-id: 26e604d71417e6910a02ec27142c3a16ea516c2b	2021-04-22 14:34:06 -07:00
Kostia Balytskyi	d48c87a95e	megarepo: introduce write side of MononokeMegarepoConfigs Summary: `MononokeMegarepoConfig` is going to be a single point of access to config storage system - provide both writes and reads. It is also a trait, to allow for unit-test implementations later. This diff introduces a trait, as well as implements the write side of the configerator-based implementor. The read side/oss impl/test impl is left `unimplemented`. Read side and test impl will be implemented in the future. Things I had to consider while implementing this: - I wanted to store each version of `SyncTargetConfig` in an individual `.cconf` in configerator - at the same time, I did not want all of them to live in the same dir, to avoid having dirs with thousands of files in it - dir sharding uses sha1 of the target repo + target bookmark + version name, then separates it into a dir name and a file name, like git does - this means that these `.cconf` files are not "human-addressable" in the configerator repo - to help this, each new config creation also creates an entry in one of the "index" files: human-readable maps from target + version name to a corresponding `.cconf` - using a single index file is also impractical, so these are separated by ascification of the repo_id + bookmark name Note: this design means that there's no automatic way to fetch the list of all targets in use. This can be bypassed by maintaining an extra index layer, whihc will list all the targets. I don't think this is very important atm. Reviewed By: StanislavGlebik Differential Revision: D27795663 fbshipit-source-id: 4d824ee4320c8be5187915b23e9c9d261c198fe1	2021-04-22 02:13:19 -07:00
Alex Hornby	4b8ed9a79a	rust: remove gotham and hyper patches as referenced PRs have been released Summary: Remove gotham and hyper patches as referenced PRs have been released Reviewed By: krallin Differential Revision: D27905248 fbshipit-source-id: a2b25562614b71b25536b29bb1657a3f3a5de83c	2021-04-21 05:19:37 -07:00
Kostia Balytskyi	81675b2d5b	megarepo: introduce megarepo_error and make configo_client use it Summary: We want to distinguish user vs system errors in `configo_client` and its users (`mononoke_config` for instance). The reason is to allow `scs_server` distinguish the two types of errors. Normally user errors would only ever be instantiated fairly "shallowly" in the `scs_server` itself, but `configo_client` is a transactional client (by this I mean that it calls user-provided transformation functions on fetched data), so we need to allow for a user error to originate from these updater functions. Reviewed By: StanislavGlebik Differential Revision: D27880928 fbshipit-source-id: e00a5580a339fdabc4b858235da9cd7e9fc7a376	2021-04-21 05:02:00 -07:00
Simon Farnsworth	ecf7c4665b	Implement a baseline dumb packer Summary: This is a baseline packer, to let us experiment with packing. It chooses the dictionary that gives us the smallest size, but does nothing intelligent around key handling. Note that at this point, packing is not always a win - e.g. the alias blobs are 432 bytes in total, but the resulting pack is 1528 bytes. Reviewed By: ahornby Differential Revision: D27795486 fbshipit-source-id: 05620aadbf7b680cbfcb0932778f5849eaab8a48	2021-04-20 12:45:08 -07:00
Kostia Balytskyi	815b5ad04a	megarepo_api: introduce a basic configo client Summary: This is not used on its on, but in subsequent diffs I will add a use-case, by the megarepo configs crate. When built in non-fbcode mode, this crate does not export anything. I chose this approach as opposed to the approach of exporting no-op stubs to force the clients to pay attention and implement gating their side too. This seems reasonable for a rather generic configo client. Reviewed By: StanislavGlebik Differential Revision: D27790753 fbshipit-source-id: d6dcec884ed7aa88abe5796ef0e58be8525893e2	2021-04-19 08:34:12 -07:00
Alex Hornby	43f8fbab26	rust: remove smallvec fork Summary: Now we're on rustc 1.51 the fork is no longer needed. Reviewed By: dtolnay Differential Revision: D27827632 fbshipit-source-id: 131841590d3987d53f5f8afb5ebc205cd36937fb	2021-04-19 01:32:20 -07:00
Thomas Orozco	db4c509b9e	mononoke: use MononokeEnvironment in RepoFactory Summary: There is a very frustrating operation that happens often when working on the Mononoke code base: - You want to add a flag - You want to consume it in the repo somewhere Unfortunately, when we need to do this, we end up having to thread this from a million places and parse it out in every single main() we have. This is a mess, and it results in every single Mononoke binary starting with heaps of useless boilerplate: ``` let matches = app.get_matches(); let (caching, logger, mut runtime) = matches.init_mononoke(fb)?; let config_store = args::init_config_store(fb, &logger, &matches)?; let mysql_options = args::parse_mysql_options(&matches); let blobstore_options = args::parse_blobstore_options(&matches)?; let readonly_storage = args::parse_readonly_storage(&matches); ``` So, this diff updates us to just use MononokeEnvironment directly in RepoFactory, which means none of that has to happen: we can now add a flag, parse it into MononokeEnvironment, and get going. While we're at it, we can also remove blobstore options and all that jazz from MononokeApiEnvironment since now it's there in the underlying RepoFactory. Reviewed By: HarveyHunt Differential Revision: D27767700 fbshipit-source-id: e1e359bf403b4d3d7b36e5f670aa1a7dd4f1d209	2021-04-16 10:27:43 -07:00
Jan Mazur	77a205db89	quiet certain connection errors when shutting down Summary: Similar to D27155123 (`1a56da1c6f`). Reviewed By: krallin Differential Revision: D27805926 fbshipit-source-id: cf58a2e9b2ef92ca536f3b61b63fb42cfb1ec940	2021-04-16 02:26:04 -07:00
Thomas Orozco	e64012ad9e	mononoke/timeseries: introduce a basic crate for tracking time series Summary: I'd like to be able to track time series for access within Mononoke. The underlying use case here is that I want to be able to track the max count of connections in our SQL connection pools over time (and possibly other things in the future). Now, the obvious question is: why am I rolling my own? Well, as it turns out, there isn't really an implementation of this that I can reuse: - You might expect to be able to track the max of a value via fb303, but you can't: https://www.internalfb.com/intern/diffusion/FBS/browse/master/fbcode/fb303/ExportType.h?commit=0405521ec858e012c0692063209f3e13a2671043&lines=26-29 - You might go look in Folly, but you'll find that the time series there only supports tracking Sum & Average, but I want my timeseries to track Max (and in fact I'd like it to be sufficiently flexible to track anything I want): https://www.internalfb.com/intern/diffusion/FBS/browse/master/fbcode/folly/stats/BucketedTimeSeries.h It's not the first time I've ran into a need for something like this. I needed it in RendezVous to track connections over the last 2 N millisecond intervals, and we needed it in metagit for host draining as well (note that the implementation here is somewhat inspired by the implementation there). Reviewed By: mzr Differential Revision: D27678388 fbshipit-source-id: ba6d244b8bb848d4e1a12f9c6f54e3aa729f6c9c	2021-04-12 05:22:33 -07:00
Arun Kulshreshtha	e6e2e61084	third-party/rust: patch curl and curl-sys Summary: Update the `curl` and `curl-sys` crates to use a patched version that supports `CURLOPT_SSLCERT_BLOB` and similar config options that allow the use of in-memory TLS credentials. These options were added last year in libcurl version `7.71.0`, but the Rust bindings have not yet been updated to support them. I intend to upstream this patch, but until then, this will allow us to use these options in fbcode. Reviewed By: quark-zju Differential Revision: D27633208 fbshipit-source-id: 911e0b8809bc0144ad8b32749e71208bd08458fd	2021-04-08 11:50:38 -07:00
Mark Juggurnauth-Thomas	53550b9f10	blobrepo_factory: remove blobrepo_factory Summary: This has been superseded by `RepoFactory`. Reviewed By: krallin Differential Revision: D27400617 fbshipit-source-id: e029df8be6cd2b7f3a5917050520b83bce5630e9	2021-04-07 14:01:49 -07:00
Mark Juggurnauth-Thomas	3b9817b5d8	benchmark_storage_config: remove dependency on blobrepo_factory Summary: Use the equivalent function from `repo_factory`. Reviewed By: krallin Differential Revision: D27363470 fbshipit-source-id: dce3cf843174caa2f9ef7083409e7935749be4cd	2021-04-07 14:01:47 -07:00
Mark Juggurnauth-Thomas	f902acfcd1	repo_factory: add main repo factory Summary: Add a factory for building development and production repositories. This factory can be re-used to build many repositories, and they will share metadata database factories and blobstores if their configs match. Similarly, the factory will only load redacted blobs once per metadata database config, rather than once per repo. Reviewed By: krallin Differential Revision: D27323369 fbshipit-source-id: 951f7343af97f5e507b76fb822ad2e66f4d8f3bd	2021-04-07 14:01:46 -07:00
Alex Hornby	1e98362120	mononoke: add checkpoint for input to manual_scrub Summary: Reduce amount of manual steps needed to restart a manual scrub by checkpointing where it has got to to a file. Differential Revision: D27588450 fbshipit-source-id: cb0eda7d6ff57f3bb18a6669d38f5114ca9196b0	2021-04-07 09:30:51 -07:00
Alex Hornby	e907f53bf8	mononoke: add progress to manual_scrub Summary: Some manual scrub runs can take a long time. Provide progress feedback logging. Includes a --quiet option for when progress reporting not required. Reviewed By: farnz Differential Revision: D27588449 fbshipit-source-id: 00840cdf2022358bc10398f08b3bbf3eeec2b299	2021-04-06 08:45:29 -07:00
Alex Hornby	d2f915d4dd	mononoke: compress manual_scrub output files if requested Summary: Use async-compression to optionally zstd compress key and error files. Reviewed By: farnz Differential Revision: D27467761 fbshipit-source-id: e1ccb7dc32e4c41eaba82a3716cf4d13f64f71ea	2021-04-01 07:02:06 -07:00
generatedunixname89002005287564	fc6c12b9c7	Daily `common/rust/cargo_from_buck/bin/autocargo` Reviewed By: farnz Differential Revision: D27458795 fbshipit-source-id: 6d4f0e8ebcebe2f71e1152a61bf6511230c31af5	2021-03-31 02:34:06 -07:00
Mark Juggurnauth-Thomas	64461bb361	test_repo_factory: use test factory for remaining tests Summary: Use the test factory for the remaining existing tests. Reviewed By: StanislavGlebik Differential Revision: D27169443 fbshipit-source-id: 00d62d7794b66f5d3b053e8079f09f2532d757e7	2021-03-25 07:34:51 -07:00
Mark Juggurnauth-Thomas	cc30bf6552	test_repo_factory: add factory for test repos Summary: Create a factory that can be used to build repositories in tests. The test repo factory will be kept in a separate crate to the production repo factory, so that it can depend on a smaller set of dependencies: just those needed for in-memory test repos. This should eventually help make compilation speeds faster for tests. A notable difference between the test repos produced by this factory and the ones produced by `blobrepo_factory` is that the new repos share the in-memory metadata database. This is closer to what we use in production, and in a couple of places it is relied upon and existing tests must use `dangerous_override` to make it happen. Reviewed By: ahornby Differential Revision: D27169441 fbshipit-source-id: 82541a2ae71746f5e3b1a2a8a19c46bf29dd261c	2021-03-25 07:34:49 -07:00
Jan Mazur	24d2aa1442	bump region2region QPS Summary: We need to bump SCS counters expressing Mononoke's QPS. They will look something like: `requests:mononoke:oregon:carolina` for requests coming from proxygen in prn and mononoke in frc. CSLB expects regions' full names. We're getting src region from proxygen as a header. Reviewed By: krallin Differential Revision: D27082868 fbshipit-source-id: 12accb8a9df5cf6a80c2c281d2f61ac1e68176d1	2021-03-22 10:12:43 -07:00
Mark Juggurnauth-Thomas	46e3a31d10	repo_derived_data: add new repo attribute to encapsulate derived data Summary: Add `repo_derived_data`. This is a new struct that encapsulates derived data configuration and lease operations, and will be used in the facet-based construction of repositories. Reviewed By: ahornby Differential Revision: D27169431 fbshipit-source-id: dee7c032deb93db8934736c111ba7238a6aaf935	2021-03-22 07:26:48 -07:00
Mark Juggurnauth-Thomas	3d6160f216	repo_identity: add new repo attribute to encapsulate identity Summary: Add `repo_identity`. This is a new struct that encapsulates repository identity and will be used in the facet-based construction of repositories. Reviewed By: ahornby Differential Revision: D27169445 fbshipit-source-id: 02a435bba54a633190c6d2e4316e86726aecfdf0	2021-03-22 07:26:48 -07:00
Mark Juggurnauth-Thomas	c83baeb00d	segmented_changelog: split trait to separate crate Summary: Resolve a circular dependency whereby `BlobRepo` needs to depend on `Arc<dyn SegmentedChangelog>`, but the segmented changelog implementation depends on `BlobRepo`, by moving the trait definition to its own crate. Reviewed By: sfilipco Differential Revision: D27169423 fbshipit-source-id: 5bf7c632607dc8baba40d7a9d65e96e265d58496	2021-03-22 07:26:47 -07:00
Thomas Orozco	a3a0347639	mononoke/rendezvous: introduce query batching Summary: This introduces a basic building block for query batching. I called this rendezvous, since it's about multiple queries meeting up in the same place :) There are a few (somewhat conflicting) goals this tries to satisfy, so let's go over them: 1), we'd like to reduce the total number of queries made by batch jobs. For example, group hg bonsai lookups made by the walker. Those jobs are characterized by the fact that they have a lot of queries to make, all the time. Here's an example: https://fburl.com/ods/zuiep7yh. 2), we'd like to reduce the overall number of connections held to MySQL by our tasks. The main way we achieve this is by reducing the maximum number of concurrent queries. Indeed, a high total number of queries doesn't necessarily result in a lot of connections as long as they're not concurrent, because we can reuse connections. On the other hand, if you dispatch 100 concurrent queries, that _does_ use 100 connections. This is something that applies to batch jobs due to their query volume, but also to "interactive" jobs like Mononoke Server or SCS, just not all the time. Here's an example: https://fburl.com/ods/o6gp07qp (you can see the query count is overall low, but sometimes spikes substantially). 2.1) It's also worth noting that concurrent queries are often the result of many clients wanting the same data, so deduplication is also useful here. 3), we also don't want to impact the latency of interactive jobs when they need to a little query here or there (i.e. it's largely fine if our jobs all hold a few connections to MySQL and use them somewhat consistently). 4), we'd like this to make it easier to do batching right. For example, if you have 100 Bonsais to map to hg, you should be able to just map and call `future::try_join_all` and have that do the right thing. 5), we don't want "bad" queries to affect other queries negatively. One example would be the occasional queries we make to Bonsai <-> Hg mapping in `known` for thousands (if not more) of rows. 6), we want this to be easy to incorporate into the codebase. So, how do we try to address all of this? Here's how: - We ... do batching, and we deduplicate requests in a batch. This is the easier bit and should address #1, #2 and #2.1, #4. - However, batching is conditional. We notably don't batch very large requests with the rest (addresses #5). We also don't batch small queries all the time: we only batch if we are observing a throughput of queries that suggests we can find some benefit in batching (this targets #3). - Finally, we have some utilities for common cases like having to group by repo id (this is `MultiRendezVous`), and this is all configurable via tunables (and the default is to not do anything). Reviewed By: StanislavGlebik Differential Revision: D27010317 fbshipit-source-id: 4a2397255f9785c6722c02e4d419438fd0aafa07	2021-03-19 08:50:40 -07:00
Chengxiong Ruan	4fb5ba1152	Use released cursive_tab and cursive_buffered_backend version (#8078 ) Summary: Pull Request resolved: https://github.com/facebookincubator/resctl/pull/8078 Pull Request resolved: https://github.com/facebookexperimental/rust-shed/pull/21 Pull Request resolved: https://github.com/facebookexperimental/eden/pull/76 Use released version to fix cursive_core version conflicts. Reviewed By: boyuni Differential Revision: D27032206 fbshipit-source-id: ba664b21cd55453dbc8124ff967a6f9d61fc4926	2021-03-16 10:00:37 -07:00
Stanislau Hlebik	480a0e9ef7	mononoke: start moving streaming changelog logic to rust Summary: Our current straming changelog updater logic is written in python, and it has a few downsides: 1) It writes directly to manifold, which means it bypasses all the multiplexed blobstore logic... 2) ...more importantly, we can't write to non-manifold blobstores at all. 3) There are no tests for the streaming changelogs This diff moves the logic of initial creation of streaming changelog entry to rust, which should fix the issues mentioned above. I want to highligh that this implementation only works for the initial creation case i.e. when there are no entries in the database. Next diffs will add incremental updates functionality. Reviewed By: krallin Differential Revision: D27008485 fbshipit-source-id: d9583bb1b98e5c4abea11c0a43c42bc673f8ed48	2021-03-12 14:46:30 -08:00
Mark Juggurnauth-Thomas	91358f3716	mononoke_types: use SortedVectorMap for BonsaiChangeset Summary: BonsaiChangesets are rarely mutated, and their maps are stored in sorted order, so we can use `SortedVectorMap` to load them more efficiently. In the cases where mutable maps of filechanges are needed, we can use `BTreeMap` during the mutation and then convert them to `SortedVectorMap` to store them. Reviewed By: mitrandir77 Differential Revision: D25615279 fbshipit-source-id: 796219c1130df5cb025952bb61002e8d2ae898f4	2021-03-11 04:28:43 -08:00
Mark Juggurnauth-Thomas	36f78eadb8	benchmarks: add benchmark_large_directory Summary: Add a microbenchmark for deriving data with large directories. This benchmark creates a commit with 100k files in a single directory, and then derives data for that commit and 10 descendant commits, each of which add, modify and remove some files. Reviewed By: ahornby Differential Revision: D26947361 fbshipit-source-id: 4215f1ac806c53a112217ceb10e50cfad56f4f28	2021-03-11 04:28:42 -08:00
Mark Juggurnauth-Thomas	eb4d31cc82	benchmark: rename to benchmarks/simulated_repo Summary: Rename this benchmark to a specific name so that we can add new benchmarks. Differential Revision: D26947362 fbshipit-source-id: a1d060ee79781aa0ead51f284517471431418659	2021-03-11 04:28:42 -08:00

1 2 3 4 5

203 Commits