sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-11 01:07:15 +03:00

Author	SHA1	Message	Date
Lukas Piatkowski	3c3de9e954	rust-shed/futures_01_ext: rename futures_ext to futures_01_ext Summary: As part of the effort to deprecate futures 0.1 in favor of 0.3 I want to create a new futures_ext crate that will contain some of the extensions that are applicable from the futures_01_ext. But first I need to reclame this crate name by renaming the old futures_ext crate. This will also make it easier to track which parts of codebase still use the old futures. Reviewed By: farnz Differential Revision: D24725776 fbshipit-source-id: 3574d2a0790f8212f6fad4106655cd41836ff74d	2020-11-05 06:07:16 -08:00
Aida Getoeva	1240231656	mononoke/mysql: share single connection pool in the same SMC between shards Summary: In Mononoke for a sharded DB we historically used connection pool size 1 per shard. With the Mysql FFI client now it doesn't make sense, as the client's conn pool is smart enough and designed to works with sharded DBs, so currently we don't even benefit from having a pool. In this diff I added an API to create sharded connections: a single pool is shared between all the shards. Reviewed By: farnz Differential Revision: D24475317 fbshipit-source-id: b7142c030a10ccfde1d5a44943b38cfa70332c6a	2020-11-05 05:34:22 -08:00
Kostia Balytskyi	2ea25308ab	commit_rewriting: use is_empty() where possible Summary: `clippy` often complains about the use of `.len() != 0`, `.len() > 0` or `.len() == 0`and proposes to use `.is_empty()` instead. This diff does that across Mononoke. Reviewed By: aslpavel Differential Revision: D24099427 fbshipit-source-id: 1bba2f958485b7efb3f41bf3eae820879c92b0e5	2020-10-04 10:03:42 -07:00
Thomas Orozco	ab5af4b053	mononoke: add a tunable for ratio of master fallbacks Summary: Let's make this configurable so we can control how many fallbacks we want to allow if we're overloaded. Reviewed By: farnz Differential Revision: D24017088 fbshipit-source-id: 9bccaf831a28daff9696950ae8aac9d53e0c51c0	2020-10-01 01:06:28 -07:00
Aida Getoeva	b92c64af7d	shed/sql: make queries! macros work with new Rust mysql client Summary: shed/sql library used mainly to communicate with Mysql db and to have a nice abstraction layer around mysql (which is used in production) and sqlite (integration tests). The library provided an interface, that was backed up from Mysql side my raw connections and by MyRouter. This diff introduces a new backend - new Mysql client for Rust. New backend is exposed as a third variant for the current model: sqlite, mysql (raw conn and myrouter) and mysql2 (new client). The main reason for that is the fact that the current shed/sql interface for Mysql (1) heavily depends on mysql_async crate, (2) introduces much more complexity than needed for the new client and (3) it seems like this will be refactored and cleaned up later, old things will be deprecated. So to not overcomplicate things by trying to implement the given interface for the new Mysql client, I tried to simplify things by adding it as a third backend option. Reviewed By: farnz Differential Revision: D22458189 fbshipit-source-id: 4a484b5201a38cc017023c4086e9f57544de68b8	2020-09-11 06:33:37 -07:00
David Tolnay	0cb8a052f5	Update formatter to rustfmt 2.0 Reviewed By: zertosh Differential Revision: D23591021 fbshipit-source-id: e664aa2fdd3aaa457796a59080be6b94f604a112	2020-09-09 07:52:33 -07:00
Stanislau Hlebik	e308419b58	RFC mononoke: limit number of filenodes get_all_filenodes_maybe_stale Summary: In a repository with files with large histories we run into a lot of SqlTimeout errors while fetching file history to serve getpack calls. However fetching the whole file history is not really necessary - client knows how to work with partial history i.e. if client misses some portion of history then it would just fetch it on demand. This diff adds way to add a limit on how many entries were going to be fetched, and if more entries were fetched then we return FilenodeRangeResult::TooBig. The downside of this diff is that we'd have to do more sequential database queries. Reviewed By: krallin Differential Revision: D23025249 fbshipit-source-id: ebed9d6df6f8f40e658bc4b83123c75f78e70d93	2020-08-12 14:33:43 -07:00
Stanislau Hlebik	bdd494b2ce	mononoke: fix filenodes cache key Summary: `is_tree` weren't part of the cache key, and that means we could have returned incorrect history if we had a file and a directory with the same name. This diff fixes it. Reviewed By: krallin Differential Revision: D23028527 fbshipit-source-id: 98a3b2028fa62231dfb570a76fb836374ce1eed0	2020-08-10 07:13:35 -07:00
Lukasz Piatkowski	0dd3c4e4bb	add Mononoke integration tests CI (#26 ) Summary: This diff adds a minimal workflow for running integrations tests for Mononoke. Currently only one test is run and it fails. This also splits the regular Mononoke CI into separate files for Linux and Mac to match the current style in Eden repo. There are the "scopeguard::defer" fixes here that somehow escaped the CI tests. Some tweaks have been made to "integration_runner_real.py" to make it runnable outside FB context. Lastly the change from using "[[ -v ... ]" to "[[ -n "${...:-}" ]]; in "library.sh" was made because the former is not supported by the default Bash version preinstalled on modern MacOS. Pull Request resolved: https://github.com/facebookexperimental/eden/pull/26 Reviewed By: krallin Differential Revision: D22541344 Pulled By: lukaspiatkowski fbshipit-source-id: 5023d147823166a8754be852c29b1e7b0e6d9f5f	2020-07-16 12:16:10 -07:00
Arun Kulshreshtha	5f0181f48c	Regenerate all Cargo.tomls after upgrade to futures 0.3.5 Summary: D22381744 updated the version of `futures` in third-party/rust to 0.3.5, but did not regenerate the autocargo-managed Cargo.toml files in the repo. Although this is a semver-compatible change (and therefore should not break anything), it means that affected projects would see changes to all of their Cargo.toml files the next time they ran `cargo autocargo`. Reviewed By: dtolnay Differential Revision: D22403809 fbshipit-source-id: eb1fdbaf69c99549309da0f67c9bebcb69c1131b	2020-07-06 20:49:43 -07:00
Jeremy Fitzhardinge	35b292ce9d	eden: manual dependency fixes Summary: Tooling can't handle named_deps yet, but it can warn about them P133451794 Reviewed By: StanislavGlebik Differential Revision: D22083499 fbshipit-source-id: 46de533c19b13b2469e912165c1577ddb63d15cd	2020-06-17 17:55:04 -07:00
Jeremy Fitzhardinge	1b4edb5567	eden: remove unused Rust dependencies Summary: Remove unused dependencies for Rust targets. This failed to remove the dependencies in eden/scm/edenscmnative/bindings because of the extra macro layer. Manual edits (named_deps) and misc output in P133451794 Reviewed By: dtolnay Differential Revision: D22083498 fbshipit-source-id: 170bbaf3c6d767e52e86152d0f34bf6daa198283	2020-06-17 17:55:03 -07:00
Stanislau Hlebik	6f9e685a1a	mononoke: add counters for fetching disabled filenodes Summary: Let's log to ods so that we can check what's hapenning with filenodes. Reviewed By: ahornby Differential Revision: D21904400 fbshipit-source-id: e602dfc338c02252cad286176a1965bdc7043d7f	2020-06-10 19:29:29 -07:00
Stanislau Hlebik	14e4d561b3	mononoke: add_filenodes now return FilenodeResult and removes TODO Summary: This diff migrates add_filenodes method to return FilenodeResult. That means that all filenodes methods now return FilenodeResult and it's time now to remove TODOs from derived_data filenodes. Note that I had to change the test "derive_disabled_filenodes" a bit. Previously FilenodesOnlyPublic::mapping::get() method immediately returned FilenodesOnlyPublic::Disabled, while now it returns None if hg changeset is not derived. This is an expected change in behaviour, so I just updated the test to try to derive FilenodesOnlyPublic first, which in turns triggers generation of hg changeset. Reviewed By: ahornby Differential Revision: D21904401 fbshipit-source-id: f6f4cd14e6cdce5a4b95d8f3f9acff305ae6fa88	2020-06-10 19:29:29 -07:00
Stanislau Hlebik	c34cfd9bbf	mononoke: make get_filenode() return FilenodeResult Summary: Similar to get_all_filenodes_maybe_stale() make this method return FilenodeResult if filenodes are disabled. Note: this diff adds one TODO in fetch_root_filenode, which will be removed together with other TODOs in the next diff. Reviewed By: ahornby Differential Revision: D21904399 fbshipit-source-id: 1569579699c02eb07021f8143aa652aa192d23bc	2020-06-10 19:29:28 -07:00
Stanislau Hlebik	ceed8ea37f	mononoke: add must_use to FilenodeResult Summary: Let's make sure it's always processed Reviewed By: krallin Differential Revision: D21904402 fbshipit-source-id: da95273d10ecf69ad99cf6fe5e41bb6bb20e8d59	2020-06-10 19:29:19 -07:00
Stanislau Hlebik	4c15790b84	mononoke: return FilenodeResult from get_all_filenodes_maybe_stale Summary: Let's return FilenodeResult from get_all_filenodes_maybe_stale and change callers to deal with that. The change is straightforward with the exception of `file_history.rs`. get_all_filenodes_maybe_stale() is used here to prefetch a lot filenodes in one go. This diff changes it to return an empty vec in case filenodes are disabled. Unfortunately this is not a great solution - since prefetched files are empty get_file_history_using_prefetched() falls back to fetching filenodes sequentially from the blobstore. that might be too slow, and the next diffs in the stack will address this problem. Reviewed By: krallin Differential Revision: D21881082 fbshipit-source-id: a86dfd48a92182381ab56994f6b0f4b14651ea14	2020-06-10 19:29:16 -07:00
David Tolnay	cf412e0d6b	rustfmt: Use use_try_shorthand default Summary: I observed that for whatever reason our setting of `use_try_shorthand = true` in rustfmt.toml was causing entire functions to not get processed by rustfmt. Even files that contain neither `try` nor `?`. Remove it and reformat fbsource. Documentation of that config: - https://github.com/rust-lang/rustfmt/blob/master/Configurations.md#use_try_shorthand We don't particularly care about the value anymore because nobody writes `r#try!(...)` in 2018 edition code. Minimized: ``` fn f() { g( ) // ... .h } ``` This function gets formatted only if use_try_shorthand is not set. The bug is fixed in the rustfmt 2.0 release candidate. Reviewed By: jsgf Differential Revision: D21878162 fbshipit-source-id: b028673c0eb703984d24bf0d2983453fc2a8c212	2020-06-10 19:29:15 -07:00
Stanislau Hlebik	ad514c1e2a	mononoke: rename filenodes tunables to sql_timeout_knobs Summary: We are going to start using tunables in Mononoke in the next diffs, and the name clash between "tunables" and "newfilenodes::tunables" makes it confusing. Let's rename newfilenodes::tunables to sql_timeout_knobs Reviewed By: krallin Differential Revision: D21879093 fbshipit-source-id: ab0bae4be3c319dcb6afeecdd1c13df395e79e3b	2020-06-04 01:18:18 -07:00
Thomas Orozco	9ac8e0505b	mononoke: update various error enums to use `#[source]` Summary: We have a number of error enums that wrap an existing errors, but fail to register the underlying error as a `#[source]`. This results in truncated context chains when we print the error. This fixes that. It also removes a bunch of manual `From` implementation that can be provided by thiserror's `#[from]`. This also required updating the `Display` implementation for those errors. I've opted for not printing the underlying error, since the context chain will include it. This does mean that if we print one of those errors without the context chain (i.e. `{}` as opposed to `{:#}` or `{:?}`), then we'll lose out a bit of context. That said, this should be OK, as we really shouldn't ever being do this, because we'd be missing the rest of the chain anyways. Reviewed By: StanislavGlebik Differential Revision: D21399490 fbshipit-source-id: a970a7ef0a9404e51ea3b59d783ceb7bf33f7328	2020-05-05 05:44:52 -07:00
Gabriel Russo	03d4e52ab3	Bump tokio to 0.2.13 Summary: This is needed because the tonic crate (see the diff stack) relies on tokio ^0.2.13 We can't go to a newer version because a bug that affects mononoke was introduced on 0.2.14 (discussion started on T65261126). The issue was reported upstream https://github.com/tokio-rs/tokio/issues/2390 This diff simply changed the version number on `fbsource/third-party/rust/Cargo.toml` and ran `fbsource/third-party/rust/reindeer/vendor`. Also ran `buck run //common/rust/cargo_from_buck:cargo_from_buck` to fix the tokio version on generated cargo files Reviewed By: krallin Differential Revision: D21043344 fbshipit-source-id: e61797317a581aa87a8a54e9e2ae22655f22fb97	2020-04-15 12:18:00 -07:00
Jeremy Fitzhardinge	28830035dd	rust: regenerate autocargo for tokio rollback Reviewed By: dtolnay Differential Revision: D20956714 fbshipit-source-id: f13256350cc7082543c7b69231a783b262f8a4d8	2020-04-10 01:12:57 -07:00
Xavier Deguillard	29727102db	memcache: don't panic if Memcache fails to initialize Summary: Simply return an error when that happens. Reviewed By: dtolnay Differential Revision: D20808660 fbshipit-source-id: 94ca1c6de5739e4e67f2db6be547ed92c5696e43	2020-04-02 10:07:23 -07:00
Mark Thomas	640f272598	migrate from sql_ext::SqlConstructors to sql_construct Summary: Migrate the configuration of sql data managers from the old configuration using `sql_ext::SqlConstructors` to the new configuration using `sql_construct::SqlConstruct`. In the old configuration, sharded filenodes were included in the configuration of remote databases, even when that made no sense: ``` [storage.db.remote] db_address = "main_database" sharded_filenodes = { shard_map = "sharded_database", shard_num = 100 } [storage.blobstore.multiplexed] queue_db = { remote = { db_address = "queue_database", sharded_filenodes = { shard_map = "valid_config_but_meaningless", shard_num = 100 } } ``` This change separates out: * DatabaseConfig, which describes a single local or remote connection to a database, used in configuration like the queue database. * MetadataDatabaseConfig, which describes the multiple databases used for repo metadata. MetadataDatabaseConfig is either: * Local, which is a local sqlite database, the same as for DatabaseConfig; or * Remote, which contains: * `primary`, the database used for main metadata. * `filenodes`, the database used for filenodes, which may be sharded or unsharded. More fields can be added to RemoteMetadataDatabaseConfig when we want to add new databases. New configuration looks like: ``` [storage.metadata.remote] primary = { db_address = "main_database" } filenodes = { sharded = { shard_map = "sharded_database", shard_num = 100 } } [storage.blobstore.multiplexed] queue_db = { remote = { db_address = "queue_database" } } ``` The `sql_construct` crate facilitates this by providing the following traits: * SqlConstruct defines the basic rules for construction, and allows construction based on a local sqlite database. * SqlShardedConstruct defines the basic rules for construction based on sharded databases. * FbSqlConstruct and FbShardedSqlConstruct allow construction based on unsharded and sharded remote databases on Facebook infra. * SqlConstructFromDatabaseConfig allows construction based on the database defined in DatabaseConfig. * SqlConstructFromMetadataDatabaseConfig allows construction based on the appropriate database defined in MetadataDatabaseConfig. * SqlShardableConstructFromMetadataDatabaseConfig allows construction based on the appropriate shardable databases defined in MetadataDatabaseConfig. Sql database managers should implement: * SqlConstruct in order to define how to construct an unsharded instance from a single set of `SqlConnections`. * SqlShardedConstruct, if they are shardable, in order to define how to construct a sharded instance. * If the database is part of the repository metadata database config, either of: * SqlConstructFromMetadataDatabaseConfig if they are not shardable. By default they will use the primary metadata database, but this can be overridden by implementing `remote_database_config`. * SqlShardableConstructFromMetadataDatabaseConfig if they are shardable. They must implement `remote_database_config` to specify where to get the sharded or unsharded configuration from. Reviewed By: StanislavGlebik Differential Revision: D20734883 fbshipit-source-id: bb2f4cb3806edad2bbd54a47558a164e3190c5d1	2020-04-02 05:27:16 -07:00
Lukas Piatkowski	1bee1993a3	mononoke: make newfilenodes and blobstore/factory OSS buildable Summary: In the process the blobstore/factory/lib.rs was split into submodules - this way it was easier to untangle the dependencies and refactor it, so I've left the split in this diff. Reviewed By: markbt Differential Revision: D20302068 fbshipit-source-id: caa3a2b5487c30198c62f7e4f4e9cb7c488dc8de	2020-03-31 04:02:45 -07:00
Lukas Piatkowski	963f3cc724	mononoke: make blobstore/sqlblob buildable in OSS Summary: This shifts the responsibility of mocking all facebook-specific code to mononoke's sql_ext crate. If OSS code calls into any of that code it will most likely result in a panic. Reviewed By: ahornby Differential Revision: D20247580 fbshipit-source-id: 43f158d91aa32adaa5df6e3786243fb89c9ce961	2020-03-27 08:13:47 -07:00
Stanislau Hlebik	bf866d3a21	mononoke: log how many filenodes were inserted Summary: It was (or rather, might have been) useful during debugging of S197766. Let's now count both "count" (i.e. how often the method was called) and count how many filenodes were inserted Reviewed By: krallin Differential Revision: D20519701 fbshipit-source-id: f19f413171fcbcc300deffbe29baa946ebbe8dce	2020-03-19 01:22:23 -07:00
Lukas Piatkowski	6365fa6509	rust-shed: add no-op memcache_stub implementation to the shed Reviewed By: mitrandir77 Differential Revision: D20304739 fbshipit-source-id: bd2956619f6a5cf1551bccb921780e7a542e9859	2020-03-10 01:07:46 -07:00
Lukas Piatkowski	7ddcdd818c	mononoke: make sql_ext OSS buildable Summary: separate out the Facebook-specific pieces of the sql_ext crate Reviewed By: ahornby Differential Revision: D20218219 fbshipit-source-id: e933c7402b31fcd5c4af78d5e70adafd67e91ecd	2020-03-06 01:33:38 -08:00
David Tolnay	754a755eee	rust: Rename tokio_preview:: to tokio:: Summary: Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/ This codemod replaces all dependencies on `//common/rust/renamed:tokio-preview` with `fbsource//third-party/rust:tokio-preview` and their uses in Rust code from `tokio_preview::` to `tokio::`. This does not introduce any collisions with `tokio::` meaning 0.1 tokio because D20235404 previously renamed all of those to `tokio_old::` in crates that depend on both 0.1 and 0.2 tokio. This is the tokio version of what D20213432 did for futures. Codemod performed by: ``` rg \ --files-with-matches \ --type-add buck:TARGETS \ --type buck \ --glob '!/experimental' \ --regexp '(_\|\b)rust(_\|\b)' \ \| sed 's,TARGETS$,:,' \ \| xargs \ -x \ buck query "labels(srcs, rdeps(%Ss, //common/rust/renamed:tokio-preview, 1))" \ \| xargs sed -i 's,\btokio_preview::,tokio::,' rg \ --files-with-matches \ --type-add buck:TARGETS \ --type buck \ --glob '!/experimental' \ --regexp '(_\|\b)rust(_\|\b)' \ \| xargs sed -i 's,//common/rust/renamed:tokio-preview,fbsource//third-party/rust:tokio-preview,' ``` Reviewed By: k21 Differential Revision: D20236557 fbshipit-source-id: 15068b93a0a944d6249a1d9f63840a4c61c9c1ba	2020-03-05 14:25:10 -08:00
Thomas Orozco	f4f96c1100	mononoke/microwave: create repository snapshots for faster cache warmup Summary: This introduces a new binary and library that (microwave: it makes warmup faster..!) that can be used to accelerate cache warmup. The idea is the microwave binary will run cache warmup and capture things that are loaded during cache warmup, and commit those to a file. We can then use that file when starting up a host to get a head start on cache warmup by injecting all those entries into our local cache before actually starting cache warmup. Currently, this only supports filenodes, but that's already a pretty good improvement. Changesets should be easy to add as well. Blobs might require a bit more work. Reviewed By: StanislavGlebik Differential Revision: D20219905 fbshipit-source-id: 82bb13ca487f82ca53b4a68a90ac5893895a96e9	2020-03-04 04:02:18 -08:00
Thomas Orozco	7f044a7b2e	mononoke/walker: disable filenodes SQL timeouts Summary: The walker has been hitting the filenodes-enforced 5 second SQL timeout when querying filenodes from MySQL. It's not clear why that is, but looking at previous run history shows that we occasionally have queries that take > 30 seconds to complete (none of those show up in MySQL slow queries, though, and there's no particular load on the hosts around that time, so it's not clear whether this is happening in MySQL or our end). Anyhow, those queries would have worked in the old implementation (after a long time), but they fail in the new one, since it enforces a 5-second timeout. We should investigate why this is happening (and Alex has landed diffs to add more reporting in the walker to that end), but in the meantime, there's no reason to break the walker Reviewed By: farnz Differential Revision: D20227842 fbshipit-source-id: 5ee5c8225b6474b66c1f48a10b4a2d671ebc79c6	2020-03-04 03:20:26 -08:00
David Tolnay	e988a88be9	rust: Rename futures_preview:: to futures:: Summary: Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/ This codemod replaces all dependencies on `//common/rust/renamed:futures-preview` with `fbsource//third-party/rust:futures-preview` and their uses in Rust code from `futures_preview::` to `futures::`. This does not introduce any collisions with `futures::` meaning 0.1 futures because D20168958 previously renamed all of those to `futures_old::` in crates that depend on both 0.1 and 0.3 futures. Codemod performed by: ``` rg \ --files-with-matches \ --type-add buck:TARGETS \ --type buck \ --glob '!/experimental' \ --regexp '(_\|\b)rust(_\|\b)' \ \| sed 's,TARGETS$,:,' \ \| xargs \ -x \ buck query "labels(srcs, rdeps(%Ss, //common/rust/renamed:futures-preview, 1))" \ \| xargs sed -i 's,\bfutures_preview::,futures::,' rg \ --files-with-matches \ --type-add buck:TARGETS \ --type buck \ --glob '!/experimental' \ --regexp '(_\|\b)rust(_\|\b)' \ \| xargs sed -i 's,//common/rust/renamed:futures-preview,fbsource//third-party/rust:futures-preview,' ``` Reviewed By: k21 Differential Revision: D20213432 fbshipit-source-id: 07ee643d350c5817cda1f43684d55084f8ac68a6	2020-03-03 11:01:20 -08:00
Thomas Orozco	c6957c1f1e	mononoke/newfilenodes: use for for_sharded_connection() Summary: I canaried with this but I forgot to fold it in -_- Reviewed By: HarveyHunt Differential Revision: D20158157 fbshipit-source-id: 4a570bbca421d8c3e1e66605f164f2b8e2a433f6	2020-02-28 04:53:03 -08:00
Thomas Orozco	b7dfbdd09d	mononoke/newfilenodes: stop using i8 internally for is_tree Summary: Makes the code a little nicer to work with. Reviewed By: HarveyHunt Differential Revision: D20138720 fbshipit-source-id: 19f228782ab3582739e35fddcb2b0bf952110641	2020-02-27 12:34:23 -08:00
Thomas Orozco	ed602e6009	mononoke/newfilenodes: retry on master whens paths are missing Summary: Paths are in a different replica, so they can be missing even if copy info is present. Let's fallback to master in this case. Differential Revision: D20098902 fbshipit-source-id: 838ab1c70a74420c431a2f442f1504c8edd29a2e	2020-02-27 12:34:23 -08:00
Thomas Orozco	4d2932c43b	mononoke/newfilenodes: switch to a virtual sharding strategy Summary: Locking by physical shard worked earlier in this stack as indicated in the benchmarks, but after Ondemand restored their fetching for www, it proved insufficient in terms of parallelism, and resulted in substantially slower gettreepacks. Besides, with the "physical sharding" approach, we found ourselves between a rock and a hard place in terms of what to do with paths: - We could keep holding the semaphore for a filenode while fetching paths. This is undesirable because it further limits our level our concurrency (because fetching a filenode + paths is going to be at least 2x as slow as fetching a filenode). - We could fetch them without holding a lease at all. This is even more undesirable, because it means that when we release the semaphore for a given shard, we haven't filled the cache yet. This means that if we have a queue of 2 requests for the same bit of data, we're going to fetch twice (task A acquires the lock, goes to MySQL for the filenode, releases the lock and starts going to paths, at which point task B acquires the lock and goes to MySQL again since the filenode hasn't been filled yet). To fix this, I had to add a dedicated cache for paths, and put it behind semaphores as well. In the example above, this would ensure task B finds a "partial filenode" in the cache and doesn't go to MySQL (instead, it goes straight up to queuing for access to paths, where it will wait behind task A and also won't hit MySQL). There are a few problems with this: - It's a lot of extra complexity (because we need to handle half misses where we have the filenode but not the path). - It ties together our level of concurrency a second time to that of the underlying number of physical shards, which is kinda meaningless when some of this data can be provided by Memcache to begin with. This diff fixes both problems. The root cause of our problem that is that we're tying our level of concurrency to physical MySQL shards, whereas what we actually want is a tunable level of concurrency that matches our work load, yet effectively deduplicates queries. In this diff, I'm updating our exclusive locking to be purely virtual. This means that we're still not over-fetching, but we are no longer constrained by the parallelism of the underlying DB (this does mean we might queue up requests there, but they won't be duplicate requests). This also results in simpler code, and opens up the way for further improvements in the future, such as using Memcache lease-get operations to further deduplicate calls, if we'd like. As part of that, I've also updated our remote_cache to use the same CacheKey entity as the local cache, to avoid spending time producing new keys when we have perfectly good ones available. Reviewed By: StanislavGlebik Differential Revision: D20097821 fbshipit-source-id: 03d7be9082982fc1c6ef365d541c1ed8ae3e6e8d	2020-02-27 12:34:23 -08:00
Thomas Orozco	b4e8201d4c	mononoke/newfilenodes: track perf counters appropriately Summary: Let's record perf counters properly. Reviewed By: StanislavGlebik Differential Revision: D20097823 fbshipit-source-id: 0daed281d3c080fcbe7b4fac996fb265bdd6d408	2020-02-27 12:34:22 -08:00
Thomas Orozco	500baffb5c	mononoke/newfilenodes: add tests for cache fill behavior Summary: This adds a test for our cache fill behavior, which is to fill the remote cache if we miss in local cache. I hadn't added this later and it's a little easier to add now that the refactor for FilenodeInfo is through. Reviewed By: ahornby Differential Revision: D19905396 fbshipit-source-id: 88b5fd83f5d2213e91efc3c5dfb91dfe4e395136	2020-02-27 12:34:22 -08:00
Thomas Orozco	95d463ce47	mononoke/filenodes: Remove path from FilenodeInfo Summary: This updates our filenodes implementation to use different types for writing (`PreparedFilenode`) and reading `(FilenodeInfo`). The bottom line is that this avoids a bunch of cloning of paths on the read path, which doesn't need to return the path to the caller, since the caller already knows it! We can also take it out of Memcache, since we don't need Memcache to tell us the path for a blob we could only possibly have found by having the path to begin with. This does update our filenodes serialization format. I bumped MC_CODEVER accordingly. Reviewed By: StanislavGlebik Differential Revision: D19905400 fbshipit-source-id: 6037802c1773de564cade8e264d36087382ee15a	2020-02-27 12:34:21 -08:00
Thomas Orozco	a039745642	mononoke/newfilenodes: introduce timeouts talking to Memcache, MySQL Summary: Since we have one connection per shard, it's a good idea to make sure we don't keep those locked for too long. This diffs adds generous timeouts to protect against this, as well as ODS reporting to track errors. Reviewed By: StanislavGlebik Differential Revision: D19905393 fbshipit-source-id: ee4f4d3e33cf48a9002b016e31d37a401c6578f2	2020-02-27 12:34:20 -08:00
Thomas Orozco	c31b7d9ef9	mononoke/newfilenodes: introduce remote caching Summary: This introduces caching of filenodes to Memcache as in the old filenodes implementation. The code is mostly was ported over from the existing filenodes implementation, and converted to async / await. However, one key difference is that the lookups happen once we hold the semaphore to talk to the underlying MySQL shard. The reason for this is: - Reads to Memcache are really fast. They're often under 1ms. If you're going to miss in Memcache and have to go to SQL, it won't make you much slower. - Reads to Memcache are kinda expensive CPU-wise. Data in Memcache is compressed, and we often see a lot of our CPU cycles spent talking to Memache when we're under load. - Memcache isn't an infinite resource. If we're reading the exact same key a hundred times, that's going to hit the same Memcache box. A bit of deduplication on our end is a nice thing to strive for. Besides, our own thread pool we use to talk to Memcache is limited in size. From a performance perspective, this doesn't make things any slower, but reduces CPU usage when we'd otherwise have a lot of duplicate fetching. Finally, note that this update also includes support for dirty-tracking in our local cache. We use this to know if we should fill the remote cache (if we 100% hit in local cache, we don't fill the remote cache). Reviewed By: StanislavGlebik Differential Revision: D19905390 fbshipit-source-id: 363f638bb24cf488c7cd3a8ecea43e93f8391d3f	2020-02-27 12:34:19 -08:00
Thomas Orozco	1c94a586f0	mononoke/newfilenodes: introduce local caching Summary: This is the meat of the change I'm trying to make here. This updates newfilenodes to check their cache before dispatching queries to MySQL once they acquire the connection. Since we only get one connection per shard, this ensures that we don't query several times for the same piece of data. Note that the caching structure is a little different from the old one, which cached entire filenode info. Instead, this now caches the exact data we'd get out of MySQL, since we want to map MySQL queries 1-1 to cache lookups. With this change, we also now have a local cache for file history queries. Historically, we hadn't cached those at all, but with this change, we can get a lot of value of caching them even for small period of time in order to de-amplify reads to MySQL and Memcache. However, they are in separate cache pools to make sure they don't evict point filenodes, which we use for gettreepack (and have a good hit rate, unlike history blocks, which have a pretty poor hit rate). Note that having those semaphored connections might feel a little scary, but it's worth noting that the exact same bottleneck is implicitly present in the existing filenodes implementation, since we can only have one active query to any given shard a given time. That said, this approach also gives us a little more future flexibility, if we'd like, since we could map multiple semaphores to "sub shards" that map N-to-1 to real, physical shards. Reviewed By: HarveyHunt Differential Revision: D19905391 fbshipit-source-id: 02b5efaa44789e6afcccdeb9ee2b4791f7c3c824	2020-02-27 12:34:19 -08:00
Thomas Orozco	ab4f7adaeb	mononoke/newfilenodes: introduce a queue-conscious filenodes implementation Summary: This introduces a new implementation of filenodes that maintains its own queuing on top of the queuing enforced by the SQL crate. Later in this stack, the goal is for this implementation to avoid dispatching duplicate queries when there is a lot of contention talking to MySQL, which happens when large changes land and suddenly everyone wants the updated code. The underlying goal is to avoid dispatching a lot of duplicate queries when there is contention. Indeed, if there is contention, then the latency between query and response increases. As a result, without visibility in the queue, the following can happen: - Task 1 looks for A in the cache. It misses - Task 1 dispatches a SQL query - Task 2 looks for A in the cache. It misses - Task 2 dispatches a SQL query - Task 3 looks for A in the cache. It misses - Task 3 dispatches a SQL query - ... - Task 1's SQL query finally executes and fills the cache. - All other queries execute anyway. The longer the dispatch queue, the longer it takes to run those queries. Looking at Mononoke's stats in prod, this happens pretty often: https://pxl.cl/10xxmo (the spike at 3pm was a 10K-files change in fbsource, for example). The goal of this stack is to avoid this effect, by checking the cache only once we know we're ready to go to SQL. In this particular diff, what's added is: - The SQL read and write implementation. This is all implemented using new futures, but the logic should be largely unchanged from before (i.e. we store filenodes and their associated copy info in shards by the filenode's path — not the source path if there is copy info —, and paths in their own shard). The queries themselves largely unchanged from the existing filenodes, with only a few tweaks: - Filenodes and copy info are now selected in one go. - There are types to distinguish path hashes and paths. - The structs to support this implementation. Reviewed By: StanislavGlebik Differential Revision: D19905397 fbshipit-source-id: bec981e7bfb396d62eb06e5ce249c21555afc64b	2020-02-27 12:34:19 -08:00

44 Commits