sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-10 16:57:49 +03:00

Author	SHA1	Message	Date
David Tolnay	e988a88be9	rust: Rename futures_preview:: to futures:: Summary: Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/ This codemod replaces all dependencies on `//common/rust/renamed:futures-preview` with `fbsource//third-party/rust:futures-preview` and their uses in Rust code from `futures_preview::` to `futures::`. This does not introduce any collisions with `futures::` meaning 0.1 futures because D20168958 previously renamed all of those to `futures_old::` in crates that depend on both 0.1 and 0.3 futures. Codemod performed by: ``` rg \ --files-with-matches \ --type-add buck:TARGETS \ --type buck \ --glob '!/experimental' \ --regexp '(_\|\b)rust(_\|\b)' \ \| sed 's,TARGETS$,:,' \ \| xargs \ -x \ buck query "labels(srcs, rdeps(%Ss, //common/rust/renamed:futures-preview, 1))" \ \| xargs sed -i 's,\bfutures_preview::,futures::,' rg \ --files-with-matches \ --type-add buck:TARGETS \ --type buck \ --glob '!/experimental' \ --regexp '(_\|\b)rust(_\|\b)' \ \| xargs sed -i 's,//common/rust/renamed:futures-preview,fbsource//third-party/rust:futures-preview,' ``` Reviewed By: k21 Differential Revision: D20213432 fbshipit-source-id: 07ee643d350c5817cda1f43684d55084f8ac68a6	2020-03-03 11:01:20 -08:00
Thomas Orozco	c6957c1f1e	mononoke/newfilenodes: use for for_sharded_connection() Summary: I canaried with this but I forgot to fold it in -_- Reviewed By: HarveyHunt Differential Revision: D20158157 fbshipit-source-id: 4a570bbca421d8c3e1e66605f164f2b8e2a433f6	2020-02-28 04:53:03 -08:00
Thomas Orozco	b7dfbdd09d	mononoke/newfilenodes: stop using i8 internally for is_tree Summary: Makes the code a little nicer to work with. Reviewed By: HarveyHunt Differential Revision: D20138720 fbshipit-source-id: 19f228782ab3582739e35fddcb2b0bf952110641	2020-02-27 12:34:23 -08:00
Thomas Orozco	ed602e6009	mononoke/newfilenodes: retry on master whens paths are missing Summary: Paths are in a different replica, so they can be missing even if copy info is present. Let's fallback to master in this case. Differential Revision: D20098902 fbshipit-source-id: 838ab1c70a74420c431a2f442f1504c8edd29a2e	2020-02-27 12:34:23 -08:00
Thomas Orozco	4d2932c43b	mononoke/newfilenodes: switch to a virtual sharding strategy Summary: Locking by physical shard worked earlier in this stack as indicated in the benchmarks, but after Ondemand restored their fetching for www, it proved insufficient in terms of parallelism, and resulted in substantially slower gettreepacks. Besides, with the "physical sharding" approach, we found ourselves between a rock and a hard place in terms of what to do with paths: - We could keep holding the semaphore for a filenode while fetching paths. This is undesirable because it further limits our level our concurrency (because fetching a filenode + paths is going to be at least 2x as slow as fetching a filenode). - We could fetch them without holding a lease at all. This is even more undesirable, because it means that when we release the semaphore for a given shard, we haven't filled the cache yet. This means that if we have a queue of 2 requests for the same bit of data, we're going to fetch twice (task A acquires the lock, goes to MySQL for the filenode, releases the lock and starts going to paths, at which point task B acquires the lock and goes to MySQL again since the filenode hasn't been filled yet). To fix this, I had to add a dedicated cache for paths, and put it behind semaphores as well. In the example above, this would ensure task B finds a "partial filenode" in the cache and doesn't go to MySQL (instead, it goes straight up to queuing for access to paths, where it will wait behind task A and also won't hit MySQL). There are a few problems with this: - It's a lot of extra complexity (because we need to handle half misses where we have the filenode but not the path). - It ties together our level of concurrency a second time to that of the underlying number of physical shards, which is kinda meaningless when some of this data can be provided by Memcache to begin with. This diff fixes both problems. The root cause of our problem that is that we're tying our level of concurrency to physical MySQL shards, whereas what we actually want is a tunable level of concurrency that matches our work load, yet effectively deduplicates queries. In this diff, I'm updating our exclusive locking to be purely virtual. This means that we're still not over-fetching, but we are no longer constrained by the parallelism of the underlying DB (this does mean we might queue up requests there, but they won't be duplicate requests). This also results in simpler code, and opens up the way for further improvements in the future, such as using Memcache lease-get operations to further deduplicate calls, if we'd like. As part of that, I've also updated our remote_cache to use the same CacheKey entity as the local cache, to avoid spending time producing new keys when we have perfectly good ones available. Reviewed By: StanislavGlebik Differential Revision: D20097821 fbshipit-source-id: 03d7be9082982fc1c6ef365d541c1ed8ae3e6e8d	2020-02-27 12:34:23 -08:00
Thomas Orozco	b4e8201d4c	mononoke/newfilenodes: track perf counters appropriately Summary: Let's record perf counters properly. Reviewed By: StanislavGlebik Differential Revision: D20097823 fbshipit-source-id: 0daed281d3c080fcbe7b4fac996fb265bdd6d408	2020-02-27 12:34:22 -08:00
Thomas Orozco	500baffb5c	mononoke/newfilenodes: add tests for cache fill behavior Summary: This adds a test for our cache fill behavior, which is to fill the remote cache if we miss in local cache. I hadn't added this later and it's a little easier to add now that the refactor for FilenodeInfo is through. Reviewed By: ahornby Differential Revision: D19905396 fbshipit-source-id: 88b5fd83f5d2213e91efc3c5dfb91dfe4e395136	2020-02-27 12:34:22 -08:00
Thomas Orozco	95d463ce47	mononoke/filenodes: Remove path from FilenodeInfo Summary: This updates our filenodes implementation to use different types for writing (`PreparedFilenode`) and reading `(FilenodeInfo`). The bottom line is that this avoids a bunch of cloning of paths on the read path, which doesn't need to return the path to the caller, since the caller already knows it! We can also take it out of Memcache, since we don't need Memcache to tell us the path for a blob we could only possibly have found by having the path to begin with. This does update our filenodes serialization format. I bumped MC_CODEVER accordingly. Reviewed By: StanislavGlebik Differential Revision: D19905400 fbshipit-source-id: 6037802c1773de564cade8e264d36087382ee15a	2020-02-27 12:34:21 -08:00
Thomas Orozco	a039745642	mononoke/newfilenodes: introduce timeouts talking to Memcache, MySQL Summary: Since we have one connection per shard, it's a good idea to make sure we don't keep those locked for too long. This diffs adds generous timeouts to protect against this, as well as ODS reporting to track errors. Reviewed By: StanislavGlebik Differential Revision: D19905393 fbshipit-source-id: ee4f4d3e33cf48a9002b016e31d37a401c6578f2	2020-02-27 12:34:20 -08:00
Thomas Orozco	c31b7d9ef9	mononoke/newfilenodes: introduce remote caching Summary: This introduces caching of filenodes to Memcache as in the old filenodes implementation. The code is mostly was ported over from the existing filenodes implementation, and converted to async / await. However, one key difference is that the lookups happen once we hold the semaphore to talk to the underlying MySQL shard. The reason for this is: - Reads to Memcache are really fast. They're often under 1ms. If you're going to miss in Memcache and have to go to SQL, it won't make you much slower. - Reads to Memcache are kinda expensive CPU-wise. Data in Memcache is compressed, and we often see a lot of our CPU cycles spent talking to Memache when we're under load. - Memcache isn't an infinite resource. If we're reading the exact same key a hundred times, that's going to hit the same Memcache box. A bit of deduplication on our end is a nice thing to strive for. Besides, our own thread pool we use to talk to Memcache is limited in size. From a performance perspective, this doesn't make things any slower, but reduces CPU usage when we'd otherwise have a lot of duplicate fetching. Finally, note that this update also includes support for dirty-tracking in our local cache. We use this to know if we should fill the remote cache (if we 100% hit in local cache, we don't fill the remote cache). Reviewed By: StanislavGlebik Differential Revision: D19905390 fbshipit-source-id: 363f638bb24cf488c7cd3a8ecea43e93f8391d3f	2020-02-27 12:34:19 -08:00
Thomas Orozco	1c94a586f0	mononoke/newfilenodes: introduce local caching Summary: This is the meat of the change I'm trying to make here. This updates newfilenodes to check their cache before dispatching queries to MySQL once they acquire the connection. Since we only get one connection per shard, this ensures that we don't query several times for the same piece of data. Note that the caching structure is a little different from the old one, which cached entire filenode info. Instead, this now caches the exact data we'd get out of MySQL, since we want to map MySQL queries 1-1 to cache lookups. With this change, we also now have a local cache for file history queries. Historically, we hadn't cached those at all, but with this change, we can get a lot of value of caching them even for small period of time in order to de-amplify reads to MySQL and Memcache. However, they are in separate cache pools to make sure they don't evict point filenodes, which we use for gettreepack (and have a good hit rate, unlike history blocks, which have a pretty poor hit rate). Note that having those semaphored connections might feel a little scary, but it's worth noting that the exact same bottleneck is implicitly present in the existing filenodes implementation, since we can only have one active query to any given shard a given time. That said, this approach also gives us a little more future flexibility, if we'd like, since we could map multiple semaphores to "sub shards" that map N-to-1 to real, physical shards. Reviewed By: HarveyHunt Differential Revision: D19905391 fbshipit-source-id: 02b5efaa44789e6afcccdeb9ee2b4791f7c3c824	2020-02-27 12:34:19 -08:00
Thomas Orozco	ab4f7adaeb	mononoke/newfilenodes: introduce a queue-conscious filenodes implementation Summary: This introduces a new implementation of filenodes that maintains its own queuing on top of the queuing enforced by the SQL crate. Later in this stack, the goal is for this implementation to avoid dispatching duplicate queries when there is a lot of contention talking to MySQL, which happens when large changes land and suddenly everyone wants the updated code. The underlying goal is to avoid dispatching a lot of duplicate queries when there is contention. Indeed, if there is contention, then the latency between query and response increases. As a result, without visibility in the queue, the following can happen: - Task 1 looks for A in the cache. It misses - Task 1 dispatches a SQL query - Task 2 looks for A in the cache. It misses - Task 2 dispatches a SQL query - Task 3 looks for A in the cache. It misses - Task 3 dispatches a SQL query - ... - Task 1's SQL query finally executes and fills the cache. - All other queries execute anyway. The longer the dispatch queue, the longer it takes to run those queries. Looking at Mononoke's stats in prod, this happens pretty often: https://pxl.cl/10xxmo (the spike at 3pm was a 10K-files change in fbsource, for example). The goal of this stack is to avoid this effect, by checking the cache only once we know we're ready to go to SQL. In this particular diff, what's added is: - The SQL read and write implementation. This is all implemented using new futures, but the logic should be largely unchanged from before (i.e. we store filenodes and their associated copy info in shards by the filenode's path — not the source path if there is copy info —, and paths in their own shard). The queries themselves largely unchanged from the existing filenodes, with only a few tweaks: - Filenodes and copy info are now selected in one go. - There are types to distinguish path hashes and paths. - The structs to support this implementation. Reviewed By: StanislavGlebik Differential Revision: D19905397 fbshipit-source-id: bec981e7bfb396d62eb06e5ce249c21555afc64b	2020-02-27 12:34:19 -08:00

1 2

62 Commits