sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-11 01:07:15 +03:00

Author	SHA1	Message	Date
Arun Kulshreshtha	fa999d9de1	mononoke_api: add HgTreeContext Summary: Add an 'HgTreeContext' struct to the 'hg' module to allow querying for tree data in Mercurial-specific formats. This initial implementation's primary purpose is to enable getting the content of tree nodes in a format that can be written directly to Mercurial's storage. Reviewed By: krallin Differential Revision: D20159958 fbshipit-source-id: d229aee4d6c7d9ef45297c18de6e393d2a2dc83f	2020-03-03 15:11:03 -08:00
David Tolnay	e988a88be9	rust: Rename futures_preview:: to futures:: Summary: Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/ This codemod replaces all dependencies on `//common/rust/renamed:futures-preview` with `fbsource//third-party/rust:futures-preview` and their uses in Rust code from `futures_preview::` to `futures::`. This does not introduce any collisions with `futures::` meaning 0.1 futures because D20168958 previously renamed all of those to `futures_old::` in crates that depend on both 0.1 and 0.3 futures. Codemod performed by: ``` rg \ --files-with-matches \ --type-add buck:TARGETS \ --type buck \ --glob '!/experimental' \ --regexp '(_\|\b)rust(_\|\b)' \ \| sed 's,TARGETS$,:,' \ \| xargs \ -x \ buck query "labels(srcs, rdeps(%Ss, //common/rust/renamed:futures-preview, 1))" \ \| xargs sed -i 's,\bfutures_preview::,futures::,' rg \ --files-with-matches \ --type-add buck:TARGETS \ --type buck \ --glob '!/experimental' \ --regexp '(_\|\b)rust(_\|\b)' \ \| xargs sed -i 's,//common/rust/renamed:futures-preview,fbsource//third-party/rust:futures-preview,' ``` Reviewed By: k21 Differential Revision: D20213432 fbshipit-source-id: 07ee643d350c5817cda1f43684d55084f8ac68a6	2020-03-03 11:01:20 -08:00
Stanislau Hlebik	b90a3e842a	common/rust: add fbinit::compat_test Summary: While we are transitioning from tokio 0.1 to tokio 0.2 we might need to use [tokio_compat](https://docs.rs/tokio-compat/0.1.4/tokio_compat/) crate. Let's add a helper macro similar to fbinit::test that uses tokio_compat runtime. Reviewed By: farnz Differential Revision: D20213814 fbshipit-source-id: 18976e953011c8ada1fa915686e2dcb76ea288d5	2020-03-03 10:18:02 -08:00
Thomas Orozco	83cd9eec54	mononoke/apiserver: run streams on a Tokio 0.2 runtime Summary: Well, we don't have a Tokio Compat runtime in Actix. This means Tokio 0.2 code (e.g. Tokio 0.2 timers) blows up when executed in the API Server. How do we fix this? By not running Mononoke code on Actix's runtime, and instead running in on a Mononoke runtime we instantiated. How do we do that? By passing a Tokio Compat Executor all the way down to the place where Actix is about to consume our stream ... and at that point, we spawn the stream on our runtime, and give Actix a dumb receiver that does work when polled on a Tokio 0.1 runtime. This feels like the end of the road for the API Server. Nothing about this is even remotely sane, but it should take us through the API Server's eventual demise and replacement with the Gotham-based EdenAPI Server, which runs on the runtime of our choice (i.e. Tokio 0.2). Reviewed By: farnz Differential Revision: D20222294 fbshipit-source-id: 1646e35fe05b131b030e4962c8a7f68f72995035	2020-03-03 10:18:02 -08:00
Doug Neal	1e088c0af2	mononoke: lfs_server: add optional client identities to ratelimit config Summary: * Added intermediate (de)serializers for config types, so that we generate full Identity objects at config load time * Implement FromStr for Identity * Compare configured identities to presented identities in ratelimit middleware in order to decide whether or not to apply the limit Reviewed By: krallin Differential Revision: D20139308 fbshipit-source-id: 340c300db549575eb6d06efcbe437c0b1db4927b	2020-03-03 09:33:03 -08:00
Stanislau Hlebik	a70ccf6f04	mononoke: make it clearer which repo is accessed in permission error Summary: Usually we have only one repo, but in case of xrepo_commit_lookup we actually have two. It's nice to know which permission failed Reviewed By: krallin Differential Revision: D20221509 fbshipit-source-id: ee98845767e72f99027ba18a8c5b374cb6f9f3ab	2020-03-03 07:22:50 -08:00
Alex Hornby	464ffc40eb	mononoke: pushrebase: fix casefolding_check usage during changeset creation Summary: Honor the repo casefolding_check setting as tested by test-pushrebase-allow-casefolding.t Reviewed By: StanislavGlebik Differential Revision: D20192411 fbshipit-source-id: 8da72049417015b1f284c115a53b13c26ce3c3f6	2020-03-03 03:57:32 -08:00
Alex Hornby	5491f049a4	mononoke: walker: publish per-node-type stats Summary: publish per-node-type progrss stats so we can correlate storage access/load to type of node traversed Reviewed By: farnz Differential Revision: D20181064 fbshipit-source-id: c741b526c50e86a3eee105fab57fd7bc3ecc063b	2020-03-03 03:47:57 -08:00
Alex Hornby	37da3ebd2b	mononoke: pushrebase: add tests for casefolding Summary: Add tests for existing default block casefolding_check behaviour, plus test demonstrating problem with casefolding_check=false Reviewed By: farnz Differential Revision: D20192412 fbshipit-source-id: 1aea0fc5581e0c44388a4224ca693698731d3cd5	2020-03-03 02:44:06 -08:00
David Tolnay	fe65402e46	rust: Move futures-old rdeps to renamed futures-old Summary: In targets that depend on both 0.1 and 0.3 futures, this codemod renames the 0.1 dependency to be exposed as futures_old::. This is in preparation for flipping the 0.3 dependencies from futures_preview:: to plain futures::. rs changes performed by: ``` rg \ --files-with-matches \ --type-add buck:TARGETS \ --type buck \ --glob '!/experimental' \ --regexp '(_\|\b)rust(_\|\b)' \ \| sed 's,TARGETS$,:,' \ \| xargs \ -x \ buck query "labels(srcs, rdeps(%Ss, fbsource//third-party/rust:futures-old, 1) intersect rdeps(%Ss, //common/rust/renamed:futures-preview, 1) )" \ \| xargs sed -i 's/\bfutures::/futures_old::/' ``` Reviewed By: jsgf Differential Revision: D20168958 fbshipit-source-id: d2c099f9170c427e542975bc22fd96138a7725b0	2020-03-02 21:02:50 -08:00
Stanislau Hlebik	25c57e445c	mononoke: add create_warmer() function Summary: Small cleanup that removes a bunch of duplicate code. That should make it easier to add other types of derived data to the warmer Reviewed By: krallin Differential Revision: D20193169 fbshipit-source-id: 437fe7981d8a71164dc9edfcc423e8c41cbe0967	2020-03-02 10:08:09 -08:00
Arun Kulshreshtha	bd4a623ccb	mononoke_api: Add HgFileContext::new_check_exists Summary: Add a `new_check_exists` method to `HgFileContext` to allow looking up potentially nonexistent filenodes. Reviewed By: xavierd Differential Revision: D20159085 fbshipit-source-id: f6047f7a25f59594823672373d8b35adb49586e1	2020-03-02 09:41:21 -08:00
Arun Kulshreshtha	8ec76a0bce	mononoke_api: add hg module Summary: Add a a new `hg` module to the `mononoke_api` crate that provides a `HgRepoContext` type, which can be used to query the repo for data in Mercurial-specific formats. This will be used in the EdenAPI server. Initially, the `HgRepoContext`'s functionality is limited to just getting the content of individual files. It will be expanded to support querying more things in later diffs. Reviewed By: markbt Differential Revision: D20117038 fbshipit-source-id: 23dd0c727b9e3d80bd6dc873804e41c7772f3146	2020-03-02 09:41:20 -08:00
Thomas Orozco	0dadca26e7	mononoke/gotham_ext: make MononokeHttpHandler middleware async & allow preemption Summary: This updates our middleware stack and introduces two new pieces of functinality: - Middleware can now be async. - Middleware can now preempt requests and dispatch a response. The underlying motivation for this is to allow implementing Mononoke LFS's rate limiting middleware in our existing middleware stack. Reviewed By: kulshrax Differential Revision: D20191213 fbshipit-source-id: fc1df7a14eb0bbefd965e32c1fca5557124076b5	2020-03-02 09:28:08 -08:00
Arun Kulshreshtha	615d8392bc	mononoke_api: update doc comments on file content methods Summary: D20121350 changed the methods for accessing file content on `FileContext` to no longer return `Stream`s. We should update the comments accordingly. Reviewed By: ahornby Differential Revision: D20160128 fbshipit-source-id: f5bfd7e31bc7e6db63f56b8f4fc238893aa09a90	2020-03-02 09:21:08 -08:00
Thomas Orozco	2d04773c23	mononoke/hg_sync_job: update Globalrevs in hgsql Summary: This updates the hg_sync_job to update Globalrevs in hgsql before attempting to sync bundles. This means that if we're syncing successfully, hg is in sync with Mononoke, and if we fail (which should be very uncommon to begin with!), hg might skip a little bit ahead, but that's OK. This only makes sense when generating bundles — when doing pushrebase, hg would be updating its own globalrevs. Reviewed By: StanislavGlebik Differential Revision: D20159262 fbshipit-source-id: 6736f8592682da1001c7c9c4c9444462b71913c2	2020-03-02 08:24:16 -08:00
Stanislau Hlebik	638e637ef6	RFC: mononoke: introduce unodes v2 Summary: Our previous implementation of unodes had a problem with diamond merges - essentially because p1 and p2 might have the same file but with different content unode will always create a merge unode which can be unexpected. (code comment in unodes/derive.rs has more info about it). This diff fixes the problem by introducing unodes v2. This allows us to import new repos with new unode implementation while keeping the old repos with unode v1. This implementation uses a heuristic which should be fast and should do the correct thing most of the time. In some cases it might exclude some parts of the history completely. For example: O <- merge commit, doesn't change anything / \ P1 \| <- modified "file.txt" to "B" \| P2 <- modified "file.txt" to "B" \ / ROOT <- created "file.txt" with content "A" In that case history of "file.txt" starting from merge commit will contain only (P1, ROOT), but it won't contain P2. We also considered other options: 1) Move this heuristic to fastlog batch derived data. See D19973553 for more details about why we decided not to do it. 2) Filter out parent unodes that are ancestors of other parent unodes. This should always be correct, but it will be hard to implement, it wil be even harder to make sure it always have good performance. Reviewed By: krallin Differential Revision: D19978157 fbshipit-source-id: 445ddd5629669d987e7aa88c35fecf0b34a40da0	2020-03-02 05:27:31 -08:00
Stanislau Hlebik	d7a4ff29b5	mononoke: log derivations to separate scuba table Summary: I'd like to log all derivations to a single place so that's it's easier to understand what was derived and where Reviewed By: aslpavel Differential Revision: D20140004 fbshipit-source-id: 305ea533031a04ff95995a6fe2a6e57e95a87026	2020-03-02 04:30:12 -08:00
Alex Hornby	63937e3030	mononoke: walker: log the source node when validating Summary: Log the source node when validating so that we can more quickly reproduce any issues in a single step via the --walk-root option, rather than needing to run the entire walk again. Differential Revision: D20098200 fbshipit-source-id: 6b0d7d151c97f25080953d6c0fbf431dc2cec6a8	2020-03-02 02:29:34 -08:00
Stanislau Hlebik	168b74e38c	mononoke: fix logging in bookmarks Reviewed By: ahornby Differential Revision: D20161053 fbshipit-source-id: 7c69bf9421dd9e55bc2ca805c2f14b9c4cd0e669	2020-02-28 13:24:29 -08:00
Stanislau Hlebik	9cf34d97ca	mononoke: asyncify WarmBookmarksCache Reviewed By: ikostia Differential Revision: D20159967 fbshipit-source-id: dab201530416f17da4b4a3be6c4ecc04b2c10950	2020-02-28 13:24:28 -08:00
Thomas Orozco	82027505a0	mononoke/mercurial: add tests for metadata extraction Summary: I noticed in my earlier Bytes 0.5 diff that this doesn't have local test coverage (there might be things somewhere else in the test suite that look for it). Let's add some. Reviewed By: ahornby Differential Revision: D20139437 fbshipit-source-id: c17e4516574d674bb0b009cd1f322008fb3c1a79	2020-02-28 10:54:04 -08:00
Alex Hornby	938830d3f6	mononoke: walker: add ability to track route to node Summary: Add ability to track route to node, so that one could report the node from which failing step started from. Reviewed By: ikostia Differential Revision: D20097615 fbshipit-source-id: 4f2c000f54bd212225533e7f3570178020f34a9d	2020-02-28 09:01:35 -08:00
Kostia Balytskyi	cec057adc5	mononoke: add some perf counters for hydrated getbundle responses Summary: In case this starts to cause problems, let's have a way to correlate those problems with some exported metrics. Reviewed By: StanislavGlebik Differential Revision: D20158822 fbshipit-source-id: 6ac9e25861dbedaecdf04fd92bda835ae66535eb	2020-02-28 08:30:43 -08:00
Kostia Balytskyi	7ed52ee31b	mononoke: return hydrated bundles for infinitepush, if config says so Summary: ## Wider goal See D20068839 ## This diff This diff actually implements the conditional hydration of `getbundle` responses, as described in the D20068839. Note that as well as implementing support for hydrated `getbyndle` responses, this diff also implements support for changegroup v3 and lfs in such responses, which is needed if we are to do this kind of stuff in LFS-enabled repository. Reviewed By: StanislavGlebik Differential Revision: D20068838 fbshipit-source-id: fbdd3f8f5fb7cd2cb60473a94094553a1d4b4d2f	2020-02-28 08:30:43 -08:00
Alex Hornby	7f09703c4c	mononoke: walker: log per-run session id to scuba for validate Summary: Extend the session id logging to the validate command by adding ability to set the progress reporters scuba builder. Reviewed By: ikostia Differential Revision: D20074153 fbshipit-source-id: ceaeebdb7eb976080061ad3b76b22d7a0f7bd891	2020-02-28 04:57:09 -08:00
Alex Hornby	7baf1066ab	mononoke: walker: fix performance regression in loading file data for compression-benefit Summary: Fix performance regression in loading file data in compression-benefit subcommand Reviewed By: StanislavGlebik Differential Revision: D20142143 fbshipit-source-id: 0b9d93feaddab1df4b9d5777e0637f35aed2feda	2020-02-28 04:57:08 -08:00
Thomas Orozco	c6957c1f1e	mononoke/newfilenodes: use for for_sharded_connection() Summary: I canaried with this but I forgot to fold it in -_- Reviewed By: HarveyHunt Differential Revision: D20158157 fbshipit-source-id: 4a570bbca421d8c3e1e66605f164f2b8e2a433f6	2020-02-28 04:53:03 -08:00
Kostia Balytskyi	d5080d20ce	mononoke: asyncify get_manifest_and_filenodes in getbundle_response Summary: ## Wider goal See D20068839 ## This diff Modernize this particular function Reviewed By: StanislavGlebik Differential Revision: D20097802 fbshipit-source-id: fe76aaf2c0b65cf9b47a1dedc66d417d22cad255	2020-02-28 04:36:38 -08:00
Kostia Balytskyi	7755c4c4e6	mononoke: asyncify prepare_filenode_entries_stream in getbundle_response Summary: ## Wider goal See D20068839 ## This diff Modernize this particular function. Reviewed By: krallin Differential Revision: D20097805 fbshipit-source-id: bbcf371921d3a709cc7178ec50b7729bddf1f630	2020-02-28 02:49:57 -08:00
Thomas Orozco	c680696e40	mononoke: defer hook loading Summary: Most binaries don't need hooks. Let's not require them. This might not be very long lived since Simon is working on removing lua hooks, but this was a trivial fix. Reviewed By: johansglock Differential Revision: D20140026 fbshipit-source-id: cc74b37459f63c5dd550c5779b72aa1d6531202c	2020-02-28 02:03:07 -08:00
Thomas Orozco	515f4a507d	mononoke/cachelob: remove Memcache blob write leases Summary: (this doesn't remove ad-hoc leases, like derived data) Let's see if this has any impact on performance. We no longer fail Manifold writes on conflicts, and Reviewed By: StanislavGlebik Differential Revision: D20038572 fbshipit-source-id: 4a972ff09ceb65e69a1d22a643a8f2d9b2ab1b17	2020-02-28 01:59:36 -08:00
David Tolnay	37a8401761	rust/thrift: Un-rename futures-preview dependency Summary: The Thrift generated code depends only on futures 0.3, not 0.1. Thus it isn't necessary to depend on renamed:futures-preview and we can depend on futures-preview directly, which is exposed to Rust code as `futures::`. Reviewed By: jsgf Differential Revision: D20145921 fbshipit-source-id: 5cae94ec6747a374c2bf05f124ab237c798de005	2020-02-27 22:27:58 -08:00
David Tolnay	d8bd00ce36	rust/thrift: Drop unused dependencies on old futures in various places Summary: The last uses of futures 0.1 were removed in D18411564 and D18392252. A later diff will switch thrift from using renamed:futures-preview to plain futures-preview to prepare for eliminating the -preview suffix. Reviewed By: jsgf Differential Revision: D20143832 fbshipit-source-id: b7fd79f18368ade59eeba6ed0ac09613000c046b	2020-02-27 22:24:10 -08:00
Jeff Zhang	c517e81329	Push `compat` down deeper into subcommands & make subcommand functions `async` in eden/mononoke/cmds/admin/main.rs Summary: Continue to push `compat()` deeper into subcommands. This enables us to refactor each file one at a time and ultimately remove the old futures from our code base. Reviewed By: farnz Differential Revision: D20132126 fbshipit-source-id: cc10dde6eda7ddcbf911dbe8d3ebe1713f8ec2ab	2020-02-27 12:39:28 -08:00
Thomas Orozco	b7dfbdd09d	mononoke/newfilenodes: stop using i8 internally for is_tree Summary: Makes the code a little nicer to work with. Reviewed By: HarveyHunt Differential Revision: D20138720 fbshipit-source-id: 19f228782ab3582739e35fddcb2b0bf952110641	2020-02-27 12:34:23 -08:00
Thomas Orozco	ed602e6009	mononoke/newfilenodes: retry on master whens paths are missing Summary: Paths are in a different replica, so they can be missing even if copy info is present. Let's fallback to master in this case. Differential Revision: D20098902 fbshipit-source-id: 838ab1c70a74420c431a2f442f1504c8edd29a2e	2020-02-27 12:34:23 -08:00
Thomas Orozco	4d2932c43b	mononoke/newfilenodes: switch to a virtual sharding strategy Summary: Locking by physical shard worked earlier in this stack as indicated in the benchmarks, but after Ondemand restored their fetching for www, it proved insufficient in terms of parallelism, and resulted in substantially slower gettreepacks. Besides, with the "physical sharding" approach, we found ourselves between a rock and a hard place in terms of what to do with paths: - We could keep holding the semaphore for a filenode while fetching paths. This is undesirable because it further limits our level our concurrency (because fetching a filenode + paths is going to be at least 2x as slow as fetching a filenode). - We could fetch them without holding a lease at all. This is even more undesirable, because it means that when we release the semaphore for a given shard, we haven't filled the cache yet. This means that if we have a queue of 2 requests for the same bit of data, we're going to fetch twice (task A acquires the lock, goes to MySQL for the filenode, releases the lock and starts going to paths, at which point task B acquires the lock and goes to MySQL again since the filenode hasn't been filled yet). To fix this, I had to add a dedicated cache for paths, and put it behind semaphores as well. In the example above, this would ensure task B finds a "partial filenode" in the cache and doesn't go to MySQL (instead, it goes straight up to queuing for access to paths, where it will wait behind task A and also won't hit MySQL). There are a few problems with this: - It's a lot of extra complexity (because we need to handle half misses where we have the filenode but not the path). - It ties together our level of concurrency a second time to that of the underlying number of physical shards, which is kinda meaningless when some of this data can be provided by Memcache to begin with. This diff fixes both problems. The root cause of our problem that is that we're tying our level of concurrency to physical MySQL shards, whereas what we actually want is a tunable level of concurrency that matches our work load, yet effectively deduplicates queries. In this diff, I'm updating our exclusive locking to be purely virtual. This means that we're still not over-fetching, but we are no longer constrained by the parallelism of the underlying DB (this does mean we might queue up requests there, but they won't be duplicate requests). This also results in simpler code, and opens up the way for further improvements in the future, such as using Memcache lease-get operations to further deduplicate calls, if we'd like. As part of that, I've also updated our remote_cache to use the same CacheKey entity as the local cache, to avoid spending time producing new keys when we have perfectly good ones available. Reviewed By: StanislavGlebik Differential Revision: D20097821 fbshipit-source-id: 03d7be9082982fc1c6ef365d541c1ed8ae3e6e8d	2020-02-27 12:34:23 -08:00
Thomas Orozco	b4e8201d4c	mononoke/newfilenodes: track perf counters appropriately Summary: Let's record perf counters properly. Reviewed By: StanislavGlebik Differential Revision: D20097823 fbshipit-source-id: 0daed281d3c080fcbe7b4fac996fb265bdd6d408	2020-02-27 12:34:22 -08:00
Thomas Orozco	500baffb5c	mononoke/newfilenodes: add tests for cache fill behavior Summary: This adds a test for our cache fill behavior, which is to fill the remote cache if we miss in local cache. I hadn't added this later and it's a little easier to add now that the refactor for FilenodeInfo is through. Reviewed By: ahornby Differential Revision: D19905396 fbshipit-source-id: 88b5fd83f5d2213e91efc3c5dfb91dfe4e395136	2020-02-27 12:34:22 -08:00
Thomas Orozco	95d463ce47	mononoke/filenodes: Remove path from FilenodeInfo Summary: This updates our filenodes implementation to use different types for writing (`PreparedFilenode`) and reading `(FilenodeInfo`). The bottom line is that this avoids a bunch of cloning of paths on the read path, which doesn't need to return the path to the caller, since the caller already knows it! We can also take it out of Memcache, since we don't need Memcache to tell us the path for a blob we could only possibly have found by having the path to begin with. This does update our filenodes serialization format. I bumped MC_CODEVER accordingly. Reviewed By: StanislavGlebik Differential Revision: D19905400 fbshipit-source-id: 6037802c1773de564cade8e264d36087382ee15a	2020-02-27 12:34:21 -08:00
Thomas Orozco	7fa9607859	mononoke/newfilenodes: remove sqlfilenodes Summary: This removes the old sqlfilenodes implementation, since we're now using the new one. There's also a bit of cruft here and there we can get rid of. Reviewed By: StanislavGlebik Differential Revision: D19905395 fbshipit-source-id: 2526b6d65eeb981f5aedda9951b44b389ecec29d	2020-02-27 12:34:21 -08:00
Thomas Orozco	149e15f2ad	mononoke: use spawn_future in getpack to fetch history Summary: The former implementation would eagerly query Memcache when fetching history (due to how old futures work) for files in getpack, but the new one does not. This means the new one loses out on a lot of buffering, which the old one used to do. This diff emulates the old behavior by eagerly querying filenodes in getpack, which improves performance on a very big getpack (32K files) by about 3x, and makes it 30% faster than the old code, instead of > 2x slower. Note that I'm not certain we really want to do this kind of aggressive buffering in getpack long term, but for now, I'd like to keep this unchanged. Reviewed By: StanislavGlebik Differential Revision: D19905398 fbshipit-source-id: 49f9a2cd505a98123fd1dabb835e8e378d45c930	2020-02-27 12:34:21 -08:00
Thomas Orozco	f6866eb97d	mononoke: switch to new filenodes implementation Summary: This updates Mononoke to use the new filenodes implementation introduced earlier in this stack. See the test plan for detailed performance results supporting why I'm making this change. Reviewed By: StanislavGlebik Differential Revision: D19905394 fbshipit-source-id: 8370fd30c9cfd075c3527b9220e4cf4f604705ae	2020-02-27 12:34:20 -08:00
Thomas Orozco	a039745642	mononoke/newfilenodes: introduce timeouts talking to Memcache, MySQL Summary: Since we have one connection per shard, it's a good idea to make sure we don't keep those locked for too long. This diffs adds generous timeouts to protect against this, as well as ODS reporting to track errors. Reviewed By: StanislavGlebik Differential Revision: D19905393 fbshipit-source-id: ee4f4d3e33cf48a9002b016e31d37a401c6578f2	2020-02-27 12:34:20 -08:00
Thomas Orozco	c31b7d9ef9	mononoke/newfilenodes: introduce remote caching Summary: This introduces caching of filenodes to Memcache as in the old filenodes implementation. The code is mostly was ported over from the existing filenodes implementation, and converted to async / await. However, one key difference is that the lookups happen once we hold the semaphore to talk to the underlying MySQL shard. The reason for this is: - Reads to Memcache are really fast. They're often under 1ms. If you're going to miss in Memcache and have to go to SQL, it won't make you much slower. - Reads to Memcache are kinda expensive CPU-wise. Data in Memcache is compressed, and we often see a lot of our CPU cycles spent talking to Memache when we're under load. - Memcache isn't an infinite resource. If we're reading the exact same key a hundred times, that's going to hit the same Memcache box. A bit of deduplication on our end is a nice thing to strive for. Besides, our own thread pool we use to talk to Memcache is limited in size. From a performance perspective, this doesn't make things any slower, but reduces CPU usage when we'd otherwise have a lot of duplicate fetching. Finally, note that this update also includes support for dirty-tracking in our local cache. We use this to know if we should fill the remote cache (if we 100% hit in local cache, we don't fill the remote cache). Reviewed By: StanislavGlebik Differential Revision: D19905390 fbshipit-source-id: 363f638bb24cf488c7cd3a8ecea43e93f8391d3f	2020-02-27 12:34:19 -08:00
Thomas Orozco	1c94a586f0	mononoke/newfilenodes: introduce local caching Summary: This is the meat of the change I'm trying to make here. This updates newfilenodes to check their cache before dispatching queries to MySQL once they acquire the connection. Since we only get one connection per shard, this ensures that we don't query several times for the same piece of data. Note that the caching structure is a little different from the old one, which cached entire filenode info. Instead, this now caches the exact data we'd get out of MySQL, since we want to map MySQL queries 1-1 to cache lookups. With this change, we also now have a local cache for file history queries. Historically, we hadn't cached those at all, but with this change, we can get a lot of value of caching them even for small period of time in order to de-amplify reads to MySQL and Memcache. However, they are in separate cache pools to make sure they don't evict point filenodes, which we use for gettreepack (and have a good hit rate, unlike history blocks, which have a pretty poor hit rate). Note that having those semaphored connections might feel a little scary, but it's worth noting that the exact same bottleneck is implicitly present in the existing filenodes implementation, since we can only have one active query to any given shard a given time. That said, this approach also gives us a little more future flexibility, if we'd like, since we could map multiple semaphores to "sub shards" that map N-to-1 to real, physical shards. Reviewed By: HarveyHunt Differential Revision: D19905391 fbshipit-source-id: 02b5efaa44789e6afcccdeb9ee2b4791f7c3c824	2020-02-27 12:34:19 -08:00
Thomas Orozco	ab4f7adaeb	mononoke/newfilenodes: introduce a queue-conscious filenodes implementation Summary: This introduces a new implementation of filenodes that maintains its own queuing on top of the queuing enforced by the SQL crate. Later in this stack, the goal is for this implementation to avoid dispatching duplicate queries when there is a lot of contention talking to MySQL, which happens when large changes land and suddenly everyone wants the updated code. The underlying goal is to avoid dispatching a lot of duplicate queries when there is contention. Indeed, if there is contention, then the latency between query and response increases. As a result, without visibility in the queue, the following can happen: - Task 1 looks for A in the cache. It misses - Task 1 dispatches a SQL query - Task 2 looks for A in the cache. It misses - Task 2 dispatches a SQL query - Task 3 looks for A in the cache. It misses - Task 3 dispatches a SQL query - ... - Task 1's SQL query finally executes and fills the cache. - All other queries execute anyway. The longer the dispatch queue, the longer it takes to run those queries. Looking at Mononoke's stats in prod, this happens pretty often: https://pxl.cl/10xxmo (the spike at 3pm was a 10K-files change in fbsource, for example). The goal of this stack is to avoid this effect, by checking the cache only once we know we're ready to go to SQL. In this particular diff, what's added is: - The SQL read and write implementation. This is all implemented using new futures, but the logic should be largely unchanged from before (i.e. we store filenodes and their associated copy info in shards by the filenode's path — not the source path if there is copy info —, and paths in their own shard). The queries themselves largely unchanged from the existing filenodes, with only a few tweaks: - Filenodes and copy info are now selected in one go. - There are types to distinguish path hashes and paths. - The structs to support this implementation. Reviewed By: StanislavGlebik Differential Revision: D19905397 fbshipit-source-id: bec981e7bfb396d62eb06e5ce249c21555afc64b	2020-02-27 12:34:19 -08:00
Thomas Orozco	341b4f1bc3	mononoke/filenodes: expect a `Vec` of filenodes to insert Summary: The API expects a stream of filenodes to insert, but we actually never used that ability. Instead, every single callsites has a `Vec`, which it converts to a stream and passes that in. I'd like to change this for two reasons: - It's un-necessary - It makes the code more complex on the Filenodes implementation side, and less efficient, since we need to `chunk()` there in small chunks, which might not all be in the same shard. If we get the entire `Vec` at once, we can chunk on a per-shard basis (this happens later in this stack). Besides, if we end up having a stream and wanting the old behavior, we can always call `chunk()` the stream and call `add_filenodes` on each batch (which is actually nicer because if you have a futures 0.2 stream that isn't static, you can do this, but you can't turn it into a `BoxStream`!). Reviewed By: StanislavGlebik Differential Revision: D19902537 fbshipit-source-id: a4c030c4a51afbb6e9db133b32464009eed197af	2020-02-27 12:34:18 -08:00
Stanislau Hlebik	cc8be5997e	mononoke: asyncify derived data Reviewed By: krallin Differential Revision: D20139701 fbshipit-source-id: 7f1c8370707eb415dd7e23d94eb923846f7ef59b	2020-02-27 12:17:54 -08:00
Alex Hornby	e70f3dc76c	mononoke: walker: log per-run session id to scuba for scrub Summary: Log a per-run session id to distinguish runs more easily. This diff adds the session for scrub logging , following one extends this to validate/progress logging. So that each tail has a separate session logged, setup is delayed until the start of each tail by passing it in as a function. Differential Revision: D19907398 fbshipit-source-id: 8e5470918112321866c67c9f94e703fd46e6a16b	2020-02-27 09:00:44 -08:00
Thomas Orozco	f1121ccef6	mononoke: add a @nocommit hook Reviewed By: HarveyHunt Differential Revision: D20139540 fbshipit-source-id: 0be6d1aa8ad7ad1197197ec886f0cf44bd6b864d	2020-02-27 08:28:05 -08:00
Thomas Orozco	26ae726af5	mononoke: update internals to Bytes 0.5 Summary: The Bytes 0.5 update left us in a somewhat undesirable position where every access to our blobstore incurs an extra copy whenever we fetch data out of our cache (by turning it from Bytes 0.5 into Bytes 0.4) — we also have quite a few place where we convert in one direction then immediately into the other. Internally, we can start using Bytes 0.5 now. For example, this is useful when pulling data out of our blobstore and deserializing as Thrift (or conversely, when serializing and putting it into our blobstore). However, when we interface with Tokio (i.e. decoders & encoders), we still have to use Bytes 0.4. So, when needed, we convert our Bytes 0.5 to 0.4 there. The tradeoff idea is that we deal with more bytes internally than we end up sending to clients, so doing the Bytes conversion closer to the point of sending data to clients means less copies. We can also start removing those once we migrate to Tokio 0.2 (and newer versions of Hyper for HTTP services). Changes that were required: - You can't extend new bytes (because that implicitly copies). You need to use BytesMut instead, which I did where that was necessary (I also added calls in the Filestore to do that efficiently). - You can't create bytes from a `&'a [u8]`, unless `'a` is `'static`. You need to use `copy_from_slice` instead. - `slice_to` and `slice_from` have been replaced by a `slice()` function that takes ranges. Reviewed By: StanislavGlebik Differential Revision: D20121350 fbshipit-source-id: eb31af2051fd8c9d31c69b502e2f6f1ce2190cb1	2020-02-27 08:08:28 -08:00
Thomas Orozco	7698cded43	mononoke/hooks: add a signed source hook Reviewed By: HarveyHunt Differential Revision: D20139152 fbshipit-source-id: a0a48d447444cf969162f5f9655ab003e7ca2f76	2020-02-27 08:05:14 -08:00
Mateusz Kwapich	6f9f82767c	add git identifiers to Source Control Service Summary: This allows us to translate git hashes Reviewed By: markbt Differential Revision: D19972870 fbshipit-source-id: 871a4cf94d468d987221cb08fe7b6135050bac93	2020-02-27 08:05:14 -08:00
Mateusz Kwapich	5825db21c6	add the git<->bonsai translation to mononoke_api crate Reviewed By: markbt Differential Revision: D19972871 fbshipit-source-id: 79c0c59f0bd1bd033bf2a8999dbe56b60a7ac085	2020-02-27 08:05:13 -08:00
Mateusz Kwapich	3ff29a8810	make BonsaiGitMapping repo-specific Summary: Nearly all of the Mononoke SQL stores are instantiated once per repo but they don't store the `RepositoryId` anywhere so every method takes it as argument. And because providing the repo_id on every call is not ergonomical we tend to add methods to blob_repo that just call the right method with the right repo_id in on of the underlying stores (see `get_bonsai_from_globalrev` on blobrepo for example). Because my reviewers [pushed back](https://our.intern.facebook.com/intern/diff/D19972871/?transaction_id=196961774880671&dest_fbid=1282141621983439) when I've tried to do the same for bonsai_git_mapping I've decided to make it right by adding the repo_id to the BonsaiGitMapping. Reviewed By: krallin Differential Revision: D20029485 fbshipit-source-id: 7585c3bf9cc8fa3cbe59ab1e87938f567c09278a	2020-02-27 08:05:13 -08:00
Kostia Balytskyi	7ee657f124	mononoke: asyncify signatures of two fns in getbundle_response Summary: ## Wider goal See D20068839 ## This diff Asyncifying only singatures allows us to independently work on function bodies, without touching the callsites later in the diff. Reviewed By: StanislavGlebik Differential Revision: D20097804 fbshipit-source-id: f1391a055947c7802f719bc99b9eae71a4ac39cd	2020-02-27 05:01:52 -08:00
Kostia Balytskyi	bd90a843a7	mononoke: asyncify diff_with_parents in getbundle_response Summary: ## Wider goal See D20068839 ## This diff Let's modernize this particular fucntion Reviewed By: StanislavGlebik Differential Revision: D20097800 fbshipit-source-id: a919b5ad1b544a7b784668ca265e24c375100fa3	2020-02-27 05:01:51 -08:00
Kostia Balytskyi	90b03f5a0d	mononoke: call old-style Future OldFuture in getbundle_response Summary: ## Wider goal See D20068839 ## This diff This file contains a mix of old and new-style futures. It even has futures, which have items composed of futures. To be able to convert on one of the levels and not the other, we need to deal with the confusion. Let's have old things have `Old` in the name. Reviewed By: StanislavGlebik Differential Revision: D20097803 fbshipit-source-id: fedb3669ef34a8328ec389a30ff2c512ab363818	2020-02-27 05:01:51 -08:00
Kostia Balytskyi	4f2993c765	mononoke: move bundle generation bits from hg_sync_job into getbundle_response Summary: ## Wider goal We want the flexibility to return hydrated responses for `getbundle` wireproto requests for draft commits. This means that the responses will contain not only the commit data (as they do now), but also trees and files. For context, when an "unhydrated" response is returned for the `getbundle` request for a draft commit, we expect one of two things to happen later in the e2e scenario: - either `hg` client would immediately make another wireproto request (`gettreepack`, `getpackv1`) within the same client `hg` command execution - or a subsequent `hg update` call will cause another wireproto request In any case, another request is needed before the pulled commit can be used. This request can hit a different server, sometimes it can even be Mercurial instead of Mononoke. Specifically, it can Mercurial instead of Mononoke if the `fallback` path markers are configured incorrectly. In that case we have a problem, as Mercurial is incapable of serving `gettreepack` or `getpackv1` for infinitepush commits. One way to deal with this is to always have correct path markers, which is prone to human mistakes. Another way is to guarantee that Mononoke returns everything in the original `getbundle` request. We don't want to do this for public commits, as `pull`s of public commits typically fetch thousands of those commits and never care about tree or file data for all but one of them. Draft commits are different however, as they are usually exactly what the client intends to use, so hydrating those is fine. Still, we want this behavior to be gated behind a config flag. ## This diff A lot of the needed code is already implemented in the hg-sync job, bundle generating variant. So prior to implementing the actual behavior described above, let's move the relevant bits to `getbundle_response`. Later we can comb them up a bit (asyncify) and use to implement the needed behavior. Reviewed By: StanislavGlebik Differential Revision: D20068839 fbshipit-source-id: 0ab63d57b2d167401b7ee8864fe7760f5f65f8ec	2020-02-27 05:01:51 -08:00
Kostia Balytskyi	aac7bff59d	mononoke: pull config schema changes from configerator Summary: This is the moral equivalent of D20115877 in fbcode. See that diff for motivation. Reviewed By: StanislavGlebik Differential Revision: D20118575 fbshipit-source-id: 8f77f572068e611003b1344be3434f2d04ec56ca	2020-02-27 05:01:50 -08:00
Stanislau Hlebik	d5d3061168	mononoke: distinguish derived data waits with derived data generation Summary: Previously it was hard to tell whether the process were actually responsible for generating derived data or it was just waiting for it to be generated. Let's make this distinction clearer. Reviewed By: johansglock Differential Revision: D20138284 fbshipit-source-id: 52ae12679db2f61869f048baf2a603b456710a71	2020-02-27 03:15:39 -08:00
David Tolnay	de96589260	autocargo: Strip line comments Summary: These comments end up being a source of churn as we roll out D20125635, and anyway are not particularly meaningful after the transformations performed by autocargo. For example: ``` bytes = { version = "0.4", features = ["serde"] } # todo: remove ``` ^ This doesn't mean the generated Cargo.toml intends to drop its bytes dependency altogether, but just that will be migrated to a different version that is present in the third-party/rust/Cargo.toml but not visible in the generated Cargo.toml. Reviewed By: jsgf Differential Revision: D20128612 fbshipit-source-id: a9e7b29ddc4b26bc47a626dd73bdaa4771ee7b18	2020-02-26 16:31:52 -08:00
Stanislau Hlebik	98f6d5d1a8	mononoke: fix walker filenode walks Summary: Since Mononoke's filenodes were migrated to derived data framework hg_linknode_populated alarm has been firing. The main reason was that there's now a delay between hg changeset being generated and filenodes being generated. This diff fixes it by making sure walker won't visit hg changesets without generated filenodes (note that walker will visit these changesets later after filenodes will be generated). Reviewed By: ahornby Differential Revision: D20067615 fbshipit-source-id: 285e9a3d8c89b85441491c889a8458c86ca0e3a8	2020-02-26 15:21:53 -08:00
Aida Getoeva	585899f419	mononoke/scs: use last change in file history Summary: There is no need to generate expensive file history stream if only one node is requested. I refactored code that generated stream of history commits, so it'd first yield the nodes and only then prefetch their parents. That will help to solve latency problem for the history request for only a single commit. I removed BFS queue and added two state variables: ready nodes and already processed: * The last are the nodes that were return as a part of a history stream on the last iteration and now can be used to construct next BFS layer: prefetch fastlog batches, fill the commit graph, take parents in BFS order to form new bunch of nodes. * First are used if it's the first iteration - there is no processed nodes yet but there are some that are ready to be returned. I believe removing the queue I simplified the code and logic a little bit. Reviewed By: StanislavGlebik Differential Revision: D19818100 fbshipit-source-id: c30d28c623464ba3552a00e8542552f7655076ef	2020-02-26 08:09:12 -08:00
Alex Hornby	04e011525a	mononoke: walker: test validate scuba logging for non-public commits Summary: add test for scuba logging for non-public commits Reviewed By: StanislavGlebik Differential Revision: D20093721 fbshipit-source-id: eb0792bcae8ea27c11709181390efb0ac0c817ee	2020-02-26 06:16:29 -08:00
Stanislau Hlebik	7076fac933	mononoke: add exponential backoff Summary: During our tests we noticed that we can send too many blobstore read requests to the mapping. Let's add exponential backoff to prevent that Reviewed By: ikostia Differential Revision: D20116043 fbshipit-source-id: 6fecbda4c36a5065b77ba9df561c6d9c6a969089	2020-02-26 05:05:33 -08:00
Thomas Orozco	4ca1333b8a	mononoke/hooks: use a smaller test group for faster tests Reviewed By: ikostia Differential Revision: D20115985 fbshipit-source-id: 4f69fc84eee352bcc689918527c6d460fcf672ba	2020-02-26 04:44:39 -08:00
Thomas Orozco	c14a88bbef	mononoke: convert places that talk to Memcache to Bytes 0.5 Summary: Memcache doesn't care (because both old and new Bytes to `Into<IOBuf>`), but Thrift is Bytes 0.5. We have our caching ext layer in the middle, which wants Bytes 0.4. This means we end up copying things we don't need to copy. Let's update to fewer copies. I didn't update apiserver, because a) it's going away, and b) those bytes go into Actix, and Actix isn't upgrading to Bytes 0.5 any time soon! Besides, this doesn't actually need updating besides tests anyway. Reviewed By: dtolnay Differential Revision: D20006062 fbshipit-source-id: 42766363a0ff8494f18349bcc822b5238e1ec0cd	2020-02-26 03:30:47 -08:00
Jeff Zhang	33140b117c	Push `compat` down one level in eden/mononoke/cmds/admin/main.rs Summary: Moving `compat` one level down to the call sites of subcommand functions. Reviewed By: farnz Differential Revision: D20085398 fbshipit-source-id: 461e147d2ae6e560b3a75fb92fa6b23f9f54d13e	2020-02-25 10:22:03 -08:00
Stanislau Hlebik	19e1e94984	mononoke: add lease renewing to derived data Summary: During S196197 lease expired and we were rederiving the same derived data over and over again for a big commit. this diff adds lease renewal that should help with this problem. Reviewed By: HarveyHunt Differential Revision: D20093323 fbshipit-source-id: d139abf6659722f47ea40d9b2f279daa03623ff4	2020-02-25 09:22:46 -08:00
Stanislau Hlebik	4bd758289b	mononoke: async/await derive_may_panic() function Reviewed By: HarveyHunt Differential Revision: D20092945 fbshipit-source-id: 70ec1a8e5b9c99f3853a13bebe3657ece5ff9e9e	2020-02-25 09:22:46 -08:00
Stanislau Hlebik	3418318883	mononoke: do not generate hgchangesets unnecessarily in FilenodesOnlyPublicMapping Summary: fetch_root_filenode is called by FilenodesOnlyPublicMapping to figure out if filenodes were already derived. Previously it first derived hg changeset and then fetched looked up root manifest in db. However if hg changeset is not derived then filenodes couldn't possible be derived either and we can return an answer faster. This is useful in the next diff where I change walker Reviewed By: ahornby Differential Revision: D20068819 fbshipit-source-id: 17f066c437e0b1f7bbeb8f6e247eadc9afe94f90	2020-02-25 08:07:07 -08:00
Thomas Orozco	f8fcbc9723	mononoke/blobstore_healer: wait for MyRouter properly Summary: The blobstore_healer has never waited for MyRouter before querying for slave status, but it ended up implicitly working because creating a blobstore required a SQL factory, and creating a SQL factory would result in waiting for MyRouter. Now that creating a blobstore doesn't require SQL factory unless you're going to actually use it (which the healer isn't: it doesn't use a multiplexblob, it uses the underlying blobstores instead), we no longer wait properly for MyRouter, so if MyRouter isn't there when we boot, we crash. This fixes that. Reviewed By: ahornby Differential Revision: D20094829 fbshipit-source-id: 82b7e8d893a01049d1f434ee8dff36a877a0d2f4	2020-02-25 07:03:28 -08:00
Alex Hornby	693e8dee0a	mononoke: walker: add support for loading by GitSha1 Aliases Summary: Add support for loading by GitSha1 Aliases. This relies on the change to Alias::GitSha1 earlier in stack. Reviewed By: ikostia Differential Revision: D19903577 fbshipit-source-id: 73cdccc04af61fa524c3683851d8af9ae90d31dc	2020-02-25 03:36:06 -08:00
Thomas Orozco	2a12e2beb6	mononoke/derived_data: log when we start deriving Summary: This should give us a slightly better idea of what hosts are doing to troubleshoot duplicate derivation. Also, let's make the logging a bit less confusing. Reviewed By: StanislavGlebik Differential Revision: D20070619 fbshipit-source-id: 91cc264b7043b8fc8c21c007832fba328ef0017d	2020-02-24 12:03:41 -08:00
Thomas Orozco	b3bebee0b4	mononoke: include DB config in multiplexed blobstore configuration Summary: This updates our multiplexed blobstore configuration to carry its own DB config. The upshot of this change is that we can move the blobstore sync queue (a fairly unruly table) to its own DB. Another nice side effect of this is that it cleans up a bunch of other code, by finally decoupling the blobstore config from the DB config. For examples, places that need to instantiate a blobstore can now to do even without a DB config (such as wireproto logging). Obviously, this cannot land until we update the configs to include this. I'll do so in Configerator prior to landing the diff. Reviewed By: HarveyHunt Differential Revision: D19973905 fbshipit-source-id: 79e4ff92cdb989aab4532decd3fe4fd6c55e2bb2	2020-02-24 11:54:45 -08:00
Thomas Orozco	b7185f0f13	mononoke/metaconfig: tidy up blobstore creation Summary: I'd like to refactor our multiplex blob to store its DB using a different shard. In preparation of doing so, let's: - Extract parsing DB configs from storage configs - Tidy up some related places that take a reference when they actually need ownership (which is sort of wasteful). Reviewed By: StanislavGlebik Differential Revision: D19973906 fbshipit-source-id: 82baceb892e9e24e5fd0349ffa5503884c177a7a	2020-02-24 11:54:44 -08:00
Xavier Deguillard	401d44916b	add lfs_protocol to autocargo Summary: Now it no longer depends on mononoke_types, we can build it with cargo Reviewed By: krallin Differential Revision: D20070438 fbshipit-source-id: 1b2f9cc3640c58fd38e962c7c738d08cbb22a71d	2020-02-24 11:12:45 -08:00
Xavier Deguillard	934b64397b	convert to bytes 0.5 Summary: The bytes 0.5 is a depencency of newer tokio, it's also newer, and thus better. Staying on 0.4 means that copies between Bytes 0.4 and 0.5 need to be done, this will be especially bad in the LFS code since 10+MB buffer will have to be copied... One main API change is for the configparser. The code used to take Into<Bytes> for the keys, I switched it to AsRef<[u8]>. For hg_memcache_client, an extra copy is performed to build a Delta, since this code uses an old tokio, and is being replaced right now, the effort of switching to a new tokio and new bytes was not deemed worth it, the copy will do for now. Reviewed By: dtolnay Differential Revision: D20043137 fbshipit-source-id: 395bfc3749a3b1bdfea652262019ac6a086e61e0	2020-02-24 10:28:46 -08:00
Lukas Piatkowski	4aea99df4e	mononoke/blobstore: remove rocksdb blobstore and replace its usages with sqliteblob Summary: This is the second (and last) step on removing RocksDB as a blobstore. Check the task for more description. Context for OSS: > The issue with rocksblob (and to some extent sqlite) is that unless we > introduce a blobstore tier/thift api (which is something I'm hoping to avoid > for xdb blobstore) we'd have to combine all the mononoke function like hg, > scs, LFS etc into one binary for it to have access to rocksdb, which would be > quite a big difference to how we deploy internally (Note: this ignores all push blocking failures!) Reviewed By: farnz Differential Revision: D20001261 fbshipit-source-id: c4b2b2a393b918d17680ad483aa1d77356f1d07c	2020-02-24 05:23:07 -08:00
Lukas Piatkowski	278ac5e1f9	mononoke: make mononoke_types OSS-buildable Summary: (Note: this ignores all push blocking failures!) Reviewed By: farnz Differential Revision: D19948740 fbshipit-source-id: 9d0cfc4ccbcb3c08bb969f23229ed3096470fa86	2020-02-24 05:23:07 -08:00
Alex Hornby	87112798b7	mononoke: walker: add option to start from non-bookmarks Summary: Add option to start the roots of the walk from any graph node, rather than just bookmarks. This is useful when reproducing issues loading a key, validating a changeset/filenode etc, or to get consistent results on things like sizing where specifying root by bookmark would result in changes between runs. Reviewed By: farnz Differential Revision: D19886707 fbshipit-source-id: b7361cbec894aba08b6f702ff0731b9b201224d3	2020-02-24 03:49:23 -08:00
Mark Thomas	70ffdc7293	add export Summary: Add `scsc export`. Analogous to `svn export`, this exports the contents of a directory within a commit to files on disk, without a local checkout. Reviewed By: mitrandir77 Differential Revision: D20006307 fbshipit-source-id: 5870712172cd8a030e85dbff75273c28ab0c332c	2020-02-24 03:00:22 -08:00
Thomas Orozco	5b07c8285e	mononoke: test-mononoke-admin.t: fixup replication lag match Summary: It's not always 0! (sometimes it's 1) Reviewed By: farnz Differential Revision: D20065610 fbshipit-source-id: b546befbf824713811fd7c011bbf4c246d3c696d	2020-02-24 02:57:18 -08:00
Stanislau Hlebik	ec76ba93c6	mononoke: convert some fastlog functions to async/await Reviewed By: farnz Differential Revision: D20059447 fbshipit-source-id: fa4a70b238ebc85ad5e589b06ee8a1ca6c0ea509	2020-02-24 00:53:56 -08:00
Xavier Deguillard	33020829b1	lfs_protocol: remove dependency on mononoke_types Summary: Mercurial wishes to use this crate, but pulling in mononoke_types brings way too many dependencies. Since the only reason mononoke_types is brought in is for the Sha256 type, let's just hardcode it to [u8; 32]. Reviewed By: krallin Differential Revision: D20003596 fbshipit-source-id: 53434143c61cd1a1275027200e1149040d30beae	2020-02-21 12:26:19 -08:00
Harvey Hunt	0ecac65ac4	mononoke: Remove restrict_users hook Summary: This hook was implemented to prevent incorrect users from moving a bookmark. However, it doesn't work and the functionality is now implemented by `is_allowed_user` in the pushrebase pipeline. Remove the unused hook. Reviewed By: johansglock Differential Revision: D20030479 fbshipit-source-id: bcbc9508eebe77cffbc7936382ba4d345b76f74f	2020-02-21 09:46:38 -08:00
Thomas Orozco	8086dc29c7	mononoke: add a limit_commit_message_length hook Summary: We're working towards sharding Bonsais. Let's make them easier to cache by also not allowing arbitrarily large commit messages. Reviewed By: StanislavGlebik Differential Revision: D20002994 fbshipit-source-id: b2319ac9d5709e968121d4299396e03a90df4a06	2020-02-21 07:18:15 -08:00
Mateusz Kwapich	42bfba7c99	add git mappings import option Summary: Let's import the info about corresponding git commits on blobimport whenever possible. Reviewed By: ikostia Differential Revision: D19877929 fbshipit-source-id: ba03d5de8ae8a9bd80084a8e858cd05e8f621193	2020-02-21 05:41:46 -08:00
Mateusz Kwapich	6111067524	add git mapping pushrebase hook Summary: Let's populate the bonsai<->git mapping on pushrebase of the commits that are coming from git. By this being a pushrebase hook we can have the accuare mappings being available as soon as the bonsai commit is available. Corresponding configerator change: D19951607 Reviewed By: krallin Differential Revision: D19949472 fbshipit-source-id: b957cbcdd0f14450ceb090539814952db9872576	2020-02-21 05:41:45 -08:00
Mateusz Kwapich	38f7a24364	add a way to update git mappings inside SQL transaction Summary: During the pushrebase hook phase we'll need to reuse existing transaction. Reviewed By: krallin Differential Revision: D19949473 fbshipit-source-id: 7c53308724bec6df6d40933405f703c86be15a7a	2020-02-21 05:41:45 -08:00
Mateusz Kwapich	c2be00c45e	add git mappings to blobrepo Summary: By having it in blobrepo we can ensure that all parts of mononoke can access it easily Reviewed By: StanislavGlebik Differential Revision: D19949474 fbshipit-source-id: ac3831d61177c4ef0ad7db248f2a0cc5edb933b1	2020-02-21 05:41:44 -08:00
Mateusz Kwapich	5a53415bcb	add git mapping crate Summary: We need a table to store git<->bonsai mappings and a crate that would abrstract operations on it: * it's going to be useful immediately to store git hashes for configerator commits and doing the hash translations via SCS. * it's going to be useful further down the line for real git support. NOTE: I'm explicitly using the name `SHA1` all over the place to minimize the confusion if we'll ever want to support other hashing schemes for git commits. (Git Community is working on SHA256 support nowdays). The corresponding AOSC diff: D19835975 Reviewed By: krallin Differential Revision: D19835974 fbshipit-source-id: 113640f4db9681b060892a8cedd93092799ab732	2020-02-21 05:41:44 -08:00
Mark Thomas	a9490441b2	add blame --parent Summary: Add the `--parent` flag to `scsc blame`. This runs blame against the first parent of the specified commit, rather than the commit itself. This allows users to copy and paste commit hashes from previous blame output in order to skip the commit, rather than having to look up the parent commit hash themselves. Reviewed By: StanislavGlebik Differential Revision: D20006308 fbshipit-source-id: d1c25aad8f236fe27e467e29f6a96c957b6c8c8f	2020-02-20 13:03:54 -08:00
Thomas Orozco	4a29fe400d	mononoke/blobstore_healer: migrate replication lag polling to async / await Summary: The former implementation here was a little difficult to work with, and resulted in a whole lot of cloning of closures, etc. This updates the implementation to be a little simpler on the whole (async / await is nicer for while loops, since you can use, well, loops) It does slightly change a few parts of the behavior: - The old implementation would wait for the replication lag duration. That's not really correct. As we've observed several time this weeks, replication lag usually drops quickly once it starts dropping. I.e. if the replication lag is 10 seconds, it doesn't take 10 seconds to catch up. This gets more important with big lag durations. - I updated replication lag to be u64 instead of usize. usize doesn't really make sense for something that has absolutely nothing to do with our pointer size. I also split out the logic for calculating how long we wait in a part that cares about whether we are busy and one that cares about replication lag (whereas the older one kinda mixed the two together). We wait for our own throttling (i.e. sleep for a sec if we didn't do anything) before we wait for replication lag, so the new behavior should have the desired behavior of: - If we don't have much work to do, we sleep 1 second between each iteration (but if we do have work, we don't). - No matter what, if we have replication lag, we wait until that passes before doing any work. The old one did that too, but it mixed the two calculations together, and was (at least in my opinion) kinda hard to reason about as a result. Reviewed By: StanislavGlebik Differential Revision: D19997587 fbshipit-source-id: 1de6a9f9c1ecb56e26c304d32b907103b47b4728	2020-02-20 12:26:51 -08:00
Thomas Orozco	be5d7343ce	mononoke/blobstore_healer: check for replication lag _before_ starting work Summary: We had crahsloops on this (which I'm fixing earlier in this stack), which resulted in overloading our queue as we tried to repeatedly clear out 100K entries at a time, rebooted, and tried again. We can fix the root cause that caused us to die, but we should also make sure crashloops don't result in ignoring lag altogether. Also, while in there, convert some of this code to async / await to make it easier to work on. Reviewed By: HarveyHunt Differential Revision: D19997589 fbshipit-source-id: 20747e5a37758aee68b8af2e95786430de55f7b1	2020-02-20 12:26:51 -08:00
Thomas Orozco	6da3dc939a	mononoke/blobstore_sync_queue: delete in smaller batches Summary: Our blobstore_sync_queue selects entries with a limit on the number of unique keys it's going to load. Then, it tries to delete them. However, the number of entries might be (much) bigger than the number of keys. When we try to delete them, we time out waiting for MySQL because deleting 100K entries at once isn't OK. This results in crashlooping in the healer, where we start, delete 100K entries, then time out. This is actually double bad, because when we come back up we just go wihhout checking replication lag first, so if we're crashlooping, we disregard the damage we're doing in MySQL (I'm fixing this later in this stack). So, let's be a bit more disciplined, and delete keys 10K at a time, at most. Reviewed By: HarveyHunt Differential Revision: D19997588 fbshipit-source-id: 2262f9ba3f7d3493d0845796ad8f841855510180	2020-02-20 12:26:50 -08:00
Thomas Orozco	ef1ffa31e8	mononoke/sql_ext: log which shard we are waiting for in myrouter Summary: MyRouter needs to be told which shards to watch. Since I'm adding a new shard, it'll be easier for everyone to know that they need to update their MyRouter configuration if we start logging the shard name we're trying to hit. Reviewed By: ikostia Differential Revision: D20001704 fbshipit-source-id: 8a9ff3521bc7e3c9b7ed39c6ae33d0ddc1d467b7	2020-02-20 07:55:04 -08:00

1 2 3 4 5

236 Commits