sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-11 01:07:15 +03:00

Author	SHA1	Message	Date
Stanislau Hlebik	099856d71c	mononoke: avoid combinatoric explosion in derived_data framework Summary: In a very mergy repos we can hit a combinatoric explosion by visiting the same node over and over again. Derived data framework has the same problem, and this diff fixes it. I had a few attempts at implementing it: 1 Use `bounded_traversal`, but change unfold to filter out parents that were already visited. That wasn't correct because during fold will be called only with the "unvisited" parents. For example in a case like ``` D / \ C B \ / A ``` fold for C or B will be called with empty parents, and that's incorrect. 2 Use `bounded_traversal`, change unfold to filter out visited parents but also remember real parents. That won't work as well. The reason is that fold might be called before unfold for parents have finished. so in the case like ``` D / \ C B \ / A \| ... thousands of commits ``` If C reaches A first, then B won't visit any other node, and it will try to derive data for B. However derived data for A might not be ready yet, so deriving data for B might fail. 3 Change bounded_traversal to support DAGs not just tree. From two points above it's clear that bounded_traversal should be called bounded_tree_traversal, because on any other DAGs it might hit combinatoric explosion. I looked into changing bounded_traversal to support DAGs, and that was possible but that was not easy. Specifically we need to make sure that all unfold operations are called after fold operations, stop using integers for nodes etc. It might also have a perf hit for the tree case, but not clear how big is it. While I think supporting DAGs in bounded_traversal makes sense, I don't want to block derived data implementation on that. I'll create a separate task for that --------------------------------------------------------------------------------- The approach I took in the end was to use bounded_stream_traversal that don't visit the same node twice. Doing this will find all commits that need to be regenerated but it might return them in an arbitrary order. After that we need to topo_sort the commits (note that I introduced the bug for hg changeset generation in D16132403, so this diff fixes it as well). This is not the most optimal implementation because it will generate the nodes sequentially even if they can be generated in parallel (e.g. if the nodes are in different branches). I don't think it's a huge concern so I think it worth waiting for bounded_dag_traversal implementation (see point 3) above) --------------------------------------------------------------------------------- Finally there were concerns about memory usage from additional hashset that keeps visited nodes. I think these concerns are unfounded for a few reasons: 1) We have to keep the nodes we visited anyway because we need to generated derived data from parents to children. In fact, bounded_traversal keeps them in the map as well. That's true that bounded traversal can do it a bit more efficiently in cases we have two different branches that do not intersect. I'd argue that's a rare case and happens only on repo merges which have two independent equally sized branches. But even for the case it's not a huge problem (see below). 2) Hashset just keep commit ids which are 32 bytes long. So even if we have 1M commits to generate that would take 32Mb + hashset overhead. And the cases like that should never happen in the first place - we do not expect to generate derived data for 1M of commits except for the initial huge repo imports (and for those cases we can afford 30Mb memory hit). If we in the state where we need to generate too many commits we should just return an error to the user, and we'll add it in the later diffs. Reviewed By: krallin Differential Revision: D16438342 fbshipit-source-id: 4d82ea6111ac882dd5856319a16dda8392dfae81	2019-07-24 11:55:34 -07:00
Kostia Balytskyi	039d3942e8	sql: do not use shard_id in fb303 counter names Summary: Before this change, we would always include the shard id in our mysql-related fb303 counters. This is not perfect for two reasons: - the the xdb blobstore we have 4K shards and 24 counters, so we were reporting 96K counters in total - we rarely care about per-counter metrics anyway, since in most cases all queries are uniformly distributed Therefore, let's change this approach to not use per-shard counters and use per-shardmap ones (when sharding is involved). Reviewed By: krallin Differential Revision: D16360591 fbshipit-source-id: b2df94a3ca9cacbf5c1f328b48e87b48cd18287e	2019-07-19 06:30:40 -07:00
Thomas Orozco	3897b5ab50	mononoke: retry in create_raw_xdb_connection Summary: This adds a few retries in create_raw_xdb_connection. This is intended as a first step towards solving some of the flakiness we've observed when connecting to MySQL through direct connections (sometimes, we fail to acquire certificates). Reviewed By: farnz Differential Revision: D16228401 fbshipit-source-id: 0804797aecfe0b917099191cd2a36ce4c077b949	2019-07-17 12:57:42 -07:00
Thomas Orozco	4105638a1e	mononoke: hoist up to_string() calls Summary: In earlier diffs in this stack, I updated the callsites that reference XDB tiers to use concrete &str types (which is what they were receiving until now ... but it wasn't spelled out as-is). In this diff, I'm updating them to use owned `String` instead, which lets us hoist up `to_string()` and `clone()` calls in the stack, rather than pass down reference only to copy them later on. This allows us to skip some unnecessary copies. Tt turns out we were doing quite a few "turn this String into a reference, pass it down the stack, then turn it back into a String". Reviewed By: farnz Differential Revision: D16260372 fbshipit-source-id: faec402a575833f6555130cccdc04e79ddb8cfef	2019-07-17 12:57:42 -07:00
Thomas Orozco	5a10c5fbf9	mononoke: convert DB connection instantiation to future Summary: Instantiating a new DB connection may require remote calls to be made to e.g. Hipster to allocate a new certificate (this is only the case when connecting to MySQL). Currently, our bindings to our underlying DB locator make a blocking call to pretend that this operaiton is synchronous: https://fburl.com/ytmljxkb This isn't ideal, because this call might actually take time, and we might also occasionally want to retry it (we've had issues in our MySQL tests with acquiring certificates that retrying should resolve). Running this synchronously makes doing so inefficient. This patch doesn't update that, but it fixes everything on the Rust side of things to stop expecting connections to return a `Result` (and to start expecting a Future instead). In a follow up diff, I'll work on making the changes in common/rust/sql to start returning a Future here. Reviewed By: StanislavGlebik Differential Revision: D16221857 fbshipit-source-id: 263f9237ff9394477c65e455de91b19a9de24a20	2019-07-16 03:17:58 -07:00
Zeyi (Rice) Fan	0aeef0996e	add cachelib layer to `CacheManager` Summary: Add cachelib layer to `CacheManager`. `CacheManager` behaviors: \| cachelib \| memcache \| Behavior \| \| -- \| -- \| -- \| \| Miss \| Miss \| Resolve `fill` future, fill both cache layer \| \| Miss \| Hit \| Fetch data from memcache and fill cachelib with the data fetched \| \| Hit \| Miss \| Return data fetched from cachelib, DO NOT fill memcache \| \| Hit \| Hit \| Return data fetched from cachelib \| Reviewed By: StanislavGlebik Differential Revision: D15929659 fbshipit-source-id: f7914efc7718c614f39a8fd6ad5e6588773fdd78	2019-06-24 17:48:30 -07:00
Young Kim	f4e8e9bd0b	Modify abomonation_future_cache to use VolatileLruCachePool Summary: Add type safety to `abomonation_future_cache` by requiring usage of `VolatileLruCachePool`, and make that change for all usages of `LruCachePool`. Reviewed By: farnz Differential Revision: D15882275 fbshipit-source-id: 3f192142af254d7b6b8ea7f9cc586c2034c97b93	2019-06-21 23:35:07 -07:00
Sungkeun Cho	365e4a732b	Use with_xdb method accepting an optional myrouter_port in SqlConstructors Summary: It updates SqlConstructors to expose a `with_xdb` method that accepts an optional myrouter port. Reviewed By: krallin Differential Revision: D15897639 fbshipit-source-id: 25047c24ef28c76d2a27a8d26de8ecad521a1f82	2019-06-19 09:40:48 -07:00
Thomas Orozco	6753522810	mononoke: allow running mononoke server without myrouter Summary: This updates the mononoke server code to support booting without myrouter. This required 2 changes: - There were a few callsites where we didn't handle not having a myrouter port. - In our function that waits for myrouter, we were failing if we had no myrouter port, but that's not desirable: if we don't have a myrouter port, we simply don't need to wait. Arguably, This isn't 100% complete yet. Notably, RepoReadWriteStatus still requires myrouter. I'm planning to create a bootcamp task for this since it's not blocking my work adding integration tests, but would be a nice to have. Speaking of further refactor, it'd be nice if we supported a `SqlConstructors::with_xdb` function that did this matching for us so we didn't have to duplicate it all over the place. I'm also planning to bootcamp this. Reviewed By: farnz Differential Revision: D15855431 fbshipit-source-id: 96187d887c467abd48ac605c28b826d8bf09862b	2019-06-18 07:52:22 -07:00
Zhiyong Zhong	a078fdb22f	Refactor `myrouter` startup expectation into a common place Summary: 1.add function myrouter_ready to common/sql_ext/sr/lib.rs; 2.refactor main.rs and repo_handlers.rs to use the new function Reviewed By: ikostia Differential Revision: D15623501 fbshipit-source-id: 7b9d6c5fd7c33845148dfacefbcf1bf3c6afaa5d	2019-06-12 03:44:52 -07:00
Thomas Orozco	a6004ae016	mononoke: caching_ext: elide DB calls for no data Summary: We don't control whatever `get_from_db` will do when asked to fetch no data. We can hope it'll do the smart thing and not hit the DB nor increment any monitoring counters. This can be problematic, because it can result in confusing data in ODS. See T45198435 for a recent example of this. Reviewed By: StanislavGlebik Differential Revision: D15620424 fbshipit-source-id: 629c2eaad00d4977b0598c26e1f2a2ca64a1d66e	2019-06-04 05:16:52 -07:00
Jeremy Fitzhardinge	af056ea526	Convert scm/mononoke/common/rust/sql_ext to Rust 2018 Summary: Rust 2018 updates to: //scm/mononoke/common/rust/sql_ext:sql_ext //scm/mononoke/common/rust/sql_ext:sql_ext-unittest Reviewed By: StanislavGlebik Differential Revision: D15465996 fbshipit-source-id: 41b6d64e3debd0d12018f1a4a3f0b6e56baeb7af	2019-05-28 15:19:49 -07:00
Jeremy Fitzhardinge	bc85fa4edb	Convert scm/mononoke/common/rust/caching_ext to Rust 2018 Summary: Rust 2018 updates to: //scm/mononoke/common/rust/caching_ext:caching_ext //scm/mononoke/common/rust/caching_ext:caching_ext-unittest Reviewed By: StanislavGlebik Differential Revision: D15465983 fbshipit-source-id: e68fdc2324a68c9884fdcf1c45273443917e758e	2019-05-24 10:07:45 -07:00
Thomas Orozco	a70681f359	caching_ext: expose cache hits / misses Summary: This updates caching_ext to record cache hit and miss stats. This makes it easier to write tests that exercise this caching. As part of this, I refactored the CachelibHandler and MemcacheHandler mocks to use a shared MockStore implementation. Reviewed By: StanislavGlebik Differential Revision: D15220647 fbshipit-source-id: b0f70b9780f577226664ebf6760b5fc93d733cd3	2019-05-21 12:25:49 -07:00
Jeremy Fitzhardinge	e56d695cec	mononoke: use SqlConstructors label for local db names Summary: Seems redundant to also require callers to open_ssl to also pass a (mostly) identical string. Also make open_ssl special-case filenodes with sharding (though filenodes aren't currently opened through it). Reviewed By: StanislavGlebik Differential Revision: D15157834 fbshipit-source-id: 0df45307f17bdb2c021673b3153606031008bee2	2019-05-21 12:25:44 -07:00
Lukas Piatkowski	a7d5f76635	mononoke: using raw connection to xdb when myrouter_port is not provided Summary: In the case of mononoke's admin tool it's annoying for users to be required to run myrouter in the background and provide myrouter port to every command. Thanks to this change it is no longer necessary to run admin commands through myrouter - the tool will simply use a direct connection to XDB using the sql crate. It is important to note that the raw XDB connection via sql crate doesn't have connection pooling and doesn't handle XDB failover so it is crucial that it is never used for long-lived or request heavy use cases like running mononoke server or blobimport Reviewed By: jsgf Differential Revision: D15174538 fbshipit-source-id: 299d3d7941ae6aec31961149f926c2a4965ed970	2019-05-21 12:25:35 -07:00
Jeremy Fitzhardinge	89ae5651a4	rust/sql: add stats for connection creation and reuse Reviewed By: StanislavGlebik Differential Revision: D14988868 fbshipit-source-id: e64c4ab57726d1743d11312592ba06107fdb78f4	2019-05-21 12:25:15 -07:00
Jeremy Fitzhardinge	08fd9ce03b	rust/sql: add LABEL to SqlConstructors Summary: Add a LABEL constant to the SqlConstructors trait to make it easier to identify which table is being used, for stats and logging. Reviewed By: HarveyHunt Differential Revision: D13457488 fbshipit-source-id: a061a9582bc1783604f249d5b7dcede4b1e1d3c5	2019-05-21 12:25:14 -07:00
Stanislau Hlebik	af70f9698a	mononoke: decrease the number of write connections Summary: Following D14935155, let's decrease number of write connections as well Reviewed By: ikostia Differential Revision: D14973547 fbshipit-source-id: c344ecc568be26287e998b45b6988744cb5e0a09	2019-05-21 12:25:14 -07:00
Stanislau Hlebik	8b0b2415a7	mononoke: decrease default number of write connections to sharded dbs Reviewed By: farnz Differential Revision: D14935155 fbshipit-source-id: 4e534657b3984974665c074aecea40ab330c6cdd	2019-05-21 12:25:11 -07:00
Johan Schuijt-Li	616654450b	unify lz4 compression with hg Summary: Mononoke and hg both have their own implementation wrappers for lz4 compression, unify these to avoid duplication. Reviewed By: StanislavGlebik Differential Revision: D14131430 fbshipit-source-id: 3301b755442f9bea00c650c22ea696912a4a24fd	2019-02-21 06:27:53 -08:00
Lukas Piatkowski	4150292a66	slog_ext: add SimpleFormatWithError to improve client side logging readability Reviewed By: StanislavGlebik Differential Revision: D14143774 fbshipit-source-id: 3afdaab20a15f61964cd5a604b35749803573ce9	2019-02-20 08:52:09 -08:00
Stanislau Hlebik	a83bc8fee3	mononoke: decrease the number of connections for sharded db Reviewed By: HarveyHunt Differential Revision: D13784695 fbshipit-source-id: 0afaac5357776119f88fc3f466e4cd799a63c9c9	2019-01-23 12:15:38 -08:00
Lukas Piatkowski	1fc999ed61	sql_ext: add PoolSizeConfig to enable easy configuring of MyRouter pool sizes Reviewed By: StanislavGlebik Differential Revision: D13465182 fbshipit-source-id: c6c1b788c698bd853a9c26c1b1c1c2b7199e05e9	2019-01-15 09:38:26 -08:00
Jeremy Fitzhardinge	408e7665d9	mononoke: move RepositoryId into mononoke-types Summary: There's nothing Mercurial-specific about identifying a repo. This also outright removes some dependencies on mercurial-types. Reviewed By: StanislavGlebik Differential Revision: D13512616 fbshipit-source-id: 4496a93a8d4e56cd6ca319dfd8effc71e694ff3e	2018-12-19 10:24:27 -08:00
Stanislau Hlebik	c6f21b1cc8	mononoke: efficient search of max generation in NodeFrontier Summary: Previously `max_gen()` function did a linear scan through all the keys, and it was linear. Let's use `UniqueHeap` datastructure to track maximum generation number. Reviewed By: lukaspiatkowski Differential Revision: D13275471 fbshipit-source-id: 21b026c54d4bc08b26a96102d2b77c58a981930f	2018-11-30 04:34:02 -08:00
Stanislau Hlebik	cfb3dd2e15	mononoke: batching of bonsai-hg mapping request Summary: We've recently found that `known()` wireproto request gets much slower when we send more traffic to Mononoke jobs. Other wireproto methods looked fine, cpu and memory usage was fine as well. Background: `known()` request takes a list of hg commit hashes and returns which of them Mononoke knows about. One thing that we've noticed is that `known()` handler sends db requests sequentially. Experiments with sending `known()` requests with commit hashes that Mononoke didn't know about confirmed that it's latency got higher the more parallel requests we sent. We suspect this is because Mononoke has to send a requests to db master, and we limit the number of master connections. A thing that should help is batching the requests i.e. do not send many requests asking if a single hg commit exists, but sending the same request for many commits at once. That change also required doing changes to the bonsai-mapping caching layer to do batch cache requests. Reviewed By: lukaspiatkowski Differential Revision: D13194775 fbshipit-source-id: 47c035959c7ee12ab92e89e8e85b723cb72738ae	2018-11-27 09:40:52 -08:00
Stanislau Hlebik	f852e42f3c	mononoke: make sure we go to slave for read connections Summary: Default ServiceType is ServiceType.Any, so it might go to master in a master region. This diff changes it. Reviewed By: lukaspiatkowski, farnz Differential Revision: D13021674 fbshipit-source-id: 928cf59b095549f3048411241116c097e1193c7d	2018-11-12 06:52:10 -08:00
Lukas Piatkowski	d22b6e14e5	sql: make max_number_of_concurrent_connections customizable Summary: Additionally use a lower max_number_of_concurrent_connections for read connections to master to avoid overloading it. Reviewed By: farnz Differential Revision: D12979366 fbshipit-source-id: 258dbae554155d7a33d619f445293092940aad61	2018-11-08 10:14:05 -08:00
Lukas Piatkowski	fc352b60fe	mononoke: remove diesel from codebase Reviewed By: StanislavGlebik Differential Revision: D10512736 fbshipit-source-id: 9cd8d3abdc896a9e7b01aaec63ff69380efc4f0c	2018-10-29 05:18:29 -07:00
Lukas Piatkowski	fe6e5f056c	sql_ext: add SqlConstructors trait to avoid copying the same sql code in Mononoke Reviewed By: StanislavGlebik Differential Revision: D10483792 fbshipit-source-id: ebae1d0fc7ff6ee750df8f0743824b326901466a	2018-10-22 10:20:47 -07:00
Stanislau Hlebik	1ca1bc0d81	mononoke: fix buffer size in compression Summary: We were using incorrect buffer size. That's very surprising that our servers weren't continuously crashing. However, see the test plan - it really looks like `LZ4_compressBound()` is the correct option here. Reviewed By: farnz Differential Revision: D9738590 fbshipit-source-id: d531f32e79ab900f40d46b7cb6dac01dff8e9cdc	2018-09-10 09:23:50 -07:00
Jun Wu	cd12c8ab3a	Back out "Reuse pylz4 encoding between hg and Mononoke into a separate library" Summary: Backout D9124508. This is actually more complex than it seems. It breaks non-buck build everywhere: - hgbuild on all platforms. POSIX platforms break because `hg archive` will miss `scm/common`. Windows build breaks because of symlink. - `make local` on GitHub repo because `failure_ext` is not public. The `pylz4` Cargo.toml has missing dependencies. Fixing them correctly seems non-trivial. Therefore let's backout the change to unblock builds quickly. The linter change is kept in case we'd like to try again in the future. Reviewed By: simpkins Differential Revision: D9225955 fbshipit-source-id: 4170a5f7664ac0f6aa78f3b32f61a09d65e19f63	2018-08-08 15:23:56 -07:00
Tuan Tran	e1d33078e6	Reuse pylz4 encoding between hg and Mononoke into a separate library Summary: Moved the lz4 compression code into a separate module in `scm/common/pylz4` and redirected code referencing the former two files to the new module Reviewed By: quark-zju, mitrandir77 Differential Revision: D9124508 fbshipit-source-id: e4796cf36d16c3a8c60314c75f26ee942d2f9e65	2018-08-08 15:23:56 -07:00
Pulkit Goyal	fc880f518b	Add Cargo.toml files to crates. (#7 ) Summary: This is a series of patches which adds Cargo.toml files to all the crates and tries to build them. There is individual patch for each crate which tells whether that crate build successfully right now using cargo or not, and if not, reason behind that. Following are the reasons why the crates don't build: * failure_ext and netstring crates which are internal * error related to tokio_io, there might be an patched version of tokio_io internally * actix-web depends on httparse which uses nightly features All the build is done using rustc version `rustc 1.27.0-dev`. Pull Request resolved: https://github.com/facebookexperimental/mononoke/pull/7 Differential Revision: D8778746 Pulled By: jsgf fbshipit-source-id: 927a7a20b1d5c9643869b26c0eab09e90048443e	2018-07-09 19:52:27 -07:00
Christopher Kuklewicz	6cc2cd49f8	Unifying sql code for Mononoke/Diesel Summary: Unify all uses of Sqlite and of Mysql This superceded D8712926 Reviewed By: farnz Differential Revision: D8732579 fbshipit-source-id: a02cd04055a915e5f97b540d6d98e2ff2d707875	2018-07-08 09:06:22 -07:00
Stanislau Hlebik	500a138b04	mononoke: add CompressContext Summary: We had a memory leak because context wasn't cleaned afterwards. This diff fixes it Reviewed By: farnz Differential Revision: D8236762 fbshipit-source-id: f82b061f3f541d9104d1185ed04ea21224b7d5bc	2018-06-04 04:56:11 -07:00
Stanislau Hlebik	e2efcd5396	mononoke: rename Context to DecompressContext Summary: We are going to add CompressContext in the next diff Reviewed By: farnz Differential Revision: D8236761 fbshipit-source-id: 0df55b9bc5e9fd78ac8c060576513c1216641ead	2018-06-04 04:56:10 -07:00
Stanislau Hlebik	dc20cc2051	mononoke: add compress function Summary: Will be used in remotefilelog getfiles method. Reviewed By: jsgf Differential Revision: D6884919 fbshipit-source-id: e8037123a4843322c29b37c6b5749444781f4fa7	2018-02-06 11:23:57 -08:00
Stanislau Hlebik	8f0916fdf0	mononoke: add pylz4 crate Summary: Add a separate crate that uses lz4 in the same way as python lz4 library. The main difference is that first 4 bytes are length of the raw data in le32 format. The reason for moving it in a separate crate is to use pylz4 for remotefilelog getfiles method. Also removed one panic and replaced it with error. Reviewed By: jsgf Differential Revision: D6884918 fbshipit-source-id: 1b05381c045a1f138ab28820175289233b07a91d	2018-02-06 11:23:57 -08:00

40 Commits