sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-10 08:47:12 +03:00

Author	SHA1	Message	Date
Thomas Orozco	1ee93bdcfb	mononoke/rendezvous: use in-flight connection count to decide when to batch Summary: After doing some local benchmarking (using MononokeApi instantiation as the benchmark), one thing that's apparent is that we have quite a few parameters here and that tuning them is likely to be a challenge. One parameter in particular is the batch "objective", which controls how many requests we want to see in the last batching interval before we choose to batch (this is `rendezvous_dispatch_min_threshold`). The problem with this is this is that there is no good, real-world, metric to set it based on. This in contrast to the other parameters we have, which do have some reasonable metric to compare to: - rendezvous_dispatch_delay_ms: this is overhead we add to queries, so it should be small & on the order of query execution latency (i.e. a few ms). - rendezvous_dispatch_max_threshold: this controls how big our batches get, so it should be on the order of what makes a SQL query too big (i.e. less than a hundred records). In contrast, we want to set `rendezvous_dispatch_min_threshold` such that batching kicks in before we start using too many concurrent connections (which is what query batching seeks to reduce), but the problem is that those two numbers aren't directly connected. One clear problem, for example, is that if our DB is in-region vs. out of-region, then for a given query execution time, and a desired concurrency level before batching kicks in, we'd need different values of `rendezvous_dispatch_min_threshold` (it would have to kick in faster for the out-of-region workload). So, this diff updates rendez vou to actually track concurrent connection count before we force batching. This is the actual metric we care about here, and it has a pretty natural "real world" values we can look at to decide where to set it (our connection pool — which is limited at 100 concurrent connections —, and our open connection baseline). Note: I set this at 5 because that's more or less what servers look like outside of spikes for Bonsai hg mapping, and of Changesets where I'm planning to introduce this in the future: - bonsai: https://fburl.com/ods/6d4a9qb5 - changesets: https://fburl.com/ods/kuq5x1vw (note: to make sense of this, focus on just one server, otherwise the constnat spikes we get sort of hide the big picture). Reviewed By: farnz Differential Revision: D27792603 fbshipit-source-id: 1a9189f6b50d48444b3373bd1cb14dc51b85a6d2	2021-04-16 10:27:44 -07:00
Kostia Balytskyi	fc3908e9fa	repo_client: log full gettreepack args to scuba in verbose mode Summary: This will allow us to have greater visibility into what's going on when there are production issues. Note: for getpack, the params data model is `[MPath, [Node]]`. In practice there seems to always just be 1 node per mpath. However, to preserve the mapping, I log every mpath in a separate sample. Reviewed By: ahornby Differential Revision: D26690685 fbshipit-source-id: 36616256747b61390b0435467892daeff2b4dd07	2021-04-14 08:29:59 -07:00
Thomas Orozco	3c88bd8832	mononoke/timeseries: track count of valid buckets Summary: It's useful when operating with timeseries to know what range of data has been populated. This diff adds support for this in mononoke/timeseries, by tracking the number of buckets that fall within intervals where data was provided. Reviewed By: mitrandir77 Differential Revision: D27734229 fbshipit-source-id: 3058a7ce4da67666e8ce8a46e34e277b69153ea4	2021-04-13 06:24:37 -07:00
Thomas Orozco	87aed04d37	mononoke/sql_ext: publish SQL max open connections stat Summary: Like it says in the title, this adds support for publishing our max open connections to ODS. Note that this is a little more involved than I would like for it to be, but there is no way to get direct access to this information. This means, we need to: - Expose how many open connections we have in flight (this is done earlier in this stack in the Rust MySQL bindings). - Periodically get this information out out for MySQL, put it in a timeseries. - Get the max out of said timeseries and publish it to a counter so that it can be fetched in ODS. This is what this diff does. Note that I've only done this for read pools, largely because I think they're the ones we tend to exhaust the most and I'd like to see if there is value in exposing those counters before I use them. We do the aggregation on a dedicated thread here. I contemplated making this a Tokio task, but I figured making it a thread would make it easier to see if it's misbehaving in any way (also: note that the SQL client allocates a bunch of threads already anyway). Reviewed By: HarveyHunt Differential Revision: D27678955 fbshipit-source-id: c7b386f3a182bae787d77e997d108d8a74a6402b	2021-04-13 03:05:23 -07:00
Thomas Orozco	d677947066	metagit/hosts-down-tailer: use mononoke/common/timeseries Summary: Like it says in the title. This is a place where we use timeseries so we might as well use that shared crate. Reviewed By: mzr Differential Revision: D27678389 fbshipit-source-id: 9b5d4980a1ddb5ce2a01c8ef417c78b1c3da80b7	2021-04-12 05:22:33 -07:00
Thomas Orozco	e64012ad9e	mononoke/timeseries: introduce a basic crate for tracking time series Summary: I'd like to be able to track time series for access within Mononoke. The underlying use case here is that I want to be able to track the max count of connections in our SQL connection pools over time (and possibly other things in the future). Now, the obvious question is: why am I rolling my own? Well, as it turns out, there isn't really an implementation of this that I can reuse: - You might expect to be able to track the max of a value via fb303, but you can't: https://www.internalfb.com/intern/diffusion/FBS/browse/master/fbcode/fb303/ExportType.h?commit=0405521ec858e012c0692063209f3e13a2671043&lines=26-29 - You might go look in Folly, but you'll find that the time series there only supports tracking Sum & Average, but I want my timeseries to track Max (and in fact I'd like it to be sufficiently flexible to track anything I want): https://www.internalfb.com/intern/diffusion/FBS/browse/master/fbcode/folly/stats/BucketedTimeSeries.h It's not the first time I've ran into a need for something like this. I needed it in RendezVous to track connections over the last 2 N millisecond intervals, and we needed it in metagit for host draining as well (note that the implementation here is somewhat inspired by the implementation there). Reviewed By: mzr Differential Revision: D27678388 fbshipit-source-id: ba6d244b8bb848d4e1a12f9c6f54e3aa729f6c9c	2021-04-12 05:22:33 -07:00
Thomas Orozco	c934b67e5b	mononoke: remove all trivial usage of async-unit Summary: I'd like to just get rid of that library since it's one more place where we specify the Tokio version and that's a little annoying with the Tokio 1.x update. Besides, this library is largely obsoleted by `#[fbinit::test]` and `#[tokio::test]`. Reviewed By: farnz Differential Revision: D27619147 fbshipit-source-id: 4a316b81d882ea83c43bed05e873cabd2100b758	2021-04-07 07:26:57 -07:00
Aida Getoeva	01b38dfa5e	mononoke/mysql: log connection/query ODS counters by the shardmap and label Summary: It is useful to have latency stats grouped by the shardmap and label to easily identify where the problem comes from if something is broken. This diff switches a single histogram used for all the MySQL use-cases into a set of histograms: one per `shardmap:label`. Ans also makes the histograms a bit smaller as we don't actually have such big numbers as 10s per conn/query. There is only one case when the histogram is created per shard instead of a shardmap, it is `xdb.hgsql` DB with 9 shards. The reason why it happens it's because we connect to each shard as to an individual tier: https://fburl.com/diffusion/um8lt7cr. {F582699426} Reviewed By: farnz Differential Revision: D27503833 fbshipit-source-id: 40c7eb64df7ae0694f63d3644231f240df8212ec	2021-04-07 05:14:03 -07:00
Thomas Orozco	a1e2833377	mononoke/rendezvous: reduce histogram size Summary: There was no reason for this to be this large, and it's causing issues with repo construction since it's pretty expensive to construct as a result (D27501915 (`69896e90b5`)). Let's just make it much smaller. Reviewed By: StanislavGlebik Differential Revision: D27591073 fbshipit-source-id: 1c986cb922d70b10c39711c57ac9f5899ed7496c	2021-04-06 06:53:58 -07:00
Mark Juggurnauth-Thomas	69896e90b5	bonsai_hg_mapping: construct rendezvous connections in a blocking closure Summary: `RendezVousConnection::new` can block for some time doing work on the CPU, specifically creating the stats objects. This causes problems for other futures during repo construction. Instead, move rendez-vous construction to a `spawn_blocking` closure, so that it doesn't interfere with the other futures. Since `SqlBonsaiHgMapping::from_sql_connections` is not async, and is part of the SqlConstruct trait, we must convert this to the builder pattern so that we can defer rendez-vous construction to later on. Reviewed By: farnz Differential Revision: D27501915 fbshipit-source-id: 9c58c32411301128424985deeab127d052c43532	2021-04-01 08:27:15 -07:00
Gus Wynn	fc46c24e8f	update tokio to 1.4.0 Summary: https://github.com/tokio-rs/tokio/releases/tag/tokio-1.4.0 I want the `biased;` option in `tokio::select!` Reviewed By: ahornby Differential Revision: D27435341 fbshipit-source-id: c29ca954c319327f62466131ae04483ad091bf49	2021-03-31 10:44:20 -07:00
Stanislau Hlebik	a8b983db80	mononoke: Back out "mononoke/mysql: group ODS counters by shardmap" Summary: Original commit changeset: 0708a4b0dc37 It seem to be the reason of sql timeouts on mononoke startup Differential Revision: D27337030 fbshipit-source-id: 7b154c09397b0e297e18b186a6338ab801b1769d	2021-03-26 01:01:37 -07:00
Mark Juggurnauth-Thomas	64461bb361	test_repo_factory: use test factory for remaining tests Summary: Use the test factory for the remaining existing tests. Reviewed By: StanislavGlebik Differential Revision: D27169443 fbshipit-source-id: 00d62d7794b66f5d3b053e8079f09f2532d757e7	2021-03-25 07:34:51 -07:00
Aida Getoeva	5f67b9dde7	mononoke/mysql: group ODS counters by shardmap Summary: Grouping stats by the shardmap can help to detect and root-cause issues. This diffs adds a label to the `MysqlConnection` and Mononoke now will log counters by shardmap. Reviewed By: StanislavGlebik Differential Revision: D26994369 fbshipit-source-id: 0708a4b0dc3762f5f9152b83200173cd8b241abc	2021-03-23 11:07:26 -07:00
Thomas Orozco	a3a0347639	mononoke/rendezvous: introduce query batching Summary: This introduces a basic building block for query batching. I called this rendezvous, since it's about multiple queries meeting up in the same place :) There are a few (somewhat conflicting) goals this tries to satisfy, so let's go over them: 1), we'd like to reduce the total number of queries made by batch jobs. For example, group hg bonsai lookups made by the walker. Those jobs are characterized by the fact that they have a lot of queries to make, all the time. Here's an example: https://fburl.com/ods/zuiep7yh. 2), we'd like to reduce the overall number of connections held to MySQL by our tasks. The main way we achieve this is by reducing the maximum number of concurrent queries. Indeed, a high total number of queries doesn't necessarily result in a lot of connections as long as they're not concurrent, because we can reuse connections. On the other hand, if you dispatch 100 concurrent queries, that _does_ use 100 connections. This is something that applies to batch jobs due to their query volume, but also to "interactive" jobs like Mononoke Server or SCS, just not all the time. Here's an example: https://fburl.com/ods/o6gp07qp (you can see the query count is overall low, but sometimes spikes substantially). 2.1) It's also worth noting that concurrent queries are often the result of many clients wanting the same data, so deduplication is also useful here. 3), we also don't want to impact the latency of interactive jobs when they need to a little query here or there (i.e. it's largely fine if our jobs all hold a few connections to MySQL and use them somewhat consistently). 4), we'd like this to make it easier to do batching right. For example, if you have 100 Bonsais to map to hg, you should be able to just map and call `future::try_join_all` and have that do the right thing. 5), we don't want "bad" queries to affect other queries negatively. One example would be the occasional queries we make to Bonsai <-> Hg mapping in `known` for thousands (if not more) of rows. 6), we want this to be easy to incorporate into the codebase. So, how do we try to address all of this? Here's how: - We ... do batching, and we deduplicate requests in a batch. This is the easier bit and should address #1, #2 and #2.1, #4. - However, batching is conditional. We notably don't batch very large requests with the rest (addresses #5). We also don't batch small queries all the time: we only batch if we are observing a throughput of queries that suggests we can find some benefit in batching (this targets #3). - Finally, we have some utilities for common cases like having to group by repo id (this is `MultiRendezVous`), and this is all configurable via tunables (and the default is to not do anything). Reviewed By: StanislavGlebik Differential Revision: D27010317 fbshipit-source-id: 4a2397255f9785c6722c02e4d419438fd0aafa07	2021-03-19 08:50:40 -07:00
Mark Juggurnauth-Thomas	33ec4db653	bounded_traversal: require futures to be boxed Summary: Bounded traversal's internal book-keeping moves the futures returned from fold and unfold callbacks around while they are being queued to be scheduled. If these futures are large, then this can result in a significant portion of bounded traversal's CPU time being spent on `memcpy`ing these futures around. This can be prevented by always boxing the futures that are returned to bounded traversal. Make this a requirement by changing the type from `impl Future<...>` to `BoxFuture<...>`. Reviewed By: mitrandir77 Differential Revision: D26997706 fbshipit-source-id: 23a3583adc23c4e7d3607a78e82fc9d1056691c3	2021-03-12 08:12:57 -08:00
Mark Juggurnauth-Thomas	91358f3716	mononoke_types: use SortedVectorMap for BonsaiChangeset Summary: BonsaiChangesets are rarely mutated, and their maps are stored in sorted order, so we can use `SortedVectorMap` to load them more efficiently. In the cases where mutable maps of filechanges are needed, we can use `BTreeMap` during the mutation and then convert them to `SortedVectorMap` to store them. Reviewed By: mitrandir77 Differential Revision: D25615279 fbshipit-source-id: 796219c1130df5cb025952bb61002e8d2ae898f4	2021-03-11 04:28:43 -08:00
Thomas Orozco	306c9dc658	mononoke: update async_limiter to tokio_shim Summary: This is dependent on by Metagit and I'd like to update Metagit to Tokio 1.0, possibly independent of Mononoke; Reviewed By: farnz Differential Revision: D26945751 fbshipit-source-id: 552c831964f31d155783af87e7931b2c824e2471	2021-03-10 11:01:28 -08:00
Andrey Chursin	0be8e8ce29	vfs: introduce AsyncVfs Summary: AsyncVfs provides async vfs interface. It will be used in the native checkout instead of current use case that spawns blocking tokio tasks for VFS action Reviewed By: quark-zju Differential Revision: D26801250 fbshipit-source-id: bb26c4fc8acac82f4b55bb3f2f3964a6d0b64014	2021-03-05 21:47:51 -08:00
Thomas Orozco	2a803fc10d	third-party/rust: update futures Summary: Those newer versions of Futures have compatibility improvements with Tokio, notably: - https://github.com/rust-lang/futures-rs/pull/2333 - https://github.com/rust-lang/futures-rs/pull/2358 Reviewed By: farnz Differential Revision: D26778794 fbshipit-source-id: 5a9dc002083e5edfa5c614d8d2242e586a93fcf6	2021-03-04 06:42:55 -08:00
Thomas Orozco	ef7045e818	common/rust: use fbinit-tokio Summary: This diffs add a layer of indirection between fbinit and tokio, thus allowing us to use fbinit with tokio 0.2 or tokio 1.x. The way this works is that you specify the Tokio you want by adding it as an extra dependency alongside `fbinit` in your `TARGETS` (before this, you had to always include `tokio-02`). If you use `fbinit-tokio`, then `#[fbinit::main]` and `#[fbinit::test]` get you a Tokio 1.x runtime, whereas if you use `fbinit-tokio-02`, you get a Tokio 0.2 runtime. This diff is big, because it needs to change all the TARGETS that reference this in the same diff that introduces the mechanism. I also didn't produce it by hand. Instead, I scripted the transformation using this script: P242773846 I then ran it using: ``` { hg grep -l "fbinit::test"; hg grep -l "fbinit::main" } \| \ sort \| \ uniq \| \ xargs ~/codemod/codemod.py \ && yes \| arc lint \ && common/rust/cargo_from_buck/bin/autocargo ``` Finally, I grabbed the files returned by `hg grep`, then fed them to: ``` arc lint-rust --paths-from ~/files2 --apply-patches --take RUSTFIXDEPS ``` (I had to modify the file list a bit: notably I removed stuff from scripts/ because some of that causes Buck to crash when running lint-rust, and I also had to add fbcode/ as a prefix everywhere). Reviewed By: mitrandir77 Differential Revision: D26754757 fbshipit-source-id: 326b1c4efc9a57ea89db9b1d390677bcd2ab985e	2021-03-03 04:09:15 -08:00
Lukas Piatkowski	f317302b0f	autocargo v1: reformating of oss-dependencies, workspace and patch sections and thrift files to match v2 Summary: For dependencies V2 puts "version" as the first attribute of dependency or just after "package" if present. Workspace section is after patch section in V2 and since V2 autoformats patch section then the third-party/rust/Cargo.toml manual entries had to be formatted manually since V1 takes it as it is. The thrift files are to have "generated by autocargo" and not only "generated" on their first line. This diff also removes some previously generated thrift files that have been incorrectly left when the corresponding Cargo.toml was removed. Reviewed By: ikostia Differential Revision: D26618363 fbshipit-source-id: c45d296074f5b0319bba975f3cb0240119729c92	2021-02-25 15:10:56 -08:00
Thomas Orozco	8e626f0c02	mononoke/async_limiter: use tokio_shim Summary: I'd like to prepare the migration to Tokio 1.0 and this is one bit of code that needs some non-trivial changes since in Tokio 1.0, Sleep is no longer Unpin. Reviewed By: farnz Differential Revision: D26610033 fbshipit-source-id: 1db4c1686fcd010e2158bcf4bb25f1e15dd19603	2021-02-25 02:11:30 -08:00
Michael Samoylenko	1043ffb436	Fix warnings Summary: ``` misa@devvm1346 /d/u/m/f/f/ctp (default)> cargo check --tests ... warning: unused import: `FutureExt` --> /data/users/misa/fbsource/fbcode/eden/mononoke/common/rust/sql_ext/src/oss.rs:14:30 \| 14 \| use futures_ext::{BoxFuture, FutureExt}; \| ^^^^^^^^^ \| = note: `#[warn(unused_imports)]` on by default warning: unused import: `futures_old::future::ok` --> /data/users/misa/fbsource/fbcode/eden/mononoke/common/rust/sql_ext/src/oss.rs:15:5 \| 15 \| use futures_old::future::ok; \| ^^^^^^^^^^^^^^^^^^^^^^^ warning: unused variable: `pool_config` --> /data/users/misa/fbsource/fbcode/eden/mononoke/common/rust/sql_ext/src/lib.rs:109:32 \| 109 \| Self::Mysql(_, pool_config) => { \| ^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_pool_config` \| = note: `#[warn(unused_variables)]` on by default warning: 3 warnings emitted ``` Reviewed By: farnz Differential Revision: D26637776 fbshipit-source-id: a7309fa02f4b40fc46a6be0bb64ecb9eceefc104	2021-02-24 11:01:02 -08:00
Alex Hornby	da20841ec9	mononoke: shard walk by sql tier and shard number Summary: Scrubbing a repo is highly concurrent as its mostly IO bound. As such we can end up waiting on sql connection pool for connections where it allows less than scheduled_max connections. This change makes bounded_traversal_unique calls from the walker aware of the database tier and shard a Node may connect to, so that execution can be limited to the bounds of what the connection pool can support without waiting. We still end up waiting for the connection, but now it's done in bounded_traversal_unique, rather than in connection pool code, and are thus a) able to process other Nodes while waiting and b) not subject to connection pool timeouts. Differential Revision: D26524074 fbshipit-source-id: 19125388c730f5cef7e9de34b5b550efa8e6b825	2021-02-23 11:07:48 -08:00
Alex Hornby	aa8f84ad4c	mononoke: async myrouter_ready() Summary: Small clean up. Allows us to pass Logger by reference, removing the FIXME in blobrepo factory Reviewed By: farnz Differential Revision: D26551592 fbshipit-source-id: d6bb04b8bb3034ad056f071b67b5ae0ce3c6f224	2021-02-23 10:55:42 -08:00
Alex Hornby	199538edfa	mononoke: expose per_key_limit() in sql_ext Summary: It's useful to know the max number of connections to a shard that it is sensible to open. Used in next diff in stack. Differential Revision: D26543419 fbshipit-source-id: 57e7c3295a5b5db1572f26954ae0dfb04c84b374	2021-02-23 02:51:50 -08:00
Alex Hornby	5cafeff1db	mononoke: allow bounded_traversal_unique steps to output None Summary: The walker mostly checks for duplicates before emitting a new edge, at the same time recording the edge as visited to prevent duplicate edges. However for derived data where the node may or may not be present, the node isn't considered visited until the node data is successfully loaded and seen in state.rs record_resolved_visit(). In such cases multiple copies of a node could be enqueued, and then we need to run each one. With this change, where the walker can detect that such a step has completed previously, it will now short circuit the step and return None. Differential Revision: D26369917 fbshipit-source-id: c2bdbbabfaa80dbb7cc7d2bc25a17230531ae111	2021-02-23 02:51:49 -08:00
Thomas Orozco	f06dc957ae	mononoke: log identities as norm vector Summary: In EdenAPI this is logged as a vector (and in all our other services), but in Mononoke Server we log it as a string. Let's fix this up. This is worth doing now since right now we end up logging to 2 columns with the same name and a different type. Reviewed By: ahornby Differential Revision: D26542737 fbshipit-source-id: 2f12c9e475061b1c21c71bade99b83cc070006e8	2021-02-22 11:48:47 -08:00
Thomas Orozco	097e4ad00c	mononoke: remove tokio-compat (i.e. use tokio 0.2 exclusively) Summary: The earlier diffs in this stack have removed all our dependencies on the Tokio 0.1 runtime environment (so, basically, `tokio-executor` and `tokio-timer`), so we don't need this anymore. We do still have some deps on `tokio-io`, but this is just traits + helpers, so this doesn't actually prevent us from removing the 0.1 runtime! Note that we still have a few transitive dependencies on Tokio 0.1: - async-unit uses tokio-compat - hg depends on tokio-compat too, and we depend on it in tests This isn't the end of the world though, we can live with that :) Reviewed By: ahornby Differential Revision: D26544410 fbshipit-source-id: 24789be2402c3f48220dcaad110e8246ef02ecd8	2021-02-22 09:22:42 -08:00
Thomas Orozco	f7d5b3db46	mononoke: remove tokio-timer dependencies Summary: Like it says in the title. Reviewed By: ahornby Differential Revision: D26513724 fbshipit-source-id: 5d1f986af17c948ad24e3d378a7623a0d97f5cf4	2021-02-22 09:22:41 -08:00
Thomas Orozco	0734a61cb1	common/rust: remove tracing Summary: This was a thing that was only ever used in Mononoke, and we don't think it's usable and haven't been using it. Let's get rid of it. As-is, it won't even work for most people due to its (indirect) dependency on Tokio 0.1. Reviewed By: StanislavGlebik Differential Revision: D26512243 fbshipit-source-id: faa16683f2adb20dfba43c4768486b982bc02ff9	2021-02-22 09:22:41 -08:00
Lukas Piatkowski	cd0b6d50e2	autocargo v1: changes to match autocargo v2 generation results. Summary: The changes (and fixes) needed were: - Ignore rules that are not rust_library or thrift_library (previously only ignore rust_bindgen_library, so that binary and test dependencies were incorrectly added to Cargo.toml) - Thrift package name to match escaping logic of `tools/build_defs/fbcode_macros/build_defs/lib/thrift/rust.bzl` - Rearrange some attributes, like features, authors, edition etc. - Authors to use " instead of ' - Features to be sorted - Sort all dependencies as one instead of grouping third party and fbcode dependencies together - Manually format certain entries from third-party/rust/Cargo.toml, since V2 formats third party dependency entries and V1 just takes them as is. Reviewed By: zertosh Differential Revision: D26544150 fbshipit-source-id: 19d98985bd6c3ac901ad40cff38ee1ced547e8eb	2021-02-19 11:03:55 -08:00
Lukas Piatkowski	87ddbe2f74	autocargo v1: update autocargo field format to allow transition to autocargo v2 Summary: Autocargo V2 will use a more structured format for autocargo field with the help of `cargo_toml` crate it will be easy to deserialize and handle it. Also the "include" field is apparently obsolete as it is used for cargo-publish (see https://doc.rust-lang.org/cargo/reference/manifest.html#the-exclude-and-include-fields). From what I know this might be often wrong, especially if someone tries to publish a package from fbcode, then the private facebook folders might be shipped. Lets just not set it and in the new system one will be able to set it explicitly via autocargo parameter on a rule. Reviewed By: ahornby Differential Revision: D26339606 fbshipit-source-id: 510a01a4dd80b3efe58a14553b752009d516d651	2021-02-12 23:28:25 -08:00
Alex Hornby	5d7a62e13c	mononoke: add duplicate detecting variant of bounded traversal stream Summary: Sometimes many nodes will unfold with a common child, in which case it is desirable that they aren't scheduled at the same time. This adds bounded_traversal_unique as a new variant to prevent those duplicates. Doing the duplicate detection inside bounded_traversal_unique means we only need to keep scheduled_max copies of keys and can keep it in a regular HashMap, vs doing it at bounded_traversal_stream call sites which would mean keeping a copy for everything in the unscheduled queue and using DashMap or a lock around HashMap. Reviewed By: farnz Differential Revision: D26319137 fbshipit-source-id: 3174ce9e7db4edeb107d26f72575de886e6b2e39	2021-02-12 10:14:44 -08:00
Thomas Orozco	2a21e4fb17	third-party/rust: update Tokio to 0.2.25 + add a patch to disable coop scheduling Summary: See the patch & motivation here: `818f943db3` Reviewed By: StanislavGlebik Differential Revision: D26399890 fbshipit-source-id: e184a3f6c1dd03cb4cdb7ea18073c3392d7ce355	2021-02-12 04:56:23 -08:00
Alex Hornby	3291f638bc	mononoke: add a new test for bounded_traversal_stream's parents Summary: New test to check that the path to each unfolded node is as expected Differential Revision: D26319141 fbshipit-source-id: e67052fd8de3e2e8c6d9287a25f52f9511e9d6c8	2021-02-10 07:34:01 -08:00
Alex Hornby	8cb43ffde8	mononoke: extract bounded_traversal_stream tests in preparation for a new variant Summary: Split out the bounded_traversal_stream test in preparation for a new variant so we can be sure same expectations apply to both. The code to build a test tree was common in a few places I touched, so extracted it to a function. Differential Revision: D23757523 fbshipit-source-id: fbc7844d8445586b13de3a3ccf4f0fb0041bcd6f	2021-02-10 07:34:01 -08:00
Stefan Filip	0a308f9f84	update Cargo.toml after assert_matches update Summary: cargo autocargo Reviewed By: fanzeyi Differential Revision: D26316542 fbshipit-source-id: f9e12a9d7b3b4e03a6f7b074ea2873ad6dcc82ad	2021-02-08 10:23:00 -08:00
Kostia Balytskyi	27fb78b1fa	scuba: add a way to log sampled out rows in verbose mode Summary: This allows us to log sampled messages, but reserves an option of falling back to full verbose logging in critical situations. Note that while this might be a desired behavior in most cases, it's certainly not always the right thing to do: sometimes sampled data needs to remain sampled, even for verbose logging. Reviewed By: ahornby Differential Revision: D26148454 fbshipit-source-id: c6ff9d1b05c9cec4895181e008ef6483884bb483	2021-02-04 13:51:26 -08:00
Thomas Orozco	c88a08b9df	mononoke: add futures_watchdog, a crate to help find stalls Summary: Like it says in the title. This adds a crate that provides a combinator that lets us easily find stalls caused by futures that stay in `poll()` for too long. The goal is to make this minimal overhead for whoever is using it: all you need is to import it + give it a logger. It automatically looks up the line where it's called and gives it back to you in logs. This uses the `track_caller` functionality to make this work. Reviewed By: farnz Differential Revision: D26250068 fbshipit-source-id: a1458e5adebac7eab6c2de458f679c7215147937	2021-02-04 10:40:04 -08:00
Mark Juggurnauth-Thomas	792d18eef6	bounded_traversal: add ordered stream Summary: Add `bounded_traversal_ordered_stream`. This function operates much like `bounded_traversal_stream`, in that it traverses a tree producing a stream of visited leaves. The difference is that the order of produced items is maintained. Key differences are: * The `unfold` method produces a sequence of `OrderedTraversal` nodes, rather than separate output and recursion sequences. The order between `Output` variants and the result of recursively expanding `Recurse` variants is what is maintained. * The `unfold` method, as well as the initial values, must provide estimates of the number of output items that the recursive result expands to. This is used to delay expanding of later items while earlier items are being expanded. * There is an additional dimension to bound. The `queue_max` parameter bounds the size of the queue of unyielded output elements. Recursive steps will not be scheduled for unfolding until there is sufficient capacity in the queue for the items they will produce. The bound is a soft bound: to ensure progress can always be made even if some `unfold` output produce more than `queue_max` elements, the queue is permitted to grow beyond `queue_max` with the output of one additional `unfold` call. Reviewed By: StanislavGlebik Differential Revision: D25867667 fbshipit-source-id: 884bffbeee3862cce56df78084d57ca62089814c	2021-02-02 09:00:17 -08:00
Mark Juggurnauth-Thomas	1cd098181c	bounded_traversal: use standard futures types instead of custom Job Summary: Replace `common::Job` by using `futures::Join` and `futures::Ready`. We still need a heterogeneous variant of `Either`, where the output types of the two futures differ, so extract this from `Job` as `common::Either2`, which returns `either::Either<LeftFuture::Out, RightFuture::Out>`. Reviewed By: ahornby Differential Revision: D25867668 fbshipit-source-id: 13c90b212c64ca5eae67217a1cecd9aee5e40a38	2021-01-29 03:14:41 -08:00
Thomas Orozco	e9656892e8	mononoke: fix some broken oss build Summary: Not much to add .. Guess we gotta update a stub here. Reviewed By: ahornby Differential Revision: D26124590 fbshipit-source-id: efc4f324b5fed15cff46b358c2b491480e9b73fb	2021-01-28 04:29:58 -08:00
Stanislau Hlebik	734928ecb9	mononoke: move functions from rsync admin to copy_utils Summary: I plan to use these functions in the megarepotool, so let's move them to a library that can be used in both. Reviewed By: krallin Differential Revision: D26015773 fbshipit-source-id: 0d2d28d86471c417508494883b69fb64e1bea328	2021-01-27 02:47:04 -08:00
Daniel Xu	5715e58fce	Add version specificiation to internal dependencies Summary: Lots of generated code in this diff. Only code change was in `common/rust/cargo_from_buck/lib/cargo_generator.py`. Path/git-only dependencies (ie `mydep = { path = "../foo/bar" }`) are not publishable to crates.io. However, we are allowed to specify both a path/git _and_ a version. When building locally, the path/git is chosen. When publishing, the version on crates.io is chosen. See https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#multiple-locations . Note that I understand that not all autocargo projects are published on crates.io (yet). The point of this diff is to allow projects to slowly start getting uploaded. The end goal is autocargo generated `Cargo.toml`s that can be `cargo publish`ed without further modification. Reviewed By: lukaspiatkowski Differential Revision: D26028982 fbshipit-source-id: f7b4c9d4f4dd004727202bd98ab10e201a21e88c	2021-01-25 22:10:24 -08:00
Thomas Orozco	4dd3461824	third-party/rust: update Tokio 0.2.x to 0.2.24 & futures 1.x to 1.30 Summary: When we tried to update to Tokio 0.2.14, we hit lots of hangs. Those were due to incompatibilities between Tokio 0.2.14 and Futures 1.29. We fixed some of the bugs (and others had been fixed and were pending a release), and Futures 1.30 have now been released, which unblocks our update. This diff updates Tokio accordingly (the previous diff in the stack fixes an incompatibility). The underlying motivation here is to ease the transition to Tokio 1.0. Ultimately we'll be pulling in those changes one or way or another, so let's get started on this incremental first step. Reviewed By: farnz Differential Revision: D25952428 fbshipit-source-id: b753195a1ffb404e0b0975eb7002d6d67ba100c2	2021-01-25 08:06:55 -08:00
Kostia Balytskyi	0d52cff58c	iterhelpers: add chunk_by_accumulation Summary: This implements chunking of the original iterator by saturation of some accumulator until a condition starts to be satisfied. Note: I tried looking through `Vec`, `itertools` and `Iterator` methods, and did not find anything that would allow me to express this easily. Reviewed By: StanislavGlebik Differential Revision: D25947821 fbshipit-source-id: 9e4dd738ecd2ab06ebb69123e4a03059f96b3fb6	2021-01-19 07:13:52 -08:00
Radu Szasz	5fb5d23ec8	Make tokio-0.2 include test-util feature Summary: This feature is useful for testing time-dependent stuff (e.g. it allows you to stop/forward time). It's already included in the buck build. Reviewed By: SkyterX Differential Revision: D25946732 fbshipit-source-id: 5e7b69967a45e6deaddaac34ba78b42d2f2ad90e	2021-01-18 10:38:08 -08:00
Alex Hornby	d17ec72093	mononoke: add log tag filtering to cmdlib Summary: Add arguments to cmdlib so we can filter log messages by the slog tag, using new Drains added in slog_ext. To use tagging from slog the form is: ``` const FOO_TAG: &str = "foo"; info!(logger, #FOO_TAG, "hello foo!"); ``` Reviewed By: krallin Differential Revision: D25837627 fbshipit-source-id: b164d508a2e82a80c4ff6f5f35c0c722257b9a2a	2021-01-15 03:13:27 -08:00

1 2 3

129 Commits