Commit Graph

129 Commits

Author SHA1 Message Date
Thomas Orozco
1ee93bdcfb mononoke/rendezvous: use in-flight connection count to decide when to batch
Summary:
After doing some local benchmarking (using MononokeApi instantiation as the
benchmark), one thing that's apparent is that we have quite a few parameters
here and that tuning them is likely to be a challenge.

One parameter in particular is the batch "objective", which controls how many
requests we want to see in the last batching interval before we choose to
batch (this is `rendezvous_dispatch_min_threshold`).

The problem with this is this is that there is no good, real-world, metric to
set it based on. This in contrast to the other parameters we have, which do
have some reasonable metric to compare to:

- rendezvous_dispatch_delay_ms: this is overhead we add to queries, so it
  should be small & on the order of query execution latency (i.e. a few ms).
- rendezvous_dispatch_max_threshold: this controls how big our batches get, so
  it should be on the order of what makes a SQL query too big (i.e. less than
  a hundred records).

In contrast, we want to set `rendezvous_dispatch_min_threshold` such that
batching kicks in before we start using too many concurrent connections (which
is what query batching seeks to reduce), but the problem is that those two
numbers aren't directly connected. One clear problem, for example, is that if
our DB is in-region vs. out of-region, then for a given query execution time,
and a desired concurrency level before batching kicks in, we'd need different
values of `rendezvous_dispatch_min_threshold` (it would have to kick in faster
for the out-of-region workload).

So, this diff updates rendez vou to actually track concurrent connection count
before we force batching. This is the actual metric we care about here, and it
has a pretty natural "real world" values we can look at to decide where to set
it (our connection pool — which is limited at 100 concurrent connections —, and
our open connection baseline).

Note: I set this at 5 because that's more or less what servers look like
outside of spikes for Bonsai hg mapping, and of Changesets where I'm planning to
introduce this in the future:

- bonsai: https://fburl.com/ods/6d4a9qb5
- changesets: https://fburl.com/ods/kuq5x1vw (note: to make sense of this,
  focus on just one server, otherwise the constnat spikes we get sort of hide
  the big picture).

Reviewed By: farnz

Differential Revision: D27792603

fbshipit-source-id: 1a9189f6b50d48444b3373bd1cb14dc51b85a6d2
2021-04-16 10:27:44 -07:00
Kostia Balytskyi
fc3908e9fa repo_client: log full gettreepack args to scuba in verbose mode
Summary:
This will allow us to have greater visibility into what's going on when there are production issues.

Note: for getpack, the params data model is `[MPath, [Node]]`. In practice there seems to always just be 1 node per mpath. However, to preserve the mapping, I log every mpath in a separate sample.

Reviewed By: ahornby

Differential Revision: D26690685

fbshipit-source-id: 36616256747b61390b0435467892daeff2b4dd07
2021-04-14 08:29:59 -07:00
Thomas Orozco
3c88bd8832 mononoke/timeseries: track count of valid buckets
Summary:
It's useful when operating with timeseries to know what range of data has been
populated. This diff adds support for this in mononoke/timeseries, by tracking
the number of buckets that fall within intervals where data was provided.

Reviewed By: mitrandir77

Differential Revision: D27734229

fbshipit-source-id: 3058a7ce4da67666e8ce8a46e34e277b69153ea4
2021-04-13 06:24:37 -07:00
Thomas Orozco
87aed04d37 mononoke/sql_ext: publish SQL max open connections stat
Summary:
Like it says in the title, this adds support for publishing our max open
connections to ODS. Note that this is a little more involved than I would like
for it to be, but there is no way to get direct access to this information.

This means, we need to:

- Expose how many open connections we have in flight (this is done earlier in
  this stack in the Rust MySQL bindings).
- Periodically get this information out out for MySQL, put it in a timeseries.
- Get the max out of said timeseries and publish it to a counter so that it can
  be fetched in ODS.

This is what this diff does. Note that I've only done this for read pools,
largely because I think they're the ones we tend to exhaust the most and I'd
like to see if there is value in exposing those counters before I use them.

We do the aggregation on a dedicated thread here. I contemplated making this a
Tokio task, but I figured making it a thread would make it easier to see if
it's misbehaving in any way (also: note that the SQL client allocates a bunch
of threads already anyway).

Reviewed By: HarveyHunt

Differential Revision: D27678955

fbshipit-source-id: c7b386f3a182bae787d77e997d108d8a74a6402b
2021-04-13 03:05:23 -07:00
Thomas Orozco
d677947066 metagit/hosts-down-tailer: use mononoke/common/timeseries
Summary:
Like it says in the title. This is a place where we use timeseries so we might
as well use that shared crate.

Reviewed By: mzr

Differential Revision: D27678389

fbshipit-source-id: 9b5d4980a1ddb5ce2a01c8ef417c78b1c3da80b7
2021-04-12 05:22:33 -07:00
Thomas Orozco
e64012ad9e mononoke/timeseries: introduce a basic crate for tracking time series
Summary:
I'd like to be able to track time series for access within Mononoke. The
underlying use case here is that I want to be able to track the max count of
connections in our SQL connection pools over time (and possibly other things in
the future).

Now, the obvious question is: why am I rolling my own? Well, as it turns out,
there isn't really an implementation of this that I can reuse:

- You might expect to be able to track the max of a value via fb303, but you
  can't:

https://www.internalfb.com/intern/diffusion/FBS/browse/master/fbcode/fb303/ExportType.h?commit=0405521ec858e012c0692063209f3e13a2671043&lines=26-29

- You might go look in Folly, but you'll find that the time series there only
  supports tracking Sum & Average, but I want my timeseries to track Max (and
  in fact I'd like it to be sufficiently flexible to track anything I want):

https://www.internalfb.com/intern/diffusion/FBS/browse/master/fbcode/folly/stats/BucketedTimeSeries.h

It's not the first time I've ran into a need for something like this. I needed
it in RendezVous to track connections over the last 2 N millisecond intervals,
and we needed it in metagit for host draining as well (note that the
implementation here is somewhat inspired by the implementation there).

Reviewed By: mzr

Differential Revision: D27678388

fbshipit-source-id: ba6d244b8bb848d4e1a12f9c6f54e3aa729f6c9c
2021-04-12 05:22:33 -07:00
Thomas Orozco
c934b67e5b mononoke: remove all trivial usage of async-unit
Summary:
I'd like to just get rid of that library since it's one more place where we
specify the Tokio version and that's a little annoying with the Tokio 1.x
update. Besides, this library is largely obsoleted by `#[fbinit::test]` and
`#[tokio::test]`.

Reviewed By: farnz

Differential Revision: D27619147

fbshipit-source-id: 4a316b81d882ea83c43bed05e873cabd2100b758
2021-04-07 07:26:57 -07:00
Aida Getoeva
01b38dfa5e mononoke/mysql: log connection/query ODS counters by the shardmap and label
Summary:
It is useful to have latency stats grouped by the shardmap and label to easily identify where the problem comes from if something is broken.

This diff switches a single histogram used for all the MySQL use-cases into a set of histograms: one per `shardmap:label`. Ans also makes the histograms a bit smaller as we don't actually have such big numbers as 10s per conn/query.

There is only one case when the histogram is created per shard instead of a shardmap, it is `xdb.hgsql` DB with 9 shards. The reason why it happens it's because we connect to each shard as to an individual tier: https://fburl.com/diffusion/um8lt7cr.

{F582699426}

Reviewed By: farnz

Differential Revision: D27503833

fbshipit-source-id: 40c7eb64df7ae0694f63d3644231f240df8212ec
2021-04-07 05:14:03 -07:00
Thomas Orozco
a1e2833377 mononoke/rendezvous: reduce histogram size
Summary:
There was no reason for this to be this large, and it's causing issues with
repo construction since it's pretty expensive to construct as a result
(D27501915 (69896e90b5)).

Let's just make it much smaller.

Reviewed By: StanislavGlebik

Differential Revision: D27591073

fbshipit-source-id: 1c986cb922d70b10c39711c57ac9f5899ed7496c
2021-04-06 06:53:58 -07:00
Mark Juggurnauth-Thomas
69896e90b5 bonsai_hg_mapping: construct rendezvous connections in a blocking closure
Summary:
`RendezVousConnection::new` can block for some time doing work on the CPU,
specifically creating the stats objects.  This causes problems for other
futures during repo construction.

Instead, move rendez-vous construction to a `spawn_blocking` closure, so that
it doesn't interfere with the other futures.

Since `SqlBonsaiHgMapping::from_sql_connections` is not async, and is part of
the SqlConstruct trait, we must convert this to the builder pattern so that we
can defer rendez-vous construction to later on.

Reviewed By: farnz

Differential Revision: D27501915

fbshipit-source-id: 9c58c32411301128424985deeab127d052c43532
2021-04-01 08:27:15 -07:00
Gus Wynn
fc46c24e8f update tokio to 1.4.0
Summary:
https://github.com/tokio-rs/tokio/releases/tag/tokio-1.4.0
I want the `biased;` option in `tokio::select!`

Reviewed By: ahornby

Differential Revision: D27435341

fbshipit-source-id: c29ca954c319327f62466131ae04483ad091bf49
2021-03-31 10:44:20 -07:00
Stanislau Hlebik
a8b983db80 mononoke: Back out "mononoke/mysql: group ODS counters by shardmap"
Summary:
Original commit changeset: 0708a4b0dc37

It seem to be the reason of sql timeouts on mononoke startup

Differential Revision: D27337030

fbshipit-source-id: 7b154c09397b0e297e18b186a6338ab801b1769d
2021-03-26 01:01:37 -07:00
Mark Juggurnauth-Thomas
64461bb361 test_repo_factory: use test factory for remaining tests
Summary: Use the test factory for the remaining existing tests.

Reviewed By: StanislavGlebik

Differential Revision: D27169443

fbshipit-source-id: 00d62d7794b66f5d3b053e8079f09f2532d757e7
2021-03-25 07:34:51 -07:00
Aida Getoeva
5f67b9dde7 mononoke/mysql: group ODS counters by shardmap
Summary:
Grouping stats by the shardmap can help to detect and root-cause issues.
This diffs adds a label to the `MysqlConnection` and Mononoke now will log counters by shardmap.

Reviewed By: StanislavGlebik

Differential Revision: D26994369

fbshipit-source-id: 0708a4b0dc3762f5f9152b83200173cd8b241abc
2021-03-23 11:07:26 -07:00
Thomas Orozco
a3a0347639 mononoke/rendezvous: introduce query batching
Summary:
This introduces a basic building block for query batching. I called this
rendezvous, since it's about multiple queries meeting up in the same place :)

There are a few (somewhat conflicting) goals this tries to satisfy, so let's go
over them:

1), we'd like to reduce the total number of queries made by batch jobs. For
example, group hg bonsai lookups made by the walker. Those jobs are
characterized by the fact that they have a lot of queries to make, all the
time. Here's an example: https://fburl.com/ods/zuiep7yh.

2), we'd like to reduce the overall number of connections held to MySQL by
our tasks. The main way we achieve this is by reducing the maximum number of
concurrent queries. Indeed, a high total number of queries doesn't necessarily
result in a lot of connections as long as they're not concurrent, because we
can reuse connections. On the other hand, if you dispatch 100 concurrent
queries, that _does_ use 100 connections. This is something that applies to
batch jobs due to their query volume, but also to "interactive" jobs like
Mononoke Server or SCS, just not all the time. Here's an example:
https://fburl.com/ods/o6gp07qp (you can see the query count is overall low, but
sometimes spikes substantially).

2.1) It's also worth noting that concurrent queries are often the result of
many clients wanting the same data, so deduplication is also useful here.

3), we also don't want to impact the latency of interactive jobs when they
need to a little query here or there (i.e. it's largely fine if our jobs all
hold a few connections to MySQL and use them somewhat consistently).

4), we'd like this to make it easier to do batching right. For example, if
you have 100 Bonsais to map to hg, you should be able to just map and call
`future::try_join_all` and have that do the right thing.

5), we don't want "bad" queries to affect other queries negatively. One
example would be the occasional queries we make to Bonsai <-> Hg mapping in
`known` for thousands (if not more) of rows.

6), we want this to be easy to incorporate into the codebase.

So, how do we try to address all of this? Here's how:

- We ... do batching, and we deduplicate requests in a batch. This is the
  easier bit and should address #1, #2 and #2.1, #4.
- However, batching is conditional. We notably don't batch very large requests
  with the rest (addresses #5). We also don't batch small queries all the time:
  we only batch if we are observing a throughput of queries that suggests we
  can find some benefit in batching (this targets #3).
- Finally, we have some utilities for common cases like having to group by repo
  id (this is `MultiRendezVous`), and this is all configurable via tunables
  (and the default is to not do anything).

Reviewed By: StanislavGlebik

Differential Revision: D27010317

fbshipit-source-id: 4a2397255f9785c6722c02e4d419438fd0aafa07
2021-03-19 08:50:40 -07:00
Mark Juggurnauth-Thomas
33ec4db653 bounded_traversal: require futures to be boxed
Summary:
Bounded traversal's internal book-keeping moves the futures returned from fold and unfold callbacks around while they are being queued to be scheduled.  If these futures are large, then this can result in a significant portion of bounded traversal's CPU time being spent on `memcpy`ing these futures around.

This can be prevented by always boxing the futures that are returned to bounded traversal.  Make this a requirement by changing the type from `impl Future<...>` to `BoxFuture<...>`.

Reviewed By: mitrandir77

Differential Revision: D26997706

fbshipit-source-id: 23a3583adc23c4e7d3607a78e82fc9d1056691c3
2021-03-12 08:12:57 -08:00
Mark Juggurnauth-Thomas
91358f3716 mononoke_types: use SortedVectorMap for BonsaiChangeset
Summary:
BonsaiChangesets are rarely mutated, and their maps are stored in sorted order,
so we can use `SortedVectorMap` to load them more efficiently.

In the cases where mutable maps of filechanges are needed, we can use `BTreeMap`
during the mutation and then convert them to `SortedVectorMap` to store them.

Reviewed By: mitrandir77

Differential Revision: D25615279

fbshipit-source-id: 796219c1130df5cb025952bb61002e8d2ae898f4
2021-03-11 04:28:43 -08:00
Thomas Orozco
306c9dc658 mononoke: update async_limiter to tokio_shim
Summary:
This is dependent on by Metagit and I'd like to update Metagit to Tokio 1.0,
possibly independent of Mononoke;

Reviewed By: farnz

Differential Revision: D26945751

fbshipit-source-id: 552c831964f31d155783af87e7931b2c824e2471
2021-03-10 11:01:28 -08:00
Andrey Chursin
0be8e8ce29 vfs: introduce AsyncVfs
Summary:
AsyncVfs provides async vfs interface.
It will be used in the native checkout instead of current use case that spawns blocking tokio tasks for VFS action

Reviewed By: quark-zju

Differential Revision: D26801250

fbshipit-source-id: bb26c4fc8acac82f4b55bb3f2f3964a6d0b64014
2021-03-05 21:47:51 -08:00
Thomas Orozco
2a803fc10d third-party/rust: update futures
Summary:
Those newer versions of Futures have compatibility improvements with Tokio,
notably:

- https://github.com/rust-lang/futures-rs/pull/2333
- https://github.com/rust-lang/futures-rs/pull/2358

Reviewed By: farnz

Differential Revision: D26778794

fbshipit-source-id: 5a9dc002083e5edfa5c614d8d2242e586a93fcf6
2021-03-04 06:42:55 -08:00
Thomas Orozco
ef7045e818 common/rust: use fbinit-tokio
Summary:
This diffs add a layer of indirection between fbinit and tokio, thus allowing
us to use fbinit with tokio 0.2 or tokio 1.x.

The way this works is that you specify the Tokio you want by adding it as an
extra dependency alongside `fbinit` in your `TARGETS` (before this, you had to
always include `tokio-02`).

If you use `fbinit-tokio`, then `#[fbinit::main]` and `#[fbinit::test]` get you
a Tokio 1.x runtime, whereas if you use `fbinit-tokio-02`, you get a Tokio 0.2
runtime.

This diff is big, because it needs to change all the TARGETS that reference
this in the same diff that introduces the mechanism. I also didn't produce it
by hand.

Instead, I scripted the transformation using this script: P242773846

I then ran it using:

```
{ hg grep -l "fbinit::test"; hg grep -l "fbinit::main"  } | \
  sort | \
  uniq | \
  xargs ~/codemod/codemod.py \
&&  yes | arc lint \
&& common/rust/cargo_from_buck/bin/autocargo
```

Finally, I grabbed the files returned by `hg grep`, then fed them to:

```
arc lint-rust --paths-from ~/files2 --apply-patches --take RUSTFIXDEPS
```

(I had to modify the file list a bit: notably I removed stuff from scripts/ because
some of that causes Buck to crash when running lint-rust, and I also had to add
fbcode/ as a prefix everywhere).

Reviewed By: mitrandir77

Differential Revision: D26754757

fbshipit-source-id: 326b1c4efc9a57ea89db9b1d390677bcd2ab985e
2021-03-03 04:09:15 -08:00
Lukas Piatkowski
f317302b0f autocargo v1: reformating of oss-dependencies, workspace and patch sections and thrift files to match v2
Summary:
For dependencies V2 puts "version" as the first attribute of dependency or just after "package" if present.
Workspace section is after patch section in V2 and since V2 autoformats patch section then the third-party/rust/Cargo.toml manual entries had to be formatted manually since V1 takes it as it is.
The thrift files are to have "generated by autocargo" and not only "generated" on their first line. This diff also removes some previously generated thrift files that have been incorrectly left when the corresponding Cargo.toml was removed.

Reviewed By: ikostia

Differential Revision: D26618363

fbshipit-source-id: c45d296074f5b0319bba975f3cb0240119729c92
2021-02-25 15:10:56 -08:00
Thomas Orozco
8e626f0c02 mononoke/async_limiter: use tokio_shim
Summary:
I'd like to prepare the migration to Tokio 1.0 and this is one bit of code
that needs some non-trivial changes since in Tokio 1.0, Sleep is no longer
Unpin.

Reviewed By: farnz

Differential Revision: D26610033

fbshipit-source-id: 1db4c1686fcd010e2158bcf4bb25f1e15dd19603
2021-02-25 02:11:30 -08:00
Michael Samoylenko
1043ffb436 Fix warnings
Summary:
```
misa@devvm1346 /d/u/m/f/f/ctp (default)> cargo check --tests
...
warning: unused import: `FutureExt`
  --> /data/users/misa/fbsource/fbcode/eden/mononoke/common/rust/sql_ext/src/oss.rs:14:30
   |
14 | use futures_ext::{BoxFuture, FutureExt};
   |                              ^^^^^^^^^
   |
   = note: `#[warn(unused_imports)]` on by default

warning: unused import: `futures_old::future::ok`
  --> /data/users/misa/fbsource/fbcode/eden/mononoke/common/rust/sql_ext/src/oss.rs:15:5
   |
15 | use futures_old::future::ok;
   |     ^^^^^^^^^^^^^^^^^^^^^^^

warning: unused variable: `pool_config`
   --> /data/users/misa/fbsource/fbcode/eden/mononoke/common/rust/sql_ext/src/lib.rs:109:32
    |
109 |                 Self::Mysql(_, pool_config) => {
    |                                ^^^^^^^^^^^ help: if this is intentional, prefix it with an underscore: `_pool_config`
    |
    = note: `#[warn(unused_variables)]` on by default

warning: 3 warnings emitted
```

Reviewed By: farnz

Differential Revision: D26637776

fbshipit-source-id: a7309fa02f4b40fc46a6be0bb64ecb9eceefc104
2021-02-24 11:01:02 -08:00
Alex Hornby
da20841ec9 mononoke: shard walk by sql tier and shard number
Summary:
Scrubbing a repo is highly concurrent as its mostly IO bound.  As such we can end up waiting on sql connection pool for connections where it allows less than scheduled_max connections.

This change makes bounded_traversal_unique calls from the walker aware of the database tier and shard a Node may connect to, so that execution can be limited to the bounds of what the connection pool can support without waiting.

We still end up waiting for the connection, but now it's done in bounded_traversal_unique, rather than in connection pool code, and are thus a) able to process other Nodes while waiting and b) not subject to connection pool timeouts.

Differential Revision: D26524074

fbshipit-source-id: 19125388c730f5cef7e9de34b5b550efa8e6b825
2021-02-23 11:07:48 -08:00
Alex Hornby
aa8f84ad4c mononoke: async myrouter_ready()
Summary: Small clean up. Allows us to pass Logger by reference, removing the FIXME in blobrepo factory

Reviewed By: farnz

Differential Revision: D26551592

fbshipit-source-id: d6bb04b8bb3034ad056f071b67b5ae0ce3c6f224
2021-02-23 10:55:42 -08:00
Alex Hornby
199538edfa mononoke: expose per_key_limit() in sql_ext
Summary: It's useful to know the max number of connections to a shard that it is sensible to open. Used in next diff in stack.

Differential Revision: D26543419

fbshipit-source-id: 57e7c3295a5b5db1572f26954ae0dfb04c84b374
2021-02-23 02:51:50 -08:00
Alex Hornby
5cafeff1db mononoke: allow bounded_traversal_unique steps to output None
Summary:
The walker mostly checks for duplicates before emitting a new edge, at the same time recording the edge as visited to prevent duplicate edges.

However for derived data where the node may or may not be present, the node isn't considered visited until the node data is successfully loaded and seen in state.rs record_resolved_visit().
In such cases multiple copies of a node could be enqueued, and then we need to run each one.

With this change, where the walker can detect that such a step has completed previously, it will now short circuit the step and return None.

Differential Revision: D26369917

fbshipit-source-id: c2bdbbabfaa80dbb7cc7d2bc25a17230531ae111
2021-02-23 02:51:49 -08:00
Thomas Orozco
f06dc957ae mononoke: log identities as norm vector
Summary:
In EdenAPI this is logged as a vector (and in all our other services), but in
Mononoke Server we log it as a string. Let's fix this up. This is worth doing
now since right now we end up logging to 2 columns with the same name and a
different type.

Reviewed By: ahornby

Differential Revision: D26542737

fbshipit-source-id: 2f12c9e475061b1c21c71bade99b83cc070006e8
2021-02-22 11:48:47 -08:00
Thomas Orozco
097e4ad00c mononoke: remove tokio-compat (i.e. use tokio 0.2 exclusively)
Summary:
The earlier diffs in this stack have removed all our dependencies on the Tokio
0.1 runtime environment (so, basically, `tokio-executor` and `tokio-timer`), so
we don't need this anymore.

We do still have some deps on `tokio-io`, but this is just traits + helpers,
so this doesn't actually prevent us from removing the 0.1 runtime!

Note that we still have a few transitive dependencies on Tokio 0.1:

- async-unit uses tokio-compat
- hg depends on tokio-compat too, and we depend on it in tests

This isn't the end of the world though, we can live with that :)

Reviewed By: ahornby

Differential Revision: D26544410

fbshipit-source-id: 24789be2402c3f48220dcaad110e8246ef02ecd8
2021-02-22 09:22:42 -08:00
Thomas Orozco
f7d5b3db46 mononoke: remove tokio-timer dependencies
Summary: Like it says in the title.

Reviewed By: ahornby

Differential Revision: D26513724

fbshipit-source-id: 5d1f986af17c948ad24e3d378a7623a0d97f5cf4
2021-02-22 09:22:41 -08:00
Thomas Orozco
0734a61cb1 common/rust: remove tracing
Summary:
This was a thing that was only ever used in Mononoke, and we don't think it's
usable and haven't been using it. Let's get rid of it. As-is, it won't even work
for most people due to its (indirect) dependency on Tokio 0.1.

Reviewed By: StanislavGlebik

Differential Revision: D26512243

fbshipit-source-id: faa16683f2adb20dfba43c4768486b982bc02ff9
2021-02-22 09:22:41 -08:00
Lukas Piatkowski
cd0b6d50e2 autocargo v1: changes to match autocargo v2 generation results.
Summary:
The changes (and fixes) needed were:
- Ignore rules that are not rust_library or thrift_library (previously only ignore rust_bindgen_library, so that binary and test dependencies were incorrectly added to Cargo.toml)
- Thrift package name to match escaping logic of `tools/build_defs/fbcode_macros/build_defs/lib/thrift/rust.bzl`
- Rearrange some attributes, like features, authors, edition etc.
- Authors to use " instead of '
- Features to be sorted
- Sort all dependencies as one instead of grouping third party and fbcode dependencies together
- Manually format certain entries from third-party/rust/Cargo.toml, since V2 formats third party dependency entries and V1 just takes them as is.

Reviewed By: zertosh

Differential Revision: D26544150

fbshipit-source-id: 19d98985bd6c3ac901ad40cff38ee1ced547e8eb
2021-02-19 11:03:55 -08:00
Lukas Piatkowski
87ddbe2f74 autocargo v1: update autocargo field format to allow transition to autocargo v2
Summary:
Autocargo V2 will use a more structured format for autocargo field
with the help of `cargo_toml` crate it will be easy to deserialize and handle
it.

Also the "include" field is apparently obsolete as it is used for cargo-publish (see https://doc.rust-lang.org/cargo/reference/manifest.html#the-exclude-and-include-fields). From what I know this might be often wrong, especially if someone tries to publish a package from fbcode, then the private facebook folders might be shipped. Lets just not set it and in the new system one will be able to set it explicitly via autocargo parameter on a rule.

Reviewed By: ahornby

Differential Revision: D26339606

fbshipit-source-id: 510a01a4dd80b3efe58a14553b752009d516d651
2021-02-12 23:28:25 -08:00
Alex Hornby
5d7a62e13c mononoke: add duplicate detecting variant of bounded traversal stream
Summary:
Sometimes many nodes will unfold with a common child, in which case it is desirable that they aren't scheduled at the same time.

This adds bounded_traversal_unique as a new variant to prevent those duplicates.

Doing the duplicate detection inside bounded_traversal_unique means we only need to keep scheduled_max copies of keys and can keep it in a regular HashMap, vs doing it at bounded_traversal_stream call sites which would mean keeping a copy for everything in the unscheduled queue and using DashMap or a lock around HashMap.

Reviewed By: farnz

Differential Revision: D26319137

fbshipit-source-id: 3174ce9e7db4edeb107d26f72575de886e6b2e39
2021-02-12 10:14:44 -08:00
Thomas Orozco
2a21e4fb17 third-party/rust: update Tokio to 0.2.25 + add a patch to disable coop scheduling
Summary:
See the patch & motivation here:

818f943db3

Reviewed By: StanislavGlebik

Differential Revision: D26399890

fbshipit-source-id: e184a3f6c1dd03cb4cdb7ea18073c3392d7ce355
2021-02-12 04:56:23 -08:00
Alex Hornby
3291f638bc mononoke: add a new test for bounded_traversal_stream's parents
Summary: New test to check that the path to each unfolded node is as expected

Differential Revision: D26319141

fbshipit-source-id: e67052fd8de3e2e8c6d9287a25f52f9511e9d6c8
2021-02-10 07:34:01 -08:00
Alex Hornby
8cb43ffde8 mononoke: extract bounded_traversal_stream tests in preparation for a new variant
Summary:
Split out the bounded_traversal_stream test in preparation for a new variant so we can be sure same expectations apply to both.

The code to build a test tree was common in a few places I touched, so extracted it to a function.

Differential Revision: D23757523

fbshipit-source-id: fbc7844d8445586b13de3a3ccf4f0fb0041bcd6f
2021-02-10 07:34:01 -08:00
Stefan Filip
0a308f9f84 update Cargo.toml after assert_matches update
Summary: cargo autocargo

Reviewed By: fanzeyi

Differential Revision: D26316542

fbshipit-source-id: f9e12a9d7b3b4e03a6f7b074ea2873ad6dcc82ad
2021-02-08 10:23:00 -08:00
Kostia Balytskyi
27fb78b1fa scuba: add a way to log sampled out rows in verbose mode
Summary:
This allows us to log sampled messages, but reserves an option of falling back to full verbose logging in critical situations.

Note that while this might be a desired behavior in most cases, it's certainly not always the right thing to do: sometimes sampled data needs to remain sampled, even for verbose logging.

Reviewed By: ahornby

Differential Revision: D26148454

fbshipit-source-id: c6ff9d1b05c9cec4895181e008ef6483884bb483
2021-02-04 13:51:26 -08:00
Thomas Orozco
c88a08b9df mononoke: add futures_watchdog, a crate to help find stalls
Summary:
Like it says in the title. This adds a crate that provides a combinator that
lets us easily find stalls caused by futures that stay in `poll()` for too
long.

The goal is to make this minimal overhead for whoever is using it: all you need
is to import it + give it a logger. It automatically looks up the line where
it's called and gives it back to you in logs. This uses the `track_caller`
functionality to make this work.

Reviewed By: farnz

Differential Revision: D26250068

fbshipit-source-id: a1458e5adebac7eab6c2de458f679c7215147937
2021-02-04 10:40:04 -08:00
Mark Juggurnauth-Thomas
792d18eef6 bounded_traversal: add ordered stream
Summary:
Add `bounded_traversal_ordered_stream`.  This function operates much like
`bounded_traversal_stream`, in that it traverses a tree producing a stream of
visited leaves.  The difference is that the order of produced items is
maintained.

Key differences are:

* The `unfold` method produces a sequence of `OrderedTraversal` nodes, rather
  than separate output and recursion sequences.  The order between `Output`
  variants and the result of recursively expanding `Recurse` variants is
  what is maintained.

* The `unfold` method, as well as the initial values, must provide estimates
  of the number of output items that the recursive result expands to.  This is
  used to delay expanding of later items while earlier items are being expanded.

* There is an additional dimension to bound.  The `queue_max` parameter bounds
  the size of the queue of unyielded output elements.  Recursive steps will not
  be scheduled for unfolding until there is sufficient capacity in the queue
  for the items they will produce.  The bound is a soft bound: to ensure progress
  can always be made even if some `unfold` output produce more than `queue_max`
  elements, the queue is permitted to grow beyond `queue_max` with the output of
  one additional `unfold` call.

Reviewed By: StanislavGlebik

Differential Revision: D25867667

fbshipit-source-id: 884bffbeee3862cce56df78084d57ca62089814c
2021-02-02 09:00:17 -08:00
Mark Juggurnauth-Thomas
1cd098181c bounded_traversal: use standard futures types instead of custom Job
Summary:
Replace `common::Job` by using `futures::Join` and `futures::Ready`.

We still need a heterogeneous variant of `Either`, where the output types of the
two futures differ, so extract this from `Job` as `common::Either2`, which
returns `either::Either<LeftFuture::Out, RightFuture::Out>`.

Reviewed By: ahornby

Differential Revision: D25867668

fbshipit-source-id: 13c90b212c64ca5eae67217a1cecd9aee5e40a38
2021-01-29 03:14:41 -08:00
Thomas Orozco
e9656892e8 mononoke: fix some broken oss build
Summary: Not much to add .. Guess we gotta update a stub here.

Reviewed By: ahornby

Differential Revision: D26124590

fbshipit-source-id: efc4f324b5fed15cff46b358c2b491480e9b73fb
2021-01-28 04:29:58 -08:00
Stanislau Hlebik
734928ecb9 mononoke: move functions from rsync admin to copy_utils
Summary:
I plan to use these functions in the megarepotool, so let's move them to a
library that can be used in both.

Reviewed By: krallin

Differential Revision: D26015773

fbshipit-source-id: 0d2d28d86471c417508494883b69fb64e1bea328
2021-01-27 02:47:04 -08:00
Daniel Xu
5715e58fce Add version specificiation to internal dependencies
Summary:
Lots of generated code in this diff. Only code change was in
`common/rust/cargo_from_buck/lib/cargo_generator.py`.

Path/git-only dependencies (ie `mydep = { path = "../foo/bar" }`) are not
publishable to crates.io. However, we are allowed to specify both a path/git
_and_ a version. When building locally, the path/git is chosen. When publishing,
the version on crates.io is chosen.

See https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#multiple-locations .

Note that I understand that not all autocargo projects are published on crates.io (yet).
The point of this diff is to allow projects to slowly start getting uploaded.
The end goal is autocargo generated `Cargo.toml`s that can be `cargo publish`ed
without further modification.

Reviewed By: lukaspiatkowski

Differential Revision: D26028982

fbshipit-source-id: f7b4c9d4f4dd004727202bd98ab10e201a21e88c
2021-01-25 22:10:24 -08:00
Thomas Orozco
4dd3461824 third-party/rust: update Tokio 0.2.x to 0.2.24 & futures 1.x to 1.30
Summary:
When we tried to update to Tokio 0.2.14, we hit lots of hangs. Those were due
to incompatibilities between Tokio 0.2.14 and Futures 1.29. We fixed some of
the bugs (and others had been fixed and were pending a release), and Futures
1.30 have now been released, which unblocks our update.

This diff updates Tokio accordingly (the previous diff in the stack fixes an
incompatibility).

The underlying motivation here is to ease the transition to Tokio 1.0.
Ultimately we'll be pulling in those changes one or way or another, so let's
get started on this incremental first step.

Reviewed By: farnz

Differential Revision: D25952428

fbshipit-source-id: b753195a1ffb404e0b0975eb7002d6d67ba100c2
2021-01-25 08:06:55 -08:00
Kostia Balytskyi
0d52cff58c iterhelpers: add chunk_by_accumulation
Summary:
This implements chunking of the original iterator by saturation of some
accumulator until a condition starts to be satisfied.

Note: I tried looking through `Vec`, `itertools` and `Iterator` methods, and
did not find anything that would allow me to express this easily.

Reviewed By: StanislavGlebik

Differential Revision: D25947821

fbshipit-source-id: 9e4dd738ecd2ab06ebb69123e4a03059f96b3fb6
2021-01-19 07:13:52 -08:00
Radu Szasz
5fb5d23ec8 Make tokio-0.2 include test-util feature
Summary:
This feature is useful for testing time-dependent stuff (e.g. it
allows you to stop/forward time). It's already included in the buck build.

Reviewed By: SkyterX

Differential Revision: D25946732

fbshipit-source-id: 5e7b69967a45e6deaddaac34ba78b42d2f2ad90e
2021-01-18 10:38:08 -08:00
Alex Hornby
d17ec72093 mononoke: add log tag filtering to cmdlib
Summary:
Add arguments to cmdlib so we can filter log messages by the slog tag, using new Drains added in slog_ext.

To use tagging from slog the form is:

```
const FOO_TAG: &str = "foo";
info!(logger, #FOO_TAG, "hello foo!");
```

Reviewed By: krallin

Differential Revision: D25837627

fbshipit-source-id: b164d508a2e82a80c4ff6f5f35c0c722257b9a2a
2021-01-15 03:13:27 -08:00