Summary:
This is the second diff in a stack that will change how multiple targets are generated inside Cargo.toml files.
Previously it used to be that every target is generated independently, which would guarantee invalid Cargo.toml creation, since multiple `[package]` or `[dependencies]` sections would be added.
In this diff the cargo_validator starts to expect only one generated section in Cargo.toml files and cargo_generator starts to generate only one section instead of multiple.
Reviewed By: farnz
Differential Revision: D18114194
fbshipit-source-id: 306b2fa297cf33a1e607d6914513f76a7e1c5580
Summary: If node was already resolved old implementation mishandled result, as it was trying to update current node, which has not been created by the time of update. This would happen because `process_unfold` would call `enqueue_unfold` before current node had been created and if child had already been resolved (execution tree contains `Node::Done(value)`) would try to update current node by calling `update_location` which in turn would fail.
Reviewed By: StanislavGlebik
Differential Revision: D18373666
fbshipit-source-id: fe1dca89f2f5015985fb4b04d671750fa3e84c37
Summary:
The big change here is updating to quickcheck 0.9, and removing the local
change to re-export `rand`. This also updates all the `rand*` crates to their
current versions, which match Quickcheck's API requirements. There was quite a
lot of API changes, including ascii's loss of quickcheck support.
Reviewed By: dtolnay
Differential Revision: D18198542
fbshipit-source-id: 31957eb8aad1d1f81c8216e8a80c2712007415c7
Summary: This diff also fixes some doc tests failing in the crates
Reviewed By: StanislavGlebik
Differential Revision: D17830921
fbshipit-source-id: 18e626766d3775284ce3a51940d762b934dd306d
Summary:
This diff updates all license headers to use the new text and style.
Also, a few internal files were missing the header, but now they have it.
`fbcode/common/rust/netstring/` had the internal header, but now it has
GPLV2PLUS - since that goes to Mononoke's Github too.
Differential Revision: D17881539
fbshipit-source-id: b70d2ee41d2019fc7c2fe458627f0f7c01978186
Summary: I think these are left over from pre-2018 code where they may have been necessary. In 2018 edition, import paths in `use` always begin with a crate name or `crate`/`super`/`self`, so `use $ident;` always refers to a crate. Since extern crates are always in scope in every module, `use $ident` does nothing.
Reviewed By: Imxset21
Differential Revision: D17290473
fbshipit-source-id: 23d86e5d0dcd5c2d4e53c7a36b4267101dd4b45c
Summary: Update bounded_traversal_stream to take IntoIterator of initial values. This allows simultaneous navigation of a graph from multiple roots.
Reviewed By: farnz
Differential Revision: D17163609
fbshipit-source-id: c999e7653cb620c215331ecc46f5a800ced8ef37
Summary:
This diff introduces `bounded_traversal_dag` which can handle arbitrary DAGs and detect invalid DAGs with cycles, but it has limitation in comparison to `bounded_traversal`:
- `bounded_traversal_dag` keeps `Out` result of computation for all the nodes
but `bounded_traversal` only keeps results for nodes that have not been completely
evaluatated
- `In` has additional constraints to be `Eq + Hash + Clone`
- `Out` has additional constraint to be `Clone`
Reviewed By: krallin
Differential Revision: D16621004
fbshipit-source-id: b9f60e461d5d50e060be4f5bb6b970f16a9b99f9
Summary:
This adds support for verifying the hashes that were provided by writers (if any) when committing a Store. This lets writers do conditional writes (i.e. if the writer knows the Sha 256 of their content, then they can ask the blobstore to verify said Sha 256 when uploading).
Note that any uploaded chunks will not be cleaned up if a conditional write fails.
Reviewed By: StanislavGlebik
Differential Revision: D16440669
fbshipit-source-id: 88bc99e646616997a4e9d7e59d59315c18f47da9
Summary:
Removing `single_visit_bounded_traversal`.
- it is not supposed to just throw away visited children
- current implementation is not used anywhere and probably should not to
Reviewed By: krallin
Differential Revision: D16562314
fbshipit-source-id: d823f4f75a34b65107dc4313f0208486c35acdee
Summary:
This adds a split_err helper, which splits an error from a stream into
a separate future. This allows the stream to be infallable.
Reviewed By: sunshowers
Differential Revision: D15746093
fbshipit-source-id: f2d3c10620365daff497c350865a928bd45da8cf
Summary:
stream_clone() takes a stream of cloneable items (and errors) and clones it n
ways to n streams. There's no buffering - all output streams must consume each
item before the next input is consumed. Output streams can be dropped
independently; the input is dropped if all outputs are dropped.
Reviewed By: Imxset21
Differential Revision: D15746068
fbshipit-source-id: 7cf1e92b36449ae2112c91ef393d885e9e16c0ae
Summary:
In a very mergy repos we can hit a combinatoric explosion by visiting the same
node over and over again. Derived data framework has the same problem, and this diff
fixes it.
I had a few attempts at implementing it:
**1** Use `bounded_traversal`, but change unfold to filter out parents that were already visited.
That wasn't correct because during fold will be called only with
the "unvisited" parents. For example in a case like
```
D
/ \
C B
\ /
A
```
fold for C or B will be called with empty parents, and that's incorrect.
**2** Use `bounded_traversal`, change unfold to filter out visited parents but
also remember real parents.
That won't work as well. The reason is that fold might be called before unfold
for parents have finished. so in the case like
```
D
/ \
C B
\ /
A
|
...
thousands of commits
```
If C reaches A first, then B won't visit any other node, and it will try to
derive data for B. However derived data for A might not be ready yet, so
deriving data for B might fail.
**3** Change bounded_traversal to support DAGs not just tree.
From two points above it's clear that bounded_traversal should be called
bounded_tree_traversal, because on any other DAGs it might hit combinatoric
explosion. I looked into changing bounded_traversal to support DAGs, and that
was possible but that was not easy. Specifically we need to make sure that all
unfold operations are called after fold operations, stop using integers for
nodes etc. It might also have a perf hit for the tree case, but not clear how
big is it.
While I think supporting DAGs in bounded_traversal makes sense, I don't want to
block derived data implementation on that. I'll create a separate task for that
---------------------------------------------------------------------------------
The approach I took in the end was to use bounded_stream_traversal that don't
visit the same node twice. Doing this will find all commits that need to be
regenerated but it might return them in an arbitrary order. After that we need
to topo_sort the commits (note that I introduced the bug for hg changeset
generation in D16132403, so this diff fixes it as well).
This is not the most optimal implementation because it will generate the nodes
sequentially even if they can be generated in parallel (e.g. if the nodes are
in different branches). I don't think it's a huge concern so I think it worth
waiting for bounded_dag_traversal implementation (see point 3) above)
---------------------------------------------------------------------------------
Finally there were concerns about memory usage from additional hashset that
keeps visited nodes. I think these concerns are unfounded for a few reasons:
1) We have to keep the nodes we visited *anyway* because we need to generated
derived data from parents to children. In fact, bounded_traversal keeps them in
the map as well.
That's true that bounded traversal can do it a bit more efficiently in cases
we have two different branches that do not intersect. I'd argue that's a rare
case and happens only on repo merges which have two independent equally sized
branches. But even for the case it's not a huge problem (see below).
2) Hashset just keep commit ids which are 32 bytes long. So even if we have 1M
commits to generate that would take 32Mb + hashset overhead. And the cases like
that should never happen in the first place - we do not expect to generate
derived data for 1M of commits except for the initial huge repo imports (and
for those cases we can afford 30Mb memory hit). If we in the state where we
need to generate too many commits we should just return an error to the user,
and we'll add it in the later diffs.
Reviewed By: krallin
Differential Revision: D16438342
fbshipit-source-id: 4d82ea6111ac882dd5856319a16dda8392dfae81
Summary:
Add a new combinator that can be used to inspect the result of a
future. This is useful when you want to run the same code for inspect_err and inspect.
Also, update the inspect_err combinator to not panic if polled twice - instead
make it act like a fused future.
Reviewed By: farnz
Differential Revision: D16359181
fbshipit-source-id: 14948b851867d5792c76cc679297b23b1e8a6adc
Summary:
Adds method `collect` that allows to convert a `Stream` into `Future` of
all of its elements where the elements themselves are stored in the generic
collection type of which can be determined by the caller.
Reviewed By: StanislavGlebik
Differential Revision: D16283392
fbshipit-source-id: 27ef22fbf35c1d9bfe6590d50321e62685604e9e
Summary: Implements `inspect_err` helper that allows to react to failed future without consuming it.
Reviewed By: StanislavGlebik
Differential Revision: D16225137
fbshipit-source-id: fe4678f029615e8ae42b30c6c908a31dfc5e8a86
Summary:
This diff sets two Rust lints to warn in fbcode:
```
[rust]
warn_lints = bare_trait_objects, ellipsis_inclusive_range_patterns
```
and fixes occurrences of those warnings within common/rust, hg, and mononoke.
Both of these lints are set to warn by default starting with rustc 1.37. Enabling them early avoids writing even more new code that needs to be fixed when we pull in 1.37 in six weeks.
Upstream tracking issue: https://github.com/rust-lang/rust/issues/54910
Reviewed By: Imxset21
Differential Revision: D16200291
fbshipit-source-id: aca11a7a944e9fa95f94e226b52f6f053b97ec74
Summary:
Previously:
```
error[E0599]: no method named `boxify` found for type `futures::future::result_::FutureResult<_, _>` in the current scope
--> experimental/dtolnay/thttp/main.rs:23:21
|
23 | let transport = try_boxfuture!(HttpClient::new(ENDPOINT));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= help: items from traits can only be used if the trait is in scope
= note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
help: the following trait is implemented but not in scope, perhaps add a `use` for it:
|
1 | use futures_ext::FutureExt;
|
```
Reviewed By: Imxset21
Differential Revision: D16134586
fbshipit-source-id: 324d2c7223c5e71ffc1f4390cd8fc0b488243c00
Summary:
When the underlying stream in a `BytesStream` hits EOF, and the `BytesStream` is asked for more data, it'll busy loop until more data becomes available.
This happens because when the `BytesStream` hits EOF on the underlying stream, it'll record that the stream is done, and won't ever poll the underlying stream again in `poll_buffer`. However, in `poll_buffer_until`, we essentially busy loop through calls to `poll_buffer`
When that happens, the process goes into an infinite loop of:
- Checking that it doesn't have enough bytes.
- Observing that the stream is done.
- Calling `poll_buffer` and not making any progress.
Instead, the right behavior would be to return a successful read of length zero in `read()` to indicate EOF to the caller. This is what this patch does.
It's worth noting that even if the underlying stream returned more data after it reported it was exhausted (callers should avoid polling them again ... or use `fuse()`), the `BytesStream` won't ever poll for that data, since it skips the poll because `stream_done` is set.
Reviewed By: farnz
Differential Revision: D16130672
fbshipit-source-id: a61c39feb1aa1ac74bbef909e47d698051477533
Summary:
```
pub fn top_level_launch<F>(future: F) -> Result<F::Item, F::Error>
where
F: Future + Send + 'static,
F::Item: Send,
F::Error: Send;
```
Starts the Tokio runtime using the supplied future to bootstrap the execution.
### Similar APIs
This function is equivalent to [`tokio::run`](https://docs.rs/tokio/0.1.22/tokio/fn.run.html) except that it also returns the future's resolved value as a Result. Thus it requires `F::Item: Send` and `F::Error: Send`.
This function has the same signature as [`Future::wait`](https://docs.rs/futures/0.1.28/futures/future/trait.Future.html#method.wait) which also goes from `F: Future` -> `Result<F::Item, F::Error>`, but `wait` requires an ambient futures runtime to already exist.
### Details
This function does the following:
- Start the Tokio runtime using a default configuration.
- Spawn the given future onto the thread pool.
- Block the current thread until the runtime shuts down.
- Send ownership of the future's resolved value back to the caller's thread.
Note that the function will not return immediately once the original future has completed. Instead it waits for the entire runtime to become idle.
### Panics
This function panics if called from the context of an executor.
Reviewed By: bolinfest
Differential Revision: D16116198
fbshipit-source-id: 52009e17c1ccc566e0c156a86f8fd5844e0e2491
Summary: Introduce `bounded_traversal_stream` operation with is similar to `bounde_traversal` but does not do `fold` hence it is not structure preserving, and returns stream as result.
Reviewed By: farnz
Differential Revision: D16108231
fbshipit-source-id: 3854ff5e18bcf2d7aa3dc75a2b66d27c59ea4f2c
Summary: After using it for sometime I found that it is more convenient to introduce addition parameter `OutCtx` which represent all information needed about node for `fold` operation instead of using `In`. Note this implementation is strictly more general, and isomorphic to previous one if `OutCtx == In`.
Reviewed By: farnz
Differential Revision: D16029193
fbshipit-source-id: f562baa023d737ce1db2936987f6c59bcd0c3761
Summary: Since we own `In` elements `unfold` can have mutable reference instead of immutable one. This can simplify some code, see `BlobRepo::find_entries_in_manifest` for example.
Reviewed By: farnz
Differential Revision: D15939878
fbshipit-source-id: 9c767240bec279f24f922e0771ac919072b3a56c
Summary:
I find that it is common pattern
```
// create unnecessary to reduce scope of lock guard
let output = {
let mut value = mutex_value.lock().expect("lock poisoned");
... // do some stuff here
};
```
This extension simplifies this pattern to
```
let output = mutex_value.with(|value| /* do some stuff here */);
```
Reviewed By: Imxset21, farnz
Differential Revision: D15577135
fbshipit-source-id: 6b22b20dda79e532ff5ec8ce75cda8b1c1404368
Summary: `bounded_traversal` traverses implicit asynchronous tree specified by `init` and `unfold` arguments, and it also does backward pass with `fold` operation. All `unfold` and `fold` operations are executed in parallel if they do not depend on each other (not related by ancestor-descendant relation in implicit tree) with amount of concurrency constrained by `scheduled_max`.
Reviewed By: jsgf
Differential Revision: D15197796
fbshipit-source-id: 1145497f5cb1c0effee47a4d27698bcf9d88f840
Summary:
We've hit an issue of slow pushes to Mononoke when a commit modifies a lot of
files (>500 in our case). turned out that the problem was in the fact that
we have only one master write connection open, and each blobstore write
requires a write to mysql because of multiplexed blobstore. Because we have
only one connection open all our mysql writes are serialized, and the push is
taking too much time. It's especially bad in non-master regions.
To mitigate the issue let's add a batching in the blobstore sync queue. When
clients call `blobstore_sync_queue.add(...)` we'll send this new entry via the
channel to a separate task that would send writes in batches. That allows us to
increase throughput significantly.
Reviewed By: jsgf
Differential Revision: D15248288
fbshipit-source-id: 22bab284b0cbe552b4b51bab4027813b4278fd14
Summary:
This test was failing consistently on my devserver when I was testing an update to tp2 crates.io. The update did not have any future or tokio changes. This test also fails on other devservers.
This test was added in D9561367.
The test fails on my devserver with:
```
test test::asynchronize_parallel ... FAILED
failures:
---- test::asynchronize_parallel stdout ----
thread 'test::asynchronize_parallel' panicked at 'Parallel sleep time 43.284909ms much greater than
40ms for 20 threads - each thread sleeps for 20ms', common/rust/futures-ext/src/lib.rs:741:9
note: Run with `RUST_BACKTRACE=1` for a backtrace.
failures:
test::asynchronize_parallel
```
Reviewed By: jsgf
Differential Revision: D14055989
fbshipit-source-id: 14a87a1dfe6ff7e273a08052695fd6c9c9ed37c7
Summary:
We've had this problem for almost a year now, and I've finally made some
progress.
The problem was in tests failing randomly with error like
```
- remote: * pushrebase failed * (glob)
- remote: msg: "pushrebase failed Conflicts([PushrebaseConflict { left: MPath(\"1\"), right: MPath(\"1\") }])"
+ remote: Jan 25 08:46:24.067 ERRO Error in hgcli proxy, error: Connection reset by peer (os error 104), root_cause: Os {
+ remote: code: 104,
+ remote: kind: ConnectionReset,
+ remote: message: "Connection reset by peer"
remote: * backtrace* (glob)
```
or
```
remote: * pushrebase failed * (glob)
remote: msg: "pushrebase failed Conflicts([PushrebaseConflict { left: MPath(\"1\"), right: MPath(\"1\") }])"
+ remote: Jan 25 08:47:59.966 ERRO Error in hgcli proxy, error: Connection reset by peer (os error 104), root_cause: Os {
+ remote: code: 104,
+ remote: kind: ConnectionReset,
+ remote: message: "Connection reset by peer"
+ remote: }, backtrace:
```
note that the problem are slightly different. In the first case the actual error message is completely lost + we get unnecessary
ConnectionReset problem message. In the second case it's just `ConnectionReset`.
This diff fixes the problem of the lost error message (problem #1) and hides `ConnectionReset` problem (problem #2).
Problem #1 was due to a bug in streamfork. Before this diff if streamfork hit
an error, then it might have not sent already received input to one of the
outputs. This diff fixes it.
This diff just hides Problem #2. If we see a ConnectionReset then an error
won't be reported. That's a hack which should be fixed, but at the moment
a) The bug is not easily debuggable
b) The problem is not urgent and shouldn't cause problems
In some cases server actually sends Connection reset, but in that case
mercurial stil gives us self-explanatory message
```
abort: stream ended unexpectedly (got 0 bytes, expected 4
``
Reviewed By: lukaspiatkowski
Differential Revision: D13818558
fbshipit-source-id: 7a2cba8cd0fcef8211451df3dea558fe2d60fa60
Summary:
Previously we had a timeout per session i.e. multiple wireproto command will
share the same timeout. It had a few disadvantages:
1) The main disadvantage was that if connection had timed out we didn't log
stats such as number of files, response size etc and we didn't log parameters
to scribe. The latter is even a bigger problem, because we usually want to
replay requests that were slow and timed out and not the requests that finished
quickly.
2) The less important disadvantage is that we have clients that do small
request from the server and then keep the connection open for a long time.
Eventually we kill the connection and log it as an error. With this change
the connection will be open until client closes it. That might potentially be
a problem, and if that's the case then we can reintroduce perconnection
timeout.
Initially I was planning to use tokio::util::timer to implement all the
timeouts, but it has different behaviour for stream - it only allows to set
per-item timeout, while we want timeout for the whole stream.
(https://docs.rs/tokio/0.1/tokio/timer/struct.Timeout.html#futures-and-streams)
To overcome it I implemented simple combinator StreamWithTimeout which does
exactly what I want.
Reviewed By: HarveyHunt
Differential Revision: D13731966
fbshipit-source-id: 211240267c7568cedd18af08155d94bf9246ecc3
Summary:
`asynchronize` does two conceptually separate things:
1. Given a closure that can do blocking I/O or is CPU heavy, create a future
that runs that closure inside a Tokio task.
2. Given a future, run it on a new Tokio task and shuffle the result back to
the caller via a channel.
Split these two things out into their own functions - one to make the future,
one to spawn it and recover the result. For now, this is no net change - but
`spawn_future` is likely to come in useful once we need more parallelism than
we get from I/O alone, and `closure_to_blocking_future` at least signals intent
when we allow a long-running function to take over a Tokio task.
Reviewed By: jsgf
Differential Revision: D9635812
fbshipit-source-id: e15aeeb305c8499219b89a542962cb7c4b740354
Summary:
`asynchronize` currently does not warn the event loop that it's
running blocking code, so we can end up starving the thread pool of threads.
We can't use `blocking` directly, because it won't spawn a synchronous task
onto a fresh Tokio task, so your "parallel" futures end up running in series.
Instead, use it inside `asynchronize` so that we can pick up extra threads in
the thread pool as and when we need them due to heavy load.
While in here, fix up `asynchronize` to only work on synchronous tasks and
push the boxing out one layer. Filenodes needs a specific change that's
worth extra eyes.
Reviewed By: jsgf
Differential Revision: D9631141
fbshipit-source-id: 06f79c4cb697288d3fadc96448a9173e38df425f
Summary:
We have suspect timings in Mononoke where `asynchronize` is used to
turn a blocking function into a future. Add a test case to ensure that
`asynchronize` itself cannot be causing accidental serialization.
Reviewed By: jsgf
Differential Revision: D9561367
fbshipit-source-id: 14f03e3f003f258450bb897498001050dee0b40d
Summary: The latest release of `tokio` updates `tokio::timer` to include a new `Timeout` type and a `.timeout()` method on `Future`s. As such, our internal implementation of `.timeout()` in `FutureExt` is no longer needed.
Reviewed By: jsgf
Differential Revision: D9617519
fbshipit-source-id: b84fd47a3ee4fc1f7c0a52e308317b93f28f04da
Summary:
It makes startup unbearably slow, and doesn't add any benefits at all. Revert
it
Reviewed By: purplefox
Differential Revision: D9358741
fbshipit-source-id: 26469941304f737c856a6ffca5e577848ad30955
Summary:
Should be functionally equivalent and semantically more appropriate
This also makes a couple of small API changes:
- The inner function is expected to just return a Result - IntoFuture is
overkill if its supposed to be synchronous in the first place
- `asynchronize` itself returns `impl Future` rather than being intrinsically
boxed.
- Restructure dieselfilenodes::add_filenodes to only asynchronize the insert
itself.
Reviewed By: farnz
Differential Revision: D8959535
fbshipit-source-id: fef9164e3be0069bd0d93573642cd57bb5babb73
Summary: I was surprised these didn't exist, but they work fine.
Reviewed By: Imxset21
Differential Revision: D9016243
fbshipit-source-id: 8405009f536b2e1aa0c9c28be59bb8c1d9ab7a4f
Summary: As in the comment to the macro: if the condition is not met, return a BoxFuture with Err inside
Reviewed By: farnz
Differential Revision: D8877731
fbshipit-source-id: 7f31a1739155201ea2be30901b8cda2511f49b03
Summary: Iterating over the code on server is a bit painful and it has grown a lot, splitting it should speed up future refactories and make it more maintainable
Reviewed By: jsgf, StanislavGlebik
Differential Revision: D8859811
fbshipit-source-id: 7c56f9f835f45eca322955cb3b9eadd87fbb30a1
Summary:
This is a series of patches which adds Cargo.toml files to all the crates and tries to build them. There is individual patch for each crate which tells whether that crate build successfully right now using cargo or not, and if not, reason behind that.
Following are the reasons why the crates don't build:
* failure_ext and netstring crates which are internal
* error related to tokio_io, there might be an patched version of tokio_io internally
* actix-web depends on httparse which uses nightly features
All the build is done using rustc version `rustc 1.27.0-dev`.
Pull Request resolved: https://github.com/facebookexperimental/mononoke/pull/7
Differential Revision: D8778746
Pulled By: jsgf
fbshipit-source-id: 927a7a20b1d5c9643869b26c0eab09e90048443e