Commit Graph

27 Commits

Author SHA1 Message Date
David Tolnay
fe65402e46 rust: Move futures-old rdeps to renamed futures-old
Summary:
In targets that depend on *both* 0.1 and 0.3 futures, this codemod renames the 0.1 dependency to be exposed as futures_old::. This is in preparation for flipping the 0.3 dependencies from futures_preview:: to plain futures::.

rs changes performed by:

```
rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
    -x \
    buck query "labels(srcs,
        rdeps(%Ss, fbsource//third-party/rust:futures-old, 1)
        intersect
        rdeps(%Ss, //common/rust/renamed:futures-preview, 1)
    )" \
| xargs sed -i 's/\bfutures::/futures_old::/'
```

Reviewed By: jsgf

Differential Revision: D20168958

fbshipit-source-id: d2c099f9170c427e542975bc22fd96138a7725b0
2020-03-02 21:02:50 -08:00
Stanislau Hlebik
25c57e445c mononoke: add create_warmer() function
Summary:
Small cleanup that removes a bunch of duplicate code.
That should make it easier to add other types of derived data to the warmer

Reviewed By: krallin

Differential Revision: D20193169

fbshipit-source-id: 437fe7981d8a71164dc9edfcc423e8c41cbe0967
2020-03-02 10:08:09 -08:00
Stanislau Hlebik
638e637ef6 RFC: mononoke: introduce unodes v2
Summary:
Our previous implementation of unodes had a problem with diamond merges -
essentially because p1 and p2 might have the same file but with different
content unode will always create a merge unode which can be unexpected.
(code comment in unodes/derive.rs has more info about it).

This diff fixes the problem by introducing unodes v2. This allows us to import
new repos with new unode implementation while keeping the old repos with unode
v1.

This implementation uses a heuristic which should be fast and should do the
correct thing most of the time. In some cases it might exclude some parts of
the history completely. For example:

     O <- merge commit, doesn't change anything
    / \
   P1  |  <- modified "file.txt" to "B"
   |   P2    <- modified "file.txt" to "B"
   \  /
    ROOT <- created "file.txt" with content "A"

In that case history of "file.txt" starting from merge commit will contain only (P1, ROOT),
but it won't contain P2.

We also considered other options:
1) Move this heuristic to fastlog batch derived data. See D19973553 for more
details about why we decided not to do it.

2) Filter out parent unodes that are ancestors of other parent unodes. This should
always be correct, but it will be hard to implement, it wil be even harder to make
sure it always have good performance.

Reviewed By: krallin

Differential Revision: D19978157

fbshipit-source-id: 445ddd5629669d987e7aa88c35fecf0b34a40da0
2020-03-02 05:27:31 -08:00
Stanislau Hlebik
d7a4ff29b5 mononoke: log derivations to separate scuba table
Summary: I'd like to log all derivations to a single place so that's it's easier to understand what was derived and where

Reviewed By: aslpavel

Differential Revision: D20140004

fbshipit-source-id: 305ea533031a04ff95995a6fe2a6e57e95a87026
2020-03-02 04:30:12 -08:00
Thomas Orozco
95d463ce47 mononoke/filenodes: Remove path from FilenodeInfo
Summary:
This updates our filenodes implementation to use different types for writing
(`PreparedFilenode`) and reading `(FilenodeInfo`).

The bottom line is that this avoids a bunch of cloning of paths on the read
path, which doesn't need to return the path to the caller, since the caller
already knows it! We can also take it out of Memcache, since we don't need
Memcache to tell us the path for a blob we could only possibly have found by
having the path to begin with.

This does update our filenodes serialization format. I bumped MC_CODEVER
accordingly.

Reviewed By: StanislavGlebik

Differential Revision: D19905400

fbshipit-source-id: 6037802c1773de564cade8e264d36087382ee15a
2020-02-27 12:34:21 -08:00
Thomas Orozco
341b4f1bc3 mononoke/filenodes: expect a Vec of filenodes to insert
Summary:
The API expects a stream of filenodes to insert, but we actually never used
that ability. Instead, every single callsites has a `Vec`, which it converts to
a stream and passes that in.

I'd like to change this for two reasons:

- It's un-necessary
- It makes the code more complex on the Filenodes implementation side, and less
  efficient, since we need to `chunk()` there in small chunks, which might not
  all be in the same shard. If we get the entire `Vec` at once, we can chunk on a
  per-shard basis (this happens later in this stack).

Besides, if we end up having a stream and wanting the old behavior, we can
always call `chunk()` the stream and call `add_filenodes` on each batch (which
is actually nicer because if you have a futures 0.2 stream that isn't static,
you can do this, but you can't turn it into a `BoxStream`!).

Reviewed By: StanislavGlebik

Differential Revision: D19902537

fbshipit-source-id: a4c030c4a51afbb6e9db133b32464009eed197af
2020-02-27 12:34:18 -08:00
Stanislau Hlebik
cc8be5997e mononoke: asyncify derived data
Reviewed By: krallin

Differential Revision: D20139701

fbshipit-source-id: 7f1c8370707eb415dd7e23d94eb923846f7ef59b
2020-02-27 12:17:54 -08:00
Thomas Orozco
26ae726af5 mononoke: update internals to Bytes 0.5
Summary:
The Bytes 0.5 update left us in a somewhat undesirable position where every
access to our blobstore incurs an extra copy whenever we fetch data out of our
cache (by turning it from Bytes 0.5 into Bytes 0.4) — we also have quite a few
place where we convert in one direction then immediately into the other.

Internally, we can start using Bytes 0.5 now. For example, this is useful when
pulling data out of our blobstore and deserializing as Thrift (or conversely,
when serializing and putting it into our blobstore).

However, when we interface with Tokio (i.e. decoders & encoders), we still have
to use Bytes 0.4.  So, when needed, we convert our Bytes 0.5 to 0.4 there.

The tradeoff idea is that we deal with more bytes internally than we end up
sending to clients, so doing the Bytes conversion closer to the point of
sending data to clients means less copies.

We can also start removing those once we migrate to Tokio 0.2 (and newer
versions of Hyper for HTTP services).

Changes that were required:

- You can't extend new bytes (because that implicitly copies). You need to use
  BytesMut instead, which I did where that was necessary (I also added calls in
  the Filestore to do that efficiently).
- You can't create bytes from a `&'a [u8]`, unless `'a` is  `'static`. You need
  to use `copy_from_slice` instead.
- `slice_to` and `slice_from` have been replaced by a `slice()` function that
  takes ranges.

Reviewed By: StanislavGlebik

Differential Revision: D20121350

fbshipit-source-id: eb31af2051fd8c9d31c69b502e2f6f1ce2190cb1
2020-02-27 08:08:28 -08:00
Stanislau Hlebik
d5d3061168 mononoke: distinguish derived data waits with derived data generation
Summary:
Previously it was hard to tell whether the process were actually responsible
for generating derived data or it was just waiting for it to be generated.

Let's make this distinction clearer.

Reviewed By: johansglock

Differential Revision: D20138284

fbshipit-source-id: 52ae12679db2f61869f048baf2a603b456710a71
2020-02-27 03:15:39 -08:00
Stanislau Hlebik
98f6d5d1a8 mononoke: fix walker filenode walks
Summary:
Since Mononoke's filenodes were migrated to derived data framework
hg_linknode_populated alarm has been firing. The main reason was that there's
now a delay between hg changeset being generated and filenodes being generated.

This diff fixes it by making sure walker won't visit hg changesets without
generated filenodes (note that walker will visit these changesets later after filenodes will be
generated).

Reviewed By: ahornby

Differential Revision: D20067615

fbshipit-source-id: 285e9a3d8c89b85441491c889a8458c86ca0e3a8
2020-02-26 15:21:53 -08:00
Aida Getoeva
585899f419 mononoke/scs: use last change in file history
Summary:
There is no need to generate expensive file history stream if only one node is requested.

I refactored code that generated stream of history commits, so it'd first yield the nodes and only then prefetch their parents. That will help to solve latency problem for the history request for only a single commit.

I removed BFS queue and added two state variables: ready nodes and already processed:
* The last are the nodes that were return as a part of a history stream on the last iteration and now can be used to construct next BFS layer: prefetch fastlog batches, fill the commit graph, take parents in BFS order to form new bunch of nodes.
* First are used if it's the first iteration - there is no processed nodes yet but there are some that are ready to be returned.

I believe removing the queue I simplified the code and logic a little bit.

Reviewed By: StanislavGlebik

Differential Revision: D19818100

fbshipit-source-id: c30d28c623464ba3552a00e8542552f7655076ef
2020-02-26 08:09:12 -08:00
Stanislau Hlebik
7076fac933 mononoke: add exponential backoff
Summary:
During our tests we noticed that we can send too many blobstore read requests to the
mapping. Let's add exponential backoff to prevent that

Reviewed By: ikostia

Differential Revision: D20116043

fbshipit-source-id: 6fecbda4c36a5065b77ba9df561c6d9c6a969089
2020-02-26 05:05:33 -08:00
Stanislau Hlebik
19e1e94984 mononoke: add lease renewing to derived data
Summary:
During S196197 lease expired and we were rederiving the same derived data over and over again for a big commit.
this diff adds lease renewal that should help with this problem.

Reviewed By: HarveyHunt

Differential Revision: D20093323

fbshipit-source-id: d139abf6659722f47ea40d9b2f279daa03623ff4
2020-02-25 09:22:46 -08:00
Stanislau Hlebik
4bd758289b mononoke: async/await derive_may_panic() function
Reviewed By: HarveyHunt

Differential Revision: D20092945

fbshipit-source-id: 70ec1a8e5b9c99f3853a13bebe3657ece5ff9e9e
2020-02-25 09:22:46 -08:00
Stanislau Hlebik
3418318883 mononoke: do not generate hgchangesets unnecessarily in FilenodesOnlyPublicMapping
Summary:
fetch_root_filenode is called by FilenodesOnlyPublicMapping to figure out if
filenodes were already derived. Previously it first derived hg changeset and
then fetched looked up root manifest in db. However if hg changeset is not
derived then filenodes couldn't possible be derived either and we can return an
answer faster.

This is useful in the next diff where I change walker

Reviewed By: ahornby

Differential Revision: D20068819

fbshipit-source-id: 17f066c437e0b1f7bbeb8f6e247eadc9afe94f90
2020-02-25 08:07:07 -08:00
Thomas Orozco
2a12e2beb6 mononoke/derived_data: log when we start deriving
Summary:
This should give us a slightly better idea of what hosts are doing to
troubleshoot duplicate derivation.

Also, let's make the logging a bit less confusing.

Reviewed By: StanislavGlebik

Differential Revision: D20070619

fbshipit-source-id: 91cc264b7043b8fc8c21c007832fba328ef0017d
2020-02-24 12:03:41 -08:00
Stanislau Hlebik
ec76ba93c6 mononoke: convert some fastlog functions to async/await
Reviewed By: farnz

Differential Revision: D20059447

fbshipit-source-id: fa4a70b238ebc85ad5e589b06ee8a1ca6c0ea509
2020-02-24 00:53:56 -08:00
Stanislau Hlebik
74a8eb4968 fastlog: convert derive_parents to async/await
Summary:
I'm going to modify it in the next diff, so let's make it async.

Note that we used `spawn_future()` before which I replaced with tokio::spawn()
here. It's not really clear if we need it at all - I'll experiment with later.
Removing it will make the code cleaner.

Reviewed By: krallin

Differential Revision: D19973315

fbshipit-source-id: cbbb9a88f4424e6e717caf1face6807ab6c32438
2020-02-19 21:28:21 -08:00
Mark Thomas
a8f06f75c0 derived_data: add DeriveError for when derivation is disabled
Summary:
Currently if derivation of a particular derived data type is disabled, but a
client makes a request that requires that derived data type, we will fail with
an internal error.

This is not ideal, as internal errors should indicate something is wrong, but
in this case Mononoke is behaving correctly as configured.

Convert these errors to a new `DeriveError` type, and plumb this back up to
the SCS server.  The SCS server converts these to a new `RequestError`
variant: `NOT_AVAILABLE`.

Reviewed By: krallin

Differential Revision: D19943548

fbshipit-source-id: 964ad0aec3ab294e4bce789e6f38de224bed54fa
2020-02-19 09:28:09 -08:00
Thomas Orozco
16384599a8 mononoke (+ rust/shed/async_unit): update async_unit to expect async fn's
Summary:
This allows code that is being exercised under async_unit to call into code
that expects a Tokio 0.2 environment (e.g. 0.2 timers).

Unfortunately, this requires turning off LSAN for the async_unit tests, since
it looks like LSAN and Tokio 0.2 don't work very well together, resulting in
LSAN reporting leaked memory for some TLS structures that were initialized by
tokio-preview (regardless of whether the Runtime is being dropped):
https://fb.workplace.com/groups/rust.language/permalink/3249964938385432/

Considering async_unit is effectively only used in Mononoke, and Mononoke
already turns off LSAN in tests for precisely this reason ... it's probably
reasonable to do the same here.

The main body of changes here is also about updating the majority of our
changes to stop calling wait(), and use this new async unit everywhere. This is
effectively a pretty big batch conversion of all of our tests to use async fns
instead of the former approaches. I've also updated a substantial number of
utility functions to be async fns.

A few notable changes here:

- Some pushrebase tests were pretty flaky — the race they look for isn't
  deterministic. I added some actual waiting (using pushrebase hooks) to make
  it more deterministic.  This is kinda copy pasted from the globalrev hook
  (where I had introduced this first), but this will do for now.
- The multiplexblob tests don't work at all with new futures, because they call
  `poll()` all over the place. I've updated them to new futures, which required
  a bit of reworking.
- I took out a couple tests in async unit that were broken anyway.

Reviewed By: StanislavGlebik

Differential Revision: D19902539

fbshipit-source-id: 352b4a531ef5fa855114c1dd8bb4d70ed967dd55
2020-02-18 01:55:00 -08:00
Stanislau Hlebik
ee7c0a8d26 backfill_derived_data mononoke: fail if derived data disabled
Summary:
Now let's fail if we try to derive data that's not enabled in the config.
Note that this diff also adds `derive_unsafe()` method which should be used in backfiller.

Reviewed By: krallin

Differential Revision: D19807956

fbshipit-source-id: c39af8555164314cafa9691197629ab9ddb956f1
2020-02-16 04:56:34 -08:00
Pavel Aslanov
34d578198e extend blame with origin_offset
Summary: This diff extend `Blame` structure with `origin_line` (line at the moment of introduction of a hunk).

Reviewed By: mitrandir77

Differential Revision: D19499125

fbshipit-source-id: 2fa6bc4ee62d484dd6dc6bee17cb1d04ba1db158
2020-02-13 08:56:32 -08:00
Stanislau Hlebik
6a1358467e mononoke: change how we generate filenodes for octopus merges
Summary:
In the next diff I'm going to refactor BlobRepo::get_manifest_from_bonsai()
function and remove IncompleteFilenodes structure. While doing so I noticed
that the behaviour of get_manifest_from_bonsai() and FilenodesOnlyPublic is
different with regards to generating hg filenodes for octopus merges. In
particular, get_manifest_from_bonsai() at the moment doesn't generate filenodes
for paths that came from stepparents, but FilenodesOnlyPublic does generate
them.

I'd say both of this approach are equally reasonable, however let's try to
preserve the behaviour

Reviewed By: krallin

Differential Revision: D19856420

fbshipit-source-id: 9f8ae656205f39536c450b885fc4d8d6a2534456
2020-02-12 23:56:23 -08:00
Pavel Aslanov
c902acbc3f remove the need to pass mapping to ::derive method
Summary: remove the need to pass mapping to `::derive` method

Reviewed By: StanislavGlebik

Differential Revision: D19856560

fbshipit-source-id: 219af827ea7e077a4c3e678a85c51dc0e3822d79
2020-02-12 10:22:39 -08:00
Lukasz Piatkowski
542d1f93d3 Manual synchronization of fbcode/eden and facebookexperimental/eden
Summary:
This commit manually synchronizes the internal move of
fbcode/scm/mononoke under fbcode/eden/mononoke which couldn't be
performed by ShipIt automatically.

Reviewed By: StanislavGlebik

Differential Revision: D19722832

fbshipit-source-id: 52fbc8bc42a8940b39872dfb8b00ce9c0f6b0800
2020-02-11 11:42:43 +01:00
Stanislau Hlebik
2b7e7e5676 mononoke: log if derived data is not enabled
Summary:
Before we start blocking generation of derived data let's start with logging if
derived data is not specified.

Reviewed By: farnz

Differential Revision: D19791523

fbshipit-source-id: 15bed8463f8a021de76a2878f06ec95d9fef877f
2020-02-10 01:44:09 -08:00
Lukasz Piatkowski
e8d62b64d5 mononoke: move the codebase under eden/ directory
fbshipit-source-id: 43a0252cb3ec42aa365f20d1b6faa4d24d74c9b8
2020-02-06 13:46:04 +01:00