Commit Graph

85 Commits

Author SHA1 Message Date
Pavel Aslanov
905c8b213e move Filenodes to BlobRepo::attributes
Summary: Move `Filenodes` to `BlobRepo::attributes` as it is mercurial specific.

Reviewed By: ikostia

Differential Revision: D21662418

fbshipit-source-id: 87648a3e6fd7382437424df3ee60e1e582b6b958
2020-06-22 07:29:19 -07:00
Pavel Aslanov
a1f5e45a5a BlobRepoHg extension trait.
Summary: This diff introduces `BlobRepoHg` extension trait for `BlobRepo` object. Which contains mercurial specific methods that were previously part of `BlobRepo`. This diff also stars moving some of the methods from BlobRepo to BlobRepoHg.

Reviewed By: ikostia

Differential Revision: D21659867

fbshipit-source-id: 1af992915a776f6f6e49b03e4156151741b2fca2
2020-06-22 07:29:19 -07:00
Pavel Aslanov
2a746920b6 make it possible to store arbitrary type in blobrepo with dependency on it
Summary:
This diff adds additional filed `BlobRepo::attributes`  which can store attributes of arbitrary type. This will help store opaque types inside blobrepo without creating dependency on a crate which contains type definition for this attribute. This diff also moves `BonsaiHgMapping` inside attributes set.
- This work will allow to move mercurial changeset generation logic to derive data infrastructure

Reviewed By: ikostia

Differential Revision: D21640438

fbshipit-source-id: 3abd912e7227738a73ea9b17aabdda72a33059aa
2020-06-22 07:29:19 -07:00
Jeremy Fitzhardinge
35b292ce9d eden: manual dependency fixes
Summary:
Tooling can't handle named_deps yet, but it can warn about them

P133451794

Reviewed By: StanislavGlebik

Differential Revision: D22083499

fbshipit-source-id: 46de533c19b13b2469e912165c1577ddb63d15cd
2020-06-17 17:55:04 -07:00
Jeremy Fitzhardinge
1b4edb5567 eden: remove unused Rust dependencies
Summary:
Remove unused dependencies for Rust targets.

This failed to remove the dependencies in eden/scm/edenscmnative/bindings
because of the extra macro layer.

Manual edits (named_deps) and misc output in P133451794

Reviewed By: dtolnay

Differential Revision: D22083498

fbshipit-source-id: 170bbaf3c6d767e52e86152d0f34bf6daa198283
2020-06-17 17:55:03 -07:00
Kostia Balytskyi
d176764593 derived_data: add splay to lease retry backoff times
Reviewed By: StanislavGlebik

Differential Revision: D21975819

fbshipit-source-id: 8acb7d5d7795d19c036f1f0e232e6eff97711d9e
2020-06-11 02:52:00 -07:00
David Tolnay
5609aa9509 cargo_from_buck: Hide generated code from rustfmt
Differential Revision: D21966217

fbshipit-source-id: f540a2e233cc2645ae71df5134b1a810d4425053
2020-06-10 19:29:49 -07:00
Aida Getoeva
4776c29077 mononoke/fastlog: clean up ops module
Summary: This diff resolves TODO to make a helper function a crate private function as the only dependency (APIServer) is now gone.

Reviewed By: StanislavGlebik

Differential Revision: D21954238

fbshipit-source-id: 7cfc49534b7ab8af6035f7a87fdc763127dce318
2020-06-10 19:29:48 -07:00
Stanislau Hlebik
ffd60c01b0 blobimport: do not use batch_derive
Summary:
batch_derive() is a dangerous function to use. I'd love to delete it but this
function is very useful for backfilling, so unfortunately I can't.

The problem arises when one tries to backfill blame and unodes simultaneously
(or just derive blame  which in turn derives unodes). While batch_derive()
tries to be careful with inserting "outer" derived data's mappings (i.e. blame
mapping), it doesn't do it for inner derived data mappings (i.e. unodes). So we
might end up in the situation where we insert unodes mapping before we inserted
all the manifests for it. If this thing fails in the middle of derivation then
we have a corruption.

Let's do not use it in blobimport. It will make derivation slower, but I'd
rather make it slower than incorrect.

Reviewed By: farnz

Differential Revision: D21905619

fbshipit-source-id: c0227df195a8cf4482b2452ca928acbc5750b3e5
2020-06-10 19:29:31 -07:00
Stanislau Hlebik
14e4d561b3 mononoke: add_filenodes now return FilenodeResult and removes TODO
Summary:
This diff migrates add_filenodes method to return FilenodeResult.
That means that all filenodes methods now return FilenodeResult and it's time
now to remove TODOs from derived_data filenodes.

Note that I had to change the test "derive_disabled_filenodes" a bit.
Previously FilenodesOnlyPublic::mapping::get() method immediately returned
FilenodesOnlyPublic::Disabled, while now it returns None if hg changeset is not
derived. This is an expected change in behaviour, so I just updated the test to
try to derive FilenodesOnlyPublic first, which in turns triggers generation of hg changeset.

Reviewed By: ahornby

Differential Revision: D21904401

fbshipit-source-id: f6f4cd14e6cdce5a4b95d8f3f9acff305ae6fa88
2020-06-10 19:29:29 -07:00
Stanislau Hlebik
c34cfd9bbf mononoke: make get_filenode() return FilenodeResult
Summary:
Similar to get_all_filenodes_maybe_stale() make this method return
FilenodeResult if filenodes are disabled.

Note: this diff adds one TODO in fetch_root_filenode, which will be removed
together with other TODOs in the next diff.

Reviewed By: ahornby

Differential Revision: D21904399

fbshipit-source-id: 1569579699c02eb07021f8143aa652aa192d23bc
2020-06-10 19:29:28 -07:00
David Tolnay
cf412e0d6b rustfmt: Use use_try_shorthand default
Summary:
I observed that for whatever reason our setting of `use_try_shorthand = true` in rustfmt.toml was causing entire functions to not get processed by rustfmt. Even files that contain neither `try` nor `?`. Remove it and reformat fbsource.

Documentation of that config:

- https://github.com/rust-lang/rustfmt/blob/master/Configurations.md#use_try_shorthand

We don't particularly care about the value anymore because nobody writes `r#try!(...)` in 2018 edition code.

Minimized:

```
fn f() {
    g(
        )
        // ...
        .h
}
```

This function gets formatted only if use_try_shorthand is not set.

The bug is fixed in the rustfmt 2.0 release candidate.

Reviewed By: jsgf

Differential Revision: D21878162

fbshipit-source-id: b028673c0eb703984d24bf0d2983453fc2a8c212
2020-06-10 19:29:15 -07:00
Stanislau Hlebik
a4b21e589f mononoke: return dummy FilenodesOnlyPublic value if filenodes are disabled
Summary:
In the next diffs we'll make it possible to disable filenodes in Mononoke. See
D21787848 and attached task for more details, but TL;DR is that if xdb is down
we still want to serve "hg update" traffic.

If filenodes are disabled we obviously can't generate filenodes for new
commits. So one option would be to just return an error from
FilenodesOnlyPublic::derive(...) call. But that would mean that any attempt to
call derivation would fail, and e.g. Mononoke servers won't be able to start up
- (see https://fburl.com/diffusion/roau028d). We could change callers to always
process errors from FilenodesOnlyPublic, but I think it would be harder to
enforce and easier to forget.

So this diff changes FilenodesOnlyPublic to be an enum, and
FilenodesOnlyPublic::Disabled is returned immediately if filenodes are
disabled. For callers it means that they can't rely on filenodes being present
in db even after FilenodesOnlyPublic were derived. That's the whole of the
stack, and the next diffs will update the callers to properly deal with missing
filenodes.

One caveat is that when we re-enable filenodes back we might need to derive
them for a lot of commits.
I don't expect it to happen often (i.e. if xdb is down then we probably can't
commit anyway), but if somehow it happened, then we should be a bit more
careful with re-enabling them after the problem was fixed. For example, we can
first derive all the filenodes locally by e.g. running backfill_derived_data,
and only after that has finished successfully we can re-enable them.

Reviewed By: krallin

Differential Revision: D21840328

fbshipit-source-id: ce9594d4a21110a5cb392c3049ccaede064c1e66
2020-06-03 04:21:23 -07:00
Stanislau Hlebik
a47388f536 mononoke: remove DefferedDerivedMapping
Summary:
DefferedDerivedMapping was added so that we can make deriving stack of commits faster - it does it by postponing updating
derived data mapping (e.g. writing to a blobstore) until the whole stack is derived.
While it probably makes derivation a bit faster, we now think it's better to remove it. A few reasons:

1) It's confusing to understand and it already caused us ubns before
2) It's increases write amplification - because we release the lease before we wrote to a blobstore, writers will try to rederive the same commit a few times. That has caused us a ubn today

Reviewed By: farnz

Differential Revision: D20113854

fbshipit-source-id: 169e05febcd382334bf4da209a20aace0b7c2333
2020-06-02 01:46:08 -07:00
Stanislau Hlebik
fde6230ea2 RFC: introduce FilenodeResult
Summary:
The motivation for the whole stack:
At the moment if mysql is down then Mononoke is down as well, both for writes
and for reads. However we can relatively easily improve the situation.
During hg update client sends getpack() requests to fetch files, and currently
for each file fetch we also fetch file's linknode. However hg client knows how
to deal with null linknodes [1], so when mysql is unavailable we can disable
filenode fetching completely and just return null linknodes. So the goal of this stack is to
add a knob (i.e. a tunable) that can turn things filenode fetches on and off, and make
sure the rest of the code deals nicely with this situation.

Now, about this diff. In order to force callers to deal with the fact that
filenodes might unavailable I suggest to add a special type of result, which (in
later diffs) will be returned by every filenodes methods.

This diff just introduces the FilenodeResult and convert BlobRepo filenode
methods to return it. The reason why I converted BlobRepo methods first
is to limit the scope of changes but at the same time show how the callers' code will look
like after FilenodeResult is introduced, and get people's thoughts of whether
it's reasonable or not.

Another important change I'd like to introduce in the next diffs is modifying FilenodesOnlyPublic
derived data to return success if filenodes knob is off. If we don't do that
then any attempt to derive filenodes might fail which in turn would lead to the
same problem we have right now - people won't be able to do hg update/hg
pull/etc if mysql is down.

[1] null linknodes might make some client side operation slower (e.g. hg rebase/log/blame),
so we should use it only in sev-like situations

Reviewed By: krallin

Differential Revision: D21787848

fbshipit-source-id: ad48d5556e995af09295fa43118ff8e3c2b0e53e
2020-06-01 05:27:34 -07:00
Stanislau Hlebik
7ccd28de14 scs_server: add track_history_across_deletions parameter
Reviewed By: krallin

Differential Revision: D21718738

fbshipit-source-id: b7dd7e05b1773ac1a442d89991549f0f97e1e55b
2020-05-26 05:38:55 -07:00
Stanislau Hlebik
a5eaa551e8 mononoke: support history across deletions
Summary:
now fastlog will be able to show a history of file like that:

```
Added
|
Deleted
|
Added
```

We do it by using deleted file manifest to find where a file was deleted. We do
it in case we find a unode which doesn't have a parent.

Differential Revision: D21664248

fbshipit-source-id: 8e80fccc87361e66e4e00ae2da3671ed8327f1f8
2020-05-21 02:46:21 -07:00
Stanislau Hlebik
090d543bf7 mononoke: avoid one special case in fastlog
Summary:
... or rather make this special case a bit smaller.

In fastlog we have a `prefetch` which was already returned to the caller,
however its parents hasn't been processed yet. Previously we had two separate
places where these parents were processed, now we have only one.

This will make it easier to add support for history across deletions in the
next diff.

Differential Revision: D21664246

fbshipit-source-id: a5a109fafaeec28a0e7a36ffa44545aebc032b00
2020-05-21 02:46:21 -07:00
Stanislau Hlebik
8505bf3772 mononoke: move functionality to fetch linknodes in a separate function
Summary: it will be used in the next diffs in the stack

Differential Revision: D21664247

fbshipit-source-id: 6ebdba72b1e8382940a91f5fbec414e7bcc8160b
2020-05-20 12:26:16 -07:00
Stanislau Hlebik
ca44297680 fastlog: create visited one level up
Summary:
Previously we created `visited` and immediately drop it.
Let's instead create it one level up. I don't think it'd matter in practice,
but it should be cleaner

Reviewed By: HarveyHunt

Differential Revision: D21662456

fbshipit-source-id: 36823309ea09448c5be10c2655ea0f317649f290
2020-05-20 12:26:16 -07:00
Stefan Filip
60966c93e7 autocargo: regenerate
Summary: maintenance

Reviewed By: StanislavGlebik

Differential Revision: D21640322

fbshipit-source-id: d0b2ce604735c05d540d06835c8e4c8a940fbf5c
2020-05-19 16:08:40 -07:00
Stanislau Hlebik
5f8ab2526c mononoke: make sure commit is regenerated when backfill_derived_data single is
Summary:
subcommand_single calls `derived_data_utils.regenerate(vec![cs_id])` with the
intention that derived data for this commit will be regenerated. However
previously it didn't work because DerivedDataUtils::derive() was ignoring
regenerate parameter. This diff fixes it.

Reviewed By: krallin

Differential Revision: D21527344

fbshipit-source-id: 56d93135071a7f3789262b7a9d9ad84a0896c895
2020-05-13 03:27:46 -07:00
Stanislau Hlebik
50b71ac322 mononoke: log the oldest underived ancestor
Summary:
This diff logs the delay in deriving data. In particular it logs how much time
has left since an underived commit was created.

Note that this code makes an assumption about monotonic dates - for repos with pushrebase
repos that should be the case.

Reviewed By: krallin

Differential Revision: D21427265

fbshipit-source-id: bfddf594467dfd2424f711f895275fb54a4e1c60
2020-05-08 07:47:19 -07:00
Stanislau Hlebik
849f30d5dd mononoke: move topo_sort to find_underived
Summary: Small refactoring that will make the next diffs easier

Differential Revision: D21426166

fbshipit-source-id: f3c3ae00794046828eaf3c0912dbabc233c97e77
2020-05-08 07:47:18 -07:00
Stanislau Hlebik
d3fd4f09a2 mononoke: renamed futures to futures-old
Reviewed By: krallin

Differential Revision: D21425900

fbshipit-source-id: 121bf1d9e5527bd6f035f1f7a506ad0f75de68fc
2020-05-06 07:55:04 -07:00
Stanislau Hlebik
1b9bf79c69 mononoke: asyncify is_derived
Reviewed By: krallin

Differential Revision: D21425882

fbshipit-source-id: 9c3969da679339e13818928a0c8de06f6ae78958
2020-05-06 07:55:04 -07:00
Stanislau Hlebik
043887c13e mononoke: asyncify find_underived
Reviewed By: krallin

Differential Revision: D21425800

fbshipit-source-id: e445bb8e10fd8c41b0322cbe4de5f6fd0a4e924c
2020-05-06 07:55:03 -07:00
Mistral Orhan Jean-Pierre Contrastin
5fe820dd06 Expose ctime from Blobstore::get() in mononoke
Summary:
- Change get return value for `Blobstore` from `BlobstoreBytes` to `BlobstoreGetData` which include `ctime` metadata
- Update the call sites and tests broken due to this change
- Change `ScrubHandler::on_repair` to accept metadata and log ctime
- `Fileblob` and `Manifoldblob` attach the ctime metadata
- Tests for fileblob in `mononoke:blobstore-test` and integration test `test-walker-scrub-blobstore.t`
- Make cachelib based caching use `BlobstoreGetData`

Reviewed By: ahornby

Differential Revision: D21094023

fbshipit-source-id: dc597e888eac2098c0e50d06e80ee180b4f3e069
2020-05-06 00:55:07 -07:00
Thomas Orozco
9ac8e0505b mononoke: update various error enums to use #[source]
Summary:
We have a number of error enums that wrap an existing errors, but fail to
register the underlying error as a `#[source]`. This results in truncated
context chains when we print the error. This fixes that. It also removes a
bunch of manual `From` implementation that can be provided by thiserror's
`#[from]`.

This also required updating the `Display` implementation for those errors. I've
opted for not printing the underlying error, since the context chain will
include it. This does mean that if we print one of those errors without the
context chain (i.e. `{}` as opposed to `{:#}` or `{:?}`), then we'll lose out a
bit of context. That said, this should be OK, as we really shouldn't ever being
do this, because we'd be missing the rest of the chain anyways.

Reviewed By: StanislavGlebik

Differential Revision: D21399490

fbshipit-source-id: a970a7ef0a9404e51ea3b59d783ceb7bf33f7328
2020-05-05 05:44:52 -07:00
Thomas Orozco
fd6b1d4ec6 common/rust/failure_ext: get rid of error chain
Summary:
This removes our own (Mononoke's) implementation of failure chains, and instead
replaces them with usage of Anyhow. This doesn't appear to be used anywhere
besides Mononoke.

The historical motivation for failure chains was to make context introspectable
back when we were using Failure. However, we're not using Failure anymore, and
Anyhow does that out of the box with its `context` method, which you can
downcast to the original error or any of the context instances:

https://docs.rs/anyhow/1.0.28/anyhow/trait.Context.html#effect-on-downcasting

Reviewed By: StanislavGlebik

Differential Revision: D21384015

fbshipit-source-id: 1dc08b4b38edf8f9a2c69a1e1572d385c7063dbe
2020-05-05 05:44:52 -07:00
Aida Getoeva
2009db293b mononoke: use deleted manifest for merged file history
Summary:
Merge react and userprofile_shell into master, rewrite profile to use react and revert changes to p page

```

Reviewed By: markbt

Differential Revision: D21285297

fbshipit-source-id: f60006738af068982a25fa368fd260ba5b7f3781
2020-04-29 15:21:59 -07:00
Aida Getoeva
ce373ec652 mononoke: derive deleted files manifest for merge commits
Summary:
Design doc for the Deleted Files Manifest derived data structure - https://fb.quip.com/Wk4XAi7dyXRr.

**This diff adds derivation logic for merge commits.**

For linear case each deleted manifest node has an optional liknode, that points to the changeset id where the path was deleted. If the liknode is None it means that the path is or was a directory and its subentries were deleted, but the path itself still exists.

The idea for merge commits is that paths deleted in merges now will  have liknodes pointing to the merge commit. So if the path was deleted in both branches of the merge commit, then deleted manifest of the merge commit will have a linknode for the path pointing to the merge commit itself. (For the examples see the test cases below.)

**To generate file history for the deleted file:**

If there is no unode for the given commit and path:
 * search for the path in the Deleted files manifest
 * if there is no node, there was no such path in the repo
 * from the manifest's liknode get the changeset id where the file was deleted
 * get the changeset's parents
 * for each parent try to find unode, if there is unode, then traverse history
 * otherwise check deleted manifest again (going back to the beginning)

This logic is going to be implemented in another diff.

Reviewed By: StanislavGlebik

Differential Revision: D21182002

fbshipit-source-id: 4dfa5ae0dab20a51d78e75c0fafb624f1a652fff
2020-04-27 18:23:45 -07:00
Aida Getoeva
65cd75c65d mononoke: asyncify part of the deleted manifest derivation
Summary: In this diff I moved some of the deleted files manifest derivation methods from the old futures to async/await. This allows me to introduce changes into the derivation logic using new standard futures.

Reviewed By: StanislavGlebik

Differential Revision: D21177622

fbshipit-source-id: fd5920cce11095885f167dad89ee974308501025
2020-04-27 18:23:44 -07:00
Mark Thomas
fc436fd46f deleted_files_manifest: deduplicate writes for manifest blobs
Summary:
When generating the deleted file manifest, there may be many duplicates of entries, particularly for deleted leaf files, which will always have the same manifest and thus the same manifest ID.

To prevent overloading of the blobstore with too many writes for the same key, deduplicate the writes before sending them.

Differential Revision: D21257640

fbshipit-source-id: 8f367496109b3e2b0a802fed96747cc290b295ec
2020-04-27 10:55:46 -07:00
Stanislau Hlebik
403347ee10 mononoke: add dry-run mode for backfilling fsnodes
Summary:
This diff adds a special dry-run mode of backfilling (for now only fsnodes are
supported). It does by keeping all derived data in memory (i.e. nothing is
written to blobstore) and periodically cleaning entries that can no longer
be referenced.

This mode can be useful to e.g. estimate size of derived data before actually
running the derivation.

Note that it requires --readonly-storage in order to make sure that we don't
accidentally write anything to e.g. mysql.

Reviewed By: ahornby

Differential Revision: D21088989

fbshipit-source-id: aeb299d5dd90a7da1e06a6be0b6d64b814bc7bde
2020-04-24 04:05:53 -07:00
Stanislau Hlebik
57c127089c mononoke: remove trace uploading while deriving data
Reviewed By: krallin

Differential Revision: D21173488

fbshipit-source-id: 44421bc95d30f9f94fd5072ac3122a6211056d24
2020-04-22 01:41:05 -07:00
Aida Getoeva
b1d61b0073 mononoke: asyncify fastlog/ops tests
Summary: Moved tests to the new mechanism of creating changesets, refactored and asyncified tests

Reviewed By: farnz

Differential Revision: D21050192

fbshipit-source-id: d97b1cf0ab92aecc2c35d95c8f9331cead906867
2020-04-20 12:55:33 -07:00
Aida Getoeva
fc24741435 mononoke: asyncify the rest of fastlog/ops
Reviewed By: StanislavGlebik

Differential Revision: D21050191

fbshipit-source-id: e6dcc80b3b71b68e937f8ae8d0dc5be662cbe0f1
2020-04-20 12:55:33 -07:00
Stanislau Hlebik
6ca292ef57 mononoke: add derive_fsnode_in_batch
Summary:
This adds a special mode of deriving fsnodes that will be used in
backfill_derived_data. From my experiments it looks like it got 5-10 times
faster.

I tried to explain how it works in the comments - lmk if that's not enough.

Reviewed By: mitrandir77

Differential Revision: D21067817

fbshipit-source-id: ff72a079754a2c15f65852c28d80f723961b53c4
2020-04-20 06:57:40 -07:00
Gabriel Russo
03d4e52ab3 Bump tokio to 0.2.13
Summary:
This is needed because the tonic crate (see the diff stack) relies on tokio ^0.2.13

We can't go to a newer version because a bug that affects mononoke was introduced on 0.2.14 (discussion started on T65261126). The issue was reported upstream https://github.com/tokio-rs/tokio/issues/2390

This diff simply changed the version number on `fbsource/third-party/rust/Cargo.toml` and ran `fbsource/third-party/rust/reindeer/vendor`.

Also ran `buck run //common/rust/cargo_from_buck:cargo_from_buck` to fix the tokio version on generated cargo files

Reviewed By: krallin

Differential Revision: D21043344

fbshipit-source-id: e61797317a581aa87a8a54e9e2ae22655f22fb97
2020-04-15 12:18:00 -07:00
Stanislau Hlebik
d3ec8dd0f3 mononoke: add batch_derived() method
Summary:
A new method on BonsaiDerived trait that derives data for a batch of commits.
Default implementation just derives them in parallel, so it's not particularly
useful. However it might be overriden if a particular derived data has a more
efficinet way of deriving a batch of commits

Reviewed By: farnz

Differential Revision: D21039983

fbshipit-source-id: 3c6a7eaa682f5eaf6b8a768ca61d6f8a8f1258a7
2020-04-15 08:59:10 -07:00
Stanislau Hlebik
584728bd56 mononoke: warmup content metadata for fsnodes
Summary: It makes it backfill a great deal faster

Reviewed By: krallin

Differential Revision: D21040292

fbshipit-source-id: f6d06cbc76e710b4812f15e85eba73b24cdbbd3e
2020-04-15 08:21:28 -07:00
Aida Getoeva
25eff1c91e mononoke/scs-log: integrate deleted manifest (linear)
Summary:
Use deleted manifest to search deleted paths in the repos with linear history. For merged history it returns error as there was no such path.
Commit, where the path was deleted, is returned as a first commit in the history stream, the rest is a history before deletion.

Reviewed By: StanislavGlebik

Differential Revision: D20897083

fbshipit-source-id: e75e53f93f0ca27b51696f416b313466b9abcee8
2020-04-14 18:27:39 -07:00
Jeremy Fitzhardinge
28830035dd rust: regenerate autocargo for tokio rollback
Reviewed By: dtolnay

Differential Revision: D20956714

fbshipit-source-id: f13256350cc7082543c7b69231a783b262f8a4d8
2020-04-10 01:12:57 -07:00
Aida Getoeva
2df76d79c8 mononoke/scs-log: add history stream terminator
Summary:
`log_v2` supports time-filters and that means it needs to be able to drop the history stream if the commits got older than the given time frame. (if not it just traverses the whole history...)
However, it cannot be done from the SCS commit_path API or from changeset_path, because they already receive history stream where commits are not ordered by creation time. And the naive solution "if next commit in the stream is older than `after_ts` then drop" won't work: there might be another branch (commit after the current one) which is still **in** the time frame.

I added a terminator-function to the `list_file_history` that is called on changeset id, for which a new fastlog batch is going to be fetched. If terminator returns true, then the fastlog is not fetched and the current history branch is dropped. All ready nodes are still streamed.
```
For example, if we have a history of the file changes like this:

      A 03/03    ^|
      |           |
      B 02/03     |
      |           |  - one fastlog batch
      C 01/03     |
      | \         |
02/01 D  E 10/02 _|  - let assume, that fastlog batches for D and E ancestors are needed to prefetch
      |  |
01/01 F  G 05/02

# Example 1

We query "history from A after time 01/02"

The old version would fetch all the commits and then filter them in `commit_path`. We would fetch both fastlog batches for the D branch and E branch.

With the terminator, `list_file_history` will call terminator on commit D and get `true` in return and then will drop the D branch,
then it will call terminator on E and get `false` and proceed with fetching fastlog for the E branch.

# Example 2

We query "history from A after time 01/04"

The old version would fetch all the commits and then filter them in `commit_path`, despite the fact that
the very first commit is already older than needed.

With the terminator it will call terminator on A and get `true` and won't proceed any further.

Reviewed By: StanislavGlebik

Differential Revision: D20801029

fbshipit-source-id: e637dcfb6fddceb4a8cfc29d08b427413bf42e79
2020-04-07 07:08:24 -07:00
Aida Getoeva
2dcfcbac62 mononoke/fastlog: asyncify part of ops
Summary: Asyncified main functions of the fastlog/ops, so it'd be easier to modify them and proceed with the new features.

Reviewed By: StanislavGlebik

Differential Revision: D20801028

fbshipit-source-id: 2a03eedca776c6e1048a72c7bd613a6ef38c5c17
2020-04-07 07:08:24 -07:00
Aida Getoeva
514a72e835 mononoke/fastlog: yield all ready nodes
Summary:
`list_file_history` implements BFS on the commit graph and returns a stream of changeset ids using `bounded_traversal_stream`.
The old version iterated BFS "levels" and each iteration streamed all nodes on the current level. For example, for the commit graph:
```
1 <- start # 1 level
|
2          # 2
| \
3  4       # 3
|  |
```
there would be 3 iterations and on each nodes would be yielded: [1], [2], [3, 4]. If there was need to prefetch fastlog batch or batches, it prefetched in parallel batch/batches for changesets on the same level.

The implementation was a bit hacky and it is a bit unfortunate that we need to make 100 iterations to stream changesets that are ready and do not require fetching fastlog. I also needed some simplification so I could then add a terminator function (3rd diff in the stack) on the fastlog batch prefetching stage (and add Deleted Manifest intefration).
So now the `bounded_traversal_stream` keeps a BFS-queue as a state and on each iteration streams all nodes in the queue until it hits the node for which it needs prefetch fastlog batch and goes to the next iteration.
```
state - [queue, prefetch_cs_id]
* on each iteration:
  1. If prefetch_cs_id.is_some() =>
        - fetch fastlog batch for prefetch_cs_id
        - fill the commit graph
        - add parents of the prefetch_cs_id to the bfs queue
  2. Get from the queue all nodes until we meet changeset without fetched parents.
      Mark this node as `prefetch_cs_id` for the next iteration.
  3. Stream ready nodes and go to the next iteration.
```
Thus
- we still fetch fastlog batches on demand and not before we really need them
- if we have 100 ready to be yield commits in the queue, we won't do 100 iterations and stream them in one go
- now if we need prefetch fastlog batches for 2 branches on the same "bfs level" we will do it one by one and not in parallel, but this situation is pretty uncommon
- code is simpler and allows to integrate Deleted Manifest and add terminator function.

Reviewed By: StanislavGlebik

Differential Revision: D20768401

fbshipit-source-id: cdba40539a842b3628826f6c72a29514da0d539e
2020-04-06 21:13:08 -07:00
David Tolnay
1a86366f0e third-party/rust: Turn off async-trait/support_old_nightly
Summary:
This diff turns off the support_old_nightly feature of async-trait (https://github.com/dtolnay/async-trait/blob/0.1.24/Cargo.toml#L28-L32) everywhere in fbcode. I am getting ready to remove the feature upstream. It was an alternative implementation of async-trait that produces worse error messages but supports some older toolchains dating back to before stabilization of async/await that the default implementation does not support.

This diff includes updating async-trait from 0.1.24 to 0.1.29 to pull in fixes for some patterns that used to work in the support_old_nightly implementation but not the default implementation.

Differential Revision: D20805832

fbshipit-source-id: cd34ce55b419b5408f4f7efb4377c777209e4a6d
2020-04-02 17:01:24 -07:00
Lukas Piatkowski
bf34f084d0 mononoke: make blobrepo and its dependencies OSS buildable
Reviewed By: markbt

Differential Revision: D20495840

fbshipit-source-id: 3bbefae1923dc84e3daea158a24c0d2a802cc9a9
2020-03-31 04:02:45 -07:00
Steven Troxler
41378d4f37 Deprecate rust-crypto in eden/monanoke/derived_data
Summary:
The rust-crypto library appears to be unmaintained, switching
 - `crypto::digest::Digest` to `digest::Digest`
 - `crypto::sha1::Sha1` to `sha1::Sha1`
 - `crypto::sha2::Sha256` to `sha2::Sha256`

Reviewed By: jsgf

Differential Revision: D20456962

fbshipit-source-id: 2e3406dedba05245265d96b480c35ba2421aa3fd
2020-03-19 19:00:06 -07:00