Commit Graph

682 Commits

Author SHA1 Message Date
Thomas Orozco
8f6f3b8834 mononoke/admin: asyncify the hg_sync subcommand
Summary:
This code had grown into a pretty big monster of a future. This makes it a bit
easier to modify and work with.

Reviewed By: StanislavGlebik

Differential Revision: D21227210

fbshipit-source-id: 5982daac4d77d60428e80dc6a028cb838e6fade0
2020-04-27 07:24:41 -07:00
Kostia Balytskyi
0f5e91a1d6 commit_validator: fix using wrong repos for noop-filenode checks
Summary:
Current implementation of validator incorrectly asks large repos to operate on
small repo filenodes and vice versa.

In this diff I both fix this issue and implement a bit more reliance on the
newtypes, so that it makes more sense to compiler.

Reviewed By: farnz

Differential Revision: D21250880

fbshipit-source-id: 1b325dca2218aa1ec7fbc0c6d0654e75ca3bffe3
2020-04-27 02:19:12 -07:00
Alex Hornby
872e95b8da mononoke: walker: avoid duplicate fsnode visits
Summary: Avoid duplicate fsnode visits by adding state tracking for them.

Reviewed By: StanislavGlebik

Differential Revision: D21250279

fbshipit-source-id: c37f7c5f20b07a15a809171f91bd12649f4f818d
2020-04-27 01:06:59 -07:00
Kostia Balytskyi
842cc18863 remove old comment and attribute
Reviewed By: farnz

Differential Revision: D21245424

fbshipit-source-id: a58d1f451341f374734b4518a2ed465a60809f0b
2020-04-25 13:55:43 -07:00
Stanislau Hlebik
b28b879846 mononoke: small refactoring before introducing Cleaner for unodes
Summary:
In the next diffs I'd like to introduce cleaner for unodes. This diff just
moves a bunch of code around to make reviewing next diffs easier

Reviewed By: krallin

Differential Revision: D21226921

fbshipit-source-id: c9f9b37bf9b11f36f8fc070dfa293fd8e6025338
2020-04-24 10:52:58 -07:00
Alex Hornby
bd57f89319 mononoke: walker remove ResolvedNode struct
Summary: Small simplification: remove ResolvedNode struct as it is OutgoingEdge plus a NodeData

Reviewed By: krallin

Differential Revision: D21134360

fbshipit-source-id: 5fdf7ccb176263bf923b0ab60f0cadb2aa4ccd43
2020-04-24 08:04:01 -07:00
Alex Hornby
b3353518ea mononoke: walker: simplify PathTrackingRoute evolution
Summary: Remove some duplication in the walker path sampling logic by moving PathTrackingRoute evolution up to the struct itself

Reviewed By: krallin

Differential Revision: D20996285

fbshipit-source-id: 165639a3a1608c0b48dc6a7d5ae261613bc90995
2020-04-24 08:04:01 -07:00
Stanislau Hlebik
31f5b6dbcb mononoke: remove unused impls
Reviewed By: krallin

Differential Revision: D21227405

fbshipit-source-id: 3579c0bb390353cabf7bd8a36d5f3a6d92a60e48
2020-04-24 05:27:10 -07:00
Lukas Piatkowski
22a9d3e1cc mononoke/configerator structs: add shipit and autocargo configs for all configerator structs
Summary: The configerator structs are used in many top level functions in Mononoke and are required in order to build all the code on github

Reviewed By: ahornby

Differential Revision: D21130546

fbshipit-source-id: 7f17d92173f5ecf7c3406ae4202359a0db8df84a
2020-04-24 04:33:53 -07:00
Alex Hornby
378559fb29 mononoke: walker: add fsnodes derivation to test blobimport
Summary:
Add --derived-data-type=fsnodes to blobimport to a couple of walker tests so we have test data present to load.

Includes a small change to library.sh to add default_setup_pre_blobimport entry point used by these tests

Reviewed By: StanislavGlebik

Differential Revision: D21202480

fbshipit-source-id: d7eb3e5736531a11da87d92d0d03a528ff2c91a7
2020-04-24 04:29:52 -07:00
Stanislau Hlebik
e10e5349e7 mononoke: remove boilerplate with auto_impl
Reviewed By: krallin

Differential Revision: D21207457

fbshipit-source-id: 32a9afe4eb4214ffa88d7ef756112e7e9033337e
2020-04-24 04:09:30 -07:00
Stanislau Hlebik
403347ee10 mononoke: add dry-run mode for backfilling fsnodes
Summary:
This diff adds a special dry-run mode of backfilling (for now only fsnodes are
supported). It does by keeping all derived data in memory (i.e. nothing is
written to blobstore) and periodically cleaning entries that can no longer
be referenced.

This mode can be useful to e.g. estimate size of derived data before actually
running the derivation.

Note that it requires --readonly-storage in order to make sure that we don't
accidentally write anything to e.g. mysql.

Reviewed By: ahornby

Differential Revision: D21088989

fbshipit-source-id: aeb299d5dd90a7da1e06a6be0b6d64b814bc7bde
2020-04-24 04:05:53 -07:00
Xavier Deguillard
413d2b3aba remotefilelog: enable uploading LFS blobs
Summary:
This adds the proper hooks in the right place to upload the LFS blobs and write
to the bundle as LFS pointers. That last part is a bit hacky as we're writing
the pointer manually, but until that code is fully Rust, I don't really see a
good way of doing it.

Reviewed By: DurhamG

Differential Revision: D20843139

fbshipit-source-id: f2ef7b045c6604398b89580b468c354d14de1660
2020-04-23 14:00:23 -07:00
Arun Kulshreshtha
43eae85091 gotham_ext: add into_handler_response method to HttpError
Summary: Add a convenience method to HttpError to return a JSON-formatted representation of the error to the client.

Reviewed By: krallin

Differential Revision: D21193939

fbshipit-source-id: e1ff1555b0016f46dbcd1847239d96daf8b45685
2020-04-23 13:58:04 -07:00
Arun Kulshreshtha
631eba23ce lfs_protocol: move LFS MIME type into protocol crate
Summary: In the process of factoring out generally useful parts of the LFS server to `gotham_ext`, it seemed like the `lfs_protocol` crate was the logical place for the Git LFS MIME type constant to live. In addition to moving it, this diff swaps out the deprecated `lazy_static` crate with `once_cell`.

Reviewed By: krallin

Differential Revision: D21193938

fbshipit-source-id: 81dc23e8f37a6c0a45ae44443807e5e21214bcd5
2020-04-23 13:58:04 -07:00
Arun Kulshreshtha
86fc9a4fdb gotham_ext: move body types into gotham_ext
Summary: Move the various body types out of the LFS server into `gotham_ext`. `StreamBody` was intentionally left behind for now since it contains some LFS-specific logic that would need to be factored out before it can be moved.

Reviewed By: krallin

Differential Revision: D21193941

fbshipit-source-id: 638b9e93e9dc7385f7fde9dbb3a2392ad0e18385
2020-04-23 13:58:03 -07:00
Arun Kulshreshtha
ff0ab62e33 gotham_ext: move SignalStream into gotham_ext
Summary: Move `SignalStream` out of the LFS server into `gotham_ext`. This is a step towards extracting all of the functionality needed to support streaming bodies in `gotham_ext`.

Reviewed By: krallin

Differential Revision: D21193940

fbshipit-source-id: 832a5254c80e4ee085ece371b45b38a4519403f3
2020-04-23 13:58:03 -07:00
Arun Kulshreshtha
25a3cfe0b5 gotham_ext: move HttpError into gotham_ext
Summary: Move the `HttpError` type out of the LFS server into `gotham_ext` so it can be used by the EdenAPI server too.

Reviewed By: krallin

Differential Revision: D21193937

fbshipit-source-id: dff59e3ae995fe5771db47174a96e31b2c9f4c73
2020-04-23 13:58:03 -07:00
Mark Thomas
1135339320 mutationstore: add logging and perf counters
Summary:
Add debug logging and perf counters for the number of mutation entries stored
during `add_entries`, and the number of mutation entries fetched during
`all_predecessors`.

Reviewed By: StanislavGlebik

Differential Revision: D21065934

fbshipit-source-id: 9b2ff9720116e6a168706f994655daffb18d0ffc
2020-04-23 08:58:09 -07:00
Mark Thomas
0dc75881d6 mutationstore: add add_entries method
Summary:
This commit adds the `add_entries` method to the mutation store, which
allows Mononoke to add new entries to the store for a given set of
commits.

It is expected that the client will provide all of the mutation entries related
to the commits it is sending.  This may be too much information, as in normal
operation we will already know about the predecessors.  In that case we can
avoid additional work by just storing the entries directly related to the new
commits.

If the client has made commits while offline, then some mutation entries may
refer to predecessors that are not known locally.  In this case we must search
through the mutation history looking for the last commits we did know about,
and add all the in-between entries, too.

Reviewed By: StanislavGlebik

Differential Revision: D20287383

fbshipit-source-id: e5fb42bc4da7873c3a5aafd83684d374d9155bca
2020-04-23 08:58:09 -07:00
Mark Thomas
6f8737d116 mutationstore: a store for commit mutation information
Summary:
Add the Mononoke Mercurial mutation store.  This stores mutation information
for draft commits so that it can be shared between clients. The mutation
entries themselves are stored in a database, and the mutation store provides
abstractions for adding and querying them.

This commit adds the `all_predecessors` method to the mutation store, which
allows Mononoke to fetch all predecessors for a given set of commits.  This
will be used to serve mutation information to clients who are pulling draft
commits.

Reviewed By: krallin

Differential Revision: D20287381

fbshipit-source-id: b4455514cb8a22bef2b9bf0229db87c2a0404448
2020-04-23 08:58:09 -07:00
Mark Thomas
4ceb020d3a log: improve default log output
Summary:
Make the default output for `scsc log` shorter by only including the first line of the commit message, and omitting less interesting fields like commit extras.

The full details are hidden behind a `--verbose` flag, similar to `hg log`.

Reviewed By: mitrandir77

Differential Revision: D21202318

fbshipit-source-id: f15a0f8737f17e3189ea1bbe282d78a9c7199dd9
2020-04-23 08:36:43 -07:00
Harvey Hunt
b24074ac35 mononoke: Add get_config_path helper to cmdlib
Summary:
Multiple functions in cmdlib were looking for
`"mononoke-config-path"`. Make it into a constant and provide a helper function
to reduce duplication.

Further, update `read_common_config` to accept `impl AsRef<Path>` to make
calling the function easier.

Reviewed By: farnz

Differential Revision: D21202528

fbshipit-source-id: 96cad817ed47be0f207965ad2bc33af13ca8b5fd
2020-04-23 07:43:37 -07:00
Mark Thomas
0518d6a0fb scs_server: add commit_history to return the history of commits
Summary:
Add the `commit_history` method to the source control service.

This methods returns the history of the commit, directly from the changelog.

This differs from `commit_path_history` in that it includes all commits, whereas `commit_path_history` will skip commits that do not contain changes that affect the path.

Differential Revision: D21201705

fbshipit-source-id: dbf1f446c106620620343122176eccd5d809779c
2020-04-23 07:14:49 -07:00
Mark Thomas
6424064215 mononoke_api: add changeset history
Summary:
Add a method to get the history of a changeset.  This differs from the history
of a changeset path, even the history of the root directory, in that all
changesets are included, even empty ones.

Differential Revision: D21179877

fbshipit-source-id: e19aac75fc40d8e9a3beb134e16a8cdfe882b791
2020-04-23 07:14:49 -07:00
Mark Thomas
d878593d3d mononoke_api: add test for commit_path_history
Summary: Add a new unit test for `commit_path_history`.  We will use this test to contrast against `commit_history`.

Differential Revision: D21179878

fbshipit-source-id: 318aa34d8d80f61c1e52d4053d5aead5a71e864c
2020-04-23 07:14:49 -07:00
Lukas Piatkowski
8bba936e5f mononoke/permission_checker: introduce MembershipChecker and its first usage in hooks
Summary: The new MembershipChecker and PermissionChecker traits will generalize access to various permission/acl systems (like LDAP) and leave the implementation details hidden behind an object trait.

Reviewed By: StanislavGlebik

Differential Revision: D21067811

fbshipit-source-id: 3bccd931f8acdb6c1e0cff4cb71917c9711b590b
2020-04-23 03:44:09 -07:00
Stanislau Hlebik
dcf66ebc11 mononoke: add walker for fsnodes
Summary: Make it possible to traverse fsnodes in walker.

Reviewed By: ahornby

Differential Revision: D21153883

fbshipit-source-id: 047ab73466f48048a34cb52e7e0f6d04cda3143b
2020-04-23 01:24:20 -07:00
Stanislau Hlebik
2a5cdfec02 mononoke: split warmup from backfill_derived_data
Summary: File is getting too large - let's split it

Reviewed By: farnz

Differential Revision: D21180807

fbshipit-source-id: 43f0af8e17ed9354a575b8f4dac6a9fe888e8b6f
2020-04-23 00:16:30 -07:00
Johannes Schmid
13b7123a82 Emit error when encountering invalid parts in parse_node
Summary: Certain node types are not to be followed by any text/parts. This change makes parse_node repor an error if they are.

Reviewed By: ahornby

Differential Revision: D21129504

fbshipit-source-id: 4a4c2755d786e81ab43cec86d5fd5189ea0e138f
2020-04-22 08:04:32 -07:00
Stanislau Hlebik
570371193c mononoke: spawn cache update
Reviewed By: ahornby

Differential Revision: D21134813

fbshipit-source-id: 4d82bd5729125299b9414c8a36aced7f32a5ed74
2020-04-22 07:28:23 -07:00
Alex Hornby
57c3e7830e mononoke: walker: remove WalkState
Summary: Small cleanup: WalkState type wraps an Arc,  can replace with implementing the WalkVisitor trait for Arc<WalkVisitor>

Reviewed By: farnz

Differential Revision: D21129622

fbshipit-source-id: 12a8a8d3b5ba2e459658a3cc71021c8d700db3b8
2020-04-22 06:48:53 -07:00
Alex Hornby
7fe3f85192 mononoke: walker: async the subcommand entry points
Summary: Async the walker's subcommand entry points, removes some of the rightward shift!

Reviewed By: krallin

Differential Revision: D20996244

fbshipit-source-id: 23d5d8f61da1a4c4fc5e46213a1842ea9721f07f
2020-04-22 06:48:53 -07:00
Alex Hornby
20b268cd68 mononoke: walker: add path to the OutgoingEdge
Summary:
For some nodes like FileContent from a BonsaiChangset, the file path is not part of node identity, but it is important for tracking which nodes are related to which paths.

This change adds an optional path field to the OutGoingEdge so that it can be used in route generation and as part of edge identity for sampling.

Its optional as some walks don't need the paths, for example scrub.

Reviewed By: farnz

Differential Revision: D20835653

fbshipit-source-id: f609c953da8bfa0cdfdfb26328149d567c73dbc9
2020-04-22 06:48:52 -07:00
Alex Hornby
b3c58f9c06 mononoke: walker: use WrappedPath to Arc all MPaths held in the graph
Summary: use WrappedPath to Arc all MPaths held in the graph.  Doing it with a wrapper type rather than Option<Arc<MPath>> so have the option of using internalization instead later.

Reviewed By: krallin

Differential Revision: D20868508

fbshipit-source-id: e47458c4b274e4cd7bd0c400cd033eebc3c85d14
2020-04-22 06:48:52 -07:00
Alex Hornby
ba43f6ec4c mononoke: walker: publish stats for scrub blobstore_keys and blobstore_bytes
Summary: Add stats reporting for the blobstore_keys and blobstore_bytes stats gathered during scrub

Reviewed By: krallin

Differential Revision: D20597720

fbshipit-source-id: 82c3bb3afa308e91779648ceb74ba10ea5fcf399
2020-04-22 06:48:52 -07:00
Alex Hornby
de301b108e mononoke: walker: add blobstore usage by key type to scrub progress reporting
Summary: Report stats by the node type (e.g. FileContent, HgManifest etc) for blobstore usage when scrubbing so we can see how large each type is.

Reviewed By: StanislavGlebik

Differential Revision: D20564327

fbshipit-source-id: 55efd7671f893916d8f85fa9a93f95c97a098af4
2020-04-22 06:48:51 -07:00
Kostia Balytskyi
b59886c7f8 mononoke: fix how infinitepush is detected
Summary:
Correctly identify infinitepush without bookmarks as infinitepush instead of plain push.

Current behavior would sometimes pass `infinitepush` bundles through the `push` pipeline. Interestingly, this does not result in any user-visible effects at the moment. However, in the future we may want to diverge these pipelines:
- maybe we want to disable `push`, but enable `infinitepush`
- maybe there will be performance optimizations, applicable only to infinitepush

In any case, the fact that things worked so far is a consequence of a historical accident, and we may not want to keep it this way. Let's have correct identification.

Reviewed By: StanislavGlebik

Differential Revision: D18934696

fbshipit-source-id: 69650ca2a83a83e2e491f60398a4e03fe8d6b5fe
2020-04-22 05:13:36 -07:00
Kostia Balytskyi
c62631136f remove unneeded 'static lifetimes
Summary: Clippy is complaining about those.

Reviewed By: krallin

Differential Revision: D21165588

fbshipit-source-id: 7d2248b6291fafac593ab0a3af0baf5e805fa53d
2020-04-22 02:49:01 -07:00
Lukas Piatkowski
449594be46 mononoke/hooks: fix to panic the server when AclChecker is unreachable
Summary:
In next diffs with permission_checker the panic is changed to anyhow::Error.

The previous behavior of this code was that when AclChecker updated failed
after 10s this fact was ignored and the hooks were simply not using ACLs. This
diff fixes it so that the server exits when AclChecheker update is timing out.

Reviewed By: johansglock

Differential Revision: D21155944

fbshipit-source-id: ab4a5071acbe6a1282a7bc5fdbf301b4bd53a347
2020-04-22 02:45:03 -07:00
Stanislau Hlebik
57c127089c mononoke: remove trace uploading while deriving data
Reviewed By: krallin

Differential Revision: D21173488

fbshipit-source-id: 44421bc95d30f9f94fd5072ac3122a6211056d24
2020-04-22 01:41:05 -07:00
Kostia Balytskyi
591dcad489 unbundle: fix typo
Reviewed By: krallin

Differential Revision: D21156819

fbshipit-source-id: 1fce4d19b4ccde10b43d81d19d262bf49a821712
2020-04-21 16:16:40 -07:00
Alex Hornby
12585058cf mononoke: walker: update compression-benefit to report progress by node type
Summary:
Allow us to see the sizes for each node type (e.g. manifests, bonsais etc), and extends the default reporting to all types.

The progress.rs changes update its summary by type reporting to be reusable, and then it is reused by the changes to sizing.rs.

Reviewed By: krallin

Differential Revision: D20560962

fbshipit-source-id: f09b45b34f42c5178ba107dd155abf950cd090a7
2020-04-21 08:29:21 -07:00
Harvey Hunt
76fb3aaf52 mononoke: blobimport: Print bookmark names as strings in blobimport
Summary:
Previously, bookmark names were printed from blobimport as a Vec<u8>. This
makes the logs hard to reason about e.g.

    `current version of bookmark [109, 97, 115, 116, 101, 114] couldn't be imported,`

Update blobimport's logging to convert the bookmark names to strings before
printing them.

Reviewed By: johansglock

Differential Revision: D21154568

fbshipit-source-id: 549a05ca97c97533b91228b34878c28129c73677
2020-04-21 07:08:50 -07:00
Thomas Orozco
aae2721caf mononoke_hg_sync_job: don't fail if the Globalrev counter is where we want it
Summary:
This was how this was supposed to work all along, but there was a bug in the
sense that if the counter is where want to set it, then the update affects 0
rows. This is a bit of a MySQL idiosyncrasy — ideally we would set
CLIENT_FOUND_ROWS on our connections in order to be consistent with SQLite.

That said, for now, considering we are the only ones touching this counter, and
considering this code isn't intended to be long-lived, it seems reasonable to
just check the counter after we fail to set it.

(see https://dev.mysql.com/doc/refman/8.0/en/mysql-affected-rows.html for
context)

Reviewed By: HarveyHunt

Differential Revision: D21153966

fbshipit-source-id: 663881c29a11a619ec9ab20c4291734ff13d798a
2020-04-21 06:09:19 -07:00
Stanislau Hlebik
a8399e7632 mononoke: change bonsai_changeset_step to be a normal async function
Summary: Will change it in the next diffs

Reviewed By: krallin

Differential Revision: D21153958

fbshipit-source-id: ef7c28f821ce0669e960f0e778d0dd29b32d4cd2
2020-04-21 05:23:57 -07:00
Alex Hornby
15f98fe58c mononoke: walker: fix flaky integration tests
Summary:
These failed when I did a local run against fbcode warm

count-objects.t and enabled-derive.t were flaky depending on the exact path taken through the graph.

Reviewed By: ikostia

Differential Revision: D21092866

fbshipit-source-id: ac4371cf81128b4d38cd764d86fc45d44d639ecc
2020-04-21 05:23:16 -07:00
Aida Getoeva
df3b589b73 mononoke: add deleted manifest to the warm bookmarks cache
Reviewed By: StanislavGlebik

Differential Revision: D21119586

fbshipit-source-id: edc7f84184853c27cc156ed5527de7a973f0f7f4
2020-04-21 04:32:56 -07:00
Kostia Balytskyi
89c9d652a2 unbundle: log resolved unbundle type to scuba as well
Summary: This is useful during invesigations. ODS only works for stats.

Reviewed By: krallin

Differential Revision: D21144414

fbshipit-source-id: 0fbb95a79c324d270c8d6dc4770d7729c7b23694
2020-04-21 04:27:35 -07:00
Kostia Balytskyi
7141232a6d admin: move arg definitions into subcommand files
Summary:
Now that subcommand building is extracted into separate files, it feels logical
to also put arg definitions there.

Reviewed By: StanislavGlebik

Differential Revision: D21143851

fbshipit-source-id: fee7ce72a544cf66e6bc26b7128aa95b2b9ea5f3
2020-04-21 02:30:08 -07:00
Kostia Balytskyi
1e95aa7293 admin: move subcommand definitions into subcommand files
Summary:
This feels like a more natural place to store them. Also, it will make
`main.rs` more readable.

Reviewed By: StanislavGlebik

Differential Revision: D21143850

fbshipit-source-id: 6ab3ec268beea92d7f897860f7688a775d60c4bf
2020-04-21 02:30:07 -07:00
Aida Getoeva
b1d61b0073 mononoke: asyncify fastlog/ops tests
Summary: Moved tests to the new mechanism of creating changesets, refactored and asyncified tests

Reviewed By: farnz

Differential Revision: D21050192

fbshipit-source-id: d97b1cf0ab92aecc2c35d95c8f9331cead906867
2020-04-20 12:55:33 -07:00
Aida Getoeva
fc24741435 mononoke: asyncify the rest of fastlog/ops
Reviewed By: StanislavGlebik

Differential Revision: D21050191

fbshipit-source-id: e6dcc80b3b71b68e937f8ae8d0dc5be662cbe0f1
2020-04-20 12:55:33 -07:00
Mark Thomas
ed9ddfcba0 mononoke: configerator-thrift-update
Reviewed By: farnz

Differential Revision: D21130778

fbshipit-source-id: e44c6d442ff7620d54e17cd30ba610283f83468e
2020-04-20 12:48:06 -07:00
Kostia Balytskyi
e7df58e848 mononoke: [RFC] migrate bits of validation code to Small/Large newtypes
Summary:
This is a POC attempt to increase the type safety of the megarepo codebase by introducing the `Small`/`Large` [newtype](https://doc.rust-lang.org/rust-by-example/generics/new_types.html) wrappers for some of the function arguments.

As an example, compare these two function signatures:
```
pub async fn verify_filenode_mapping_equivalence<'a>(
    ctx: CoreContext,
    source_hash: ChangesetId,
    source_repo: &'a BlobRepo,
    target_repo: &'a BlobRepo,
    moved_source_repo_entries: &'a PathToFileNodeIdMapping,
    target_repo_entries: &'a PathToFileNodeIdMapping,
    reverse_mover: &'a Mover,
) -> Result<(), Error>
```
and
```
async fn verify_filenode_mapping_equivalence<'a>(
    ctx: CoreContext,
    source_hash: Large<ChangesetId>,
    large_repo: &'a Large<BlobRepo>,
    small_repo: &'a Small<BlobRepo>,
    moved_large_repo_entries: &'a Large<PathToFileNodeIdMapping>,
    small_repo_entries: &'a Small<PathToFileNodeIdMapping>,
    reverse_mover: &'a Mover,
) -> Result<(), Error>
```

In the first case, it is possible to call function with the source and target repo inverted accidentally, whereas in the second one it is not.

Reviewed By: StanislavGlebik

Differential Revision: D20463053

fbshipit-source-id: 5f4f9ac918834dbdd75ed78623406aa777950ace
2020-04-20 11:30:05 -07:00
Kostia Balytskyi
57272c81e8 mononoke: [RFC] introduce generic Large and Small newtype wrappers
Summary:
To be used in the next diff. The idea is to provide a little more type safety at least on the function call boundary. We still have to access the wrapped type every so often, since we need to run the actual business logic on it. It seems still safer this way though.

Note: if this and the following diff look ok to people, I want to go over the actual commit sync lib and do the same there, add `Source` and `Target` wrappers, define wrapper-sensitive `Mover` and `BookmarkRenamer` types and so on.

Reviewed By: StanislavGlebik

Differential Revision: D20463054

fbshipit-source-id: 9049cd05327ee203f94df4e493e20b2615e617f1
2020-04-20 11:30:05 -07:00
Lukas Piatkowski
1b7e0dbfd6 Re-sync with internal repository 2020-04-20 16:35:04 +02:00
Thomas Orozco
825b0a1daa mononoke/hg_sync: fix off-by-one error in globalrev sync
Summary:
The ID in Hgsql is supposed to the next globalrev to assign, not the last one
that was assigned. We would have otherwise noticed during the rollout since
we'd have seen that the counter in Mercurial wasn't `globalrev(master) + 1`
(and we could have fixed it up manually before it had any impact), but let's
fix it now.

Reviewed By: StanislavGlebik

Differential Revision: D21089653

fbshipit-source-id: 0a37e1b7299a0606788bd87f788799db6e3d55f4
2020-04-20 07:16:44 -07:00
Stanislau Hlebik
7a03ef85bc mononoke: do not cache gets in MemWritesBlobstore
Summary:
At the moment MemWritesBlobstore stores both writes and reads. This is not
always desirable - in particular, in the next diff I'd like to add --dry-run
mode to backfill_derived_data which would keep everything that was derived in
memory. In that case storing all data that we read from blobstore might be too
much.
Given that we don't need this functionality let's just remove it

While there also add a method to get access to the underlying cache - again, it
will be used in the next diff.

Reviewed By: krallin

Differential Revision: D21088794

fbshipit-source-id: 91c8729d748d8ad8d9a70e6f8d5e15afe5021e8c
2020-04-20 06:57:41 -07:00
Stanislau Hlebik
6ca292ef57 mononoke: add derive_fsnode_in_batch
Summary:
This adds a special mode of deriving fsnodes that will be used in
backfill_derived_data. From my experiments it looks like it got 5-10 times
faster.

I tried to explain how it works in the comments - lmk if that's not enough.

Reviewed By: mitrandir77

Differential Revision: D21067817

fbshipit-source-id: ff72a079754a2c15f65852c28d80f723961b53c4
2020-04-20 06:57:40 -07:00
Stanislau Hlebik
d34a940ab5 mononoke: explicity enable derived data type
Summary: See comments for more details

Reviewed By: krallin

Differential Revision: D21088800

fbshipit-source-id: b4c187b5d4d476602e69d26d71d3fe1252fd78e6
2020-04-20 06:57:40 -07:00
Stanislau Hlebik
1e908ed410 mononoke: change stream::iter into a for loop
Summary: I think it's more readable this way

Reviewed By: krallin

Differential Revision: D21088598

fbshipit-source-id: 1608c250701ae6870094f0f61c0c2ce4e2c12ebf
2020-04-20 06:57:40 -07:00
Stanislau Hlebik
2d56b1d530 mononoke: move backfill derived data to a separate directory
Summary:
In the next few diffs I'm going to add more functionality to backfill derived
data. The file has grown quite big already, so I'd rather put this new
functionality in a separate file. This diff does the first step - it just moves
a file to a separte directory.

Reviewed By: farnz

Differential Revision: D21087813

fbshipit-source-id: 4a8e3eac4b8d478aa4ceca6bb55fa0d2973068ba
2020-04-20 06:57:39 -07:00
Stanislau Hlebik
74fe56b5d8 mononoke: fix timeout in integration tests
Reviewed By: krallin

Differential Revision: D21129373

fbshipit-source-id: 7c47e5b5a156babfc8ad9819af44f807a1a036d1
2020-04-20 05:28:52 -07:00
Stanislau Hlebik
3193f4fab3 mononoke: fail with non-zero exit code in serve_forever if failure happened
Summary:
Previously our jobs would have exit with error code 0 even if there was an
actual error. The reason for this was because error was just ignored (or rather
just printed to stderr).
This is not a huge problem but it makes tw output confusing - it shows that
the task was "Completed" while in reality it "Failed"

Reviewed By: ahornby

Differential Revision: D20693297

fbshipit-source-id: 4f615e2ef11f2edbb9bdbcf49cb1635929fdae89
2020-04-17 14:57:56 -07:00
Simon Farnsworth
fec914e397 Make unsharded SQLBlob able to use myrouter
Summary: For some reason, direct connections don't work from devvm4263.prn3.facebook.com, complaining about an IP address mismatch. Rather than debug that, just use MyRouter.

Reviewed By: StanislavGlebik

Differential Revision: D21089203

fbshipit-source-id: 88489a6a3b83de885c7c6e9405b325b56b807e12
2020-04-17 08:38:59 -07:00
Simon Farnsworth
f5d2983bc7 Recreate blobstore for each benchmark
Summary:
I was getting weird results from SQLBlob, which I traced to connections being closed a certain amount of time after the first query.

Fix this by recreating the blob store (thus using new connections) each time.

Reviewed By: StanislavGlebik

Differential Revision: D21089041

fbshipit-source-id: e94f8993d64dbd81d9f122f92d64aa92dad8514f
2020-04-17 08:38:59 -07:00
Thomas Orozco
518167f581 mononoke: allow for HgsqlGlobalrevs name not matching the HgsqlName
Reviewed By: ahornby

Differential Revision: D21088256

fbshipit-source-id: 6ed2969d41ade83d1a603e319450be7decd3f151
2020-04-17 06:24:10 -07:00
Thomas Orozco
c12037e4af mononoke/hook_tailer: make concurrency configurable
Summary: That is helpful when e.g. benchmarking on the most humongous commits.

Reviewed By: farnz

Differential Revision: D21064716

fbshipit-source-id: 62973d8e4f0352a2d963bb2e8a87bdced6dedc85
2020-04-17 04:52:28 -07:00
Thomas Orozco
805a150bb6 mononoke/hook_tailer: support passing a list of changesets to tail
Summary:
This makes it easier to test performance on a specific set of commits. As part
of that, I've also updated our file reading to be async since why not.

Reviewed By: farnz

Differential Revision: D21064609

fbshipit-source-id: d446ab5fb5597b9113dbebecf97f7d9b2d651684
2020-04-17 04:52:28 -07:00
Thomas Orozco
d6d5129fa3 mononoke: add smoke tests for the hook tailer
Summary:
Let's try and make sure this doesn't bitrot again by adding a smoke test. Note
that there are no hooks configured here, so this exercises everything but the
actual hook running, but for now this is probably fine.

Note that this required updating the hook tailer to use the repository config
for the hook manager, since you can't start AclChecker in a test otherwise.

Reviewed By: StanislavGlebik

Differential Revision: D21063378

fbshipit-source-id: c7336bc883dca2722b189449a208e9381196300e
2020-04-17 04:52:27 -07:00
Thomas Orozco
cc45bb8d56 mononoke/hook_tailer: use csid_resolve for exclusions
Summary: There's no reason to force using Hg Changeset IDs here.

Reviewed By: StanislavGlebik

Differential Revision: D21063377

fbshipit-source-id: e4c3943449c33340159afbef819dd3dbf786bf5c
2020-04-17 04:52:27 -07:00
Thomas Orozco
1e28a37e0c mononoke/hook_tailer: remove a bit more dead code
Summary: What it says in the title

Reviewed By: farnz

Differential Revision: D21043171

fbshipit-source-id: 151a49cc0847b1b4f577df631c3cc6bb5ebfa77e
2020-04-17 04:52:26 -07:00
Thomas Orozco
0d07c819a9 mononoke/hook_tailer: allow for logging stats to a csv
Summary:
I'd like to make sure that performance of our hooks is now linear with regard
to the changeset size after D21039811.

Reviewed By: farnz

Differential Revision: D21043172

fbshipit-source-id: a36edc5cfdb26fc63160dfdbc47be157b7506523
2020-04-17 04:52:26 -07:00
Thomas Orozco
53cb9829c6 mononoke/hook_tailer: stream outcomes
Summary:
Rather than buffer everything, let's stream outcomes as we go. Also, let's
track the number of changesets we accepted or rejected, as opposed to the hook
instance count (my goal is to output all that in a CSV if we want more detail).

Reviewed By: StanislavGlebik

Differential Revision: D21043173

fbshipit-source-id: 1b20339a52ac95a0a771b9ef469d19dd14ffc2c3
2020-04-17 04:52:26 -07:00
Egor Tkachenko
ba141a2d70 mononoke: opsfiles: Port chef_chef_test.sh hook
Summary: Porting chef_chef_test.sh hook into mononoke rust hooks

Reviewed By: HarveyHunt

Differential Revision: D21040287

fbshipit-source-id: 663d79f6d1e467be57fd82c7e06660971c8bd90d
2020-04-16 16:46:02 -07:00
Arun Kulshreshtha
13c6ccdab6 edenapi_server: use constants for long argument names
Summary: Use constants for flag names instead of duplicating them as hardcoded strings.

Reviewed By: xavierd

Differential Revision: D21072045

fbshipit-source-id: 4617d169d034e05dcf11eb138ad0b6eaf915edec
2020-04-16 16:13:03 -07:00
Arun Kulshreshtha
fc741586d2 Add integration test for EdenAPI server
Summary: Add a simple integration test for the EdenAPI server which just starts up a server and hits its health_check endpoint. This will be expanded in later diffs to perform actual testing.

Reviewed By: krallin

Differential Revision: D21054212

fbshipit-source-id: a3be8ddabb3960d709a1e83599bc6a90ebe49b25
2020-04-16 10:03:13 -07:00
Arun Kulshreshtha
8adac0ec89 edenapi_server: use TLS session data middleware
Summary:
Add TLS session data middleware to the EdenAPI server to allow inspecting HTTPS traffic to the server for debugging purposes.

An additional side effect of setting this up is that we use `bind_server_with_socket_data` to bind to the listening socket, wherein we construct a `TlsSocketData`, which constructs a `TlsCertificateIdentities` containing the client certificates, which are later added to the server's State by Gotham. Having the client certificate information in the State is necessary in order for the ClientIdentityMiddleware to work, and will allow enforcement of ACLs based on the provided client identities.

Reviewed By: krallin

Differential Revision: D21054213

fbshipit-source-id: 7002c73b7458f21e3c4a51a3029d27d1dea7a927
2020-04-16 10:03:12 -07:00
Thomas Orozco
4b8e0b670d mononoke/blobstore_healer: wait for MyRouter
Summary:
We used to implicitly do this when creating the sync queue (though it wasn't
needed there - if we don't wait we crash later when checking for replication
lag), but we no longer do after the SqlConstruct refactor.

This fixes that so now we can start the healer again.

Reviewed By: farnz

Differential Revision: D21063118

fbshipit-source-id: 24f236d10b411bc9a5694b42c19bf2afa352a54c
2020-04-16 09:46:14 -07:00
Thomas Orozco
1ebbe25ed8 mononoke/blobstore_healer: add more Context to errors
Summary:
Being told `Input/output error: Connection refused (os error 111)` isn't very
helpful when things are broken. However, being told:

```
Execution error: While waiting for replication

Caused by:
    0: While fetching repliction lag for altoona
    1: Input/output error: Connection refused (os error 111)
```

Is nicer.

Reviewed By: farnz

Differential Revision: D21063120

fbshipit-source-id: 1408b9eca025b120790a95d336895d2f50be3d5d
2020-04-16 09:46:14 -07:00
Thomas Orozco
27bb95826c mononoke/sql_construct: provide more context when failing to create connections
Summary: What it says in the title.

Reviewed By: ahornby

Differential Revision: D21063122

fbshipit-source-id: fc7c4075af31548180b64ff21472bc32e5625960
2020-04-16 09:46:14 -07:00
Thomas Orozco
87622937f5 mononoke/blobstore_healer: remove more old futures from main
Summary:
This turns out quite nice because we had some futures there that were always
`Ok`, and now we can use `Output` instead of `Item` and `Error`.

Reviewed By: ahornby

Differential Revision: D21063119

fbshipit-source-id: ab5dc67589f79c898d742a276a9872f82ee7e3f9
2020-04-16 09:46:13 -07:00
Thomas Orozco
fe971aef07 mononoke/blobstore_healer: asyncify maybe_schedule_healer_for_storage
Summary:
I'd like to do a bit of work on this, so might as well convert it to async /
await first.

Reviewed By: ahornby

Differential Revision: D21063121

fbshipit-source-id: e388d59cecf5ba68d9bdf551868cea79765606f7
2020-04-16 09:46:13 -07:00
Simon Farnsworth
483eac115b Use standard DB config for SQL blob
Summary: I'm going to want to be able to test against a single ephemeral shard, as well as production use against a real DB. Use the standard config to make that possible.

Reviewed By: ahornby

Differential Revision: D21048697

fbshipit-source-id: 644854e2c831a9410c782ca1fddc1c4b5f324d03
2020-04-16 06:05:18 -07:00
Kostia Balytskyi
220edc6740 admin: add a subcommand to manipulate mutable_counters
Summary:
This is generally something I wanted to have for a long time: instead of having to open a writable db shell, now we can just use the admin command. Also, this will be easier to document in the oncall wikis.

NB: this is lacking the `delete` functionality atm, but that one is almost never needed.

Reviewed By: krallin

Differential Revision: D21039606

fbshipit-source-id: 7b329e1782d1898f1a8a936bc711472fdc118a96
2020-04-16 03:19:44 -07:00
Thomas Orozco
b9bc56ada5 mononoke/hook_tailer: asyncify everything that's left
Summary: As it says in the title

Reviewed By: farnz

Differential Revision: D21042082

fbshipit-source-id: 0d5fb63ab380aa53a04352a8d8a474390127f68c
2020-04-16 02:15:24 -07:00
Thomas Orozco
e58a0868d5 mononoke/hook_tailer: remove continuous mode
Summary:
We don't use this anymore (instead we just do backtesting in bulk). Let's get
rid of it.

Reviewed By: farnz

Differential Revision: D21042083

fbshipit-source-id: af5aea3033a4d58ba61b8f22d7dc1249a112933e
2020-04-16 02:15:23 -07:00
Thomas Orozco
2ecf51e7af mononoke/hook_tailer: asyncify run_with_limit
Summary:
I'd like to clean up this code a little bit since I'm going to make a few
changes and would like to avoid mixing too many old and new futures.

Reviewed By: farnz

Differential Revision: D21042081

fbshipit-source-id: d6a807ce9c60d09d82c6b8c6866ea23b8ef45f21
2020-04-16 02:15:23 -07:00
Thomas Orozco
d37d58b51d mononoke/hook_tailer: remove dead code
Summary:
run_in_range isn't being used anywhere. Let's get rid of it. Also, let's not
make run_in_range0 a method on Tailer since it's more of a helper function.

Reviewed By: farnz

Differential Revision: D21042084

fbshipit-source-id: 2678a94ce4b0b6ae1c97e47eb02652bcbf238b0d
2020-04-16 02:15:23 -07:00
Thomas Orozco
10b815e1eb mononoke/hook_tailer: remove redundant roundtrip through hg cs id
Summary: What it says in the title.

Reviewed By: farnz

Differential Revision: D21042080

fbshipit-source-id: c5dbcc6179d01da2748d18ecae5b737c436e68a9
2020-04-16 02:15:22 -07:00
Thomas Orozco
c89216e5db mononoke/hook_tailer: remove ad-hoc logging setup
Summary:
The hook_tailer is broken in mode/dev right now because it blows up with a
debug assertion in clap complaining that `--debug` is being added twice. This
is because it sets up its own logging, which is really not needed.

Let's just take this all out: it's not necessary

Reviewed By: farnz

Differential Revision: D21040108

fbshipit-source-id: 75ec70717ffcd0778730a0960607c127a958fe52
2020-04-16 02:15:22 -07:00
Thomas Orozco
a327fcb460 mononoke/hook_tailer: use csid_resolve
Summary: It's nice to be able to use a Bonsai ID if that's what you have.

Reviewed By: farnz

Differential Revision: D21040109

fbshipit-source-id: 4dfc447437053f9d7f4a1c9b3753d51fe5d02491
2020-04-16 02:15:22 -07:00
Arun Kulshreshtha
8c68227742 edenapi_server: add startup debug logging
Summary: Add a few debug-level log lines during server startup so we can see which part of startup is slow.

Reviewed By: quark-zju

Differential Revision: D21054216

fbshipit-source-id: 5dfb7b58fffb360506f34e3f2bb9e8b51fcc5e6b
2020-04-15 23:17:22 -07:00
Xavier Deguillard
4973c55030 exchange: always call prepushoutgoing hooks
Summary:
Previously, an extension adding the "changeset" pushop might forget to call the
prepushoutgoing hooks, preventing them from being called.

Reviewed By: DurhamG

Differential Revision: D21008487

fbshipit-source-id: a6bc506c7e1695854aca3d3b2cd118ef1c390c52
2020-04-15 20:22:18 -07:00
Arun Kulshreshtha
291aee8c21 edenapi_server: remove extraneous #![deny(warnings)]
Summary: `#![deny(warnings)]` does nothing outside of the crate root file, so this was a no-op.

Reviewed By: singhsrb

Differential Revision: D21054214

fbshipit-source-id: dc1931c0a186eb42aae7700dd006550616f29a70
2020-04-15 17:52:05 -07:00
Gabriel Russo
03d4e52ab3 Bump tokio to 0.2.13
Summary:
This is needed because the tonic crate (see the diff stack) relies on tokio ^0.2.13

We can't go to a newer version because a bug that affects mononoke was introduced on 0.2.14 (discussion started on T65261126). The issue was reported upstream https://github.com/tokio-rs/tokio/issues/2390

This diff simply changed the version number on `fbsource/third-party/rust/Cargo.toml` and ran `fbsource/third-party/rust/reindeer/vendor`.

Also ran `buck run //common/rust/cargo_from_buck:cargo_from_buck` to fix the tokio version on generated cargo files

Reviewed By: krallin

Differential Revision: D21043344

fbshipit-source-id: e61797317a581aa87a8a54e9e2ae22655f22fb97
2020-04-15 12:18:00 -07:00
Mark Thomas
235c9a5cd9 getbundle: compute full set of new draft commits
Summary:
In getbundle, we compute the set of new draft commit ids.  This is used to
include tree and file data in the bundle when draft commits are fully hydrated,
and will also be used to compute the set of mutation information we will
return.

Currently this calculation only computes the non-common draft heads.  It
excludes all of the ancestors, which should be included.  This is because it
re-uses the prepare_phases code, which doesn't quite do what we want.

Instead, separate out these calculations into two functions:

  * `find_new_draft_commits_and_public_roots` finds the draft heads
    and their ancestors that are not in the common set, as well as the
    public roots the draft commits are based on.
  * `find_phase_heads` finds and generates phase head information for
    the public heads, draft heads, and the nearest public ancestors of the
    draft heads.

Reviewed By: StanislavGlebik

Differential Revision: D20871337

fbshipit-source-id: 2f5804253b8b4f16b649d737f158fce2a5102002
2020-04-15 11:00:33 -07:00
Xavier Deguillard
643e69e045 remotefilelog: do not write delta in bundle2
Summary:
Computing delta force the client to have the previous version locally, which it
may not have, forcing a full fetch of the blob, to then compute a delta. Since
delta are a way to save on bandwidth usage, fetching a blob to compute it
negate its benefits.

Reviewed By: DurhamG

Differential Revision: D20999424

fbshipit-source-id: ae958bb71e6a16cfc77f9ccebd82eec00ffda0db
2020-04-15 10:26:39 -07:00
Stanislau Hlebik
d3ec8dd0f3 mononoke: add batch_derived() method
Summary:
A new method on BonsaiDerived trait that derives data for a batch of commits.
Default implementation just derives them in parallel, so it's not particularly
useful. However it might be overriden if a particular derived data has a more
efficinet way of deriving a batch of commits

Reviewed By: farnz

Differential Revision: D21039983

fbshipit-source-id: 3c6a7eaa682f5eaf6b8a768ca61d6f8a8f1258a7
2020-04-15 08:59:10 -07:00
Stanislau Hlebik
584728bd56 mononoke: warmup content metadata for fsnodes
Summary: It makes it backfill a great deal faster

Reviewed By: krallin

Differential Revision: D21040292

fbshipit-source-id: f6d06cbc76e710b4812f15e85eba73b24cdbbd3e
2020-04-15 08:21:28 -07:00
Thomas Orozco
fec12c95f1 mononoke/hooks: compute the changeset id once, be O(N) as opposed to O(N^2)
Summary:
Unfortunately, `BonsaiChangeset::get_changeset_id()` is a fairly expensive
operation, since it'll clone, serialize, and hash the changeset. In hooks in
particular, since we run this once per hook execution (and therefore once per
file), that can be come a problem.

Indeed, on a commit with 1K file changes, hooks run for ~30 seconds
(P129058164). According to perf, the overwhelming majority of that time is
spent in computing hashes of bonsai changesets. For a commit with 10K changes,
it spends time there as well, it took 3.5 hours.

This diff updates hooks to compute the changeset id just once, which brings our
time down to O(N) (where N = file changes).

Reviewed By: StanislavGlebik

Differential Revision: D21039811

fbshipit-source-id: 73f9939ffc7d095e717bdb5efc46dbf4ad312c65
2020-04-15 06:29:50 -07:00
Thomas Orozco
2d56af23be mononoke/hook_tailer: log completion time
Summary: This is generally helpful to log — see later in this stack.

Reviewed By: HarveyHunt

Differential Revision: D21039810

fbshipit-source-id: 4087db70b3f56f47270c10eb31a37f33c61778df
2020-04-15 06:29:49 -07:00
Kostia Balytskyi
cf10fe8689 admin: make sure bookmark operations create syncable log entries
Summary:
This is important for various syncs.

Note: there's an obvious race condition, TOCTTOU is non-zero for existing bookmark locations. I don't think this is a problem, as we can always re-run the admin.

Reviewed By: StanislavGlebik

Differential Revision: D21017448

fbshipit-source-id: 1e89df0bb33276a5a314301fb6f2c5049247d0cf
2020-04-15 04:17:42 -07:00
Aida Getoeva
25eff1c91e mononoke/scs-log: integrate deleted manifest (linear)
Summary:
Use deleted manifest to search deleted paths in the repos with linear history. For merged history it returns error as there was no such path.
Commit, where the path was deleted, is returned as a first commit in the history stream, the rest is a history before deletion.

Reviewed By: StanislavGlebik

Differential Revision: D20897083

fbshipit-source-id: e75e53f93f0ca27b51696f416b313466b9abcee8
2020-04-14 18:27:39 -07:00
Kostia Balytskyi
f31680f160 admin: change "blacklisted" to "redacted" in admin and tests
Summary:
Some time ago we decided on the "redaction" naming for this feature. A few
places were left unfixed.

Reviewed By: xavierd

Differential Revision: D21021354

fbshipit-source-id: 18cd86ae9d5c4eb98b843939273cfd4ab5a65a3a
2020-04-14 16:18:35 -07:00
Thomas Orozco
69b09c0854 mononoke/hg_sync_job: use hgsql name in integration test
Summary: What it says in the title.

Reviewed By: farnz

Differential Revision: D20943176

fbshipit-source-id: 8fae9b0bad32e2b6ede3c02305803c857c93f5e7
2020-04-14 10:26:11 -07:00
Thomas Orozco
eefb43237c mononoke/repo_read_write_status: use HgsqlName
Summary:
We should use the HgsqlName to check the repo lock, because that's the one
Mercurial uses in the repo lock there.

Reviewed By: farnz

Differential Revision: D20943177

fbshipit-source-id: 047be6cb31da3ee006c9bedc3de21d655a4c2677
2020-04-14 10:26:11 -07:00
Thomas Orozco
d84ba9caae mononoke/hg_sync_job: use the hgsql repo name for globalrevs
Summary:
The name for repository in hgsql might not match that of the repository itself.
Let's use the hgsql repo name instead of the repo name for syncing globalrevs.

Reviewed By: farnz

Differential Revision: D20943175

fbshipit-source-id: 605c623918fd590ba3b7208b92d2fedf62062ae1
2020-04-14 10:26:10 -07:00
Thomas Orozco
5186d4e694 mononoke/metaconfig: include the repository hgsql name in the config
Summary:
This parses out the Hgsql name out of the repo config. While in there, I also
noticed that our tests force us to have a default impl right now (there are
otherwise waaaay to many fields to specify), but at the same time we don't use
it everywhere. So, in an effort to clean up, I updated hooks to use a default.

I added a newtype wrapper for the hgsql name, since this will let me update the
globalrev syncer and SQL repo lock implementation to require a HgsqlName
instead of a string and have the compiler prove that all callsites are doing
so.

Reviewed By: farnz

Differential Revision: D20942177

fbshipit-source-id: bfbba6ba17cf3e3cad0be0f8406e41e5a6e6c3d4
2020-04-14 10:26:10 -07:00
Thomas Orozco
92d3000204 mononoke: sync repos.thrift from Configerator
Summary:
See D20941946 for why this is being added. This just brings in the updated
Thrift definition.

Reviewed By: farnz

Differential Revision: D20942176

fbshipit-source-id: c060f80666cb79f1498023276b7a09ec12bf52b4
2020-04-14 10:26:10 -07:00
Steven Troxler
a70c6755e4 Asyncify prefetch_content code
Summary:
This diff may not have quite the right semantics.

It switches `prefetch_content` to async syntax,
in the process getting rid of the old function `spawn_future`,
which assumes old-style futures, in favor of using
`try_for_each_concurrent` to handle concurrency.

In the process, we were able to remove a couple levels of clones.

I *think* that the old code - in which each call to `spawn_future`
would spin off its own future on the side but then also wait
for completion, and then we buffered - would run at most 256
versions of `prefetch_content_node` at a time, and the current
code is the same. But it's possible that I've either halved or
doubled the concurrency somehow here, if I lost track of the
details.

Reviewed By: krallin

Differential Revision: D20665559

fbshipit-source-id: d95d50093f7a9ea5a04c835baea66e07a7090d14
2020-04-14 10:19:00 -07:00
Lukas Piatkowski
6afe62eeaa eden/scm: split revisionstore into types and rest of logic
Summary:
The revisionstore is a large crate with many dependencies, split out the types part which is most likely to be shared between different pieces of eden/mononoke infrastructure.

With this split it was easy to get eden/mononoke/mercurial/bundles

Reviewed By: farnz

Differential Revision: D20869220

fbshipit-source-id: e9ee4144e7f6250af44802e43221a5b6521d965d
2020-04-14 07:50:19 -07:00
Steven Troxler
5c87595a4b Change fetch_all_public_changesets to new stream API
Summary:
By switching to the new futures api, we can save a few heap allocations
and reduce indentation of the code.

Reviewed By: krallin

Differential Revision: D20666338

fbshipit-source-id: 730a97e0365c31ec1a8ab2995cba6dcbf7982ecd
2020-04-14 07:12:33 -07:00
Simon Farnsworth
92fce3d518 Clean out unused deps from our TARGETS files
Summary:
We had accumulated lots of unused dependendencies, and had several test_deps in deps instead. Clean this all up to reduce build times and speed up autocargo processing.

Net removal is of around 500 unneeded dependency lines, which represented false dependencies; by removing them, we should get more parallelism in dev builds, and less overbuilding in CI.

Reviewed By: krallin, StanislavGlebik

Differential Revision: D20999762

fbshipit-source-id: 4db3772cbc3fb2af09a16601bc075ae8ed6f0c75
2020-04-14 03:38:11 -07:00
Thomas Orozco
ee2e6fd8e2 mononoke/blobrepo: make RepoBlobstore an actual struct
Summary:
RepoBlobstore is currently a type alias for the underlying blobstore type. This
is a bit unideal for a few reasons:

- It means we can't add convenience methods on it. Notably, getting access to
  the underlying blobstore can be helpful in tests, but as-is we cannot do that
  (see the test that I updated in the LFS server change in this diff for an
  example).
- Since the various blobstores we use for wrapping are blobstores themselves,
  it is possible when deconstructing the repo blobstore to accidentally forget
  to remove one layer. By making the internal blobstore a `T`, we can let the
  compiler prove that deconstructing the `RepoBlobstore` is done properly.

Most of the changes in this diff are slight refactorings to make this compile
(e.g. removing obsolete trait bounds, etc.), but there are a couple functional
changes:

- I've extracted the RedactedBlobstore configuration into its own Arc. This
  enables us to pull it back out of a RedactedBlobstore without having to copy
  the actual data that's in it.
- I've removed `as_inner()` and `into_inner()` from `RedactedBlobstore`. Those
  methods didn't really make sense. They had 2 use cases:
  - Deconstruct the `RedactedBlobstore` (to rebuild a new blobstore). This is
    better handled by `as_parts()`.
  - Get the underlying blobstore to make a request. This is better handled by
    yielding the blobstore when checking for access, which also ensures you
    cannot accidentally bypass redaction by using `as_inner()` (this which also
    allowed me to remove a clone on blobstore in the process).

Reviewed By: farnz

Differential Revision: D20941351

fbshipit-source-id: 9fa566702598b916cb87be6b3f064cd7e8e0b3e0
2020-04-14 03:19:25 -07:00
Kostia Balytskyi
66eb788549 admin: report metadata for filenodes
Summary:
Filenode envelopes have metadata, let's display it as well.
Althouth I've never seen it being non-empty, whenever I investigate some
filenode difference, I would like to know for sure.

Reviewed By: StanislavGlebik

Differential Revision: D20951954

fbshipit-source-id: 188321591e0d591d31e1ca765994f953dc23221c
2020-04-14 02:01:35 -07:00
Simon Farnsworth
e58925a771 Make clippy happier with vec initialization in blobstore benchmark
Summary: It says I was doing it the slow way. Do it the fast way

Reviewed By: krallin

Differential Revision: D20926911

fbshipit-source-id: 65790d510d626e70a402c22a2df5d7606427aa7f
2020-04-13 08:37:59 -07:00
Simon Farnsworth
10a1fc24b7 Use the standard caching options to enable caching-assisted blobstore benchmarking
Summary: In production, we'll never look at blobstores on their own. Use the standard cachelib and memcache layers in benchmarks to test with caching.

Reviewed By: krallin

Differential Revision: D20926910

fbshipit-source-id: 030dcf7ced76293eda269a31adc153eb6d51b48a
2020-04-13 08:37:59 -07:00
Simon Farnsworth
d11ae2dcc8 Add read benchmarks to the blobstore benchmark set
Summary: This lets us look at a blobstore's behaviour for repeated single reads, parallel same-blob reads, and parallel reads to multiple blobs.

Reviewed By: krallin

Differential Revision: D20920206

fbshipit-source-id: 24d9a58024318ff3454fbbf44d6f461355191c55
2020-04-13 08:37:59 -07:00
Simon Farnsworth
f8cc1c6e97 Delete HgChangeset hook handling completely
Summary: Not in use any more - all hooks are now Bonsai form - so remove it.

Reviewed By: krallin

Differential Revision: D20891164

fbshipit-source-id: b92f169a0ec3a4832f8e9ec8dc9696ce81f7edb3
2020-04-11 04:26:37 -07:00
Simon Farnsworth
25b29257a3 Port hooks which now run on modified, not just added files
Summary: These hooks now run on modified files, not just added files, after porting to Bonsai form.

Reviewed By: krallin

Differential Revision: D20891166

fbshipit-source-id: 93a142f91c0bea7f5fe5e541530c644d215dce3a
2020-04-11 04:26:37 -07:00
Jeremy Fitzhardinge
28830035dd rust: regenerate autocargo for tokio rollback
Reviewed By: dtolnay

Differential Revision: D20956714

fbshipit-source-id: f13256350cc7082543c7b69231a783b262f8a4d8
2020-04-10 01:12:57 -07:00
Stanislau Hlebik
8ffb6af331 mononoke: measure the whole duration of deriving chunks
Summary:
We were exluding warmup, which might take a noticeable amount of time. Let's
measure everything

Reviewed By: krallin

Differential Revision: D20920211

fbshipit-source-id: f48b0c2425eb2bae2991fa537dde1bc61b5e44ac
2020-04-09 23:40:30 -07:00
Jun Wu
94b9ef5625 add getcommitdata to wireproto capabilities
Summary: This allows the client to do proper feature detection.

Reviewed By: krallin

Differential Revision: D20910379

fbshipit-source-id: c7b9d4073e94518835b39809caf8b068f70cbc2f
2020-04-09 12:57:07 -07:00
Jun Wu
3bfb9da1f5 include sorted parents in getcommitdata output
Summary:
The Mercurial SHA1 is defined as:

  sorted([p1, p2]) + content

The client wants to be able to verify the commit hashes returned by
getcommitdata. Therefore, also write the sorted parents so the client can
calculate the SHA1 easily without fetching SHA1s of parents. This is
useful because we also want to make commit SHA1s lazy on client-side.

I also changed the NULL behavior so the server does not return
content for the NULL commit, as it will fail the SHA1 check.
The server will expects the client to already know how to handle
the NULL special case.

Reviewed By: krallin

Differential Revision: D20910380

fbshipit-source-id: 4a9fb8ef705e93c759443b915dfa67d03edaf047
2020-04-09 11:04:22 -07:00
Thomas Orozco
d37a0bd373 mononoke/lfs_server: add option to disable ACL Checker
Summary:
This makes sense to have when running locally. Of you're running Mononoke LFS
locally, then implicitly your access is governed by whether you have access to
the underlying data. If you are on the source control team and you do have
access, it makes sense to let you run without ACL checks (since you could
rebuild from source anyway).

Reviewed By: farnz

Differential Revision: D20897249

fbshipit-source-id: 43e8209952f22aa68573c9b94a34e83f2c88f11b
2020-04-08 11:58:10 -07:00
Thomas Orozco
c55140f290 mononoke/lfs_server: download: handle redaction
Summary:
When a client requests a blob that is redacted, we should tell them that,
instead of returning a 500. This does that, we now return a `410 Gone` when
redacted content is accessed.

Reviewed By: farnz

Differential Revision: D20897251

fbshipit-source-id: fc6bd75c82e0cc92a5dbd86e95805d0a1c8235fb
2020-04-08 11:58:09 -07:00
Thomas Orozco
0a21ab46c4 mononoke/lfs_server: ignore redaction errors in batch
Summary:
If a blob is redacted, we shouldn't crash in batch. Instead, we should return
that the blob exists, and let the download path return to the client the
information that the blob is redacted. This diff does that.

Reviewed By: HarveyHunt

Differential Revision: D20897247

fbshipit-source-id: 3f305dfd9de4ac6a749a9eaedce101f594284d16
2020-04-08 11:58:09 -07:00
Thomas Orozco
77149d7ee8 mononoke/lfs_server: don't return a 502 on batch error
Summary:
502 made a bit of sense since we can occasionally proxy things to upstream, but
it's not very meaningful because our inability to service a batch request is
never fully upstream's fault (it would not a failure if we had everything
internally).

So, let's just return a 500, which makes more sense.

Reviewed By: farnz

Differential Revision: D20897250

fbshipit-source-id: 239c776d04d2235c95e0fc0c395550f9c67e1f6a
2020-04-08 11:58:09 -07:00
Thomas Orozco
ee45e76fcf mononoke/lfs_server: ignore failures from upstream if internal can satisfy
Summary:
I noticed this while doing some unrelated work on this code. Basically, if we
get an error from upstream, then we shouldn't return an error the client
*unless* upstream being down means we are unable to satisfy their request
(meaning, we are unable to say whether a particular piece of content is
definitely present or definitely missing).

This diff fixes that. Instead of checking for a success when hearing form
upstream _then_ running our routing logic, let's instead only fail if in the
course of trying to route the client, we discover that we need a URL from
upstream AND upstream has failed.

Concretely, this means that if upstream blew up but internal has all the data
we want, we ignore the fact that upstream is down. In practice, internal is
usually very fast (because it's typically all locally-cached) so this is
unlikely to really occur in real life, but it's still a good idea to account
for this failure scenario.

Reviewed By: HarveyHunt

Differential Revision: D20897252

fbshipit-source-id: f5a8598e8a9da382d0d7fa6ea6a61c2eee8ae44c
2020-04-08 11:58:08 -07:00
Thomas Orozco
368d43cb71 mononoke_types: add Sha256 stubs
Summary: Like it says in the title.

Reviewed By: farnz

Differential Revision: D20897248

fbshipit-source-id: bf17ee8bdec85153eed3c8265304af79ec9a8877
2020-04-08 11:58:08 -07:00
Thomas Orozco
6130f1290f mononoke/blobrepo_factory: add a builder for test repos
Summary:
Right now we have a couple functions, but they're not easily composable. I'd
like to make the redacted blobs configurable when creating a test repo, but I
also don't want to have 2 new variants, so let's create a little builder for
test repos.

This should make it easier to extend in the future to add more customizability
to test repos, which should in turn make it easier to write unit tests :)

Reviewed By: HarveyHunt

Differential Revision: D20897253

fbshipit-source-id: 3cb9b52ffda80ccf5b9a328accb92132261616a1
2020-04-08 11:58:08 -07:00
Steven Troxler
10bf48e871 Extract async fn tail_one_iteration
Summary:
This asyncifies the internals of `subcommand_tail`, which
loops over a stream, by taking the operation performed in
the loop and making it an async function.

The resulting code saves a few heap allocations by reducing
clones, and is also *much* less indented, which helps with
readability.

Reviewed By: krallin

Differential Revision: D20664511

fbshipit-source-id: 8e81a1507e37ad2cc59e616c739e19574252e72c
2020-04-08 11:19:35 -07:00
Lukas Piatkowski
c7d12b648f mononoke/mercurial: make revlog crate OSS buildable
Reviewed By: krallin

Differential Revision: D20869309

fbshipit-source-id: bc234b6cfcb575a5dabdf154969db7577ebdb5c5
2020-04-08 09:49:11 -07:00
Simon Farnsworth
4135c567a8 Port over all hooks whose behaviour doesn't change from Mercurial form to Bonsai form
Summary: These hooks behave the same way in Mercurial and Bonsai form. Port them over to operating on Bonsai form

Reviewed By: krallin

Differential Revision: D20891165

fbshipit-source-id: cbcdf217398714642d2f2d6669376defe8b944d7
2020-04-08 08:59:01 -07:00
Simon Farnsworth
da7cbd7f36 Run Bonsai hooks as well as old-style hooks
Summary: Running on Mercurial hooks isn't scalable long term - move the consumers of hooks to run on both forms for a transition period

Reviewed By: krallin

Differential Revision: D20879136

fbshipit-source-id: 4630cafaebbf6a26aa6ba92bd8d53794a1d1c058
2020-04-08 08:59:00 -07:00
Simon Farnsworth
c59ae3274b Teach hook loader to load new (Bonsai) form hooks
Summary: To use Bonsai-based hooks, we ned to be able to load them. Make it possible.

Reviewed By: krallin

Differential Revision: D20879135

fbshipit-source-id: 9b44d7ca83257c8fc30809b4b65ec27a8e9a8209
2020-04-08 08:59:00 -07:00
Simon Farnsworth
b66d875fa5 Move hooks over from an internal representation based on HgChangesets to BonsaiChangesets
Summary: We want all hooks to run against the Bonsai form, not a Mercurial form. Create a second form of hooks (currently not used) which acts on Bonsai hooks. Later diffs in the stack will move us over to Bonsai only, and remove support for Mercurial changeset derived hooks

Reviewed By: krallin

Differential Revision: D20604846

fbshipit-source-id: 61eece8bc4ec5dcc262059c19a434d5966a8d550
2020-04-08 08:59:00 -07:00
Steven Troxler
afdb247802 Swap out a while loop instead of .and_then + .fold
Summary:
Thanks to StanislavGlebik for this idea: we can make the looping over
upload changesets into straightforward imperative code instead
of using `.and_then` + `.fold` by taking the next chunk in a
while loop.

The resulting code is probably easier to understand (depends whether
you come from a functional background I guess), and it's less indented
which is definitely more readable

Reviewed By: StanislavGlebik

Differential Revision: D20881862

fbshipit-source-id: 7ecf76a2fae3eb0e6c24a1ee14e0684b6334b087
2020-04-08 08:19:32 -07:00
Steven Troxler
aabbd3b66a Minor cleanups of blobimport_lib/lib.rs
Summary:
A couple of minor improvements, removing some overhead:
 - We don't need to pass cloned structs to `erive_data_for_csids`,
   refs work just fine
 - We can strip out one of the boxing blocks by directly assigning
   an `async` block to `globalrevs_work`
   - We can't do the same for `synced_commit_mapping_work` because
     we have to iterate over `chunk` in synchronous code, so that
     `chunk` can later be consumed by the line defining `changesets`.

Reviewed By: StanislavGlebik

Differential Revision: D20863304

fbshipit-source-id: 14cad3324978a66bcf325b77df7803d77468d30b
2020-04-08 08:19:32 -07:00
Steven Troxler
814f428f03 Asyncify the max_rev code
Summary:
This wound up being a little tricky, because
that `async move` blocks capture any data used,
and most of the fields of the `Blobimport` struct
are values rather than refs.

The easiest solution that I came up with, which looks
a little weird but works better than anything else
I tried, is to just inject a little block of code
(which I commented so it will hopefully be clear to
future readers) taking refs of anything that we need
to use in an async block but also have available later.

In the process, we are able to strip out a layer of
clones, which should improve efficiency a bit.

Reviewed By: StanislavGlebik

Differential Revision: D20862358

fbshipit-source-id: 186bf9939b9496c432ff0d9a01e602da47f4b5d4
2020-04-08 08:19:32 -07:00
Lukas Piatkowski
2e7baa454b mononoke: cover more crates with OSS buildability that depend on cmdlib crate
Reviewed By: krallin

Differential Revision: D20735803

fbshipit-source-id: d4159d16384ff795717f6ccdd278b6b4af45d1ab
2020-04-08 03:09:07 -07:00
Lukas Piatkowski
8e9df760c5 mononoke: make cmdlib OSS buildable
Summary: Some methods that were unused or barely used outside of the cmdlib crate were made non-public (parse_caching, CachelibSettings, init_cachelib_from_settings).

Reviewed By: krallin

Differential Revision: D20671251

fbshipit-source-id: 232e786fa5af5af543239aca939cb15ca2d6bc10
2020-04-08 03:09:06 -07:00
Stefan Filip
d1ba21803a version: warn users when they are running an old build
Summary:
Old is defined by being based on a commit that is more than 30 days old.
The build date is taken from the version string.
One observation is that if we fail to release in more than 30 days then all
users will start seeing this message without any way of turning it off. Doesn't
seem worth while to add a config for silencing it though.

Reviewed By: quark-zju

Differential Revision: D20825399

fbshipit-source-id: f97518031bbda5e2c49226f3df634c5b80651c5b
2020-04-07 14:25:38 -07:00
Stanislau Hlebik
b2a8862a9a mononoke: add a test backfill derived data
Summary:
I decided to go with integration test because backfilling derived data at the
moment requires two separate calls - a first one to prefetch changesets, and a
second one to actually run backfill. So integration test is better suited for this
case than unit tests.

While doing so I noticed that fetch_all_public_changesets actually won't fetch
all changesets - it loses the last commit becauses t_bs_cs_id_in_range was
returning exclusive (i.e. max_id was not included). I fixed the bug and made the name clearer.

Reviewed By: krallin

Differential Revision: D20891457

fbshipit-source-id: f6c115e3fcc280ada26a6a79e1997573f684f37d
2020-04-07 08:44:25 -07:00
Aida Getoeva
2df76d79c8 mononoke/scs-log: add history stream terminator
Summary:
`log_v2` supports time-filters and that means it needs to be able to drop the history stream if the commits got older than the given time frame. (if not it just traverses the whole history...)
However, it cannot be done from the SCS commit_path API or from changeset_path, because they already receive history stream where commits are not ordered by creation time. And the naive solution "if next commit in the stream is older than `after_ts` then drop" won't work: there might be another branch (commit after the current one) which is still **in** the time frame.

I added a terminator-function to the `list_file_history` that is called on changeset id, for which a new fastlog batch is going to be fetched. If terminator returns true, then the fastlog is not fetched and the current history branch is dropped. All ready nodes are still streamed.
```
For example, if we have a history of the file changes like this:

      A 03/03    ^|
      |           |
      B 02/03     |
      |           |  - one fastlog batch
      C 01/03     |
      | \         |
02/01 D  E 10/02 _|  - let assume, that fastlog batches for D and E ancestors are needed to prefetch
      |  |
01/01 F  G 05/02

# Example 1

We query "history from A after time 01/02"

The old version would fetch all the commits and then filter them in `commit_path`. We would fetch both fastlog batches for the D branch and E branch.

With the terminator, `list_file_history` will call terminator on commit D and get `true` in return and then will drop the D branch,
then it will call terminator on E and get `false` and proceed with fetching fastlog for the E branch.

# Example 2

We query "history from A after time 01/04"

The old version would fetch all the commits and then filter them in `commit_path`, despite the fact that
the very first commit is already older than needed.

With the terminator it will call terminator on A and get `true` and won't proceed any further.

Reviewed By: StanislavGlebik

Differential Revision: D20801029

fbshipit-source-id: e637dcfb6fddceb4a8cfc29d08b427413bf42e79
2020-04-07 07:08:24 -07:00
Aida Getoeva
2dcfcbac62 mononoke/fastlog: asyncify part of ops
Summary: Asyncified main functions of the fastlog/ops, so it'd be easier to modify them and proceed with the new features.

Reviewed By: StanislavGlebik

Differential Revision: D20801028

fbshipit-source-id: 2a03eedca776c6e1048a72c7bd613a6ef38c5c17
2020-04-07 07:08:24 -07:00
Thomas Orozco
7e2ad0b529 mononoke/fastreplay: handle Gettreepack for designated nodes
Summary: We need to parse `directories` here. Let's do so.

Reviewed By: HarveyHunt

Differential Revision: D20869830

fbshipit-source-id: 74830aa0045b801fba089812447fb61d7d09ad14
2020-04-07 04:36:07 -07:00
Thomas Orozco
edadb9307a mononoke/repo_client: record depth
Summary: As it says in the title!

Reviewed By: HarveyHunt

Differential Revision: D20869828

fbshipit-source-id: df7728ce548739ef2dadad1629817fb56c166b66
2020-04-07 04:36:06 -07:00