Commit Graph

3609 Commits

Author SHA1 Message Date
David Tolnay
1ae4fb3039 Update to new configerator client
Summary: `//common/rust/shed/cached_config` is the center of a dependency graph that all only uses old configerator because cached_config uses it. This diff switches all of these over to the new client.

Reviewed By: farnz

Differential Revision: D30357631

fbshipit-source-id: 9a9df74096aa38a06371c6bc787245af71175e48
2021-08-17 11:02:08 -07:00
Yan Soares Couto
1d639f5a93 derived_data: introduce DerivedDataManager
Summary:
The `DerivedDataManager` will manage the ordering of derivation for derived
data, taking into account dependencies between types as well as the topological
ordering of the repository.  It will replace the free functions in
`derived_data` as well as much of the `utils` crate.

This is the first step: it introduces the manager, although currently it only takes
over management of the derived data lease.

Reviewed By: mitrandir77

Differential Revision: D30281634

fbshipit-source-id: 04c3a34d97ea02cc8c26d34096cca341e800da9b
2021-08-17 10:30:07 -07:00
Yan Soares Couto
be8daaa23c derived_data: make mapping not depend on BlobRepo
Summary:
In preparation for the derived data manager, ensure that derived data
mappings do not require a `BlobRepo` reference.

The main use for this was to log to scuba.  This functionality is extracted out
to the new `BonsaiDerivedMappingContainer`, which now contains just enough
information to be able to log to scuba.

Reviewed By: mitrandir77

Differential Revision: D30135447

fbshipit-source-id: 1daa468a87f297adc531cb214dda3fa7fe9b15da
2021-08-17 10:30:07 -07:00
Stanislau Hlebik
dc8bf342da mononoke: set mutable renames while creating move commits
Reviewed By: mitrandir77

Differential Revision: D30338443

fbshipit-source-id: de5e39aad224c29cfe0bbdce011624037811aa36
2021-08-17 08:01:28 -07:00
Stanislau Hlebik
995a0a1bd5 mononoke: introduce DirectoryMultiMover
Summary:
We have mover only for files, and it doesn't quite work for directories - at
the very least directory can be None (i.e. root of the repo).

In the next diffs we'll start recording files and directories renames during
megarepo operations, so let's DirectoryMultiMover as a preparation for that.

Reviewed By: mitrandir77

Differential Revision: D30338444

fbshipit-source-id: 4fed5f50397a7d3d8b77f23552921d515a684604
2021-08-17 08:01:28 -07:00
Simon Farnsworth
bd66f8a79b Add megarepo API to create a release branch point
Summary:
AOSP megarepo wants to create release branches from existing branches, and then update configs to follow only release-ready code.

Provide the primitive they need to do this, which takes an existing commit and config, and creates a new config that tracks the same sources. The `change_target_config` method can then be used to shift from mainline to release branch

Reviewed By: StanislavGlebik

Differential Revision: D30280537

fbshipit-source-id: 43dac24451cf66daa1cd825ada8f685957cc33c1
2021-08-17 06:56:29 -07:00
Egor Tkachenko
51367815cd Moving thrift targets
Summary:
I was adding thrift fiddle support to my derivation service like this https://www.internalfb.com/intern/wiki/Rust-at-facebook/Thrift/Writing_a_Rust_Thrift_Server/#thrift-fiddle and run into errors with generated thrift code P439166788. After searching a little bit I came across the post in Thrift Users group with this comment https://fb.workplace.com/groups/thriftusers/posts/497757970933620/?comment_id=498850394157711
So in this diff I'm moving all `thrift_library` targets into the directory together with .thrift file itself.

Reviewed By: ahornby

Differential Revision: D30300919

fbshipit-source-id: bb2d7e2a98d6ba783e6249963b3a1dfcd6d62669
2021-08-17 06:49:44 -07:00
Stanislau Hlebik
06b34ca66e mononoke: use mutable_renames in fastlog
Summary:
Let's make it possible to query mutable renames from fastlog. For now this is a
very basic support i.e. we don't support middle-of-history renames, blame is
not supported etc.

Mutable rename logic is gated by a tunable, so we can roll it back quickly in
case of problems.

Reviewed By: ahornby

Differential Revision: D30279932

fbshipit-source-id: 0e8e329e8ab4d4980ab401bd103e6c97419d0f67
2021-08-17 01:18:59 -07:00
Stanislau Hlebik
56519f10aa mononoke: add mutable renames to repo factories
Summary:
Let's make it possible to build mutable renames using repo factories. It will
be used in the next diffs.

Differential Revision: D30279930

fbshipit-source-id: 57e873c69495e541daf943a47e6cb46fc19b221b
2021-08-17 01:18:58 -07:00
Alex Hornby
7813e241df mononoke: remove check_lock_repo from repo_client unbundle
Summary:
We currently do repo lock checks in a loop during unbundle.  However we now do a repo lock check in the  bookmarks_movement::PushrebaseOntoBookmarkOp::run(), making the loop and check in repo_client unbundle redundant

Cons: It will no longer early terminate. Pros: database load should be reduced.

Reviewed By: StanislavGlebik

Differential Revision: D30331806

fbshipit-source-id: 16ee72e570184c20ac08d3fa6d8f9f333c91deb7
2021-08-16 15:12:50 -07:00
Yan Soares Couto
7a1085d2ac Add validation that only snapshots can contain untracked/missing files
Summary:
- This diff adds validation so that changesets that are not snapshots cannot have untracked or missing files.
- It removes the THIS IS A SNAPSHOT commit message.
- It makes the snapshot created by `hg snapshot createremote` be an actual snapshot.

Differential Revision: D30159184

fbshipit-source-id: 976968c0c2222f950a4a937aa805b25dc07c9207
2021-08-16 09:19:06 -07:00
Yan Soares Couto
f761a291a7 Add snapshot_state to bonsai changeset
Summary:
This diff adds some data to BonsaiChangeset that tells whether it is a snapshot or not.

For now, this marks every changeset as not being a snapshot. The next diff will add validation to snapshots, some tests, and mark the current `snapshot createremote` command as uploading snapshots.

Reviewed By: markbt

Differential Revision: D30158530

fbshipit-source-id: 9835450ac44e39ce8d653938f3a629f081247d2f
2021-08-16 09:19:05 -07:00
Yan Soares Couto
c1e83d3dbd snapshot: Also upload new files, untracked, and removed
Summary:
This diff makes the snapshot command upload all types of files that were missing (added/untracked/missing/deleted), using the new types of file changes added on the previous diff.

Next steps:
- Add some indicator to Bonsai Changeset saying it is a snapshot. Verify only snapshots can have certain file changes (untracked/missing).
- Upload the files and the changeset inside an ephemeral bubble instead of in the main blobstore
- Start writing the `snapshot restore` command

Differential Revision: D30137673

fbshipit-source-id: 555238f1d64a5438cde35a843043884a939de4fe
2021-08-16 09:19:05 -07:00
Simon Farnsworth
bfb9db07b7 Reformat Mononoke thrift files to match arc f
Summary: I have my editor set up to format on save - let's unify this on the standard FB format for thrift files, so that I don't create junk.

Reviewed By: ahornby

Differential Revision: D30285082

fbshipit-source-id: 17b09635a2473174a92e29bb042432dbac44865a
2021-08-16 04:42:52 -07:00
Aida Getoeva
e2d57e9f02 mononoke/multiplex: add multiplex logging
Summary:
The current Mononoke Blobstore Trace scuba table is used with idea of having a record per blobstore and operation. This diff adds logging to the new scuba table of the combined multiplexed operations' outcome, like time spent on the `put` including sync-queue and blobstore write or tracking record of the "some failed others none" cases in `get/is_present`.

This helps to see the real time spent on writes and reads and to assess the impact of changes coming in `get` and `is_present`.

Reviewed By: ahornby

Differential Revision: D30248284

fbshipit-source-id: f79050ced32ba77bd2e220e242407bcd711a9b6d
2021-08-16 04:25:33 -07:00
Stanislau Hlebik
0445c36fdd mononoke: store hashed paths in mutable_renames
Summary:
Mysql has a limit on the length of the [key used in the index](https://dev.mysql.com/doc/refman/8.0/en/innodb-limits.html#:~:text=The%20index%20key%20prefix%20length,REDUNDANT%20or%20COMPACT%20row%20format.)

So we can't use the full path in index as-is. We can try to use just prefix of
the path in the index, but that won't allow us to make [the index
unique](https://stackoverflow.com/questions/15157227/mysql-varchar-index-length).

So let's do the same thing we do in filenodes - hash the path and store hash ->
path mapping in the separate table. Because we do similar thing in filenodes it
we can reuse some of the code we used there.

Reviewed By: markbt

Differential Revision: D30301729

fbshipit-source-id: 32c058a163e5ff541641c6049a74168ceba66a74
2021-08-13 11:23:56 -07:00
Thomas Orozco
de5b8e2dcb rust: ignore metadata-sys rules in Autocargo
Summary:
Autocargo only allows 1 rust-library per Cargo.toml, but right now we have 3
per Thrift library so that doesn't work:

https://www.internalfb.com/intern/sandcastle/log/?instance_id=27021598231105145&step_id=27021602582167211&step_index=13&name=Run%20config

There's little benefit in Autocargo-ifying those rules anyway since they're of
use to Thrift servers and this doesn't work at all in our OSS builds, so let's
just see if we can just noop them. That'll make the crate not exist at all as a
dep, but even considering that it exists only to link to a C++ library that
Autocargo doesn'tk now how to build anyway, that seems OK?

drop-conflicts

Reviewed By: markbt

Differential Revision: D30304720

fbshipit-source-id: 047524985b2dadab8610267c05e3a1b3770e84e6
2021-08-13 10:43:40 -07:00
Alex Hornby
133bd8567a mononoke: fix unused parameters warning in oss build
Summary: Spotted this in passing

Reviewed By: krallin

Differential Revision: D30305517

fbshipit-source-id: aa043f2daa8ede7d1f3aee4f49346ab9f14c8d01
2021-08-13 10:37:18 -07:00
Yan Soares Couto
54a885805b Get a RepoBlobstore from a bubble
Summary:
This is basically a refactor.

Before this diff, `bubble.handle(main)` could be used to access things in bubble with fallback. With this diff, `bubble.wrap_repo_blobstore(main)` can be used for the same effect.

The difference is **the type**, which now is `RepoBlobstore` instead of `EphemeralHandle`. Both are blobstores and work the same way for fetching/putting, but on the following diffs I will want to replace some code (e.g. that creates a changeset) to use the ephemeral blobstore for snapshots, and in order to reuse the same code (which expects `RepoBlobstore`), we need the change of types.

This is part of BlobRepo refactoring as well, as what I'm gonna do is replace BlobRepo with a different facet container that has a RepoBlobstore inside.

Reviewed By: markbt

Differential Revision: D30282624

fbshipit-source-id: 4132797104ecd2596e7da91b1daacc1c6fc85934
2021-08-13 04:55:37 -07:00
Stanislau Hlebik
b4c8d2a0cd mononoke: small refactoring in fastlog
Summary:
Added try_continue_traversal_when_no_parents function. For now it continues
traversal only if we deleted file manifest found something, but it will be
extended in the next diff to also use mutable renames.

Differential Revision: D30279931

fbshipit-source-id: b2cdae62d7841cfa0834ac1dd280ffb8dafa43ef
2021-08-12 15:27:17 -07:00
Yan Soares Couto
07d66e6df1 Add more types to FileChange struct
Summary: This adds types to FileChange thrift and rust structs to deal with additional possible snapshot states, that is, untracked and missing files. Conflicted stuff not added yet.

Reviewed By: StanislavGlebik

Differential Revision: D30103162

fbshipit-source-id: 59faa9e4af8dca907b1ec410b8af74985d85b837
2021-08-12 12:40:48 -07:00
Stanislau Hlebik
d627cc4238 mononoke: introduce simple mutable renames
Summary:
We'd like to have support for mutable renames i.e. make it possible to mark a
file or directory as renamed from another file/directory.

This diff adds a mysql table which will store these renames. A few notes:
1) Note that table stores `src_unode_id` - this is an optimization to make
fastlog traversal after renames faster.
2) I've been considering whether we want to use "insert or ignore" or "insert
... on duplicate key update". I opted for the latter - with "insert or ignore"
we'd need to process additional case when insert was ignored, and that would
make implementation harder. Besides I think that overwriting should be fine given
that we can always change mutable renames later

Differential Revision: D30277300

fbshipit-source-id: 35e5cab79f0db5a1ecf28a8a9b9f9b86d0f42fb6
2021-08-12 10:41:44 -07:00
Arun Kulshreshtha
a69d9d91de third-party/rust: disable lzma and xz features for async-compression
Summary:
The `async-compression` crate is currently only used by Mononoke for zstd and gzip compression. We'd like to use it in Mercurial too (primarily for zstd), but unlike Mononoke, Mercurial needs to build on all platforms.

One of `async-compression`'s dependencies, `lzma-sys`, has a complicated build script which required extensive fixups (see D21455819). Unfortunately, `lzma-sys` currently does not build on Windows with Buck because the [fixups.toml](https://www.internalfb.com/code/fbsource/[ba27fac3d5b5]/third-party/rust/fixups/lzma-sys/fixups.toml?lines=35) and [a fixup header file](https://www.internalfb.com/code/fbsource/[468048d6e50b]/third-party/rust/fixups/lzma-sys/include/config.h?lines=34) both enable pthreads, which causes conditional compilation to attempt to include POSIX dependencies.  The exact error is:

```
third-party/rust/vendor/lzma-sys-0.1.16/xz-5.2/src/common\mythread.h:103:10: fatal error: 'sys/time.h' file not found
```
Given that this crate is currently only used by Mercurial and Mononoke, and only for the zstd algorithm in practice, as a quick workaround let's just disable LZMA and XZ support. This is unfortunate; it would be better to figure out how to make the fixup work correctly, but Buck on Windows is such a niche use case at the moment that I'm not really sure where to begin.

Reviewed By: dtolnay

Differential Revision: D30271553

fbshipit-source-id: 76560c39b6f2d8750fa34c30ccb3e7db734e92a7
2021-08-12 09:30:58 -07:00
Alex Hornby
3f8c9ed354 mononoke: add cli arguments for manifold_request_priority
Summary: This diff adds a CLI option to be able to override the manifold request priority for a particular job or command line run.

Reviewed By: HarveyHunt

Differential Revision: D30277209

fbshipit-source-id: 58217c11234133dfc68e11b230d99066dd783600
2021-08-12 08:49:36 -07:00
Alex Hornby
850be082c9 mononoke: add tunable for manifold API shared priorities
Summary: Add tunable for manifold apikey priorities

Reviewed By: StanislavGlebik

Differential Revision: D30256021

fbshipit-source-id: ac95824f357bddd7eb38c19b5001abc483de8425
2021-08-12 08:49:36 -07:00
Yan Soares Couto
63caa0ff60 Add redaction to EphemeralBlobstore
Summary:
- This diff makes EphemeralBlobstore use RepoBlobstore under the hood, which gives it redaction for free.
- It also refactors and simplifies the ephemeral blobstore code, removing unnecessary separation between repo-less and repo-full ephemeral blobstore.
- It will be useful later on to be able to have a RepoBlobstore so we can transparently replace code that deals with blobrepo to also work with ephemeral stuff.

Differential Revision: D30229056

fbshipit-source-id: 956f1e8fecc2b3fa518eb11268fbbbfd27c4f5dd
2021-08-12 08:00:42 -07:00
Liubov Dmitrieva
20ad5b713e support streaming for file uploads rather than buffering the whole content
Summary:
support streaming for file uploads rather than buffering the whole content

This is a preferable way for big files. We are currently using LFS endpoint instead for big files but not in snapshots.

Also, it we enable streaming, we have an option not to use LFS endpoint for file uploads in the future.

**[Land after hg release with D30100887 (fe8ed9d28c) has been fully rolled out]**

Reviewed By: yancouto

Differential Revision: D30158390

fbshipit-source-id: b62c498b8bdf23a5f413f6e4b71d7433906e4611
2021-08-12 07:41:30 -07:00
Stanislau Hlebik
16cb44997b mononoke: remove unused function
Reviewed By: krallin

Differential Revision: D30249839

fbshipit-source-id: d3f13b1c4017ad7bc8996a636d84263a251f9bc2
2021-08-12 01:10:40 -07:00
Stanislau Hlebik
e2623b1f7b mononoke: use add_opt in streaming_clone
Summary: It's nicer.

Reviewed By: krallin

Differential Revision: D30254092

fbshipit-source-id: f5178a9458985dfcb9f63e6b0d5ef77c02798b0b
2021-08-11 13:51:57 -07:00
Yan Soares Couto
4d52344fee Use FileChange enum instead of Option<FileChange>
Summary:
for now this changes:
```
struct FileChange {
  ...stuff
}
fn f(x: Option<FileChange>)
```
to
```
struct TrackedFileChange {
  ...stuff
}
enum FileChange {
  TrackedChange(TrackedFileChange),
  Deleted,
}
fn f(x: FileChange)
```

This makes it much clearer that `None` actually means the file was deleted. It will also be useful as in the next diff I will add more stuff inside FileChange (for untracked changes), and this refactor will make it easy.

(The refactor from using `Option` to putting it all inside the enum isn't really necessary, but IMO it looks much clearer, so I did it.)

Reviewed By: StanislavGlebik

Differential Revision: D30103454

fbshipit-source-id: afd2f29dc96baf9f3d069ad69bb3555387cff604
2021-08-11 08:56:40 -07:00
Yan Soares Couto
e061f30645 Improve ODS logging on unbundle
Summary:
The time between logging total_unbundles and the actual successes is the time of the unbundle operation, which may be long.

This makes the alarm on D30222804 much less accurate, as success and fail for the same operation might fall in different buckets.

This diff changes two things:
- total_unbundles are logged at the end of the unbundle operation, which should make tracking more accurate when compared against successes.
- resolver_error is now logged in more cases that would previously error but not be logged

I created a wrapper function in order to make sure it always logs, as before it could not log if there were some early errors.

Differential Revision: D30248117

fbshipit-source-id: 4ec0c148dd7aa818b6d204fafecacacf4d267be7
2021-08-11 08:48:52 -07:00
Yan Soares Couto
2326e89b64 Don't specify repoid when rewrapping RepoBlobstore
Summary:
RepoBlobstore already has all the information necessary for rebuilding itself, we don't need to pass in `repo_id` again.

This is easier to use and less error-prone.

Reviewed By: markbt

Differential Revision: D30227978

fbshipit-source-id: b73407d5f022ce5614ee2fa9734f5a8b0c860fe7
2021-08-11 07:34:08 -07:00
David Tolnay
f9ae9f62c1 Use safe cxx signature for metadata creation
Summary:
This diff cleans up all remaining places that Thrift metadata is being created as a raw pointer, which had to be converted unsafely to UniquePtr in D30180770 (ff5931b944).

It also eliminates all the places that definitions of `MetadataFunc` and `RustThriftMetadata` were duplicated across the codebase. It would have been UB if any of these were to fall out of sync, as I discovered when trying to adjust the representation of RustThriftMetadata in D30180770 (ff5931b944).

Reviewed By: guswynn

Differential Revision: D30182979

fbshipit-source-id: 3313440313f28863ac378986c04522d358cb4fd5
2021-08-11 05:23:27 -07:00
David Tolnay
fab18899a4 Remove unused port arg from ServiceFramework constructors
Summary:
The port argument to the `ServiceFramework` constructor has been totally unused since D28431427. You can see all of the call sites in this codemod are *always* first passing the port to `ThriftServerBuilder::with_port`, then passing the resulting ThriftServer to a `ServiceFramework` constructor, so ServiceFramework can just obtain the correct port out of the given ThriftServer.

 ---

API before:

```
impl ServiceFramework {
    pub fn from_server(name: &str, server: ThriftServer, port: u16) -> Result<Self>;
    pub fn from_primary_server(name: &str, server: ThriftServer, port: u16) -> Result<Self>;
}
```

API after:

```
impl ServiceFramework {
    pub fn from_server(name: &str, server: ThriftServer) -> Result<Self>;
    pub fn from_primary_server(name: &str, server: ThriftServer) -> Result<Self>;
}
```

 ---

Call site before:

```
let server = runtime.spawn(async move {
    let thrift: ThriftServer = ThriftServerBuilder::new(fb)
        .with_port(args.port)                                                 //<----------
        .with_factory(exec, move || service)
        .build();

    let mut svc_framework =
        ServiceFramework::from_server("example_server", thrift, args.port)?;  //<----------

    svc_framework.add_module(BuildModule)?;
    svc_framework.add_module(ThriftStatsModule)?;
    svc_framework.add_module(Fb303Module)?;
    svc_framework.serve().await
});
```

Call site after:

```
let server = runtime.spawn(async move {
    let thrift: ThriftServer = ThriftServerBuilder::new(fb)
        .with_port(args.port)
        .with_factory(exec, move || service)
        .build();

    let mut svc_framework =
        ServiceFramework::from_server("example_server", thrift)?;

    svc_framework.add_module(BuildModule)?;
    svc_framework.add_module(ThriftStatsModule)?;
    svc_framework.add_module(Fb303Module)?;
    svc_framework.serve().await
});
```

Differential Revision: D30180773

fbshipit-source-id: 16cf32b582161395eab5af3f8aaef6015e69cd9f
2021-08-11 05:23:27 -07:00
David Tolnay
6d2591eb93 Make Thrift server creation safe using cxx UniquePtr
Summary:
The safe signature is possible as of {D30180770 (ff5931b944)} but has been separated out of that diff because it requires this codemod of a large number of downstream service implementations.

 ---

API before:

```
impl ThriftServerBuilder {
    pub unsafe fn with_metadata(mut self, metadata: *mut RustThriftMetadata) -> Self;
}
```

API after:

```
impl ThriftServerBuilder {
    pub fn with_metadata(mut self, metadata: UniquePtr<RustThriftMetadata>) -> Self;
}
```

 ---

Call site before:

```
let thrift = unsafe {
    ThriftServerBuilder::new(fb)
        .with_port(thrift_options.port)
        .with_metadata(create_metadata())
        .with_max_requests(thrift_options.max_requests)
        .with_factory(exec, move || service)
        .build()
};
```

Call site after:

```
let thrift = ThriftServerBuilder::new(fb)
    .with_port(thrift_options.port)
    .with_metadata(create_metadata())
    .with_max_requests(thrift_options.max_requests)
    .with_factory(exec, move || service)
    .build();
```

Reviewed By: guswynn

Differential Revision: D30180772

fbshipit-source-id: f8137b9f91b7c7b5de5bdee9dfd0b7925399cee2
2021-08-11 04:28:52 -07:00
David Tolnay
ff5931b944 Switch srserver binding to cxx
Summary:
This diff eliminates the unsafe code from the bindgen-based ThriftServer and ServiceFramework binding in favor of a simpler safe binding based on CXX.

Followup codemods:

- {D30180771}
- {D30180772}
- {D30180773}

Reviewed By: Imxset21

Differential Revision: D30180770

fbshipit-source-id: e80f0c36f5a816d85a4810e275a97d402b5db4e4
2021-08-11 04:25:55 -07:00
Jan Mazur
a16e3b1d5f log more info when derivation is slow
Summary: Let's log more to understand issue of slow derivation better.

Reviewed By: StanislavGlebik

Differential Revision: D30133815

fbshipit-source-id: 7f77422bb7728931608c156b191935a5ecc8f8fa
2021-08-11 03:40:19 -07:00
Stanislau Hlebik
7ebfee2d81 mononoke: make parameters not required in streaming_clone_warmup
Summary: They don't have to be required anymore

Reviewed By: markbt

Differential Revision: D30245130

fbshipit-source-id: 6563026f648439e5cda5d0e72ae40c0feec43ad9
2021-08-11 02:11:05 -07:00
Kuba Zika
823df51cd2 Drop redundant bits from no_bad_filenames
Summary: Remove redundant functionality from `no_bad_filenames` repo hook.

Reviewed By: krallin

Differential Revision: D30051283

fbshipit-source-id: d110626a9a7338865e33b4ce2b44e69cad1a33e8
2021-08-10 16:58:34 -07:00
Yan Soares Couto
955f21511b Delete RepoBlobstoreArgs
Summary: This was a "builder" class, but it was highly unnecessary. It had two fields, but only one was ever used, and it made things simpler to just replace it with "constructor" methods.

Differential Revision: D30162340

fbshipit-source-id: ed75d9979794c00ca18aa95fdb01688831ec4b5a
2021-08-10 06:09:23 -07:00
Stanislau Hlebik
1d0c535123 mononoke: support reading streaming changelog chunks with tag
Reviewed By: ahornby

Differential Revision: D30015772

fbshipit-source-id: ca19f41b95ce0db43895b3c53009538d5712e239
2021-08-10 05:13:54 -07:00
Stanislau Hlebik
9076af5e7c mononoke: add "success" field to streaming changelog
Summary: It makes it clearer whether the push was successful or not.

Reviewed By: krallin

Differential Revision: D30217873

fbshipit-source-id: 6b7c3af5d794ce53504e5f92fd4d5cd6e763acc0
2021-08-10 04:08:37 -07:00
Stanislau Hlebik
ba6d889922 mononoke: log streaming changelog updates
Summary:
I'd like to replace our current python streaming changelog builder with new
rust streaming changelog builder. One thing that's still missing is monitoring
and alarms. This diff adds a basic support that - let's log to scuba every time
we update streaming changelog. Later we can add a detector that would alarm if scuba had no updates in a while

Reviewed By: Croohand

Differential Revision: D30157560

fbshipit-source-id: 9740c8462ca2edf18adfe1b65b271fa0a8618cb4
2021-08-08 23:49:23 -07:00
Liubov Dmitrieva
fe8ed9d28c file upload: pass content size as a parameter
Summary:
file upload: pass content size as a parameter

We shouldn't rely on body size, because it will not allow us to build compression and also it won't allow us to implement streaming.

Reviewed By: yancouto

Differential Revision: D30100887

fbshipit-source-id: c16f79fa71fe320f61d15e1328b67026f586a1dc
2021-08-06 05:37:05 -07:00
Pranjal Raihan
54762c1866 Allow list of services in metadata
Summary:
I'm changing the semantics of `metadata.thrift` here slightly.
 ---
Before:
- `ThriftServiceContext` contains `ThriftService` inline.
- `ThriftMetadata` only contains other services *referred* to by the primary service (in `ThriftServiceContext`). This includes all base classes **but not the service itself**.
- Understanding the service class hierarchy requires traversing through each `parent` field and looking up the name in `ThriftMetadata`.
 ---
After:
- `ThriftServiceContextRef` contains just the service name.
- `ThriftMetadata` now includes the service itself (previously inlined in `ThriftServiceContext`).
- `services` field lists all names of services in the class hierarchy in order from most to least derived. **These semantics are needed to support `MultiplexAsyncProcessorFactory` where the concept of a single `parent` falls apart**.
 ---

After migrating all clients, we can remove `ThriftServiceContext` completely. It's now deprecated.

For `py3`, I've removed `extractMetadataFromServiceContext` because it's no longer needed. All it was doing was adding the inline `ThriftServiceContext` into the metadata's map... which we do by default now.

Reviewed By: yfeldblum

Differential Revision: D29952004

fbshipit-source-id: 13c62aafabbfc287ad64489c02104dd977be71ce
2021-08-05 14:24:22 -07:00
Yan Soares Couto
fe1728f79b Verify uploaded bonsai changeset
Summary:
Now we validate the bonsai changeset uploaded via edenapi, by using the `RepoWriteContext.create_changeset` function, instead of directly creating the changeset using `BonsaiChangesetMut`.

I left a comment with a possible future improvement, where we can use upload tokens on `create_changeset` to avoid querying the blobstore for file size.

Differential Revision: D30045939

fbshipit-source-id: 84bb383879f8a25464044487eb99bd38b2849537
2021-08-05 09:29:45 -07:00
Yan Soares Couto
cd8fde2864 Simplify server bonsai changeset upload
Summary:
This simplifies both client and server code to make bonsai changeset uploading be simpler for snapshots, as we only need a single commit, no mutations, etc.

This will make it easier to validate the bonsai changeset on the next diff.

It is fine to change both client and server as this code is not still in production, so we don't need to worry about rollout order.

Reviewed By: StanislavGlebik

Differential Revision: D30044542

fbshipit-source-id: d14bf58d671bc3bb5ff54b07c21f1781a043e0cf
2021-08-05 09:29:45 -07:00
Yan Soares Couto
f64520a312 On lookup call, return file size metadata
Summary:
This diff addresses [this comment](https://www.internalfb.com/diff/D29849964 (4bde7b7488)?dst_version_fbid=244353817531864&transaction_fbid=342353780770798).

- It removes the bit of code in `process_files_upload` that adds file size to the metadata.
- In order for this not to break the bonsai upload, I made it so the lookup call returns upload tokens with file size when looking up a file.
- Took the opportunity to do some refactoring
  - Consolidated duplicated functions in `convert_file_to_content_id`, and added some helpful From implementations to make calling it more ergonomic.
  - `convert_file_to_content_id` now doesn't fail when the file doesn't exist, instead returns option (also fixed the callsite)

Reviewed By: liubov-dmitrieva

Differential Revision: D30016963

fbshipit-source-id: aae8a085d7a207e50679bb1210277a9e21a32de8
2021-08-05 09:29:45 -07:00
Yan Soares Couto
2681fcbf34 snapshot: Print changeset ID on createremote
Summary: Using changes from D29995429, this returns the upload token of the changeset upload in the uploadsnapshot response.

Reviewed By: StanislavGlebik

Differential Revision: D30012368

fbshipit-source-id: 5ca54763153a474d1ce3c38ddeaa0efff071b09c
2021-08-05 09:29:44 -07:00
Yan Soares Couto
4ed5f8726f Add ChangesetId to UploadToken and use it on /upload/changeset/bonsai
Summary:
Using the new macros from previous diffs, this creates a new `ChangesetId` edenapi type and adds it to AnyId, which allows it to be used from UploadToken.

It then adds the lookup method for it, and returns it from upload_bonsai_changeset call (instead of a fake HgId UploadToken).

This will be used so that the client can know the changeset id of the uploaded snapshot.

Reviewed By: StanislavGlebik

Differential Revision: D29995429

fbshipit-source-id: e2ee4b9b0ac21d6f5394afacbfed1802da64013b
2021-08-05 09:29:44 -07:00
Jan Mazur
5949160ff0 don't log unique, structured data into one scuba column
Summary: As in the title.

Reviewed By: farnz

Differential Revision: D30129586

fbshipit-source-id: cc541c90a55394b878657d8471182e76dbe7619f
2021-08-05 04:31:54 -07:00
Jan Mazur
32219290d1 let client connect to local proxy port
Summary: This will make LFS and mononoke wireproto traffic go through a http proxy. It's behind `--config auth_proxy.http_proxy`.

Reviewed By: farnz

Differential Revision: D29935440

fbshipit-source-id: be9a5fb7579ad8d750edf4b3c3a24fac7005679c
2021-08-04 10:55:02 -07:00
Kuba Zika
bab69077dd Split no_bad_filenames repo hook
Summary:
Split extension filtering functionality from `no_bad_filenames`.

This diff does not modify `no_bad_filenames`. I am planning to
land and deploy this diff, then update the Mononoke configuration
to start using the new hooks, then land and deploy the next diff
which will remove redundant functionality from `no_bad_filenames`.

Reviewed By: krallin

Differential Revision: D29997126

fbshipit-source-id: 2b76b6275b491f3e8950ec4cfd2b4a3dacb929c9
2021-08-04 09:35:07 -07:00
Alex Hornby
2f28c4121c rust: remove chashmap from cargo vendoring
Summary: Previous diffs switched all our usage from chashmap to dashmap as dashmap upstream is more responsive. Now remove chashmap from the cargo vendoring.

Reviewed By: dtolnay

Differential Revision: D30046522

fbshipit-source-id: 111ef9375bd8095f8b7c95752ecbc1988fb0438d
2021-08-04 07:31:08 -07:00
Stanislau Hlebik
ea794c29b4 mononoke: make it possible automatically select parents when manually rewriting
Summary:
One of the things that megarepo_tool can do is to manually rewrite a commit
from one repo to another with a particular commit remapping version.

e.g.

```

source repo
X
|
P

target repo
X'  <- rewritten X
|
A   <- commit that exists only in target repo
|
P'  <- rewritten P

```

Previously it always required manually setting the parents in the target repo
i.e. in the example above we'd need to tell that A is the new parent of
rewritten commit.

However this is not always convenient. Sometimes we just want megarepo_tool to take
parents in the source repo (i.e. P in the example above), remap them to large repo
(i.e. P' in the example above), and use P' as a target repo parent.

This diff adds a special option that lets us do so.

Reviewed By: farnz

Differential Revision: D30040016

fbshipit-source-id: 116dbe1803857053336ca76d0a65dbca8b14bd73
2021-08-03 08:46:41 -07:00
Alex Hornby
535a580fb1 mononoke: removed chashmap from skiplist
Summary: Switch from chashmap to dashmap as dashmap upstream is more responsive.

Reviewed By: StanislavGlebik

Differential Revision: D30044747

fbshipit-source-id: a8eef5140542ddce4199bd052af01f41c75b53e8
2021-08-03 01:44:36 -07:00
Mark Juggurnauth-Thomas
676ac14070 mononoke_api: remove dependency on futures-old
Summary: This no longer depends on old-style futures, so we can remove the dependency.

Reviewed By: quark-zju

Differential Revision: D27708962

fbshipit-source-id: fd66fb2934ff631abe0bfcdae843fcc9b10d5fdc
2021-08-02 14:14:19 -07:00
Simon Farnsworth
370a536f4a Provide a way to write Megarepo configs to disk for testing
Summary:
In integration tests, we want to be able to run through the megarepo processing, and then check that configs have persisted correctly, so that we can start async workers after sending a config change down, and see the change be picked up.

Make it possible

Reviewed By: StanislavGlebik

Differential Revision: D30012106

fbshipit-source-id: f944165e7b93451180a78d8287db8a59d71bbe13
2021-08-02 13:53:04 -07:00
Alex Hornby
e389e2a151 mononoke: remove chashmap usage from blobrepo_utils
Summary: Switch from chashmap to dashmap as dashmap upstream is more responsive.

Reviewed By: StanislavGlebik

Differential Revision: D30044510

fbshipit-source-id: 8003ecba2f9c5d16e9cb6dced28f3785a062870d
2021-08-02 13:02:29 -07:00
Stanislau Hlebik
5e8e82fba8 mononoke: add "tag" to streaming_changelog_chunks
Reviewed By: krallin

Differential Revision: D30015700

fbshipit-source-id: df8b61a69d781e1e8d7ab2e2cbaa148c4859cb97
2021-08-02 10:33:44 -07:00
Stanislau Hlebik
9a0d8a1019 mononoke: start logging repo name to scribe commit queue
Summary:
Let's log repo name since it's clearer for people than repo ids. And in my mind logging
repo ids was a mistake - repo id is an implementation detail (we use repo ids
because they are more efficient to store in xdb table than strings), and
Mononoke users shouldn't need to care about repo ids. So let's start loggin
repo names.

Reviewed By: krallin

Differential Revision: D30040409

fbshipit-source-id: 71c2794d8122e616850662cda27c8092d382de7a
2021-08-02 06:48:48 -07:00
Yan Soares Couto
c7f872602c Common implementation for all hash types
Summary:
Code was very duplicated between ContentId, Sha1, Sha256. And in this stack I plan to implement even more hashes, so I made this macro which makes it really easy to do so.

I took the opportunity to make the inner field not public, and only accessible via from/into.

Reviewed By: liubov-dmitrieva

Differential Revision: D29992279

fbshipit-source-id: c0b7225a3634071a1b1513119ec516d14bd8fd9e
2021-08-02 05:37:20 -07:00
Yan Soares Couto
cbde50c591 Create simple bonsai commit
Summary:
On `createremote`, create a changeset using the create_bonsai_changesets method created earlier.

For now, this changeset is create with a bunch of placeholders, but I added todo for all of those things and will tackle them over the next diffs, otherwise this would be a massive diff and hard to review.

Reviewed By: liubov-dmitrieva

Differential Revision: D29990295

fbshipit-source-id: 6b4c97887c0b4c017c586bf0ea06f12df9d07d23
2021-08-02 05:37:20 -07:00
Yan Soares Couto
ab8d843ebd Upload modified files
Summary: This makes the `createremote` command also upload modified files to Mononoke, which will later be used to populate the snapshot.

Reviewed By: liubov-dmitrieva

Differential Revision: D29989181

fbshipit-source-id: 5a3b8d7133d6b27ea291ca01d14432a38d92f866
2021-08-02 05:37:20 -07:00
Yan Soares Couto
715a04f253 Create ephemeral bubble from createremote command
Summary:
This starts adding very basic behaviour to the createremote command.

- Added a `uploadsnapshot` method to the python/rust api. This will be used by the `createremote` command. It will create a bubble, and upload a snapshot to it. For now it just creates a bubble. The request/response objects are still subject to change.
- Added basic code to `createremote` that calls `uploadsnapshot`. It gets the modified files, but for now does nothing with them. I believe I'll have to read their content on rust code, as they are not in the "hg filestore" I believe, as they're not commited.

Reviewed By: StanislavGlebik

Differential Revision: D29987594

fbshipit-source-id: d1e332bb6d1baf9e90efdd2173474e8f3ebcc0e7
2021-08-02 05:37:20 -07:00
David Tolnay
c86c8d71ce Update to Rust 1.54.0
Summary: Release notes: https://blog.rust-lang.org/2021/07/29/Rust-1.54.0.html

Reviewed By: zertosh

Differential Revision: D30007531

fbshipit-source-id: ad85c526cb24bd111bef820c7887c4f5e44b79fe
2021-07-30 08:42:15 -07:00
Yan Soares Couto
2c70b3a9bf Initial integration test
Summary:
Created a very simple integration test, which will be useful as test plan for a lot of diffs.

It modifies a file and calls `hg snapshot createremote`, which for now does nothing. It already sets up edenapi, which will be necessary in following diffs.

Reviewed By: liubov-dmitrieva

Differential Revision: D29961686

fbshipit-source-id: 20e89a2d011daa35243d3a99d90a468f90000f15
2021-07-30 03:24:34 -07:00
David Tolnay
aa8152f1dd Make thrift-generated dyn async traits future compatible
Summary:
The use of dyn traits of the Thrift-generated server traits was emitting future compatibility warnings with recent versions of rustc, due to a fixed soundness hole in the trait object system:

```
error: the trait `x_account_aggregator_if::server::XAccountAggregator` cannot be made into an object
     |
     = this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
note: for a trait to be "object safe" it needs to allow building a vtable to allow the call to be resolvable dynamically; for more information visit <https://doc.rust-lang.org/reference/items/traits.html#object-safety>
```

This diff pulls in https://github.com/dtolnay/async-trait/releases/tag/0.1.51 which results in the Thrift-generated server traits no longer hitting the problematic pattern.

Reviewed By: zertosh

Differential Revision: D29979939

fbshipit-source-id: 3e6e976181bfcf35ed453ae681baeb76a634ddda
2021-07-29 16:25:33 -07:00
Liubov Dmitrieva
8e99088d58 Update backup state in hg cloud upload command
Summary:
Update backup state in `hg cloud upload` command

The backup state is used by `hg sl`, so it would be nice to keep it up-to-date after `hg cloud upload` command, similar to old `hg cloud backup`.

Also, we should add heads what we filtered in order to update the backup state correctly.

So, it will now returned list of uploaded heads as nodes (including filtered) and list of failed commits as nodes (not only heads).

Reviewed By: markbt

Differential Revision: D29878296

fbshipit-source-id: 5848e9f86175fbdc56db123cf7ba0d5fc51273b0
2021-07-29 12:11:57 -07:00
Stanislau Hlebik
95e9e913bf mononoke: use shared error to improve error messages when inserting into the blobstore sync queue
Summary:
We didn't  print underlying causes of insertion failure. The reason we didn't was because
```
let s = format!("failed to insert {}", err);
```

used `{}`, and in order to print caused we need either `{:#}` or {:?}` - see https://docs.rs/anyhow/1.0.42/anyhow/struct.Error.html#display-representations.

However krallin suggested that we can achieve the same by  by converting the error to SharedError instead of stringifying it. Let's do that instead.

Reviewed By: krallin

Differential Revision: D29985083

fbshipit-source-id: 8ae3abcfc4db9ef62581a3e20462eb6bbfb401b6
2021-07-29 11:13:05 -07:00
Stanislau Hlebik
eb55bb4284 mononoke: make sure multiplexed blobstore write succeeds if all underlying
Summary:
We had somewhat inconsistent behaviour in multiplexed blobstore:
1) If on_put handlers are too slow (i.e. they are slower than all blobstores) then we
succeed as soon as all blobstores were successful (regardless of the value of
minimum_successful_writes). It doesn't matter if on_put handlers fail or
succeed, we've already returned success to our user.
2) However if all writes to the queue quickly fail, then we return a failure
even if writes to all blobstore were successful.

#2 seems like a change in behaviour from an old diff D17421208 (9de1de2d8b), and not a
desirable one - if blobstore sync queue is unavailable and it responds with
failures quickly, then blobstore writes will always fail even if all blobstores
are healthy.

So this diff makes it so that we always succeed if all blobstore puts were
successful, regardless of success or failures of on_put handlers.

Reviewed By: liubov-dmitrieva

Differential Revision: D29985084

fbshipit-source-id: 64338d552be45a70d9b1d16dfbe7d10346ab539c
2021-07-29 08:37:53 -07:00
Yan Soares Couto
e2a7bc89db Move SCS tests out of OSS
Reviewed By: StanislavGlebik

Differential Revision: D29940309

fbshipit-source-id: ca22a2b47265f92f4e6bd29f8c5959511f8bad04
2021-07-29 05:36:40 -07:00
Stanislau Hlebik
03f5a60109 mononoke: log resulting cs_id from megarepo calls
Summary: It's useful for debugging

Reviewed By: mojsarn

Differential Revision: D29960133

fbshipit-source-id: e026b473b4a9fecebe41f2fff22dd57d514e51ab
2021-07-29 02:33:58 -07:00
Stanislau Hlebik
da4c040209 mononoke: use bookmark subscription in derived data tailer
Summary:
At the moment we read all bookmarks from leader database all the time. This is
quite wasteful for repos with large number of repos. Let's instead use
BookmarksSubscription - it uses bookmarks update log to read only bookmark that
changed

Reviewed By: krallin

Differential Revision: D29964975

fbshipit-source-id: 1cd8bc61c363e8254f0663139f90fef24b9df93e
2021-07-29 02:09:58 -07:00
Stanislau Hlebik
90ec89c6ae mononoke: track idle time in derived data tailer
Summary:
It's nice to now how much time tailer spends deriving things, and how long it's
idling. It can hint us on how much head room we have.

Reviewed By: farnz

Differential Revision: D29963128

fbshipit-source-id: 179c140d20f1097e7059a13549e39ae63ffd8198
2021-07-29 02:09:58 -07:00
Stanislau Hlebik
fbcb42a51f mononoke: remove dry_run functionality from backfill_derived_data
Summary:
It wasn't really ever used, and it's quite complicated and unnecessary. Let's
just remove it

Reviewed By: krallin

Differential Revision: D29963129

fbshipit-source-id: d31ec788fe31e010dcc8f110431f4e4fbda21778
2021-07-29 02:09:58 -07:00
Stanislau Hlebik
b9ce9c0933 mononoke: make sync_changeset return result immediately if it was computed
Summary:
Just as with D29874802 and D29848377, let's make sure if the same
sync_changeset request was sent again then we would return the same result.

Reviewed By: mojsarn

Differential Revision: D29876414

fbshipit-source-id: 91c3bd38983809da8ce246f44066204df667bb12
2021-07-28 10:03:26 -07:00
Stanislau Hlebik
34f0396fa0 mononoke: move bookmark in sync_changeset conditionally
Summary:
# Goal of the stack

There goal of this stack is to make megarepo api safer to use. In particular, we want to achieve:
1) If the same request is executed a few times, then it won't cause corrupt repo in any way (i.e. it won't create commits that client didn't intend to create and it won't move bookmarks to unpredictable places)
2) If request finished successfully, but we failed to send the success to the client, then repeating the same request would finish successfully.

Achieving #1 is necessary because async_requests_worker might execute a few requests at the same time (though this should be rare). Achieveing #2 is necessary because if we fail to send successful response to the client (e.g. because of network issues), we want client to retry and get this successful response back, so that client can continue with their next request.

In order to achieve #1 we make all bookmark move conditional i.e. we move a bookmark only if current location of the bookmark is at the place where client expects it. This should help achieve goal #1, because even if we have two requests executing at the same time, only one of them will successfully move a bookmark.

However once we achieve #1 we have a problem with #2 - if a request was successful, but we failed to send a successful reply back to the client then client will retry the request, and it will fail, because a bookmark is already at the new location (because previous request was successful), but client expects it to be at the old location (because client doesn't know that the request was succesful). To fix this issue before executing the request we check if this request was already successful, and we do it heuristically by checking request parameters and verifying the commit remapping state. This doesn't protect against malicious clients, but it should protect from issue #2 described above.

So the whole stack of diffs is the following:
1) take a method from megarepo api
2) implement a diff that makes bookmark moves conditional
3) Fix the problem #2 by checking if a previous request was successful or not

# This diff

Now that we have target_location in sync_changeset() method,
let's move bookmark in sync_changeset conditionally, just as in D29874803 (5afc48a292).

This would prevent race conditions from happening when the same sync_changeset
method is executing twice.

Reviewed By: krallin

Differential Revision: D29876413

fbshipit-source-id: c076e14171c6615fba2cedf4524d442bd25f83ab
2021-07-28 10:03:26 -07:00
Stanislau Hlebik
c5162598f0 mononoke: add target_location to sync_changeset method
Summary:
# Goal of the stack

There goal of this stack is to make megarepo api safer to use. In particular, we want to achieve:
1) If the same request is executed a few times, then it won't cause corrupt repo in any way (i.e. it won't create commits that client didn't intend to create and it won't move bookmarks to unpredictable places)
2) If request finished successfully, but we failed to send the success to the client, then repeating the same request would finish successfully.

Achieving #1 is necessary because async_requests_worker might execute a few requests at the same time (though this should be rare). Achieveing #2 is necessary because if we fail to send successful response to the client (e.g. because of network issues), we want client to retry and get this successful response back, so that client can continue with their next request.

In order to achieve #1 we make all bookmark move conditional i.e. we move a bookmark only if current location of the bookmark is at the place where client expects it. This should help achieve goal #1, because even if we have two requests executing at the same time, only one of them will successfully move a bookmark.

However once we achieve #1 we have a problem with #2 - if a request was successful, but we failed to send a successful reply back to the client then client will retry the request, and it will fail, because a bookmark is already at the new location (because previous request was successful), but client expects it to be at the old location (because client doesn't know that the request was succesful). To fix this issue before executing the request we check if this request was already successful, and we do it heuristically by checking request parameters and verifying the commit remapping state. This doesn't protect against malicious clients, but it should protect from issue #2 described above.

So the whole stack of diffs is the following:
1) take a method from megarepo api
2) implement a diff that makes bookmark moves conditional
3) Fix the problem #2 by checking if a previous request was successful or not

# This diff

We already have it for change_target_config, and it's useful to prevent races
and inconsistencies. That's especially important given that our async request
worker might run a few identical sync_changeset methods at the same time, and
target_location can help process this situation correctly.

Let's add target_location to sync_changeset, and while there I also updated the
comment for these fields in other methods. The comment said

```
// This operation will succeed only if the
// `target`'s bookmark is still at the same location
// when this operation tries to advance it
```

This is not always the case - operation might succeed if the same operation has been
re-sent twice,  see previous diffs for more explanationmotivation.

Reviewed By: krallin

Differential Revision: D29875242

fbshipit-source-id: c14b2148548abde984c3cb5cc62d04f920240657
2021-07-28 10:03:26 -07:00
Stanislau Hlebik
c9473f74f6 mononoke: make change_target_config return result immediately if it was computed
Summary:
# Goal of the stack

There goal of this stack is to make megarepo api safer to use. In particular, we want to achieve:
1) If the same request is executed a few times, then it won't cause corrupt repo in any way (i.e. it won't create commits that client didn't intend to create and it won't move bookmarks to unpredictable places)
2) If request finished successfully, but we failed to send the success to the client, then repeating the same request would finish successfully.

Achieving #1 is necessary because async_requests_worker might execute a few requests at the same time (though this should be rare). Achieveing #2 is necessary because if we fail to send successful response to the client (e.g. because of network issues), we want client to retry and get this successful response back, so that client can continue with their next request.

In order to achieve #1 we make all bookmark move conditional i.e. we move a bookmark only if current location of the bookmark is at the place where client expects it. This should help achieve goal #1, because even if we have two requests executing at the same time, only one of them will successfully move a bookmark.

However once we achieve #1 we have a problem with #2 - if a request was successful, but we failed to send a successful reply back to the client then client will retry the request, and it will fail, because a bookmark is already at the new location (because previous request was successful), but client expects it to be at the old location (because client doesn't know that the request was succesful). To fix this issue before executing the request we check if this request was already successful, and we do it heuristically by checking request parameters and verifying the commit remapping state. This doesn't protect against malicious clients, but it should protect from issue #2 described above.

So the whole stack of diffs is the following:
1) take a method from megarepo api
2) implement a diff that makes bookmark moves conditional
3) Fix the problem #2 by checking if a previous request was successful or not

# This diff

Same as with D29848377 - if result was already computed and client retries the
same request, then return it.

Differential Revision: D29874802

fbshipit-source-id: ebc2f709bc8280305473d6333d0725530c131872
2021-07-28 10:03:26 -07:00
Stanislau Hlebik
47e92203dc mononoke: make add_sync_target return result immediately if it was computed
Summary:
# Goal of the stack

There goal of this stack is to make megarepo api safer to use. In particular, we want to achieve:
1) If the same request is executed a few times, then it won't cause corrupt repo in any way (i.e. it won't create commits that client didn't intend to create and it won't move bookmarks to unpredictable places)
2) If request finished successfully, but we failed to send the success to the client, then repeating the same request would finish successfully.

Achieving #1 is necessary because async_requests_worker might execute a few requests at the same time (though this should be rare). Achieveing #2 is necessary because if we fail to send successful response to the client (e.g. because of network issues), we want client to retry and get this successful response back, so that client can continue with their next request.

In order to achieve #1 we make all bookmark move conditional i.e. we move a bookmark only if current location of the bookmark is at the place where client expects it. This should help achieve goal #1, because even if we have two requests executing at the same time, only one of them will successfully move a bookmark.

However once we achieve #1 we have a problem with #2 - if a request was successful, but we failed to send a successful reply back to the client then client will retry the request, and it will fail, because a bookmark is already at the new location (because previous request was successful), but client expects it to be at the old location (because client doesn't know that the request was succesful). To fix this issue before executing the request we check if this request was already successful, and we do it heuristically by checking request parameters and verifying the commit remapping state. This doesn't protect against malicious clients, but it should protect from issue #2 described above.

So the whole stack of diffs is the following:
1) take a method from megarepo api
2) implement a diff that makes bookmark moves conditional
3) Fix the problem #2 by checking if a previous request was successful or not

# This diff

If a previous add_sync_target() call was successful on mononoke side, but we
failed to deliver this result to the client (e.g. network issues), then client
would just try to retry this call. Before this diff it wouldn't work (i.e. we
just fail to create a bookmark because it's already created). This diff fixes
it by checking a commit this bookmark points to and checking if it looks like
it was created by a previous add_sync_target call. In particular, it checks
that remapping state file matches the request parameters, and that config
version is the same.

Differential Revision: D29848377

fbshipit-source-id: 16687d975748929e5eea8dfdbc9e206232ec9ca6
2021-07-28 10:03:26 -07:00
Stanislau Hlebik
e17e77eea3 mononoke: add repo_id parameter when finding abandoned requests
Summary:
Addressing comment from
https://www.internalfb.com/diff/D29845826 (f4a078e257)?transaction_fbid=1017293239127849

Reviewed By: krallin

Differential Revision: D29955591

fbshipit-source-id: a99bdd9dd8181e5cba54944d4957ce56b8ecb4f3
2021-07-28 06:23:31 -07:00
Yan Soares Couto
cc498b04c4 Use common response for methods with upload tokens
Summary:
There were 3 places that use the same type of response:
```
Response {
   index: usize,
   token: UploadToken,
}
```

This diff merges all of them by using a single `UploadTokensResponse`. I'm still using aliases (`use as`) for all of them, if desired I can rename everywhere to use the actual type `UploadTokensReponse`.

Reviewed By: liubov-dmitrieva

Differential Revision: D29878626

fbshipit-source-id: 92af2d4c40eae42edd0a8594642ef0b816df4feb
2021-07-28 02:16:35 -07:00
Yan Soares Couto
4bde7b7488 Use bonsai changeset upload on client
Summary:
## High level goal

This stack aims to add a way to upload commits directly using the bonsai format via edenapi, instead of using the hg format and converting on server.

The reason this is necessary is that snapshots will be uploaded on bonsai format directly, as hg format doesn't support them. So this is a stepping stone to do that, first being implemented on commit cloud upload, as that code already uses eden api, and later will be used by the snapshotting commands.

## This diff

This diff actually ties everything together from the stack and makes it work end to end. By creating the following client side changes:
- Add some config to use the bonsai format when uploading via EdenApi. The config is disabled by default.
- Add wrapper around new uploadfileblobs method (from D29799484 (8586ae1077))
- Getting the correct data to call the bonsai changeset upload endpoint created on D29849963 (b6548a10cb)
  - Some fields are String and not bytes
  - Some fields are renamed
  - File size and type can be acquired from file context. file content id, which is also required, is obtained as a response from the uploadfileblobs method: Behaviour added on D29879617 (9aae11a5ab)

Reviewed By: liubov-dmitrieva

Differential Revision: D29849964

fbshipit-source-id: a039159f927f49bbc45d4e0160ec1d3a01334eca
2021-07-28 02:16:35 -07:00
Stanislau Hlebik
ad0c9b7e2c mononoke: add more scuba logging to async request worker
Summary: It's nice to understand what's going on

Reviewed By: liubov-dmitrieva

Differential Revision: D29846694

fbshipit-source-id: 7551199ef4529e45c0eb23f79c0cc4a71ba54d0f
2021-07-27 14:12:54 -07:00
Stanislau Hlebik
f4a078e257 mononoke: make sure async megarepo requests are picked up by another worker if current worker dies
Summary:
High-level goal of this diff:
We have a problem in long_running_request_queue - if a tw job dies in the
middle of processing a request then this request will never be picked up by any
other job, and will never be completed.
The idea of the fix is fairly simple - while a job is executing a request it
needs to constantly update inprogress_last_updated_at field with the current
timestamp. In case a job dies then other jobs would notice that timestamp
hasn't been updated for a while and mark this job as "new" again, so that
somebody else can pick it up.
Note that it obviously doesn't prevent all possible race conditions - the worker
might just be too slow and not update the inprogress timestamp in time, but
that race condition we'd handle on other layers i.e. our worker guarantees that
every request will be executed at least once, but it doesn't guarantee that it will
be executed exactly once.

Now a few notes about implementation:
1) I intentionally separated methods for finding abandoned requests, and marking them new again. I did so to make it easier to log which requests where abandoned (logging will come in the next diffs).

2) My original idea (D29821091) had an additional field called execution_uuid, which would be changed each time a new worker claims a request. In the end I decided it's not worth it - while execution_uuid can reduce the likelyhood of two workers running at the same time, it doesn't eliminate it completely. So I decided that execution_uuid doesn't really gives us much.

3) It's possible that there will be two workers will be executing the same request and update the same inprogress_last_updated_at field. As I mentioned above, this is expected, and request implementation needs to handle it gracefully.

Reviewed By: krallin

Differential Revision: D29845826

fbshipit-source-id: 9285805c163b57d22a1936f85783154f6f41df2f
2021-07-27 14:12:53 -07:00
Stanislau Hlebik
9271300067 mononoke: mark some fields as nullable
Summary:
Currently they got zeros by default, but having NULL here seems like a nicer
option.

Reviewed By: krallin

Differential Revision: D29846254

fbshipit-source-id: 981d979055eca91594ef81f0d6dc4ba571a2e8be
2021-07-27 14:12:53 -07:00
Stanislau Hlebik
3b7d6bdfae mononoke: bring long_running_request_queue in sync with what we have in prod
Reviewed By: krallin

Differential Revision: D29817070

fbshipit-source-id: 37b029e74c54df7ff5a7bd4a1c8ef3f85fff127c
2021-07-27 14:12:53 -07:00
Stanislau Hlebik
8cd6278de9 mononoke: implement ensure_ancestors_of option for bookmarks
Summary:
This option would let us tell that a given bookmark (or bookmarks if they are
specified via a regex) is allowed to move only if it stays an ancestor of a
given bookmark.
Note - this is a sev followup, and we intend to use it for */stable bookmarks
(e.g. fbcode/stable, fbsource/stable etc). They are always intended to be an
ancestor of master

Reviewed By: krallin

Differential Revision: D29878144

fbshipit-source-id: a5ce08a09328e6a19af4d233c1a273a5e620b9ce
2021-07-27 12:47:22 -07:00
Yan Soares Couto
5ee6cc870b Read parents hgid from blobrepo when not local
Summary:
## High level goal

This stack aims to add a way to upload commits directly using the bonsai format via edenapi, instead of using the hg format and converting on server.

The reason this is necessary is that snapshots will be uploaded on bonsai format directly, as hg format doesn't support them. So this is a stepping stone to do that, first being implemented on commit cloud upload, as that code already uses eden api, and later will be used by the snapshotting commands.

## This diff

This diff fixes the bonsai changeset upload endpoint, by making it get the changesets for the parents using hgids by querying them from blobrepo. The inner map is not enough as the bottom of the stack always has a parent outside of the stack.

Reviewed By: liubov-dmitrieva

Differential Revision: D29880356

fbshipit-source-id: b6b5428159e8c74f5a910f39dadb98aa10c78542
2021-07-27 05:46:41 -07:00
Yan Soares Couto
b6548a10cb Add upload bonsai changeset endpoint
Summary:
## High level goal

This stack aims to add a way to upload commits directly using the bonsai format via edenapi, instead of using the hg format and converting on server.

The reason this is necessary is that snapshots will be uploaded on bonsai format directly, as hg format doesn't support them. So this is a stepping stone to do that, first being implemented on commit cloud upload, as that code already uses eden api, and later will be used by the snapshotting commands.

## This diff

This diff creates an endpoint on eden api which uploads a commit using the bonsai format.

It also adds all the necessary types to represent a bonsai commit (basically the same as hg commit, but no manifests, and a bit more detail on how each file changed) via the wire, and related boilerplate.

Reviewed By: liubov-dmitrieva

Differential Revision: D29849963

fbshipit-source-id: 2ff44d53874449ae4373a0135a60ead40c541309
2021-07-27 05:46:40 -07:00
Stanislau Hlebik
30395f41e2 mononoke: print latest error when reading megarepo configs
Summary: It makes it easier to understand what went wrong

Reviewed By: krallin

Differential Revision: D29894836

fbshipit-source-id: 1bc759067350b823d388fcab9a8cee41da4423af
2021-07-27 02:13:09 -07:00
Stanislau Hlebik
5dcc30a4b1 mononoke: fix megarepo logging to use correct method name
Reviewed By: krallin

Differential Revision: D29894709

fbshipit-source-id: 3f33df57cd0c32b40eb55dc02ef3820138a423d0
2021-07-27 02:13:09 -07:00
Arun Kulshreshtha
14d8c051c1 third-party/rust: remove patch from curl and curl-sys
Summary:
The patches to these crates have been upstreamed.

allow-large-files

Reviewed By: jsgf

Differential Revision: D29891894

fbshipit-source-id: a9f2ee0744752b689992b770fc66b6e66b3eda2b
2021-07-26 15:00:16 -07:00
Mark Juggurnauth-Thomas
c8b33fd580 blame: enable batch derivation of blame_v2
Summary:
Implement batch derivation of blame V2.

Blame derivations are independent so long as the two commits do not change or
delete any of the same files.  We can re-use the existing batching code so long
as we change it to split the stacks on *any* change (not just a
change-vs-delete conflict).

Reviewed By: StanislavGlebik

Differential Revision: D29776514

fbshipit-source-id: b06289467c9ec502170c2f851b07569214b6ff0a
2021-07-26 07:09:35 -07:00
Stanislau Hlebik
116a51bd40 ConfigHandle: use fbthrift deserialization config reading
Summary:
I noticed that reading one of the mononoke configs was failing with

```
invalid type: string \"YnrbN4fJXYGlR1EzoxLRvVbibyUiRM/HZThRJnKBThA\", expected
a sequence at line 2587 column 61)\x18ninvalid type: string
\"YnrbN4fJXYGlR1EzoxLRvVbibyUiRM/HZThRJnKBThA\", expected a sequence at line
2587 column 61
```

The problem is coming from the fact that configerator configs use thrift simple
json encoding, which is different from normal json encoding. At the very least
the difference is in how binary fields are encoded - thrift simple json
encoding uses base64 to encode them. [1]

Because of this encoding difference reading the configs with binary fields in
them fails.

This diff fixes it by using simple_json deserialization for
get_config_handle()... but the existing callers used the old broken
`get_config_handle()` which is
incompatible with the new one. Old `get_config_handle()` relied on the fact
that serde::Deserializer can be used to deserialize the config, while thrift
simple json doesn't implement serde::Deserializer.

As a first step I migrated existing callers to use old deprecated method, and
we can migrate them to the new one as needed.

[1] It was a bit hard to figure out for sure what kind of encoding is used, but
discussion in
https://fb.workplace.com/groups/configerator.users/posts/3062233117342191
suggests that it's thrift simple json encoding after all

Reviewed By: farnz

Differential Revision: D29815932

fbshipit-source-id: 6a823d0e01abe641e0e924a1b2a4dc174687c0b4
2021-07-25 08:53:08 -07:00
Stanislau Hlebik
5afc48a292 mononoke: move bookmark in change_target_config conditionally
Summary:
Do a similar change to change_target_config as we've done for add_sync_target
in D29848378. Move bookmark only if it points to an expected commit. That would
prevent make it safer to deal with cases where the same change_target_config
was executing twice.

Reviewed By: mojsarn

Differential Revision: D29874803

fbshipit-source-id: d21a3029ee58e2a8acc41e37284d0dd03d2803a3
2021-07-24 03:55:08 -07:00
Stanislau Hlebik
4f632c4e8b mononoke: create bookmark in add_sync_target
Summary:
This is the first diff that tries to make megarepo asynchronous methods
idempotent - replaying the same reqeust twice shouldn't cause corruption on the
server. At the moment this is not the case - if we have a runaway
add_sync_target call, then in the end it moves a bookmark to a random place,
even if there was another same successful add_sync_target call and a few others on
top.

add_sync_target should create a new bookmark, and if a bookmark already exists
it's better to not move it to a random place.

This diff does it, however it creates another problem - if a request was successful on mononoke side, but we failed to deliver the successful result to the client (e.g. network issues), then retrying this request would fail because bookmark already exists. This problem will be addressed in the next diff.

Reviewed By: mojsarn

Differential Revision: D29848378

fbshipit-source-id: 8a58e35c26b989a7cbd4d4ac4cbae1691f6e9246
2021-07-24 03:55:08 -07:00
Michael Voznesensky
7ed41f2b36 Bump configerator, add support for config driven no parent commits
Summary: As discussed, extends Mononoke service to support commits w/o parents for the AI Infra usecase.

Reviewed By: markbt

Differential Revision: D29810303

fbshipit-source-id: f07fd7f1521ffe1cea85f1f54e71fe37fc39bb62
2021-07-23 13:40:18 -07:00
Stanislau Hlebik
d1e86ab457 mononoke: add more logging to add sync target call
Summary: It's nice to be able to keep track of what's going on

Reviewed By: mwdevine

Differential Revision: D29790543

fbshipit-source-id: b855d72efe8826a99b3a6a562722e299e9cbfece
2021-07-22 14:52:03 -07:00
Yan Soares Couto
3f8de3336a Add bubble id to upload files call
Summary:
Added an optional argument to `/upload/file`, that allows specifying a bubble id, which will be used to upload the file into the ephemeral blobstore instead of the main one.

This is necessary in order to create a snapshot, as all files must be in the ephemeral blobstore.

Reviewed By: liubov-dmitrieva

Differential Revision: D29734333

fbshipit-source-id: c1dcf8d5a78819925f8defbfbd7d06b0f6a9e973
2021-07-22 13:47:12 -07:00
Yan Soares Couto
681a5305e3 Use NonZeroU64 as BubbleId
Summary: Insert id's are always positive, so let's use `NonZeroU64` instead of `u64`. This is more restricted, which is good, but also has the added benefit that `Option<NonZeroU64>` doesn't use any additional space, because of compiler optimizations.

Reviewed By: StanislavGlebik

Differential Revision: D29733877

fbshipit-source-id: 8a0e1a1bd84bcedbba51840f1da8f8cac79bca42
2021-07-22 13:47:12 -07:00
Yan Soares Couto
16fbc46f2c Add ephemeral handle
Summary: Ephemeral handle is a blobstore that's built from a bubble and a "main blobstore", which first attempts to read from the ephemeral blobstore, but falls back to the main one. Will be used to read/write stuff in snapshots.

Reviewed By: liubov-dmitrieva

Differential Revision: D29733408

fbshipit-source-id: f15ae9d3009632cd71fafa88eac09986e0b958e7
2021-07-22 13:47:12 -07:00
Liubov Dmitrieva
bf41088ef2 Move EdenApi Uploads code from commit cloud extension to core
Summary:
Move EdenApi Uploads code from commit cloud extension to core

So this can be later used for pushes as well. The code is not commit cloud specific.

The function takes revs and returns uploaded, failed lists that are also revs.

Reviewed By: yancouto

Differential Revision: D29846299

fbshipit-source-id: e3a7fbc56f0b651c738dc06da7fdb7cde4feedf7
2021-07-22 11:52:57 -07:00
Mark Juggurnauth-Thomas
3dca04d257 tests: update test-backfill-derived-data.t
Summary: This test is overly reliant on exact logging output, and the output has changed.  Update the test for the new output, and make it a bit more lenient in the process.

Reviewed By: StanislavGlebik

Differential Revision: D29787827

fbshipit-source-id: 3e8aa77d2edcf3d0ca95c0d17d0b4e3845b78ae3
2021-07-22 09:34:14 -07:00
CodemodService Bot
0a402ce760 Daily common/rust/cargo_from_buck/bin/autocargo
Reviewed By: krallin

Differential Revision: D29841733

fbshipit-source-id: c9da8e0324f402f3b9726f2733b51de56abde8f6
2021-07-22 09:22:41 -07:00
Liubov Dmitrieva
33ab763498 Improve integration tests coverage for hg cloud upload and hg cloud sync
Summary:
Improve integration tests coverage for `hg cloud upload` and `hg cloud sync` with enabled upload.

This includes end2end tests for uploading mutation information, pulling commits from another repo,
generally how uploads behaves after a rebase, after file moves, after editing a commit message, how copy_from data has been preserved.

Reviewed By: markbt

Differential Revision: D29816436

fbshipit-source-id: 2aa421c8479683721984e13d537c34df8b1ca2d1
2021-07-22 08:31:19 -07:00
Liubov Dmitrieva
34fb6f721e update test certificates for another 10 years rather than the default 1 year
Summary: update test certificates for another 10 years rather than the default 1 year

Reviewed By: markbt

Differential Revision: D29846930

fbshipit-source-id: 98bc139c21e4d9e4cb5bab46485d849345bcc43d
2021-07-22 08:18:48 -07:00
Mark Juggurnauth-Thomas
1f81d30c93 source_control_service: add tree_exists and commit_path_exists
Summary:
Add methods to easily determine whether a tree exists, or whether anything
(either a file or a tree) exists at a particular path.

Reviewed By: StanislavGlebik

Differential Revision: D29815982

fbshipit-source-id: f3fb1919545bdcb46ed663a0a514338dc137abee
2021-07-22 00:09:27 -07:00
Stanislau Hlebik
46827b3756 mononoke: remove unused method
Summary:
This is not used. Even though this method has the "right intention" (i.e. we
need to start marking long running requests as new), I'm not sure we can use it
as is. So let's just delete it for now.

Reviewed By: farnz

Differential Revision: D29817068

fbshipit-source-id: 84d392fea01dfb5fb7bc56f0072baf2cf70b39f4
2021-07-21 12:02:59 -07:00
Jan Mazur
4d602331a2 print underlying errors in debug mode
Summary: Currently we only print what's in the `error` annotation.

Reviewed By: krallin

Differential Revision: D29794843

fbshipit-source-id: a2c411208d7be8fd856dd9b3f82fd96a4ed37aee
2021-07-21 07:19:32 -07:00
Liubov Dmitrieva
9c75e9b2b0 bugfix: fix uninitalized state variable
Summary:
bugfix: fix uninitalized state variable and add a test

in rare cases it is used further down the code

Reviewed By: StanislavGlebik

Differential Revision: D29815203

fbshipit-source-id: e117df5575f025787d94f0a8ed4a171408e361d0
2021-07-21 05:38:51 -07:00
Stanislau Hlebik
eb0aebc24c mononoke: make sure we don't reload redaction config unnecessarily
Summary:
We seem to be reloading it every minute, even though we are supposed to reload
only when it's changed. That's probably not a huge deal, but we just get a
spammy stderr message. Let's remove it.

Reviewed By: yancouto

Differential Revision: D29789760

fbshipit-source-id: 65a39cca67636ae71befb963c78b6473b5b9f3fc
2021-07-21 01:32:43 -07:00
Stanislau Hlebik
5c7d31b3ae mononoke: fix integration tests
Summary:
mysql tests were failing because of invalid config with

```
+  E0719 14:48:27.582197 1846476 [main] eden/mononoke/cmdlib/src/helpers.rs:318] Execution error: unknown keys in config parsing: `{"blobstore.ephemeral_blobstore.?.metadata.?.filenodes", "blobstore.ephemeral_blobstore.?.metadata.?.mutation", "blobstore.ephemeral_blobstore.?.metadata.?.primary"}`
```

See example - https://www.internalfb.com/intern/testinfra/diagnostics/6473924511735259.562949979040542.1626706163/

This diff fixes it

Reviewed By: akushner

Differential Revision: D29812804

fbshipit-source-id: c71f7f38103194137523ca947e4b23819da37c35
2021-07-21 01:32:43 -07:00
Liubov Dmitrieva
26e149e737 rename cbor_stream => cbor_stream_filtered
Summary:
Rename to avoid confusion. The function filters errors from the underlying stream.

The first error and number of errors are logged to scuba but the errors are not passed to the client.

Reviewed By: kulshrax

Differential Revision: D29734930

fbshipit-source-id: 503adaa9e618d931a354011ef83c3ab22eb3b9bf
2021-07-20 03:50:09 -07:00
Yan Soares Couto
a47540a494 Add /ephemeral/prepare endpoint
Summary:
Using all the preparations added in the stack, this diff adds the `/:repo/ephemeral/prepare` endpoint to eden api.

It simply creates an ephemeral bubble and returns its id via the call.

Reviewed By: markbt

Differential Revision: D29698714

fbshipit-source-id: 5bc289cad97657db850b151849784e50a17a9da6
2021-07-19 09:53:04 -07:00
Yan Soares Couto
58448b16d5 Add ephemeral blobstore to inner repo
Summary: This allows ephemeral blobstore to be used in places that have a Repo context, like in the eden api, which will be used on the next diff to implement a new endpoint on eden api to create a bubble.

Reviewed By: markbt

Differential Revision: D29697657

fbshipit-source-id: b7e83c5c7c5e77243f0dba29c024d9f66ca4b2f9
2021-07-19 09:53:04 -07:00
Yan Soares Couto
e20022a088 Build ephemeral blobstore on repo factory
Summary:
Config for ephemeral blobstore and some code for creating ephemeral blobstores was already added, this diff ties them both together by making the ephemeral blobstore be build using the default config on RepoFactory, so it can be used as a Repo attribute easily in other places.

I was able to do this easily because I stopped using `BackingBlostore` and started simply using `dyn Blobstore` in the ephemeral blobstore. Using BackingBlobstore would require some significant changes, because:
1. Building of blobstores is not ergonomic, it is quite hard and requires a bunch of manual code to be able to build some subtrait of Blobstore.
2. A lot of the blobstores "wrappers" do not implement things like BlobstoreKeySource, which would need to be implemented individually (example: D29678881 (817948ca75) would be just the start).

Reviewed By: markbt

Differential Revision: D29677545

fbshipit-source-id: 0f5cffe6bdfece1aaa74339ef40376d1ff27e6c2
2021-07-19 09:53:04 -07:00
Yan Soares Couto
eccda5507b Use Reloader on segmented changelog periodic reloader
Summary:
Use the class added on previous diff on segmented changelog periodic reloader as well.

To do this, I needed to add some changes to reloader:
- Add auto implementation of `Loader` trait for functions
- Add a tokio notify, as that was used on tests in segmented changelog

Reviewed By: markbt

Differential Revision: D29524220

fbshipit-source-id: 957f21db91f410fcdabb0d1c16d5c4f615892ab6
2021-07-19 05:17:50 -07:00
Yan Soares Couto
817948ca75 Implement BlobstoreKeySource for Prefixer and Counting blobstores
Summary: If we ever want to start using things like BlobstoreKeySource more extensively, we'll need to implement it for a lot of blobstores. This starts that, though it's not used for now.

Reviewed By: ahornby

Differential Revision: D29678881

fbshipit-source-id: 918a169b8b934c6f5e1eefaba7d11dc220eb7c59
2021-07-19 05:04:34 -07:00
Aida Getoeva
0619e705ef mononoke/multiplex: put sync-queue lookup for get under the tunable
Summary: This is needed to disable sync-queue lookups and second blobstores lookups later on.

Reviewed By: StanislavGlebik

Differential Revision: D29663435

fbshipit-source-id: abb5109de6063158a7ff0a116a5c1d336bfdb43f
2021-07-17 17:00:16 -07:00
Aida Getoeva
008c99e5fa mononoke/multiplex: remove fail_if_unsure where possible
Summary: This just helps to understand where we definitely have to fail in case of "ProbablyNotPresent" and work on those in the future.

Reviewed By: StanislavGlebik

Differential Revision: D29663436

fbshipit-source-id: c8428115f3c9637114e3964c948123d473207d53
2021-07-17 17:00:16 -07:00
Stanislau Hlebik
5ce118e1c2 mononoke: remove segmented changelog from BlobRepo
Summary:
Segmented changelog is initialized in every BlobRepo, and that's quite annoying
- there's a lot of spam goes to stderr in jobs like hg sync job which don't use
segmented changelog at all.

At the same time segmented changelog is only used in mononoke api, so we can
just initialize segmented changelog in InnerRepo, and remove from BlobRepo
completely.

Reviewed By: markbt

Differential Revision: D29735623

fbshipit-source-id: 9137c9266169b7ef16b1c6c0b80cae896214203b
2021-07-16 11:10:00 -07:00
Stanislau Hlebik
e5f8b1588e mononoke: read just written sync target config
Reviewed By: markbt

Differential Revision: D29728973

fbshipit-source-id: b1d9020222c9d4494b206dabeb7e92c9c45d35b7
2021-07-16 07:43:08 -07:00
Liubov Dmitrieva
99ff270a78 Clean Up: remove everything about the infinitepush write path
Summary:
Remove everything about the infinitepush write path.

Infinitepush path has been splitted into 2 for migration purpose. It is now time to clean up.

Reviewed By: StanislavGlebik

Differential Revision: D29711414

fbshipit-source-id: c61799fe124e2def4254cdd45e550c82c501e514
2021-07-15 07:37:55 -07:00
Harvey Hunt
b3a504d191 mononoke: lfs: Remove throttle limits
Summary:
Now that the new `rate_limiting` crate is being used by LFS server we
can remove the throttle limits code and config.

Differential Revision: D29396505

fbshipit-source-id: 19638bd93ad9dea2638e8501837c6c13e4dd48ff
2021-07-15 04:09:51 -07:00
Mark Juggurnauth-Thomas
0914112d96 update test debug strings
Summary:
Integration tests rely on specific debug output.  This changed for `String`
in Rust 1.53, so update accordingly.

Reviewed By: yancouto

Differential Revision: D29696713

fbshipit-source-id: 751d72660f1d8772d754ab404192281857b32b2f
2021-07-14 10:18:30 -07:00
Meyer Jacobs
f564998c4f edenapi: add new file aux data attribute
Summary: Adds a new file attribute, `FileAuxData` (based on Mononoke's `ContentMetadata`)

Reviewed By: DurhamG

Differential Revision: D29557288

fbshipit-source-id: 59251ebe8ddf2009d7bcf44a83eab68d49c817de
2021-07-13 15:17:30 -07:00
Meyer Jacobs
6c21aa14c9 edenapi: implement file content attribute
Summary:
Implement the `content` attribute.

Introduce a new `FileContent` type which stores the hg file blob and metadata, and modify `FileEntry` to allow constructing `FileEntry` with optional `FileContent` builder-style.

Reviewed By: DurhamG

Differential Revision: D29647203

fbshipit-source-id: b956c294d03dc81affc90d7274b2e430a3556e96
2021-07-13 15:17:30 -07:00
Meyer Jacobs
7c0b422b37 edenapi: introduce file attributes support
Summary:
Add support for optional file attributes to EdenApi, with `content` there as a placeholder.

Modifies the `FileRequest` type, adding a vec of `FileSpec`, which allows the client to specify desired attributes per-key. The existing `keys` field will be treated as a request for the content attribute and may be used in combination with the new per-key attributes.

Reviewed By: DurhamG

Differential Revision: D29634709

fbshipit-source-id: 6571837f87d1635e8529490e10dbe4ba054b7348
2021-07-13 15:17:30 -07:00
Durham Goode
1183f14f11 treemanifest: disable flatcompat by default
Summary:
This was a hack to allow the tests to produce the same hashes as
before. Let's disable this and fix the remaining test failures. A future diff
will remove the feature entirely.

Where possible I changed input hashes to desc() and output hashes to globs so
hopefully future hash changes are a little easier.

Differential Revision: D29567762

fbshipit-source-id: cf5150c112c56b08f583feba80e5a636cc07db0a
2021-07-13 15:04:57 -07:00
Liubov Dmitrieva
6b9587c207 remove deprecated commands
Summary: Clean Up. Remove deprecated commands.

Reviewed By: singhsrb

Differential Revision: D29677591

fbshipit-source-id: c4da701e9eedaa2f4dcd59b3e95c924aede74bf7
2021-07-13 13:48:06 -07:00
Durham Goode
5cb2db12f1 tests: remove accidental treemanifestserver.py
Summary:
This was accidentally committed in an earlier diff. It's unused, so
let's delete it.

Reviewed By: krallin

Differential Revision: D29668138

fbshipit-source-id: 105bf466665c447c37c73462e102d8771d0368ee
2021-07-13 13:32:16 -07:00
Liubov Dmitrieva
990a246aa8 Support exchange of mutation information during changesets uploads
Summary:
Support exchange of mutation information during changesets uploads

Add new api for mutation information.

Add implementation for this new api.

Add client side support.

Reviewed By: markbt

Differential Revision: D29661255

fbshipit-source-id: 1d8cfa356599c215460aee49dd0c78b11af987b8
2021-07-13 01:56:06 -07:00
Mark Juggurnauth-Thomas
ba8784f812 scs_server: support blame_v2
Summary:
Allow SCS server to use blame V2 to serve blame requests, if it is enabled.

This uses `CompatBlame` so that it can use either blame V1 or blame V2.

Reviewed By: liubov-dmitrieva

Differential Revision: D29645410

fbshipit-source-id: 8d02e295995439c3b64e0128bdb5e6f5f6153159
2021-07-12 02:45:23 -07:00
Mark Juggurnauth-Thomas
7567684c3a admin: support blame_v2 in the blame subcommands
Summary:
Allow access to blame V2 in `mononoke_admin` by using `fetch_blame_compat`.

This uses `CompatBlame` to provide blame support using either blame V1 or blame V2.

Reviewed By: liubov-dmitrieva

Differential Revision: D29492859

fbshipit-source-id: 38c73690d36b57be73cec98ae2a013f16b3e0f7a
2021-07-12 02:45:23 -07:00
Mark Juggurnauth-Thomas
af0dbd16fd blame_v2: implement fetch_blame_compat
Summary:
Implement `fetch_blame_compat`, which will fetch either blame V1 or blame V2,
depending on the repo config, and return a compatibility adapter that can be
used by code to use both kinds.

Reviewed By: StanislavGlebik

Differential Revision: D29492857

fbshipit-source-id: 88d68ef2988e316642a5ebd9aa38b541c02c5da4
2021-07-12 02:45:23 -07:00
Mark Juggurnauth-Thomas
a11563a2a6 blame_v2: implement derivation of blame_v2
Summary:
Add `blame_version` to `BlameDeriveOptions`, and if this is set to `V2`, derive
a V2 blame root for the changeset.

Blame V2 and its roots are in a separate blobstore key space, so this derivation
is entirely independent of blame V1.

The key prefix for blame V2 roots is `derived_root_blame_v2`, even though this
is slightly different to the prefix for blame V1.  This is so that it matches
other derived data roots (e.g. unode V2).  Similarly, `BlameRoot` becomes
`RootBlameV2` so that it matches the other root types.

Blame V2 uses a separate mapping for blame roots, which contain the root
unode manifest id as additional data.

Differential Revision: D29492858

fbshipit-source-id: de2799040129e1ab90cc6bd8f775a6d47c607db7
2021-07-12 02:45:23 -07:00
Mark Juggurnauth-Thomas
d3400bc439 blame: extract mapping and derivation for v1 from derived module
Summary:
Split the `derived` module into `derive_v1`, which handles derivation of blame
V1, and `mapping_v1`, which handles the derived data mapping.

This is in preparation for introducing derivation of blame V2.

Reviewed By: StanislavGlebik

Differential Revision: D29463127

fbshipit-source-id: ae3add600ca62141e7f25713367680b667507da3
2021-07-12 02:45:23 -07:00
Mark Juggurnauth-Thomas
bbd0dcfa5d blame: extract function for fetching content
Summary:
Extract `fetch_content_for_blame` to a separate module so we can re-use it in
blame V2.

The method previously returned nested `Result`s, which can be confusing as in
most contexts, the blame being rejected is not actually an error.  Switch to an
explicit enum to make it clearer what the inner result represents.

Reviewed By: yancouto

Differential Revision: D29462095

fbshipit-source-id: 52ffcb4173a3b36f4b6cdafe4f42a4cafd993f49
2021-07-12 02:45:23 -07:00
Liubov Dmitrieva
43187d53ad edenapi: Implement uploading changesets in hg cloud upload command
Summary:
Implement using of uploading changesets in `hg cloud upload` command.

This is the last part for `hg cloud upload` - uploading changesets via Edenapi
test
```

```
# machine #2
liubovd {emoji:1f352}  ~/fbsource
 [15] → hg pull -r 0b6075b4bda143d5212c1525323fb285d96a1afb
pulling from mononoke://mononoke.c2p.facebook.net/fbsource
connected to twshared27150.03.cln3.facebook.com session RaIPDgvF6l8rmXkA
abort: 0b6075b4bda143d5212c1525323fb285d96a1afb not found!
```

```
# machine #1
devvm1006.cln0 {emoji:1f440}   ~/fbsource/fbcode/eden/scm
 [6] →  EDENSCM_LOG="edenapi::client=info" ./hg cloud upload
Jul 11 13:26:26.322  INFO edenapi::client: Requesting lookup for 1 item(s)
commitcloud: head '0b6075b4bda1' hasn't been uploaded yet
Jul 11 13:26:26.472  INFO edenapi::client: Requesting lookup for 6 item(s)
commitcloud: queue 1 commit for upload
Jul 11 13:26:26.648  INFO edenapi::client: Requesting lookup for 1 item(s)
commitcloud: queue 0 files for upload
Jul 11 13:26:26.698  INFO edenapi::client: Requesting lookup for 4 item(s)
commitcloud: queue 4 trees for upload
Jul 11 13:26:27.393  INFO edenapi::client: Requesting trees upload for 4 item(s)
commitcloud: uploaded 4 trees
commitcloud: uploading commit '0b6075b4bda143d5212c1525323fb285d96a1afb'...
Jul 11 13:26:28.426  INFO edenapi::client: Requesting changesets upload for 1 item(s)
commitcloud: uploaded 1 commit
```

```
# machine #2
liubovd {emoji:1f352}  ~/fbsource
 [16] → hg pull -r 0b6075b4bda143d5212c1525323fb285d96a1afb
pulling from mononoke://mononoke.c2p.facebook.net/fbsource
connected to twshared16001.08.cln2.facebook.com session QCpy1x9yrflRF6xF
searching for changes
adding commits
adding manifests
adding file changes
added 895 commits with 0 changes to 0 files
(running background incremental repack)
prefetching trees for 4 commits

liubovd {emoji:1f352}  ~/fbsource
 [17] → hg up 0b6075b4bda143d5212c1525323fb285d96a1afb
warning: watchman has recently started (pid 93231) - operation will be slower than usual
connected to twshared32054.08.cln2.facebook.com session Hw91G8kRYzt4c5BV
1 files updated, 0 files merged, 0 files removed, 0 files unresolved

liubovd {emoji:1f352}  ~/fbsource
 [18] → hg diff -c .
connected to twshared0965.07.cln2.facebook.com session rrYSvRM6pnBYZ2Fn
 diff --git a/fbcode/eden/scm/test b/fbcode/eden/scm/test
new file mode 100644
 --- /dev/null
+++ b/fbcode/eden/scm/test
@@ -0,0 +1,1 @@
+test
```

Initial perf wins:

Having a large stack of 6 commits (total 24 files changed), tested *adding a single line to a file at the top commit*. We can see at least 2X win but it should be more because I have tested with a local instance of edenapi service that runs on my devserver.

```
╷
╷ @  5582fc8ee  6 minutes ago  liubovd
╷ │  test
╷ │
╷ o  d55f9bb65  86 minutes ago  liubovd  D29644738
╷ │  [hg] edenapi: Implement using of uploading changesets in `hg cloud upload` command
╷ │
╷ o  561149783  Friday at 15:10  liubovd  D29644797
╷ │  [hg] edenapi: Add request handler for uploading hg changesets
╷ │
╷ o  c3dda964a  Friday at 15:10  liubovd  D29644800
╷ │  [edenapi_service] Add new /:repo/upload/changesets endpoint
╷ │
╷ o  28ce2fa0c  Friday at 15:10  liubovd  D29644799
╷ │  [hg] edenapi/edenapi_service: Add new API for uploading Hg Changesets
╷ │
╷ o  13325b361  Yesterday at 15:23  liubovd  D29644798
╭─╯  [edenapi_service] Implement uploading of hg changesets
```

```
# adding new line to a file test in the test commit, and then run:
devvm1006.cln0 {emoji:1f440}   ~/fbsource/fbcode/eden/scm
 [8] → time hg cloud upload
commitcloud: head '4e4f947d73e6' hasn't been uploaded yet
commitcloud: queue 1 commit for upload
commitcloud: queue 0 files for upload
commitcloud: queue 4 trees for upload
commitcloud: uploaded 4 trees
commitcloud: uploading commit '4e4f947d73e676b63df7c90c4e707d38e6d0a93b'...
commitcloud: uploaded 1 commit

real	0m3.778s
user	0m0.017s
sys	0m0.027s
```

```
# adding another new line to a file test in the test commit, and then run:
devvm1006.cln0 {emoji:1f440}   ~/fbsource/fbcode/eden/scm
 [11] → time hg cloud backup
connected to twshared30574.02.cln2.facebook.com session uvOvhxtBfeM7pMgl
backing up stack rooted at 13325b3612d2
commitcloud: backed up 1 commit

real	0m7.507s
user	0m0.013s
sys	0m0.030s
```

Test force mode of the new command that reupload everything:
```
devvm1006.cln0 {emoji:1f440}   ~/fbsource/fbcode/eden/scm
 [13] → time hg cloud upload --force
commitcloud: head '5582fc8ee382' hasn't been uploaded yet
commitcloud: queue 6 commits for upload
commitcloud: queue 24 files for upload
commitcloud: uploaded 24 files
commitcloud: queue 61 trees for upload
commitcloud: uploaded 61 trees
commitcloud: uploading commit '13325b3612d20c176923d1aab8a28383cea2ba9a'...
commitcloud: uploading commit '28ce2fa0c6a02de57cdc732db742fd5c8f2611ad'...
commitcloud: uploading commit 'c3dda964a71b65f01fc4ccadc9429ee887ea982c'...
commitcloud: uploading commit '561149783e2fb5916378fe27757dcc2077049f8c'...
commitcloud: uploading commit 'd55f9bb65a0829b1731baa686cb8a6e0c5500cc2'...
commitcloud: uploading commit '5582fc8ee382c4c367a057db2a1781377bf55ba4'...
commitcloud: uploaded 6 commits

real	0m7.830s
user	0m0.011s
sys	0m0.032s
```

We can see the time is similar to the current `hg cloud backup` command.

Reviewed By: markbt

Differential Revision: D29644738

fbshipit-source-id: cbbfcb2e8018f83f323f447848b3b6045baf47c5
2021-07-12 02:13:31 -07:00
Liubov Dmitrieva
f8784a6020 Add new /:repo/upload/changesets endpoint
Summary: add new /:repo/upload/changesets endpoint

Reviewed By: markbt

Differential Revision: D29644800

fbshipit-source-id: 7d17c9c7a52e0e528a1d2cb7ea323a9abf13cf93
2021-07-12 02:13:31 -07:00
Liubov Dmitrieva
804fc98c3f Implement uploading of hg changesets
Summary:
implement uploading of hg changesets

For now, reuse the upload code path from unbundle but calling it with empty filenodes and manifests.

Those are used for parents validation but this is not needed for us because we load trees and filenodes and their parents to construct the bonsai cs.

We might want to rewrite it to a cleaner code later and separate from unbundle but for now reusing the fucntion is the easiest way because we know the implementation is correct and also has logging.

Reviewed By: markbt

Differential Revision: D29644798

fbshipit-source-id: 27217d3061ab8d9712417facdbfbbc7e3aebfc5b
2021-07-12 02:13:31 -07:00
Jun Wu
5af32aa8b9 make are_heads_assigned work with duplicated heads
Summary: There is a code path that `heads` contain duplicated items. Be compatible with it.

Reviewed By: andll

Differential Revision: D29645743

fbshipit-source-id: ff73bc51e877c2d02fcfff28cd1001e70478f212
2021-07-09 18:24:25 -07:00
Stanislau Hlebik
d74cc69de4 mononoke: avoid creating deletion commit on megarepo mainline
Summary:
In previous diff we started creating deletion commits on megarepo mainline.
This is not great since it breaks bisects, and this diff avoids that.

The way it does it is the following:
1) First do the same thing we did before - create deletion commit, and then
create a merge commit with p1 as deletion commit and p2 as an addition commit.
Let's call it "fake merge", since this commit won't be used for our mainline
2) Generate manifest for our "fake merge", and then use this manifest to
generate bonsai diff. But this time make p1 an old target commit (i.e. remove
deleted commit as if it never exited).
3) Use generated bonsai diff to create a commit.

So in short we split the procedure in two - first generate and validate the
resulting manifest (this is what we use "fake merge" commit for), and then
generate bonsai changeset using this manifest. It's unfortunate that in order
to generate resulting manifest we actually need to create commit and save it to
blobstore. If we had in-memory manifests we could have avoided that, but alas
we don't have them yet.

This way of creating bonsai changesets is a bit unconventional, but I think it has the benefit of relying on tools that we have confidence that they work (i.e. bonsai_diff), and we don't need to reimplement all the bonsai logic again

Reviewed By: mitrandir77

Differential Revision: D29633340

fbshipit-source-id: eebdb0e4db5abbab9346c575b662b7bb467497c4
2021-07-09 05:23:43 -07:00
Stanislau Hlebik
58ffbc5cec mononoke: redo the way we create merge bonsai changesets in change_target_config
Summary:
Initially I just wanted to address comments from D29515737 (fa8796ae19) about unnecessary
manifest retraversals, but there were a few more problems:
1) We didn't detect file conflicts in the final merge commit correctly. For
example, if additions_merge commit added a file "dir/1.txt", but there's
already file "dir" in target changeset that we won't detect this problem.
2) What's worse is that we might produce invalid bonsai merge changeset
ourselves. Say, if we delete "source_1/dir/file.txt", and then add file
"source_1/dir" in additions merge commit then resulting bonsai changeset should
have "source_1/dir" entry in the bonsai changeset.

This diff does the following:
1) Adds more tests to cover different corner cases - some of them were failing
before this diff.
2) Improves logic to verify file conflicts
3) Instead of trying to generate correct merge bonsai changeset it simplifies
the task and creates a separate deletion commit.

Note that creating a deletion commit on the mainline is something we want to
avoid to not break bisects. This will be addressed in the next diff.

Reviewed By: mitrandir77

Differential Revision: D29633341

fbshipit-source-id: 8f755d852212fbce8f9331049bf836c1d0a4ef42
2021-07-09 05:23:43 -07:00
Stanislau Hlebik
a24c68cf27 mononoke: log bookmark name when bookmark is moved/created/deleted/pushrebased
Reviewed By: mitrandir77

Differential Revision: D29634346

fbshipit-source-id: 25e98921410e8a481f3468264d0a1f084b89b1ba
2021-07-09 05:23:43 -07:00
Mateusz Kwapich
3a41e7fbc3 megarepo_add_branching_sync_target method
Summary: This new method will allow the megarepo customers to create a sync target that's branching off the existing target. This feature is meant to be used for release branches.

Reviewed By: StanislavGlebik

Differential Revision: D29275281

fbshipit-source-id: 7b58d5cc49c99bbc5f7e01814178376aa3abfcdf
2021-07-09 05:23:43 -07:00
Liubov Dmitrieva
ebdae10209 edenapi: end to end integration test for hg cloud upload
Summary:
First integration test for the `hg cloud upload` command.

We will be able to cover more cases once last part (uploading of changesets) will be implemented.

Reviewed By: markbt

Differential Revision: D29612725

fbshipit-source-id: cb8fedfc4e8c2408bccaa4195dc1e5c0758d742a
2021-07-09 03:23:45 -07:00
Liubov Dmitrieva
9aaf619762 Implement upload trees API
Summary: Implement upload trees endpoint

Reviewed By: markbt

Differential Revision: D29556346

fbshipit-source-id: 415285f2fba0b3f18a75f616649e31f78afca2b9
2021-07-09 03:23:45 -07:00
Liubov Dmitrieva
d327996144 edenapi: upload filenodes (client side)
Summary:
upload filenodes (client side)

On the client side I implemented file upload and filenodes upload in the same API repo.edenapi.uploadfiles

This is because we should use the tokens from the file upload part to feed then into filenodes upload request.

Reviewed By: markbt

Differential Revision: D29549091

fbshipit-source-id: 436de187c8dce9a603c0c0a182e88b582a2d8001
2021-07-07 11:31:05 -07:00
Alex Hornby
4db26bffd3 mononoke: update bundle to use byteorder::BigEndian
Summary: update bundle to use byteorder::BigEndian in preparation for Bytes upgrade.  New versions of Bytes no longer reexport it.

Differential Revision: D29561928

fbshipit-source-id: ce44d9c27f9786a4bcec8f7166763c95828847e8
2021-07-07 07:52:59 -07:00
Yan Soares Couto
b60cfff714 Use Reloader on redacted config
Summary: Use the class added on previous diff on redacted config as well

Reviewed By: mitrandir77

Differential Revision: D29521423

fbshipit-source-id: 70f5a1cbce80a0068a0f438b7d217bfffb6a1592
2021-07-07 06:21:38 -07:00
Yan Soares Couto
f6a6b6a337 Extract periodic reloader to common class and use it in skiplist
Summary:
I've seen periodic reloading of stuff in at least 3 places in mononoke (2 of which I added, skiplists and redaction config, and also on segmented changelog, there might be more).

This stack extracts that logic to a common place, so we don't need to reinvent that logic all the time, and it's easier to do it the next time.

Reviewed By: mitrandir77

Differential Revision: D29520651

fbshipit-source-id: 59820c03f168cb25e2c6345e36746121451f34e2
2021-07-07 06:21:38 -07:00
Stanislau Hlebik
42c8cc1247 mononoke: remove globalrev sql syncer
Summary: We don't need it anymore, and we recently had a sev that was caused by globalrev sql syncer. Let's remove it

Reviewed By: mitrandir77

Differential Revision: D29557246

fbshipit-source-id: c7d0232203b098dff3d750d34093877240d961c4
2021-07-07 04:25:49 -07:00
Mateusz Kwapich
051894b81d add fb303 flags to async request worker
Summary: needed to set up tw health check

Reviewed By: StanislavGlebik

Differential Revision: D29580808

fbshipit-source-id: 6a3833d652979915fd44dc6d89511192397d8b96
2021-07-07 03:47:07 -07:00
Mateusz Kwapich
fa8796ae19 change_target_config implementation
Summary: The implementation of change_sync_target_config_method.

Reviewed By: StanislavGlebik

Differential Revision: D29515737

fbshipit-source-id: 748278e73b1ed727550f3f05451b508a70be07db
2021-07-06 08:32:48 -07:00
Mateusz Kwapich
28d69d60c8 use the SourceName newtype where possible
Summary:
I got frustrated with the fact that half of the functions in
megarepo_api required the source name to be wrapped into newtype and
other half didn't. This refactor unifes it everywhere except the thrift
datastructure itself - not sure if we can afffect thrift codegen in this way.

Reviewed By: StanislavGlebik

Differential Revision: D29515474

fbshipit-source-id: 2d55a03cf396b174b0228c3fcc627b2296600400
2021-07-06 08:32:48 -07:00
Mateusz Kwapich
ae57ff3ccc make writing state optional in create_merge_commits
Summary:
The merge commit in case of change_target_sync_config won't be representing any
consistent state of the target so we don't want to write the remapping state
file there.

Reviewed By: StanislavGlebik

Differential Revision: D29515476

fbshipit-source-id: b0703be1127af6582785510fde51ff8501fb4f17
2021-07-06 08:32:48 -07:00
Mateusz Kwapich
15f3eadc49 make create_move_commits take just sources
Summary:
in case of change_target_sync_config we'll be creating move commits only for subset
of sources to let's change the function singature to so it's possible to
specify such subset.

Reviewed By: StanislavGlebik

Differential Revision: D29515475

fbshipit-source-id: 31002ec56dad872948bcbc79b0ed5fdb794e1f10
2021-07-06 08:32:48 -07:00
Mateusz Kwapich
85f31f3f85 move reusable functions to common
Summary:
The `change_target_config` methods responsibilities have a huge intersection
with `add_target_config`: the change method needs to know how to merge-in new
sources into the target and the whole "create move commits, then create merge
commits" flow can be reused.

Reviewed By: StanislavGlebik

Differential Revision: D29515301

fbshipit-source-id: c15f95875cbcbf5aad00e5047f6a8ffb55c4da31
2021-07-06 08:32:48 -07:00
Aida Getoeva
498416a53c mononoke/blobstore: single lookup for is_present multiplex
Summary:
Currently `is_present` makes a blobstores lookup and in case it couldn't determine whether the key exists or not, it checks the sync-queue (in case the key was written recently) and then might check the multiplex stores again, then fails if still unsure. This brings unnecessary complications and makes the multiplex blobstore less reliable.
More details in: https://fb.quip.com/wOCeAhGx6Oa1

This diff allows us to get rid of the queue and second store lookups and move the decision-making to the callers. The new logic is under the tunable for the safer rollout.

*This diff is safe to land.*

Reviewed By: StanislavGlebik

Differential Revision: D29428268

fbshipit-source-id: 9fc286ed4290defe16d58b2b9983e3baaf1a3fe4
2021-07-05 11:13:18 -07:00
Harvey Hunt
fcaa5c72d6 mononoke: Implement loadshedding checks
Summary:
Now that Mononoke uses the `rate_limiting` library we can shed load if
a server is overloaded. Add load shedding checks to the entry points for
wireproto and EdenAPI HTTP traffic.

At the time of writing, there aren't any load shedding limits configure so this
change won't have any effect.

Differential Revision: D29396504

fbshipit-source-id: c90cc40fc2609bdae1a267be3a1aecfe7fd33b7b
2021-07-05 10:18:52 -07:00
Harvey Hunt
14ba455e60 mononoke: Use new rate limiting crate
Summary:
Update Mononoke server to use the new `rate_limiting` crate. This diff
also removes the old rate limiting library.

Differential Revision: D29396507

fbshipit-source-id: 05adb9322705b771a739c8bcaf2816c95218a42d
2021-07-05 10:18:51 -07:00
Harvey Hunt
a92eae78ae mononoke: lfs: Use new load shedding config
Summary:
Replace the LFS server's load shedding logic with that provided by the
`rate_limiting` crate.

Differential Revision: D29396503

fbshipit-source-id: a71812a55b9c9f111ee2861dc1b131ad20ca82d2
2021-07-05 10:18:51 -07:00
Harvey Hunt
7b40d3af0d mononoke: Add new rate limiting library
Summary:
Add a new rate limiting library that also supports load shedding when
an individual server is overloaded. This library provides a few benefits:

- The code can be shared between the LFS server and Mononoke server.
- The library supports more complex expressions of which clients to apply a
  rate limit to (e.g. 10% of sandcastle and mactest machines).
- The rate limiting `Target` can be expanded in the future as the client
  provides more information (e.g. client region).
- Mononoke server will be able to loadshed if an individual host is overloaded,
  as we can currently do with the LFS server.

I've added this library as a separate crate rather than rewriting
`load_limiter` to make it easier to review. The next diff will make use of the
new library and remove the old one.

Reviewed By: StanislavGlebik

Differential Revision: D29396509

fbshipit-source-id: 2fbc04e266b18392062e6f952075efd5e24e89ba
2021-07-05 10:18:51 -07:00
Aida Getoeva
8cf1889499 mononoke/blobstore: new is_present semantics via enum
Summary:
This diff introduces new `is_present` semantics that will allow to move the decision logic on the complex multiplex `is_present` result from the multiplex to the callers.

In the current API `is_present` call returns `Result<bool>` and in case it couldn't determine whether the key exists or not, it checks the sync-queue in case the key was written recently and then might check the multiplex stores again, then fails if still unsure. This brings unnecessary complications and makes the multiplex blobstore less reliable.
More details in: https://fb.quip.com/wOCeAhGx6Oa1

This change allows us to get rid of the queue and second store lookups and move the decision-making to the callers.

*This diff shouldn't change the behaviour, but to replace bool value with an enum and add the conversions where needed.*

Reviewed By: StanislavGlebik

Differential Revision: D29377462

fbshipit-source-id: 4b70f772d2ed70d9fffda015ba06c3f16bf1475d
2021-07-05 09:17:29 -07:00
Aida Getoeva
fb5c2ad210 mononoke/blobstore: add multiplex is_present test
Summary: This diff adds a new unit-test for multiplex `is_present`.

Reviewed By: StanislavGlebik

Differential Revision: D29488020

fbshipit-source-id: c7b167c19b4c371be9d3be03be3351590af77040
2021-07-05 09:17:29 -07:00
Harvey Hunt
5487586135 mononoke: Don't use rate limiting prefix from config
Summary:
The rate limits for commits support the ability to apply a specific
prefix to the load limiting category. However, we haven't used this
functionality. Remove it to make subsequent work on the rate limiting logic
easier to implement.

Reviewed By: StanislavGlebik

Differential Revision: D29396506

fbshipit-source-id: ac518ccd74f6fac49ab85f87f1500787b5db955e
2021-07-05 07:09:53 -07:00
Harvey Hunt
cd7cdb334e mononoke: lfs: Remove jitter and sleep
Summary:
These config options are now unused as the client performs its own
jitter and exponential backoff. These values have been 0 in config for a while
now, so removing the code for them should have no behavioural impact.

Reviewed By: StanislavGlebik

Differential Revision: D29396508

fbshipit-source-id: 0a9d7efdd37516bee85ff9a34bfe9aa286ce4c0e
2021-07-05 07:09:53 -07:00
Harvey Hunt
fcaefcaeb3 rust: ratelim: Update to new futures
Summary: Update the ratelim crate to use the latest version of Rust's futures.

Reviewed By: StanislavGlebik

Differential Revision: D29061726

fbshipit-source-id: e7514d5802ad13cfe7e96f67fe17d208209967eb
2021-07-05 07:09:53 -07:00
Durham Goode
954fa919ac treemanifest: make ondemandfetch the default
Summary:
All our clients fetch with ondemandfetch set to true. Let's enable it
by default and remove the other fetch path.

Reviewed By: quark-zju

Differential Revision: D29148507

fbshipit-source-id: ea348aedba495d9d3a8652c4289178c08dae2f08
2021-07-01 09:31:15 -07:00
Durham Goode
1ecd0244f3 treemanifest: support designatednodes in python server
Summary:
All our client fetches now use the designatednodes api call. Let's
support it on the treemanifest python server so we can simplify the client code
around this one way of fetching.

Reviewed By: quark-zju

Differential Revision: D29148505

fbshipit-source-id: 22a92cdcfb105d8861b590de683e1bc12618abae
2021-07-01 09:31:15 -07:00
Durham Goode
14b3b10632 treemanifest: remove server conditions
Summary:
Now that the tests are using the copy of treemanifest we can remove the
server logic. To start with, let's remove all the conditional paths for the
server.

A later diff will remove the server specific storage bits (like revlog usage).

Reviewed By: quark-zju

Differential Revision: D29120431

fbshipit-source-id: aceb7ee265ce7333e26065202f114fed93895619
2021-07-01 09:31:15 -07:00
Durham Goode
8fb6b52ac1 treemanifest: move test server logic to separate extension
Summary:
The treemanifest store and pull/push logic is overly complicated.
Untangling it is a bit tricky since it needs to support both server and client
use cases. Since we no longer care about the server code except for tests, let's
copy the treemanifest extension and use it for the server repo in tests.

A future diff will take advantage of this to delete all the server logic from
the main treemanifest extension.

Reviewed By: quark-zju

Differential Revision: D29115069

fbshipit-source-id: 8b7080aa6c7de77be058b34baad5e976cd7c1acf
2021-07-01 09:31:15 -07:00
Yan Soares Couto
29ef2722fb Initialise redaction config using configerator
Summary:
This diff joins all previous diffs, by making redaction config be read from configerator by default, when created from RepoFactory.

The logic for fetching from XDB still exists, and can be used if a tunable is enabled, though this needs a service restart. I plan to leave it there as a possible fallback until we successfully add some real-world thing to the new redaction config and it works.

Reviewed By: StanislavGlebik

Differential Revision: D29033596

fbshipit-source-id: 4c5e97f542457dc25cf234d31182c3106173585d
2021-06-30 12:19:13 -07:00
Mark Juggurnauth-Thomas
1eff08edc2 unodes: fix find_unode_renames for files copied multiple times
Summary:
`find_unode_renames` uses a hashmap for renames keyed on the source path.  This
means when files are copied multiple times, only the last copy is kept.

Implement a new version, which takes multiple renames into account.  Prepare
for blame V2 by including the parent index and source path in the generated
rename source.

The existing implementation is retained so that we don't change how blame V1 is
computed.

Reviewed By: StanislavGlebik

Differential Revision: D29454872

fbshipit-source-id: e6de5cad3f00ac8413d25c385d42619c8cb02a31
2021-06-30 12:06:46 -07:00
Mark Juggurnauth-Thomas
567573e0c1 blame_v2: fix parent index determination in mononoke_admin
Summary:
The parent index for blame ranges is the index of the first *changeset* parent
that contains the file, not the index of the unode parent, which is what is
currently used for non-copy parents.  In fact, unode parents are guaranteed to
exist, so currently this will always be 0.

Change the calculation to work out which changeset parent contains the file.

Differential Revision: D29453805

fbshipit-source-id: 98950090a7d87e4ece22d8047dc3794ac52ec4a2
2021-06-30 09:32:45 -07:00
Jan Mazur
d92440d984 hard fail when lfs credentials not found
Summary:
If the default is to pass certificates we should hard fail every single time they are not found instead silently making unauthenticated request. This will surface issues much quicker.

best_match_for can return `Ok(None)`.

Reviewed By: johansglock

Differential Revision: D29159532

fbshipit-source-id: ff28a627d91a9cf37258a97dc2c7f709ba8d00c2
2021-06-30 09:09:20 -07:00
Yan Soares Couto
8e3c29e8d1 Create class to load redaction config from configerator
Summary:
Adds class `ConfigeratorRedactedBlobs` that reads redaction data from configerator, and reloads it when necessary.

The class does this:
- Reads `RedactionSets` from configerator.
- For each key there, read `RedactionConfigBlobstore` looking for a `RedactionKeyList` with that key (these were populated by D29033598).
- From the keys listed, builds the map of redacted blobs, with the same format as before when it was fetched from XDB.
- Periodically checks if config changed. If so, reload the map of redactions. (should only happen when we land a new config change to redaction, which should be very very rare)
- We use ArcSwap to keep the config, as a good way to provide read-only access with eventual reloading.

Not implemented on this diff:
- Creation of `ConfigeratorRedactedBlobs`, or adding it to `RedactedBlobs` enum.

It's not used in this diff, will be used in the future, I split it mostly to make it easier to review.

Reviewed By: StanislavGlebik

Differential Revision: D29033595

fbshipit-source-id: 36603685433b6dd153f2c23123907f7311c20a32
2021-06-30 08:57:30 -07:00
Yan Soares Couto
86292afd15 Add command to add RedactionKeyList to RedactionConfigBlobstore on mononoke_admin
Summary:
This diff adds two subsubcommands to mononoke admin:
```
- admin redaction create-key-list
- admin redaction create-key-list-from-ids
```

The first works similar to `admin redaction add`, but instead of inserting it to XDB, it instead creates a RedactionKeyList object, inserts it into the blobstore and prints its content.

The second just takes a list of keys as a raw input. Usually, the first one should be used. The second will **only** be used to migrate the current setup to be in configerator.

Reviewed By: markbt

Differential Revision: D29033598

fbshipit-source-id: 4d181d2b5c7701c7e88114a26d9219cada86c618
2021-06-30 08:57:30 -07:00
Yan Soares Couto
7e2a209bdb Create RedactionConfigBlobstore
Summary:
This diff adds a class RedactionConfigBlobstore, which can be built using `RepoFactory`.

It is a prefix blobstore (with prefix "redactionconfig") that's created independent to all other repo blobstores. And it's where we'll store the blobs related to redaction key lists.

I created this because AFAICT using RepoFactory we can only build RepoBlobstore, which are partitioned by repo, which we don't want for redaction as it is generic for all repos.

It's not used in this diff, will be used in the future, I split it mostly to make it easier to review.

Reviewed By: markbt

Differential Revision: D29033599

fbshipit-source-id: 5c44a73e2097c0de3abad038f45166f42a14a70b
2021-06-30 08:57:30 -07:00
Yan Soares Couto
b4f3a36802 Sync changes from D29360425 to fbcode
Summary: Syncing configerator changes from D29360425, and fixing all tests. Not used yet.

Reviewed By: markbt

Differential Revision: D29363416

fbshipit-source-id: d2de13d32bcec2e7fbff20204be8d9a8d65c0efe
2021-06-30 08:57:30 -07:00
Yan Soares Couto
1bcae1ae65 Add redaction config to common config, don't use it yet
Summary:
This reads the config added on D29305462. It populates it into `CommonConfig` struct, and also adds it to `RepoFactory`, but doesn't yet use it anywhere, this will be done on the next diff.

There is a single behaviour change in this diff, which I believe should be harmless but is noted in the comments in case it isn't.

Reviewed By: markbt

Differential Revision: D29272581

fbshipit-source-id: 62cd7dc78478c1d8cb212eafdd789527ead50ef6
2021-06-30 08:57:30 -07:00
Stanislau Hlebik
9e98685fff mononoke: add --logview-additional-level-filter
Summary:
While debugging T94402830 I noticed that logview category was using quite a lot
of scribe quota, however we don't really use it much. The reason it was using a
lot of scribe quota is because we were logging all stderr messages to logview,
even "DEBUG" ones. That might also be the reason why we didn't use logview - it
was too spammy.

Let's make it possible to add an additional log level filter for logview, so
that we could log only e.g. warn log messages and above.

Reviewed By: Croohand

Differential Revision: D29456888

fbshipit-source-id: 8cc66773ca8d82b00c3337937f519f6140fc8c9d
2021-06-30 06:44:51 -07:00
Stanislau Hlebik
8a7211c8e8 mononoke: fix GetGlobalrev sql query
Summary:
The type of "value" field is char, and so GetGlobalrevCounter was failing with

```
 MySQL Value Error: Expected row field to be RowField::Long, but got
 RowField::Bytes
```

The reason we haven't seen it before is because this query is called only when
IncreaseGlobalrevCounter didn't change a single row, and that's usually not the
case.

Reviewed By: HarveyHunt

Differential Revision: D29482685

fbshipit-source-id: 32073edcf5d57d7dad275a65c2e0f67b7321cef2
2021-06-30 03:57:02 -07:00
Stanislau Hlebik
5ae7ef3a4b mononoke: make mononoke_api stderr output less spammy
Summary:
scs server is full of messages that tell us that everything it ok. They are not
particularly useful and noisy - let's log only if we have a non-zero lag

Reviewed By: farnz

Differential Revision: D29457065

fbshipit-source-id: ab759745455f3b560e6230ade9f8a9095a3d961e
2021-06-30 00:52:10 -07:00
Liubov Dmitrieva
04aa0405e8 add 'upload/filenodes' request
Summary:
add 'upload/filenodes' request

This API must be called after file content has been uploaded. It requires a valid upload token for already uploaded file content.

The token can contain file content id of different types (canonical, sha1, sha256). It may or may not contain content size.

Reviewed By: StanislavGlebik

Differential Revision: D29197219

fbshipit-source-id: 3de31831ab06265675617a5c43cbd4be91f5cbe2
2021-06-29 19:28:45 -07:00
Mark Juggurnauth-Thomas
626095504d metaconfig: add BlameVersion
Summary: Add the `BlameVersion` enum to distinguish configured blame versions.

Reviewed By: StanislavGlebik

Differential Revision: D29453807

fbshipit-source-id: c9f912714010585fbed05f56bbae8e0fb3e92e44
2021-06-29 10:32:40 -07:00
Stanislau Hlebik
39e915d8d9 mononoke: allow creation of multiple symlinks that point to the same directory
Summary:
Previously it wasn't possible because symlink target was a key in the map that
mega_grepo_sync was sending to scs, and so we can't have two different symlink
for the same symlink target. However we actually need it - some of aosp repos
have symlink different sources that point to the same symlink target.

This diff fixes it by reverting the key and valud in the `linkfiles` map.

Differential Revision: D29359634

fbshipit-source-id: da74d6e934350822d82d2135ab06c754824525c9
2021-06-28 04:04:46 -07:00
Xavier Deguillard
41897e3acc third-party: patch os_info to properly support Centos Stream
Summary:
This is just updating the os_info crate to my fork with a fix for Centos
Stream: https://github.com/stanislav-tkach/os_info/pull/267

Reviewed By: quark-zju

Differential Revision: D29410043

fbshipit-source-id: 3642e704f5a056e75fee4421dc59020fde13ed5e
2021-06-25 21:07:33 -07:00
Daniel Xu
431a4ed16b Fix autocargo skew
Summary: I think someone landed a dependency change or something and forgot to update autocargo

Reviewed By: dtolnay

Differential Revision: D29402335

fbshipit-source-id: e9a4906bf249470351c2984ef64dfba9daac8891
2021-06-25 17:23:33 -07:00
Arun Kulshreshtha
60139f5316 localrepo: add option to explicitly enable or disable EdenAPI globally
Summary: Add an option to allow manually forcing EdenAPI to be enabled or disabled. This is useful in a variety of cases, such as bypassing the normal EdenAPI activation logic in tests, or to forcibly disable EdenAPI in cases where it isn't working correctly.

Differential Revision: D29377923

fbshipit-source-id: f408efe2a46ef3f1bd2914669310c3445c7d4121
2021-06-25 15:33:00 -07:00
Mark Juggurnauth-Thomas
a818b2a73d mononoke_api: detect multiple copies or renames when diffing
Summary:
When diffing a changeset with its parents, if a file is copied to multiple places, then we should include all of those copies in the diff.

Furthermore, if the file is also removed, then the *first* of those copies
should be considered a move.  Note that "first" here means the first in the
lexicographic ordering of the repository manifest.

Reviewed By: liubov-dmitrieva

Differential Revision: D29359516

fbshipit-source-id: eeed630c2e4d20f3fb8c923611a0433c74fd25d0
2021-06-25 11:17:26 -07:00
Jeremy Fitzhardinge
174de901a2 thrift/rust: remove new constructor for newtype typedefs
Summary:
A `new` constructor isn't necessary because it's identical to just
`TypeName`. Now that user-provided constructor can be included, it occupies
valuable namespace.

#forcetdhashing

Reviewed By: krallin

Differential Revision: D29387037

fbshipit-source-id: 7de343c13842c74772f7eca83ddd7019e1040c5c
2021-06-25 10:15:36 -07:00
Jun Wu
4b7bcc2553 dag: rename parents_and_head to parents_head_and_roots
Summary: The returned value now includes roots. Rename the function to clarify.

Reviewed By: kulshrax

Differential Revision: D29383072

fbshipit-source-id: 02a255ce20d9797f482f6fe1c716f2d79a12d4e0
2021-06-25 09:29:03 -07:00
Stanislau Hlebik
5ef7ba764b mononoke: do not check non-prefix free paths and verify_config earlier
Summary:
1) Turned out it's possible to have non-prefix free paths in aosp manifests. So
we have to remove this check for now
2) also let's verify config earlier so that we can return an error to the user
faster

Differential Revision: D29335602

fbshipit-source-id: 3dd72d63a370515eca5d356b3b98bb2ac2245aee
2021-06-25 09:26:33 -07:00
Egor Tkachenko
9961e81a80 Add logging of rebased changeset to scuba
Summary:
When we do pushrebase changesets which send to us by the client will be rebased and get new hash, which is not available in mononoke_test_perf atm.
Lets log rebased changeset_id

Reviewed By: Croohand

Differential Revision: D29362816

fbshipit-source-id: bebab24b12de1be9a9b81502453fcf44444f94b5
2021-06-25 06:57:34 -07:00
Thomas Orozco
8c83bd9a1c third-party/rust: update Tokio to 1.7.1
Summary: There is a regression in 1.7.0 (which we're on at the moment) so we might as well update.

Reviewed By: zertosh, farnz

Differential Revision: D29358047

fbshipit-source-id: 226393d79c165455d27f7a09b14b40c6a30d96d3
2021-06-25 06:17:41 -07:00
Stanislau Hlebik
1ab8052daf mononoke: add a command to fetch deleted file manifest
Summary: It's nice to be able to expect it

Differential Revision: D29366561

fbshipit-source-id: 12b3cb31e9c5821d942a0a10f97962e3ae1ddc41
2021-06-25 03:29:25 -07:00
Yan Soares Couto
af0811bdbc Add type RedactionKeyList
Summary:
This adds the blob object RedactionKeyList, which just contains a list of Strings, each of which will be a key to be redacted.

This will be stored on the blobstore, while a key to this object will be stored in configerator.

Some stuff that might be worth discussing:
- This class just holds a list of strings, per se it doesn't have much to do with redaction. If we want to change this to a more generic object like `KeyList`, I'm happy to do it. By default I'll leave it like this.
- I used serde (more precisely, json) to (de)serialise it. The only reason I did it was because I wanted to make this as simple as possible, from what I see in other objects need to define a thrift struct with the same config, then write `into/from_thrift` implementations. If preferred, I can do that.

It's not used in this diff, will be used in the future, I split it mostly to make it easier to review.

Reviewed By: markbt

Differential Revision: D29033597

fbshipit-source-id: 5550dbf58c5214201b739f8150fd06471bd67ab8
2021-06-25 03:15:28 -07:00
Andrey Chursin
7ed94dde6a OnDemandUpdateSegmentedChangelog: build up master before generating pull data
Summary: This is required to make sure segmented changelog has all the data needed

Reviewed By: quark-zju

Differential Revision: D29347285

fbshipit-source-id: 82ee1ffca178492b7ad363c53cee7ec57058733f
2021-06-24 13:58:02 -07:00
Andrey Chursin
dc97c2544a edenapi_service: add fast forward pull handler
Reviewed By: quark-zju

Differential Revision: D29342138

fbshipit-source-id: 056dad3bb7c207b1f0e9d0ee50a95e96ad690254
2021-06-24 13:58:02 -07:00
Robin Håkanson
4056944213 Add git LFS support to gitimport and grepo branch_forest.
Summary:
Add git LFS support to gitimport and grepo branch_forest.

I did not want to add the parsing of .gitattributes and .lfsconfig to the gitimport library. This needs to be done by the users of gitimport before the import is started, And the GitImportLfs object needs to be configured accordingly. Currently we are extrating this data from the manifest files for the "g"repo imports.

I am not sure the simple git-lfs download client works with other git-lfs server back ends then Dewey. But it is a fairly simple implementation and it should be easy to extend to be more generic.

Reviewed By: farnz

Differential Revision: D29082867

fbshipit-source-id: a7b0272147b3d44a0b6b9782d2a1b8ec94653b8f
2021-06-24 13:49:20 -07:00
Stanislau Hlebik
129d4fa88f mononoke: support multiple directories in mononoke_admin rsync
Summary: It's useful to be able to copy multiple dirs at once

Reviewed By: markbt

Differential Revision: D29358375

fbshipit-source-id: f1cc351195cc2c19de36a1b6936b598e314848c3
2021-06-24 11:44:34 -07:00
Stanislau Hlebik
1044dd545d mononoke: support mononoke admin convert for git
Summary:
Previously only conversion between bonsai and hg was supported. Let's add git
as well.

Obviously you can use `scsc lookup`, but mononoke_admin can be useful for repos
that are not on scs yet.

Reviewed By: farnz

Differential Revision: D29360793

fbshipit-source-id: eb2b71eab192b3456ba3d580f7eb8c4a85b2fd1d
2021-06-24 07:32:51 -07:00
Yan Soares Couto
a3e0290fe1 Move CoreContext creation in repo_factory to a new function
Summary: Very simple refactor. This logic was already used twice and I will use it another time in following diffs.

Reviewed By: markbt

Differential Revision: D29033594

fbshipit-source-id: 96040a2eee2b58f6851646e51b67c46c6bf334fe
2021-06-24 06:33:04 -07:00
Mark Juggurnauth-Thomas
728f145e78 ephemeral_blobstore: add ephemeral blobstore
Summary:
Implement get and put for the ephemeral blobstore.  This allows blobs to
be stored and retrieved in bubbles.

Ephemeral bubbles always have a repo associated with them when they are opened,
to simplify blob prefixing.  It is valid for a bubble id to have multiple repos
associated with it, but they must be accessed separately, and in practice this
won't be used.

Reviewed By: StanislavGlebik

Differential Revision: D29067722

fbshipit-source-id: d870f695fc1d0c825fdaec9337c82a13209165ce
2021-06-24 04:13:58 -07:00
Mark Juggurnauth-Thomas
3c9bf458be metaconfig: add ephemeral blobstore config
Summary:
Extend metaconfig to include configuration for the ephemeral blobstore.

An ephemeral blobstore is optional: repos without an ephemeral blobstore cannot
store ephemeral commits or snapshots.

Reviewed By: StanislavGlebik

Differential Revision: D29067719

fbshipit-source-id: fe7d42173d5c34a937c99c72f4b2bd08af503889
2021-06-24 04:13:58 -07:00
Mark Juggurnauth-Thomas
5716174f8f packblob: generalise key prefixes
Summary:
Packblob currently expects key prefixes of the form `repoNNNN.` to be stripped , but also allows keys without this prefix. For the ephemeral blobstore we want to allow prefixes of the form `ephXXX.repoNNNN.` as well.

Generalise packblob so that we can have multiple key prefixes.

Packblob will enforce that none of the blobs in the packblob have a prefix that matches any of the patterns - this will prevent us from accidentally storing `repoNNNN.`-prefixed blobs in an ephemeral blobstore that requires `ephXXX.repoNNNN.` prefixes, for example.

Reviewed By: liubov-dmitrieva

Differential Revision: D29067720

fbshipit-source-id: 953909d47c9c4af91b529bcc684340d26411463d
2021-06-24 04:13:58 -07:00
Alex Hornby
196ade1c06 mononoke: extract chunking params in walker
Summary: Make it clearer which of the TailParams are only required when chunking, removing parallel Option<> so that all items that should be set together are inside one optional item.

Reviewed By: farnz

Differential Revision: D29264647

fbshipit-source-id: d64cddf94b35e62d6e50cd8afe906eef2444c730
2021-06-24 01:49:39 -07:00
Alex Hornby
3d59baacd5 mononoke: check if chunking in walker defer_visit()
Summary: Makes defer_visit return result, so we can detect if it is called when not chunking.

Reviewed By: farnz

Differential Revision: D29268346

fbshipit-source-id: b8ea503c2848adb5d7ca3fb0e61399be2930c3de
2021-06-24 01:49:39 -07:00
Andrey Chursin
ea95fbdee8 api: introduce segmented_changelog_pull_fast_forward_master
Reviewed By: quark-zju

Differential Revision: D29319057

fbshipit-source-id: 88ff9e1f4acc0109c8a1e4978914f84832ebeb36
2021-06-23 14:51:39 -07:00
Andrey Chursin
f9b85a5a93 segmented_changelog: impl for ReadOnlySegmentedChangelog::pull_fast_forward_master
Summary: This is rougly similar to algorithm in NameDag

Reviewed By: quark-zju

Differential Revision: D29318721

fbshipit-source-id: 51a9123daa2b4cf0fbe2346a8a0c7e75172d9afb
2021-06-23 14:51:39 -07:00
Andrey Chursin
b13454d54b segmented_changelog: introduce SegmentedChangelog::pull_fast_forward_master
Summary: The naming is used in other parts of dag crate - this introduce mononoke side binding for corresponding functions on dag side

Reviewed By: quark-zju

Differential Revision: D29318722

fbshipit-source-id: e9eea5536b041b6ab2ce578914817bca43a10d48
2021-06-23 14:51:39 -07:00
Stanislau Hlebik
3c14f3c20b mononoke: fix symlink handling in megarepo_api
Summary:
Path should be relative to the symlink path, not to the repo root. This diff
fixes it

Reviewed By: farnz

Differential Revision: D29327682

fbshipit-source-id: a51161a8039a88263fe941562f2c2134aa5d4fef
2021-06-23 04:20:33 -07:00
Meyer Jacobs
b5858adee1 scmstore: update remaining tests
Summary: Update the remaining tests for scmstore. In each of these cases we're just disabling scmstore for various reasons. I think `test-lfs-bundle.t` and `test-lfs.t`'s failures represents a legitimate issue with scmstore's contentstore fallback, but I don't think it should block the rollout

Reviewed By: kulshrax

Differential Revision: D29289515

fbshipit-source-id: 10d055bf679db8efdeb16ac96b7ed597d7b6d82c
2021-06-22 13:14:58 -07:00
Stanislau Hlebik
56c926297f mononoke: reuse hg manifest from parents if they are identical
Summary:
This is a followup from D28903515 (9a3fbfe311). In D28903515 (9a3fbfe311) we've added support for reusing
hg filenodes if parent has the same filenode. However we weren't reusing
manifests even if parent has an identical manifest, and this diff adds a
support to do so.

There's one caveat - we try to reuse parent manifests only if there are more
than one parent manifest. See explanation in the comments.

Reviewed By: farnz

Differential Revision: D29098908

fbshipit-source-id: 5ecfdc4b022ffc7620501cc024e7a659fb82f768
2021-06-22 11:50:02 -07:00
Andres Suarez
fc37fea20c Update itertools 0.8.2 to 0.10.1
Reviewed By: dtolnay

Differential Revision: D29286012

fbshipit-source-id: 6923c0b750692e6932e85fd539b076b172ff43b7
2021-06-22 04:09:00 -07:00
Alex Hornby
4c94a2bfc3 mononoke: no need to revisit deferred nodes in OldestFirst mode
Summary:
In the walker, an Option<NodeData> value of None is used to indicate that no data could be found for a node, and that for derived data mappings we should try again to load it later, when it may have been derived.

When a node is outside the chunk boundary this isn't appropriate,  we should just mark as visited and move on, which is what this change does.

Reviewed By: farnz

Differential Revision: D29230223

fbshipit-source-id: c2afdee9b914af89c7954c8e6a7d17a174df7ed1
2021-06-22 01:41:39 -07:00
Meyer Jacobs
43a75431bb scmstore: update additional test
Summary: Only four tests remaining after this.

Reviewed By: kulshrax

Differential Revision: D29229656

fbshipit-source-id: 56c0a17f6585263e983ce8bc3c345b1f266422e0
2021-06-21 20:32:50 -07:00
Meyer Jacobs
88ab7198bc scmstore: update more tests
Summary: Update more tests to avoid relying on pack files and legacy LFS, and override configs in `test-inconsistent-hash.t` to continue using pack files even after the scmstore rollout to test the Mononoke's response to corruption, which is not currently as easy with indexedlog.

Reviewed By: quark-zju

Differential Revision: D29229650

fbshipit-source-id: 11fe677fcecbb19acbefc9182b17062b8e1644d8
2021-06-21 20:32:50 -07:00
Andrew Gallagher
05cf7acd77 object-0.25.3: patch SHT_GNU_versym entsize fix
Summary:
Pull in a patch which fixes writing out an incorrect entsize for the
`SHT_GNU_versym` section:
ddbae72082

Reviewed By: igorsugak

Differential Revision: D29248208

fbshipit-source-id: 90bbaa179df79e817e3eaa846ecfef5c1236073a
2021-06-21 09:31:49 -07:00
Yan Soares Couto
73212fc9bf Return Arc instead of reference
Summary:
For context and high level goal, see: https://fb.quip.com/8zOkAQRiXGQ3

On RedactedBlobs, let's return an `Arc<HashMap>` instead of `&Hashmap`.

This is not needed now, but when reloading information from configerator, we won't be able to return a reference, only a pointer.

Reviewed By: StanislavGlebik

Differential Revision: D28962040

fbshipit-source-id: 0848acc1a81a87c0b51d968efe31f61dacd57c47
2021-06-21 08:42:16 -07:00
Yan Soares Couto
f0a287580e Add wrapper around redacted blobs
Summary:
For context and high level goal, see: https://fb.quip.com/8zOkAQRiXGQ3

Instead of using `HashMap<String, RedactedMetadata>` everywhere, let's use a `Arc<RedactedBlobs>` object from which we can instead borrow a map. The borrow function is async because it will need to be when we're fetching from configerator, as it may need to rebuild the redaction data.

Wrapping it in `Arc` will also makes it re-use the same across repos, I believe right now it's cloned everywhere.

In later diffs I'll use this enum to add a new way to fetch configs.

Reviewed By: markbt

Differential Revision: D28935506

fbshipit-source-id: befa96810ee7ebb9487f99f9e769a945981b58ed
2021-06-21 08:42:16 -07:00
Simon Farnsworth
23cd985c98 Add a tool to check working copy equivalence between git and Mononoke
Summary:
We're doing imports for AOSP megarepo work, and want a tool to quickly check that our imports are what we expect.

Use libgit2 and a simple LFS parser to read git SHA-256 entries, and FSNodes to get the Mononoke entries to match

Reviewed By: StanislavGlebik

Differential Revision: D29169743

fbshipit-source-id: 1ef1e2c780b8742c7fa5f15f9ee01bc0481a6543
2021-06-21 07:35:31 -07:00
Simon Farnsworth
e07bd8ab5a Fix up building of the test case for scrubbing
Summary: This is a minimal fix so that it builds, not enough to test the new bit, but enough to unbreak contbuild

Reviewed By: yancouto, HarveyHunt

Differential Revision: D29263246

fbshipit-source-id: c5430ff4bc885103664c33caca90af5819d97ddd
2021-06-21 07:29:25 -07:00
Alex Hornby
51ee68fa24 mononoke: improve couple of ifs in walker
Summary: Spotted this in passing. Save a DashMap lookup in the OldestFirst case by checking the enum first

Reviewed By: farnz

Differential Revision: D29232280

fbshipit-source-id: 72e93ee704767a42c36ffeec505fd79a22c4d88e
2021-06-21 02:37:24 -07:00
Stanislau Hlebik
fdaea05176 mononoke: log when derived data mapping was inserted
Summary:
At the moment we have a few ways of deriving data:
1) "normal", which is used by most of the mononoke code. In this case we insert
derived data mapping after all the data for a given derived data type was
safely saved.
2) "backfill", which is used when we backfilling a lot of commits. In this case
we write all the data to in-memory blobstore first, and only later we save data
to real blobstore, and then write derived data mapping
3) "batch", when we derive data for a few commits at once. It can be combined
with "backfill" mode.

We also have a special scuba table for derived data derivation, however there
are a few problems with it.

Only "normal" mode has good and predictable logging i.e. it logs once before we
attempt to derive a commit, and once after commit was derived or failed.

"backfill" logs right after data for a given commit was "derived", however this is an in-memory
derivation, and at this point no data was saved to the blobstore.
So if backfill process crashes a bit later then commit might not be derived
after all, and it's impossible to tell it just by looking at the scuba table.

With "batch" mode it's even worse - we don't get any logs at all.

A bigger refactoring is needed here, because currently the process of
derivation is very hard to grok. But for now I suggest to slightly improve
scuba logging by logging and even when a derived data mapping was actually written (or failed to be
written). After this diff we'll get the following:

1) "normal" mode will get three entries in scuba table in this order: derivation start,
mapping written, derivation end,
2) "backfill" mode will also get three entries in scuba table by in a different
order: derivation start, derivation end, mapping written
3) "batch" mode will get one entry for writing the mapping. Not great, but
better than nothing!

Reviewed By: farnz

Differential Revision: D29231404

fbshipit-source-id: 2c601e7dc58c00e22fda1ddd542833a818d1d023
2021-06-21 01:19:52 -07:00
Stanislau Hlebik
222352e0a5 mononoke: move derived data logging code to a separate file
Summary: Just moving a code around a bit to make derive_impl file a bit smaller

Reviewed By: farnz

Differential Revision: D29231405

fbshipit-source-id: c923f42710f4be98147bc58d5b828d5d6c7bf1a6
2021-06-21 01:19:52 -07:00
Simon Farnsworth
3404fb6b66 New manual_scrub mode for checking that a write-mostly store is populated
Summary:
I'm seeing significant Zippy load when I do a check scrub of our big repo to make sure that it's all in SQL Blobstore as well as our main blob stores.

Teach scrub to not bother talking to the main blobstores unless the write-mostly blobstore is either missing the data or unable to retrieve it.

Reviewed By: ahornby

Differential Revision: D29233349

fbshipit-source-id: 1127129ff283477558cddb03686c3c13aee47fb5
2021-06-18 10:26:22 -07:00
Aida Getoeva
b340165c59 mononoke/eden: reduce the number of ODS timeseries
Summary: We have over [17M timeseries](https://www.internalfb.com/intern/ods/category?cat_id=1475&selection=timeseries) now with the [edenapi far ahead](https://fburl.com/scuba/gorilla_keys/yurnzsfi). Let's not group the timeseries by repo name, as it's not very useful (we can look into Scuba for more details), and remove some of the percentiles.

Reviewed By: ahornby

Differential Revision: D29196854

fbshipit-source-id: 0158fe9e9526fb3db35a4ac6234bf580cbd6805b
2021-06-18 04:16:59 -07:00
Andres Suarez
845128485c Update bytecount
Reviewed By: dtolnay

Differential Revision: D29213998

fbshipit-source-id: 92e7a9de9e3d03f04b92a77e16fa0e37428fe2fb
2021-06-17 19:50:32 -07:00
Davide Cavalca
b82c5672fc Update several rust crate versions
Summary: Update versions for several of the crates we depend on.

Reviewed By: danobi

Differential Revision: D29165283

fbshipit-source-id: baaa9fa106b7dad000f93d2eefa95867ac46e5a1
2021-06-17 16:38:19 -07:00
Liubov Dmitrieva
1b818d114d add an option to pass some metadata in the token
Summary:
add an option to pass some metadata in the token

This will be used for content tokens, for example. We would like to guarantee that the specific content has been uploaded and it had the specific length. This will be used for hg filenodes upload.

Reviewed By: markbt

Differential Revision: D29136295

fbshipit-source-id: 2fbd3917ee0a55f43216351fdbc1a6686eb80176
2021-06-17 08:22:33 -07:00
Liubov Dmitrieva
500a232716 implement upload of file content into blobstore
Summary:
upload file content into blobstore

the existing Mononoke API already validates the provided hashes and calculates the missing one

we would probably need to write to all multiplexed blobstores, but multiplexing will be addressed separately

Reviewed By: markbt

Differential Revision: D29103111

fbshipit-source-id: 0cac837efc238f618a35420523279fb7aa91668a
2021-06-17 08:22:33 -07:00
Alex Hornby
8f2b3a8a9d mononoke: sqlblob allow inline mysql puts
Summary: Allow puts to sqlblob with mysql backing to use the InlineBase64 hash type.

Reviewed By: farnz

Differential Revision: D28829452

fbshipit-source-id: 265cf45e55284d34d3002a9db205e14eaee4fa39
2021-06-17 07:26:45 -07:00
Stanislau Hlebik
fbc07cb4c3 mononoke: make chunk size configurable in regenerate_filenodes binary
Summary:
It's useful to have it configurable.
While here, also use slog instead of println to attach timestamp as well

Reviewed By: Croohand

Differential Revision: D29165693

fbshipit-source-id: d844926560b15042445d5861a281870ac102d12e
2021-06-17 03:07:24 -07:00
Thomas Orozco
b170b80412 mononoke: add an --oncall argument to megarepo bind commits
Summary:
Like it says in the title. Let's allow specifying an oncall here since that
oncall will be tasked with retroactive review of the commit.

Reviewed By: StanislavGlebik

Differential Revision: D29162534

fbshipit-source-id: 9ed3ac43c38a1120bb16a2f5b5218fdbf80e0d47
2021-06-16 08:50:52 -07:00
CodemodService Bot
4c4dfd45ad Daily common/rust/cargo_from_buck/bin/autocargo
Reviewed By: krallin

Differential Revision: D29158387

fbshipit-source-id: 48a0b590e01083d762bbed2b7e272cbefc72641f
2021-06-16 04:50:15 -07:00
Stanislau Hlebik
8b83a0463a mononoke: bump bonsai_hg_mapping cache key again
Differential Revision: D29160193

fbshipit-source-id: cd8db604b74470c7cde28c8062f0d86fa097a794
2021-06-16 03:58:38 -07:00
Stanislau Hlebik
daca08abb7 mononoke: bump filenodes memcache
Summary:
Similar to D29098920 (9a813fb14b), I'm regenerating filenodes now. Because of that I need to
bump cache key

Reviewed By: mitrandir77

Differential Revision: D29134968

fbshipit-source-id: 2f2b5b41bedcb0a037be7eade74e7b45a8990880
2021-06-16 02:23:19 -07:00
Mark Juggurnauth-Thomas
11228c2240 mononoke_types: make repo id suffix pattern a separate constant
Summary:
The `REPO_PREFIX_REGEX` attempts to match on the `.` of the repo id suffix,
however since this is a regex metacharacter, it needs to escape it.  Currently
it does this in the regex itself, but this is fragile.

Instead, let's have a separate constant for "the pattern that matches the repo
suffix", which has the value `"\."`.

Reviewed By: liubov-dmitrieva

Differential Revision: D29067723

fbshipit-source-id: 1e1a0a3ecf2b02d4e44fad1eaab57804f3dd020c
2021-06-15 08:29:31 -07:00
Mark Juggurnauth-Thomas
99b78bb74b memblob: implement BlobstoreKeySource
Summary:
Allow enumeration of memblob keys using `BlobstoreKeySource`.

To allow this, we have to change the map type from `HashMap` to `BTreeMap`.
Since memblob is only used in tests, this should be ok.

Reviewed By: liubov-dmitrieva

Differential Revision: D28224475

fbshipit-source-id: b46ef758e0d92a44359ccd1c0a8b61ccf4548a80
2021-06-15 08:29:31 -07:00
Mark Juggurnauth-Thomas
fb646e5edc blobstore: enumeration ranges are inclusive
Summary:
Manifold enumeration ranges are inclusive.  Update documentation of options that
ultimately feed into this as such.

To avoid future confusion, use Rust's inclusive ranges to initialize these, and
remove the exclusive range option.

The fileblob implementation was actually performing exclusive checks at both
ends, rather than inclusive ones.  Correct this by implementing `RangeBounds`
and using `range.contains` instead.

Reviewed By: liubov-dmitrieva

Differential Revision: D28224481

fbshipit-source-id: 7244588271d7754d6c6820790cbd76574b296d7b
2021-06-15 08:29:31 -07:00
Mateusz Kwapich
ef473588c3 test showcasing the problem with zstd update
Summary:
After zstd update (D28897019 (f89dbebae8)) we've seen errors in mononoke production that
weren't caught by tests. This tests showcases this specific error so we don't
reintroduce the problem in the future and so we can verify bug when we have it.

Reviewed By: ahornby

Differential Revision: D29112406

fbshipit-source-id: 8de73066f7c27ba6c9f56e112964ae733f541bf1
2021-06-15 04:47:11 -07:00
Stanislau Hlebik
9a813fb14b mononoke: bump bonsai_hg_mapping memcache
Summary:
I had to delete a bunch of bonsai hg mapping entries for aosp repos. Let's
bump memcache to make sure they are no longer in the cache.

Reviewed By: ahornby

Differential Revision: D29098920

fbshipit-source-id: a997a1362196efe3a77281d67c6419aff72cad14
2021-06-15 02:11:26 -07:00
Stanislau Hlebik
e65b2f0555 mononoke: reorganize merge_even/merge_uneven fixtures
Summary:
Just moving the code around.
This made it easier for me to understand what's going on.

Reviewed By: farnz

Differential Revision: D29098907

fbshipit-source-id: 07826e2408d7b62487ac1ed20ef0bded8ac4de6c
2021-06-14 07:17:50 -07:00
Aida Getoeva
6b85b79e4a mononoke/blobstore: add mysql sync queue latency logging
Summary:
This diff adds logging for the sync queue writes latency to the Mononoke Blobstore Trace Scuba table and creates a new column `Queue`, which can be used later on for the WAL logging purposes too.

This is needed to better understand out current situation with the MySQL base queue and compare it with the other solutions.

Reviewed By: ahornby

Differential Revision: D29015723

fbshipit-source-id: a2d7ad17101bb456ceae9060c39bfecb06644326
2021-06-14 06:17:35 -07:00
Simon Farnsworth
58f1cba9ba Simplify the logic in store_file_change
Summary: In the previous diff, we took out most of the complexity around Mercurial-style filenodes. Rearrange the code to have fewer early returns, and thus be easier to follow when trying to debug

Reviewed By: yancouto

Differential Revision: D29060994

fbshipit-source-id: 830f6c8d4a42725d7096a0d5c4a4a4d6797b187a
2021-06-11 12:24:56 -07:00