Commit Graph

368 Commits

Author SHA1 Message Date
Jun Wu
6ecf255fcf test-infinitepush-mutation: use modern configs
Summary:
Enable remotenames and narrow-heads.

Local bookmarks are replaced by remote bookmarks.

Reviewed By: krallin

Differential Revision: D22200503

fbshipit-source-id: 41ac4f4f606011dcaf6d0d9867b01fb77b9a79d8
2020-06-29 13:00:07 -07:00
Jun Wu
95cf4a2a39 test-infinitepush-hydrated: use modern configs
Summary:
Enable remotenames and narrow-heads.

Phase exchange is gone because of narrow-heads.
The remtoenames extension was written suboptimally so it issued a second
bookmarks request (which, hopefully can be removed by having selective
pull everywhere and migrate pull to use the new API).

Reviewed By: krallin

Differential Revision: D22200506

fbshipit-source-id: c522bb9fc1396d813e0f1f380c4290445bab3db3
2020-06-29 13:00:07 -07:00
Jun Wu
ffde9f50e9 test-infinitepush-commits-disabled: use modern configs
Summary:
Enable remotenames and narrow-heads. The `master_bookmark` is no longer a local
bookmark in the client repo.

Reviewed By: krallin

Differential Revision: D22200513

fbshipit-source-id: bc3c1715ce21f45a35bc67148eb00e44944bea6e
2020-06-29 13:00:06 -07:00
Jun Wu
8ecb79a921 test-gettreepack-sparse-update: use modern configs
Summary: Enable remotenames and narrow-heads.

Reviewed By: krallin

Differential Revision: D22201083

fbshipit-source-id: 585dff69db9dd725c8fa1090d47c85b150f979da
2020-06-29 13:00:06 -07:00
Jun Wu
439c029007 test-gettreepack-designated-nodes: use modern configs
Summary:
Enable remotenames and narrow-heads. The server gets one more request from
remotenames.

Reviewed By: krallin

Differential Revision: D22200502

fbshipit-source-id: 26bc28b19438c7be4a19eae6be728c83b113f822
2020-06-29 13:00:06 -07:00
Jun Wu
93318c255b test-bookmark-hg-kind: use modern configs
Summary:
Enable remotenames and narrow-heads. The client gets remote bookmarks instead
of local bookmarks during clone and phases are decided by remote bookmarks.

Reviewed By: krallin

Differential Revision: D22200515

fbshipit-source-id: 12a9e892855b3a8f62f01758565de5f224c4942b
2020-06-29 13:00:06 -07:00
Jun Wu
8bde7d1316 tests: show remotenames in tglogpnr
Summary:
Change the template to show remote bookmarks, which will be more relevant once
we migrate to modern configs. Namely, phases will be decided by remote bookmarks.

The named branches logic was mostly removed from the code base. Therefore
drop the `{branches}` template.

Reviewed By: StanislavGlebik

Differential Revision: D22200512

fbshipit-source-id: 8eca3a71ff88b8614023f4920a448156fcd712d5
2020-06-29 13:00:06 -07:00
Jun Wu
dbd29b7d06 tests: turn on narrow-heads for some tests
Summary: With narrow-heads, the phase exchange step is skipped.

Reviewed By: krallin

Differential Revision: D22200504

fbshipit-source-id: 6ab366e7e68eb3b82f52acaa8f488747435e0ecf
2020-06-29 13:00:06 -07:00
Jun Wu
889beacdf1 tests: enable narrow-heads for Mononoke tests
Summary:
Most tests pass without changes. Some incompatible tests are added to the
special list.

Reviewed By: krallin

Differential Revision: D22200505

fbshipit-source-id: 091464bbc7c9c532fed9ef91f2c955d6e4f2df0b
2020-06-29 13:00:06 -07:00
Stanislau Hlebik
04ce32014d mononoke: log pushed commits to scribe
Summary: This is the final diff of the stack - it starts logging pushed commits to scribe

Reviewed By: farnz

Differential Revision: D22212755

fbshipit-source-id: ec09728408468acaeb1c214d43f930faac30899b
2020-06-29 12:15:22 -07:00
Stanislau Hlebik
8a137ae922 mononoke: add Scribe
Summary:
At the moment we can't test logging to scribe easily - we don't have a way to
mock it. Scribe are supposed to help with that.

They will let us to configure all scribe logs to go to a directory on a
filesystem similar to the way we configure scuba. The Scribe itself will
be stored in CoreContext

Reviewed By: farnz

Differential Revision: D22237730

fbshipit-source-id: 144340bcfb1babc3577026191428df48e30a0bb6
2020-06-29 12:15:22 -07:00
Jun Wu
4902a3300c tests: enable narrow-heads by default
Summary: Many tests are incompatible. But many are passing.

Reviewed By: kulshrax

Differential Revision: D22052475

fbshipit-source-id: 1f30ac2b0fe034175d5ae818ec2be098dbd5283d
2020-06-29 11:29:04 -07:00
Simon Farnsworth
7e9b8dd9e9 Remove last vestiges of Lua hooks from tests
Summary:
For Lua hooks, we needed to know whether to run the hook per file, or per changeset. Rust hooks know this implicitly, as they're built-in to the server.

Stop having the tests set an unnecessary config

Reviewed By: krallin

Differential Revision: D22282799

fbshipit-source-id: c9f6f6325823d06d03341f04ecf7152999fcdbe7
2020-06-29 10:03:22 -07:00
Harvey Hunt
026710c2cd mononoke: Remove --config_path from server arguments
Summary:
D21642461 (46d2b44c0e) converted Mononoke server to use the
`--mononoke-config-path` common argument style to select a config path.

Now that this change has been running for a while, remove the extra logic in
the server that allowed it to accept both the deprecated `--config_path / -P`
and the new arg.

Reviewed By: ikostia

Differential Revision: D22257386

fbshipit-source-id: 7da4ed4e0039d3659f8872693fa4940c58bae844
2020-06-29 07:28:36 -07:00
Kostia Balytskyi
fb3eea2b56 commit_validator: get rid of unneeded bookmark rewriting
Summary:
`get_entry_with_small_repo_mapings` is a function that turns a `CommitEntry`
struct into `CommitEntryWithSmallReposMapped` struct - the idea being that this
function looks up hashes of commits into which the original commit from the
large repo got rewritten (in practice rewriting may have happened in the small
-> large direction, but it is not important for the purpose of this job). So it
establishes a mapping. Before this
diff, it actually established `Large<ChangesetId> ->
Option<(Small<ChangesetId>, Option<BookmarkName>)>` mapping, meaning that it
recorded into which bookmark large bookmark was rewritten. This was a useless
information (as evidenced by the fact that it was ignored by the
`prepare_entry` function, which turns `CommitEntryWithSmallReposMapped` into
`EntryPreparedForValidation`. It is useless because bookmarks are mutable and
it is impossible to do historic validation of the correctness of bookmark
renaming: bookmarks may have been correctly renamed when commits where pushes,
but they may be incorrectly renamed now and vice-versa. To deal with bookmarks,
we have a separate job, `bookmarks_validator`.

So this diff stops recording this useless information. As a bonus, this will
make migration onto `LiveCommitSyncConfig` easier.

Reviewed By: StanislavGlebik

Differential Revision: D22235389

fbshipit-source-id: c02b3f104a8cbd1aaf76100aa0930efeac475d42
2020-06-29 01:48:52 -07:00
Kostia Balytskyi
c01294e8d6 backsyncer_cmd: use LiveCommitSyncConfig
Summary:
This diff migrates `backsyncer_cmd` (the thing that runs in the separate backsyncer job, as opposed to bakcsyncer, triggered from push-redirector) onto `LiveCommitSyncConfig`. Specifically, this means that on every iteration of the loop, which calls `backsync_latest` we reload `CommitSyncConfig` from configerator, build a new `CommitSyncer` from it, and then pass that `CommitSyncer` to `backsync_latest`.

One choice made here is to *not* create `CommitSyncer` on every iteration of the inner loop of `backsync_latest` and handle live configs outside. The reason for this is twofold:
- `backsync_latest` is called form `PushRedirector` methods, and `PushRedirector` is recreated on each `unbundle` using `LiveCommitSyncConfig`. That call provides an instance of `CommitSyncer` used to push-redirect a commit we want to backsync. It seems strictly incorrect to try and maybe use a different instance.
- because of some other consistency concerns (different jobs getting `CommitSyncConfig` updates at different times), any sync config change needs to go through the following loop:
  - lock the repo
  - land the change
  - wait some time, until all the possible queues (x-repo sync and backsync) are drained
  - unlock the repo
- this means that it's ok to have the config refreshed outside of `backsync_latest`

Reviewed By: farnz

Differential Revision: D22206992

fbshipit-source-id: 83206c3ebdcb2effad7b689597a4522f9fd8148a
2020-06-26 13:40:31 -07:00
Viet Hung Nguyen
fa1caa8c4e mononoke/repo_import: Add gitimport functionality and integration test
Summary: I have previously moved the gitimport functionality (D22159880 (2cf5388835)) into a separate library, since repo_import shares similar behaviours. In this diff, I setup repo_import to be able to call gitimport to get the commits and changes. (Next steps include using Mover to set the paths of the files in the commits given by gitimport)

Reviewed By: StanislavGlebik

Differential Revision: D22233127

fbshipit-source-id: 4680c518943936f3e29d21c91a2bad60108e49dd
2020-06-25 19:54:38 -07:00
Simon Farnsworth
454de31134 Switch Loadable and Storable interfaces to new-style futures
Summary:
Eventually, we want everything to be `async`/`await`; as a stepping stone in that direction, switch some of the blobstore interfaces to new-style `BoxFuture` with a `'static` lifetime.

This does not enable any fixes at this point, but does mean that `.compat()` moves to the places that need old-style futures instead of new. It also means that the work needed to make the transition fully complete is changed from a full conversion to new futures, to simply changing the lifetimes involved and fixing the resulting compile failures.

Reviewed By: krallin

Differential Revision: D22164315

fbshipit-source-id: dc655c36db4711d84d42d1e81b76e5dddd16f59d
2020-06-25 08:45:37 -07:00
Stanislau Hlebik
b0e910655a mononoke: allow pushing only a single bookmark during push
Summary:
Push supported multiple bookmarks in theory, but in practice we never used it.
Since we want to start logging pushed commits in the next diffs we need to decide what to do with
bookmarks, since at the moment we can log only a single bookmark to scribe

let's just allow a single bookmark push

Reviewed By: farnz

Differential Revision: D22212674

fbshipit-source-id: 8191ee26337445ce2ef43adf1a6ded3e3832cc97
2020-06-25 05:51:30 -07:00
Kostia Balytskyi
8c50e0d870 unbundle: use live_commit_sync_config for push redirection
Summary:
This diff enables `unbundle` flow to start creating `push_redirector` structs from hot-reloaded `CommitSyncConfig` (by using the `LiveCommitSyncConfig` struct).

Using `LiveCommitSyncConfig` unfortunately means that we need to make sure those tests, which don't use standard fixtures, need to have both the `.toml` and the `.json` commit sync configs present, which is a little verbose. But it's not too horrible.

Reviewed By: StanislavGlebik

Differential Revision: D21962960

fbshipit-source-id: d355210b5dac50d1b3ad277f99af5bab56c9b62e
2020-06-25 03:28:08 -07:00
Viet Hung Nguyen
ebd041b0ec mononoke/tests: modified paths to absolute
Summary: When running integration tests we should make the paths absolute, but kept it relative so far. This results it breaking the tests.

Reviewed By: krallin

Differential Revision: D22209498

fbshipit-source-id: 54ca3def84abf313db32aecfac503c7a42ed6576
2020-06-24 11:17:07 -07:00
Thomas Orozco
ce7f53422f mononoke/lfs_server: support the client not having the data it wants to send us
Summary:
This diff is probably going to sound weird ... but xavierd and I both think
this is the best approach for where we are right now. Here is why this is
necessary.

Consider the following scenario

- A client creates a LFS object. They upload it to Mononoke LFS, but not
  upstream.
- The client shares this (e.g. with Sandcastle), and includes a LFS pointer.
- The client tries to push this commit

When this happens, the client might not actually have the object locally.
Indeed, the only pieces of data the client is guaranteed to have is
locally-authored data.

Even if the client does have the blob, that's going to be in the hgcache, and
uploading from the hgcache is a bit sketchy (because, well, it's a cache, so
it's not like it's normally guaranteed to just hold data there for us to push
it to the server).

The problem boils down to a mismatch of assumptions between client and server:

- The client assumes that if the data wasn't locally authored, then the server
  must have it, and will never request this piece of data again.
- The server assumes that if the client offers a blob for upload, it can
  request this blob from the client (and the client will send it).

Those assumptions are obviously not compatible, since we can serve
not-locally-authored data from LFS and yet want the client to upload it, either
because it is missing in upstream or locally.

This leaves us with a few options:

- Upload from the hg cache. As noted above, this isn't desirable, because the
  data might not be there to begin with! Populating the cache on demand (from
  the server) just to push data back to the server would be quite messy.
- Skip the upload entirely, either by having the server not request the upload
  if the data is missing, by having the server report that the upload is
  optional, or by having the client not offer LFS blobs it doens't have to the
  server, or finally by having the client simply disobey the server if it
  doesn't have the data the server is asking for.

So, why can we not just skip the upload? The answer is: for the same reason we
upload to upstream to begin with. Consider the following scenario:

- Misconfigured client produces a commit, and upload it to upstream.
- Misconfigured client shares the commit with Sandcastle, and includes a LFS
  pointer.
- Sandcastle wants to push to master, so it goes to check if the blob is
  present in LFS. It isn't (Mononoke LFS checks both upstream and internal, and
  only finds the blob in upstream, so it requests that the client submit the
  blob), but it's also not not locally authored, so we skip the push.
- The client tries to push to Mononoke

This push will fail, because it'll reference LFS data that is not present in
Mononoke (it's only in upstream).

As for how we fix this: the key guarantee made by our proxying mechanism is
that if you write to either LFS server, your data is readable in both (the way
we do this is that if you write to Mononoke LFS, we write it to upstream too,
and if you write to upstream, we can read it from Mononoke LFS too).

What does not matter there is where the data came from. So, when the client
uploads, we simply let it submit a zero-length blob, and if so, we take that to
mean that the client doesn't think it authored data (and thinks we have it), so
we try to figure out where the blob is on the server side.

Reviewed By: xavierd

Differential Revision: D22192005

fbshipit-source-id: bf67e33e2b7114dfa26d356f373b407f2d00dc70
2020-06-24 10:02:01 -07:00
Thomas Orozco
266607333c hg/mononoke: fix broken test message expectation
Summary:
I landed D22118926 (e288354caf) yesterday expecting those messages at about the same time
xavierd landed D21987918 (4d13ce1bcc), which removed them. This removes them from the
tests.

Reviewed By: StanislavGlebik

Differential Revision: D22204980

fbshipit-source-id: 6b1d696c93a07e942f86cd8df9a8e43037688728
2020-06-24 03:27:55 -07:00
Xavier Deguillard
dc8c24ab30 remotefilelog: enable the rust stores by default
Summary:
The Rust store code has been enabled everywhere for a few weeks now, let's
enable it by default in the code. Future changes will remove the config as well
as all the code associated with the non Rust store code.

The various tests changes are due to small difference between the Rust code and
the Python one, the biggest one being it's handling of corrupted packfiles. The
old code silently ignored them, while the new one errors out for local
packfiles. The test-lfs-bundle.t difference is just due to an ordering
difference between Python and Rust.

Reviewed By: kulshrax

Differential Revision: D21985744

fbshipit-source-id: 10410560193476bc303a72e7583f84924a6de820
2020-06-23 18:47:44 -07:00
Thomas Orozco
e288354caf sparse: prefetch trees before iterating through the whole manifest
Summary:
If we're going to iterate through the whole manifest, we should probably
prefetch it. Otherwise, we might end up doing a whole lot of sequential
fetching. We saw this this week when a change landed in sparse profiles that
caused requests to Mononoke to increase 100-fold.

Unfortunately, I don't think we can selectively only fetch the things we are
missing, so this just goes ahead and fetches everything unconditionally. If
there is a better way to do this, I'm all ears.

Reviewed By: StanislavGlebik, xavierd

Differential Revision: D22118926

fbshipit-source-id: f809fa48a7ff7b449866b42b247bf1da30097caa
2020-06-23 08:37:23 -07:00
Kostia Balytskyi
6b370f24e3 tests: add configerator commitsync fixtures
Summary: This will be used in the following diffs. It just adds commitsync fixtures in a single place, so that we can later play with them in integration tests.

Reviewed By: StanislavGlebik

Differential Revision: D21952665

fbshipit-source-id: 2933a9f7ea8343d5d52e6c3207e7d78a3ef0be25
2020-06-23 04:33:17 -07:00
Pavel Aslanov
a1f5e45a5a BlobRepoHg extension trait.
Summary: This diff introduces `BlobRepoHg` extension trait for `BlobRepo` object. Which contains mercurial specific methods that were previously part of `BlobRepo`. This diff also stars moving some of the methods from BlobRepo to BlobRepoHg.

Reviewed By: ikostia

Differential Revision: D21659867

fbshipit-source-id: 1af992915a776f6f6e49b03e4156151741b2fca2
2020-06-22 07:29:19 -07:00
Lukas Piatkowski
6ebd409406 mononoke/integration tests: separate out facebook-specific code for running integration tests
Summary: Not all facebook-specific code was moved out of integration_runner_real.py, but removing part of the code that is left would made the code less readable, the rest of it will be removed while the integration_runner_real.py is made usable for OSS

Reviewed By: farnz

Differential Revision: D22114948

fbshipit-source-id: d9c532a6a9ea653de2b12cffc92fbf45826dad37
2020-06-22 06:36:12 -07:00
Durham Goode
7f1588131b py3: set LANG="en_US.UTF-8" for most tests
Summary:
We support unicode file paths, and in python 3 those get passed to
python libraries as unicode strings. The tests set LANG=C which mean the python
library tries to convert the path to ascii, but fails for any non-ascii
characters. Let's switch to LANG="en_US.UTF-8" to match our production
behavior and make tests about unicode paths work.

Reviewed By: xavierd

Differential Revision: D22098359

fbshipit-source-id: c3057edc66e6e32f7b8b49374e622d02bd05711f
2020-06-19 13:40:17 -07:00
Simon Farnsworth
5f2b7259cd Run hooks in the large repo as well as the small when pushredirection is in place
Summary: Megarepo is simplified if we can avoid copying hooks everywhere - run megarepo hooks as well as small repo hooks during pushredirection.

Reviewed By: StanislavGlebik

Differential Revision: D20652331

fbshipit-source-id: f42216797b9061db10b50c1440253de1f56d6b85
2020-06-18 07:33:46 -07:00
Alex Hornby
9c53e07e46 mononoke: add optional compress to packblob put
Summary:
Add optional compress on put controlled by a command line option.

Other than costing some CPU time, this may be a good option when populating repos from existing uncompressed stores to new stores.

Reviewed By: farnz

Differential Revision: D22037756

fbshipit-source-id: e75190ddf9cfd4ed3ea9a18a0ec6d9342a90707b
2020-06-17 02:35:04 -07:00
Alex Hornby
6658b17fb6 mononoke: add packblob to blobstore_factory
Summary: mononoke: add packblob to blobstore_factory

Reviewed By: farnz

Differential Revision: D21924406

fbshipit-source-id: d42702b11bdc0b187467869a7959f68022c60ab2
2020-06-16 04:11:43 -07:00
Arun Kulshreshtha
977c3c73e3 edenapi_server: rename the subtree endpoint to complete_trees
Summary:
Rename the `subtree` endpoint on the EdenAPI server to `complete_trees` to better express what it does (namely, fetching complete trees, in contrast to the lighter weight `/trees` endpoint that serves individual tree nodes). This endpoint is not used by anything yet, so there isn't much risk in renaming it at this stage.

In addition to renaming the endpoint, the relevant request struct has been renamed to `CompleteTreeRequest` to better evoke its purpose, and the relevant client and test code has been updated accordingly. Notably, now that the API server is gone, we can remove the usage of this type from Mononoke's `hgproto` crate, thereby cleaning up our dependency graph a bit.

Reviewed By: krallin

Differential Revision: D22033356

fbshipit-source-id: 87bf6afbeb5e0054896a39577bf701f67a3edfec
2020-06-15 13:40:44 -07:00
Thomas Orozco
51623fbdbd mononoke/tests: disable mutation.record everywhere
Summary:
This is consistent with what is being done for now in hg for tests that haven't been migrated
to modern configurations yet, and ensures we get stable commit hashes in our tests: D21899139. It's already explicitly turned on on tests that want it.

In the future, this should probably be updated to use "modern configs" like the
Mercurial tests do.

Reviewed By: ikostia

Differential Revision: D22016705

fbshipit-source-id: b27f6423bf4ec5244ef3ce2e7676306165a331a8
2020-06-12 04:17:18 -07:00
Thomas Orozco
304701f890 mononoke/tests: remove expecting a message that is gone
Summary: This message is gone we shouldn't expect it anymore: D21913608

Reviewed By: ikostia

Differential Revision: D22016684

fbshipit-source-id: 97d86e9750e775c1bb3a1e75939f506cd35851c0
2020-06-12 04:17:17 -07:00
Thomas Orozco
01a84fabe4 mononoke: turn off mutation in test-commitcloud
Summary:
This test broke when this got turned on for all tests (D21899139). It's not
enabled for other commit cloud tests there, so let's be consistent.

Reviewed By: ikostia

Differential Revision: D22016686

fbshipit-source-id: 5f4385b60fd31c89e335e971f262da1226f32254
2020-06-12 04:17:17 -07:00
Kostia Balytskyi
87970c4168 tests: fix broken test-infinitepush-bookmarks-disabled.t
Reviewed By: markbt

Differential Revision: D21997584

fbshipit-source-id: b2a1f0daf540911cdde9735a53f4d2b6d3d6984e
2020-06-11 09:22:28 -07:00
Aida Getoeva
11c4601550 mononoke: remove apiserver source code
Summary:
APIServer is deprecated, tw jobs stopped and deleted.
This diffs removes the source code.

Reviewed By: krallin

Differential Revision: D21839442

fbshipit-source-id: 5a4089d9205d8b0061c8aa01dcd74674fe9baca8
2020-06-10 19:29:47 -07:00
Egor Tkachenko
d72c2b0b60 mononoke: opsfiles: Port deny_from_corp.sh hook
Summary: Ported deny_from_corp hook into mononoke.

Reviewed By: krallin

Differential Revision: D21329467

fbshipit-source-id: f5fa7a745b09a83b2624dd074155901f0bd31a55
2020-06-10 19:29:47 -07:00
Arun Kulshreshtha
9c689337cd edenapi_types: rename depth to length in history request
Summary: D21880220 renamed the `depth` parameter in Mononoke's history fetching functions to be `length`. This diff makes the same change for EdenAPI's `HistoryRequest` struct.

Reviewed By: StanislavGlebik

Differential Revision: D21948599

fbshipit-source-id: fe8649a5789f07d8262ad3d5e2f477a8b50f2c6f
2020-06-10 19:29:45 -07:00
Arun Kulshreshtha
898ddfe519 edenapi_server: add subtree endpoint
Summary:
Add a new `subtree` endpoint to the EdenAPI server, which fetches trees using the underlying server-side logic for the `gettreepack` wire protocol command. This is intended for use in situations where compatibility with `gettreepack` is desired when using HTTP for tree fetching.

The name of the endpoint is up for bikeshedding. It seemed weird to name the endpoint `gettreepack` since that name is a verb and refers to a "pack" which is not relevant in this context (there are no wirepacks or packfiles involved). I chose the name `subtree` since the endpoint logically returns all of the nodes underneath a given node in the tree (though in the most frequent case, the node will be the root node and therefore the subtree will be the entire tree).

In practice, this initial implementation is not ideal because it buffers the trees in memory, which is problematic because `gettreepack` requests are likely to produce a very large number of trees. Later in this stack, the endpoint will be updated to produce a streaming response instead.

Reviewed By: StanislavGlebik

Differential Revision: D21782764

fbshipit-source-id: 726925858352c33c923da1979da9d20fbcf930f6
2020-06-10 19:29:44 -07:00
Johan Schuijt-Li
1d4c5cbfc4 mononoke: replace instances of whitelist/blacklist/blackhole
Summary:
There are people that are hurt by usage of these terms, this should be more
then enough reason to replace these. Newly chosen terms are more
self-explanatory as well.

This doesn't yet touch the actualy config files, as that requires a bit more
effort than 1 diff and will require more coordination.

Reviewed By: krallin

Differential Revision: D21924440

fbshipit-source-id: e24fc638dc8c9d6d20b6f3fa5f0d0bbc91bbf77b
2020-06-10 19:29:30 -07:00
Stanislau Hlebik
6ca1d57cb8 mononoke: add an integration test where filenodes are disabled
Summary:
This test checks that we can start Mononoke and serve pull/push/update with
filenodes

Reviewed By: ahornby

Differential Revision: D21904753

fbshipit-source-id: 86690c5ed5ce7d022844809b09beb25c7961cac8
2020-06-10 19:29:29 -07:00
Mateusz Kwapich
11fbd214e5 add common-base subcommand to scsc
Summary: This will allow me to test it easily.

Reviewed By: StanislavGlebik

Differential Revision: D21840079

fbshipit-source-id: 1b3743da9c7908eb0dedd665aa24a4bf7aabd94f
2020-06-10 19:29:14 -07:00
Arun Kulshreshtha
3223d12f99 edenapi: add data subcommand to read_res
Summary: Previously, `read_res` was called `data_util` and only dealt with EdenAPI data responses. Support for history responses was added later as a `history` subcommand. For consistency, let's move the top-level commands for data responses underneath a new `data` subcommand. When support for addition response types is added in the future, those can also go under their own subcommands.

Reviewed By: quark-zju

Differential Revision: D21825197

fbshipit-source-id: f5cb759a68324e7d0f98e3448bd5d1cba6417bad
2020-06-02 12:49:18 -07:00
Arun Kulshreshtha
735b112d97 edenapi: rename data_util to read_res
Summary: Give this tool a more descriptive name. (It reads EdenAPI responses, so `read_res` seemed fitting.)

Reviewed By: quark-zju

Differential Revision: D21796964

fbshipit-source-id: 8a4ee365aa3bcf115fc7a3452406ed96b4a25edc
2020-06-02 12:49:18 -07:00
Stanislau Hlebik
094bf1d44f mononoke: process wantslfspointers from clienttelemtry
Summary:
See D21765065 for more context. TL;DR is that we want to control
lfs rollout from client side to make sure we don't put lfs pointers in the
shared memcache

Reviewed By: xavierd

Differential Revision: D21822159

fbshipit-source-id: daea6078d95eb4e9c040d353a20bcdf1b6ae07b1
2020-06-01 15:19:36 -07:00
Kostia Balytskyi
5cd4583c5a cross_repo: record sync map version_name in synced_commit_mapping
Summary:
Out `CommitSyncConfig` struct now contains a `version_name` field, which is intended to be used as an identifier of an individual version of the `commitsyncmap` in use. We want to record this value in the `synced_commit_mapping` table, so that later it is possible to attribute a commit sync map to a given commit sync.

This is part of a broader goal of adding support for sync map versioning. The broader goal is important, as it allows us to move faster (and troubleshoot better) when sync maps need changing.

Note that when commit is preserved across repos, we set `version_name` to `NULL`, as it makes no sense to attribute commit sync maps to those case.

Reviewed By: farnz

Differential Revision: D21765408

fbshipit-source-id: 11a77cc4d926e4a4d72322b51675cb78eabcedee
2020-06-01 07:16:52 -07:00
Arun Kulshreshtha
10fa44290e edenapi: use array to specify keys in history requests
Summary: Update the JSON format for history requests to use an array rather than an object to represent keys, for the same reason as D21412989. (Namely, that it's possible for two keys to share the same path, making the path unsuitable for use as a field name in a JSON object.)

Reviewed By: xavierd

Differential Revision: D21782763

fbshipit-source-id: eb04013795d1279ecbf00a8a0be106318695bd05
2020-05-29 15:45:10 -07:00
Stanislau Hlebik
7ef7fca580 mononoke: support ManualMove in hg sync job
Reviewed By: krallin

Differential Revision: D21762295

fbshipit-source-id: 4ab66634a0e9976ca2c0de4b2841c0b9bf4afd35
2020-05-28 07:24:05 -07:00
Mark Thomas
354c42f32c tests_utils: add_file content can take ownership
Summary:
Change the signature of `CreateCommitContext::as_file` and its associated
functions so that content is `impl Into<String>`, rather than
`impl AsRef<str>`.  The content will immediately be converted to a `String`
anyway, so we can avoid a string copy if the caller already has a string that
can be moved.

Reviewed By: krallin

Differential Revision: D21743429

fbshipit-source-id: d54914386439489fe4e47e37ff9a75c52b1a0443
2020-05-28 06:22:33 -07:00
Mark Thomas
807b7a4261 test_utils: add drawdag support
Summary:
Add support for drawdag in Mononoke unit tests.  Tests can use ASCII DAGs to construct
commit graphs, and can optionally customize the content of each commit.

Reviewed By: krallin

Differential Revision: D21743431

fbshipit-source-id: 9e6a52d1efe67ef4a5519ed7783f953fef7358f1
2020-05-28 06:22:33 -07:00
Stanislau Hlebik
e4023eaf19 mononoke: remove UseExistingIfAvailable from hg sync job
Summary:
It was used only once for testing push redirection. We no longer need it, so
I'd like to delete it to remove this old code and also to make it easier to
support ManualMove bookmarks.

Differential Revision: D21745630

fbshipit-source-id: 362952d95edb923cc4b60359321b563c1e4961de
2020-05-27 13:38:02 -07:00
Kostia Balytskyi
01695d59e8 commitcloud_backfiller: add tests for forward and reverse fillers
Summary:
This diff extends the integration test for the forward filler to execute queue operations, as well as the core business logic.
It also adds a test for the reverse filler, which does the same, but in a different difection.

Reviewed By: krallin

Differential Revision: D21628705

fbshipit-source-id: fb4ee0ecacc990d073425f3f37f794f74c057ea2
2020-05-26 07:00:12 -07:00
Kostia Balytskyi
56eea5b6f6 commitcloud_backfiller: add reverse filler
Summary:
This diff finally introduce the continuous reverse filler. Specifically, this adds a cli (and underlying wiring) to operate the filler logic in the `reversefillerqueue` table.

To achieve this:
- the filler class is turned into a base class with two subclasses for the forward and reverse fillers
- the main file is renamed from `forwardfiller.py` into `filler.py`, to better reflect the independence of direction.

Reviewed By: krallin

Differential Revision: D21628259

fbshipit-source-id: 5676a162a62f0dc6fe80e6300b72d30370fc80b4
2020-05-26 07:00:12 -07:00
Stanislau Hlebik
7ccd28de14 scs_server: add track_history_across_deletions parameter
Reviewed By: krallin

Differential Revision: D21718738

fbshipit-source-id: b7dd7e05b1773ac1a442d89991549f0f97e1e55b
2020-05-26 05:38:55 -07:00
Alex Hornby
a7a4dcd046 mononoke: add devdb support to integration test runner
Summary: Add devdb support to integration test runner so that one can more easily inspect mysql test data, also makes it easier to run tests with locally modified schema.

Reviewed By: HarveyHunt

Differential Revision: D21645234

fbshipit-source-id: ec75d70ef59f04548c7346a122298567dd09c264
2020-05-26 02:36:12 -07:00
Stefan Filip
a2254714c3 mononoke: update bulkops::fetch_all_public_changesets to return commits in order
Summary:
At first glance people will assume that changesets are returned in the same
order that they were added in the database or that at least commits are
returned in a deterministic fashion. That didn't happen because the both
changeset ids and changeset entries were received without any order.
This diff updates the function to returns results in order they were added
to the database.

Reviewed By: krallin

Differential Revision: D21676663

fbshipit-source-id: 912e6bea0532796b1d8e44e47d832c0420d97bc1
2020-05-21 20:43:45 -07:00
Alex Hornby
83dddd4c6f mononoke: fix test-blobstore_healer.t log expectations with replication enabled
Summary: Fix test log expections when replication is enabled.

Reviewed By: ikostia

Differential Revision: D21644474

fbshipit-source-id: fe1968994da427e2810be1bdea8fa56387d3f00f
2020-05-20 03:20:33 -07:00
Stanislau Hlebik
e30a12ce58 mononoke: remove getfiles from traffic replay
Reviewed By: farnz

Differential Revision: D21622533

fbshipit-source-id: 3225e287df42c1bac8ad8f67cdb05ec33f27dfdd
2020-05-19 04:43:01 -07:00
Stanislau Hlebik
54c036e8b2 mononoke: record git mapping while doing a push
Summary:
Currently we record them only during pushrebase. Let's record during push as
well.

To simplify things a little bit let's allow only a very simple push case:
1) Single bookmark.
2) All pushed commits should be reachable by this bookmark.

Reviewed By: krallin

Differential Revision: D21451337

fbshipit-source-id: bf2f1e6025ac116fb8096824b7c4c6440d073874
2020-05-18 05:41:41 -07:00
Jun Wu
06f03628aa infinitepush: remove legacy auto pull logic
Summary: The revset autopull now covers the infintiepush autopull logic.

Reviewed By: DurhamG

Differential Revision: D21526664

fbshipit-source-id: 90cfdebc99bb69b3e45eadcbf4b0d764e0cd68c6
2020-05-14 12:47:35 -07:00
Alex Hornby
cd9346f7da mononoke: walker: add --sample-path-regex option
Summary:
Add a --sample-path-regex option for use in the corpus dumper so we can dump out just a subset of directories from a repo.

This is most useful on large repos.

Reviewed By: farnz

Differential Revision: D21325548

fbshipit-source-id: bfda87aa76fbd325e4e01c2df90b5dcfc906a8f6
2020-05-14 04:22:22 -07:00
Alex Hornby
5df1989251 mononoke: walker: corpus dump bytes to inflight area and then move
Summary:
Update the corpus walker to dump the sampled bytes as early as possible to the Inflight area of the output dir, then move them to final location once path is known.

When walking large files and manifests this uses a lot less memory that holding the bytes in a map!

Layout is changed is to make comparison by file type easier.  we get a top level dir per extension, e.g. all .json files are under FileContent/byext/json

This also reduces the number of bytes taken from the sampling fingerprint used to make directories, 8 was overkill.  3 is enough to limit directory size.

Reviewed By: farnz

Differential Revision: D21168633

fbshipit-source-id: e0e108736611d552302e085d91707cca48436a01
2020-05-14 04:22:22 -07:00
Alex Hornby
288d03af6e mononoke: walker: add corpus dumper for space analysis
Summary:
Add corpus dumper for space analysis

This reuses the path based tracking from compression-benefit and the size sampling from scrub.

The core new functionality is the dump to disk from inside corpus stream.

Reviewed By: StanislavGlebik

Differential Revision: D20815125

fbshipit-source-id: 01fdc9dd69050baa8488177782cbed9e445aa3f7
2020-05-14 02:32:51 -07:00
Jun Wu
39bd5d8634 context: remove "is a remote bookmark or commit, try to 'hg pull' it first" message
Summary:
We now have auto pull logic that covers most unknown rev use-cases. The hint
message is no longer necessary. It's also unclear how to use `hg pull`
correctly. For example, should it be `-r`, `-B remote/foo` or `-B foo`?

Reviewed By: DurhamG

Differential Revision: D21526667

fbshipit-source-id: 40583bfb094e52939130250dd71b96db4d725ad5
2020-05-13 19:27:41 -07:00
Mark Thomas
1110a36017 add test-infinitepush-mutation.t to the mysql tests
Summary:
The `test-infinitepush-mutation.t` test covers the new mutation database, so
add it to the mysql tests.

Reviewed By: krallin

Differential Revision: D21548966

fbshipit-source-id: 0dc1f90129fa61fb6db1c1b5a747efa3d20041f5
2020-05-13 11:00:57 -07:00
Mark Thomas
14dfeecda8 getbundle: include mutations in getbundle response for draft commits
Summary:
When the client pulls draft commits, include mutation information in the bundle
response.

Reviewed By: farnz

Differential Revision: D20871339

fbshipit-source-id: a89a50426fbd8f9ec08bbe43f16fd0e4e3424e0b
2020-05-13 11:00:56 -07:00
Mark Thomas
5774dbde9d unbundle: accept mutation entries and store them in the mutation store
Summary:
Advertise support for `b2x:infinitepushmutation`.  When the client sends us
mutation information, store it in the mutation store.

Reviewed By: mitrandir77

Differential Revision: D20871340

fbshipit-source-id: ab0b3a20f43a7d97b3c51dcc10035bf7115579af
2020-05-13 11:00:56 -07:00
Stanislau Hlebik
d1b8399a16 mononoke: allow overriding lfs params in sync job
Reviewed By: krallin

Differential Revision: D21500773

fbshipit-source-id: b280b6759b0be066025f33bbf0b12a3359d227ba
2020-05-13 01:26:58 -07:00
Arun Kulshreshtha
7514241c38 edenapi_server: add history endpoint
Summary: Add a `/history` endpoint that serves EdenAPI history data. Like the other endpoints, this one currently buffers the response in memory, and will be modified to return a streaming response in a later diff.

Reviewed By: krallin

Differential Revision: D21489463

fbshipit-source-id: 259d2d1b7d700251fe902f1ac741545e5895404a
2020-05-12 16:26:22 -07:00
Arun Kulshreshtha
3ac2032c07 edenapi_server: split up test-edenapi-server.t
Summary: Break up the EdenAPI server integration tests to prevent the test from getting too long.

Reviewed By: krallin

Differential Revision: D21464056

fbshipit-source-id: 076aaf8717547fe9188f40c078d577961c02325d
2020-05-12 16:26:21 -07:00
Arun Kulshreshtha
4af81d590e edenapi_server: add trees endpoint
Summary: Add an endpoint that serves trees. Uses the same underlying logic as the files endpoint, and returns the requested nodes in a CBOR DataResponse.

Reviewed By: krallin

Differential Revision: D21412987

fbshipit-source-id: a9bcc169644a5889c3118a3207130228a5246b2f
2020-05-12 16:26:20 -07:00
Arun Kulshreshtha
40928f027c make_req: take array instead of object as input for data requests
Summary: Change `make_req` to take a JSON array as input when constructing `DataRequest`s instead of a JSON object. This is more correct because DataRequests can include multiple `Key`s with the same path; this cannot be represented as an object since an object is effectively a hash map wherein we would have duplicate keys.

Reviewed By: quark-zju

Differential Revision: D21412989

fbshipit-source-id: 07a092a15372d86f3198bea2aa07b973b1a8449d
2020-05-12 16:26:20 -07:00
Arun Kulshreshtha
955a057e8f edenapi_server: add files endpoint
Summary:
Add an endpoint that serves Mercurial file data.

The data for all files involved is fetched concurrently from Mononoke's backend but in this initial version the results are first buffered in memory before the response is returned; I plan to change this to stream the results in a later diff.

For now this version demonstrates the basic functionality as well as things like ACL enforcement (a valid client identity header with appropriate access permissions must be present for requests to succeed).

Reviewed By: krallin

Differential Revision: D21330777

fbshipit-source-id: c02a70dff1f646d02d75b9fc50c19e79ad2823e6
2020-05-12 16:26:19 -07:00
Thomas Orozco
72b949340f mononoke: pretty-print root-cause
Summary:
Right now, we debug-print the root cause and pretty-print everything else. This
is pretty bad because the root cause is usually the one thing we would want to
pretty print so we can add instructions there (such as "your hooks failed, fix
it").

This fixes this so we stop pretty-printing the root cause, but also debug print
the whole error, which gives us more developer-friendly context and is easier
for automation to match on.

This is actually in common/rust ... but we're the only people using it AFAICT.

Reviewed By: StanislavGlebik

Differential Revision: D21522518

fbshipit-source-id: 10158811574b56024e14852229e4541da19d5609
2020-05-12 07:59:42 -07:00
Thomas Orozco
4408577028 mononoke: improve reporting of case conflicts
Summary:
At least let's tell the use what to do about the problem and, where we can,
what the conflicting file was (see the attached task).

Reviewed By: farnz

Differential Revision: D21459412

fbshipit-source-id: 52b90cf7d41ebe6550083c6673b4e93b10edf5e2
2020-05-12 06:44:39 -07:00
Alex Hornby
1c044613f8 mononoke: walker: filter the repo path by node type
Summary:
Not all node types can have a path associated

Reset the tracked path to None if the route is taking us through a node type that can't have a repo path.

Reviewed By: krallin

Differential Revision: D21228372

fbshipit-source-id: 2b1e291f09232500adce79c630d428f09cd2d2cc
2020-05-11 12:00:59 -07:00
Alex Hornby
d64505bfff mononoke: walker: add --sample-offset so whole repo can be sampled in slices
Summary:
Add new --sample-offset argument so that in combination with the existing --sample-rate the whole repo can be sampled in slices

For --sample-rate=N, this allows us to scrub or corpus dump 1/Nth of the repo a time, which is particularly useful for corpus dumping on machines with limited disk.

Also factored out the sampling args construction as 3 of the 4 walk variants use them (only validate does not)

Reviewed By: krallin

Differential Revision: D21158486

fbshipit-source-id: 94f98ceb71c22e0e9d368a563cdb04225b6fc459
2020-05-11 12:00:58 -07:00
Arun Kulshreshtha
1918683317 edenapi_server: add repos endpoint
Summary:
Add a simple `/repos` endpoint that returns the list of repos available in a JSON response.

While the handler itself is quite simple, this diff establishes the general pattern by which new handlers will be added to the server.

Reviewed By: krallin

Differential Revision: D21330778

fbshipit-source-id: 77f57c969c34c8c1f7c94979fac383ec442a1e14
2020-05-08 12:07:02 -07:00
Kostia Balytskyi
3f9ba38f09 unbundle: save infinitepush unbundles into reversefillerqueue
Summary:
We want to be able to record all the bundles Mononoke processes to be later
replayed by Mercurail.

Reviewed By: krallin

Differential Revision: D21427622

fbshipit-source-id: b88e10e03d07dae35369286fe31022f36a1ee5cf
2020-05-07 05:12:14 -07:00
Lukas Piatkowski
ff2eddaffb mononoke: reverse autocargo include list to excludes
Summary: Cover as much as remining code with `Cargo.toml`s, for the rest create an exlusion list in the autocargo config.

Reviewed By: krallin

Differential Revision: D21383620

fbshipit-source-id: 64cc78a38ce0ec482966f32a2963ab4939e20eba
2020-05-06 08:43:18 -07:00
Mistral Orhan Jean-Pierre Contrastin
5fe820dd06 Expose ctime from Blobstore::get() in mononoke
Summary:
- Change get return value for `Blobstore` from `BlobstoreBytes` to `BlobstoreGetData` which include `ctime` metadata
- Update the call sites and tests broken due to this change
- Change `ScrubHandler::on_repair` to accept metadata and log ctime
- `Fileblob` and `Manifoldblob` attach the ctime metadata
- Tests for fileblob in `mononoke:blobstore-test` and integration test `test-walker-scrub-blobstore.t`
- Make cachelib based caching use `BlobstoreGetData`

Reviewed By: ahornby

Differential Revision: D21094023

fbshipit-source-id: dc597e888eac2098c0e50d06e80ee180b4f3e069
2020-05-06 00:55:07 -07:00
Thomas Orozco
cef5d8d956 mononoke: test-linknodes: add more testing
Summary:
This test is flaky right now, but it's not clear why. I'm also unable to repro.
Let's add more logging.

Reviewed By: StanislavGlebik

Differential Revision: D21405284

fbshipit-source-id: 3ce5768066091de61e62339286410a6223d251d5
2020-05-05 10:32:55 -07:00
Thomas Orozco
70007f049e mononoke/lfs_server: add a test for error formatting
Summary:
I'm going to send a diff to get rid of failure chains, and the LFS Server
actually uses that quite a bit. Let's make sure we don't affect the error
rendering there.

Reviewed By: StanislavGlebik

Differential Revision: D21383032

fbshipit-source-id: e0ec9c88760e7fd48d39fa1570efd1870a9ef532
2020-05-05 05:44:52 -07:00
Thomas Orozco
c63ac4a8eb mononoke: fix a broken test
Summary:
Looks like this broke yesterday. There was a Reindeer update yesterday IIRC, so
I'm guessing that's the cause. In any case, this is easy to fix forward.

Reviewed By: farnz

Differential Revision: D21399830

fbshipit-source-id: 5cf33411e089a8c675a8b3fdf7b6ae5ae267058d
2020-05-05 02:47:03 -07:00
Thomas Orozco
cfde4afe90 mononoke/gitimport: support read-only mode
Summary:
This adds support for running Gitimport with `--readonly-storage`. The way we
do this is by masking the various storages we use (blobstore, changesets,
bonsai).

Reviewed By: markbt

Differential Revision: D21347939

fbshipit-source-id: 68084ba0d812dc200776c761afdfe41bab9a6d82
2020-05-04 07:18:02 -07:00
Thomas Orozco
28eee11931 mononoke/gitimport: improve concurrency
Summary:
The original gitimport wasn't really designed for concurrency, since it did
commits one by one. With this update, we can now derive Bonsais from multiple
commits in parallel, and use multiple threads to communicate with the Git
repository (which is actually somewhat expensive when that's all we do).

We also store Bonsais iteratively. There is a bit of extra work that could be
done also here by saving Bonsais asynchronously to the Blobstore, and inserting
a single batch in Changesets once we're finished.

Reviewed By: farnz

Differential Revision: D21347941

fbshipit-source-id: e0ea86bf4d164599df1370844d3f0301d1031801
2020-05-04 07:18:02 -07:00
Thomas Orozco
bc7e31cdd1 mononoke/gitimport: allow deriving a range of commits
Summary:
This adds support for deriving commits within a range in gitimport, which gets
us one step closer to resumable gitimport. The primary goal of this is to
evaluate whether using Gitimport for Configerator might be suitable.

Differential Revision: D21347942

fbshipit-source-id: aa3177466e389ceb675328999ccf836f29912698
2020-05-04 07:18:01 -07:00
Thomas Orozco
57ccda8e9c mononoke/gitimport: add derive hg functionality
Summary:
This adds some basic functionality for deriving hg manifests in gitimport. I'd
like to add this to do some correctness testing on importing Git manifests from
Configerator.

Differential Revision: D21347940

fbshipit-source-id: 6f819fa8a62b3088fb163138fc23910b8f2ff3ce
2020-05-04 07:18:01 -07:00
Harvey Hunt
3cd49f9d3c mononoke: Add tunables - a simple form of config hot reloading
Summary:
Currently, Mononoke's configs are loaded at startup and only refreshed
during restart. There are some exceptions to this, including throttling limits.
Other Mononoke services (such as the LFS server) have their own implementations
of hot reloadable configs, however there isn't a universally agreed upon method.

Static configs makes it hard to roll out features gradually and safely. If a
bad config option is enabled, it can't be rectified until the entire tier is
restarted. However, Mononoke's code is structured with static configs in mind
and doesn't support hot reloading. Changing this would require a lot of work
(imagine trying to swap out blobstore configs during run time) and wouldn't
necessarily provide much benefit.

Instead, add a subset of hot reloadable configs called tunables. Tunables are
accessible from anywhere in the code and are cheap to read as they only require
reading an atomic value. This means that they can be used even in hot code
paths.

Currently tunables support reloading boolean values and i64s. In the future,
I'd like to expand tunables to include more functionality, such as a rollout
percentage.

The `--tunables-config` flag points to a configerator spec that exports a
Tunables thrift struct. This allows differents tiers and Mononoke services to
have their own tunables. If this isn't provided, `MononokeTunables::default()`
will be used.

This diff adds a proc_macro that will generate the relevant `get` and `update`
methods for the fields added to a struct which derives `Tunables`. This struct is
then stored in a `once_cell` and can be accessed using `tunables::tunables()`.

To add a new tunable, add a field to the `MononokeTunables` struct that is of
type `AtomicBool` or `AtomicI64`. Update the relevant tunables configerator
config to include your new field, with the exact same name.

Removing a tunable from `MononokeTunables` is fine, as is removing a tunable
from configerator.

If the `--tunables-config` path isn't passed, then a default tunables config
located at `scm/mononoke/tunables/default` will be loaded. There is also the
`--disable-tunables` flag that won't load anything from configerator, it
will instead use the `Tunable` struct's `default()` method to initialise it.
This is useful in integration tests.

Reviewed By: StanislavGlebik

Differential Revision: D21177252

fbshipit-source-id: 02a93c1ceee99066019b23d81ea308e4c565d371
2020-04-30 16:08:30 -07:00
Kostia Balytskyi
e016ea16a2 commit cloud forwardfiller: add a test to exercise the flow
Summary:
Building on the previous two commits, this adds a test which performs the following steps:
- does an infintiepush push to a Mercurial server
- looks into the `forwardfillerqueue`
- runs commitcloud forwardfiller's `fill-one` to replay the bundle to Mononoke
- verifies that this action causes the commit to appear in Mononoke

As this test uses `getdb.sh` from Mercurial test suite, it needs to be whitelisted from network blackholing (Note: we whitelist mononoke tests, which run with `--mysql` automatically, but this one is different, so we need to add it manually. See bottom diff of the stack for why we don't use `--mysql` and ephemeral shards here).

Reviewed By: krallin

Differential Revision: D21325071

fbshipit-source-id: d4d6cbdb10a2bcf955ee371278bf2bbbd5f5122c
2020-04-30 13:00:22 -07:00
Thomas Orozco
48d3c33411 mononoke/microwave: warmup from last derived state
Summary:
Microwave doesn't normally allow writes, which can cause cache warmup to fail
if master has underived commits. So, let's go back in bookmarks history to
whatever is most recent and derived. We can do so using the existing logic we
use in the warm bookmarks cache.

Reviewed By: farnz

Differential Revision: D21325485

fbshipit-source-id: 11e758cd512a22e02704ac34458fead18c284c20
2020-04-30 12:47:35 -07:00
Lukas Piatkowski
764023bc99 mononoke: replace all remaining usages of aclchecker with permission_checker
Summary: The changes to server/context, gotham_ext and the code that depends on them are the only reminding places where aclchecker is used directly and it is not easy to split this diff to convert them separately.

Reviewed By: krallin

Differential Revision: D21067809

fbshipit-source-id: a041ab141caa6fe6871e1fda6013e33f1f09bc56
2020-04-29 11:57:34 -07:00
Thomas Orozco
4acf76de95 mononoke/admin: add support for showing arbitrary bundles in hg sync
Summary:
It can be quite convenient when you have a bundle ID to be able to quickly
translate it to a Bonsai and get information about said Bonsai in order to e.g.
identify whether it was particularly large. This adds that functionality.

Reviewed By: StanislavGlebik

Differential Revision: D21228409

fbshipit-source-id: fc2ff938ff16e99c88b3e522b7ac39c4f39d60f2
2020-04-27 07:24:41 -07:00
Thomas Orozco
8f6f3b8834 mononoke/admin: asyncify the hg_sync subcommand
Summary:
This code had grown into a pretty big monster of a future. This makes it a bit
easier to modify and work with.

Reviewed By: StanislavGlebik

Differential Revision: D21227210

fbshipit-source-id: 5982daac4d77d60428e80dc6a028cb838e6fade0
2020-04-27 07:24:41 -07:00
Alex Hornby
378559fb29 mononoke: walker: add fsnodes derivation to test blobimport
Summary:
Add --derived-data-type=fsnodes to blobimport to a couple of walker tests so we have test data present to load.

Includes a small change to library.sh to add default_setup_pre_blobimport entry point used by these tests

Reviewed By: StanislavGlebik

Differential Revision: D21202480

fbshipit-source-id: d7eb3e5736531a11da87d92d0d03a528ff2c91a7
2020-04-24 04:29:52 -07:00
Xavier Deguillard
413d2b3aba remotefilelog: enable uploading LFS blobs
Summary:
This adds the proper hooks in the right place to upload the LFS blobs and write
to the bundle as LFS pointers. That last part is a bit hacky as we're writing
the pointer manually, but until that code is fully Rust, I don't really see a
good way of doing it.

Reviewed By: DurhamG

Differential Revision: D20843139

fbshipit-source-id: f2ef7b045c6604398b89580b468c354d14de1660
2020-04-23 14:00:23 -07:00
Mark Thomas
6f8737d116 mutationstore: a store for commit mutation information
Summary:
Add the Mononoke Mercurial mutation store.  This stores mutation information
for draft commits so that it can be shared between clients. The mutation
entries themselves are stored in a database, and the mutation store provides
abstractions for adding and querying them.

This commit adds the `all_predecessors` method to the mutation store, which
allows Mononoke to fetch all predecessors for a given set of commits.  This
will be used to serve mutation information to clients who are pulling draft
commits.

Reviewed By: krallin

Differential Revision: D20287381

fbshipit-source-id: b4455514cb8a22bef2b9bf0229db87c2a0404448
2020-04-23 08:58:09 -07:00
Mark Thomas
4ceb020d3a log: improve default log output
Summary:
Make the default output for `scsc log` shorter by only including the first line of the commit message, and omitting less interesting fields like commit extras.

The full details are hidden behind a `--verbose` flag, similar to `hg log`.

Reviewed By: mitrandir77

Differential Revision: D21202318

fbshipit-source-id: f15a0f8737f17e3189ea1bbe282d78a9c7199dd9
2020-04-23 08:36:43 -07:00
Stanislau Hlebik
dcf66ebc11 mononoke: add walker for fsnodes
Summary: Make it possible to traverse fsnodes in walker.

Reviewed By: ahornby

Differential Revision: D21153883

fbshipit-source-id: 047ab73466f48048a34cb52e7e0f6d04cda3143b
2020-04-23 01:24:20 -07:00
Alex Hornby
20b268cd68 mononoke: walker: add path to the OutgoingEdge
Summary:
For some nodes like FileContent from a BonsaiChangset, the file path is not part of node identity, but it is important for tracking which nodes are related to which paths.

This change adds an optional path field to the OutGoingEdge so that it can be used in route generation and as part of edge identity for sampling.

Its optional as some walks don't need the paths, for example scrub.

Reviewed By: farnz

Differential Revision: D20835653

fbshipit-source-id: f609c953da8bfa0cdfdfb26328149d567c73dbc9
2020-04-22 06:48:52 -07:00
Alex Hornby
de301b108e mononoke: walker: add blobstore usage by key type to scrub progress reporting
Summary: Report stats by the node type (e.g. FileContent, HgManifest etc) for blobstore usage when scrubbing so we can see how large each type is.

Reviewed By: StanislavGlebik

Differential Revision: D20564327

fbshipit-source-id: 55efd7671f893916d8f85fa9a93f95c97a098af4
2020-04-22 06:48:51 -07:00
Kostia Balytskyi
b59886c7f8 mononoke: fix how infinitepush is detected
Summary:
Correctly identify infinitepush without bookmarks as infinitepush instead of plain push.

Current behavior would sometimes pass `infinitepush` bundles through the `push` pipeline. Interestingly, this does not result in any user-visible effects at the moment. However, in the future we may want to diverge these pipelines:
- maybe we want to disable `push`, but enable `infinitepush`
- maybe there will be performance optimizations, applicable only to infinitepush

In any case, the fact that things worked so far is a consequence of a historical accident, and we may not want to keep it this way. Let's have correct identification.

Reviewed By: StanislavGlebik

Differential Revision: D18934696

fbshipit-source-id: 69650ca2a83a83e2e491f60398a4e03fe8d6b5fe
2020-04-22 05:13:36 -07:00
Lukas Piatkowski
449594be46 mononoke/hooks: fix to panic the server when AclChecker is unreachable
Summary:
In next diffs with permission_checker the panic is changed to anyhow::Error.

The previous behavior of this code was that when AclChecker updated failed
after 10s this fact was ignored and the hooks were simply not using ACLs. This
diff fixes it so that the server exits when AclChecheker update is timing out.

Reviewed By: johansglock

Differential Revision: D21155944

fbshipit-source-id: ab4a5071acbe6a1282a7bc5fdbf301b4bd53a347
2020-04-22 02:45:03 -07:00
Alex Hornby
12585058cf mononoke: walker: update compression-benefit to report progress by node type
Summary:
Allow us to see the sizes for each node type (e.g. manifests, bonsais etc), and extends the default reporting to all types.

The progress.rs changes update its summary by type reporting to be reusable, and then it is reused by the changes to sizing.rs.

Reviewed By: krallin

Differential Revision: D20560962

fbshipit-source-id: f09b45b34f42c5178ba107dd155abf950cd090a7
2020-04-21 08:29:21 -07:00
Thomas Orozco
aae2721caf mononoke_hg_sync_job: don't fail if the Globalrev counter is where we want it
Summary:
This was how this was supposed to work all along, but there was a bug in the
sense that if the counter is where want to set it, then the update affects 0
rows. This is a bit of a MySQL idiosyncrasy — ideally we would set
CLIENT_FOUND_ROWS on our connections in order to be consistent with SQLite.

That said, for now, considering we are the only ones touching this counter, and
considering this code isn't intended to be long-lived, it seems reasonable to
just check the counter after we fail to set it.

(see https://dev.mysql.com/doc/refman/8.0/en/mysql-affected-rows.html for
context)

Reviewed By: HarveyHunt

Differential Revision: D21153966

fbshipit-source-id: 663881c29a11a619ec9ab20c4291734ff13d798a
2020-04-21 06:09:19 -07:00
Alex Hornby
15f98fe58c mononoke: walker: fix flaky integration tests
Summary:
These failed when I did a local run against fbcode warm

count-objects.t and enabled-derive.t were flaky depending on the exact path taken through the graph.

Reviewed By: ikostia

Differential Revision: D21092866

fbshipit-source-id: ac4371cf81128b4d38cd764d86fc45d44d639ecc
2020-04-21 05:23:16 -07:00
Kostia Balytskyi
e7df58e848 mononoke: [RFC] migrate bits of validation code to Small/Large newtypes
Summary:
This is a POC attempt to increase the type safety of the megarepo codebase by introducing the `Small`/`Large` [newtype](https://doc.rust-lang.org/rust-by-example/generics/new_types.html) wrappers for some of the function arguments.

As an example, compare these two function signatures:
```
pub async fn verify_filenode_mapping_equivalence<'a>(
    ctx: CoreContext,
    source_hash: ChangesetId,
    source_repo: &'a BlobRepo,
    target_repo: &'a BlobRepo,
    moved_source_repo_entries: &'a PathToFileNodeIdMapping,
    target_repo_entries: &'a PathToFileNodeIdMapping,
    reverse_mover: &'a Mover,
) -> Result<(), Error>
```
and
```
async fn verify_filenode_mapping_equivalence<'a>(
    ctx: CoreContext,
    source_hash: Large<ChangesetId>,
    large_repo: &'a Large<BlobRepo>,
    small_repo: &'a Small<BlobRepo>,
    moved_large_repo_entries: &'a Large<PathToFileNodeIdMapping>,
    small_repo_entries: &'a Small<PathToFileNodeIdMapping>,
    reverse_mover: &'a Mover,
) -> Result<(), Error>
```

In the first case, it is possible to call function with the source and target repo inverted accidentally, whereas in the second one it is not.

Reviewed By: StanislavGlebik

Differential Revision: D20463053

fbshipit-source-id: 5f4f9ac918834dbdd75ed78623406aa777950ace
2020-04-20 11:30:05 -07:00
Thomas Orozco
825b0a1daa mononoke/hg_sync: fix off-by-one error in globalrev sync
Summary:
The ID in Hgsql is supposed to the next globalrev to assign, not the last one
that was assigned. We would have otherwise noticed during the rollout since
we'd have seen that the counter in Mercurial wasn't `globalrev(master) + 1`
(and we could have fixed it up manually before it had any impact), but let's
fix it now.

Reviewed By: StanislavGlebik

Differential Revision: D21089653

fbshipit-source-id: 0a37e1b7299a0606788bd87f788799db6e3d55f4
2020-04-20 07:16:44 -07:00
Stanislau Hlebik
74fe56b5d8 mononoke: fix timeout in integration tests
Reviewed By: krallin

Differential Revision: D21129373

fbshipit-source-id: 7c47e5b5a156babfc8ad9819af44f807a1a036d1
2020-04-20 05:28:52 -07:00
Thomas Orozco
518167f581 mononoke: allow for HgsqlGlobalrevs name not matching the HgsqlName
Reviewed By: ahornby

Differential Revision: D21088256

fbshipit-source-id: 6ed2969d41ade83d1a603e319450be7decd3f151
2020-04-17 06:24:10 -07:00
Thomas Orozco
805a150bb6 mononoke/hook_tailer: support passing a list of changesets to tail
Summary:
This makes it easier to test performance on a specific set of commits. As part
of that, I've also updated our file reading to be async since why not.

Reviewed By: farnz

Differential Revision: D21064609

fbshipit-source-id: d446ab5fb5597b9113dbebecf97f7d9b2d651684
2020-04-17 04:52:28 -07:00
Thomas Orozco
d6d5129fa3 mononoke: add smoke tests for the hook tailer
Summary:
Let's try and make sure this doesn't bitrot again by adding a smoke test. Note
that there are no hooks configured here, so this exercises everything but the
actual hook running, but for now this is probably fine.

Note that this required updating the hook tailer to use the repository config
for the hook manager, since you can't start AclChecker in a test otherwise.

Reviewed By: StanislavGlebik

Differential Revision: D21063378

fbshipit-source-id: c7336bc883dca2722b189449a208e9381196300e
2020-04-17 04:52:27 -07:00
Arun Kulshreshtha
fc741586d2 Add integration test for EdenAPI server
Summary: Add a simple integration test for the EdenAPI server which just starts up a server and hits its health_check endpoint. This will be expanded in later diffs to perform actual testing.

Reviewed By: krallin

Differential Revision: D21054212

fbshipit-source-id: a3be8ddabb3960d709a1e83599bc6a90ebe49b25
2020-04-16 10:03:13 -07:00
Thomas Orozco
87622937f5 mononoke/blobstore_healer: remove more old futures from main
Summary:
This turns out quite nice because we had some futures there that were always
`Ok`, and now we can use `Output` instead of `Item` and `Error`.

Reviewed By: ahornby

Differential Revision: D21063119

fbshipit-source-id: ab5dc67589f79c898d742a276a9872f82ee7e3f9
2020-04-16 09:46:13 -07:00
Kostia Balytskyi
220edc6740 admin: add a subcommand to manipulate mutable_counters
Summary:
This is generally something I wanted to have for a long time: instead of having to open a writable db shell, now we can just use the admin command. Also, this will be easier to document in the oncall wikis.

NB: this is lacking the `delete` functionality atm, but that one is almost never needed.

Reviewed By: krallin

Differential Revision: D21039606

fbshipit-source-id: 7b329e1782d1898f1a8a936bc711472fdc118a96
2020-04-16 03:19:44 -07:00
Xavier Deguillard
4973c55030 exchange: always call prepushoutgoing hooks
Summary:
Previously, an extension adding the "changeset" pushop might forget to call the
prepushoutgoing hooks, preventing them from being called.

Reviewed By: DurhamG

Differential Revision: D21008487

fbshipit-source-id: a6bc506c7e1695854aca3d3b2cd118ef1c390c52
2020-04-15 20:22:18 -07:00
Mark Thomas
235c9a5cd9 getbundle: compute full set of new draft commits
Summary:
In getbundle, we compute the set of new draft commit ids.  This is used to
include tree and file data in the bundle when draft commits are fully hydrated,
and will also be used to compute the set of mutation information we will
return.

Currently this calculation only computes the non-common draft heads.  It
excludes all of the ancestors, which should be included.  This is because it
re-uses the prepare_phases code, which doesn't quite do what we want.

Instead, separate out these calculations into two functions:

  * `find_new_draft_commits_and_public_roots` finds the draft heads
    and their ancestors that are not in the common set, as well as the
    public roots the draft commits are based on.
  * `find_phase_heads` finds and generates phase head information for
    the public heads, draft heads, and the nearest public ancestors of the
    draft heads.

Reviewed By: StanislavGlebik

Differential Revision: D20871337

fbshipit-source-id: 2f5804253b8b4f16b649d737f158fce2a5102002
2020-04-15 11:00:33 -07:00
Xavier Deguillard
643e69e045 remotefilelog: do not write delta in bundle2
Summary:
Computing delta force the client to have the previous version locally, which it
may not have, forcing a full fetch of the blob, to then compute a delta. Since
delta are a way to save on bandwidth usage, fetching a blob to compute it
negate its benefits.

Reviewed By: DurhamG

Differential Revision: D20999424

fbshipit-source-id: ae958bb71e6a16cfc77f9ccebd82eec00ffda0db
2020-04-15 10:26:39 -07:00
Kostia Balytskyi
cf10fe8689 admin: make sure bookmark operations create syncable log entries
Summary:
This is important for various syncs.

Note: there's an obvious race condition, TOCTTOU is non-zero for existing bookmark locations. I don't think this is a problem, as we can always re-run the admin.

Reviewed By: StanislavGlebik

Differential Revision: D21017448

fbshipit-source-id: 1e89df0bb33276a5a314301fb6f2c5049247d0cf
2020-04-15 04:17:42 -07:00
Aida Getoeva
25eff1c91e mononoke/scs-log: integrate deleted manifest (linear)
Summary:
Use deleted manifest to search deleted paths in the repos with linear history. For merged history it returns error as there was no such path.
Commit, where the path was deleted, is returned as a first commit in the history stream, the rest is a history before deletion.

Reviewed By: StanislavGlebik

Differential Revision: D20897083

fbshipit-source-id: e75e53f93f0ca27b51696f416b313466b9abcee8
2020-04-14 18:27:39 -07:00
Kostia Balytskyi
f31680f160 admin: change "blacklisted" to "redacted" in admin and tests
Summary:
Some time ago we decided on the "redaction" naming for this feature. A few
places were left unfixed.

Reviewed By: xavierd

Differential Revision: D21021354

fbshipit-source-id: 18cd86ae9d5c4eb98b843939273cfd4ab5a65a3a
2020-04-14 16:18:35 -07:00
Thomas Orozco
69b09c0854 mononoke/hg_sync_job: use hgsql name in integration test
Summary: What it says in the title.

Reviewed By: farnz

Differential Revision: D20943176

fbshipit-source-id: 8fae9b0bad32e2b6ede3c02305803c857c93f5e7
2020-04-14 10:26:11 -07:00
Thomas Orozco
d84ba9caae mononoke/hg_sync_job: use the hgsql repo name for globalrevs
Summary:
The name for repository in hgsql might not match that of the repository itself.
Let's use the hgsql repo name instead of the repo name for syncing globalrevs.

Reviewed By: farnz

Differential Revision: D20943175

fbshipit-source-id: 605c623918fd590ba3b7208b92d2fedf62062ae1
2020-04-14 10:26:10 -07:00
Simon Farnsworth
92fce3d518 Clean out unused deps from our TARGETS files
Summary:
We had accumulated lots of unused dependendencies, and had several test_deps in deps instead. Clean this all up to reduce build times and speed up autocargo processing.

Net removal is of around 500 unneeded dependency lines, which represented false dependencies; by removing them, we should get more parallelism in dev builds, and less overbuilding in CI.

Reviewed By: krallin, StanislavGlebik

Differential Revision: D20999762

fbshipit-source-id: 4db3772cbc3fb2af09a16601bc075ae8ed6f0c75
2020-04-14 03:38:11 -07:00
Thomas Orozco
77149d7ee8 mononoke/lfs_server: don't return a 502 on batch error
Summary:
502 made a bit of sense since we can occasionally proxy things to upstream, but
it's not very meaningful because our inability to service a batch request is
never fully upstream's fault (it would not a failure if we had everything
internally).

So, let's just return a 500, which makes more sense.

Reviewed By: farnz

Differential Revision: D20897250

fbshipit-source-id: 239c776d04d2235c95e0fc0c395550f9c67e1f6a
2020-04-08 11:58:09 -07:00
Stefan Filip
d1ba21803a version: warn users when they are running an old build
Summary:
Old is defined by being based on a commit that is more than 30 days old.
The build date is taken from the version string.
One observation is that if we fail to release in more than 30 days then all
users will start seeing this message without any way of turning it off. Doesn't
seem worth while to add a config for silencing it though.

Reviewed By: quark-zju

Differential Revision: D20825399

fbshipit-source-id: f97518031bbda5e2c49226f3df634c5b80651c5b
2020-04-07 14:25:38 -07:00
Stanislau Hlebik
b2a8862a9a mononoke: add a test backfill derived data
Summary:
I decided to go with integration test because backfilling derived data at the
moment requires two separate calls - a first one to prefetch changesets, and a
second one to actually run backfill. So integration test is better suited for this
case than unit tests.

While doing so I noticed that fetch_all_public_changesets actually won't fetch
all changesets - it loses the last commit becauses t_bs_cs_id_in_range was
returning exclusive (i.e. max_id was not included). I fixed the bug and made the name clearer.

Reviewed By: krallin

Differential Revision: D20891457

fbshipit-source-id: f6c115e3fcc280ada26a6a79e1997573f684f37d
2020-04-07 08:44:25 -07:00
Thomas Orozco
edadb9307a mononoke/repo_client: record depth
Summary: As it says in the title!

Reviewed By: HarveyHunt

Differential Revision: D20869828

fbshipit-source-id: df7728ce548739ef2dadad1629817fb56c166b66
2020-04-07 04:36:06 -07:00
Thomas Orozco
1c982d5258 mononoke/unbundle_replay: report size of the unbundle
Summary: This is helpful to draw conclusions as to how fast it is.

Reviewed By: StanislavGlebik

Differential Revision: D20872108

fbshipit-source-id: d323358bbba29de310d6dfb4c605e72ce550a019
2020-04-07 01:05:32 -07:00
Mateusz Kwapich
549eb41059 run blobstore healer integration test with mysql
Summary: So we're sure that all the quries work not only in sqlite.

Reviewed By: krallin

Differential Revision: D20839958

fbshipit-source-id: 9d05cc175d65396af7495b31f8c6958ac7bd8fb6
2020-04-06 09:57:24 -07:00
Alex Hornby
7060cd47d6 mononoke: walker: use sampling blobstore in compression-benefit
Summary:
Use the new sampling blobstore and sampling key in existing compression-benefit subcommand and check the new vs old reported sizes.

The overall idea for these changes is that the walker uses a CoreContext tagged with a SamplingKey to correlate walker steps for a node to the underlying blobstore reads,  this allows us to track overall bytes size (used in scrub stats) or the data itself (used in compression benefit)  per node type.

The SamplingVisitor and NodeSamplingHandler cooperate to gather the sampled data into the maps in NodeSamplingHandler,  which the output stream from the walk then operates on, e.g. to compress the blobs and report on compression benefit.

The main new logic sits in sampling.rs, it is used from sizing.rs (and later in stack from scrub.rs)

Reviewed By: krallin

Differential Revision: D20534841

fbshipit-source-id: b20e10fcefa5c83559bdb15b86afba209c63119a
2020-04-02 09:08:05 -07:00
Mark Thomas
640f272598 migrate from sql_ext::SqlConstructors to sql_construct
Summary:
Migrate the configuration of sql data managers from the old configuration using `sql_ext::SqlConstructors` to the new configuration using `sql_construct::SqlConstruct`.

In the old configuration, sharded filenodes were included in the configuration of remote databases, even when that made no sense:
```
[storage.db.remote]
db_address = "main_database"
sharded_filenodes = { shard_map = "sharded_database", shard_num = 100 }

[storage.blobstore.multiplexed]
queue_db = { remote = {
    db_address = "queue_database",
    sharded_filenodes = { shard_map = "valid_config_but_meaningless", shard_num = 100 }
}
```

This change separates out:
* **DatabaseConfig**, which describes a single local or remote connection to a database, used in configuration like the queue database.
* **MetadataDatabaseConfig**, which describes the multiple databases used for repo metadata.

**MetadataDatabaseConfig** is either:
* **Local**, which is a local sqlite database, the same as for **DatabaseConfig**; or
* **Remote**, which contains:
    * `primary`, the database used for main metadata.
    * `filenodes`, the database used for filenodes, which may be sharded or unsharded.

More fields can be added to **RemoteMetadataDatabaseConfig** when we want to add new databases.

New configuration looks like:
```
[storage.metadata.remote]
primary = { db_address = "main_database" }
filenodes = { sharded = { shard_map = "sharded_database", shard_num = 100 } }

[storage.blobstore.multiplexed]
queue_db = { remote = { db_address = "queue_database" } }
```

The `sql_construct` crate facilitates this by providing the following traits:

* **SqlConstruct** defines the basic rules for construction, and allows construction based on a local sqlite database.
* **SqlShardedConstruct** defines the basic rules for construction based on sharded databases.
* **FbSqlConstruct** and **FbShardedSqlConstruct** allow construction based on unsharded and sharded remote databases on Facebook infra.
* **SqlConstructFromDatabaseConfig** allows construction based on the database defined in **DatabaseConfig**.
* **SqlConstructFromMetadataDatabaseConfig** allows construction based on the appropriate database defined in **MetadataDatabaseConfig**.
* **SqlShardableConstructFromMetadataDatabaseConfig** allows construction based on the appropriate shardable databases defined in **MetadataDatabaseConfig**.

Sql database managers should implement:

* **SqlConstruct** in order to define how to construct an unsharded instance from a single set of `SqlConnections`.
* **SqlShardedConstruct**, if they are shardable, in order to define how to construct a sharded instance.
* If the database is part of the repository metadata database config, either of:
    * **SqlConstructFromMetadataDatabaseConfig** if they are not shardable.  By default they will use the primary metadata database, but this can be overridden by implementing `remote_database_config`.
    * **SqlShardableConstructFromMetadataDatabaseConfig** if they are shardable.  They must implement `remote_database_config` to specify where to get the sharded or unsharded configuration from.

Reviewed By: StanislavGlebik

Differential Revision: D20734883

fbshipit-source-id: bb2f4cb3806edad2bbd54a47558a164e3190c5d1
2020-04-02 05:27:16 -07:00
Jun Wu
2aec2dbcb6 commitcloud: migrate to tech-debt-free repo.pull for pulling
Summary:
The new API does nothing that cloud sync does not want: bookmarks, obsmarkers,
prefetch, etc. Wrappers to disable features are removed.

This solves a "lagged master" issue where selectivepull adds `-B master` to
pull extra commits but cloud sync cannot hide them without narrow-heads. Now
cloud sync just does not pull the extra commits.

Reviewed By: sfilipco

Differential Revision: D20808884

fbshipit-source-id: 0e60d96f6bbb9d4ce02c04e8851fc6bda442c764
2020-04-01 19:40:57 -07:00
Lukas Piatkowski
bf34f084d0 mononoke: make blobrepo and its dependencies OSS buildable
Reviewed By: markbt

Differential Revision: D20495840

fbshipit-source-id: 3bbefae1923dc84e3daea158a24c0d2a802cc9a9
2020-03-31 04:02:45 -07:00
Thomas Orozco
11af551491 mononoke/benchmark_filestore: make it work again
Summary:
This bitrot with two different changes:

- D19473960 put it on a v2 runtime, but the I/O is v1 so that doesn't work (it
  panics).
- The clap update a couple months ago made duplicate arguments illegal, and a
  month before that we had put `debug` in the logger args (arguably where it
  belong), so this binary was now setting `debug` twice, which would now panic.

Evidently, just static typing wasn't quite enough to keep this working through
time (though that's perhaps due to the fact that both of those changes were
invisible to the type system), so I also added a smoke test for this.

Reviewed By: farnz

Differential Revision: D20618785

fbshipit-source-id: a1bf33783885c1bb2fe99d3746d1b73853bcdf38
2020-03-30 07:32:20 -07:00
Thomas Orozco
8315336b2c mononoke/unbundle_replay: run hooks
Summary:
As the name indicates, this updates unbundle_replay to run hooks. Hook failures
don't block the replay, but they're logged to Scuba.

Differential Revision: D20693851

fbshipit-source-id: 4357bb0d6869a658026dbc5421a694bc4b39816f
2020-03-30 06:25:08 -07:00
Thomas Orozco
066cdcfb3d mononoke/unbundle_replay: also report recorded duration
Summary: This will make it easier to compare performance.

Differential Revision: D20674164

fbshipit-source-id: eb1a037b0b060c373c1e87635f52dd228f728c89
2020-03-30 06:25:07 -07:00
Thomas Orozco
213276eff5 mononoke/unbundle_replay: add Scuba reporting
Summary: This adds some Scuba reporting to unbundle_replay.

Differential Revision: D20674162

fbshipit-source-id: 59e12de90f5fca8a7c341478048e68a53ff0cdc1
2020-03-30 06:25:07 -07:00
Thomas Orozco
13f24f7425 mononoke/unbundle_replay: unbundle concurrently, derive filenodes concurrently
Summary:
This updates unbundle_replay to do things concurrently where possible.
Concretely, this means we do ingest unbundles concurrently, and filenodes
derivation concurrently, and only do the actual pushrebase sequentially. This
lets us get ahead on work wherever we can, and makes the process faster.

Doing unbundles concurrently isn't actually guaranteed to succeed, since it's
*possible* that an unbundle coming in immediately after a pushrebase actually
depends the commits created in said pushrebase. In this case, we simply retry
the unbundle when we're ready to proceed with the pushrebase (in the code, this
is the `Deferred` variant). This is fine from a performance perspective

As part of this, I've also moved the loading of the bundle to processing, as
opposed to the hg recording client (the motivation for this is that we want to
do this loading in parallel as well).

This will also let us run hooks in parallel once I add this in.

Reviewed By: StanislavGlebik

Differential Revision: D20668301

fbshipit-source-id: fe2c62ca543f29254b4c5a3e138538e8a3647daa
2020-03-30 06:25:07 -07:00
Thomas Orozco
8ce3d94187 mononoke/unbundle_replay: add support for replaying a bookmark
Summary:
This adds support for replaying the updates to a bookmark through unbundle
replay. The goal is to be able to run this as a process that keeps a bookmark
continuously updated.

There is still a bit of work here, since we don't yet allow the stream to pause
until bookmark update becomes available (i.e. once caught up, it will exit).
I'll introduce this in another diff.

Note that this is only guaranteed to work if there is a single bookmark in the
repo. With more, it could fail if a commit is first introduced in a bookmark that
isn't the one being replayed here, and later gets introduced in said bookmark.

Reviewed By: StanislavGlebik

Differential Revision: D20645159

fbshipit-source-id: 0aa11195079fa6ac4553b0c1acc8aef610824747
2020-03-30 06:25:04 -07:00
Thomas Orozco
6b1894cec9 mononoke/unbundle_replay: derive filenodes
Summary:
We normally derive those lazily when accepting pushrebase, but we do derive
them eagerly in blobimport. For now, let's be consistent with blobimport.

This ensures that we don't lazily generate them, which would require read traffic,
and gives a picture a little more consistent with what an actual push would look like.

Reviewed By: ikostia

Differential Revision: D20623966

fbshipit-source-id: 2209877e9f07126b7b40561abf3e6067f7a613e6
2020-03-30 06:25:04 -07:00
Thomas Orozco
7ca14665a2 mononoke/unbundle_replay: use repo pushrebase hooks
Summary:
This updates unbundle_replay to account for pushrebase hooks, notably to assign
globalrevs.

To do so, I've extracted the creation of pushrebase hooks in repo_client and
reused it in unbundle_replay. I also had to update unbundle_replay to no longer
use `args::get_repo` since that doesn't give us access to the config (which we
need to know what pushrebase hooks to enable).

Reviewed By: ikostia

Differential Revision: D20622723

fbshipit-source-id: c74068c920822ac9d25e86289a28eeb0568768fc
2020-03-30 06:25:03 -07:00
Thomas Orozco
3804f1ca16 mononoke: introduce unbundle_replay
Summary:
This adds a unbundle_replay Rust binary. Conceptually, this is similar to the
old unbundle replay Python script we used to have, but there are a few
important differences:

- It runs fully in-process, as opposed to pushing to a Mononoke host.
- It will validate that the pushrebase being produced is consistent with what
  is expected before moving the bookmark.
- It can find sources to replay from the bookmarks update log (which is
  convenient for testing).

Basically, this is to writes and to the old unbundle replay mechanism what
Fastreplay is to reads and to the traffic replay script.

There is still a bit of work to do here, notably:

- Make it possible to run this in a loop to ingest updates iteratively.
- Run hooks.
- Log to Scuba!
- Add the necessary hooks (notably globalrevs)
- Set up pushrebase flags.

I would also like to see if we can disable the presence cache here, which would
let us also use this as a framework for benchmarking work on push performance,
if / when we need that.

Reviewed By: StanislavGlebik

Differential Revision: D20603306

fbshipit-source-id: 187c228832fc81bdd30f3288021bba12f5aca69c
2020-03-30 06:25:03 -07:00
Aida Getoeva
5e7c092cba mononoke/scs-log: do not accept timestamps <= 0
Summary:
SCS log accepts two dates/timestamps to filter history by the commit creation time. Each timestamp is a `i64` and zero or negative timestamp still represents a pretty valid time in past.
The time filters are pretty expensive: they require sequential changeset info fetching and checking the date.

It turned out that some of the requests have time filters but seem not meaning it: their both after and before timestamps equals to zero. And there are lots of such queries: https://fburl.com/scuba/mononoke_scs_server/g345na72. This cause SCS log traverse the whole history for a path, which turns into hours of fetching cs infos and fastlog batches.

I've decided to consider a valid timestamp only the timestamp greater than 0: only after 1970-01-01 00:00:00 UTC.

Reviewed By: StanislavGlebik

Differential Revision: D20670210

fbshipit-source-id: f59c425779a37ecac489dbba2ed3fd547987ee62
2020-03-26 14:06:31 -07:00
Jun Wu
8bd55436a4 pull: automatically pull selective bookmarks unless it's a no-argument pull
Summary:
This enforces certain selective pull logic in core. Namely, rewrite `pull -r X`
to `pull -r X -B master`.

Unlike selectivepull in remotenames, `pull` (pulls everything) won't be
rewritten to `pull -B master` (which pulls less commits and names).

Therefore this change always adds more commits to pull, and therefore should not
break existing users. Eventually we want the "not pulling everything" behavior,
but right now this just fixes `pull -r X` to also update important remote names.

Reviewed By: markbt

Differential Revision: D20531121

fbshipit-source-id: af457b5ddb1265b61956eb2ee6afb7b7208293e0
2020-03-26 10:54:09 -07:00
Aida Getoeva
54575f37be mononoke/scs: separate blame integration test from test-scs.t
Summary:
This diff moves blame integration tests out from the main `test-scs.t`.
The change makes `test-scs.t` be able to complete and not time out anymore.

Reviewed By: ikostia

Differential Revision: D20629281

fbshipit-source-id: 67ba047442e7216a8addd0945c94d2f932eca08a
2020-03-25 09:00:20 -07:00
Aida Getoeva
aa5da4eaa3 scs: separate lookup integration test from test-scs.t
Summary: This diff moves lookup integration tests out from the `test-scs.t`.

Reviewed By: krallin

Differential Revision: D20620537

fbshipit-source-id: a8e1020901271b0e66dd4caa43ad3eddbf887a41
2020-03-25 09:00:20 -07:00
Aida Getoeva
fde579782f scs: fix the blank line in test-scs
Reviewed By: markbt

Differential Revision: D20628121

fbshipit-source-id: 6a9d72eab4d2ce8deb4640225080de32a96b9caf
2020-03-24 13:50:24 -07:00
Aida Getoeva
f06fed3a79 scs: fix test-scs.t
Summary: Fix test that was broken by D20557896

Reviewed By: krallin

Differential Revision: D20619387

fbshipit-source-id: 3fd7ee501144528b18b7162f73dcf3d251fd5f2f
2020-03-24 10:21:05 -07:00
Aida Getoeva
1306d4f4e7 scs: enable Scuba logging for perf counters
Summary:
Mononoke has performance counters that allow to log number of Blobstore gets and puts, Memcache hits and misses, etc. These metrics are very useful for debugging issues in production.

APIServer logs perf counters, but SCS apparently doesn't, this diff adds logging.

Reviewed By: mitrandir77

Differential Revision: D20598544

fbshipit-source-id: 5071434a90ae6bf356326a97f1d0593901912ef7
2020-03-23 19:01:00 -07:00
Aida Getoeva
66f007f083 scs: separate scs-diff test
Summary: Move integration tests for `scsc diff` to the `test-scs-diff.t`.

Reviewed By: krallin

Differential Revision: D20557896

fbshipit-source-id: 201b6b5046babfef56db87c0cce70ab8cb6ae62c
2020-03-23 08:36:51 -07:00
Aida Getoeva
4819a08901 scs: separate log test
Summary:
`test-scs.t` is very big and takes too much time to run.
I'm moving integration tests for `scsc log` to the `test-scs-log.t` file.

Reviewed By: krallin

Differential Revision: D20537505

fbshipit-source-id: 8f4a06ad4b48f34eb131d095ec21bd2d08cfe9d9
2020-03-23 08:36:50 -07:00
Aida Getoeva
4815913d9e mononoke/scs: use changeset info in changeset context
Summary:
Changeset info is less expensive to load than Bonsai, so we would like to use it in SCS as a source of commit info if possible.

This diff adds a method into the Repo object that checks `changeset_info` derivation is enabled for the repo in the `DerivedDataConfig`. If derivation is enabled, then SCS derives this info otherwise it awaits for bonsai and converts it into the changeset info. The bonsai fields aren't copied but moved to the `ChangesetInfo`.

Reviewed By: StanislavGlebik

Differential Revision: D20282403

fbshipit-source-id: b8ddad50dcd5c6de109728b2081ca5a13f440988
2020-03-19 12:16:40 -07:00
Thomas Orozco
956c768095 mononoke/repo_client: add telemetry for designated nodes
Summary:
Now that Arun is about to roll this out to the team, we should get some more
logging in place server side. This updates the designated nodes handling code
to report whether it was enabled (and log prior to the request as well).

Reviewed By: HarveyHunt

Differential Revision: D20514429

fbshipit-source-id: 76ce62a296fe27310af75c884a3efebc5f210a8a
2020-03-18 12:57:34 -07:00
Simon Farnsworth
a908be34b3 Modernise hooks support
Summary: Migrate hooks to new futures and thus modern tokio. In the process, replace Lua hooks with Rust hooks, and add fixes for the few cases where Lua was too restrictive about what could be done.

Reviewed By: StanislavGlebik

Differential Revision: D20165425

fbshipit-source-id: 7bdc6820144f2fdaed653a34ff7c998913007ca2
2020-03-18 09:17:17 -07:00
Harvey Hunt
d1b4f83bf5 mononoke: Log number of possible LFS fetches for a getpack request.
Summary:
Update the `getpack` code to calculate how many files (and their total
size) would be served over LFS.

NOTE: The columns have `Possible` in their names as we might not have LFS
enabled, in which case we aren't actually fetching this many blobs from an LFS
server.

Reviewed By: farnz

Differential Revision: D20444137

fbshipit-source-id: 85506d8c468cfdc470684dd216567f1848c43d08
2020-03-16 14:11:49 -07:00
Stanislau Hlebik
5790f7e176 mononoke: add lfs rollout percentage
Summary:
Allow to gradually rollout lfs. A lot of the details are covered in D20441254
I won't repeat them here. I'd only mention that in order for fastreplay to
correctly calculate percentages this diff starts to log client_hostname for
fastreplay.

Reviewed By: ikostia

Differential Revision: D20441264

fbshipit-source-id: e272176f68879f6c545784609799d21daedec5eb
2020-03-16 08:18:41 -07:00
Mateusz Kwapich
0cb4e3eca6 return generation numbers
Summary:
let's return the generation numbers as part of commit info - they are cheap to
obtain.

Reviewed By: StanislavGlebik

Differential Revision: D20426744

fbshipit-source-id: 50c7017c55aeba04fb9059e2c1db19f2fb0a6e5e
2020-03-13 08:31:07 -07:00
Kostia Balytskyi
23dee89ed0 mononoke: make commit validator manifest-diff-based
Summary:
Currently, x-repo commit validator runs full working copy comparison every time. This is slow, as it requires fetching of full manifests for the commit versions in both a small and a large repo. For those of leafs, which are different, filenodes are also fetched, so that `ContentId`s can be compared. It's slow. Because it's slow, we skip the majority of commits and validate only 1 in ~40 `bookmarks_update_log` entries. Obviously, this is not ideal.

To address this, this diff introduces a new way to run validation: based on comparing full manifest diffs. For each entry of the `bookmarks_update_log` of the large repo, we:
- unfold it into a list of commits it introduces into the repository
- fetch each commit
- see how it rewrites to small repos
- compute manifest differences between the commit and it's parent in each involved repo
- compare those differences
- check that topological order relationship remains sane (i.e. if `a` is a p1 of `b` in a small repo, than `a'` must be a p1 of `b'` in a large repo, where `a'` and `b'` are remappings of `a` and `b`)

In addition, this diff adds some integration tests and gets rid of the skipping logic.

Reviewed By: StanislavGlebik

Differential Revision: D20007320

fbshipit-source-id: 6e4647e9945e1da40f54b7f5ed79651927b7b833
2020-03-13 07:46:48 -07:00
Stanislau Hlebik
e3bf91e944 mononoke: add generate_placeholder_diff option
Summary:
Let's use functionality added in D20389226 so that we can generate
diffs even for large files.

I've contemplated between two approaches: either "silently" generate
placeholder diff for files that are over the limit or add a new option
where client can request these placeholders. I choose the latter for a few
reasons:

1) To me option #1 might cause surprises e.g. diffing a single large file
doesn't fail, but diffing two files whose size is (LIMIT / 2 + 1) will fail.
Option #2 let's the client be very explicit about what it needs - and it also
won't return a placeholder when the actual content is expected!
2) Option #2 makes the client think about what it wants to be returned,
and it seems in line with source control thrift API design in general i.e.
commit_file_diffs is not trivial to use by design to because force client to
think about edge cases. And it seems that forcing client to think about additional
important edge case (i.e. large file) makes sense.
3) The code change is simpler with option #2

I also thought about adding generate_placeholder_diff parameter to CommitFileDiffsParams,
but that makes integration with scs_helper.py harder.

Reviewed By: markbt

Differential Revision: D20417787

fbshipit-source-id: ab9b32fd7a4768043414ed7d8bf39e3c6f4eb53e
2020-03-12 12:00:26 -07:00
Thomas Orozco
62a027768f mononoke: add a test for BFS fetching over SSH
Summary:
This adds a test demonstrating that we can perform BFS fetching over SSH. The
test should demonstrate that the fetches are:

- Done in a BFS fashion (we don't fetch the entire trees before comparing them,
  instead we do a fetch at each layer in the tree). In particular, note that
  the cc tree, which is unchanged, doesn't get explored at all.
- Done using the right paths / filenodeids, and therefore the right linknodes
  are located.

Reviewed By: farnz

Differential Revision: D20387124

fbshipit-source-id: b014812b0e6e85a5cdf6abefe3fe4f47b004461e
2020-03-12 11:16:07 -07:00
Thomas Orozco
399fd6c573 mononoke/{edenapi,lfs}_server: update to new Hyper, new Bytes, new Gotham
Summary:
This updates the lfs server and eden api server to use a newer version of
Gotham, which comes along with an updated version of Bytes and Hyper.

A few things had to change for this:

- New bytes don't support concatenation, so we need to fold them ourselves,
  except...
- ... new Hyper bodies don't tell you how big they are (either in requests or
  responses), so we need to inspect headers to find the size instead (I added
  this in `gotham_ext::body_ext::BodyExt`, although it arguably belongs more in
  a `hyper_ext` crate, but creating a new crate for just this seems overkill).
- New Hyper requires its data stream to be `Sync` for reasons that have more to
  do with developer experience than runtime
  (https://github.com/hyperium/hyper/pull/1857). Unfortunately, our Filestore
  streams aren't `Sync`, because our `BoxFuture` contains a `dyn Future` that
  isn't explicitly `Sync` (which is how we pull things out of blobstores). Even
  if `BoxFuture` contained a `Sync` future, that still wouldn't be enough
  anyway, because `compat()` explicitly implements `!Sync` on the stream it
  returns. I'll ask upstream in Hyper if this can possibly change in the
  future, but for now we can work around it by wrapping the stream in a
  channel. I'll keep an eye out for performance here.
- When I updated our "pre state data" tweaks on top of Gotham, I renamed those
  to "socket data", since that's a better name or what they are (hence the
  changes here).
- I updated the lfs_protocol to stop depending on Hyper and instead depend on
  http, since that's all we need here.

As you review this, please pay close attention to the updated implementation of
`SignalStream`. Since this is a custom `Stream` in new futures, it requires a
bit of `unsafe { ... }`.

Note that, unfortunately, the diff includes both the Gotham update and the
server updates, since they have to happen together.

Reviewed By: kulshrax, dtolnay

Differential Revision: D20342689

fbshipit-source-id: a490db96ca7c4da8ff761cb80c1e7e3c836bad87
2020-03-11 10:22:28 -07:00
Thomas Orozco
04f347484b mononoke: allow selecting a priority in hgcli, and passing it to Mononoke
Summary:
This adds the ability to specify a priority in hgcli, and to pass it on to
Mononoke. This will be used to replay commit cloud traffic at a lower priority.

Reviewed By: farnz

Differential Revision: D20038573

fbshipit-source-id: 4055d28ee295e2b15c15945bd3741f6d739ead3a
2020-03-11 08:54:51 -07:00
Thomas Orozco
c5917acc3f mononoke: context_concurrency_blobstore
Summary:
This adds a blobstore that can reach into a CoreContext in order to identify
the allowed level of concurrency for blobstore requests initiated by this
CoreContext. This will let us replay infinitepush bundles with limits on a
per-request basis.

Reviewed By: farnz

Differential Revision: D20038575

fbshipit-source-id: 07299701879b7ae65ad9b7ff6e991ceddf062b24
2020-03-11 08:54:51 -07:00
Mateusz Kwapich
9fd7f0d2b4 improve the check for conflicts during insert
Summary:
Before we assumed that if the rows_affected length doesn't match the number of
entries we were trying to insert we have a conflict. Let's verify if we really
have conflict or we're trying to insert the same entry twice.

Reviewed By: krallin

Differential Revision: D20343219

fbshipit-source-id: 19e032439fdd65f5fe1afe1a10b401bc2fe33462
2020-03-10 05:47:05 -07:00
Mateusz Kwapich
e1bf77097f test showing the blobimport problem.
Summary: Running blobimport twice on the same commit seems to cause problems.

Reviewed By: krallin

Differential Revision: D20343218

fbshipit-source-id: 4d572630e7c15c219bee8db15cc879b2cb8602fe
2020-03-10 05:47:05 -07:00
Alex Hornby
b8ca854c0b mononoke: walker: add ability to walk all published bookmarks
Summary: Add ability to walk all published bookmarks as there may be multiple important bookmarks

Reviewed By: krallin

Differential Revision: D20249806

fbshipit-source-id: aff2ee1ec7d51a9e4fb6e1e803612abd207fd6cb
2020-03-10 05:26:35 -07:00
Thomas Orozco
3ee98c82e2 mononoke/microwave: add support for changesets
Summary:
This updates microwave to also support changesets, in addition to filenodes.
Those create a non-trivial amount of SQL load when we warm up the cache (due to
sequential reads), which we can eliminate by loading them through microwave.

They're also a bottleneck when manifests are loaded already.

Note: as part of this, I've updated the Microwave wrapper methods to panic if
we try to access a method that isn't instrumented. Since we'd be running
the Microwave builder in the background, this feels OK (because then we'd find
out if we call them during cache warmup unexpectedly).

Reviewed By: farnz

Differential Revision: D20221463

fbshipit-source-id: 317023677af4180007001fcaccc203681b7c95b7
2020-03-05 11:57:43 -08:00
Thomas Orozco
dd38f1fdb2 mononoke/cache_warmup: conditionally use microwave for faster warmup
Summary:
This incorporates microwave into the cache warmup process. See earlier in this
stack for a description of what this does, how it works, and why it's useful.

Reviewed By: ahornby

Differential Revision: D20219904

fbshipit-source-id: 52db74dc83635c5673ffe97cd5ff3e06faba7621
2020-03-05 11:57:43 -08:00
Thomas Orozco
275e4eff76 mononoke/mercurial: remove incorrect FileBytes Extend implementation
Summary:
This removes the Extend implementation for FileBytes, which was incorrect (it
discarded existing data!). I had introduced this as a backwards compatibility
shim when doing the Bytes 0.4 to Bytes 0.5 migration :/

We don't really need this shim, considering:

- The only place that really matters that uses this is the remotefilelog crate,
  where we have a content id, and where we should use `filestore::fetch_concat`
  instead.
- The other places are tests (or close to abandonware...), which can do their
  own folding.

Longer term, I'd like to remove the whole `Content` stream in hg entries, so
those callsites can use the filestore methods, which a) have test coverage
(unlike ad-hoc folds, which don't always do), and b) are more efficient since
they know how large the destination buffer needs to be ahead of time, and don't
need to re-allocate.

To make sure this fixes the bug, I also introduced tests for the remotefilelog
crate. As expected, the chunked variant fails without this fix.

Reviewed By: mitrandir77

Differential Revision: D20248978

fbshipit-source-id: 1b554d3e595eb867b6b6cf4204d31f27dd90a111
2020-03-04 08:51:42 -08:00
Mateusz Kwapich
1e33cd40b6 a small tool to backfill git mappings
Summary:
The git mappings are normally populated during blobimport of the repo but we
need something for the repos we've already imported.

Reviewed By: markbt

Differential Revision: D20160768

fbshipit-source-id: 9e37c7d0f12682e73ca9990e56e4d827e9861a9f
2020-03-04 06:08:43 -08:00
David Tolnay
e988a88be9 rust: Rename futures_preview:: to futures::
Summary:
Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/

This codemod replaces *all* dependencies on `//common/rust/renamed:futures-preview` with `fbsource//third-party/rust:futures-preview` and their uses in Rust code from `futures_preview::` to `futures::`.

This does not introduce any collisions with `futures::` meaning 0.1 futures because D20168958 previously renamed all of those to `futures_old::` in crates that depend on *both* 0.1 and 0.3 futures.

Codemod performed by:

```
rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
    -x \
    buck query "labels(srcs, rdeps(%Ss, //common/rust/renamed:futures-preview, 1))" \
| xargs sed -i 's,\bfutures_preview::,futures::,'

rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| xargs sed -i 's,//common/rust/renamed:futures-preview,fbsource//third-party/rust:futures-preview,'
```

Reviewed By: k21

Differential Revision: D20213432

fbshipit-source-id: 07ee643d350c5817cda1f43684d55084f8ac68a6
2020-03-03 11:01:20 -08:00
Thomas Orozco
83cd9eec54 mononoke/apiserver: run streams on a Tokio 0.2 runtime
Summary:
Well, we don't have a Tokio Compat runtime in Actix. This means Tokio 0.2 code
(e.g. Tokio 0.2 timers) blows up when executed in the API Server.

How do we fix this? By not running Mononoke code on Actix's runtime, and
instead running in on a Mononoke runtime we instantiated.

How do we do that? By passing a Tokio Compat Executor all the way down to the
place where Actix is about to consume our stream ... and at that point, we
spawn the stream on our runtime, and give Actix a dumb receiver that does work
when polled on a Tokio 0.1 runtime.

This feels like the end of the road for the API Server. Nothing about this is
even remotely sane, but it should take us through the API Server's eventual
demise and replacement with the Gotham-based EdenAPI Server, which runs on the
runtime of our choice (i.e. Tokio 0.2).

Reviewed By: farnz

Differential Revision: D20222294

fbshipit-source-id: 1646e35fe05b131b030e4962c8a7f68f72995035
2020-03-03 10:18:02 -08:00
Doug Neal
1e088c0af2 mononoke: lfs_server: add optional client identities to ratelimit config
Summary:
* Added intermediate (de)serializers for config types, so that we generate full Identity objects at config load time
* Implement FromStr for Identity
* Compare configured identities to presented identities in ratelimit middleware in order to decide whether or not to apply the limit

Reviewed By: krallin

Differential Revision: D20139308

fbshipit-source-id: 340c300db549575eb6d06efcbe437c0b1db4927b
2020-03-03 09:33:03 -08:00
Alex Hornby
464ffc40eb mononoke: pushrebase: fix casefolding_check usage during changeset creation
Summary: Honor the repo casefolding_check setting as tested by test-pushrebase-allow-casefolding.t

Reviewed By: StanislavGlebik

Differential Revision: D20192411

fbshipit-source-id: 8da72049417015b1f284c115a53b13c26ce3c3f6
2020-03-03 03:57:32 -08:00
Alex Hornby
37da3ebd2b mononoke: pushrebase: add tests for casefolding
Summary: Add tests for existing default block casefolding_check behaviour,  plus test demonstrating problem with casefolding_check=false

Reviewed By: farnz

Differential Revision: D20192412

fbshipit-source-id: 1aea0fc5581e0c44388a4224ca693698731d3cd5
2020-03-03 02:44:06 -08:00
David Tolnay
fe65402e46 rust: Move futures-old rdeps to renamed futures-old
Summary:
In targets that depend on *both* 0.1 and 0.3 futures, this codemod renames the 0.1 dependency to be exposed as futures_old::. This is in preparation for flipping the 0.3 dependencies from futures_preview:: to plain futures::.

rs changes performed by:

```
rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
    -x \
    buck query "labels(srcs,
        rdeps(%Ss, fbsource//third-party/rust:futures-old, 1)
        intersect
        rdeps(%Ss, //common/rust/renamed:futures-preview, 1)
    )" \
| xargs sed -i 's/\bfutures::/futures_old::/'
```

Reviewed By: jsgf

Differential Revision: D20168958

fbshipit-source-id: d2c099f9170c427e542975bc22fd96138a7725b0
2020-03-02 21:02:50 -08:00
Thomas Orozco
2d04773c23 mononoke/hg_sync_job: update Globalrevs in hgsql
Summary:
This updates the hg_sync_job to update Globalrevs in hgsql before attempting to
sync bundles. This means that if we're syncing successfully, hg is in sync with
Mononoke, and if we fail (which should be very uncommon to begin with!), hg
might skip a little bit ahead, but that's OK.

This only makes sense when generating bundles — when doing pushrebase, hg would
be updating its own globalrevs.

Reviewed By: StanislavGlebik

Differential Revision: D20159262

fbshipit-source-id: 6736f8592682da1001c7c9c4c9444462b71913c2
2020-03-02 08:24:16 -08:00
Kostia Balytskyi
7ed52ee31b mononoke: return hydrated bundles for infinitepush, if config says so
Summary:
## Wider goal
See D20068839

## This diff
This diff actually implements the conditional hydration of `getbundle`
responses, as described in the D20068839.

Note that as well as implementing support for hydrated `getbyndle` responses, this diff also implements support for changegroup v3 and lfs in such responses, which is needed if we are to do this kind of stuff in LFS-enabled repository.

Reviewed By: StanislavGlebik

Differential Revision: D20068838

fbshipit-source-id: fbdd3f8f5fb7cd2cb60473a94094553a1d4b4d2f
2020-02-28 08:30:43 -08:00
Thomas Orozco
26ae726af5 mononoke: update internals to Bytes 0.5
Summary:
The Bytes 0.5 update left us in a somewhat undesirable position where every
access to our blobstore incurs an extra copy whenever we fetch data out of our
cache (by turning it from Bytes 0.5 into Bytes 0.4) — we also have quite a few
place where we convert in one direction then immediately into the other.

Internally, we can start using Bytes 0.5 now. For example, this is useful when
pulling data out of our blobstore and deserializing as Thrift (or conversely,
when serializing and putting it into our blobstore).

However, when we interface with Tokio (i.e. decoders & encoders), we still have
to use Bytes 0.4.  So, when needed, we convert our Bytes 0.5 to 0.4 there.

The tradeoff idea is that we deal with more bytes internally than we end up
sending to clients, so doing the Bytes conversion closer to the point of
sending data to clients means less copies.

We can also start removing those once we migrate to Tokio 0.2 (and newer
versions of Hyper for HTTP services).

Changes that were required:

- You can't extend new bytes (because that implicitly copies). You need to use
  BytesMut instead, which I did where that was necessary (I also added calls in
  the Filestore to do that efficiently).
- You can't create bytes from a `&'a [u8]`, unless `'a` is  `'static`. You need
  to use `copy_from_slice` instead.
- `slice_to` and `slice_from` have been replaced by a `slice()` function that
  takes ranges.

Reviewed By: StanislavGlebik

Differential Revision: D20121350

fbshipit-source-id: eb31af2051fd8c9d31c69b502e2f6f1ce2190cb1
2020-02-27 08:08:28 -08:00
Mateusz Kwapich
6f9f82767c add git identifiers to Source Control Service
Summary: This allows us to translate git hashes

Reviewed By: markbt

Differential Revision: D19972870

fbshipit-source-id: 871a4cf94d468d987221cb08fe7b6135050bac93
2020-02-27 08:05:14 -08:00
Mateusz Kwapich
3ff29a8810 make BonsaiGitMapping repo-specific
Summary:
Nearly all of the Mononoke SQL stores are instantiated once per repo but they don't store the `RepositoryId` anywhere so every method takes it as argument. And because providing the repo_id on every call is not ergonomical we tend to add methods to blob_repo that just call the right method with the right repo_id in on of the underlying stores (see `get_bonsai_from_globalrev` on blobrepo for example).

Because my reviewers [pushed back](https://our.intern.facebook.com/intern/diff/D19972871/?transaction_id=196961774880671&dest_fbid=1282141621983439) when I've tried to do the same for bonsai_git_mapping  I've decided to make it right by adding the repo_id to the BonsaiGitMapping.

Reviewed By: krallin

Differential Revision: D20029485

fbshipit-source-id: 7585c3bf9cc8fa3cbe59ab1e87938f567c09278a
2020-02-27 08:05:13 -08:00
Stanislau Hlebik
98f6d5d1a8 mononoke: fix walker filenode walks
Summary:
Since Mononoke's filenodes were migrated to derived data framework
hg_linknode_populated alarm has been firing. The main reason was that there's
now a delay between hg changeset being generated and filenodes being generated.

This diff fixes it by making sure walker won't visit hg changesets without
generated filenodes (note that walker will visit these changesets later after filenodes will be
generated).

Reviewed By: ahornby

Differential Revision: D20067615

fbshipit-source-id: 285e9a3d8c89b85441491c889a8458c86ca0e3a8
2020-02-26 15:21:53 -08:00
Aida Getoeva
585899f419 mononoke/scs: use last change in file history
Summary:
There is no need to generate expensive file history stream if only one node is requested.

I refactored code that generated stream of history commits, so it'd first yield the nodes and only then prefetch their parents. That will help to solve latency problem for the history request for only a single commit.

I removed BFS queue and added two state variables: ready nodes and already processed:
* The last are the nodes that were return as a part of a history stream on the last iteration and now can be used to construct next BFS layer: prefetch fastlog batches, fill the commit graph, take parents in BFS order to form new bunch of nodes.
* First are used if it's the first iteration - there is no processed nodes yet but there are some that are ready to be returned.

I believe removing the queue I simplified the code and logic a little bit.

Reviewed By: StanislavGlebik

Differential Revision: D19818100

fbshipit-source-id: c30d28c623464ba3552a00e8542552f7655076ef
2020-02-26 08:09:12 -08:00
Alex Hornby
04e011525a mononoke: walker: test validate scuba logging for non-public commits
Summary: add test for scuba logging for non-public commits

Reviewed By: StanislavGlebik

Differential Revision: D20093721

fbshipit-source-id: eb0792bcae8ea27c11709181390efb0ac0c817ee
2020-02-26 06:16:29 -08:00
Thomas Orozco
b3bebee0b4 mononoke: include DB config in multiplexed blobstore configuration
Summary:
This updates our multiplexed blobstore configuration to carry its own DB
config. The upshot of this change is that we can move the blobstore sync queue
(a fairly unruly table) to its own DB.

Another nice side effect of this is that it cleans up a bunch of other code, by
finally decoupling the blobstore config from the DB config. For examples,
places that need to instantiate a blobstore can now to do even without a DB
config (such as wireproto logging).

Obviously, this cannot land until we update the configs to include this. I'll
do so in Configerator prior to landing the diff.

Reviewed By: HarveyHunt

Differential Revision: D19973905

fbshipit-source-id: 79e4ff92cdb989aab4532decd3fe4fd6c55e2bb2
2020-02-24 11:54:45 -08:00
Lukas Piatkowski
4aea99df4e mononoke/blobstore: remove rocksdb blobstore and replace its usages with sqliteblob
Summary:
This is the second (and last) step on removing RocksDB as a blobstore.
Check the task for more description.

Context for OSS:
> The issue with rocksblob (and to some extent sqlite) is that unless we
> introduce a blobstore tier/thift api (which is something I'm hoping to avoid
> for xdb blobstore) we'd have to combine all the mononoke function like hg,
> scs, LFS etc into one binary for it to have access to rocksdb, which would be
> quite a big difference to how we deploy internally

(Note: this ignores all push blocking failures!)

Reviewed By: farnz

Differential Revision: D20001261

fbshipit-source-id: c4b2b2a393b918d17680ad483aa1d77356f1d07c
2020-02-24 05:23:07 -08:00
Mark Thomas
70ffdc7293 add export
Summary:
Add `scsc export`.  Analogous to `svn export`, this exports the contents of a
directory within a commit to files on disk, without a local checkout.

Reviewed By: mitrandir77

Differential Revision: D20006307

fbshipit-source-id: 5870712172cd8a030e85dbff75273c28ab0c332c
2020-02-24 03:00:22 -08:00
Thomas Orozco
5b07c8285e mononoke: test-mononoke-admin.t: fixup replication lag match
Summary: It's not always 0! (sometimes it's 1)

Reviewed By: farnz

Differential Revision: D20065610

fbshipit-source-id: b546befbf824713811fd7c011bbf4c246d3c696d
2020-02-24 02:57:18 -08:00
Mateusz Kwapich
42bfba7c99 add git mappings import option
Summary: Let's import the info about corresponding git commits on blobimport whenever possible.

Reviewed By: ikostia

Differential Revision: D19877929

fbshipit-source-id: ba03d5de8ae8a9bd80084a8e858cd05e8f621193
2020-02-21 05:41:46 -08:00
Mateusz Kwapich
6111067524 add git mapping pushrebase hook
Summary:
Let's populate the bonsai<->git mapping on pushrebase of the commits that are
coming from git. By this being a pushrebase hook we can have the accuare mappings
being available as soon as the bonsai commit is available.

Corresponding configerator change: D19951607

Reviewed By: krallin

Differential Revision: D19949472

fbshipit-source-id: b957cbcdd0f14450ceb090539814952db9872576
2020-02-21 05:41:45 -08:00
Mark Thomas
a9490441b2 add blame --parent
Summary:
Add the `--parent` flag to `scsc blame`.  This runs blame against the first
parent of the specified commit, rather than the commit itself.  This allows
users to copy and paste commit hashes from previous blame output in order to
skip the commit, rather than having to look up the parent commit hash
themselves.

Reviewed By: StanislavGlebik

Differential Revision: D20006308

fbshipit-source-id: d1c25aad8f236fe27e467e29f6a96c957b6c8c8f
2020-02-20 13:03:54 -08:00
Thomas Orozco
4a29fe400d mononoke/blobstore_healer: migrate replication lag polling to async / await
Summary:
The former implementation here was a little difficult to work with, and
resulted in a whole lot of cloning of closures, etc.

This updates the implementation to be a little simpler on the whole (async /
await is nicer for while loops, since you can use, well, loops)

It does slightly change a few parts of the behavior:

- The old implementation would wait for the replication lag duration. That's
  not really correct. As we've observed several time this weeks, replication
  lag usually drops quickly once it starts dropping. I.e. if the replication
  lag is 10 seconds, it doesn't take 10 seconds to catch up. This gets more
  important with big lag durations.
- I updated replication lag to be u64 instead of usize. usize doesn't really
  make sense for something that has absolutely nothing to do with our pointer
  size.

I also split out the logic for calculating how long we wait in a part that
cares about whether we are busy and one that cares about replication lag
(whereas the older one kinda mixed the two together). We wait for our own
throttling (i.e. sleep for a sec if we didn't do anything) before we wait for
replication lag, so the new behavior should have the desired behavior of:

- If we don't have much work to do, we sleep 1 second between each iteration
  (but if we do have work, we don't).
- No matter what, if we have replication lag, we wait until that passes before
  doing any work.

The old one did that too, but it mixed the two calculations together, and was
(at least in my opinion) kinda hard to reason about as a result.

Reviewed By: StanislavGlebik

Differential Revision: D19997587

fbshipit-source-id: 1de6a9f9c1ecb56e26c304d32b907103b47b4728
2020-02-20 12:26:51 -08:00
Thomas Orozco
be5d7343ce mononoke/blobstore_healer: check for replication lag _before_ starting work
Summary:
We had crahsloops on this (which I'm fixing earlier in this stack), which
resulted in overloading our queue as we tried to repeatedly clear out 100K
entries at a time, rebooted, and tried again.

We can fix the root cause that caused us to die, but we should also make sure
crashloops don't result in ignoring lag altogether.

Also, while in there, convert some of this code to async / await to make it
easier to work on.

Reviewed By: HarveyHunt

Differential Revision: D19997589

fbshipit-source-id: 20747e5a37758aee68b8af2e95786430de55f7b1
2020-02-20 12:26:51 -08:00
Thomas Orozco
58126d90d6 mononoke: log input size
Summary:
This adds some basic logging for input size for Gettreepack and Getpack. This
might make it easier to understand "poison pill" requests that take out the
host before it has a chance to finish the request.

Reviewed By: StanislavGlebik

Differential Revision: D19974661

fbshipit-source-id: deae13428ae2d1857872185de2b6c0a8bcaf3334
2020-02-20 02:24:10 -08:00
Thomas Orozco
c899ed7249 test-gitimport-octopus: don't expect a specific number of commits to verify
Summary:
bonsai_verify occasionally visits the same commit twice (I found out by adding
logging and noting that it occasionally visits the same commit twice). Let's
allow this here.

Reviewed By: StanislavGlebik

Differential Revision: D19951390

fbshipit-source-id: 3e470476c6bc43ffd62cf24c3486dfcc7133de6c
2020-02-19 10:16:38 -08:00
Doug Neal
8e684cfda7 mononoke: lfs_server: add jitter field to ratelimit struct
Summary: Add the max_jitter_ms field to the rate limiting config struct, and to the integration test.

Reviewed By: HarveyHunt

Differential Revision: D19905068

fbshipit-source-id: b44251c456a45bc494d1080e405f2d009becc0d2
2020-02-18 07:47:09 -08:00
Thomas Orozco
49808a4410 mononoke/hg_sync_job: use 0.2 runtime
Summary:
This is required for 0.2 timers or runtime reliant code to work within the sync
job. To achieve this, we need to get of Tokio 0.1 fs code, which is
incompatible with Tokio 0.2 because it uses `blocking()`.

Reviewed By: ikostia

Differential Revision: D19909434

fbshipit-source-id: 58781e858dd55a9a5fc10a004e8ebdace1a533a4
2020-02-18 07:42:41 -08:00