Commit Graph

1391 Commits

Author SHA1 Message Date
Simon Farnsworth
0c3fe9b20f Fully asyncify blobstore sync queue
Summary: Move it from `'static` BoxFutures to async_trait and lifetimes

Reviewed By: markbt

Differential Revision: D22927171

fbshipit-source-id: 637a983fa6fa91d4cd1e73d822340cb08647c57d
2020-08-05 15:41:15 -07:00
David Tolnay
014f40209b Back out "rust: 1.45.2 update"
Summary:
This is a backout of D22912569 (34760b5164), which is breaking opt-clang-thinlto builds on platform007 (S206790).

Original commit changeset: 5ffdc48adb1f

Reviewed By: aaronabramov

Differential Revision: D22956288

fbshipit-source-id: 45940c288d6f10dfe5457d295c405b84314e6b21
2020-08-05 13:28:13 -07:00
Viet Hung Nguyen
f2ee103884 mononoke/repo_import: add more meaningful print outs and save hashes
Summary:
Added more logs when running the binary to be able to track the progress more easily.
Saved bonsai hashes into a file. In case we fail at deriving data types, we can still try to derive them manually with the saves hashes and avoid running the whole tool again.

Reviewed By: StanislavGlebik

Differential Revision: D22943309

fbshipit-source-id: e03a74207d76823f6a2a3d92a1e31929a39f39a5
2020-08-05 12:46:14 -07:00
Mark Thomas
cbd105a73e hook_tailer: reduce default concurrency to 20
Summary:
Large commits and many hooks can mean checking 100 commits at a time overload
the system.  Reduce the default concurrency to something more reasonable.

While we're here, lets use the proper mechanism for default values in clap.

Reviewed By: ikostia

Differential Revision: D22945597

fbshipit-source-id: 0f0a086c3b74bec614ada44a66409c8d2b91fe69
2020-08-05 10:34:05 -07:00
Mark Thomas
e12728305c hook_tailer: make command line arguments consistent
Summary:
Argument names should be `snake_case`.  Long options should be `--kebab-case`.

Retain the old long options as aliases for compatibility.

Reviewed By: HarveyHunt

Differential Revision: D22945600

fbshipit-source-id: a290b3dc4d9908eb61b2f597f101b4abaf3a1c13
2020-08-05 10:34:05 -07:00
Mark Thomas
b2b895353f hook_tailer: add --exclude-merges to skip merge commits
Summary: Add `--exclude-merges` which will skip merge commits.

Reviewed By: HarveyHunt

Differential Revision: D22945598

fbshipit-source-id: 3c20cf049bbe15a975671e8792259b460356804a
2020-08-05 10:34:05 -07:00
Mark Thomas
57626bec98 hook_tailer: add --log-interval to log every N commits
Summary:
Add `--log-interval` to log every N commits, so that it can be seen to be
making progress in the logs.

The default is set to 500, which logs about once every 10 seconds on my devserver.

Reviewed By: HarveyHunt

Differential Revision: D22945599

fbshipit-source-id: 7fc09b907793ea637289c9018958013d979d6809
2020-08-05 10:34:05 -07:00
Simon Farnsworth
99247529d5 Wishlist priority connections should use background mode
Summary: Commitcloud fillers use wishlist priority because we want them to wait their turn behind other users; let's also stop them from flooding the blobstore healer queue by making them background priority.

Reviewed By: ahornby

Differential Revision: D22867338

fbshipit-source-id: 5d16438ea185b580f3537e3c4895a545483eca7a
2020-08-05 06:35:46 -07:00
Simon Farnsworth
aa94fb9581 Add a multiplex mode that doesn't update the sync queue
Summary:
Backfillers and other housekeeping processes can run so far ahead of the blobstore sync queue that we can't empty it from the healer task as fast as the backfillers can fill it.

Work around this by providing a new mode that background tasks can use to avoid filling the queue if all the blobstores are writing successfully. This has a side-effect of slowing background tasks to the speed of the slowest blobstore, instead of allowing them to run ahead at the speed of the fastest blobstore and relying on the healer ensuring that all blobs are present.

Future diffs will add this mode to appropriate tasks

Reviewed By: ikostia

Differential Revision: D22866818

fbshipit-source-id: a8762528bb3f6f11c0ec63e4a3c8dac08d0b4d8e
2020-08-05 06:35:46 -07:00
Stanislau Hlebik
f13067b0da mononoke: add manual_commit_sync to megarepotool
Summary:
This operation is useful immediately after a small repo is merged into a large repo.
See example below

```
  B' <- manually synced commit from small repo (in small repo it is commit B)
  |
  BM <- "big merge"
 /  \
...  O <- big move commit i.e. commit that moves small repo files in correct location
     |
     A <- commit that was copied from small repo. It is identical between small and large repos.
```

Immediately after a small repo is merged into a large one we need to tell that a commit B and all of
its ancestors from small repo needs to be based on top of "big merge" commit in large repo rather than on top of
commit A.
The function below can be used to achieve exactly that.

Reviewed By: ikostia

Differential Revision: D22943294

fbshipit-source-id: 33638a6e2ebae13a71abd0469363ce63fb6b014f
2020-08-05 05:55:15 -07:00
Simon Farnsworth
33c2a0c846 Update auto_impl to 0.4
Summary: We were using a git snapshot of auto_impl from somewhere between 0.3 and 0.4; 0.4 fixes a bug around Self: 'lifetime constraints on methods that blocks work I'm doing in Mononoke, so update.

Reviewed By: dtolnay

Differential Revision: D22922790

fbshipit-source-id: 7bb68589a1d187393e7de52635096acaf6e48b7e
2020-08-04 18:12:45 -07:00
Kostia Balytskyi
c8e3c27a65 megarepo: test invisible merge e2e
Reviewed By: StanislavGlebik

Differential Revision: D22924237

fbshipit-source-id: ba13d610c26c1b0be4f4afa75de93568359457c6
2020-08-04 12:21:13 -07:00
Stefan Filip
7392392a33 server: add commit/location_to_hash path
Summary:
Eden api endpoint for segmented changelog. It translates a path in the
graph to the hash corresponding to that commit that the path lands on.
It is expected that paths point to unique commits.

This change looks to go through the plumbing of getting the request from
the edenapi side through mononoke internals and to the segmented changelog
crate. The request used is an example. Follow up changes will look more at
what shape the request and reponse should have.

Reviewed By: kulshrax

Differential Revision: D22702016

fbshipit-source-id: 9615a0571f31a8819acd2b4dc548f49e36f44ab2
2020-08-04 11:22:39 -07:00
Stefan Filip
2f3e569120 mononoke_api: add segmented changelog location to hash translation
Summary:
This functionality is going to be used in EdenApi. The translation is required
to unblock removing the changelog from the local copy of the repositories.
However the functionality is not going to be turned on in production just yet.

Reviewed By: kulshrax

Differential Revision: D22869062

fbshipit-source-id: 03a5a4ccc01dddf06ef3fb3a4266d2bfeaaa8bd2
2020-08-04 11:22:39 -07:00
Stefan Filip
4261013101 metaconfig: add segmented changelog config
Summary:
To start the only configuration available is whether the functionality provided
by this component is available in any shape or form. By default the component
is going to be disabled to all repositories. We will enable it first to
bootstrapped repositories and after additional tooling is added to production
repositories.

Reviewed By: kulshrax

Differential Revision: D22869061

fbshipit-source-id: fbaed88f2f45e064c0ae1bc7762931bd780c8038
2020-08-04 11:22:39 -07:00
Santiago Alfonso Muñoz Rodriguez
007dc93916 Enumeration API for BlobStore keys
Summary:
- Enumerate API now provided via trait BlobstoreKeySource
- Implementation for Fileblob and ManifoldBlob
- Modified populate_healer to use new api
- Modified fixrepocontents to use new api

Reviewed By: ahornby

Differential Revision: D22763274

fbshipit-source-id: 8ee4503912bf40d4ac525114289a75d409ef3790
2020-08-04 06:54:18 -07:00
Alex Hornby
f7210430d9 mononoke: check whether to emit an edge earlier from the walker, remaining types
Summary: Update all the remaining steps in the walker to use the new early checks, so as to prune unnecessary edges earlier in the walk.

Reviewed By: farnz

Differential Revision: D22847412

fbshipit-source-id: 78c499a1870f97df7b641ee828fb8ec58303ebef
2020-08-04 06:47:38 -07:00
Alex Hornby
5fb309a7b2 mononoke: check whether to emit an edge from the walker earlier
Summary:
Check whether to emit an edge from the walker earlier to reduce vec allocation of unnecessary edges that would immediately be dropped in WalkVistor::visit.

The VisitOne trait is introduced as a simpler api to the Visitor that can be used to check if one edge needs to be visited,  and the Checker struct in walk.rs is a helper around that that will only call the VisitOne api if necessary. Checker also takes on responsibility for respecting keep_edge_paths when returning paths,  so that parameter has be removed  for migrated steps.

To keep the diff size reasonable, this change has all the necessary Checker/VisitOne changes but only converts hg_manifest_step, with the remainder of the steps converted in the next in stack.  Marked todos labelling unmigrated types as always emit types are be removed as part of converting remaining steps.

Reviewed By: farnz

Differential Revision: D22864136

fbshipit-source-id: 431c3637634c6a02ab08662261b10815ea6ce293
2020-08-04 04:30:49 -07:00
Stanislau Hlebik
fe60eeff85 mononoke: megarepotool support for gradual merge
Summary:
This tool can be used in tandem with pre_merge_delete tool to merge a one large
repository into another in a controlled manner - the size of the working copy
will be increased gradually.

Reviewed By: ikostia

Differential Revision: D22894575

fbshipit-source-id: 0055d3e080c05f870cfd0026174365813b0eb253
2020-08-04 02:53:15 -07:00
Simon Farnsworth
f7e8931a56 Add a minimum successful writes count for MultiplexedBlobstore
Summary:
There are two reasons to want a write quorum:

1. One or more blobstores in the multiplex are experimental, and we don't want to accept a write unless the write is in a stable blobstore.
2. To reduce the risk of data loss if one blobstore loses data at a bad time.

Make it possible

Reviewed By: krallin

Differential Revision: D22850261

fbshipit-source-id: ed87d71c909053867ea8b1e3a5467f3224663f6a
2020-08-04 02:45:38 -07:00
Jeremy Fitzhardinge
34760b5164 rust: 1.45.2 update
Summary: A couple of features stabilized, so drop their `#![feature(...)]` lines.

Reviewed By: eugeneoden, dtolnay

Differential Revision: D22912569

fbshipit-source-id: 5ffdc48adb1f57a1b845b1b611f34b8a7ceff216
2020-08-03 19:29:17 -07:00
Kostia Balytskyi
6824787241 library.sh: add absolute config paths everywhere
Summary:
In several places in `library.sh` we had `--mononoke-config-path
mononoke-config`. This ensured that we could not run such commands from
non-`$TESTTMP` directorires. Let's fix that.

Reviewed By: StanislavGlebik

Differential Revision: D22901668

fbshipit-source-id: 657bce27ce6aee8a88efb550adc2ee5169d103fa
2020-08-03 13:00:23 -07:00
Kostia Balytskyi
fe487f9e8b push_redirector: add contexts
Summary: The more contexts the better. Makes debugging errors much more pleasant.

Reviewed By: StanislavGlebik

Differential Revision: D22890940

fbshipit-source-id: 48f89031b4b5f9b15f69734d784969e2986b926d
2020-08-03 13:00:23 -07:00
Kostia Balytskyi
b7f8a1b193 megarepotool: add bonsai merge
Summary:
An extremely thin wrapper around existing APIs: just a way to create merge commits from the command line.

This is needed to make the merge strategy work:

```
C
|
M3
| \
.  \
|   \
M2   \
| \   \
.  \   \
|   \   \
M1   \   \
| \   \   \
.  TM3 \   \
.  /    |  |
.  D3 (e7a8605e0d) TM2  |
.  | /    /
.  D2 (33140b117c)  TM1
.  |  /
.  D1 (733961456f)
|   |
|    \
|    DAG to merge
|
main DAG
```

When we're creating `M2` as a result of merge of `TM2` into the main DAG, some files are deleted in the `TM3` branch, but not deleted in the `TM2` branch. Executing merge by running `hg merge` causes these files to be absent in `M2`. To make Mercurial work, we would need to execute `hg revert` for each such file prior to `hg merge`. Bonsai merge semantics however just creates correct behavior for us. Let's therefore just expose a way to create bonsai merges via the `megarepotool`.

Reviewed By: StanislavGlebik

Differential Revision: D22890787

fbshipit-source-id: 1508b3ede36f9b7414dc4d9fe9730c37456e2ef9
2020-08-03 11:32:35 -07:00
Kostia Balytskyi
f9e410d965 megarepotool: add pre-merge-delete CLI
Summary:
This adds a CLI for the functionality, added in the previous diff. In addition, this adds an integration test, which tests this deletion functionality.

The output of this tool is meant to be stored in the file. It simulates a simple DAG, and it should be fairly easy to automatically parse the "to-merge" commits out of this output. In theory, it could have been enough to just print the "to-merge" commits alone, but it felt like sometimes it may be convenient to quickly examine the delete commits.

Reviewed By: StanislavGlebik

Differential Revision: D22866930

fbshipit-source-id: 572b754225218d2889a3859bcb07900089b34e1c
2020-08-03 11:32:35 -07:00
Kostia Balytskyi
1eb7cfe277 megarepolib: add pre-merge delete implementation
Summary:
This implements a new strategy of creating pre-merge delete commits.

As a reminder, the higher-level goal is to gradually merge two independent DAGs together. One of them is the main repo DAG, the other is an "import". It is assumed that the import DAG is already "moved", meaning that all files are at the right paths to be merged.

The strategy is as follows: create a stack of delete commits with gradually decreasing working copy size. Merge them into `master` in reverse order.

Reviewed By: StanislavGlebik

Differential Revision: D22864996

fbshipit-source-id: bfc60836553c656b52ca04fe5f88cdb1f15b2c18
2020-08-03 11:32:35 -07:00
Simon Farnsworth
a5e9b79d7d Return all errors in the event of a multiplexed put failure
Summary:
With upcoming write quorum work, it'll be interesting to know all the failures that prevent a put from succeeding, not just the most recent, as the most recent may be from a blobstore whose reliability is not yet established.

Store and return all errors, so that we can see exactly why a put failed

Reviewed By: ahornby

Differential Revision: D22896745

fbshipit-source-id: a3627a04a46052357066d64135f9bf806b27b974
2020-08-03 09:30:05 -07:00
Kostia Balytskyi
48aa00ed92 megarepolib: implement chunker from hint string
Summary:
"Chunking hint" is a string (expected to be in a file) of the following format:
```
prefix1, prefix2, prefix3
prefix4,
prefix5, prefix6
```

Each line represents a single chunk: if a paths starts with any of the prefixes in the line, it should belong to the corresponding chunk. Prefixes are comma-separated. Any path that does not start with any prefix in the hint goes to an extra chunk.

This hint will be used in a new pre-merge-delete approach, to be introduced further in the stack.

Reviewed By: StanislavGlebik

Differential Revision: D22864999

fbshipit-source-id: bbc87dc14618c603205510dd40ee5c80fa81f4c3
2020-08-03 08:44:15 -07:00
Kostia Balytskyi
1825ed96d3 megarepolib: delete obsolete pre_merge_deletes impl
Summary:
We need to use a different type of pre-merge deletes, it seems, as the one proposed requires a huge number of commits. Namely, if we have `T` files in total in the working copy and we're happy to delete at most `D` files per commit, while merging at most `S` files per deletion stack:
```
#stacks = T/S
#delete_commits_in_stack = (T-X)/D
#delete_commits_total = T/S * (T-X)/D = (T^2 - TX)/SD ~ T^2/SD

T ~= 3*10^6

If D~=10^4 and X~=10^4:
#delete_commits_total ~= 9*10^12 / 10^8 = 9*10^4

If D~=10^5 and X~=10^5:
#delete_commits_total ~= 9*10^12 / 10^10 = 9*10^2
```

So either 90K or 900 delete commits. 90K is clearly too big. 900 may be tolerable, but it's still hard to manage and make sense of. What's more, there seems to be a way to produce fewer of these, see further in the stack.

Reviewed By: StanislavGlebik

Differential Revision: D22864998

fbshipit-source-id: e615613a34e0dc0d598f3178dde751e9d8cde4da
2020-08-03 08:27:16 -07:00
Simon Farnsworth
a9b8793d2d Add a write-mostly blobstore mode for populating blobstores
Summary:
We're going to add an SQL blobstore to our existing multiplex, which won't have all the blobs initially.

In order to populate it safely, we want to have normal operations filling it with the latest data, and then backfill from Manifold; once we're confident all the data is in here, we can switch to normal mode, and never have an excessive number of reads of blobs that we know aren't in the new blobstore.

Reviewed By: krallin

Differential Revision: D22820501

fbshipit-source-id: 5f1c78ad94136b97ae3ac273a83792ab9ac591a9
2020-08-03 04:36:19 -07:00
Viet Hung Nguyen
578207d0dc mononoke/repo_import: add hg sync checker
Summary:
Related diff: D22816538 (3abc4312af)

In repo_import tool once we move a bookmark to reveal commits to users, we want to check if hg_sync has received the commits. To do this, we extract the largest log id from bookmarks_update_log to compare it with the mutable_counter value related to hg_sync. If the counter value is larger or equal to the log id, we can move the bookmark to the next batch of commits. Otherwise, we sleep, retry fetching the mutable_counter value and compare the two again.
mutable_counters is an sql table that can track bookmarks log update instances with a counter.
This diff adds the functionality to extract the mutable_counters value for hg_sync.

======================
SQL query fix:
In the previous diff (D22816538 (3abc4312af)) we didn't cover the case where we might not get an ID which should return None. This diff fixes this error.

Reviewed By: StanislavGlebik

Differential Revision: D22864223

fbshipit-source-id: f3690263b4eebfe151e50b01a13b0193009e3bfa
2020-08-03 04:01:27 -07:00
Alex Hornby
3bd5ec74b0 mononoke: remove unused stats from walker state
Summary: The walker had a couple of unused stats fields in state.rs. Remove them.

Reviewed By: farnz

Differential Revision: D22863812

fbshipit-source-id: effc37abe29fafb51cb1421ff4962c5414b69be1
2020-08-03 01:39:39 -07:00
Jeremy Fitzhardinge
6a2846b1ca rust: mem::replace without using return value is just an assignment
Summary: 1.45 onwards warns about this.

Reviewed By: dtolnay

Differential Revision: D22877852

fbshipit-source-id: 14286142593e84f1f996b05a9c061b4f6687d418
2020-07-31 18:38:35 -07:00
Alex Hornby
5f71745810 mononoke: fix flaky test test-walker-corpus.t
Summary:
This is expected to fix flakyness in test-walker-corpus.t

The problem was that if a FileContent node was reached via an Fsnode it did not have a path associated.  This is a race condition that I've not managed to reproduce locally, but I think is highly likely to be the reason for flaky failure on CI

Reviewed By: ikostia

Differential Revision: D22866956

fbshipit-source-id: ef10d92a8a93f57c3bf94b3ba16a954bf255e907
2020-07-31 10:22:34 -07:00
Liubov Dmitrieva
cc2b5c04ca imrove authentication handling
Summary:
There have been lots of issues with user experience related to authentication
and its help messages.

Just one of it:
certs are configured to be used for authentication and they are invalid but the `hg cloud auth`
command will provide help message about the certs but then ask to copy and
paste a token from the code about interactive token obtaining.

Another thing, is certs are configired to use, it was not hard to
set up a token for Scm Daemon that can be still on tokens even if cloud
sync uses certs.

Now it is possible with `hg auth -t <token>` command

Now it should be more cleaner and all the messages should be cleaner as well.

Also certs related help message has been improved.

Also all tests were cleaned up from the authentication except for the main
test. This is to simplify the tests.

Reviewed By: mitrandir77

Differential Revision: D22866731

fbshipit-source-id: 61dd4bffa6fcba39107be743fb155be0970c4266
2020-07-31 10:16:59 -07:00
Lukas Piatkowski
417d61f4b6 mononoke/mononoke_x_repo_sync_job: make mononoke_x_repo_sync_job and related public (#40)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/40

Those tools are being used in some integration tests, make them public so that the tests might pass

Reviewed By: ikostia

Differential Revision: D22844813

fbshipit-source-id: 7b7f379c31a5b630c6ed48215e2791319e1c48d9
2020-07-31 09:02:33 -07:00
Lukas Piatkowski
e78c6d58c3 mononoke/integration tests: use C locale by default (#41)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/41

As of D22098359 (7f1588131b) the default locale used by integration tests is en_US.UTF-8, but as the comment in code mentiones:
```
The en_US.UTF-8 locale doesn't behave the same on all systems and trying to run
commands like "sed" or "tr" on non-utf8 data will result in "Illegal byte
sequence" error.
That is why we are forcing the "C" locale.
```

Additionally I've changed the test-walker-throttle.t test to use "/bin/date" directly. Previously it was using "/usr/bin/date", but the "/bin/date" is a more standard path as it works on MacOS.

Reviewed By: krallin

Differential Revision: D22865007

fbshipit-source-id: afd1346e1753df84bcfc4cf88651813c06933f79
2020-07-31 09:02:33 -07:00
Lukas Piatkowski
203d186f68 mononoke/integration tests: remove test-gitimport-octopus.t from OSS tests
Summary: It fails now, unknown reason, will work on it later

Reviewed By: mitrandir77, ikostia

Differential Revision: D22865324

fbshipit-source-id: c0513bfa2ce9f6baffebff472053e8a5d889c9ba
2020-07-31 08:02:46 -07:00
Stanislau Hlebik
cd2a3fcf32 mononoke: add allow_bookmark_update_delay
Summary:
Follow up from D22819791.
We want to use bookmark update delay only in scs, so let's configure it this
way

Reviewed By: krallin

Differential Revision: D22847143

fbshipit-source-id: b863d7fa4bf861ffe5d53a6a2d5ec44e7f60eb1a
2020-07-31 03:09:24 -07:00
Stanislau Hlebik
43ac2a1c62 mononoke: use WarmBookmarkCache in repo_client
Summary:
This is the (almost) final diff to introduce WarmBookmarksCache in repo_client.
A lot of this code is to pass through the config value, but a few things I'd
like to point out:
1) Warm bookmark cache is enabled from config, but it can be killswitched using
a tunable.
2) WarmBookmarksCache in scs derives all derived data, but for repo_client I
decided to derive just hg changeset. The main motivation is to not change the
current behaviour, and to make mononoke server more resilient to failures in
other derived data types.
3) Note that WarmBookmarksCache doesn't obsolete SessionBookmarksCache that was
introduced earlier, but rather it complements it. If WarmBookmarksCache is
enabled, then SessionBookmarksCache reads the bookmarks from it and not from
db.
4) There's one exception in point #3 - if we just did a push then we read
bookmarks from db rather than from bookmarks cache (see
update_publishing_bookmarks_after_push() method). This is done intentionally -
after push is finished we want to return the latest updated bookmarks to the
client (because the client has just moved a bookmark after all!).
I'd argue that the current code is a bit sketchy already - it doesn't read from
master but from replica, which means we could still see outdated bookmarks.

Reviewed By: krallin

Differential Revision: D22820879

fbshipit-source-id: 64a0aa0311edf17ad4cb548993d1d841aa320958
2020-07-31 03:09:24 -07:00
Alex Hornby
ecb58ff8d7 mononoke: add cmdlib argument to control cachelib zstd compression
Summary:
Add a cmdlib argument to control cachelib zstd compression. The default behaviour is unchanged, in that the CachelibBlobstore will attempted compression when putting to the cache if the object is larger than the cachelib max size.

To make the cache behaviour more testable, this change also adds an option to do an eager put to cache without the spawn. The default remains to do a lazy fire and forget put into the cache with tokio::spawn.

The motivation for the change is that when running the walker the compression putting to cachelib can dominate CPU usage for part of the walk, so it's best to turn it off and let those items be uncached as the walker is unlikely to visit them again (it only revisits items that were not fully derived).

Reviewed By: StanislavGlebik

Differential Revision: D22797872

fbshipit-source-id: d05f63811e78597bf3874d7fd0e139b9268cf35d
2020-07-31 01:12:02 -07:00
Santiago Alfonso Muñoz Rodriguez
c32b31984f Resolve cmd line argument conflict on populate_healer
Summary: populate_healer would panic on launch because there were 2 aguments assigned to -d: debug and destination-blobstore-id

Reviewed By: StanislavGlebik

Differential Revision: D22843091

fbshipit-source-id: e300af85b4e9d4f757b4311f2b7d776f59c7527d
2020-07-31 00:17:43 -07:00
Jun Wu
b57b6f8705 changegroup: do not print 'adding changeset X' with --debug
Summary:
The debug print abuses the `linkmapper`. The Rust commit add logic does not
use `linkmapper`. So let's remove the debug message to be consistent with
the Rust logic.

Reviewed By: DurhamG

Differential Revision: D22657189

fbshipit-source-id: 2e92087dbb5bfce2f00711dcd62881aba64b0279
2020-07-30 20:32:35 -07:00
Jun Wu
26580d00af allow pulling with empty 'common' set
Summary:
The check does not practically work because the client sends `common=[null]`
if the common set is empty.

D22519582 changes the client-side logic to send `common=[]` instead of
`common=[null]` in such cases. Therefore remove the constraint to keep
tests passing. 13 tests depend on this change.

Reviewed By: StanislavGlebik

Differential Revision: D22612285

fbshipit-source-id: 48fbc94c6ab8112f0d7bae1e276f40c2edd47364
2020-07-30 20:00:41 -07:00
Arun Kulshreshtha
439dd2d495 gotham_ext: move client hostname lookup into gotham_ext
Summary: Move client hostname reverse DNS lookup from inside of the LFS server's `RequestContext` to an async method on `ClientIdentity`, allowing it to be used elsewhere. The behavior of `RequestContext::dispatch_post_request` should remain unchanged.

Reviewed By: krallin

Differential Revision: D22835610

fbshipit-source-id: 15c1183f64324f216bd639630396c9c6f19bcaaa
2020-07-30 10:27:35 -07:00
Arun Kulshreshtha
d691e06abd tests: allow multiple curl error codes in test-lfs-server-https.t
Summary: When a TLS connection fails due to a missing client certificate, the `curl` command may fail with either code 35 or 56 depending on the TLS version used. With TLS v1.3, the error is explicitly reported as a missing client certificate, whereas in TLS v1.2, it is reported as a generic handshake failure. This is because TLS v1.3 defines an explicit [`certificate_required`](https://tools.ietf.org/html/rfc8446#section-4.4.2.4) alert, which is [not present](https://github.com/openssl/openssl/issues/6804) in earlier TLS versions.

Reviewed By: krallin

Differential Revision: D22834527

fbshipit-source-id: a15d6a169d35ece6ed5a54b37b8ca9bbc506b3da
2020-07-30 10:27:35 -07:00
Stanislau Hlebik
ffa578ed1f mononoke: change warm bookmark cache to store BookmarkKind
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large
commits.

We'd like to use WarmBookmarkCache in repo client, and to do that we need to be
able to tell Publishing and PullDefault bookmarks apart. Let's teach
WarmBookmarksCache about it.

Reviewed By: krallin

Differential Revision: D22812478

fbshipit-source-id: 2642be5c06155f0d896eeb47867534e600bbc535
2020-07-30 07:28:44 -07:00
Stanislau Hlebik
445994e44a mononoke: add method for creating publishing bookmarks
Summary:
This method will be used in the next diff to add a test, but it might be more
useful later as well.

Note that `update()` method in BookmarkTransaction already handles publishing bookmarks correctly

Reviewed By: farnz

Differential Revision: D22817143

fbshipit-source-id: 11cd7ba993c83b3c8bca778560af4a360f892b03
2020-07-30 07:28:43 -07:00
Stanislau Hlebik
8dcc48b90f mononoke: introduce SessionBookmarkCache
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large
commits.

The code for managing cached_publishing_bookmarks_maybe_stale was already a bit
tricky, and with WarmBookmarksCache introduction it would've gotten even worse.
Let's move this logic to a separate SessionBookmarkCache struct.

Reviewed By: krallin

Differential Revision: D22816708

fbshipit-source-id: 02a7e127ebc68504b8f1a7401beb063a031bc0f4
2020-07-30 07:28:43 -07:00
Lukas Piatkowski
9962321103 mononoke/regenerate_hg_filenodes: make regenerate_hg_filenodes public (#39)
Summary: Pull Request resolved: https://github.com/facebookexperimental/eden/pull/39

Reviewed By: krallin

Differential Revision: D22816308

fbshipit-source-id: e64b2b5f5b319814265fdb0129f2bce6b1a72a98
2020-07-30 06:50:54 -07:00
Lukas Piatkowski
4ccff9c2ef mononoke/megarepotool: make megarepotool public (#38)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/38

The tool is used in some integration tests, make it public so that the tests might pass

Reviewed By: ikostia

Differential Revision: D22815283

fbshipit-source-id: 76da92afb8f26f61ea4f3fb949044620a57cf5ed
2020-07-30 06:50:54 -07:00
Stanislau Hlebik
bca1052f78 mononoke: store publishing bookmarks in cache
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large
commits.

The problem with large changesets is deriving hg changesets for them. It might take
a significant amount of time, and that means that all the clients are stuck waiting on
listkeys() or heads() call waiting for derivation. WarmBookmarksCache can help here by returning bookmarks
for which hg changesets were already derived.

This is the second refactoring to introduce WarmBookmarksCache.
Now let's cache not only pull default, but also publishing bookmarks. There are two reasons to do it:
1) (Less important) It simplifies the code slightly
2) (More important) Without this change 'heads()' fetches all bookmarks directly from BlobRepo thus
bypassing any caches that we might have. So in order to make WarmBookmarksCache useful we need to avoid
doing that.

Reviewed By: farnz

Differential Revision: D22816707

fbshipit-source-id: 9593426796b5263344bd29fe5a92451770dabdc6
2020-07-30 03:35:02 -07:00
Stanislau Hlebik
6941d0cfe9 mononoke: do not store bytes in pull_default bookmarks cache
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large commits.

This diff just does a small refactoring that makes introducing
WarmBookmarksCache easier. In particular, later in cached_pull_default_bookmarks_maybe_stale cache I'd like to store
not only PullDefault bookmarks, but also Publishing bookmarks so that both
listkeys() and heads() method could be served from this cache. In order to do
that we need to store not only bookmark name, but also bookmark kind (i.e. is
it Publishing or PullDefault).

To do that let's store the actual Bookmarks and hg changeset objects instead of
raw bytes.

Reviewed By: farnz

Differential Revision: D22816710

fbshipit-source-id: 6ec3af8fe365d767689e8f6552f9af24cbcd0cb9
2020-07-30 03:35:02 -07:00
Mateusz Kwapich
d1322c621d don't error out when path doesn't exist
Summary:
Most out our APIs throw error when the path doesn't exist. I would like to
argue that's not the right choice for list_file_history.

Errors should be only retuned in abnormal situations and with
`history_across_deletions` param there's no other easy way to check if the file
ever existed other than calling this API - so it's not abnormal to call
it with path that doesn't exist in the repo.

Reviewed By: StanislavGlebik

Differential Revision: D22820263

fbshipit-source-id: 002bda2ef5ee9d6632259a333b7f3652cfb7aa6b
2020-07-30 03:25:01 -07:00
Viet Hung Nguyen
3abc4312af mononoke: add sql query to get max bookmark log id
Summary:
Added a new query function to get the largest log id from bookmarks_update_log.

In repo_import tool once we move a bookmark to reveal commits to users, we want to check if hg_sync has received the commits. To do this, we extract the largest log id from bookmarks_update_log to compare it with the mutable_counter value related to hg_sync. If the counter value is larger or equal to the log id, we can move the bookmark to the next batch of commits.
Since this query wasn't implemented before, this diff add this functionality.

Next step: add query for mutable_counter

Reviewed By: krallin

Differential Revision: D22816538

fbshipit-source-id: daaa4e5159d561e698c6e1874dd8822546c699c7
2020-07-30 03:23:08 -07:00
Lukas Piatkowski
db2f711159 mononoke/hg_sync_job: make mononoke_hg_sync_job public (#37)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/37

mononoke_hg_sync_job is used in integration tests, make it public

Reviewed By: krallin

Differential Revision: D22795881

fbshipit-source-id: 7a32c8e8adf723a49922dbb9e7723ab01c011e60
2020-07-30 02:52:56 -07:00
Lukas Piatkowski
0b5ac21f79 mononoke/backsyncer_cmd: make backsyncer_cmd public (#36)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/36

This command is used in some integration tests, make it public.

Reviewed By: krallin

Differential Revision: D22792846

fbshipit-source-id: 39ac89b1a674ea63dc924cafa07107dbf8e5a098
2020-07-30 02:52:56 -07:00
Stanislau Hlebik
264a1493ca mononoke: fix a comment
Reviewed By: farnz

Differential Revision: D22816709

fbshipit-source-id: 7c338034bdfb835133eda12d23385fe432557868
2020-07-29 11:42:22 -07:00
Kostia Balytskyi
ff563aaf05 megarepolib: introduce stacked pre-merge deletes
Summary:
To gradually merge one repo into the other, we need to produce multiple slices of the working copy. The sum of these slices has to be equal to
the whole of the original repo's working copy. To create each of these slices all files but the ones in the slice need to be deleted from the working copy.
Before this diff, megarepolib would do this in a single delete commit. This however may be impractical, as it will produce huge commits, which we'll be unable
to process adequately. So this diff essentially introduces gradual deletion for each slice, and calls each slice "deletion stack". This is how it looks (a copy from the docstring):

```
  M1
  . \
  . D11 (ac5fca16ae)
  .  |
  . D12 (4c57c974e3)
  .   |
  M2   \
  . \   |
  . D21 (1135339320) |
  .  |  |
  . D22 (60419d261b) |
  .   | |
  o    \|
  |     |
  o    PM
  ^     ^
  |      \
main DAG   merged repo's DAG
```
Where:
 - `M1`, `M2` - merge commits, each of which merges only a chunk
   of the merged repo's DAG
 - `PM` is a pre-merge master of the merged repo's DAG
 - `D11 (ac5fca16ae)`, `D12 (4c57c974e3)`, `D21 (1135339320)` and `D22 (60419d261b)` are commits, which delete
   a chunk of working copy each. Delete commmits are organized
   into delete stacks, so that `D11 (ac5fca16ae)` and `D12 (4c57c974e3)` progressively delete
   more and more files.

Reviewed By: StanislavGlebik

Differential Revision: D22778907

fbshipit-source-id: ad0bc31f5901727b6df32f7950053ecdde6f599c
2020-07-28 09:43:32 -07:00
Viet Hung Nguyen
f1ef619284 mononoke/repo_import: add phabricator lag checker
Summary:
Once we start moving the bookmark across the imported commits (D22598159 (c5e880c239)), we need to check dependent systems to avoid overloading them when parsing the commits. In this diff we added the functionality to check Phabricator. We use an external service (jf graphql - find discussion here: https://fburl.com/nr1f19gs) to fetch commits from Phabricator. Each commit id starts with "r", followed by a call sign (e.g FBS for fbsource) and the commit hash (https://fburl.com/qa/9pf0vtkk). If we try to fetch an invalid commit id (e.g not having a call sign), we should receive an error. Otherwise, we should receive a JSON.
An imported commit should have the following query result: https://fburl.com/graphiql/20txxvsn - nodes has one result with the imported field true.
If the commit hasn't been recognised by Phabricator yet, the nodes array will be empty.
If the commit has been recognised, but not yet parsed, the imported field will be false.
If we haven't parsed the batch, we will try to check Phabricator again after sleeping for a couple of seconds.
If it has parsed the batch of commits, we move the bookmark to the next batch.

Reviewed By: krallin

Differential Revision: D22762800

fbshipit-source-id: 5c02262923524793f364743e3e1b3f46c921db8d
2020-07-28 08:09:21 -07:00
Lukas Piatkowski
22f90df1db mononoke/integration tests: use a combination of kill and wait to kill a process
Summary: On MacOS if you kill a process without waiting on it to be killed you will receive a warning on the terminal saying that the process was killed. To suppress that output, which is messing with the integratino tests, use a combination of kill and wait (the custom "killandwait" bash function). It will wait for the process to stop which is probably what most integration tests would prefer to do

Reviewed By: krallin

Differential Revision: D22790485

fbshipit-source-id: d2a08a5e617e692967f8bd566e48f5f9b50cb94d
2020-07-28 08:02:52 -07:00
Lukas Piatkowski
9db04f2daa mononoke/integration tests: use "date" command directly rather than via path
Summary: Using "/usr/bin/date" rather than just "date" is very limiting, not all systems have common command line tools installed in the same place, just use "date".

Reviewed By: krallin

Differential Revision: D22762186

fbshipit-source-id: 747da5a388932fb5b9f4c068014c01ee90a91f9b
2020-07-28 08:02:52 -07:00
Lukas Piatkowski
ec9be535eb mononoke/integration tests: use LC_ALL=C locale
Summary: On MacOS the default localisation configuration (UTF-8) won't allow operations on arbitrary bytes of data via some commands, because not all sequences of bytes are valid utf-8 characters. That is why when handling arbitrary bytes it is better to use the "C" locale, which can be achieved by setting the LC_ALL env variable to "C".

Reviewed By: krallin

Differential Revision: D22762189

fbshipit-source-id: aa917886c79fba5ea61ff7168767fc4b052a35a1
2020-07-28 08:02:52 -07:00
Lukas Piatkowski
16182e626b mononoke/integration tests: use newer bash version on MacOS GitHub CI runs
Summary: Use brew on MacOS GitHub CI runs to update bash from 3.* to 5.*.

Reviewed By: krallin

Differential Revision: D22762195

fbshipit-source-id: b3a4c9df7f8ed667e88b28aacf7d87c6881eb775
2020-07-28 08:02:52 -07:00
Lukas Piatkowski
88782c8f69 mononoke/integration tests: use GNU command lines to run tests on MacOS
Summary: MacOS uses FreeBSD version of command line tools. This diff uses brew to install the GNU tooling on GitHub CI and uses it to run the integration tests.

Reviewed By: krallin

Differential Revision: D22762198

fbshipit-source-id: 1f67674392bf6eceea9d2de02e929bb3f9f7cadd
2020-07-28 08:02:52 -07:00
Alex Hornby
c01ba6abfa mononoke: log route to step on unexpected errors from walker
Summary:
On unexpected errors like missing blobstore keys the walker will now log the preceding node (source) and an interesting step to this node (not necessarily  the immediately preceding, e.g. the affected changeset).

Validate mode produces route information with interesting tracking enabled,  scrub currently does not to save time+memory. Blobstore errors in scrub mode can be reproduced in validate mode when the extra context from the graph route is needed.

Reviewed By: farnz

Differential Revision: D22600962

fbshipit-source-id: 27d46303a2f2c07219950c20cc7f1f78773163e5
2020-07-28 05:21:48 -07:00
Harvey Hunt
e5b249cefc mononoke: lfs_server: Use enforce_acl_check as a killswitch for ACL enforcement
Summary:
Now that we can configure ACL checking on a per-repo basis, use the
`enforce_acl_check` config option as a killswitch to quickly disable ACL
enforcement, if required.

Further, remove the `acl_check` config flag that was always set to True.

As part of this change I've refactored the integration test a little and
replaced the phrase "ACL check" with "ACL enforcement", as we always check the
ACL inside of the LFS server.

Reviewed By: krallin

Differential Revision: D22764510

fbshipit-source-id: 8e09c743a9cd78d54b1423fd2a5cfc9bf7383d7a
2020-07-28 04:57:01 -07:00
Lukas Piatkowski
d52ea235c7 mononoke/integration tests: sqlite - cast BLOB to TEXT before applying LIKE operation
Summary: Some versions of sqlite don't allow using LIKE operation on BLOB data, so first cast it to TEXT. This test was failing on Linux runs on GitHub.

Reviewed By: krallin

Differential Revision: D22761041

fbshipit-source-id: 567d68050297c3a2ac781b252d3e9b21ea5b2201
2020-07-27 14:35:01 -07:00
Lukas Piatkowski
d5dd156cfb mononoke/integration tests: install missing jq command on MacOS
Reviewed By: krallin

Differential Revision: D22762192

fbshipit-source-id: 73f12e65c3ab63910638f16197e5629a7d6efb2c
2020-07-27 14:35:01 -07:00
Lukas Piatkowski
db06969e0d mononoke/integration tests: create exclusion list of integration tests
Summary: Have a comprehensive list of OSS tests that do not pass yet.

Reviewed By: krallin

Differential Revision: D22762196

fbshipit-source-id: 19ab920c4c143179db65a6d8ee32974db16c5e3d
2020-07-27 14:35:01 -07:00
Harvey Hunt
cce86abf14 mononoke: lfs_server: Enforce ACL checks on a per repo basis
Summary:
Update the LFS server to use the `enforce_lfs_acl_check` to enforce
ACL checks for specific repos and also reject clients with missing idents.

In the next diff, I will use the existing LFS server config's
`enforce_acl_check` flag as a killswitch.

Reviewed By: krallin

Differential Revision: D22762451

fbshipit-source-id: 61d26944127711f3503e04154e8c079ae75dc815
2020-07-27 11:04:59 -07:00
Stanislau Hlebik
97cc687069 mononoke: add an option to disable leases in backfill_derive_data
Summary:
Let's by default not take a lease so that derived_data_tailer can make progress even if all other services are failing to derive.

One note - we don't remove the lease completely, but rather we use another lease that's separate from the lease used by other mononoke services. The motivation here is to make sure we don't derive unodes 4 times - blame, deleted_file_manifest and fastlog all want to derive unodes, and with no lease at all they would just all derive the same data a few times. Deriving unodes a few times seems undesirable, so I suggest to use a InProcessLease instead of no lease.

Reviewed By: krallin

Differential Revision: D22761222

fbshipit-source-id: 9595705d955f3bb2fe7efd649814fc74f9f45d54
2020-07-27 07:13:30 -07:00
Mark Thomas
89bc34035b scuba: add log sequence numbers
Summary:
Add log sequence numbers to the scuba sample builder.  This provides an ordering
over the logs made by an individual instance of Mononoke, allowing them to be
sorted.

Reviewed By: krallin

Differential Revision: D22728880

fbshipit-source-id: 854bde51c7bfc469677ad08bb738e5097cb05ad5
2020-07-27 07:10:07 -07:00
Simon Farnsworth
a40a8f36b7 Asyncify MultiplexedBlobstore
Summary:
We have two deficiencies to correct in here; modernise the code without changing behaviour first to make it easier to later fix them.

Deficiency 1 is that we always call the `on_put` handler; we need a mode that doesn't do that unless a blobstore returns an error, for jobs not waiting on a human.

Deficiency 2 is that we accept a write after one blobstore accepts it; there's a risk of that being the only copy if a certain set of race conditions are met

Reviewed By: StanislavGlebik

Differential Revision: D22701961

fbshipit-source-id: 0990d3229153cec403717fcd4383abcdf7a52e58
2020-07-27 06:09:47 -07:00
Stanislau Hlebik
fd153acdef mononoke: make it possible to build sparse skiplist
Summary:
as in title.

Since we haven't tested it much yet I've added a note that this feature is
experimental

Reviewed By: krallin

Differential Revision: D22760648

fbshipit-source-id: 33f858b0021939dabbe1894b08bd495464ad0f63
2020-07-27 03:48:30 -07:00
Stanislau Hlebik
82c291010b mononoke: small refactoring of admin skiplist_subcommand
Summary:
Move changeset_fetcher building to a separate function, because
build_skiplist_index is already rather large and I'm going to make it larger in
the next diff

Reviewed By: krallin

Differential Revision: D22760556

fbshipit-source-id: 800baba052f46ed817f011f71dd28d40e98245fe
2020-07-27 03:48:30 -07:00
Lukas Piatkowski
006b80bf1b mononoke/integration tests: fix for test-security-whitelist.t - override allowed id
Reviewed By: krallin

Differential Revision: D22760347

fbshipit-source-id: 613bca3073e404b02c55c557a3835d0738c10102
2020-07-27 02:45:58 -07:00
Stanislau Hlebik
88975e359e mononoke: RFC skiplist with gaps
Summary:
Currently our skiplists store a skip edge for almost all public commits. This
is problematic for a few reasons:
1) It uses more memory
2) It increases the startup time
3) It makes startup flakier. We've noticed a few times that our backend storage
return errors more often when try to download large blobs.

Let's change the way we build skiplist. Let's not index every public changeset
we have, but rather index it smarter. See comments for more details.

Reviewed By: farnz

Differential Revision: D22500300

fbshipit-source-id: 7e9c887595ba11da80233767dad4ec177d933f72
2020-07-27 01:33:57 -07:00
Kostia Balytskyi
24b4b02df6 megarepolib: impl create delete commits
Summary:
This adds `megarepolib` support for pre-merge "delete" commits creation.
Please see `create_sharded_delete_commits` docstring for explanation of what
these "delete" commits are.

This functionality is not used anywhere yet (I intend to use it from
`megarepotool`), so I've added what I think is a reasonble test coverage.

Reviewed By: StanislavGlebik

Differential Revision: D22724946

fbshipit-source-id: a8144c47b92cb209bb1d0799f8df93450c3ef29f
2020-07-26 05:16:29 -07:00
Lukas Piatkowski
2c5cc232fc mononoke/x509 identity: add OSS parsing of x509 certificates (#32)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/32

This parsing uses the standard "subject name" field of a x509 certificate to create MononokeIdentity.

Reviewed By: farnz

Differential Revision: D22627150

fbshipit-source-id: 7f4bfc87dc2088bed44f95dd224ea8cdecc61886
2020-07-24 09:05:52 -07:00
Stanislau Hlebik
4ddf071f7e mononoke: update walker to visit content referenced by fsnodes
Summary: If fsnodes point to non-existent content we should be able to detect that.

Reviewed By: farnz

Differential Revision: D22723866

fbshipit-source-id: 31510aada5e21109b498a26e28e0f6f3b7358ec4
2020-07-24 09:03:47 -07:00
Stanislau Hlebik
b8e68c433c mononoke: fix help message
Reviewed By: krallin

Differential Revision: D22723876

fbshipit-source-id: 68e46dcd3fe3998cca49abf5d7f11068186341ea
2020-07-24 08:52:32 -07:00
Arun Kulshreshtha
2566915fcd edenapi_server: fix comment in errors.rs
Reviewed By: StanislavGlebik

Differential Revision: D22705442

fbshipit-source-id: efa51077372ec9381c56d47f240f54fec573bc3c
2020-07-24 00:56:30 -07:00
Stefan Filip
ca09a04945 regenerate projects
Summary: A few projects out of sync between TARGETS and Cargo.toml.

Reviewed By: dtolnay

Differential Revision: D22704460

fbshipit-source-id: 3d809292d50cc42cfbc4973f7b26af38d931121f
2020-07-23 15:03:46 -07:00
Stanislau Hlebik
4e252cbf2e mononoke: add --limit to backfill_derived_data
Summary: It's nice to have this flag available

Reviewed By: krallin

Differential Revision: D22693732

fbshipit-source-id: 9d0d8f44cb0f5f7263a33e86e9c5b8a9927c0c85
2020-07-23 13:33:16 -07:00
Mateusz Kwapich
2bf5cf7ca1 add descendants_of argument to commit_path_history
Differential Revision: D22502592

fbshipit-source-id: 8e3acf00a6c4dc9b651551a6723b582d9bcaca39
2020-07-23 07:34:52 -07:00
Mateusz Kwapich
3b36868c5c add descendants_of arg to changeset_path history
Summary: Thiss turned out to be quite complex

Differential Revision: D22502591

fbshipit-source-id: 4bd9b90e9c88c234b84fbea2221036387a037ba3
2020-07-23 07:34:52 -07:00
Mateusz Kwapich
52863fa3e3 remove terminator argument
Summary: now the terminator argument is unused - we can get rid of it.

Differential Revision: D22502594

fbshipit-source-id: e8ecec01002421baee38be0c7e048d08068f2d74
2020-07-23 07:34:52 -07:00
Mateusz Kwapich
39c0b018ce migrate time filters to use visitor
Summary:
`until_timestamp` will benefit from checking each node - this will allow for
less filtering on the caller side.

Differential Revision: D22502595

fbshipit-source-id: 23c574b0a4eeb700cf7ea8c1ea29e3a6520097a9
2020-07-23 07:34:52 -07:00
Mateusz Kwapich
451c5e9827 introduce a Visitor trait
Summary:
This new trait is going to replace the `Terminator` argument to fastlog
traversal function. Insted of deciding if we should fetch or/not given fastlog
batch this trait allows us to make decisions based on each visited changeset.

Differential Revision: D22502590

fbshipit-source-id: 19f9218958604b2bcb68203c9646b3c9b433541d
2020-07-23 07:34:52 -07:00
Mateusz Kwapich
646e321d7c better public api for deleted files manifest
Summary:
The function for finding the commit where the file was deleted
in the fastlog module doesn't depend on fastlog at all.
It also seems generic enough to be a good public API for deleted files
manifests module.

Differential Revision: D22502596

fbshipit-source-id: 2e226bf14339da886668ee8e3402a08e8120266e
2020-07-23 07:34:51 -07:00
Mateusz Kwapich
b08005947a encapsulate visiting a node
Summary:
Let's centralize the logic that adds new nodes to BFS queue during fastlog
traversal, this will allow me to hook into it in the next diffs.

Differential Revision: D22502593

fbshipit-source-id: 63f4e7adb3a7e11b4a2b2dcc65cab3bb4bf6f015
2020-07-23 07:34:51 -07:00
Mateusz Kwapich
41d11ca236 introduce find_merges feature
Summary:
This new skiplist feature allows to find merges between any two related
commits.

Reviewed By: StanislavGlebik

Differential Revision: D22457894

fbshipit-source-id: 203d43588040759b89a895395058a21c9b5ca43d
2020-07-23 07:34:51 -07:00
Mateusz Kwapich
091f06f47d save the visited nodes during skiplist traversal
Summary: I'm planning to use it in my next diff to power `find_merges` functionality.

Reviewed By: StanislavGlebik

Differential Revision: D22457898

fbshipit-source-id: 76c3f107fd8b5bbef96e978037be31efca0f9841
2020-07-23 07:34:51 -07:00
Mateusz Kwapich
3f8db5fda3 extract processing non-skippable nodes into separate function
Summary:
The `process_frontier` function is THE function that powers skiplist traversal
and it's quite complex. To make the core more readable I'm moving part of the
code into separate functions.

Differential Revision: D22457896

fbshipit-source-id: e3521855ae7ab889c21d7aff0204e27dc23cf906
2020-07-23 07:34:51 -07:00
Mateusz Kwapich
08462ea85c extract single skiplist traversal step to a separate
Summary:
The `process_frontier` function is THE function that powers skiplist traversal
and it's quite complex. To make the core more readable I'm moving parts of the
code into separate functions.

I'm also planning to use the single step function to simplify lowest common
ancestor algorithm later.

Differential Revision: D22457895

fbshipit-source-id: 1234118705ca6b1b61e09fdd7867ce4366045a28
2020-07-23 07:34:51 -07:00
Kostia Balytskyi
dde78cb83e backsyncer: move under commit_rewriting
Summary: Same as a previous diff. Let's keep the top-level dir tidy.

Reviewed By: krallin

Differential Revision: D22691638

fbshipit-source-id: 7f9a21f307efd9bbe37f515f475409c89b99cd31
2020-07-23 06:32:52 -07:00
Kostia Balytskyi
d99cae1ba5 megarepolib: move under commit_rewriting
Summary:
There seems to be too many things at the top level of `Mononoke` already.
Let's make sure all x-repo thingies live under the same directory.

Reviewed By: krallin

Differential Revision: D22691539

fbshipit-source-id: 19feeb6777309b9034f8620bd211041b61b08bfc
2020-07-23 06:32:52 -07:00
Alex Hornby
6a0b614a90 mononoke: fix flaky test-walker-validate.t
Summary: Fix flaky test-walker-validate.t.   There can be more than one route to the bad filenode, so wildcard the src and via fields when matching the output.

Reviewed By: StanislavGlebik

Differential Revision: D22664371

fbshipit-source-id: f4d880187ec2b557fb5f69ad546c2486d150b337
2020-07-22 11:05:48 -07:00
Arun Kulshreshtha
edb856a77e mononoke_api: use create_getpack_v2_blob in HgFileContext
Summary: Use `create_getpack_v2_blob` instead of `create_getpack_v1_blob` for fetching file content because the former also provides metadata, which is required to support LFS.

Reviewed By: quark-zju

Differential Revision: D22564950

fbshipit-source-id: 2835160a9dfd18b80cd13e4a5dbcf6f4ce2f4579
2020-07-22 10:37:14 -07:00
Kostia Balytskyi
55cb8dba22 admin: asyncify crossrepo subcommand
Summary: This makes crossrepo subcommand not import `BoxFuture`

Reviewed By: StanislavGlebik

Differential Revision: D22647197

fbshipit-source-id: ea2dd20039a8aaf96be0483cc25f3fad38d262f5
2020-07-22 09:17:43 -07:00
Stanislau Hlebik
aaafd7a707 mononoke: allow specifying list to redact in a file
Summary:
For large lists it's much more convenient to specify them in a file - we are
not limited by cmd line size limit.

Reviewed By: krallin

Differential Revision: D22595023

fbshipit-source-id: 93035208700f981453eaf98f84341a86f2f1c04d
2020-07-22 07:37:36 -07:00
Kostia Balytskyi
1e5a0dc4db admin: add crossrepo config subcommand
Summary: This is to be able to inspect `LiveCommitSyncConfig` from our admin tooling.

Reviewed By: StanislavGlebik

Differential Revision: D22497065

fbshipit-source-id: 3070890b7dc2a4075a5c15aca703494e33ee6530
2020-07-22 07:34:59 -07:00
Stanislau Hlebik
c6ff6a0216 mononoke: add a command to verify manifests
Summary:
We have three different types of manifests that store file type and content -
hg manifests, fsnodes and unodes.

Let's add a command that verifies that these manifests are consistent.

There's some copy-paste in the code when listing manifests (e.g. list_fsnodes,
list_unodes etc are quite similar). There might be a way to have less
copy-paste, but given that each of the functions have some small differences it
doesn't really seem worth it.

Reviewed By: krallin

Differential Revision: D22663631

fbshipit-source-id: 487be8611df218472cec1899f34367906794484b
2020-07-22 07:25:31 -07:00
Lukas Piatkowski
808a00ea73 mononoke/integration tests: provide localhost certs for OSS test runs (#30)
Summary: Pull Request resolved: https://github.com/facebookexperimental/eden/pull/30

Reviewed By: johansglock

Differential Revision: D22596741

fbshipit-source-id: e8cf5da09f5c8c812b78b44aac969f52e195684f
2020-07-22 03:29:19 -07:00
Kostia Balytskyi
dd4fcd7aee commit_validator: validate according to recorded sync config version
Summary: `commit_validator` cannot just use latest `CommitSyncConfig` to validate how a commit was synced between large and small repos. Instead, when querying `synced_commit_mapping`, it needs to pay attention to `version_name` used to create the mapping. Then it needs to query `LiveCommitSyncConfig` for a config of this version and use it as a validation basis.

Reviewed By: StanislavGlebik

Differential Revision: D22525606

fbshipit-source-id: 6c32063b18461d592d931316aec7fd041bcc1ae4
2020-07-21 10:19:44 -07:00
Egor Tkachenko
661d31bd21 mononoke: check that repo is locked during unbundle process
Summary:
Currently the repo lock is checked only once at the beginnig of unbundle future. That unbundle process take some time and during that time repo can be locked by someone.
We can reduce that possibility by creating additional future, which will check the repo in the loop and poll both futures for whoever will finish first.

Reviewed By: StanislavGlebik

Differential Revision: D22560907

fbshipit-source-id: 1cba492fa101dba988e07361e4048c6e9b778197
2020-07-21 09:41:38 -07:00
Kostia Balytskyi
be29e70289 admin: report version in crossrepo map
Summary: What the title says.

Reviewed By: farnz

Differential Revision: D22476423

fbshipit-source-id: 5d6781fc09e3f89c1d555787821770d6b131c9f2
2020-07-21 09:09:23 -07:00
Kostia Balytskyi
9d607c382b commit sync config versioning: expose version in RewrittenAs
Summary:
This is the next step in exposing version, used to sync commits in read
queries. The previous step was to query this from DB, now let's also put it
into an enum payload. Further, I will add consumers of this in admin and
validation.

Note that ideally, `RewrittenAs` should always have a version associated with
it, but:
- right now this is not true in the DB (needs backfilling)
- even when I backfill everything, I would not like to error out just at
  reading time, I would prefer the consumers to deal with the absense of a
  version in rewrtitten commits.

Therefore, I decided to use `Option` here.

Reviewed By: StanislavGlebik

Differential Revision: D22476166

fbshipit-source-id: 5bc27bb21b7e59c604755ef35aa5d3d2c3df527e
2020-07-21 09:09:23 -07:00
Kostia Balytskyi
fc61c74f23 live_commit_sync_config: make scs xrepo-lookup watch live changes
Summary:
When we change `CommitSyncConfig`, we want to not have to restart `scs` servers, and instead have them pick up the new config by using `LiveCommitSyncConfig`.

This diff turned out larger than I expected, mainly due to the need to introduce various things around `TestLiveCommitSyncConfig`:
- `TestLiveCommitSyncConfig`: the trait implementer to use in `mononoke_api::Repo`
- `TestLiveCommitSyncConfigSource`: the helper struct to keep around for new values injection (very similar to how our `ConfigStore` has an inner `ConfigSource`, which can also be `TestSource`, but here injected values can be `CommitSyncConfig` instead of JSON).
- all the places in integration tests, where `setup_configerator_configs` is now needed (I plan to start setting up configerator configs always in a separate diff, as it is cheap)

Here are the explanations for a few things I think may be not immediately obvious:
- I removed the `Clone` bound from `LiveCommitSyncConfig` trait, as `Clone` traits [cannot be made into trait objects](https://doc.rust-lang.org/book/ch17-02-trait-objects.html#object-safety-is-required-for-trait-objects)
- `TestLiveCommitSyncConfigSource` vs `TestLiveCommitSyncConfigSourceInner` discrepancy is to ensure nobody should instantiate `TestLiveCommitSyncConfigSourceInner` outside of `live_commit_sync_config/src`
- I am aware of the ugly discrepancy between the main `--mononoke-config-path`, which is used to load initial configuration and can be both a file-based and a configerator-based config; and `--local-configerator-path`, used to override config sources for `Tunables` and `LiveCommitSyncConfig`. Ideally these two should just be unified somehow, but that is a little out of scope of this diff (I've already added it to the dirt doc though).
- in `mononoke_api::Repo` there are methods `new_test_xrepo` and `new_test_common`, which changed from maybe accepting just a `CommitSyncConfig` to now maybe accepting both `CommitSyncConfig` and `LiveCommitSyncConfig`. It can be made a bit cleaner: I can just query `CommitSyncConfig` from `LiveCommitSyncConfig` in `new_test_common` and avoid having two args. I was too lazy to do this, lmk if you feel strongly about it.

Reviewed By: StanislavGlebik

Differential Revision: D22443623

fbshipit-source-id: 0d6bbda2223e77b89cc59863b703db5081bcd601
2020-07-21 09:09:23 -07:00
Viet Hung Nguyen
c5e880c239 mononoke/repo_import: Add bookmark moving functionality
Summary: After we derived the bonsaichangesets (D22455297 (c315c56c05)), we want to move a bookmark in small increments to reveal the commits to the users (https://fburl.com/wiki/zp9hgd7z step 8). This diff adds this functionality to repo_import and automate this step.

Reviewed By: StanislavGlebik

Differential Revision: D22598159

fbshipit-source-id: 01db898f07a0b7be50c3f595e78931204f33bb46
2020-07-21 06:47:51 -07:00
Alex Hornby
2aaddc487e mononoke: add context to alias deserialize error handling
Summary: Add context to show the affected key if there are problems deserializing an alias.

Reviewed By: krallin

Differential Revision: D22629544

fbshipit-source-id: 1718d4187386e37038bb5c958db2659bd5b54cfd
2020-07-21 06:11:46 -07:00
Jun Wu
0562c1220f tests: enable template-new-builtin
Summary: This makes tests depend less on revision numbers.

Reviewed By: DurhamG

Differential Revision: D22468669

fbshipit-source-id: 74a06930faa3e6ee9d246ecc718c2a3740f57a54
2020-07-20 17:27:53 -07:00
Mateusz Kwapich
0a8aaa461c repo_stack_info: return queue when the limit is hit
Summary:
This allows clients to get more commits in the followup query, also it lets the
clients know that the limit was reached.

Reviewed By: markbt

Differential Revision: D22576875

fbshipit-source-id: 93de20b1033cd5d0cdf902a418d7b727b03d2d08
2020-07-20 10:39:52 -07:00
Mateusz Kwapich
3547ffd3a9 repo_stack_info: make the repo_stack_info results ordered
Summary:
We already traverse in useful order, let's preserve this order as it's natural
for clients to expect some kind of topological ordering.

Reviewed By: markbt

Differential Revision: D22576873

fbshipit-source-id: 32a6f0de1ba9cc473e57b5a69fde538dfe8a3d75
2020-07-20 10:39:52 -07:00
Mateusz Kwapich
a142790f41 repo_stack_info limit the limit
Summary: This prevents us from abuse in form of crazy big queries.

Reviewed By: markbt

Differential Revision: D22576876

fbshipit-source-id: 0e407d79ba367f1b42faa82e4757053e43001e50
2020-07-20 10:39:52 -07:00
Mateusz Kwapich
6bc6730839 repo_stack_info: limit the output size, not traversal depth.
Summary:
The customers need our queries to return a predictable amounts of output
whilist limiting the depth doesn't really do it: the output may be
theoretically exponential.

Reviewed By: markbt

Differential Revision: D22576874

fbshipit-source-id: 65c793b229889c0cf26af693537ed6dbc0d33ebd
2020-07-20 10:39:52 -07:00
Alex Hornby
5348068572 mononoke: walker: log changeset missing link nodes were reached via
Summary: log changeset missing link nodes were reached via.

Reviewed By: farnz

Differential Revision: D20124059

fbshipit-source-id: 412a88245abb3b761c2363c33c28206ab749d572
2020-07-20 04:58:16 -07:00
Alex Hornby
d07a69f71e mononoke: walker: log changeset non-public commits were reached via
Summary: Log changeset non-public commits were reached via.

Reviewed By: krallin

Differential Revision: D20115443

fbshipit-source-id: b2bed03b42c26785f845e055b984fda8d5af63e9
2020-07-20 04:58:16 -07:00
Stanislau Hlebik
3ba14b7ff1 Back out "mononoke: use batch_derive method in derived data utils"
Reviewed By: ahornby

Differential Revision: D22623587

fbshipit-source-id: 52e56e5b3ceb170ea41c58020fcd986023728ea2
2020-07-20 03:18:55 -07:00
Jun Wu
eb4c007145 changelog: use Rust RevlogIndex for partialmatch
Summary:
I dropped the special case of wdir handling. With the hope that we will handle
the virtual commits differently eventually (ex. drop special cases, insert real
commits to Rust DAG but do not flush them to disk, support multiple wdir
virtual commits, null is no longer an ancestor of every commit).

`test-listkeyspatterns.t` is changed because `0` no longer resolves to `null`.

Reviewed By: DurhamG

Differential Revision: D22368836

fbshipit-source-id: 14b9914506ef59bb69363b602d646ec89ce0d89a
2020-07-17 22:23:04 -07:00
Stanislau Hlebik
9d18c46b1f remediation of S205607
fbshipit-source-id: 798decc90db4f13770e97cdce3c0df7d5421b2a3
2020-07-17 17:16:13 -07:00
Stanislau Hlebik
3665548bb0 remediation of S205607
fbshipit-source-id: 5113fe0c527595e4227ff827253b7414abbdf7ac
2020-07-17 17:16:13 -07:00
Qinfan Wu
bf06c5782a Import libra-crypto
Summary: Will be useful and avoid copy-pasting some code.

Differential Revision: D22592284

fbshipit-source-id: c42df645042722ea26c13d0737cb349fc2e8fbc1
2020-07-17 13:55:04 -07:00
Thomas Orozco
8a244dc0de mononoke/derived_data: log perf counters to derived data Scuba
Summary:
Like it says in the title. This would be helpful to understand why a particular
derivation took a given amount of time. To avoid having other work that shares
this CoreContext resulting in biased counters, I set this up so that we start
new perf counters for derivation.

Reviewed By: farnz

Differential Revision: D22595473

fbshipit-source-id: de85d5108aabde23cf6587662f15f25aac0cd650
2020-07-17 04:32:12 -07:00
Arun Kulshreshtha
6849666105 edenapi_types: add metadata field to DataEntry
Summary:
Add a metadata field to `read_res` containing a `revisionstore::Metadata` struct (which contains the object size and flags). The main purpose of this is to support LFS, which is indicated via a metadata flag.

Although this change affects the `DataEntry` struct which is serialized over the wire, version skew between the client and server should not break things since the field will automatically be populated with a default value if it is missing in the serialized response, and ignored if the client was built with an earlier version of the code without this field.

In practice, version skew isn't really a concern since this isn't used in production yet.

Reviewed By: quark-zju

Differential Revision: D22544195

fbshipit-source-id: 0af5c0565c17bdd61be5d346df008c92c5854e08
2020-07-16 13:32:19 -07:00
Lukasz Piatkowski
0dd3c4e4bb add Mononoke integration tests CI (#26)
Summary:
This diff adds a minimal workflow for running integrations tests for Mononoke. Currently only one test is run and it fails.

This also splits the regular Mononoke CI into separate files for Linux and Mac to match the current style in Eden repo.
There are the "scopeguard::defer" fixes here that somehow escaped the CI tests.
Some tweaks have been made to "integration_runner_real.py" to make it runnable outside FB context.
Lastly the change from using "[[ -v ... ]" to "[[ -n "${...:-}" ]]; in "library.sh" was made because the former is not supported by the default Bash version preinstalled on modern MacOS.

Pull Request resolved: https://github.com/facebookexperimental/eden/pull/26

Reviewed By: krallin

Differential Revision: D22541344

Pulled By: lukaspiatkowski

fbshipit-source-id: 5023d147823166a8754be852c29b1e7b0e6d9f5f
2020-07-16 12:16:10 -07:00
Simon Farnsworth
ae4b55ccaf Sort test output in test-cmd-manual-scrub.t
Reviewed By: krallin

Differential Revision: D22570814

fbshipit-source-id: bd1ef5defb2c401a3a6d19c4102c8904a211289d
2020-07-16 05:48:07 -07:00
Mark Thomas
0bd84ab898 scs_server: increase list_bookmarks limits (part 2)
Summary:
Increase the max limits for repo_list_bookmarks and commit_list_descendant_bookmarks
from 1,000 to 10,000.  The higher number is still reasonable for a single request,
and this reduces the number of round-trips when there are lots of bookmarks.

This updates the constants in the thrift definition.  The server was updated to accept
higher limits in a previous diff.

Reviewed By: mitrandir77

Differential Revision: D22524891

fbshipit-source-id: bc47c1e50af35f213ebcfbca1574669e79b2fe92
2020-07-15 08:15:06 -07:00
Thomas Orozco
ef6d7b48b4 mononoke/lfs_server: popularity: add ODS counters, timeouts and tests
Summary:
ODS counters are helpful to know if the feature is turned on or off without
requiring a traffic spike, so let's log them. Also, let's add timeouts in here,
so we know if things aren't working as expected (I did check in the Mononoke
LFS dataset — 10ms is a very conservative number, that's way beyond the p99 of
batch requests, which include potentially many counter checks).

To make this easier to iterate on, let's also add tests.

Reviewed By: StanislavGlebik

Differential Revision: D22545853

fbshipit-source-id: 02ea4484a4e4ba0dfd4a71030c129eb5c6bb1ec9
2020-07-15 03:39:23 -07:00
Jun Wu
f6d838bc0e extensions: enable lz4revlog by default
Summary: Some native code (ex. RevlogIndex) only knows the lz4 format.

Reviewed By: DurhamG

Differential Revision: D22368825

fbshipit-source-id: d33cee235e3aa4fbf2cfb441319e3c12728d8b5b
2020-07-14 14:33:43 -07:00
Viet Hung Nguyen
9fc9cf9d1c mononoke/repo_import: update packman binary path, move package building
Summary:
Fixed the path to generate the binary in order to able to use repo_import as command.
Changed contbuild location for publishing repo_import packman : mononoke contbuild uses the binary for testing, so it would be cheaper to just build there.

Reviewed By: StanislavGlebik

Differential Revision: D22526542

fbshipit-source-id: 926f8ed08169c34833ccc5711ec3fa26c3784615
2020-07-14 10:19:20 -07:00
Mark Thomas
8abb5136d3 scs_server: increase list_bookmarks limits (part 1)
Summary:
Increase the max limits for repo_list_bookmarks and commit_list_descendant_bookmarks
from 1,000 to 10,000.  The higher number is still reasonable for a single request,
and this reduces the number of round-trips when there are lots of bookmarks.

This updates the server.  A later diff will increase the constant so that clients
can make use of it, but this diff must be landed and deployed before that change
can be made.

Reviewed By: mitrandir77

Differential Revision: D22524892

fbshipit-source-id: c216f4ba7fa60774990d87747c9d8ea9d551dc85
2020-07-14 05:48:08 -07:00
Viet Hung Nguyen
78e3864869 xdiff: renamed third-party xdiff functions
Summary:
Follow up on this diff: D22432330 (b7817ffbd8)

Renamed xdiff functions to avoid linking issues when using both libgit2-sys and xdiff.

Reviewed By: farnz

Differential Revision: D22511368

fbshipit-source-id: e4be20e3112a8e8829298d5748657e9bdbde8588
2020-07-14 03:46:04 -07:00
Stanislau Hlebik
23390ee238 mononoke: use bulkops in skiplist builder
Summary:
We already have a function to fetch all public changesets - let's use it
instead of re-implementing it.

The small caveat is that function in skiplist subcommand fetched all the
changesets (i.e. public and draft), so using bulkops function looks like a change in
behaviour. However it's actually the same - we index only public changesets for
skiplists anyway.

Reviewed By: krallin

Differential Revision: D22499940

fbshipit-source-id: ac8ad7d2b6ff0208e830a344877d7d2e93693abc
2020-07-13 15:17:35 -07:00
Thomas Orozco
2e99b4b7cd mononoke/lfs_server: blob popularity: skip consistent routing for hot blobs
Summary:
If a particular is blob is too popular, we can saturate a LFS host through
consistent routing, and possibly OOM the host as well.

Historically, we haven't had enough traffic to LFS to make this a problem, but
we're getting there now.

This diffs adds support for reporting the popularity of a blob through SCS (not
Mononoke SCS — the couting one), and for using this popularity to identify when
we should stop consistently-routing a given blob.

The idea is that if e.g. something was requested 300 times in the last 20
seconds, it'll take a second for all the hosts to have it in cache, so we might
as well distribute this load.

There are plenty of things we could do slightly better here, such as making the
interval configurable, or having something in-between "consistently route to a
single host" and "don't consistently route at all". That said, I don't think
those are necessary right now, so let's start simple and find out.

Reviewed By: HarveyHunt

Differential Revision: D22503748

fbshipit-source-id: 48827bcfb7658ad22c88a8433359e29b0d56ad5a
2020-07-13 13:00:36 -07:00
Lukas Piatkowski
f54b8373e3 mononoke: fixup license headers
Reviewed By: farnz

Differential Revision: D22505759

fbshipit-source-id: 7d217e731a7655dc8272dbb157298965495fcf53
2020-07-13 11:02:33 -07:00
Mark Thomas
1945ef11b7 scs_server: compute descendant bookmarks in parallel
Summary:
Computing descendant bookmarks is done async-but-serially (i.e. in a single
tokio Task).  That means it is slow for repos with many bookmarks if you make a
request for an early commit.

This change makes it use multiple tokio Tasks, each one computing a batch of 100
bookmarks, giving true parallelism.

Reviewed By: krallin

Differential Revision: D22461931

fbshipit-source-id: a8908f7c20173c61b83d69c9dc37a5275937e2dc
2020-07-13 06:22:37 -07:00
Viet Hung Nguyen
c315c56c05 mononoke/repo_import: add deriving data functionality
Summary: After we shifted the bonsaichangesets (D22307039 (f5db2fdcdc)), we want to derive all the data types available in the target repo. Previously, we used a script to specify what we want to derive (https://fburl.com/wiki/ivk76muf step 4). This diff adds this functionality to repo_import and automate this step.

Reviewed By: StanislavGlebik

Differential Revision: D22455297

fbshipit-source-id: b38ac68606687350ace57d68464e68ca8229f7a5
2020-07-13 03:28:33 -07:00
Lukas Piatkowski
a41db27baf mononoke/blobstore_healer: make it OSS buildable
Reviewed By: farnz

Differential Revision: D22460549

fbshipit-source-id: aa5327f5dae1008cee784d41e322034cd0bb5b61
2020-07-13 03:02:34 -07:00
Lukas Piatkowski
6b9637bbac mononoke/blobimport: make it OSS buildable
Reviewed By: krallin

Differential Revision: D22455491

fbshipit-source-id: 919ba0e4fc759ef25546eacf30200ff19cd89466
2020-07-13 03:02:34 -07:00
Stanislau Hlebik
8daaacbd77 mononoke: log repo name when bookmark is missing
Summary:
Logs showed that one of the repo is missing a bookmark, however it wasn't clear
which repo. Let's log it to make it clearer.

Reviewed By: HarveyHunt

Differential Revision: D22500793

fbshipit-source-id: c1d5fce66d7b2b119c7365d13511a7e9a6d6ed3f
2020-07-13 02:47:20 -07:00
Stanislau Hlebik
64740aafce mononoke: asyncify build_skiplist_index subcommand
Reviewed By: farnz

Differential Revision: D22480533

fbshipit-source-id: af6bf14998fe38c7dd6655a51addeb41fbc7aa3b
2020-07-12 03:21:20 -07:00
Simon Farnsworth
81e65f5bcc Fully asyncify the blobstore healer
Summary:
As part of modernising MultiplexedBlobstore, I want to fully asyncify the blobstore_sync_queue; that means I need this fully asyncified.

Fully asyncify everything but the bits that interact with blobstore_sync_queue; those have to wait for MultiplexedBlobstore to be asyncified

End goal is to reduce the number of healer overloads, by adding a mode of operation in which writes (e.g. from backfills or derived data) can avoid a sync queue write when all blobstores are working

Reviewed By: StanislavGlebik

Differential Revision: D22460059

fbshipit-source-id: 5792c4a8daf17ffe99a04d792792f568c40fde37
2020-07-11 05:41:36 -07:00
Simon Farnsworth
9287bfca2c Move blobstore healer tests to their own file
Summary: I'm about to asyncify the healer - move 2/3rds of the file content (tests) into their own file.

Reviewed By: ikostia

Differential Revision: D22460166

fbshipit-source-id: 18c0dde5f582c4c7006e3f023816ac457d38234b
2020-07-11 05:41:36 -07:00
Lukas Piatkowski
c5f79f3668 mononoke/benchmark_filestore: make it OSS buildable
Reviewed By: krallin

Differential Revision: D22475133

fbshipit-source-id: c14bf4f0811e8c2f1cf31416bf88f378caf50be3
2020-07-10 22:12:40 -07:00
Lukas Piatkowski
385ef6d938 mononoke/admin: make it OSS buildable
Reviewed By: krallin

Differential Revision: D22458187

fbshipit-source-id: 05d321bc1aded67fb2eca851b4b1ad4a8bd49d52
2020-07-10 22:12:40 -07:00
Simon Farnsworth
78847ff88c Make BlobstoreSyncQueue use new futures
Summary: Stage 1 of a migration - next step is to make all users of this trait use new futures, and then I can come back, add lifetimes and references, and leave it modernised

Reviewed By: StanislavGlebik

Differential Revision: D22460164

fbshipit-source-id: 94591183912c0b006b7bcd7388a3d7c296e60577
2020-07-10 06:43:13 -07:00
Mark Thomas
a51d164892 admin: increase type_length_limit
Reviewed By: ikostia

Differential Revision: D22476055

fbshipit-source-id: 1df7556a5cf774744b26f09e3ed681cceb30c617
2020-07-10 05:55:06 -07:00
Mark Thomas
2180ac866d dbbookmarks: share SelectBookmark query
Summary: Use `pub(crate)` visibility to share the `SelectBookmark` query between modules.

Reviewed By: StanislavGlebik

Differential Revision: D22464059

fbshipit-source-id: 269561f5ab936b730ce2052e50173134ce241ff8
2020-07-10 04:50:25 -07:00
Mark Thomas
fb5fdb9c15 bookmarks: remove repo_id from Bookmarks methods
Summary:
Remove the `repo_id` parameter from the `Bookmarks` trait methods.

The `repo_id` parameters was intended to allow a single `Bookmarks` implementation
to serve multiple repos.  In practise, however, each repo has its own config, which
results in a separate `Bookmarks` instance for each repo.  The `repo_id` parameter
complicates the API and provides no benefit.

To make this work, we switch to the `Builder` pattern for `SqlBookmarks`, which
allows us to inject the `repo_id` at construction time.  In fact nothing here
prevents us from adding back-end sharing later on, as these `SqlBookmarks` objects
are free to share data in their implementation.

Reviewed By: StanislavGlebik

Differential Revision: D22437089

fbshipit-source-id: d20e08ce6313108b74912683c620d25d6bf7ca01
2020-07-10 04:50:25 -07:00
Mark Thomas
aed95ea96d dbbookmarks: split up into modules
Summary:
The dbbookmarks crate is getting too large for a single file.  Split it up into
a `store` module, which implements the bookmarks traits, and a `transaction`
module, which handles bookmark transactions.

Reviewed By: krallin

Differential Revision: D22437088

fbshipit-source-id: 629b62de151400cdbf56d502aef061df46c3da81
2020-07-10 04:50:25 -07:00
Mark Thomas
3afceb0e2c bookmarks: extract BundleReplayData from BookmarkUpdateReason
Summary:
Separate out the `BundleReplayData` from the `BookmarkUpdateReason` enum.  There's
no real need for this to be part of the reason, and removing it means we can
abstract away the remaining dependency on Mercurial changeset IDs from
the main bookmarks traits.

Reviewed By: mitrandir77, ikostia

Differential Revision: D22417659

fbshipit-source-id: c8e5af7ba57d10a90c86437b59c0d48e587e730e
2020-07-10 04:50:24 -07:00
Mark Thomas
fa4dce16f7 dbbookmarks: improve async transaction functions
Summary:
The dbbookmarks implementation still contains some remnants of old-style
futures combinators.  Remove these.

Reviewed By: krallin

Differential Revision: D22432175

fbshipit-source-id: 8d4419def4129112c2386b45e750970790020049
2020-07-10 04:50:24 -07:00
Simon Farnsworth
65e7404eba Command to manually scrub keys supplied on stdin
Summary: For populating the XDB blobstore, we'd like to copy data from Manifold - the easiest way to do that is to exploit MultiplexedBlobstore's scrub mode to copy data directly.

Reviewed By: krallin

Differential Revision: D22373838

fbshipit-source-id: 550a9c73e79059380337fa35ac94fe1134378196
2020-07-10 01:01:05 -07:00
Stanislau Hlebik
361f4e98a7 mononoke: use batch_derive method in derived data utils
Summary:
Previously backfill_batch_dangerous method was calling internal derive_impl() method
directly. That wasn't great (after all, we are calling a function whose name suggests it should only be called from inside derive data crate) and this diff changes it so that we call batch_derive() method instead.

This gives a few benefits:
1) We no longer call internal derive_impl function
2) It allows different types of derived data to override batching behaviour.
For example, we've already overriden it for fsnodes and next diff will override
it for blame as well.

To make it compatible with derive_impl() batch_derive() now accepts derive data mode and mapping

Reviewed By: krallin

Differential Revision: D22435044

fbshipit-source-id: a4d911606284676566583a94199195860ffe2ecf
2020-07-09 10:45:19 -07:00
Mark Thomas
5c95bf6be2 mutation_store: make separate requests by primordial and successor
Summary:
D22206317 (9a6ed4b6ca) added requesting of predecessor information for suspected primordials
by the successor ID.  This allows recovery of earlier predecessors when partial
data upload resulted in the history of a commit being extended backwards.

Unfortunately, while the individual requests are fast, the combined request
using `OR` in SQL ended up being very slow for some requests.

Separate out the requests at the application level, and aggregate the results
by concatenating them.  `collect_entries` already handles duplicates should any
arise.

Most of the time the successor query will very quickly return no rows, as
it only matters when history is extended backwards, which is expected to be
rare.

Reviewed By: ikostia

Differential Revision: D22456062

fbshipit-source-id: 1e6094b4ac1590a5824e9ae6ef48468766560188
2020-07-09 09:21:01 -07:00
Jun Wu
b80966f93c revlog: turn on head-based-commit-transaction for tests
Summary:
Bypass truncation-based transaction if narrow-heads is on.

The transaction abort still works logically because commit references stay
unchanged on abort.

Related EdenFS and Mononoke tests are updated. Mononoke tests probably
shouldn't rely on revlog / fncache implementation details in hg.

Reviewed By: DurhamG

Differential Revision: D22240186

fbshipit-source-id: f97efd60855467b52c9fb83e7c794ded269e9617
2020-07-08 14:33:58 -07:00
Thomas Orozco
ae917ba227 mononoke/virtually_sharded_blobstore: make sampling rate tunable
Summary: As it says in the title.

Reviewed By: farnz

Differential Revision: D22432526

fbshipit-source-id: 42726584689cbc2f5c9138b42b7bf77939921bdd
2020-07-08 09:07:19 -07:00
Kostia Balytskyi
75db021d70 live_commit_sync_config: make it into a trait
Summary:
The goal is to make it easier to implement unit tests, which depend on `LiveCommitSyncConfig`. Specifically, `scs` has a piece of code, which instantiates `mononoke_api::Repo` with a test version of `CommitSyncConfig`. To migrate it to `LiveCommitSyncConfig`, I need to be able to create a test version of that. It **is** possible now, but would require me to turn a supplied instance of `CommitSyncConfig` back into `json`, which is cumbersome. Using a `dyn LiveCommitSyncConfig` there, instead of a concrete struct seems like a good idea.

Note also that we are using this technique in many places: most (all?) of our DB tables are traits, which we then implement for SQL-specific structs.

Finally, this diff does not actually migrate all of the current users of `LiveCommitSyncConfig` (the struct) to be users of `LiveCommitSyncConfig` (the trait), and instead makes them use `CfgrLiveCommitSyncConfig` (the trait impl). The idea is that we can migrate bits to use traits when needed (for example, in an upcoming `scs` diff). When not needed, it's fine to use concrete structs. Again, this is already the case in a a few places: we sometimes use `SqlSyncedCommitMapping` struct directly, instead of `T: SyncedCommitMapping` or `dyn SyncedCommitMapping`.

Reviewed By: StanislavGlebik

Differential Revision: D22383859

fbshipit-source-id: 8657fa39b11101684c1baae9f26becad6f890302
2020-07-08 08:34:06 -07:00
Thomas Orozco
dd1aaf90fe mononoke/{hgproto,mercurial_bundles}: eliminate O(N^2) behavior in decoding
Summary:
This updates the AsyncRead implementations we use in hgproto and
mercurial_bundles to use a LimitedAsyncRead. The upshot of this change is that
we eliminate O(N^2) behavior when parsing the data we receive from clients.

See the earlier diff on this stack for more detail on where this happens, but
the bottom line is that Framed presents a full-size buffer that we zero out
every time we try to read data. With this change, the buffer we zero out is
comparable to the amount of data we are reading.

This matters in commit cloud because bundles might be really big, and a single
big bundle is enough to take an entire core for a spin or 20 minutes (and they
achieve nothing but time out in the end). That being said, it's also useful for
non-commit cloud bundles: we do occasionally receive big bundles (especially
for WWW codemods), and those will benefit from the exact same speedup.

One final thing I should mention: this is all in a busy CPU poll loop, and as I noted
in my earlier diff, the effect persists across our bundle receiving code. This means
it will sometimes result in not polling other futures we might have going.

Reviewed By: farnz

Differential Revision: D22432350

fbshipit-source-id: 33f1a035afb8cdae94c2ecb8e03204c394c67a55
2020-07-08 08:07:13 -07:00
Thomas Orozco
8c994e7682 mononoke/fastreplay: log replay success & failure counts to ODS
Summary:
I want to update the health check to stop averaging averages (like in
D22394014). To do this, I need those counters.

Reviewed By: ahornby

Differential Revision: D22410196

fbshipit-source-id: aa5cbfe6607be3b953887f1639e1de54baac7389
2020-07-07 06:41:23 -07:00
Stanislau Hlebik
886e34d17b mononoke: log size of fetched undesired files
Summary:
Just knowing the number of fetched undesired files doesn't give the full
picture. e.g. fetching lots of small files is better than fetching single
multi-Gb file.
So knowing the size of files is helpful

Reviewed By: krallin

Differential Revision: D22408400

fbshipit-source-id: 7653c1cdceccf50aeda9ce8a4880ee5178d4b107
2020-07-07 06:23:01 -07:00
Arun Kulshreshtha
5f0181f48c Regenerate all Cargo.tomls after upgrade to futures 0.3.5
Summary: D22381744 updated the version of `futures` in third-party/rust to 0.3.5, but did not regenerate the autocargo-managed Cargo.toml files in the repo. Although this is a semver-compatible change (and therefore should not break anything), it means that affected projects would see changes to all of their Cargo.toml files the next time they ran `cargo autocargo`.

Reviewed By: dtolnay

Differential Revision: D22403809

fbshipit-source-id: eb1fdbaf69c99549309da0f67c9bebcb69c1131b
2020-07-06 20:49:43 -07:00
Thomas Orozco
7e8c9174be mononoke/admin: add a filestore fetch subcommand
Summary:
Sometimes you want to fetch a file. Using curl and the LFS server works, but
this really should be part of Mononoke admin.

Reviewed By: ikostia

Differential Revision: D22397472

fbshipit-source-id: 17decf4aa2017a2c1be52605a254692f293d1bcd
2020-07-06 14:56:08 -07:00
Thomas Orozco
46def15c4f mononoke/admin: fix filestore store subcommand
Summary:
This got broken when we moved to Tokio 0.2. Let's fix it and add a test to make
sure it does not regress.

Reviewed By: ikostia

Differential Revision: D22396261

fbshipit-source-id: a8359aee33b4d6d840581f57f91af6c03125fd6a
2020-07-06 14:56:08 -07:00
Kostia Balytskyi
6d5b3ac1f2 live_commit_sync_config: add versions accessors
Summary:
This diff adds two new bits of functionality to `LiveCommitSyncConfig`:
- getting all possible versions of `CommitSyncConfig` for a given repo
- getting `CommitSyncConfig` for a repo by version name

These bits are meant to be used in:
- `commit_validator` and `bookmarks_validator`, which would
  need to run validation against a specific config version
- `mononoke_admin`, which would need to be able to query all versions,
  display the version used to sync two commits and so on

Reviewed By: StanislavGlebik

Differential Revision: D22235381

fbshipit-source-id: 42326fe853b588849bce0185b456a5365f3d8dff
2020-07-06 14:00:36 -07:00
Thomas Orozco
ce0af2d591 mononoke/virtually_sharded_blobstore: deduplicate puts based on data being put
Summary:
This updates the virtually_sharded_blobstore to deduplicate puts only if the
data being put is actually the data we have put in the past. This is done by
keeping track of the hash of things we've put in the presence cache.

This has 2 benefits:

- This is safer. We only dedupe puts we 100% know succeeded (because this
  particular instance was the one to attempt the put).
- This is creates less surprises, notably it lets us overwrite data in the
  backing store (if we are writing something different).

Reviewed By: StanislavGlebik

Differential Revision: D22392809

fbshipit-source-id: d76a49baa9a5749b0fb4865ee1fc1aa5016791bc
2020-07-06 12:10:46 -07:00
Thomas Orozco
19b31ead9d mononoke/virtually_sharded_blobstore: make race tests a little more forgiving
Summary:
Running those on my devserver, I noticed they can be a bit flaky. They're are
racy on the purpose, but let's relax them a bit.

We have a lot of margin here — our blobstore is rate limited at once request
every 10ms, and we need to do 100 requests (the goal is to show that they don't
all wait), so 100ms is fine to prove that they're not rate limited when sharing
the same data.

Reviewed By: StanislavGlebik

Differential Revision: D22392810

fbshipit-source-id: 2e3c9cdf19b0e4ab979dfc000fbfa8da864c4fd6
2020-07-06 12:10:46 -07:00
Kostia Balytskyi
f223ca6e6e synced commit mapping: expose version in get query
Summary:
When we look up how a commit was synced, we frequently need to know which version of `CommitSyncConfig` was used to sync it. Specifically, this is useful for admin tooling and commit validator, which I am planning to migrate to use versioned `CommitSyncConfig` in the near future.

Later I will also include this information into `RewrittenAs` variant of `CommitSyncOutcome`, so that we expose it to real users. I did not do it in this diff to keep is small and easy to review. And because the other part is not ready :P

Reviewed By: StanislavGlebik

Differential Revision: D22255785

fbshipit-source-id: 4312e9b75e2c5f92ba018ff9ed9149efd3e7b7bc
2020-07-06 11:23:31 -07:00
Mateusz Kwapich
7b3aa42459 fix the problem with ordering in into_response
Summary: When I've implemented this method I didn't test it for preserving the order of the input changesets and I've noticed my mistake when I was testing the scmquery part.

Reviewed By: StanislavGlebik

Differential Revision: D22374981

fbshipit-source-id: 4529f01370798377b27e4b6a706fc192a1ea928e
2020-07-06 08:32:03 -07:00
Mark Thomas
bcaaba1e9c add list-bookmarks command
Summary:
Add the `scsc list-bookmarks` command, which lists bookmarks in a repository.

If a commit id is also provided, `list-bookmark` will be limited to bookmarks that
point to that commit of one of its descendants.

Reviewed By: mitrandir77

Differential Revision: D22361240

fbshipit-source-id: 17067ba47f9285b8137a567a70a87fadcaabec80
2020-07-06 07:01:24 -07:00
Thomas Orozco
07907b2b26 mononoke/virtually_sharded_blobstore: merge in the context_concurrency_blobstore
Summary:
There is inevitably interaction between caching, deduplication and rate
limiting:

- You don't want the rate limiting to be above caching (in the blobstore stack,
  that is), because you shouldn't rate limits cache hits (this is where we are
  today).
- You don't want the rate limiting to below deduplication, because then you get
  priority inversion where a low-priority rate-limited request might hold the
  semaphore while a higher-priority, non rate limited request wants to do the
  same fetch (we could have moved rate limiting here prior to introducing
  deduplication, but I didn't do it earlier because I wanted to eventually
  introduce deduplication).

So, now that we have caching and deduplication in the same blobstore, let's
also incorporate rate limiting there!.

Note that this also brings a potential motivation for moving Memcache into this
blobstore, in case we don't want rate limiting to apply to requests before they
go to the _actual_ blobstore (I did not do this in this diff).

The design here when accessing the blobstore is as follows:

- Get the semaphore
- Check if the data is in cache, if so release the semaphore and return the
  data.
- Otherwise, check if we are rater limited.

Then, if we are rate limited:

- Release the semaphore
- Wait for our turn
- Acquire the semaphore again
- Check the cache again (someone might have put the data we want while we were
  waiting).
    - If the data is there, then return our rate limit token.
    - If the data isn't there, then proceed to query the blobstore.

If we aren't rate limited, then we just proceed to query the blobstore.

There are a couple subtle aspects of this:

- If we have a "late" cache hit (i.e. after we waited for rate limiting), then
  we'll have waited but we won't need to query the blobstore.
    - This is important when a large number of requests from the same key
      arrive at the same time and get rate limited. If we don't do this second
      cache check or if we don't return the token, then we'll consume a rate
      limiting token for each request (instead of 1 for the first request).
- If a piece of data isn't cacheable, we should treat it like a cache hit with
  regard to semaphores (i.e. release early), but like a miss with regard to
  rate limits (i.e. wait).

Both of those are addressed captured in the code by returning the `Ticket` on a
cache hit. We can then choose to either return the ticket on a cache hit, or wait
for it on a cache miss.

(all of this logic is captured in unit tests, we can remove any of the blocks
there in `Shards::acquire` and a test will fail)

Reviewed By: farnz

Differential Revision: D22374606

fbshipit-source-id: c3a48805d3cdfed2a885bec8c47c173ee7ebfe2d
2020-07-06 04:38:31 -07:00
Thomas Orozco
6153dab328 mononoke/async_limiter: add support for cancelling access
Summary:
Sometimes we take a token then realize we don't want it. In this case, giving it back is convenient.

This adds this!

Reviewed By: farnz

Differential Revision: D22374607

fbshipit-source-id: ccf47e6c75c37d154704645c9e826f514d6f49f6
2020-07-06 04:38:31 -07:00
Kostia Balytskyi
b7cf1dcbdb x-repo sync job: use LiveCommitSyncConfig
Summary:
This is a mirror image of a diff, which made backsyncer use `LiveCommitSyncConfig`: we want to use configerator-based live configs, when we run in the continuous tailing mode.

As no-op iteration time used to be 10s and that's a bit wasteful for tests, this diff changes it to be configurable.

Finally, because of instantiating various additional `CommitSyncerArgs` structs, this diff globs out some of the `using repo` logs (which aren't very useful as test signals anyway, IMO).

Reviewed By: StanislavGlebik

Differential Revision: D22209205

fbshipit-source-id: fa46802418a431781593c41ee36f468dee9eefba
2020-07-03 13:36:18 -07:00
Stanislau Hlebik
2cfc23770c mononoke: use override_blame_filesize_limit option
Summary: This diff actually start to use the option

Reviewed By: krallin

Differential Revision: D22373943

fbshipit-source-id: fe23da9c3daa1f9f91a5ee5e368b33e0091aa9c1
2020-07-03 09:58:46 -07:00
Stanislau Hlebik
06f2a420d1 mononoke: correctly return BlameError::Rejected
Summary:
Previously if a blame request was rejected (e.g. because a file was too large)
then we returned BlameError::Error.

This doesn't look correct, because there's BlameError::Rejected. This diff
makes it so that fetch_blame function returns BlameError::Rejected

Reviewed By: aslpavel

Differential Revision: D22373948

fbshipit-source-id: 4859809dc315b8fd66f94016c6bd5156cffd7cc2
2020-07-03 09:58:46 -07:00
Stanislau Hlebik
2a732f2626 mononoke: pass BlobRepo in fetch_full_file_content
Summary:
In the next diffs we'll need to read override_blame_filesize_limit from derived
data config, and this config is stored in BlobRepo.

this diff makes a small refactoring to pass BlobRepo to fetch_full_file_content

Reviewed By: krallin

Differential Revision: D22373946

fbshipit-source-id: b209abce82c0279d41173b5b25f6761659a92f3d
2020-07-03 09:58:46 -07:00
Stanislau Hlebik
b703f11685 mononoke: asyncify fetch_full_file_content
Summary: This will make adding blame file size limit override the next diffs easier

Reviewed By: krallin

Differential Revision: D22373945

fbshipit-source-id: 4857e43c5d80596340878753ea90bf31d7bb3367
2020-07-03 09:58:46 -07:00
Mateusz Kwapich
f21b459c99 remove dependency on bounded_traversal
Summary:
We're always yielding zero or one child during traversal, bounded traversal is
unnecessary here

Differential Revision: D22242148

fbshipit-source-id: b4c8a1279ef7bd15e9d0b3b2063683f45e30a97a
2020-07-03 08:02:25 -07:00
Mateusz Kwapich
7ff7c931a8 add option for limiting the log to descendants of single node
Summary:
Let's use new option in CLI. Unfortunately we can't easily accept commit ids in
named params so it has to be a postional one.

Differential Revision: D22234412

fbshipit-source-id: a9c27422fa65ae1c42cb1c243c7694507a957437
2020-07-03 08:02:25 -07:00
Thomas Orozco
de731a89fc mononoke/virtually_sharded_blobstore: log deduplicated puts
Summary:
If anything were to go wrong, we'd be happy to know which puts we ignored. So,
let's log them.

Reviewed By: farnz

Differential Revision: D22356714

fbshipit-source-id: 5687bf0fc426421c5f28b99a9004d87c97106695
2020-07-03 05:53:11 -07:00
Thomas Orozco
be1bac6c06 mononoke/virtually_sharded_blobstore: expose this in cmdlib
Summary:
Eventually, I plan to make this the default, but for now I'd like to make it
something we can choose to turn on or off as a cmd argument (so we can start
with the experimental tier and Fastreplay).

Note that this mixes volatile vs. non-volatile pools when accessing the pools
for cacheblob. In practice, those pools are actually volatile, it's just that
things don't break if you access them as non-volatile.

Reviewed By: farnz

Differential Revision: D22356537

fbshipit-source-id: 53071b6b21ca5727d422e10f685061c709114ae7
2020-07-03 05:53:11 -07:00
Thomas Orozco
c68100f46e mononoke/virtually_sharded_blobstore: spawn before taking semaphores
Summary:
I canaried this on Fastreplay, but unfortunately that showed that sometimes we
just deadlock, or get so slow we might as well be deadlocked (and it happens
pretty quickly, after ~20 minutes). I tried spawning all the `get()` futures,
and that fixes the problem (but it makes gettreepack noticeably slower), so
that suggests something somewhere is creating futures, polling them a little
bit, then never driving them to completion.

For better or worse, I'd experienced the exact same problem with the
ContextConcurrencyBlobstore (my initial attempt at QOS, which also used a
semaphore), so I was kinda expecting this to happen.

In a sense, this nice because I we've suspected there were things like that in
the codebase for a while (e.g. with the occasional SQL timeout we see where it
looks like MySQL responds fast but we don't actually poll it until past the
timeout), and it gives us a somewhat convenient repro.

In another sense, it's annoying because it blocks this work :)

So, to work around the problem, for now, let's spawn futures to force the work
to complete when a semaphore is held. I originally had an unconditional spawn
here, but that is too expensive for the cache-hit code path and slows things
down (by about ~2x).

However, having it only if we'll query the blobstore isn't not as expensive,
and that seems to be fine (in fact it is a ~20% p99 perf improvement,
though the exact number depends on the number of shard we use for this, which I've had to tweak a bit).

https://pxl.cl/1c18H

I did find what I think is one potential instance of this problem in
`bounded_traversal_stream`, which is that we never try to poll `scheduled` to
completion. Instead, we just poll for the next ready future in our
FuturesUnordered, and if that turns out to be synchronous work then we'll just
re-enqueue more stuff (and sort of starve async work in this FuturesUnordered).

I tried updating bounded traversal to try a fairer implementation (which polls
everything), but that wasn't sufficient to make the problem go away, so I think
this is something we have to just accept for now (note that this actually has
some interesting perf impact in isolation: it's a free ~20% perf improvement on
p95+: https://pxl.cl/1c192

see 976b6b92293a0912147c09aa222b2957873ef0df if you're curious

Reviewed By: farnz

Differential Revision: D22332478

fbshipit-source-id: 885b84cda1abc15c51fbc5dd34473e49338e13f4
2020-07-03 05:53:11 -07:00
Thomas Orozco
2082621d51 mononoke/virtually_sharded_blobstore: add ODS metrics
Summary: Those are useful to track.

Reviewed By: farnz

Differential Revision: D22332480

fbshipit-source-id: 43f5cd7121c4aa497d961015e7c16973615798d1
2020-07-03 05:53:10 -07:00
Thomas Orozco
1db62473f2 mononoke/virtually_sharded_blobstore: track perf counters
Summary: Like it says in the title. Those are useful!

Reviewed By: farnz

Differential Revision: D22332479

fbshipit-source-id: f9bddad75fcbed2593c675f9ba45965bd87f1575
2020-07-03 05:53:10 -07:00
Thomas Orozco
c297024a52 mononoke/virtually_sharded_blobstore: do not delay reads for uncacheable data
Summary:
The goal of this blobstore is to dedupe reads by waiting for them to finish and
hit cache instead (and also to dedupe writes, but that's not relevant here).

However, this is not a desirable feature if a blob cannot be stored in cache,
because then we're serializing accesses for no good reason. So, when that
happens, we store "this cannot be stored in cache", and we release reads
immediately.

Reviewed By: farnz

Differential Revision: D22285269

fbshipit-source-id: be7f1c73dc36b6d58c5075172e5e3c5764eed894
2020-07-03 05:53:10 -07:00
Thomas Orozco
b9319a4d32 mononoke/virtually_sharded_blobstore: add a newtype for cache keys + a prefix
Summary:
I'm going to store things that aren't quite the exact blobs in here, so on the
off chance that we somehow have two caching blobstores (the old one and this
one) that use the same pools, we should avoid collisions by using a prefix.

And, since I'm going to use a prefix, I'm adding a newtype wrapper to not use
the prefixed key as the blobstore key by accident.

Differential Revision: D22285271

fbshipit-source-id: e352ba107f205958fa33af829c8a46896c24027e
2020-07-03 05:53:10 -07:00
Thomas Orozco
bf3c2e19f0 mononoke/virtually_sharded_blobstore: a caching blobstore that deduplicates
Summary:
This introduces a caching blobstore that deduplicates reads and writes. The
underlying motivation is to improve performance for processes that might find
themsleves inadvertently reading the same data concurrently from a bunch of
independent callsites (most of Mononoke), or writing the same bit of data over
and over again.

The latter is particularly useful for things like commit cloud backfilling in
WWW, where some logger commits include the same blob being written hundreds or
thousands of times, and cause us to overload the underlying Zippy shard in
Manifold. This is however a problem we've also encountered in the past in e.g.
the deleted files manifest and had to solve there. This blobstore is a little
different in the sense that it solves that problem for all writers.

This comes at the cost of writes being dropped if they're known to be
redundant, which prevents updates through this blobstore. This is desirable for
most of Mononoke, but not all (notably, for skiplist updates it's not great).

For now, I'm going to add this behind an opt-in flag, and later on I'm planning
to make it opt-out and turn it off there (I'm thinking to use the CoreContext
for this).

Reviewed By: farnz

Differential Revision: D22285270

fbshipit-source-id: 4e3502ab2da52a3a0e0e471cd9bc4c10b84a3cc5
2020-07-03 05:53:10 -07:00
Kostia Balytskyi
f210326656 blobstore_healer: log the speed with which queue rows are deleted
Summary: This allowed me to compare two alternative approaches to queue draining, and generally seems like a useful thing to do.

Reviewed By: krallin

Differential Revision: D22364733

fbshipit-source-id: b6c76295c85b4dec6f0bfd7107c30bb4e4a28942
2020-07-03 05:09:56 -07:00
Johan Schuijt-Li
2b69716461 push compat() down one level from main
Summary: Migrate to new-style futures

Reviewed By: ikostia

Differential Revision: D22365232

fbshipit-source-id: 08ddd50be1c34fe90a453f369cea2e45323b63db
2020-07-03 02:36:09 -07:00
Stanislau Hlebik
2d24ddf2e1 mononoke: add --all-types to backfill_derive_data single
Summary: It's useful to derive all enabled derived data at once

Reviewed By: krallin

Differential Revision: D22336338

fbshipit-source-id: 54bc27ab2c23c175913fc02e6bf05d18a54c249c
2020-07-03 00:20:58 -07:00
Stanislau Hlebik
2a54f281f2 mononoke: add an option to perform a stack move in megarepotool
Summary:
We've recently added an option to perform a stack move in megarepolib. A "stack
move" it's a stack of commits that move a files according to a mover. Now let's
expose it in the megarepotool

Reviewed By: ikostia

Differential Revision: D22312486

fbshipit-source-id: 878d4b2575ed2930bbbf0b9b35e51bb41393e622
2020-07-03 00:18:41 -07:00
Mark Thomas
dba11deb2d scs_server: implement commit_list_descendant_bookmarks
Summary:
Implement `commit_list_descendant_bookmarks` by iterating over all bookmarks and
checking if the target commit is an ancestor of the bookmark's current target.

Reviewed By: mitrandir77

Differential Revision: D22357988

fbshipit-source-id: e1b1d7387742ba7133370f52c4d36c0b1a77f4e3
2020-07-02 12:58:16 -07:00
Mark Thomas
309e4539ab mononoke_api: bypass cache for old bookmark location in move_bookmark
Summary:
Currently the `move_bookmark` API needs to get the old bookmark location in order
to move the bookmark.  We'll fix that in general later, but for now we need to
make sure the value we use doesn't come from an out-of-date cache (e.g. the
warm_bookmarks_cache), as it may prevent the move from working.

Reviewed By: krallin

Differential Revision: D22358467

fbshipit-source-id: 4d46a6be717644b24663318326fdcd81249481c9
2020-07-02 11:01:04 -07:00
Mark Thomas
6d5bce25c6 mononoke_api: implement pagination for all bookmarks
Summary:
Bookmark requests that are truncated because the requested limit is reached now return a `continue_after` value, containing the last bookmark that was processed.

Callers can make a subsequent request with the same parameters, but `after` set to the value received in `continue_after` to continue their request where it left off.

Reviewed By: krallin

Differential Revision: D22338301

fbshipit-source-id: 81e398bee444e0960e65dc3b4cdbbe877aff926d
2020-07-02 07:53:12 -07:00
Mark Thomas
344c8edda4 add commit_list_descendant_bookmarks
Summary:
Add `commit_list_descendant_bookmarks` which will list all bookmarks that are
descendants of a particular commit.

We will also use this opportunity to complete the implementation of pagination
for regular bookmark listing, so add the appropriate fields to the
`repo_list_bookmarks` request and response structs.

Reviewed By: StanislavGlebik

Differential Revision: D22338300

fbshipit-source-id: defd019795c2a2ac9e5573d58de187c10848397f
2020-07-02 07:53:12 -07:00
Mark Thomas
3e4e59baef bookmarks: add 'pagination' filter to 'list'
Summary:
Add a new parameter, `pagination`, to the `list` method of the `Bookmarks` trait.

This restricts the returned bookmarks to those lexicographically after the
given bookmark name (exclusive).  This can be use to implement pagination:
callers can provide the last bookmark in the previous page to fetch the
next page of bookmarks.

Reviewed By: krallin

Differential Revision: D22333943

fbshipit-source-id: 686df545020d936095e29ae5fee24258511f4083
2020-07-02 07:53:12 -07:00
Mark Thomas
742eb6f829 bookmarks: rework Bookmarks traits
Summary:
Rework the bookmarks traits:

* Split out log functions into a separate `BookmarkUpdateLog` trait.  The cache doesn't care about these methods.

* Simplify `list` down to a single method with appropriate filtering parameters.  We want to add more filtering types, and adding more methods for each possible combination will be messier.

* The `Bookmarks` and `BookmarkUpdateLog` traits become `attributes` on `BlobRepo`, rather than being a named member.

Reorganise the bookmarks crate to separate out the bookmarks log and transactions into their own modules.

Reviewed By: krallin

Differential Revision: D22307781

fbshipit-source-id: 4fe514df8b7ef92ed3def80b21a16e196d916c64
2020-07-02 07:53:12 -07:00
Mark Thomas
64610d46c2 bookmarks: escape LIKE patterns used for bookmark prefixes
Summary:
The LIKE pattern used by bookmark prefixes needs to be escaped, otherwise
users looking for bookmarks containing `\`, `_` or `%` will get the
wrong results.

Reviewed By: krallin

Differential Revision: D22336716

fbshipit-source-id: 99b0ad6097f096358e66042752e4d153359935be
2020-07-02 07:53:11 -07:00
Kostia Balytskyi
b134a2f5bb blobstore_healer: fix how replication lag is monitored
Summary: We were monitoring the wrong lag so far.

Reviewed By: farnz

Differential Revision: D22356455

fbshipit-source-id: abe41a4154c2a8d53befed4760e2e9544797c845
2020-07-02 06:18:35 -07:00
Stanislau Hlebik
f6d06a266a mononoke: check conflicts correctly when doing bulk adds in transaction
Summary:
`bulk_add()` method was checking for conflicts correctly i.e. it wouldn't fail
if we try to insert the same mapping twice.
`bulk_add_git_mapping_in_transaction` wasn't doing this check i.e. it would
fail.

This caused us a few problems and this diff fixes them - now
`bulk_add_git_mapping_in_transaction` would do the same checks as bulk_add was
doing previously.

There is another important change in behaviour: if we try to insert two entries, one of them
has a conflict another don't then previously we'd insert the second entry.
Now we don't insert any, arguably that's a preferred behaviour.

Reviewed By: krallin

Differential Revision: D22332001

fbshipit-source-id: 86fff8c23c43eeca0fb36b01b10cdaa73b3ce4ab
2020-07-02 05:31:29 -07:00
Arun Kulshreshtha
632596e947 edenapi: make JSON representation of DataRequest into an object
Summary:
EdenAPI's `make_req` tools allows developers to create ad-hoc CBOR request payloads for debugging purposes (e.g., for use with `curl`). The tool generates requests from human-created JSON, which are particularly useful in Mercurial and Mononoke's integration tests.

Later in this stack, the use of this JSON format will be extended beyond just this one tool. As such, it is important that the representation be sufficiently extensible so accommodate future changes to the request structs. In the case of the JSON representation of `DataRequest`, this means changing from an array to a single-attribute object, so that additional fields can potentially be added in the future.

Reviewed By: quark-zju

Differential Revision: D22319314

fbshipit-source-id: 5931bc7ab01ca48ceab5ffd1c9177dd3035b643c
2020-07-01 23:03:00 -07:00
Stefan Filip
422c84b659 mononoke: monitor replication lag in segmented_changelog::IdMap
Summary:
The (re)construction process for the IdMap will generate millions of rows
to be inserted in our database. We want to throttle the inserts so that
the database doesn't topple over.

Reviewed By: ikostia

Differential Revision: D22104349

fbshipit-source-id: 73b7c2bab12ae0cd836080bcf1eb64586116e70f
2020-07-01 18:18:55 -07:00
Stefan Filip
a688181255 mononoke: add MyAdmin implementation for ReplicaLagMonitor
Summary:
Simple implementation that queries the MyAdmin service to fetch replication
lag.

Caching like in sqlblob::facebook::myadmin will probably come in a follow
up change.

Reviewed By: StanislavGlebik

Differential Revision: D22104350

fbshipit-source-id: fbd90174d528ddae4045e957c343e6c213f70d26
2020-07-01 18:18:55 -07:00
Stefan Filip
bf61eb5c64 mononoke: add trait ReplicaLagMonitor
Summary:
ReplicaLagMonitor is aimed to generalize over different stategies of fetching
the replication lag in a SQL database. Querying a set of connections is one
such strategy.

Reviewed By: ikostia

Differential Revision: D22104348

fbshipit-source-id: bbbeccb55a664e60b3c14ee17f404982d09f2b25
2020-07-01 18:18:55 -07:00
Viet Hung Nguyen
5f43a49743 mononoke/repo_import: add a packman config
Summary: A tool to automate repo imports (intern project: https://fburl.com/hlq6cyma)

Reviewed By: krallin

Differential Revision: D22334417

fbshipit-source-id: 21125a73afede5cf555fc66294d8b02c619c6bba
2020-07-01 11:26:31 -07:00
Mark Thomas
60c5a1722b sql_ext: make sqlite LIKE case sensitive
Summary:
SQLite's `LIKE` operator is case insensitive by default.  This doesn't match MySQL, and
also seems like a surprising default.  Set the pragma on every connection to make it
case sensitive.

Reviewed By: farnz

Differential Revision: D22332419

fbshipit-source-id: 4f503eeaa874e110c03c27300467ddc02dc9b365
2020-07-01 11:08:43 -07:00
Mark Thomas
4baaceef2d bookmarks: rename BookmarkHgKind to BookmarkKind
Summary:
Whether a bookmark is publishing or not is not specific to Mercurial - it also affects
whether a commit is draft, so it is interesting to the Bonsai world.

Rename `BookmarkHgKind` to `BookmarkKind` to make this clear.

Since negatives are more awkward to work with, rename `PublishingNotPullDefault` to
`Publishing` and `PullDefault` to `PullDefaultPublishing` to make it clearer that
pull-default bookmarks are also publishing.

We can't rename the database column, so that remains as `hg_kind`.

Reviewed By: StanislavGlebik

Differential Revision: D22307782

fbshipit-source-id: 9e686a98cc5eaf9af722fa62fac5ffd4844967fd
2020-07-01 11:08:43 -07:00
Kostia Balytskyi
a40d9bb264 blobstore healer: improve incomplete batch identification logic
Summary:
Blobstore healer has a logic, which prevents it from doing busy work, when the
queue is empty. This is implemented by means of checking whether the DB query
fetched the whole `LIMIT` of values. Or that is the idea, at least. In
practice, here's what happens:

1. DB query is a nested one: first it gets at most `LIMIT` distinct
`operation_key` entries, then it gets all rows with such entries. In practice
this almost always means `# of blobstores * LIMIT` rows, as we almost always
succeed writing to every blobstore
2. Once this query is done, the rows are grouped by the `blobstore_key`, and a
future is created for each such row (for simplicity, ignore that future may not
be created).
3. We then compare the number of created futures with `LIMIT` and report an
incomplete batch if the numbers are different.

This logic has a flaw: same `blobstore_key` may be written multiple times with
different `operation_key` values. One example of this: `GitSha1` keys for
identical contents. When this happens, grouping from step 2 above will produce
fewer than `LIMIT` groups, and we'll end up sleeping for nothing.

This is not a huge deal, but let's fix it anyway.

My fix also adds some strictly speaking unnecessary logging, but I found it
helpful during this investigation, so let's keep it.

The price of this change is collecting two `unique_by` calls, both of which
allocates a temporary hash set [1] of the size `LIMIT * len(blobstore_key) * #
blobstores` (and another one with `operation_key`). For `LIMIT=100_000`
`len(blobstore_key)=255`, `# blobstores = 3` we have roughly 70 mb for the
larger one, which should be ok.

[1] https://docs.rs/itertools/0.9.0/itertools/trait.Itertools.html#method.unique

Reviewed By: ahornby

Differential Revision: D22293204

fbshipit-source-id: bafb7817359e2c867cf33c319a886653b974d43f
2020-07-01 02:08:54 -07:00
Viet Hung Nguyen
f5db2fdcdc mononoke/repo_import: add rewrite_commit functionality
Summary:
Previous commit: D22233127 (fa1caa8c4e)

In this diff, I added rewrite commit path functionality using Mover https://fburl.com/diffusion/6rnf9q2f to repo_import.
Given a prefix (e.g. new_repo), we prepend the paths of the files extracted from the bonsaichangesets given by gitimport (e.g. folder1/file1 => new_repo/folder1/file1). Previously, we did this manually when importing a git repo (https://www.internalfb.com/intern/wiki/Mercurial/Admin/ImportingRepos/) using convert extension.

Reviewed By: StanislavGlebik

Differential Revision: D22307039

fbshipit-source-id: 322533e5d6cbaf5d7eec589c8cba0c1b9c79d7af
2020-06-30 11:39:35 -07:00
Simon Farnsworth
aa00466319 Sync hook_type change from D22282802
Summary: Also fix up the parser test that now fails with this change

Reviewed By: StanislavGlebik

Differential Revision: D22306340

fbshipit-source-id: 820aad48068471b03cbc1c42107c443bfa680607
2020-06-30 11:20:54 -07:00
Mateusz Kwapich
15256a91be add an option to limit commit_history to descendants of commit
Summary: This will be used for commits_between replacement

Differential Revision: D22234236

fbshipit-source-id: c0c8550d97a9e8b42034d605e24ff54251fbd13e
2020-06-30 08:09:30 -07:00
Mateusz Kwapich
66a810a68c new history format
Summary: Some SCMQuery queries need just a list of commit hashes instead of full coverage.

Reviewed By: markbt

Differential Revision: D22165006

fbshipit-source-id: 9eeeab72bc4c88ce040d9d2f1a7df555a11fb5ae
2020-06-30 08:09:30 -07:00
Mateusz Kwapich
398ab603c2 add into_response that would map commit ids across identity schemes
Summary: This way we can go from list of changesets into changet ids that we're returning as an answer in few queries.

Differential Revision: D22165005

fbshipit-source-id: 4da8ab2a89be0de34b2870044e44d35424be5510
2020-06-30 08:09:30 -07:00
Stanislau Hlebik
c43ea517b0 mononoke: move derive_data_for_csids to derived_data_utils
Summary: It can be useful in other places as well, not only in blobimport

Reviewed By: krallin

Differential Revision: D22307314

fbshipit-source-id: f7d8c91101edc2ed4f230f7ef6796e39fbea5117
2020-06-30 06:22:31 -07:00
Mark Thomas
160936b732 bookmarks: convert to new-style BoxFutures and BoxStreams
Summary: Convert the bookmarks traits to use new-style `BoxFuture<'static>` and `BoxStream<'static>`.  This is a step along the path to full `async`/`await`.

Reviewed By: farnz

Differential Revision: D22244489

fbshipit-source-id: b1bcb65a6d9e63bc963d9faf106db61cd507e452
2020-06-30 02:37:34 -07:00
Jun Wu
4b45a2a185 test-pushrebase: use modern configs
Summary:
Enable narrow-heads.

Changed log revset from `:` to `all()` to make the test compatible.

Reviewed By: krallin

Differential Revision: D22200495

fbshipit-source-id: 148a82e77c953b9e7dbed055ef464c318e56cafa
2020-06-29 13:00:07 -07:00
Jun Wu
4c1634b9fd test-commitcloud: use modern configs
Summary:
Enable narrow-heads, and mutation. Disable obsmarker related features.

Change phase manipulation to `debugmakepublic` which works with narrow-heads.

Reviewed By: krallin

Differential Revision: D22200511

fbshipit-source-id: 8dec050f137e6cc055015fe084eb4cc67faa1216
2020-06-29 13:00:07 -07:00
Jun Wu
42b145a65d test-walker-scrub-blobstore: use modern configs
Summary:
Enable narrow-heads.

The test output seems a bit unstable - sometimes I got 28 there. So I globbed
it out.

Reviewed By: krallin

Differential Revision: D22200497

fbshipit-source-id: f005381a341d88c0bcbb09150e7d1878df7a38f3
2020-06-29 13:00:07 -07:00
Jun Wu
9eb40deffe test-pushrebase-emit-obsmarkers: use modern configs
Summary:
Enable narrow-heads.

Change the revset `:` to `all()`. With narrow-heads, `:` selects all commits
including those that are not referred by visible heads. The `all()` revset
only selects commits reachable from visible heads.

Reviewed By: krallin

Differential Revision: D22200498

fbshipit-source-id: beb863d42069ae898e419a4a75b3a707c72ae1f9
2020-06-29 13:00:07 -07:00
Jun Wu
e0c9b2b87b test-sqlblob: use modern configs
Summary:
Enable remotenames, selectivepull, and narrow-heads. Use the new stream clone
code path.

Selectivepull makes a difference. `hg pull -r HASH` also pulls the selected
bookmarks so an extra `pull` was unnecessary. Change the clone command to use
`-U` to trigger the new clone code path.

Reviewed By: krallin

Differential Revision: D22200499

fbshipit-source-id: 764202098c7e8afdbb5e2ee83679da7570c08c90
2020-06-29 13:00:07 -07:00
Jun Wu
adcf846f2f test-reduction: use modern configs
Summary:
Enable remotenames and narrow-heads.

Local bookmarks are replaced by remote bookmarks, causing the test change.

Reviewed By: krallin

Differential Revision: D22200500

fbshipit-source-id: aeee528d1766e0642c12e78a6c1a50cadc8a579a
2020-06-29 13:00:07 -07:00
Jun Wu
d1da1d70c1 test-push-redirector-sync-job: use modern configs
Summary:
Enable remotenames and narrow-heads.

The commits become 'draft' because there are no remote bookmarks.

Reviewed By: krallin

Differential Revision: D22200514

fbshipit-source-id: 04d0befa7c22756e936a28ffdcdf1305057cf062
2020-06-29 13:00:07 -07:00
Jun Wu
2875942761 test-infinitepush: use modern configs
Summary:
Enable remotenames and narrow-heads.

The test was migrated cleanly. The only change is that local bookmarks are
replaced by remote bookmarks.

Reviewed By: krallin

Differential Revision: D22200510

fbshipit-source-id: f5b8cd2ed125e9fc4e5daac897851d91fef5693f
2020-06-29 13:00:07 -07:00
Jun Wu
6ecf255fcf test-infinitepush-mutation: use modern configs
Summary:
Enable remotenames and narrow-heads.

Local bookmarks are replaced by remote bookmarks.

Reviewed By: krallin

Differential Revision: D22200503

fbshipit-source-id: 41ac4f4f606011dcaf6d0d9867b01fb77b9a79d8
2020-06-29 13:00:07 -07:00
Jun Wu
95cf4a2a39 test-infinitepush-hydrated: use modern configs
Summary:
Enable remotenames and narrow-heads.

Phase exchange is gone because of narrow-heads.
The remtoenames extension was written suboptimally so it issued a second
bookmarks request (which, hopefully can be removed by having selective
pull everywhere and migrate pull to use the new API).

Reviewed By: krallin

Differential Revision: D22200506

fbshipit-source-id: c522bb9fc1396d813e0f1f380c4290445bab3db3
2020-06-29 13:00:07 -07:00
Jun Wu
ffde9f50e9 test-infinitepush-commits-disabled: use modern configs
Summary:
Enable remotenames and narrow-heads. The `master_bookmark` is no longer a local
bookmark in the client repo.

Reviewed By: krallin

Differential Revision: D22200513

fbshipit-source-id: bc3c1715ce21f45a35bc67148eb00e44944bea6e
2020-06-29 13:00:06 -07:00
Jun Wu
8ecb79a921 test-gettreepack-sparse-update: use modern configs
Summary: Enable remotenames and narrow-heads.

Reviewed By: krallin

Differential Revision: D22201083

fbshipit-source-id: 585dff69db9dd725c8fa1090d47c85b150f979da
2020-06-29 13:00:06 -07:00
Jun Wu
439c029007 test-gettreepack-designated-nodes: use modern configs
Summary:
Enable remotenames and narrow-heads. The server gets one more request from
remotenames.

Reviewed By: krallin

Differential Revision: D22200502

fbshipit-source-id: 26bc28b19438c7be4a19eae6be728c83b113f822
2020-06-29 13:00:06 -07:00
Jun Wu
93318c255b test-bookmark-hg-kind: use modern configs
Summary:
Enable remotenames and narrow-heads. The client gets remote bookmarks instead
of local bookmarks during clone and phases are decided by remote bookmarks.

Reviewed By: krallin

Differential Revision: D22200515

fbshipit-source-id: 12a9e892855b3a8f62f01758565de5f224c4942b
2020-06-29 13:00:06 -07:00
Jun Wu
8bde7d1316 tests: show remotenames in tglogpnr
Summary:
Change the template to show remote bookmarks, which will be more relevant once
we migrate to modern configs. Namely, phases will be decided by remote bookmarks.

The named branches logic was mostly removed from the code base. Therefore
drop the `{branches}` template.

Reviewed By: StanislavGlebik

Differential Revision: D22200512

fbshipit-source-id: 8eca3a71ff88b8614023f4920a448156fcd712d5
2020-06-29 13:00:06 -07:00
Jun Wu
dbd29b7d06 tests: turn on narrow-heads for some tests
Summary: With narrow-heads, the phase exchange step is skipped.

Reviewed By: krallin

Differential Revision: D22200504

fbshipit-source-id: 6ab366e7e68eb3b82f52acaa8f488747435e0ecf
2020-06-29 13:00:06 -07:00
Jun Wu
889beacdf1 tests: enable narrow-heads for Mononoke tests
Summary:
Most tests pass without changes. Some incompatible tests are added to the
special list.

Reviewed By: krallin

Differential Revision: D22200505

fbshipit-source-id: 091464bbc7c9c532fed9ef91f2c955d6e4f2df0b
2020-06-29 13:00:06 -07:00
Stanislau Hlebik
04ce32014d mononoke: log pushed commits to scribe
Summary: This is the final diff of the stack - it starts logging pushed commits to scribe

Reviewed By: farnz

Differential Revision: D22212755

fbshipit-source-id: ec09728408468acaeb1c214d43f930faac30899b
2020-06-29 12:15:22 -07:00
Stanislau Hlebik
6fd54d6b22 mononoke: don't fail if logging to scribe failed
Summary:
Failing push if we failed to log to scribe doesn't make a lot of sense. By that
time the ship has sailed - commit has already been pushed and by failing the
request we can't undo that. It will just create an annoyance by whoever is
pushing.

Instead let's log it to scuba

Reviewed By: farnz

Differential Revision: D22256687

fbshipit-source-id: 2428bbf1db4cef6fa80777ad65184fab1804fa9c
2020-06-29 12:15:22 -07:00
Stanislau Hlebik
8a137ae922 mononoke: add Scribe
Summary:
At the moment we can't test logging to scribe easily - we don't have a way to
mock it. Scribe are supposed to help with that.

They will let us to configure all scribe logs to go to a directory on a
filesystem similar to the way we configure scuba. The Scribe itself will
be stored in CoreContext

Reviewed By: farnz

Differential Revision: D22237730

fbshipit-source-id: 144340bcfb1babc3577026191428df48e30a0bb6
2020-06-29 12:15:22 -07:00
Jun Wu
4902a3300c tests: enable narrow-heads by default
Summary: Many tests are incompatible. But many are passing.

Reviewed By: kulshrax

Differential Revision: D22052475

fbshipit-source-id: 1f30ac2b0fe034175d5ae818ec2be098dbd5283d
2020-06-29 11:29:04 -07:00
Simon Farnsworth
7e9b8dd9e9 Remove last vestiges of Lua hooks from tests
Summary:
For Lua hooks, we needed to know whether to run the hook per file, or per changeset. Rust hooks know this implicitly, as they're built-in to the server.

Stop having the tests set an unnecessary config

Reviewed By: krallin

Differential Revision: D22282799

fbshipit-source-id: c9f6f6325823d06d03341f04ecf7152999fcdbe7
2020-06-29 10:03:22 -07:00
Harvey Hunt
026710c2cd mononoke: Remove --config_path from server arguments
Summary:
D21642461 (46d2b44c0e) converted Mononoke server to use the
`--mononoke-config-path` common argument style to select a config path.

Now that this change has been running for a while, remove the extra logic in
the server that allowed it to accept both the deprecated `--config_path / -P`
and the new arg.

Reviewed By: ikostia

Differential Revision: D22257386

fbshipit-source-id: 7da4ed4e0039d3659f8872693fa4940c58bae844
2020-06-29 07:28:36 -07:00
Stanislau Hlebik
f55fa975a5 mononoke: fix unused imports
Reviewed By: krallin

Differential Revision: D22281331

fbshipit-source-id: 656ba6500193bd179d1e6cd1443de3e85d37c597
2020-06-29 03:22:11 -07:00
Kostia Balytskyi
fb3eea2b56 commit_validator: get rid of unneeded bookmark rewriting
Summary:
`get_entry_with_small_repo_mapings` is a function that turns a `CommitEntry`
struct into `CommitEntryWithSmallReposMapped` struct - the idea being that this
function looks up hashes of commits into which the original commit from the
large repo got rewritten (in practice rewriting may have happened in the small
-> large direction, but it is not important for the purpose of this job). So it
establishes a mapping. Before this
diff, it actually established `Large<ChangesetId> ->
Option<(Small<ChangesetId>, Option<BookmarkName>)>` mapping, meaning that it
recorded into which bookmark large bookmark was rewritten. This was a useless
information (as evidenced by the fact that it was ignored by the
`prepare_entry` function, which turns `CommitEntryWithSmallReposMapped` into
`EntryPreparedForValidation`. It is useless because bookmarks are mutable and
it is impossible to do historic validation of the correctness of bookmark
renaming: bookmarks may have been correctly renamed when commits where pushes,
but they may be incorrectly renamed now and vice-versa. To deal with bookmarks,
we have a separate job, `bookmarks_validator`.

So this diff stops recording this useless information. As a bonus, this will
make migration onto `LiveCommitSyncConfig` easier.

Reviewed By: StanislavGlebik

Differential Revision: D22235389

fbshipit-source-id: c02b3f104a8cbd1aaf76100aa0930efeac475d42
2020-06-29 01:48:52 -07:00
Kostia Balytskyi
b7dba9ff2f cross_repo_sync: expose get_commit_sync_outcome as a public fn
Summary: We need to be able to query `synced_commit_mapping` to understand which `version_name` was used to sync commits. That `version_name` will be needed to produce `CommitSyncConfig` by utilizing upcoming `LiveCommitSyncConfig` APIs. And `CommitSyncConfig` is needed to create `CommitSyncer`. So let's extract this fn out of `CommitSyncer`, as it's an independent functionality really

Reviewed By: farnz

Differential Revision: D22244952

fbshipit-source-id: 53e55139efd423174176720c8bf7e3ecc0dcb0d7
2020-06-27 04:42:54 -07:00
Kostia Balytskyi
c01294e8d6 backsyncer_cmd: use LiveCommitSyncConfig
Summary:
This diff migrates `backsyncer_cmd` (the thing that runs in the separate backsyncer job, as opposed to bakcsyncer, triggered from push-redirector) onto `LiveCommitSyncConfig`. Specifically, this means that on every iteration of the loop, which calls `backsync_latest` we reload `CommitSyncConfig` from configerator, build a new `CommitSyncer` from it, and then pass that `CommitSyncer` to `backsync_latest`.

One choice made here is to *not* create `CommitSyncer` on every iteration of the inner loop of `backsync_latest` and handle live configs outside. The reason for this is twofold:
- `backsync_latest` is called form `PushRedirector` methods, and `PushRedirector` is recreated on each `unbundle` using `LiveCommitSyncConfig`. That call provides an instance of `CommitSyncer` used to push-redirect a commit we want to backsync. It seems strictly incorrect to try and maybe use a different instance.
- because of some other consistency concerns (different jobs getting `CommitSyncConfig` updates at different times), any sync config change needs to go through the following loop:
  - lock the repo
  - land the change
  - wait some time, until all the possible queues (x-repo sync and backsync) are drained
  - unlock the repo
- this means that it's ok to have the config refreshed outside of `backsync_latest`

Reviewed By: farnz

Differential Revision: D22206992

fbshipit-source-id: 83206c3ebdcb2effad7b689597a4522f9fd8148a
2020-06-26 13:40:31 -07:00
Kostia Balytskyi
9a00efc973 cmdlib: expose test-instance and local-configerator-path
Summary:
Before this diff only the main Mononoke server binary was able to use fs-based
`ConfigStore`, which is pretty useful in integration tests.

Reviewed By: farnz

Differential Revision: D22256618

fbshipit-source-id: 493a064a279250d01469c9ff7f747585581caf51
2020-06-26 06:51:07 -07:00
Simon Farnsworth
7938a1957a Support BlobstoreWithLink in Sqlblob
Summary: We designed the schema to make this simple to implement - it's literally a metadata read and a metadata write.

Reviewed By: ikostia

Differential Revision: D22233922

fbshipit-source-id: b392b4a3a23859c6106934f73ef60084cc4de62c
2020-06-26 03:54:42 -07:00
Simon Farnsworth
b1c85aaf4b Switch Blobstore to new-style futures
Summary:
Eventually, we want everything to be `async`/`await`; as a stepping stone in that direction, switch the remaining lobstore traits to new-style futures.

This just pushes the `.compat()` out to old-style futures, but it makes the move to non-'static lifetimes easier, as all the compile errors will relate to lifetime issues.

Reviewed By: krallin

Differential Revision: D22183228

fbshipit-source-id: 3fe3977f4469626f55cbf5636d17fff905039827
2020-06-26 03:54:42 -07:00
Kostia Balytskyi
ef87f564bc add newtype for CommitSyncConfigVersion
Summary:
This is to avoid passing `String` around. Will be useful in one of the next
diffs, where I add querying `LiveCommitSyncConfig` by versions.

Reviewed By: krallin

Differential Revision: D22243254

fbshipit-source-id: c3fa92b62ae32e06d7557ec486d211900ff3964f
2020-06-26 02:45:26 -07:00
Viet Hung Nguyen
fa1caa8c4e mononoke/repo_import: Add gitimport functionality and integration test
Summary: I have previously moved the gitimport functionality (D22159880 (2cf5388835)) into a separate library, since repo_import shares similar behaviours. In this diff, I setup repo_import to be able to call gitimport to get the commits and changes. (Next steps include using Mover to set the paths of the files in the commits given by gitimport)

Reviewed By: StanislavGlebik

Differential Revision: D22233127

fbshipit-source-id: 4680c518943936f3e29d21c91a2bad60108e49dd
2020-06-25 19:54:38 -07:00
Simon Farnsworth
454de31134 Switch Loadable and Storable interfaces to new-style futures
Summary:
Eventually, we want everything to be `async`/`await`; as a stepping stone in that direction, switch some of the blobstore interfaces to new-style `BoxFuture` with a `'static` lifetime.

This does not enable any fixes at this point, but does mean that `.compat()` moves to the places that need old-style futures instead of new. It also means that the work needed to make the transition fully complete is changed from a full conversion to new futures, to simply changing the lifetimes involved and fixing the resulting compile failures.

Reviewed By: krallin

Differential Revision: D22164315

fbshipit-source-id: dc655c36db4711d84d42d1e81b76e5dddd16f59d
2020-06-25 08:45:37 -07:00