Commit Graph

1577 Commits

Author SHA1 Message Date
Arun Kulshreshtha
7904099f13 edenapi_server: Add EdenAPI-specific columns to Scuba samples
Summary: Add a Scuba handler for the EdenAPI server, which will allow the server to log custom columns to Scuba via `ScubaMiddleware`. For now, the only application specific columns are `repo` and `method`.

Reviewed By: krallin

Differential Revision: D23619437

fbshipit-source-id: f08aaf9c84657b4d92f1a1cfe5f0cb5ccd408e5e
2020-09-10 20:57:12 -07:00
Arun Kulshreshtha
cec871cc9f edenapi_server: log stats to ODS
Summary: Add `OdsMiddleware` to the EdenAPI server to log various aggregate request stats to ODS. This middleware is directly based on the `OdsMiddleware` from the LFS server (which unfortunately can't be easily generalized due to the way in which the `stats` crate works).

Reviewed By: krallin

Differential Revision: D23619075

fbshipit-source-id: d361c73d18e0d1cb57347fd24c43bdb68fb7819d
2020-09-10 20:57:12 -07:00
Arun Kulshreshtha
25af6e81d7 edenapi_server: add HandlerInfo
Summary:
Add a new `HandlerInfo` struct that can be inserted into the request's `State` by each handler to provide information about the handler to post-request middleware. (This includes things like which handler ran, which repo was queried, etc.)

A note on design: I did consider adding these fields to `RequestContext` (similar to how it's done in the LFS server), but that proved somewhat problematic in that it would require using interior mutability to avoid forcing each handler to borrow the `RequestContext` mutably (and therefore prevent it from borrowing other state data). The current approach seems preferable to adding a `Mutex` inside of `RequestContext`.

Reviewed By: krallin

Differential Revision: D23619077

fbshipit-source-id: 911806b60126d41e2132d1ca62ba863c929d4dc9
2020-09-10 20:57:12 -07:00
Viet Hung Nguyen
4a6351c8a2 mononoke/repo_import: check dependent systems for small repo,
Summary: Once we start to move the bookmark for the large repo commits, small repo commits should also start to appear for the dependent systems (e.g. Phabricator) through back-syncing. This diff adds this functionality to see if the commits have been recognised by the tools.

Reviewed By: StanislavGlebik

Differential Revision: D23566994

fbshipit-source-id: 2f6f3b9099bb864fec6a488064abe1abe7f06813
2020-09-10 05:46:35 -07:00
Lukasz Piatkowski
1d2340782a mononoke/integration: exclude the most flaky tests (#55)
Summary: Pull Request resolved: https://github.com/facebookexperimental/eden/pull/55

Reviewed By: farnz

Differential Revision: D23622449

Pulled By: lukaspiatkowski

fbshipit-source-id: 79e1895f2c6191a2968d0cff226a38ba47188431
2020-09-10 03:26:09 -07:00
Simon Farnsworth
1f7d61a04f Teach mononoke_admin blobstore-fetch to save the raw contents to a file
Summary: Useful when looking into blobstore corruption - you can compare all the blobstore versions by manual fetchees.

Reviewed By: krallin

Differential Revision: D23604436

fbshipit-source-id: 7b56947b0188536499514bae6615c6e81b9106c3
2020-09-10 02:29:48 -07:00
Simon Farnsworth
4754357f62 Asyncify more of blobstore_fetch admin command
Summary: Going to add more features, so simplify by asyncifying first

Reviewed By: krallin

Differential Revision: D23604437

fbshipit-source-id: 52b2b372e4d3fbf1d59168c6c11311d9edf4ff0f
2020-09-10 02:29:48 -07:00
Simon Farnsworth
89e30973ff Report write errors when scrubbing
Summary: When we're scrubbing blobstores, it's not actually a success state if a scrub fails to write. Report this back to the caller - no-one will usually be scrubbing unless they expect repair writes to succeed, and a failure is a sign that we need to investigate further

Reviewed By: mitrandir77

Differential Revision: D23601541

fbshipit-source-id: d328935af9999c944719a6b863d0c86b28c54f59
2020-09-10 02:29:47 -07:00
Lukasz Piatkowski
c044f1669a mononoke/integration tests: deal with bash issues on tests (#50)
Summary:
One test was fixed earlier by switching MacOS to use modern version of bash, the other is fixed here by installing "nmap" and using "ncat" from within it on both linux and mac.

Pull Request resolved: https://github.com/facebookexperimental/eden/pull/50

Reviewed By: krallin

Differential Revision: D23599695

Pulled By: lukaspiatkowski

fbshipit-source-id: e2736cee62e82d1e9da6eaf16ef0f2c65d3d8930
2020-09-10 01:56:54 -07:00
Arun Kulshreshtha
54641e8d1b edenapi_server: remove extraneous fields/methods from RequestContext
Summary: Remove unused fields and superfluous methods from `RequestContext`.

Reviewed By: singhsrb

Differential Revision: D23619076

fbshipit-source-id: 0fc42d6c29a8bb5c197d3559baa497a9e6e9c825
2020-09-09 22:43:31 -07:00
Durham Goode
f5a2347fbb py3: fix Mononoke Python 3 test failures
Summary:
Fixes a few issues with Mononoke tests in Python 3.

1. We need to use different APIs to account for the unicode vs bytes difference
for path hash encoding.
2. We need to set the language environment for tests that create utf8 file
paths.
3. We need the redaction message and marker to be bytes.  Oddly this test still
fails with jq CLI errors, but it makes it past the original error.

Reviewed By: quark-zju

Differential Revision: D23582976

fbshipit-source-id: 44959903aedc5dc9c492ec09a17b9c8e3bdf9457
2020-09-09 18:31:04 -07:00
Viet Hung Nguyen
0c84fb7a2b mononoke/mononoke_types: implement deserialize for typed_hashes
Summary:
typed_hash only implements serialize. Because of this, if we want to serialize a struct that contains e.g changesetid(s), we can't deserialize it later. This diff adds deserialize implementation for typed_hashes.
Implementation is similar to HgNodeHash's: https://fburl.com/diffusion/r3df5iga

Reviewed By: krallin

Differential Revision: D23598925

fbshipit-source-id: 4d48b75eb8a01028e6e2d9bcc1ae20051a97b7fb
2020-09-09 11:35:38 -07:00
Stanislau Hlebik
b5f1e53cd6 mononoke: use logical number of cpus in our runtime
Summary:
We are using older version of tokio which spawns as many threads as we have
physical cores instead of the number of logical cores. It was fixed in
https://github.com/tokio-rs/tokio/issues/2269 but we can't use it yet because
we are waiting for another fix to be released -
https://github.com/rust-lang/futures-rs/pull/2154.

For now let's hardcode it in mononoke

Reviewed By: krallin

Differential Revision: D23599140

fbshipit-source-id: 80685651a7a29ba8938d9aa59770f191f7c42b8b
2020-09-09 09:25:40 -07:00
Pavel Aslanov
f87db3eecf move existing changeset derivation logic to mercurial_derived_data
Summary:
This change move logic associated with mercurial changeset derivation to `mercurial_derived_data` crate.

NOTE: it is not converted to derived data infrastructure at this point, it is a preparation step to actually do this

Reviewed By: farnz

Differential Revision: D23573610

fbshipit-source-id: 6e8cbf7d53ab5dbd39d5bf5e06c3f0fc5a8305c8
2020-09-09 07:56:32 -07:00
David Tolnay
0cb8a052f5 Update formatter to rustfmt 2.0
Reviewed By: zertosh

Differential Revision: D23591021

fbshipit-source-id: e664aa2fdd3aaa457796a59080be6b94f604a112
2020-09-09 07:52:33 -07:00
Lukasz Piatkowski
c983dc96fe mononoke/integration tests: fix using private certs during Mac tests with hg (#53)
Summary:
The Mac integration test workflow already installs a modern curl that fixes https://github.com/curl/curl/issues/4801, but it does so after "hg" is built, so "hg" uses the system curl libraries, which fails when used with a certificate not present in keychain.

Pull Request resolved: https://github.com/facebookexperimental/eden/pull/53

Reviewed By: krallin

Differential Revision: D23597285

Pulled By: lukaspiatkowski

fbshipit-source-id: a7b8b6ae55ce338bfb9946a852cbb6b929e73203
2020-09-09 07:28:09 -07:00
Simon Farnsworth
9b9607b02e Have manual_scrub continue on errors, writing out files to let you retry instead
Summary:
There are blobs that fail to scrub and terminate the process early for a variety of reasons; when this is running as a background task, it'd be nice to get the remaining keys scrubbed, so that you don't have a large number of keys to fix up later.

Instead of simply outputting to stdout, write keys to one of three files in the format accepted on stdin:

1. Success; you can use `sort` and `comm -3` to remove these keys from the input dat, thus ensuring that you can continue scrubbing.
2. Missing; you can look at these keys to determine which blobs are genuinely lost from all blobstores, and fix up.
3. Error; these will need running through scrub again to determine what's broken.

Reviewed By: krallin

Differential Revision: D23574855

fbshipit-source-id: a613e93a38dc7c3465550963c3b1c757b7371a3b
2020-09-09 07:25:13 -07:00
Simon Farnsworth
aa2df38491 Improve errors on scrub failure
Summary:
With three blobstores in play, we have issues working out exactly what's wrong during a manual scrub. Make the error handling better:

1. Manual scrub adds the key as context for the failure.
2. Scrub error groups blobstores by content, so that you can see which blobstore is most likely to be wrong.

Reviewed By: ahornby, krallin

Differential Revision: D23565906

fbshipit-source-id: a199e9f08c41b8e967d418bc4bc09cb586bbb94b
2020-09-09 07:25:13 -07:00
Harvey Hunt
06941b4fad mononoke: Don't sort bookmark names using SQL
Summary:
Sorting bookmark names can be expensive for the MySQL server. As we
don't rely on the ordering of bookmark names when requesting all bookmarks,
remove the sorting.

I've not modified the `Select.*After` queries as they are used for pagination,
which does rely on the order of bookmark names. Further, any queries for
bookmarks that have a limit other than `std::u64::MAX` will remain sorted.

Reviewed By: ahornby

Differential Revision: D23574741

fbshipit-source-id: 79e07b64bb8bb34229c429bdf885c5144963f140
2020-09-09 07:08:26 -07:00
Lukasz Piatkowski
2b65fabc17 mononoke/integration tests: remove non-existent test-traffic-replay.t from exclusion list (#54)
Summary: Pull Request resolved: https://github.com/facebookexperimental/eden/pull/54

Reviewed By: ahornby

Differential Revision: D23597167

Pulled By: lukaspiatkowski

fbshipit-source-id: 1bc92ff32384a02ef019778a20c44634addadf25
2020-09-09 07:00:54 -07:00
Stanislau Hlebik
f0d44ef2aa mononoke: remove copy-paste when creating cs args factories
Reviewed By: krallin

Differential Revision: D23596215

fbshipit-source-id: b4f89ac56e033b0c976a001575f5862819f552a4
2020-09-09 05:45:30 -07:00
Lukasz Piatkowski
c9bbf63cab mononoke/integration tests: handle case-sensitive related tests (#49)
Summary:
The test-blobimport.t creates few files that are conflicting in a case insensitive file system, so make them differ by changing number of underscores in one of the files.

test-pushrebase-block-casefolding.t is directly testing a feature of case sensitive file system, so it cannot be really tested on MacOS

Pull Request resolved: https://github.com/facebookexperimental/eden/pull/49

Reviewed By: farnz

Differential Revision: D23573165

Pulled By: lukaspiatkowski

fbshipit-source-id: fc16092d307005b6f0c8764c1ce80c81912c603b
2020-09-09 03:53:32 -07:00
Stanislau Hlebik
66fbdf72c7 mononoke: add sampling for redacted accesses
Summary:
Previously we were not logging a redacted access if previous access was logged
less < MIN_REPORT_TIME_DIFFERENCE_NS ago. That doesn't work well with our
tests.

Let's instead add a sampling tunable.

Reviewed By: krallin

Differential Revision: D23595067

fbshipit-source-id: 47f6152945d9fdc2796fd1e74804e8bcf7f34940
2020-09-09 02:51:41 -07:00
Pavel Aslanov
32e162c197 move function used by mercurial_derived_data into a separate crate
Summary: Moving some of the functionality (which is required for mercurial changeset derivation) into a separate crate. This is required to convert mercurial changeset to derived data to avoid circular dependency it would create otherwise.

Reviewed By: StanislavGlebik

Differential Revision: D23566293

fbshipit-source-id: 9d30b4b3b7d8a922f72551aa5118c43104ef382c
2020-09-09 02:48:09 -07:00
Zeyi (Rice) Fan
26c8020522 explicitly specify features for tokio-util
Summary: This is needed in a later diff that requires "codec" feature from `future-util`.

Reviewed By: dtolnay

Differential Revision: D23575630

fbshipit-source-id: e9cdf11b6ec05e5f2744da6b6efd8cb7bf08b212
2020-09-08 17:53:56 -07:00
Kostia Balytskyi
39d1cd8a47 synced_commit_mapping: add get which returns a vec
Summary:
This method is a future of synced-commit-mapping: there can be multiple query
results and we should make a decision of whether it is acceptable for the
business logic in the business logic, rather than pick a random one.

In later diffs I will introduce the consumers for this method.

Reviewed By: mitrandir77

Differential Revision: D23574165

fbshipit-source-id: f256f82c9848f54e5096c6e50d42600bfd260081
2020-09-08 13:36:04 -07:00
Kostia Balytskyi
8e2b7754c4 synced_commit_mapping: rename get into get_one
Summary:
Another preparatory step for the actuall mapping model fix. This just renames
`get` method into a `get_one` to emphasize it's use-case and to ease the search later.

At the end of this change, I expect there to be no use-cases for `get_one` and expect is to be gone.

Reviewed By: mitrandir77

Differential Revision: D23574116

fbshipit-source-id: f5015329b15f3f08961006607d0f9bf10f499a88
2020-09-08 13:36:04 -07:00
Kostia Balytskyi
688309059b commit_rewriting: extract existing commit_sync_outcome into a file
Summary: This is just preparatory extraction to make further work more convenient.

Reviewed By: mitrandir77

Differential Revision: D23574077

fbshipit-source-id: 352ca8ac62bae4fd8fcb980da05c95ce477a414e
2020-09-08 13:36:04 -07:00
Thomas Orozco
2948993c38 remotefilelog: add killswitch for client certs
Summary:
See D23538897 for context. This adds a killswitch so we can rollout client
certs gradually through dynamicconfig.

Reviewed By: StanislavGlebik

Differential Revision: D23563905

fbshipit-source-id: 52141365d89c3892ad749800db36af08b79c3d0c
2020-09-08 10:39:07 -07:00
Thomas Orozco
d1c4772da3 remotefilelog: use client certs when connecting to LFS
Summary:
Like it says in the title, this updates remotefilelog to present client
certificates when connecting to LFS (this was historically the case in the
previous LFs extension). This has a few upsides:

- It lets us understand who is connecting, which makes debugging easier;
- It lets us enforce ACLs.
- It lets us apply different rate limits to different use cases.

Config-wise, those certs were historically set up for Ovrsource, and the auth
mechanism will ignore them if not found, so this should be safe. That said, I'd
like to a killswitch for this nonetheless. I'll reach out to Durham to see if I
can use dynamic config for that

Also, while I was in there, I cleaned up few functions that were taking
ownership of things but didn't need it.

Reviewed By: DurhamG

Differential Revision: D23538897

fbshipit-source-id: 5658e7ae9f74d385fb134b88d40add0531b6fd10
2020-09-08 10:39:07 -07:00
David Tolnay
be0786f14b Prepare for rustfmt 2.0
Summary:
Generated by formatting with rustfmt 2.0.0-rc.2 and then a second time with fbsource's current rustfmt (1.4.14).

This results in formatting for which rustfmt 1.4 is idempotent but is closer to the style of rustfmt 2.0, reducing the amount of code that will need to change atomically in that upgrade.

 ---

*Why now?* **:** The 1.x branch is no longer being developed and fixes like https://github.com/rust-lang/rustfmt/issues/4159 (which we need in fbcode) only land to the 2.0 branch.

 ---

Reviewed By: StanislavGlebik

Differential Revision: D23568780

fbshipit-source-id: b4b4a0aa683d236e2fdeb5b96d723ac2d84b9faf
2020-09-08 07:33:16 -07:00
Stanislau Hlebik
bf8a8c4cc9 mononoke: try to fix the test-redaction.t
Summary:
This test fail on sandcastle because the last two lines are not showing up.
I have a hunch that the last two lines just weren't flushed, and this diff
attempts to fix it.

Reviewed By: krallin

Differential Revision: D23570321

fbshipit-source-id: fd7a3315582c313a05e9f46b404e811384bd2a50
2020-09-08 04:29:33 -07:00
Viet Hung Nguyen
065d80b947 mononoke/repo_import: add small change to sleep time
Summary: When we imported a repo (T71717570), we received a network connect error after querying a lot from graphql.  I am not sure, if it's because of the frequent amount of queries, but just to be on the safe side, I increased the default sleep time between queries.

Reviewed By: krallin

Differential Revision: D23538886

fbshipit-source-id: 6a84f509e5e19f86880d3f8c6413f2f47e4a469b
2020-09-08 01:14:24 -07:00
Arun Kulshreshtha
8a26c3c960 edenapi_server: add Scuba logging
Summary: Add Scuba logging using `ScubaMiddleware` from `gotham_ext`. Each request will be logged to the Scuba dataset specified by the `--scuba-dataset` flag, as well as optionally to the log file specified by `--scuba-log-file`.

Reviewed By: sfilipco

Differential Revision: D23547668

fbshipit-source-id: e6cd88ad729a40cf45b63538f7481ee098ea12dc
2020-09-07 17:24:45 -07:00
Arun Kulshreshtha
a7a96e55eb lfs_server: tidy up middleware imports
Summary: Import middleware directly from `gotham_ext` rather than relying on reexports in the `middleware` module.

Reviewed By: farnz

Differential Revision: D23547320

fbshipit-source-id: e64a8acff55445a646b0a1b3b1e71cf6606c3d02
2020-09-07 17:24:45 -07:00
Arun Kulshreshtha
83c54b48f8 gotham_ext: move ScubaMiddleware into gotham_ext
Summary:
Move `ScubaMiddleware` out of the LFS server and into `gotham_ext`.

This change required splitting up the `ScubaKey` enum to separate generally useful column names (e.g., HTTP columns that would be applicable to any HTTP service) from LFS-specific columns. `ScubaMiddlwareState` has been modified to accept any type that implements `Into<String>` as a key, and the `ScubaKey` enum has been split up into `HttpScubaKey` (in `gotham_ext`) and `LfsScubaKey` (in `lfs_server`).

The middleware now takes a type parameter to specify a "handler" (implementing the new `ScubaHandler`  trait) which allows the application to add application-specific Scuba columns in addition to the default columns. The application-specific columns will be added immediately prior to the sample being logged.

Reviewed By: krallin

Differential Revision: D23458748

fbshipit-source-id: 3e99f3e0b5d3475a4f5ac9eaefade2eeff12c2fa
2020-09-07 17:24:45 -07:00
Mateusz Kwapich
3c10f1b9c5 add a way to query changed directories
Summary:
This diff is more complex than I wished for it as originally I didn't take into
account direcotries when designing `commit_compare` method.

Reviewed By: StanislavGlebik

Differential Revision: D23541892

fbshipit-source-id: 0e2b2abf7b3c541529d9881e48a575239374040f
2020-09-07 11:58:31 -07:00
Mateusz Kwapich
49b98e206e add a way to diff the directories between commits
Summary: We need that to replace similar feature in SCMQuery

Reviewed By: StanislavGlebik

Differential Revision: D23541893

fbshipit-source-id: 3dd6357ea834337a81216e24cb132e23b01bc77d
2020-09-07 11:58:31 -07:00
Lukas Piatkowski
fbfb856191 mononoke/integration test: make test-traffic-replay.t private
Reviewed By: StanislavGlebik

Differential Revision: D23565712

fbshipit-source-id: 7cb2d4a6c107ff513522e7343ffd5a8eea25879c
2020-09-07 10:35:39 -07:00
Lukasz Piatkowski
52bc18a728 mononoke/integration tests: fix up integration tests using hooks (#48)
Summary:
Hooks have been recently made public. Remove from list of excluded tests the ones that were blocked by missing hooks and fix them up.

Pull Request resolved: https://github.com/facebookexperimental/eden/pull/48

Reviewed By: farnz

Differential Revision: D23564883

Pulled By: lukaspiatkowski

fbshipit-source-id: 101dd093eb11003b8a4b4aa4c5ce242d9a9b9462
2020-09-07 08:42:39 -07:00
Jun Wu
89eb6520d2 scmutil: remove meaningfulparents
Summary:
The "meaningfulparents" concept is coupled with rev numbers.
Remove it. This changes default templates to not show parents, and `{parents}`
template to show parents.

Reviewed By: DurhamG

Differential Revision: D23408970

fbshipit-source-id: f1a8060122ee6655d9f64147b35a321af839266e
2020-09-05 15:06:44 -07:00
Lukasz Piatkowski
20b082ee6a mononoke/integration tests: blacklist 2 integration tests on OSS runs (#47)
Summary:
Those are new tests that use functionality not compatible yet with OSS.

Pull Request resolved: https://github.com/facebookexperimental/eden/pull/47

Reviewed By: chadaustin

Differential Revision: D23538921

Pulled By: lukaspiatkowski

fbshipit-source-id: c512a1b2359f9ff772d0e66d2e6a66f91e00f95c
2020-09-04 20:21:56 -07:00
Lukas Piatkowski
12c684afcd mononoke/hooks: make deny_files public
Reviewed By: aslpavel

Differential Revision: D23537799

fbshipit-source-id: 58c9568e30982f682b00faae42bc3a3f3595890f
2020-09-04 12:23:35 -07:00
Thomas Orozco
3ba2c2b429 mononoke/hg_sync: make it work on Mercurial Python 3
Summary:
A few things here:

- The heads must be bytes.
- The arguments to wireproto must be strings (we used to encode / decode them,
  but we shouldn't).
- The bookmark must be a string (otherwise it gets serialized as `"b\"foo\""`
  and then it deserializes to that instead of `foo`).

Reviewed By: StanislavGlebik

Differential Revision: D23499846

fbshipit-source-id: c8a657f24c161080c2d829eb214d17bc1c3d13ef
2020-09-04 11:56:44 -07:00
Thomas Orozco
747b355236 mononoke: make mononoke_hg_sync_job sendunbundlereplaybatch more debuggable
Summary:
Right now we get very little logging out of errors in here, which is making it
difficult to fix it on Py3 (where it currently is broken).

This diff doesn't fix anything, but at the very least, let's make the errors
better so we can make this easier to start debugging.

Reviewed By: ahornby

Differential Revision: D23499369

fbshipit-source-id: 7ee60b3f2a3be13f73b1f72dee062ca80cb8d8d9
2020-09-04 11:56:44 -07:00
Thomas Orozco
c8dd8ae4e3 mononoke: run tests using hg Python 3 as well
Summary:
The motivation for this is to surface potential regressions in hg Python 3 by
testing code paths that are exercised in Mononoke. The primary driver for this
were the regressions in the LFS extension that broke uploads, and for which we
have test coverage here in Mononoke.

To do this, I extracted the manifest generation (the manifest is the list of
binaries that the tests know about, which is passed to the hg test runner), and
moved it into its own function, then added a new target for the py3 tests.

Unfortunately, a number of tests are broken in Python 3 currently. We should
fix those. It looks like there are some errors in Mercurial when walking a
manifest with non-UTF-8 files, and the other problem is that the hg sync job is
in fact broken: https://fburl.com/testinfra/545af3p8.

Reviewed By: ahornby

Differential Revision: D23499370

fbshipit-source-id: 762764147f3b57b2493d017fb7e9d562a58d67ba
2020-09-04 11:56:44 -07:00
Stanislau Hlebik
7b323a4fd9 mononoke: add log-only mode in redaction
Summary:
Before redacting something it would be good to check that this file is not
accessed by anything. Having log-only mode would help with that.

Reviewed By: ikostia

Differential Revision: D23503666

fbshipit-source-id: ae492d4e0e6f2da792d36ee42a73f591e632dfa4
2020-09-04 07:37:15 -07:00
Stanislau Hlebik
0740f99f13 mononoke: allow logging censored scuba accesses to file
Summary:
In the next diff I'm going to add log-only mode to redaction, and it would be
good to have a way of testing it (i.e. testing that it actually logs accesses
to bad keys).

In this diff let's use a config option that allows logging censored scuba
accesses to file, and let's update redaction integration test to use it

Reviewed By: ikostia

Differential Revision: D23537797

fbshipit-source-id: 69af2f05b86bdc0ff6145979f211ddd4f43142d2
2020-09-04 07:37:14 -07:00
Thomas Orozco
f1e4f62e2d mononoke/fsnodes: expose FsnodeFile as the LeafId
Summary:
Fsnodes have a lot of data about files, but right now we can't access it
through a Fsnode lookup or a manifest walk, because the LeafId for a Fsnode is
just the content id and the file type.

This is a bit sad, because it means we e.g. cannot dump a manifest with file
sizes (D23471561 (179e4eb80e)).

Just changing the LeafId is easy, but that brings a new problem with Fsnode
derivation.

Indeed, deriving manifests normally expects us to have the "derive leaf"
function produce a LeafId (so we'd want to produce a `FsnodeFile`), but in
Fsnodes, this currently happens in deriving trees instead.

Unfortunately, we cannot easily just move the code that produces `FsnodeFile`
from the tree derivation to the leaf derivation, that is, do:

```
fn check_fsnode_leaf(
    leaf_info: LeafInfo<FsnodeFile, (ContentId, FileType)>,
) -> impl Future<Item = (Option<FsnodeSummary>, FsnodeFile), Error = Error>
```

Indeed, the performance of Fsnode derivation relies on all the leaves for a
given tree being derived together with the tree and its parents in context.

So, we'd need the ability for deriving a new leaf to return something different
from the actual leaf id. This means we want to return a `(ContentId,
FileType)`, even though our `LeafId` is a `FsnodeFile`.

To do this, this diff introduces a new `IntermediateLeafId` type in the
derivation. This represents the type of the leaf that is passed from deriving a
leaf to deriving a tree. We need to be able to turn a real `LeafId` into it,
because sometimes we don't re-derive leaves.

I think we could also refactor some of the code that passes a context here to
just do this through the `IntermediateLeafId`, but I didn't look into this too
much.

So, this diff does that, and uses it in Mononoke Admin so we can print file
sizes.

Reviewed By: StanislavGlebik

Differential Revision: D23497754

fbshipit-source-id: 2fc480be0b1e4d3d261da1d4d3dcd9c7b8501b9b
2020-09-04 06:30:18 -07:00
Mateusz Kwapich
f7be2eef14 tunable scuba sampling
Summary:
This allows us to sample the most popular method logs (`repo_list_hg_manifest` calls make up for 90% samples in our scuba table) while still have full logging for other queries end errors.

The sampling can be eaily disabled via tunable. In case we get a lot of errors we can also start sampling the error request with a simple configerator change.

Reviewed By: krallin

Differential Revision: D23507333

fbshipit-source-id: c7e34467d99410ec3de08cce2db275a55394effd
2020-09-04 06:26:35 -07:00
Viet Hung Nguyen
437a0e905b mononoke/repo_import: add deriving data types for multiple repos
Summary: Previously, we only supported deriving data types for the repo we import into. This diff expands on this and now we can do that for multiple repos (e.g. small repos we backsync commits to from large repo we import to).

Reviewed By: StanislavGlebik

Differential Revision: D23499953

fbshipit-source-id: 223209a6a2739eae93082cae4f04e53e0cba0c58
2020-09-04 05:39:21 -07:00
Stanislau Hlebik
11a45b6b60 mononoke: do not pass tasks to find_files_with_given_content_id_blobstore_keys
Summary:
In the next diff I'm going to add log_only mode for redaction.
And in this diff I make a small refactoring that makes next diff simpler.
find_files_with_given_content_id_blobstore_keys don't accept tasks anymore,
just content keys.

Reviewed By: aslpavel

Differential Revision: D23535829

fbshipit-source-id: 1dac37f5ea7038fc779ad51192a290fcc23e6556
2020-09-04 05:22:03 -07:00
Lukas Piatkowski
67a71d1f98 mononoke/hooks: make limit_commitsize and limit_filesize public
Reviewed By: aslpavel

Differential Revision: D23502908

fbshipit-source-id: 8b9070cfaa28af7b808d02548c0fb7c5d344550d
2020-09-04 04:23:05 -07:00
Lukas Piatkowski
462cb96cc2 mononoke/hooks: make no_questionable_filenames public
Reviewed By: aslpavel

Differential Revision: D23478259

fbshipit-source-id: 642948c2685690298a71fbe7177c4bd6a6e43f85
2020-09-04 04:23:05 -07:00
Lukas Piatkowski
eebdc0b896 mononoke/metaconfig: sync thrift changes from configerator for HookConfig
Summary: Use the new fields from RawHookConfig in HookConfig

Reviewed By: StanislavGlebik

Differential Revision: D23499766

fbshipit-source-id: 43e9d2dfdcfb0fa0dd4de6310ea0013db1b69474
2020-09-04 02:02:06 -07:00
Stefan Filip
3f0b08e46f segmented_changelog: add version field to IdMap
Summary:
The version is going to be used to seamlessly upgrade the IdMap. We can
generate the IdMap in a variety of ways. Naturally, algorithms for generating
the IdMap may change, so we want a mechanism for updating the shared IdMap.

A generated IdDag is going to require a specific IdMap version. To be more
precise, the IdDag is going to specify which version of IdMap it has to be
interpreted with.

Reviewed By: quark-zju

Differential Revision: D23501158

fbshipit-source-id: 370e6d9f87c433645d2a6b3336b139bea456c1a0
2020-09-03 16:33:20 -07:00
Stefan Filip
58a4821fe3 segmented_changelog: add IdMap trait with SqlIdMap implementation
Summary:
Separate the operational bits of the IdMap from the core SegmentedChangelog
requirements.

I debaded whether it make sense to add repo_id to SqlIdMap. Given the current
architecture I don't see a reason not to do it. On the contrary separating
two objects felt convoluted.

Reviewed By: quark-zju

Differential Revision: D23501160

fbshipit-source-id: dab076ab65286d625d2b33476569da99c7b733d9
2020-09-03 16:33:20 -07:00
Stefan Filip
f3c353edbc segmented_changelog: change idmap module from file to directory
Summary:
Planning to add a trait for core idmap functionality (that's just translating
cs_id to vertex and back). The current IdMap will then be an implementation of
that trait.

Reviewed By: quark-zju

Differential Revision: D23501159

fbshipit-source-id: 34e3b26744e4b5465cd108cca362c38070317920
2020-09-03 16:33:20 -07:00
Stanislau Hlebik
4947e07cb7 mononoke: asyncify one function in redaction admin subcommand
Summary:
I'm going to change this function soon, so it's nice to asyncify it to make
next diffs simpler and also remove duplicated logic.

Also remove unnecessary `logger` parameter - we can always get logger from CoreContext

Reviewed By: krallin

Differential Revision: D23501634

fbshipit-source-id: 7ad2fc17167e4107481ceb230e0b7cb3e7f2549a
2020-09-03 12:22:24 -07:00
Mateusz Kwapich
20d096f5d5 add thrift metadata support
Summary: This closely replicates EscapeZero work in D23328638 and will allow us to issue requests to SCS using Thrift Fiddle (https://www.internalfb.com/thrift_fiddle).

Reviewed By: EscapeZero

Differential Revision: D23475864

fbshipit-source-id: fb286e3fcd6ea79704fa2e7e1ed9ab5595ff7b81
2020-09-03 12:18:18 -07:00
Arun Kulshreshtha
858a080502 gotham_ext: make StreamBody automatically delay post-request callbacks
Summary: Now that post-request callbacks are available in `gotham_ext`, we can make `StreamBody` use them directly instead of using an LFS-specific wrapper (previously required to access the LFS server's `RequestContext`). This also means that the EdenAPI server will get this behavior for free.

Reviewed By: krallin

Differential Revision: D23402969

fbshipit-source-id: 56ab710473f13e8983b136664af364af6884bd3f
2020-09-03 11:59:32 -07:00
Arun Kulshreshtha
5556a447d1 edenapi_server: use LogMiddleware
Summary: Add `LogMiddleware` to the EdenAPI server, which will print a log message whenever a request is received or has completed.

Reviewed By: DurhamG

Differential Revision: D23299902

fbshipit-source-id: f44ef1b01692f0e4f9b109917fcee89a84ca4208
2020-09-03 11:59:32 -07:00
Arun Kulshreshtha
96a6a3fcfb edenapi_server: use LoadMiddleware
Summary: Use `LoadMiddleware` to track the number of outstanding requests in the server.

Reviewed By: DurhamG

Differential Revision: D23298415

fbshipit-source-id: bdcdb0f657d8deac593d356c87ac0d8d3f39e322
2020-09-03 11:59:32 -07:00
Arun Kulshreshtha
7144363d2c gotham_ext: move LogMiddleware to gotham_ext
Summary: Now that `LogMiddleware` no longer depends on `RequestContext`, it can be moved into `gotham_ext`.

Reviewed By: DurhamG

Differential Revision: D23298412

fbshipit-source-id: d5288decba98c3dd4605b9a44e41eba0f47fee37
2020-09-03 11:59:31 -07:00
Arun Kulshreshtha
35d292e513 gotham_ext: move LoadMiddleware to gotham_ext
Summary: Now that `LoadMiddleware` no longer depends on `RequestContext`, it can be moved into `gotham_ext`.

Reviewed By: DurhamG

Differential Revision: D23298416

fbshipit-source-id: 5d29da492e39beb5621daf0570d9b3e657cbfc04
2020-09-03 11:59:31 -07:00
Arun Kulshreshtha
82c451fb9f lfs_server: use PostRequestMiddleware
Summary: This diff removes the post-request callback functionality from the LFS server's `RequestContext` and replaces it with the new `PostRequestMiddleware`. The middleware is directly based on `RequestContext`, so the underlying behavior is essentially the same as before.

Reviewed By: krallin

Differential Revision: D23298413

fbshipit-source-id: 1e58a40f6ce6d526456dbd9ae3a8efc85768bf04
2020-09-03 11:59:31 -07:00
Arun Kulshreshtha
3ad7fa8b6f gotham_ext: allow applications to dynamically configure PostRequestMiddleware
Summary: Make `PostRequestMiddleware` generic over a user-provided config struct which can be used to dynamically configure the behavior of post-request callback dispatching. Right now this is only used to support disabling hostname logging, but could be easily extended to cover more uses in the future.

Reviewed By: krallin

Differential Revision: D23495005

fbshipit-source-id: 3d59a8346f449775ec76d03c260d973d04fb90a9
2020-09-03 11:59:31 -07:00
Arun Kulshreshtha
cc0f2e4c40 gotham_ext: add PostRequestMiddleware
Summary: Add new middleware that allows HTTP handlers and other middleware to register callbacks that will be run once the current request completes. This is heavily based on the post-request callback functionality from the LFS server's `RequestContext`. The intention here is to expose this functionality in a manner that's independent of other, application-specific logic.

Reviewed By: krallin

Differential Revision: D23298419

fbshipit-source-id: e4b1534b02c35f685ce544de13e331947e187818
2020-09-03 11:59:31 -07:00
Thomas Orozco
d77cf89ead mononoke/admin: clean up unodes subcommand a bit
Summary:
I pattern matched off of this for the previous diff in this stack, and spotted
a bit of clean up that might make sense here:

- Using `.help()` for a subcommand overrides the whole help text. We meant to
  use `.about()` here. I fixed this in some copy-pasted code as well.
- Printing debug output alongside real output makes it harder to select the
  real output. I fixed this by logging debug output to stderr instead.

Reviewed By: StanislavGlebik

Differential Revision: D23471560

fbshipit-source-id: 7900cfe65613c48abd77faad6d6a45a7aa523b36
2020-09-03 09:32:06 -07:00
Thomas Orozco
179e4eb80e mononoke/admin: add a subcommand for dumping paths
Summary:
This adds a subcommand for dumping all the paths in a repository. This is
helpful when you have a Content ID, limited imagination and time on your hands,
and you'd like to turn those into a file path where that Content ID lives.

This uses fsnodes for the traversal because that's O(# directories) as opposed
top O(# files). I had an earlier implementation that used unodes, but that was
really slow.

Reviewed By: StanislavGlebik

Differential Revision: D23471561

fbshipit-source-id: 948bfd20939adf4de0fb1e4b2852ad4d12182f16
2020-09-03 09:32:06 -07:00
Viet Hung Nguyen
7c34b39ec8 mononoke/repo_import: add backsyncing to rewrite file paths, remove backup file
Summary:
add backsyncing to rewrite file paths:
After setting the variables for large repo (D23294833 (d6895d837d)), we try to import the git commits into large repo and rewrite the file paths.
Following this, repo import tool should back-sync the commits into small_repo.

next step: derive all the data types for both small and large repos. Currently, we only derive it for the large repo.

==============
remove backup file:
The backup file was a last-minute addition when trying to import a repo for the first time.
Removed it, because we shouldn't write to external files. Future plan is to include
better process recoverability across the whole tool and not just rewrite file paths functionality.

Reviewed By: StanislavGlebik

Differential Revision: D23452571

fbshipit-source-id: bda39694fa34788218be795319dbbfd014ba85ff
2020-09-03 06:43:08 -07:00
Stanislau Hlebik
a77d9f243a mononoke: parallelize operations in create_commit scs method
Reviewed By: krallin

Differential Revision: D23496535

fbshipit-source-id: 18f88abb9b85d38a93d2aa99c38edcf8190343c3
2020-09-03 04:12:35 -07:00
Lukas Piatkowski
a4af730541 monononke/hooks: make no_bad_filenames public
Reviewed By: aslpavel

Differential Revision: D23474524

fbshipit-source-id: 5f7826346500b1acc7450791dd1e7806c4e623d6
2020-09-03 02:40:43 -07:00
Lukas Piatkowski
81d9338100 mononoke/hooks: make few generic hooks public
Summary: More hooks will come in next diffs.

Reviewed By: aslpavel

Differential Revision: D23449755

fbshipit-source-id: 451fdb7a759140f2d6df8f3a18493c700fa2b761
2020-09-03 02:40:43 -07:00
Stanislau Hlebik
29bbc0dc15 mononoke: check if content we are about to redact is not reachable
Summary:
That's one of the sev followups. Before redacting a file content let's check if
it exists in "main-bookmark" (which is be default master), and refuse to redact
if it actually exists.

If this check passes (i.e. the content we are about to redact is not reachable
from master) that doesn't mean that we are 100% safe. E.g. this comment can be
in ancestor of master, or in any other repo or it can be added in the next
commit.

This check is a best-effort check to prevent shooting ourselves in the foot.

Reviewed By: aslpavel

Differential Revision: D23476278

fbshipit-source-id: 5a4cd10964a65b8503ba9a6391f17319f0ce37d8
2020-09-03 01:30:14 -07:00
Stefan Filip
da4c33c67a tests: add commit-location-to-hash integration test
Summary: Exercise location-to-hash functionality in edenapi.

Reviewed By: kulshrax

Differential Revision: D23456214

fbshipit-source-id: 2ab22eb045517a5927c2de502d8cfc9898daecef
2020-09-02 17:20:43 -07:00
Stefan Filip
932450fb15 handlers: update location-to-hash endpoint with count parameter
Summary:
To reduce the size over the wire on cases where we would be traversing the
changelog on the client, we want to allow the endpoint to return a whole parent
chain with their hashes.

Reviewed By: kulshrax

Differential Revision: D23456216

fbshipit-source-id: d048462fa8415d0466dd8e814144347df7a3452a
2020-09-02 17:20:42 -07:00
Stefan Filip
7122cdded7 types: rename Location to CommitLocation
Summary:
Renaming all the LocationToHash related structures to CommitLocationToHash.
This is done for consistency. I realized the issue when the command for reading
the request from cbor was not what I was expecting it to be. The reason was that
the commit prefix was used inconsistently for LocationToHash.

Reviewed By: kulshrax

Differential Revision: D23456221

fbshipit-source-id: 0181dcaf81368b978902d8ca79c5405838e4b184
2020-09-02 17:20:42 -07:00
Stefan Filip
310b3616a6 blobrepo: instantiate segmented changelog as an attribute
Summary:
Segmented Changelog is a component that has multiple components of each own
that each can be configured in different ways. It seems that it already is
more complicated than other components in how it is set up and it will probably
evolve to have more knobs (caching comes to mind).

Right now we have 3 ways of instantiating SegmentedChangelog:
- Disabled, all requests return errors
- ReadOnly, requests to unprocessed commits return errors
- OnDemandUpdate, requests trigger commit processing when required

Reviewed By: aslpavel

Differential Revision: D23456217

fbshipit-source-id: a6016f05197abbc3722764fa8e9056190a767b36
2020-09-02 17:20:42 -07:00
Stefan Filip
b818a86631 config: add segmented changelog config parsing
Summary:
Parsing is done in the SegmentedChangelogConfig structure which will inform
how to construct the SegmentedChangelog in Mononoke.

Reviewed By: aslpavel

Differential Revision: D23456222

fbshipit-source-id: a7d5d81f4c166909164026e81af57f1c2ea32347
2020-09-02 17:20:42 -07:00
Stefan Filip
e57b1f9265 segmented_changelog: add on-demand updating dag implementation
Summary:
The Segmented Changelog must be built somewhere. One of the simplest deployments
of involves the on-demand update of the graph. When a commit that wasn't yet
processed is encountered, we sent it to processing along with all of it's
ancestors.

At this time not much attention was paid to the distinction of master commit
versus non-master commit. For now the expectation is that only commits from
master will exercise this code path. The current expectation is that clients
will only call location-to-hash using commits from master.
Let me know if there is an easy way to check if a commit is part of master.
Later changes will invest more in handling non-master commits.

Reviewed By: aslpavel

Differential Revision: D23456218

fbshipit-source-id: 28c70f589cdd13d08b83928c1968372b758c81ad
2020-09-02 17:20:42 -07:00
Stefan Filip
d50e09a41d segmented_changelog: add SegmentedChangelogBuilder
Summary:
This builders implements SqlConstruct and SqlConstuctFromMetadataDatabaseConfig
to make handling the Sql connection for IdMap consistent with what happens in
Mononoke in general.

Reviewed By: aslpavel

Differential Revision: D23456219

fbshipit-source-id: 6998afbbfaf1e0690a40be6e706aca1a3b47829f
2020-09-02 17:20:42 -07:00
Stefan Filip
66706d77c5 segmented_changelog: add SegmentedChangelog trait
Summary:
The trait provides two methods for location to hash translation. The first
returns a single hash and is existing functionality. The second returns a
list of hashes and represents new functionality. This diff also adds this
functionality to the Dag structure which is currently the only real
implementation for SegmentedChangelog.

Reviewed By: aslpavel

Differential Revision: D23456215

fbshipit-source-id: 0c2ca91672cf23129342c585f98446c0ebbdf7ef
2020-09-02 17:20:41 -07:00
Stefan Filip
10b233f180 blobrepo: move ChangesetFetcher to attributes
Summary:
I am planning to add Segmented Changelog to attributes.

I am writing an integration test for an EdenApi endpoint that depends on
Segmented Changelog and I would like to set it up to update on demand. When a
request comes in for a commit that we haven't parsed for Segmented Changelog we
want to update the structure on demand. This means that we probably need to
fetch commits. This means that we want to pass the ChangesetFetcher to Segmented
Changelog when it is built. Since Segmented Changelog fits well as an attribute
we want the ChangesetFetcher as an attribute.

I wonder how much thought has been given to attributes behaving as a dependency
injector in the `guice` sense.

Reviewed By: aslpavel

Differential Revision: D23428201

fbshipit-source-id: 7003c018ba806fd657dd8f071e0e83d35058b10f
2020-09-02 17:20:41 -07:00
Kostia Balytskyi
6e8cbd31b1 megarepotool: add gradual-merge-progress subcommand
Summary:
This is to be able to automatically report progress: how many merges has been
done already.

Note: this intentionally uses the same logic as regular `gradual-merge`, so that we always report correct numbers.

Reviewed By: StanislavGlebik

Differential Revision: D23478448

fbshipit-source-id: 3deb081ab99ad34dbdac1057682096b8faebca41
2020-09-02 12:18:31 -07:00
Thomas Orozco
b8e197fdb4 mononoke/lfs_server: allow enabling rate limits probabilistically
Summary:
If we exceed a rate limit, we probably don't want to just drop 100% of traffic.
This would create a sawtooth pattern where we allow a bunch of traffic, update
our counters, drop a bunch of traffic, update our counters again, allow a bunch
of traffic, etc.

To fix this, let's make limits probabilistic. This lets us say "beyond X GB/s,
drop Y% of traffic", which is closer to a sane rate limit.

It might also make sense to eventually change this to use ratelim. Initially,
we didn't do this because we needed our rate limiting decisions to be local to
a single host (because different hosts served different traffic), but now that
we spread the load for popular blobs across the whole tier, we should be able
to just delegate to ratelim.

For now, however, let's finish this bit of a functionality so we can turn it
on.

The corresponding Configerator change is here: D23472683

Reviewed By: aslpavel

Differential Revision: D23472945

fbshipit-source-id: f7d985fded3cdbbcea3bc8cef405224ff5426a25
2020-09-02 11:02:18 -07:00
Stanislau Hlebik
cdf96a20dd mononoke: asyncify redaction_add
Summary: Will change it in the next diff, so let's asyncify it now.

Reviewed By: aslpavel

Differential Revision: D23475332

fbshipit-source-id: f25fb7dc16f99cb140df9374f435e071401c2b90
2020-09-02 09:28:48 -07:00
Alex Hornby
b22599c500 mononoke: memo the hash values of interned paths in the walker
Summary: Memo the hash values of interned paths in the walker. The interner calls the hash function inside a lock that gets heavily contended, so this reduces the time the lock is held.

Reviewed By: farnz

Differential Revision: D23075260

fbshipit-source-id: 3ee50e3ce56106eadd17dc7d737ba95282640051
2020-09-02 05:52:33 -07:00
Alex Hornby
46cc110012 mononoke: switch walker from arc-intern to internment
Summary: Switch the walker from arc-intern::ArcIntern to internment::ArcIntern as internment does not need to acquire its map's locks on every drop.

Reviewed By: farnz

Differential Revision: D23075265

fbshipit-source-id: 6dd241aed850ec0fd3c8a4e68dda06053ec0b424
2020-09-02 05:52:33 -07:00
Kostia Balytskyi
d49406d847 repo_client: get rid of unneeded perf counters
Summary:
These two perf counters proved to be not very convenient to evaluate the
volume of undesired file fetches. Let's get rid of them. Specifically, they are
not convenient, because they accumulate values and it's hard to aggregate over
them.

Note that I don't do the same for tree fetches, as there's no better way of
estimating those now.

Reviewed By: mitrandir77

Differential Revision: D23452913

fbshipit-source-id: 08f8dd25eece495f986dc912a302ab3109662478
2020-09-02 05:02:46 -07:00
Kostia Balytskyi
e7ddc6cc13 undesired fetches: regex-based reporting
Summary:
We want to be able to report more than just on one prefix. Instead, let's add a regex-based reporting. To make deployment easier, let's keep both options for now and later just remove prefix-based one.

Note: this diff also changes how a situation with absent `undesired_path_prefix_to_log` is treated. Previously, if `undesired_path_prefix_to_log` is absent, but `"undesired_path_repo_name_to_log": "fbsource"`, it would report every path. Now it won't report any, which I think is a saner behavior. If we do ever want to report every path, we can just add `.*` as a regex.

Reviewed By: StanislavGlebik

Differential Revision: D23447800

fbshipit-source-id: 059109b44256f5703843625b7ab725a243a13056
2020-09-01 12:01:00 -07:00
Viet Hung Nguyen
2c1d4a49ad mononoke/repo_import: change logic of file paths rewriting with multiple movers
Summary:
This diff modifies how we rewrite file paths when we import into a repo by allowing the tool to apply multiple movers.

Motivation:
When we try to import into a small repo that pushredirects to a large repo, we have decided to import into the large repo first, then backsync to the small repo. To do that, we have to set a couple of flags related to importing into the large repo (see: D23294833 (d6895d837d)): bookmarks and import destination path.  Previously, we fixed the destination path in large repo by applying the small_to_large repo syncer's mover on the destination path in small repo. e.g:
if small_to_large repo syncer mover = {
default_action = prepend(**large_dir**)
map = [...]},
then **destination_path** in small repo becomes **large_dir/destination_path** in large repo.
After this, we prepended the imported files with the new prefix with another mover: prepend(**large_dir/dest_path**)
a -> large_dir/dest_path/a
Consequently, all directories and files under **destination_path** would get imported under **large_dir/destination_path** in large repo with this logic. e.g.
However, it's possible that with push-redirections, some directories would get remapped to a different place in large repo. e.g
small_to_large syncer mover = {
default_action = prepend(**large_dir**)
map = [
dest_path/b -> random_dir/b
]},
but with the current repo_import implementation dest_path/b would get prepended to large_dir/dest_path/b.
To avoid this, we apply multiple movers on the imported files. e.g.
1. we prepend all files with dest_path:
    mover = {
    default_action: prepend(**dest_path**)
    map={}} =>
    a -> dest_path/a
    b -> dest_path/b
2. we remap the files using the small_to_large repo syncer mover:
    mover = {
 default_action: prepend(**large_dir**)
 map =
 {dest_path/b -> random_dir/b}} =>
   dest_path/a -> large_dir/dest_path/a
   dest_path/b -> random_dir/b

Reviewed By: StanislavGlebik

Differential Revision: D23371244

fbshipit-source-id: 0bf4193b24d73c79ed00dfb38e2b0538388d1c0f
2020-09-01 09:26:07 -07:00
Pavel Aslanov
fffcf5b966 utility to keep streaming clone data warm
Summary: This is streaming clone warmup binary as per https://fb.quip.com/hfuBAdYnzr9M

Reviewed By: StanislavGlebik

Differential Revision: D23347029

fbshipit-source-id: f187a2f3529a7eae5998bab199228bfbe6057e6e
2020-09-01 07:13:33 -07:00
David Tolnay
75c2118e01 Remove crate_root from Rust dependency info
Reviewed By: danobi

Differential Revision: D23430948

fbshipit-source-id: c4b374021325fc247121ceecd0e82a0291aa75d6
2020-08-31 14:43:24 -07:00
Arun Kulshreshtha
767570d298 lfs_server: remove PerfCounters from post-request callback signature
Summary:
`PerfCounters` was the only application-specific type exposed as a parameter to the post-request callbacks, and it was only being used in one place. To facilitate making the post-request callback functionality more general, this diff makes the callback in question capture the `CoreContext` in its environment, thereby giving it access to the `PerfCounters` without requiring it to be passed as an argument.

This should not change the behavior since regardless of how the callback obtains a reference, it will still refer to the same underlying `PerfCounters` from the request's `CoreContext`.

Reviewed By: DurhamG

Differential Revision: D23298417

fbshipit-source-id: 898f14e5b35b827e98eaf1731db436261baa43bb
2020-08-27 14:15:25 -07:00
Durham Goode
fe56f44ca0 treemanifest: prevent fetching nullid
Summary:
Mononoke throws an error if we request the nullid. In the long term we
want to get rid of the concept of the nullid entirely, so let's just add some
Python level blocks to prevent us from attempting to fetch it. This way we can
start to limit how much Rust has to know about these concepts.

Reviewed By: sfilipco

Differential Revision: D23332359

fbshipit-source-id: 8a67703ba1197ead00d4984411f7ae0325612605
2020-08-27 09:59:40 -07:00
Viet Hung Nguyen
d6895d837d mononoke/repo_import: generate repo import settings for push-redirected repo
Summary:
Once we discover that the (small) repo we import into push-redirects (D23158826 (d3f3cffe13)) to a large repo,
we want to import into the large repo first, then backsync into the small one (see previous diff summary).
The aim of this diff is to setup the variables (e.g. bookmarks) needed for importing into
the large repo first before backsyncing the commits into the small repo.

Next step: add functionalities to control how we backsync from large repo to the small repo

Reviewed By: StanislavGlebik

Differential Revision: D23294833

fbshipit-source-id: 019d84498fae4772051520754991cb59ea33dbf4
2020-08-27 02:38:26 -07:00
Stefan Filip
902bdfd46a tests: set --noproxy localhost for all sslcurl calls
Summary:
Without the `--noproxy localhost` flag curl will obey the `https_proxy` env
variable but will not respect the `no_proxy` env variable or `curlrc`.
This means that tests running in a shell with `https_proxy` will likely fail.
The failures may vary in aspect based on what logic is running at the time.

Reviewed By: kulshrax

Differential Revision: D23360744

fbshipit-source-id: 0383a141e848bd2257438697e699f727d79dd5d2
2020-08-26 18:40:52 -07:00
Arun Kulshreshtha
9f68c673f3 gotham_ext: move TimerMiddleware into gotham_ext
Summary: Now that `TimerMiddleware` no longer depends on `RequestContext`, it can be moved into `gotham_ext`.

Reviewed By: farnz

Differential Revision: D23298414

fbshipit-source-id: 058cb67c9294b28ec7aec03a45da9588e97facc5
2020-08-26 16:04:31 -07:00
Arun Kulshreshtha
825016043f lfs_server: decouple TimerMiddleware from RequestContext
Summary: Previously, the LFS server's `TimerMiddleware` needed to be used in conjunction with `RequestContext`, as its purpose was to simply call a method on the `RequestContext` to record the elapsed time. This diff moves tracking of the elapsed time into `TimerMiddleware` itself (via Gotham's `State`), allowing the middleware to be used on its own.

Reviewed By: farnz

Differential Revision: D23298418

fbshipit-source-id: 8077d40edec0936d95317ac11d86bbcd33a3bf04
2020-08-26 16:04:31 -07:00
Stanislau Hlebik
7cfbf99de4 mononoke: add admin command to find rejected blames
Summary:
We might need to rebackfill blame for configerator (see
https://fburl.com/hfylxmag). It's good to have a command that shows how many
files with rejected blame we have.

Reviewed By: farnz

Differential Revision: D23267648

fbshipit-source-id: 33e658b53391285461890bda3a94b391e6063c12
2020-08-26 11:54:08 -07:00
Arun Kulshreshtha
c08f67f004 gotham_ext: move middleware.rs to middleware/mod.rs
Summary: Move `middleware.rs` to `middleware/mod.rs` for consistency with the prevailing module file structure in the Mononoke codebase.

Reviewed By: sfilipco

Differential Revision: D23298420

fbshipit-source-id: 4f88d046a2c6ca1be2e3e315c9eea17845c6b8b3
2020-08-25 17:33:42 -07:00
Arun Kulshreshtha
374eb7d1bf edenapi_server: remove extract_identities function
Summary: This function used to be longer before AclChecker was replaced with PermissionsChecker. Now the function is a one-liner, so it doesn't make sense to keep it as a separate function.

Reviewed By: sfilipco

Differential Revision: D23304899

fbshipit-source-id: 23e8c4b2334cdbff21ca336aecedf6ba6c466f99
2020-08-25 17:18:49 -07:00
Mark Thomas
dee916ec4c bookmarks_movement: handle services with all or no permitted paths
Summary:
If a service is configured with no permitted paths, ensure we deny any writes
that might affect any path.  This is not hugely useful, and probably means a
configuration error, but it's the safe choice.

In a similar vein, if a service is permitted to modify any path, there's not
much point in checking all the commits, so skip the path checks to save some
time.

Reviewed By: StanislavGlebik

Differential Revision: D23316392

fbshipit-source-id: 3d9bf034ce496540ddc4468b7128657e446059c6
2020-08-25 09:14:09 -07:00
Mark Thomas
f15976637e bookmarks_movement: restrict bookmarks that are marked allow_only_external_sync
Reviewed By: StanislavGlebik

Differential Revision: D23294907

fbshipit-source-id: ed89e5fd841e7d516b5d259c1f5de4e9f8f40ee3
2020-08-25 09:14:09 -07:00
Mark Thomas
eed7df1c52 bookmarks_movement: add integration test for bookmark modifications
Summary:
This tests creating, moving and deleting bookmarks using the source control
service, making sure that hooks and service write restrictions are applied
appropriately.

Reviewed By: StanislavGlebik

Differential Revision: D23287999

fbshipit-source-id: bd7e66ec3668400a617f496611e4f24f33f8083e
2020-08-25 09:14:09 -07:00
Mark Thomas
f474c09d90 scs_server: implement repo_create_bookmark and repo_delete_bookmark
Summary: Implement these new thrift methods by calling the corresponding mononoke_api method.

Reviewed By: StanislavGlebik

Differential Revision: D23288002

fbshipit-source-id: 2abf1144fe524f695984a7aa472308b8bf067d45
2020-08-25 09:14:09 -07:00
Mark Thomas
4747346e82 mononoke_api: add create and delete bookmark methods
Summary: Add methods to create and delete bookmarks.

Reviewed By: StanislavGlebik

Differential Revision: D23288003

fbshipit-source-id: 5fca60254f00966478270e1a4447cc6a1b5a438e
2020-08-25 09:14:09 -07:00
Mark Thomas
c3070381b3 bookmarks_movement: implement service write path restrictions
Summary:
Use `PrefixTrie` to ensure that all service writes are to paths that are permitted
for the service.

By default, no paths are permitted.  The service can be configured to allow all
paths by configuring the empty path as a permitted prefix.

Reviewed By: StanislavGlebik

Differential Revision: D23287997

fbshipit-source-id: 2b7a0df655084385f73551602d6107411d6aad2f
2020-08-25 09:14:09 -07:00
Mark Thomas
d7dcd5c4c3 mononoke_types: add PrefixTrie for testing path prefixes
Summary:
Add `PrefixTrie`, which is a collection of path prefixes for testing against.

The tree is initially populated with a set of path prefixes.  Once populated,
it can be tested against using other paths.  These tests will return true if
the trie contains a prefix of that path.

Reviewed By: StanislavGlebik

Differential Revision: D23288127

fbshipit-source-id: 6096a9abc8e3a1bf5a8309123a46d321d9795f77
2020-08-25 09:14:08 -07:00
Mark Thomas
a11e0052ac bookmarks_movement: implement service write bookmark restrictions
Summary:
Move handling of service write bookmark restrictions into the `bookmarks_movement` crate.

This moves `check_bookmark_modification_permitted` from `mononoke_api` onto
`SourceControlServiceParams`, where it can be called from `bookmarks_movement`.

Reviewed By: StanislavGlebik

Differential Revision: D23288000

fbshipit-source-id: e346231b183ce1533ab03130fd2ddab709176fcd
2020-08-25 09:14:08 -07:00
Mark Thomas
111bec050d bookmarks_movement: move hook running to a restrictions enum
Summary:
Bookmark movement for service write will use different restrictions than hooks.
Move hook running to be controlled by an enum in preparation for adding service
write restrictions.

Reviewed By: StanislavGlebik

Differential Revision: D23287998

fbshipit-source-id: 30670d4d6666c341885b57a3f41246e52db541a2
2020-08-25 09:14:08 -07:00
Mark Thomas
0606b32fbe mononoke_api: use bookmarks_movement for repo_move_bookmark
Summary: Use bookmarks_movement to implement the bookmark move in repo_move_bookmark.

Reviewed By: StanislavGlebik

Differential Revision: D23222562

fbshipit-source-id: 31249411d9521823f90248f459eb34ed4e2faea5
2020-08-25 09:14:08 -07:00
Mark Thomas
bd24c15579 repo_client: fix fast-forward failure falsehood
Summary:
The error message for fast-forward failure is wrong.  The correct way to allow
non-fast-forward moves is with the NON_FAST_FORWARD pushvar.

Reviewed By: StanislavGlebik

Differential Revision: D23243542

fbshipit-source-id: 554cdee078cd712f17441bd10bd7968b0674bbfe
2020-08-25 09:14:08 -07:00
Mark Thomas
61d45865de bookmarks_movement: prepare for running hooks on additional changesets
Summary:
When bookmarks are moved or created, work out what additional changesets
should have the hooks run on them.  This may apply to plain pushes,
force pushrebases, or bookmark only pushrebases.

At first, this will run in logging-only mode where we will count how many
changesets would have hooks run on them (up to a tunable limit).  We can
enable running of hooks with a tunable killswitch later on.

Reviewed By: StanislavGlebik

Differential Revision: D23194240

fbshipit-source-id: 8031fdc1634168308c7fe2ad3c22ae4389a04711
2020-08-25 09:14:08 -07:00
Mark Thomas
a2bbb7e259 metaconfig_parser: parse hooks_skip_ancestors_of
Differential Revision: D23222560

fbshipit-source-id: 928f35be98682298f2891fefe82c3ed4f6e63097
2020-08-25 09:14:08 -07:00
Mark Thomas
889e84f8d5 bookmarks_movement: move hook running into bookmarks_movement
Summary:
Move the running of hooks from in `repo_client` to in `bookmarks_movement`.

For pushrebase and plain push we still only run hooks on the new commits the client has sent.
Bookmark-only pushrebases, or moves where some commits were already known, do not run
the hooks on the omitted changesets.  That will be addressed next.

The push-redirector currently runs hooks in the large repo.  Since hook running has now been moved
to later on, they will automatically be run on the large repo, and instead the push-redirector runs them on
the small repo, to ensure they are run on both.

There's some additional complication with translating hook rejections in the push-redirector.  Since a
bookmark-only push can result in hook rejections for commits that are not translated, we fall back to
using the large-repo commit hash in those scenarios.

Reviewed By: StanislavGlebik

Differential Revision: D23077551

fbshipit-source-id: 07f66a96eaca4df08fc534e335e6d9f6b028730d
2020-08-25 09:14:07 -07:00
Mark Thomas
8790a793b9 mononoke_api: add HookManager
Summary: We will shortly need a `HookManager` in the write methods of the source control service.  Add one to `mononoke_api::Repo`

Reviewed By: StanislavGlebik

Differential Revision: D23077552

fbshipit-source-id: e1eed3661fe26a839e50ac4d884f4fadf793dbbb
2020-08-25 09:14:07 -07:00
Mateusz Kwapich
37192dc0e1 add a way to exclude a commit and its ancestor from commit history
Summary: We need this functionality for scmquery replacement.

Reviewed By: krallin

Differential Revision: D22999792

fbshipit-source-id: 56e5ec68469cb9c154a5c3045ded969253270b94
2020-08-25 03:48:49 -07:00
Mateusz Kwapich
42cc5431a4 add a way to exclude a commit and its ancestor from commit history
Summary: We need this functionality for scmquery replacement.

Reviewed By: krallin

Differential Revision: D22999793

fbshipit-source-id: 94e53adf5458e0bc1ebceffb3b548b7fc021218a
2020-08-25 03:48:49 -07:00
Mateusz Kwapich
1b00df7887 add a way to exclude a commit and its ancestor from path history
Summary: We need this functionality for scmquery replacement.

Reviewed By: krallin

Differential Revision: D22999141

fbshipit-source-id: e2e4177e56db85f65930b67a9e927a5c93b652df
2020-08-24 13:03:05 -07:00
Mateusz Kwapich
a3f8760fbc add a way to exclude a commit and its ancestor from path history
Summary: We need this functionality for scmquery replacement.

Reviewed By: krallin

Differential Revision: D22999142

fbshipit-source-id: 04cea361ea6270626e7ff77255e3dc75875ece97
2020-08-24 13:03:04 -07:00
Mateusz Kwapich
e7daab0dfb change the path history options to struct
Summary:
Rust doesn't have named arguments as with positional it's hard to keep track
of all of them if there're many. I'm planning to add one more so let's switc to
struct.

Reviewed By: krallin

Differential Revision: D22999143

fbshipit-source-id: 54dade05f860b41d18bebb52317586015a893919
2020-08-24 13:03:04 -07:00
Egor Tkachenko
7fd2f22cc0 Fix bug with zero hash manifest
Summary:
If the imported commit has manifest id with all zeros (empty commit). Blobimport job can't find it in blobstore and returns error D23266254.
Add an early return when the manifest_id is NULL_HASH.

Reviewed By: StanislavGlebik

Differential Revision: D23266254

fbshipit-source-id: b8a3c47edfdfdc9d8cc8ea032fb96e27a04ef911
2020-08-24 07:34:29 -07:00
Viet Hung Nguyen
a31922f40d mononoke/repo_import: removed callsign command argument for repo_import
Summary:
Related commits: D23214677 (dcb565409d), D23213192

In the previous commits we added phabricator callsigns to the repo configs.
Since we can extract the callsigns from them, we don't need the callsign
flag for repo_import tool. This diff removes the flag and uses the config variable.

Reviewed By: StanislavGlebik

Differential Revision: D23240398

fbshipit-source-id: d8b853d37e21be97af42e9f50658b9f471f8fc48
2020-08-21 13:00:45 -07:00
Mark Thomas
ae96aceb4a mononoke_api: remove a use of old futures
Summary: The `map_err` call can be done with the new future from `compat()`.

Reviewed By: StanislavGlebik

Differential Revision: D23239251

fbshipit-source-id: c80609ae0a975bc54253784e002a07a048651aa3
2020-08-21 13:00:45 -07:00
Mark Thomas
2a747529ad mononoke_api: add tests for resolve_bookmark and list_bookmarks
Summary:
Add tests for basic functionality of `resolve_bookmark` and `list_bookmarks`,
ensuring that they correctly go through the warm bookmarks cache.

`list_bookmarks` was still using old-style streams, so upgrade it to new streams.

Differential Revision: D23239250

fbshipit-source-id: f78abae2d382263be76c34f1488249677134a74d
2020-08-21 13:00:45 -07:00
Mark Thomas
160008e35a mononoke_api: resolve_bookmark should still check the db on cache miss
Summary:
If the warm bookmarks cache doesn't contain the bookmark we are looking for,
this might just be because it's a scratch bookmark, which aren't included in
that cache.

Always request the bookmark from the backing db if the cache misses.

Reviewed By: StanislavGlebik

Differential Revision: D23238009

fbshipit-source-id: c8843f1974ba14f148e30ba78a38eb710e7383b6
2020-08-21 13:00:45 -07:00
Stanislau Hlebik
1c6c8663b2 mononoke: try to print warning about expensive getbundle earlier
Summary:
We already had a logic that prints if we are about to run an expensive
getbundle. However this logic prints a warning after we've fetched 1M commits
already, and user would have to wait for a long time to get this message.

However in some cases we can give this warning very quickly. For example, if
the lowest "heads" generation number is >1M commits away from highest "common"
generation number, then we can print the warning right away.

Differential Revision: D23213482

fbshipit-source-id: 67e2399ca958703129cf3c22d82ce48cbbdcd2d1
2020-08-21 13:00:45 -07:00
Stanislau Hlebik
666182b451 mononoke: add one more function to create DifferenceOfUnionsOfAncestorsNodeStream
Summary:
In the next diff I'd like to compute generation number first, and then call
DifferenceOfUnionsOfAncestorsNodeStream. To avoid refetching these numbers
again let's create a function that accepts a vector of (ChangesetId,
Generation) pairs.

While here I also made the order more consistent: now we have "hashes"
parameters always in front of "excludes"

Differential Revision: D23212883

fbshipit-source-id: 11e0a1494126f84b36e3e33e65071449db5840d2
2020-08-19 07:30:11 -07:00
Lukasz Piatkowski
411a03ee58 mononoke/integration tests: fix tests after moving dummyssh (#43)
Summary: Pull Request resolved: https://github.com/facebookexperimental/eden/pull/43

Reviewed By: StanislavGlebik

Differential Revision: D23211512

Pulled By: lukaspiatkowski

fbshipit-source-id: 5c831c0d3dfd595647138f98968b91c1660c0856
2020-08-19 04:09:54 -07:00
Stanislau Hlebik
30150b244e mononoke: log correct file size for undesired file fetches
Summary:
Previously for undesired fetches of lfs files we were logging 0. Let's log the
real path instead

Reviewed By: ikostia

Differential Revision: D23209754

fbshipit-source-id: 7a893b257a89332a5169ab2072ecf48ae94b91e0
2020-08-19 02:41:19 -07:00
Viet Hung Nguyen
e1b088d315 mononoke/repo_import: moved tests
Summary: Minor refactor: moved the tests from main to their own file, because main was getting too large and found it hard to navigate through the code.

Reviewed By: StanislavGlebik

Differential Revision: D23188766

fbshipit-source-id: a2b2e32c77587f95c07a0bb02a4957e3671dd2c6
2020-08-18 09:09:14 -07:00
Johan Schuijt-Li
4e0660a94c asyncify connection accepting
Summary:
This largely moves connection accepting from old style bytes, futures and tokio
to updated versions, while keeping some parts at old bytes/futures in order to
remain compatible with the rest of the Mononoke codebase.

Division lies on `Stdio` which maintains old channels, stream and futures,
while the socket handling, connection acception and wire encoding is updated.

With the updated futures, we now wait for the forwarding stream to have
succeeded before considering a connection fully handled.

Other notable changes:
 - futures_ext now a mini codec Decoder instead of relying on NetstringDecoder,
   which has been updated to use bytes 0.5
 - hgcli has been modified to use updated NetstringDecoder
 - netstring now requires the updated bytes 0.5 crate
 - the part in connection_acceptor was handling repo/security logic is now part of repo_handler (as it should have been), connection_acceptor now only handles networking and framing
 - tests now verify that the shutdown handler is triggered

Reviewed By: krallin

Differential Revision: D22526867

fbshipit-source-id: 34e43af4a0c8b84de0000f2093d7fffd3fb0e20d
2020-08-18 09:09:14 -07:00
Mark Thomas
526aac3aa2 scribe_commit_queue: include received_timestamp
Summary:
Subscribers to the commit tailer categories would like to know when Mononoke
received the commit.

Reviewed By: StanislavGlebik

Differential Revision: D23162447

fbshipit-source-id: 747214f1964a643f59c491aa08cdbd5c8fe331c8
2020-08-17 13:13:12 -07:00
Mark Thomas
23dee0c931 mononoke_api: add freshness parameter to resolve_bookmark
Summary:
Allow callers of `resolve_bookmark` to specify whether they'd like the most recent value of
the bookmark, rather than one which may be stale.

Use this in the repo_move_bookmark test to avoid flakiness caused by the test code racing against
the warm bookmark cache.

Reviewed By: StanislavGlebik

Differential Revision: D23151427

fbshipit-source-id: 4b8358be1cf103479ccc23a41b2505776543ee49
2020-08-17 09:09:07 -07:00
Mark Thomas
053abf7919 hook_manager_factory: extract construction of the hook manager
Summary:
Extract construction of the hook manager to its own crate, so that we can re-use it.

Eventually the hook manager will become a repo attribute and will be constructed by
the repo attribute factory, but for now it needs its own factory method.

Differential Revision: D23129407

fbshipit-source-id: 302fde4d1ae38c6f61032a32c880018ebf84dee2
2020-08-17 09:09:07 -07:00
Mark Thomas
563137e6f7 hooks: create HookRejection struct
Summary:
Convert hook rejections from a tuple to a named struct.  This will be used in
the bookmarks_movement public interface.

Reviewed By: krallin

Differential Revision: D23077550

fbshipit-source-id: a35476817660c38b8df879ba603b927a7e39be21
2020-08-17 09:09:07 -07:00
Viet Hung Nguyen
d3f3cffe13 mononoke/repo_import: check if repo pushredirects
Summary: Some repos are push-redirected repos: pushes go to another repos, but then synced into this repo. Because of this, when we import a repo into a smaller repo that push-redirects to a large repo, we need to make sure we don't break the large repo with the imported code, since merges, pushes, imports etc. are redirected to the large repo. For now, in order to avoid breaking the large repo, we added a simple check that returns error, if the small repo push-redirects to the large one.

Reviewed By: ikostia

Differential Revision: D23158826

fbshipit-source-id: f722790441d641f67293e78c5d1ea5d1102bbb9b
2020-08-17 06:13:21 -07:00
Mark Thomas
2117806dc7 tests: convert dummyssh and get_free_socket to python3
Summary: Convert dummyssh and get_free_socket to full python3 binaries.

Reviewed By: johansglock

Differential Revision: D23105490

fbshipit-source-id: 6c39c32ba0728cde108b42245acece1d7828ac7c
2020-08-17 02:42:14 -07:00
Arun Kulshreshtha
523013c808 cmdlib: remove extra comment slashes
Reviewed By: quark-zju

Differential Revision: D23111025

fbshipit-source-id: b8606c322439c41097d739df59b551a8432e7fe4
2020-08-14 18:04:26 -07:00
Alex Hornby
213edc10ce mononoke: limit queue peeking from scrub blobstore when one store has missing value
Summary: When running manual scrub for a large repo with one empty store, we are doing one peek per key.  For keys that have existed for some time this is unnecessary as we know the key should exist and slows down the scrub.

Reviewed By: farnz

Differential Revision: D23054582

fbshipit-source-id: d2222350157ca37aa31b7792214af4446129c692
2020-08-14 02:37:45 -07:00
Mark Thomas
dba3ef35ca repo_client: extract common calls to bookmarks_movement
Summary:
Extract the calls to bookmarks_movement to separate functions to avoid duplication and
make the post-resolve action functions easier to read.

Reviewed By: StanislavGlebik

Differential Revision: D23057045

fbshipit-source-id: c6b5a8cdb2399e89c174c3df844529d4b5309edf
2020-08-14 02:28:56 -07:00
Mark Thomas
c529e6a527 bookmarks_movement: refactor bookmark movement for pushrebase
Summary: Refactor control of movement of non-scratch bookmarks through pushrebase.

Reviewed By: krallin

Differential Revision: D22920694

fbshipit-source-id: 347777045b4995b69973118781511686cf34bdba
2020-08-14 02:28:55 -07:00
Mark Thomas
a16b88d1c5 pushrebase: remove OntoBookmarkParams and clean up interface
Summary:
Some parts of the `pushrebase` public interface will be re-exported from `bookmarks_movement`.

Clean these up in preparation:

* Remove `OntoBookmarkParams` as it is now a simple wrapper around `BookmarkName` that
  prevents us from using a reference.

* Make the bundle replay data `Option<&T>` rather than `&Option<T>`, allowing us to
  use the former when available.  The latter can be readily converted with `.as_ref()`.

* Rename `SuccessResult` to `Outcome` and `ErrorKind` to `InternalError`.

Reviewed By: krallin

Differential Revision: D23055580

fbshipit-source-id: 1208a934f979a9d5eb73310fb8711b1291393ecf
2020-08-14 02:28:55 -07:00
Mark Thomas
c59c2979d2 bookmarks_movement: refactor bookmark movement for force-pushrebase
Summary:
Refactor control of movement of non-scratch bookmarks through force-pushrebase
or bookmark-only pushrebase.  These are equivalent to ordinary pushes, and so
can use the same code path for moving the bookmarks.

This has the side-effect of enabling some patterns that were previously not
possible, like populating git mappings with a force-pushrebase.

Reviewed By: ikostia

Differential Revision: D22844828

fbshipit-source-id: 4ef71fa4cef69cc2f1d124837631e8304644ca06
2020-08-14 02:28:54 -07:00
Mark Thomas
279c3dcd8f bookmarks_movement: refactor bookmark movement for push
Summary: Refactor control of movement of non-scratch bookmarks through plain pushes.

Reviewed By: krallin

Differential Revision: D22844829

fbshipit-source-id: 2f1a89e1d0f69880f74b7bc135144bfb305a918e
2020-08-14 02:28:54 -07:00
Mark Thomas
e0bdff5188 bookmarks_movement: refactor scratch bookmark movement
Summary:
Refactor control of movement of scratch bookmarks to a new `bookmark_movement` crate
that will contain all bookmark movement controls.

Reviewed By: krallin

Differential Revision: D22844830

fbshipit-source-id: 56d25ad45a9328eaa079c13466b4b802f033d1dd
2020-08-14 02:28:53 -07:00
Alex Hornby
61046e3adb rust: point interment crate to internment master
Summary: Update internment to point at its latest master branch commit.  Upstream has merged my PR to use DashMap inside internment, but they haven't cut a new crates release yet.

Reviewed By: jsgf, krallin

Differential Revision: D23075070

fbshipit-source-id: 8f4ec0e3ddbefd672c3040fb174d1cf5f6c1a94a
2020-08-14 02:15:24 -07:00
Alex Hornby
d59dd787c5 mononoke: make blobstore ctime a bit easier to use
Summary: Ctime is an Option<i64>, so rather than as_ctime()/into_ctime() use the fact that it's fairly small and Copy to just .ctime()

Reviewed By: krallin

Differential Revision: D23081739

fbshipit-source-id: be62912eca02e5c29d7473d6f386d98df11000dd
2020-08-14 02:09:46 -07:00
Stanislau Hlebik
f68531f81c mononoke: add a flag to disable short history fetching
Summary: Let's use new flag to enable/disable short history for getpack request

Reviewed By: krallin

Differential Revision: D23080200

fbshipit-source-id: 7aa0be6ded0601fa4d31d4b9ff8792a4f8d91b19
2020-08-13 10:37:40 -07:00
Meyer Jacobs
b9ce375f36 edenapi: Split DataEntry into FileEntry and TreeEntry
Summary:
The primary change is in `eden/scm/lib/edenapi/types`:
* Split `DataEntry` into `FileEntry` and `TreeEntry`.
* Split `DataError` into `FileError` and `TreeError`. Remove `Redacted` error variant from `TreeError` and `MaybeHybridManifest` error variant from `FileError`.
* Split `DataRequest`, `DataResponse` into appropriate File and Tree types.
* Refactor `data.rs` into `file.rs` and `tree.rs`.
* Lift `InvalidHgId` error, used by both File and Tree, into `lib.rs`.
* Bugfix: change `MaybeHybridManifest` to be returned only for hash mismatches with empty paths, to match documented behavior.

Most of the remaining changes are straightforward fallout of this split. Notable changes include:
* `eden/scm/lib/edenapi/tools/read_res`: I've split the "data" commands into "file" and "tree", but I've left the identical arguments sharing the same argument structs. These can be refactored later if / when they diverge.
* `eden/scm/lib/types/src/hgid.rs`: Moved `compute_hgid` from `eden/scm/lib/edenapi/types/src/data.rs` to as a new `from_content` constructor on the `HgId` struct.
* `eden/scm/lib/revisionstore/src/datastore.rs`: Split `add_entry` method on `HgIdMutableDeltaStore` trait into `add_file` and `add_tree` methods.
*  `eden/scm/lib/revisionstore/src/edenapi`
    * `mod.rs`: Split `prefetch` method on `EdenApiStoreKind` into `prefetch_files` and `prefetch_trees`, which are given a default implementation that fails with `unimplemented!`.
    * `data.rs`: Replace blanket trait implementations for `EdenApiDataStore<T>` with specific implementations for `EdenApiDataStore<File>` and `EdenApiDataStore<Tree>` which call the appropriate fetch and add functions.
    * `data.rs` `test_get_*`: Replace dummy hashes with real hashes. These tests were only passing due to the hash mismatches (incorrectly) being considered `MaybeHybridManifest` errors, and allowed to pass.

Reviewed By: kulshrax

Differential Revision: D22958373

fbshipit-source-id: 788baaad4d9be20686d527f819a7342678740bc3
2020-08-13 10:01:40 -07:00
Stanislau Hlebik
96a9528149 mononoke: use VecDeque in blame_range_split_at
Summary:
We had an accidentally quadratic behaviour in our blame implementation.
blame_range_split_at copied the right part of the range over and over again.
This diff fixes it by using VecDeque instead

Reviewed By: aslpavel

Differential Revision: D23102690

fbshipit-source-id: 951dd6383c48206fdc92757a47690f8e826a737b
2020-08-13 08:32:41 -07:00
Viet Hung Nguyen
126a661d8c mononoke/repo_import: add commit push functionality
Summary:
After creating the merge commit (D23028163 (f267bec3f7)) from the imported commit head and the destination bookmark's head, we need to push the commit onto that bookmark. This diff adds the push functionality to repo_import tool.
Note: GlobalrevPushrebaseHook is a hook to assign globalrevs to commits to keep the order of the commits

Reviewed By: StanislavGlebik

Differential Revision: D23072966

fbshipit-source-id: ff815467ed0f96de86da3de9a628fd45743eb167
2020-08-13 00:43:26 -07:00
Stanislau Hlebik
e308419b58 RFC mononoke: limit number of filenodes get_all_filenodes_maybe_stale
Summary:
In a repository with files with large histories we run into a lot of SqlTimeout
errors while fetching file history to serve getpack calls. However fetching the
whole file history is not really necessary - client knows how to work with
partial history i.e. if client misses some portion of history then it would
just fetch it on demand.

This diff adds way to add a limit on how many entries were going to be fetched, and if more entries were fetched then we return FilenodeRangeResult::TooBig. The downside of this diff is that we'd have to do more sequential database
queries.

Reviewed By: krallin

Differential Revision: D23025249

fbshipit-source-id: ebed9d6df6f8f40e658bc4b83123c75f78e70d93
2020-08-12 14:33:43 -07:00
Stanislau Hlebik
5008ac3932 mononoke: start using warm bookmark cache blobimport
Summary: Finally let's start using warm bookmark cache for blobimport

Reviewed By: krallin

Differential Revision: D23057327

fbshipit-source-id: fc454bf827f476919d0bfed7691b8b29d79bd876
2020-08-12 12:03:19 -07:00
Stanislau Hlebik
297773719e mononoke: add new warmer that tracks which commit has been blobimported
Summary:
See D23053788 for motivation. Let's add a new warmer that checks
mutable_counters to understand which commit has been imported already.

Reviewed By: krallin

Differential Revision: D23053991

fbshipit-source-id: 3651aed8836a791675dd8d7bcc145fd32e56a13f
2020-08-12 08:50:35 -07:00
Stanislau Hlebik
2767b28825 mononoke: blobimport record highest imported generation number
Reviewed By: krallin

Differential Revision: D23053788

fbshipit-source-id: 615a4f4064a56d6e45818f85f002267d4bf08c95
2020-08-12 08:50:35 -07:00
Alex Hornby
5631157574 mononoke: use sorted_vector_map when parsing hg manifest blob
Summary:
Use sorted_vector_map when parsing hg manifest blob, as blobs are usually stored sorted, which can result in high cost of BTree insertion when traversing large repos.

Also uses the size_hint() from the parsing Split to save reallocations during insert.

Reviewed By: markbt

Differential Revision: D22975883

fbshipit-source-id: 1faff754f03d7b2c20ebb741fec4f97b310852f9
2020-08-12 02:51:17 -07:00
Alex Hornby
7766cac6a1 mononoke: add task spawning to manual_scrub
Summary: Add task spawning to manual_scrub to increase throughput

Reviewed By: farnz

Differential Revision: D23055811

fbshipit-source-id: 1e3d1f0e5b5fc2f2935aa367ae2e749c867d2d62
2020-08-12 01:31:03 -07:00
Alex Hornby
e18cf0210f mononoke: use the background session mode from manual scrub
Summary: There are no users waiting on manual scrub, so set it to use the background session mode.

Reviewed By: krallin

Differential Revision: D23054581

fbshipit-source-id: 985bcadbaf17d2a8c92fdec811ecb239cbca7b37
2020-08-12 01:31:03 -07:00
Stanislau Hlebik
593b16d485 mononoke: add WarmBookmarksCacheBuilder
Summary:
Let's split logic from WarmBookmarksCache into a separate builder. This builder
will configure which warmers we'd like to use.

This will make it easier to introduce a new warmer later in the stack

Reviewed By: krallin

Differential Revision: D23053785

fbshipit-source-id: 32acc9da98d32624ca0dc00277910443f3d86f66
2020-08-11 15:37:36 -07:00
Stanislau Hlebik
8a3d0dca74 mononoke: check if hg changesets should be warmed
Summary:
Previously we were unconditionally adding hg changesets, but that's a bit
strange and there's no reason to do it. Let's do the same check we do for other
derived data types. Note that there should be no change in behaviour - all our
repos have "hgchangesets" derived data type enabled.

Reviewed By: krallin

Differential Revision: D23053786

fbshipit-source-id: 0b3ea99f649bc89ea9b216f368fee11fa25e153f
2020-08-11 15:37:35 -07:00
Stanislau Hlebik
a2d997d7a1 mononoke: renamed is_derived to is_warm
Summary: I want to add a new warmer in the next diffs which won't do any deriving.

Reviewed By: krallin

Differential Revision: D23053787

fbshipit-source-id: 4c7febb60ab7e835302db746c670d656bd9d1989
2020-08-11 15:37:35 -07:00
Stanislau Hlebik
f5e5286a87 mononoke: fix minor typo
Summary: makes grepping easier

Reviewed By: krallin

Differential Revision: D23053784

fbshipit-source-id: f8ebddfb0d99000ec3ad9d068c8abfe929bf7a5d
2020-08-11 11:15:59 -07:00
Viet Hung Nguyen
f267bec3f7 mononoke/repo_import: add merge functionality
Summary:
Once we have revealed the commits to the user (D22864223 (578207d0dc), D22762800 (f1ef619284)), we need to merge the imported branch into the destination branch (specified by dest-bookmark). To do this, we extract the latest commit of the destination branch, then compare the two commits, if we have merge conflicts. If we have merge conflicts, we inform the user, so they can resolve it. Otherwise, we create a new bonsai having the two commits as parents.

Next step: pushrebase the merge commit

Minor refactor: moved app setup to a separate file for better readability.

Reviewed By: StanislavGlebik

Differential Revision: D23028163

fbshipit-source-id: 7f3e2a67dc089e6bbacbe71b5e4ef5f6eed2a9e1
2020-08-11 03:26:57 -07:00
Alex Hornby
74f2e7affc mononoke: add context to blobstore_sync_queue get error handling
Summary: Add context to show the affected key if there are problems peeking a key.

Reviewed By: farnz

Differential Revision: D23003001

fbshipit-source-id: b46b7626257f49d6f11e80a561820e4b37a5d3b0
2020-08-11 02:52:44 -07:00
Alex Hornby
0a8c81c668 mononoke: walker state, check for visited before insert
Summary:
Now that the previous diff has pre-computed the hash value using EagerHashMemo, its less expensive to try a read-lock only get() first before committing to a write lock acquiring insert().

The combination of these and the previous diff moved WalkState::visit from dominating the cpu profile to not ( the path interning dominates now ).

Reviewed By: krallin

Differential Revision: D22975881

fbshipit-source-id: 90b2be83282ee2095c517c0d4f13536ddadf6267
2020-08-11 02:52:43 -07:00
Alex Hornby
22add277f9 mononoke: update walker state to use eager hash memo
Summary:
DashMap takes the hash of its keys multiple times,  once outside the lock, and then once or twice inside the lock depending if the key is present in the shard.

Pre-computing the hash value using EagerHashMemo means its done only once and more importantly, outside the lock.

To use EagerHashMemo one needs to supply the BuildHasher, so its added as a struct member and the record method is made a member function.

Reviewed By: farnz

Differential Revision: D22975878

fbshipit-source-id: c2ca362fdfe31e5dca329e6200029207427cd9a1
2020-08-11 02:52:43 -07:00
Stefan Filip
2825193931 edenapi: add /commit/revlog_data endpoint
Summary:
Matches the `getcommitdata` SSH endpoint.
This is going to be used to remove the requirement that client repostories
need to have all commits locally.

Reviewed By: krallin

Differential Revision: D22979458

fbshipit-source-id: 75d7265daf4e51d3b32d76aeac12207f553f8f61
2020-08-11 01:54:14 -07:00
Simon Farnsworth
3086b241c6 Give the blobstore healer a way to cope with temporary falls in MySQL capacity
Summary:
The query we use to select blobs to heal is naturally expensive, due to the use of a subquery. This means that finding the perfect queue limit is hard, and we depend on task restarts to handle brief overload of MySQL.

Give us a fast fall in batch size (halve on each failure), and slow climb back (10% climb on each success), and a random delay after each failure before retrying.

Reviewed By: StanislavGlebik

Differential Revision: D23028518

fbshipit-source-id: f2909fe792280f81d604be99fabb8b714c1e6999
2020-08-10 15:24:13 -07:00
Stanislau Hlebik
9787a2c33a mononoke: add admin command to return filenodes for path
Summary: It's useful to debug filenodes

Reviewed By: krallin

Differential Revision: D23028528

fbshipit-source-id: 500fe2ad62a8e07498f46801c0c1523d1656ceeb
2020-08-10 11:13:01 -07:00
Stanislau Hlebik
bdd494b2ce mononoke: fix filenodes cache key
Summary:
`is_tree` weren't part of the cache key, and that means we could have returned
incorrect history if we had a file and a directory with the same name.

This diff fixes it.

Reviewed By: krallin

Differential Revision: D23028527

fbshipit-source-id: 98a3b2028fa62231dfb570a76fb836374ce1eed0
2020-08-10 07:13:35 -07:00
Stanislau Hlebik
21e232ddaf mononoke: add init_tunables in fastreplay
Summary:
I noticed that fastreplay doesn't init tunables, and that means that it doesn't
get the updates, and more importantly it doesn't use default values of
tunables.

That doesn't look expected (but lmk if I'm wrong!)

Reviewed By: krallin

Differential Revision: D23027311

fbshipit-source-id: ee43d02457d2240ebeb1530c672cb3847bc3afd4
2020-08-10 03:55:41 -07:00
Alex Hornby
02b9979b21 rust: vendor dashmap 3.11.9
Summary: This has my into_key() PR https://github.com/xacrimon/dashmap/pull/91 merged so the patch pointing to my fork is also removed.

Reviewed By: farnz

Differential Revision: D22896911

fbshipit-source-id: 188d438ce2aa20cfb3c466a62227d1cd27625f74
2020-08-10 03:19:33 -07:00
Alex Hornby
a7ff2a0c34 rust: vendor ahash 0.4.4
Summary:
Vendor ahash 0.4.4.   In tests I haven't found this update significant in mononoke walker performance, but might as well be current now I'd tried it.

I have found that wrapping ahash in a memoizing hasher helps, but that is for another diff.

Reviewed By: farnz

Differential Revision: D22864635

fbshipit-source-id: 5019259273ae3bd2df95cdd18adceed895baf8f2
2020-08-07 05:34:01 -07:00
Alex Hornby
e0c6e249fe mononoke: add a non-thrift header to packblob so we can vary thrift protocol in future
Summary: Add a non-thrift header to packblob so we can vary thrift protocol in future.

Reviewed By: farnz

Differential Revision: D22953758

fbshipit-source-id: a114a350105e75cbe57f6c824295d863c723f32f
2020-08-07 03:43:56 -07:00
Stanislau Hlebik
be3c46e10d mononoke: add --find-latest-imported-rev-only mode to blobimport
Reviewed By: ikostia

Differential Revision: D22975677

fbshipit-source-id: d4322901a84b8d76ccdffab17421f32c8e7510eb
2020-08-06 13:08:50 -07:00
Jun Wu
6fd7a2e582 dag: use concrete error types
Summary:
This is more complex than previous libraries, mainly because `dag` defines APIs
(traits) used by other code, which might raise error type not interested
by `dag` itself. `BackendError::Other(anyhow::Error)` is currently used to
capture types that do not fit in `dag`'s predefined error types.

Reviewed By: sfilipco

Differential Revision: D22883865

fbshipit-source-id: 3699e14775f335620eec28faa9a05c3cc750e1d1
2020-08-06 12:31:57 -07:00
Jun Wu
8d0f48c4da dag: rename some anyhow::Result to dag::Result
Summary:
Prefix some `Result` with `dag::Result`. Since `dag::Result` is just
`anyhow::Result` for now, this does not change anything but makes
it more compatible with upcoming changes.

Reviewed By: sfilipco

Differential Revision: D22883864

fbshipit-source-id: 95a26897ed026f1bb8000b7caddeb461dcaad0e7
2020-08-06 12:31:57 -07:00
Mark Thomas
b56a1b5b2c scs_server: add repo_list_hg_manifest
Summary:
To allow EdenFS to get aux manifest data from Mononoke without needing to derive fsnodes, provide
a mechanism to list a manifest using the hg manifest id that returns the size and content hashes
of each of the files.

NOTE: this is temporary until the EdenAPI server is fully online and serving this data.

Reviewed By: krallin

Differential Revision: D22975967

fbshipit-source-id: 0a25da6d74534d42fc3b5f38ba3b72107b209681
2020-08-06 11:28:11 -07:00
Stanislau Hlebik
3bb6bddce8 mononoke: remove expect
Summary: Let's return normal error instead

Reviewed By: krallin

Differential Revision: D22976148

fbshipit-source-id: fd89dfa1949d4b5e3354aab7d93ca40d779a18ec
2020-08-06 08:00:33 -07:00
Stanislau Hlebik
37747550ef mononoke: open RevlogRepo once
Summary: Previously it was opened twice, even though there were no reason to do it.

Reviewed By: krallin

Differential Revision: D22976149

fbshipit-source-id: 426858da4548f1eaffe1d989e5424937af2583a5
2020-08-06 08:00:33 -07:00
Alex Hornby
e29db47009 mononoke: factor out walker hasher settings, take explicit ahash dependency
Summary:
Factor out the walkers state internals to BuildStateHasher and StateMap

This change keeps the defaults the same using DashMap and ahash::RandomState and uses the same ahash version that DashMap defaults to internally.

This is in preparation for the next diff the where the ahash dependency is updated to 0.4.4. Though it was clearer not to combine the refactoring and the update of the hasher used in the same diff.

Reviewed By: ikostia

Differential Revision: D22851585

fbshipit-source-id: 84fa0dc73ff9d32f88ad390243903812a4a48406
2020-08-06 06:27:22 -07:00
Alex Hornby
f07e0be8e3 mononoke: only emit NodeData from walker if required
Summary:
Only emit NodeData from walker if required to save some memory.  Each of the walks can now specify which NodeData it is interested in observing in the output stream.

We still need to emit Some as part of the Option<NodeData> in the output stream as it is used in things like the final count of loaded objects. Rather than stream over Option<Option<NodeData>> we instead add a NodeData::NotRequired variant

Reviewed By: markbt

Differential Revision: D22849831

fbshipit-source-id: ef212103ac2deb9d66b017b8febe233eb53c9ed3
2020-08-06 06:27:22 -07:00
Stanislau Hlebik
c0347c6baf mononoke: refactor verify_working_copy slightly
Summary:
Extract verify_working_copy_inner function, which lets directly specify
source/target repo, hash and movers. It can be useful to verify equivalence of
two commits even if they are not in commit equivalence mapping.

Reviewed By: krallin

Differential Revision: D22950840

fbshipit-source-id: ab30be7190e29db3343b846b48333d7c7339d043
2020-08-06 05:51:37 -07:00
Simon Farnsworth
0c3fe9b20f Fully asyncify blobstore sync queue
Summary: Move it from `'static` BoxFutures to async_trait and lifetimes

Reviewed By: markbt

Differential Revision: D22927171

fbshipit-source-id: 637a983fa6fa91d4cd1e73d822340cb08647c57d
2020-08-05 15:41:15 -07:00
David Tolnay
014f40209b Back out "rust: 1.45.2 update"
Summary:
This is a backout of D22912569 (34760b5164), which is breaking opt-clang-thinlto builds on platform007 (S206790).

Original commit changeset: 5ffdc48adb1f

Reviewed By: aaronabramov

Differential Revision: D22956288

fbshipit-source-id: 45940c288d6f10dfe5457d295c405b84314e6b21
2020-08-05 13:28:13 -07:00
Viet Hung Nguyen
f2ee103884 mononoke/repo_import: add more meaningful print outs and save hashes
Summary:
Added more logs when running the binary to be able to track the progress more easily.
Saved bonsai hashes into a file. In case we fail at deriving data types, we can still try to derive them manually with the saves hashes and avoid running the whole tool again.

Reviewed By: StanislavGlebik

Differential Revision: D22943309

fbshipit-source-id: e03a74207d76823f6a2a3d92a1e31929a39f39a5
2020-08-05 12:46:14 -07:00
Mark Thomas
cbd105a73e hook_tailer: reduce default concurrency to 20
Summary:
Large commits and many hooks can mean checking 100 commits at a time overload
the system.  Reduce the default concurrency to something more reasonable.

While we're here, lets use the proper mechanism for default values in clap.

Reviewed By: ikostia

Differential Revision: D22945597

fbshipit-source-id: 0f0a086c3b74bec614ada44a66409c8d2b91fe69
2020-08-05 10:34:05 -07:00
Mark Thomas
e12728305c hook_tailer: make command line arguments consistent
Summary:
Argument names should be `snake_case`.  Long options should be `--kebab-case`.

Retain the old long options as aliases for compatibility.

Reviewed By: HarveyHunt

Differential Revision: D22945600

fbshipit-source-id: a290b3dc4d9908eb61b2f597f101b4abaf3a1c13
2020-08-05 10:34:05 -07:00
Mark Thomas
b2b895353f hook_tailer: add --exclude-merges to skip merge commits
Summary: Add `--exclude-merges` which will skip merge commits.

Reviewed By: HarveyHunt

Differential Revision: D22945598

fbshipit-source-id: 3c20cf049bbe15a975671e8792259b460356804a
2020-08-05 10:34:05 -07:00
Mark Thomas
57626bec98 hook_tailer: add --log-interval to log every N commits
Summary:
Add `--log-interval` to log every N commits, so that it can be seen to be
making progress in the logs.

The default is set to 500, which logs about once every 10 seconds on my devserver.

Reviewed By: HarveyHunt

Differential Revision: D22945599

fbshipit-source-id: 7fc09b907793ea637289c9018958013d979d6809
2020-08-05 10:34:05 -07:00
Simon Farnsworth
99247529d5 Wishlist priority connections should use background mode
Summary: Commitcloud fillers use wishlist priority because we want them to wait their turn behind other users; let's also stop them from flooding the blobstore healer queue by making them background priority.

Reviewed By: ahornby

Differential Revision: D22867338

fbshipit-source-id: 5d16438ea185b580f3537e3c4895a545483eca7a
2020-08-05 06:35:46 -07:00
Simon Farnsworth
aa94fb9581 Add a multiplex mode that doesn't update the sync queue
Summary:
Backfillers and other housekeeping processes can run so far ahead of the blobstore sync queue that we can't empty it from the healer task as fast as the backfillers can fill it.

Work around this by providing a new mode that background tasks can use to avoid filling the queue if all the blobstores are writing successfully. This has a side-effect of slowing background tasks to the speed of the slowest blobstore, instead of allowing them to run ahead at the speed of the fastest blobstore and relying on the healer ensuring that all blobs are present.

Future diffs will add this mode to appropriate tasks

Reviewed By: ikostia

Differential Revision: D22866818

fbshipit-source-id: a8762528bb3f6f11c0ec63e4a3c8dac08d0b4d8e
2020-08-05 06:35:46 -07:00
Stanislau Hlebik
f13067b0da mononoke: add manual_commit_sync to megarepotool
Summary:
This operation is useful immediately after a small repo is merged into a large repo.
See example below

```
  B' <- manually synced commit from small repo (in small repo it is commit B)
  |
  BM <- "big merge"
 /  \
...  O <- big move commit i.e. commit that moves small repo files in correct location
     |
     A <- commit that was copied from small repo. It is identical between small and large repos.
```

Immediately after a small repo is merged into a large one we need to tell that a commit B and all of
its ancestors from small repo needs to be based on top of "big merge" commit in large repo rather than on top of
commit A.
The function below can be used to achieve exactly that.

Reviewed By: ikostia

Differential Revision: D22943294

fbshipit-source-id: 33638a6e2ebae13a71abd0469363ce63fb6b014f
2020-08-05 05:55:15 -07:00
Simon Farnsworth
33c2a0c846 Update auto_impl to 0.4
Summary: We were using a git snapshot of auto_impl from somewhere between 0.3 and 0.4; 0.4 fixes a bug around Self: 'lifetime constraints on methods that blocks work I'm doing in Mononoke, so update.

Reviewed By: dtolnay

Differential Revision: D22922790

fbshipit-source-id: 7bb68589a1d187393e7de52635096acaf6e48b7e
2020-08-04 18:12:45 -07:00
Kostia Balytskyi
c8e3c27a65 megarepo: test invisible merge e2e
Reviewed By: StanislavGlebik

Differential Revision: D22924237

fbshipit-source-id: ba13d610c26c1b0be4f4afa75de93568359457c6
2020-08-04 12:21:13 -07:00
Stefan Filip
7392392a33 server: add commit/location_to_hash path
Summary:
Eden api endpoint for segmented changelog. It translates a path in the
graph to the hash corresponding to that commit that the path lands on.
It is expected that paths point to unique commits.

This change looks to go through the plumbing of getting the request from
the edenapi side through mononoke internals and to the segmented changelog
crate. The request used is an example. Follow up changes will look more at
what shape the request and reponse should have.

Reviewed By: kulshrax

Differential Revision: D22702016

fbshipit-source-id: 9615a0571f31a8819acd2b4dc548f49e36f44ab2
2020-08-04 11:22:39 -07:00
Stefan Filip
2f3e569120 mononoke_api: add segmented changelog location to hash translation
Summary:
This functionality is going to be used in EdenApi. The translation is required
to unblock removing the changelog from the local copy of the repositories.
However the functionality is not going to be turned on in production just yet.

Reviewed By: kulshrax

Differential Revision: D22869062

fbshipit-source-id: 03a5a4ccc01dddf06ef3fb3a4266d2bfeaaa8bd2
2020-08-04 11:22:39 -07:00
Stefan Filip
4261013101 metaconfig: add segmented changelog config
Summary:
To start the only configuration available is whether the functionality provided
by this component is available in any shape or form. By default the component
is going to be disabled to all repositories. We will enable it first to
bootstrapped repositories and after additional tooling is added to production
repositories.

Reviewed By: kulshrax

Differential Revision: D22869061

fbshipit-source-id: fbaed88f2f45e064c0ae1bc7762931bd780c8038
2020-08-04 11:22:39 -07:00
Santiago Alfonso Muñoz Rodriguez
007dc93916 Enumeration API for BlobStore keys
Summary:
- Enumerate API now provided via trait BlobstoreKeySource
- Implementation for Fileblob and ManifoldBlob
- Modified populate_healer to use new api
- Modified fixrepocontents to use new api

Reviewed By: ahornby

Differential Revision: D22763274

fbshipit-source-id: 8ee4503912bf40d4ac525114289a75d409ef3790
2020-08-04 06:54:18 -07:00
Alex Hornby
f7210430d9 mononoke: check whether to emit an edge earlier from the walker, remaining types
Summary: Update all the remaining steps in the walker to use the new early checks, so as to prune unnecessary edges earlier in the walk.

Reviewed By: farnz

Differential Revision: D22847412

fbshipit-source-id: 78c499a1870f97df7b641ee828fb8ec58303ebef
2020-08-04 06:47:38 -07:00
Alex Hornby
5fb309a7b2 mononoke: check whether to emit an edge from the walker earlier
Summary:
Check whether to emit an edge from the walker earlier to reduce vec allocation of unnecessary edges that would immediately be dropped in WalkVistor::visit.

The VisitOne trait is introduced as a simpler api to the Visitor that can be used to check if one edge needs to be visited,  and the Checker struct in walk.rs is a helper around that that will only call the VisitOne api if necessary. Checker also takes on responsibility for respecting keep_edge_paths when returning paths,  so that parameter has be removed  for migrated steps.

To keep the diff size reasonable, this change has all the necessary Checker/VisitOne changes but only converts hg_manifest_step, with the remainder of the steps converted in the next in stack.  Marked todos labelling unmigrated types as always emit types are be removed as part of converting remaining steps.

Reviewed By: farnz

Differential Revision: D22864136

fbshipit-source-id: 431c3637634c6a02ab08662261b10815ea6ce293
2020-08-04 04:30:49 -07:00
Stanislau Hlebik
fe60eeff85 mononoke: megarepotool support for gradual merge
Summary:
This tool can be used in tandem with pre_merge_delete tool to merge a one large
repository into another in a controlled manner - the size of the working copy
will be increased gradually.

Reviewed By: ikostia

Differential Revision: D22894575

fbshipit-source-id: 0055d3e080c05f870cfd0026174365813b0eb253
2020-08-04 02:53:15 -07:00
Simon Farnsworth
f7e8931a56 Add a minimum successful writes count for MultiplexedBlobstore
Summary:
There are two reasons to want a write quorum:

1. One or more blobstores in the multiplex are experimental, and we don't want to accept a write unless the write is in a stable blobstore.
2. To reduce the risk of data loss if one blobstore loses data at a bad time.

Make it possible

Reviewed By: krallin

Differential Revision: D22850261

fbshipit-source-id: ed87d71c909053867ea8b1e3a5467f3224663f6a
2020-08-04 02:45:38 -07:00
Jeremy Fitzhardinge
34760b5164 rust: 1.45.2 update
Summary: A couple of features stabilized, so drop their `#![feature(...)]` lines.

Reviewed By: eugeneoden, dtolnay

Differential Revision: D22912569

fbshipit-source-id: 5ffdc48adb1f57a1b845b1b611f34b8a7ceff216
2020-08-03 19:29:17 -07:00
Kostia Balytskyi
6824787241 library.sh: add absolute config paths everywhere
Summary:
In several places in `library.sh` we had `--mononoke-config-path
mononoke-config`. This ensured that we could not run such commands from
non-`$TESTTMP` directorires. Let's fix that.

Reviewed By: StanislavGlebik

Differential Revision: D22901668

fbshipit-source-id: 657bce27ce6aee8a88efb550adc2ee5169d103fa
2020-08-03 13:00:23 -07:00
Kostia Balytskyi
fe487f9e8b push_redirector: add contexts
Summary: The more contexts the better. Makes debugging errors much more pleasant.

Reviewed By: StanislavGlebik

Differential Revision: D22890940

fbshipit-source-id: 48f89031b4b5f9b15f69734d784969e2986b926d
2020-08-03 13:00:23 -07:00
Kostia Balytskyi
b7f8a1b193 megarepotool: add bonsai merge
Summary:
An extremely thin wrapper around existing APIs: just a way to create merge commits from the command line.

This is needed to make the merge strategy work:

```
C
|
M3
| \
.  \
|   \
M2   \
| \   \
.  \   \
|   \   \
M1   \   \
| \   \   \
.  TM3 \   \
.  /    |  |
.  D3 (e7a8605e0d) TM2  |
.  | /    /
.  D2 (33140b117c)  TM1
.  |  /
.  D1 (733961456f)
|   |
|    \
|    DAG to merge
|
main DAG
```

When we're creating `M2` as a result of merge of `TM2` into the main DAG, some files are deleted in the `TM3` branch, but not deleted in the `TM2` branch. Executing merge by running `hg merge` causes these files to be absent in `M2`. To make Mercurial work, we would need to execute `hg revert` for each such file prior to `hg merge`. Bonsai merge semantics however just creates correct behavior for us. Let's therefore just expose a way to create bonsai merges via the `megarepotool`.

Reviewed By: StanislavGlebik

Differential Revision: D22890787

fbshipit-source-id: 1508b3ede36f9b7414dc4d9fe9730c37456e2ef9
2020-08-03 11:32:35 -07:00
Kostia Balytskyi
f9e410d965 megarepotool: add pre-merge-delete CLI
Summary:
This adds a CLI for the functionality, added in the previous diff. In addition, this adds an integration test, which tests this deletion functionality.

The output of this tool is meant to be stored in the file. It simulates a simple DAG, and it should be fairly easy to automatically parse the "to-merge" commits out of this output. In theory, it could have been enough to just print the "to-merge" commits alone, but it felt like sometimes it may be convenient to quickly examine the delete commits.

Reviewed By: StanislavGlebik

Differential Revision: D22866930

fbshipit-source-id: 572b754225218d2889a3859bcb07900089b34e1c
2020-08-03 11:32:35 -07:00
Kostia Balytskyi
1eb7cfe277 megarepolib: add pre-merge delete implementation
Summary:
This implements a new strategy of creating pre-merge delete commits.

As a reminder, the higher-level goal is to gradually merge two independent DAGs together. One of them is the main repo DAG, the other is an "import". It is assumed that the import DAG is already "moved", meaning that all files are at the right paths to be merged.

The strategy is as follows: create a stack of delete commits with gradually decreasing working copy size. Merge them into `master` in reverse order.

Reviewed By: StanislavGlebik

Differential Revision: D22864996

fbshipit-source-id: bfc60836553c656b52ca04fe5f88cdb1f15b2c18
2020-08-03 11:32:35 -07:00
Simon Farnsworth
a5e9b79d7d Return all errors in the event of a multiplexed put failure
Summary:
With upcoming write quorum work, it'll be interesting to know all the failures that prevent a put from succeeding, not just the most recent, as the most recent may be from a blobstore whose reliability is not yet established.

Store and return all errors, so that we can see exactly why a put failed

Reviewed By: ahornby

Differential Revision: D22896745

fbshipit-source-id: a3627a04a46052357066d64135f9bf806b27b974
2020-08-03 09:30:05 -07:00
Kostia Balytskyi
48aa00ed92 megarepolib: implement chunker from hint string
Summary:
"Chunking hint" is a string (expected to be in a file) of the following format:
```
prefix1, prefix2, prefix3
prefix4,
prefix5, prefix6
```

Each line represents a single chunk: if a paths starts with any of the prefixes in the line, it should belong to the corresponding chunk. Prefixes are comma-separated. Any path that does not start with any prefix in the hint goes to an extra chunk.

This hint will be used in a new pre-merge-delete approach, to be introduced further in the stack.

Reviewed By: StanislavGlebik

Differential Revision: D22864999

fbshipit-source-id: bbc87dc14618c603205510dd40ee5c80fa81f4c3
2020-08-03 08:44:15 -07:00
Kostia Balytskyi
1825ed96d3 megarepolib: delete obsolete pre_merge_deletes impl
Summary:
We need to use a different type of pre-merge deletes, it seems, as the one proposed requires a huge number of commits. Namely, if we have `T` files in total in the working copy and we're happy to delete at most `D` files per commit, while merging at most `S` files per deletion stack:
```
#stacks = T/S
#delete_commits_in_stack = (T-X)/D
#delete_commits_total = T/S * (T-X)/D = (T^2 - TX)/SD ~ T^2/SD

T ~= 3*10^6

If D~=10^4 and X~=10^4:
#delete_commits_total ~= 9*10^12 / 10^8 = 9*10^4

If D~=10^5 and X~=10^5:
#delete_commits_total ~= 9*10^12 / 10^10 = 9*10^2
```

So either 90K or 900 delete commits. 90K is clearly too big. 900 may be tolerable, but it's still hard to manage and make sense of. What's more, there seems to be a way to produce fewer of these, see further in the stack.

Reviewed By: StanislavGlebik

Differential Revision: D22864998

fbshipit-source-id: e615613a34e0dc0d598f3178dde751e9d8cde4da
2020-08-03 08:27:16 -07:00
Simon Farnsworth
a9b8793d2d Add a write-mostly blobstore mode for populating blobstores
Summary:
We're going to add an SQL blobstore to our existing multiplex, which won't have all the blobs initially.

In order to populate it safely, we want to have normal operations filling it with the latest data, and then backfill from Manifold; once we're confident all the data is in here, we can switch to normal mode, and never have an excessive number of reads of blobs that we know aren't in the new blobstore.

Reviewed By: krallin

Differential Revision: D22820501

fbshipit-source-id: 5f1c78ad94136b97ae3ac273a83792ab9ac591a9
2020-08-03 04:36:19 -07:00
Viet Hung Nguyen
578207d0dc mononoke/repo_import: add hg sync checker
Summary:
Related diff: D22816538 (3abc4312af)

In repo_import tool once we move a bookmark to reveal commits to users, we want to check if hg_sync has received the commits. To do this, we extract the largest log id from bookmarks_update_log to compare it with the mutable_counter value related to hg_sync. If the counter value is larger or equal to the log id, we can move the bookmark to the next batch of commits. Otherwise, we sleep, retry fetching the mutable_counter value and compare the two again.
mutable_counters is an sql table that can track bookmarks log update instances with a counter.
This diff adds the functionality to extract the mutable_counters value for hg_sync.

======================
SQL query fix:
In the previous diff (D22816538 (3abc4312af)) we didn't cover the case where we might not get an ID which should return None. This diff fixes this error.

Reviewed By: StanislavGlebik

Differential Revision: D22864223

fbshipit-source-id: f3690263b4eebfe151e50b01a13b0193009e3bfa
2020-08-03 04:01:27 -07:00
Alex Hornby
3bd5ec74b0 mononoke: remove unused stats from walker state
Summary: The walker had a couple of unused stats fields in state.rs. Remove them.

Reviewed By: farnz

Differential Revision: D22863812

fbshipit-source-id: effc37abe29fafb51cb1421ff4962c5414b69be1
2020-08-03 01:39:39 -07:00
Jeremy Fitzhardinge
6a2846b1ca rust: mem::replace without using return value is just an assignment
Summary: 1.45 onwards warns about this.

Reviewed By: dtolnay

Differential Revision: D22877852

fbshipit-source-id: 14286142593e84f1f996b05a9c061b4f6687d418
2020-07-31 18:38:35 -07:00
Alex Hornby
5f71745810 mononoke: fix flaky test test-walker-corpus.t
Summary:
This is expected to fix flakyness in test-walker-corpus.t

The problem was that if a FileContent node was reached via an Fsnode it did not have a path associated.  This is a race condition that I've not managed to reproduce locally, but I think is highly likely to be the reason for flaky failure on CI

Reviewed By: ikostia

Differential Revision: D22866956

fbshipit-source-id: ef10d92a8a93f57c3bf94b3ba16a954bf255e907
2020-07-31 10:22:34 -07:00
Liubov Dmitrieva
cc2b5c04ca imrove authentication handling
Summary:
There have been lots of issues with user experience related to authentication
and its help messages.

Just one of it:
certs are configured to be used for authentication and they are invalid but the `hg cloud auth`
command will provide help message about the certs but then ask to copy and
paste a token from the code about interactive token obtaining.

Another thing, is certs are configired to use, it was not hard to
set up a token for Scm Daemon that can be still on tokens even if cloud
sync uses certs.

Now it is possible with `hg auth -t <token>` command

Now it should be more cleaner and all the messages should be cleaner as well.

Also certs related help message has been improved.

Also all tests were cleaned up from the authentication except for the main
test. This is to simplify the tests.

Reviewed By: mitrandir77

Differential Revision: D22866731

fbshipit-source-id: 61dd4bffa6fcba39107be743fb155be0970c4266
2020-07-31 10:16:59 -07:00
Lukas Piatkowski
417d61f4b6 mononoke/mononoke_x_repo_sync_job: make mononoke_x_repo_sync_job and related public (#40)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/40

Those tools are being used in some integration tests, make them public so that the tests might pass

Reviewed By: ikostia

Differential Revision: D22844813

fbshipit-source-id: 7b7f379c31a5b630c6ed48215e2791319e1c48d9
2020-07-31 09:02:33 -07:00
Lukas Piatkowski
e78c6d58c3 mononoke/integration tests: use C locale by default (#41)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/41

As of D22098359 (7f1588131b) the default locale used by integration tests is en_US.UTF-8, but as the comment in code mentiones:
```
The en_US.UTF-8 locale doesn't behave the same on all systems and trying to run
commands like "sed" or "tr" on non-utf8 data will result in "Illegal byte
sequence" error.
That is why we are forcing the "C" locale.
```

Additionally I've changed the test-walker-throttle.t test to use "/bin/date" directly. Previously it was using "/usr/bin/date", but the "/bin/date" is a more standard path as it works on MacOS.

Reviewed By: krallin

Differential Revision: D22865007

fbshipit-source-id: afd1346e1753df84bcfc4cf88651813c06933f79
2020-07-31 09:02:33 -07:00
Lukas Piatkowski
203d186f68 mononoke/integration tests: remove test-gitimport-octopus.t from OSS tests
Summary: It fails now, unknown reason, will work on it later

Reviewed By: mitrandir77, ikostia

Differential Revision: D22865324

fbshipit-source-id: c0513bfa2ce9f6baffebff472053e8a5d889c9ba
2020-07-31 08:02:46 -07:00
Stanislau Hlebik
cd2a3fcf32 mononoke: add allow_bookmark_update_delay
Summary:
Follow up from D22819791.
We want to use bookmark update delay only in scs, so let's configure it this
way

Reviewed By: krallin

Differential Revision: D22847143

fbshipit-source-id: b863d7fa4bf861ffe5d53a6a2d5ec44e7f60eb1a
2020-07-31 03:09:24 -07:00
Stanislau Hlebik
43ac2a1c62 mononoke: use WarmBookmarkCache in repo_client
Summary:
This is the (almost) final diff to introduce WarmBookmarksCache in repo_client.
A lot of this code is to pass through the config value, but a few things I'd
like to point out:
1) Warm bookmark cache is enabled from config, but it can be killswitched using
a tunable.
2) WarmBookmarksCache in scs derives all derived data, but for repo_client I
decided to derive just hg changeset. The main motivation is to not change the
current behaviour, and to make mononoke server more resilient to failures in
other derived data types.
3) Note that WarmBookmarksCache doesn't obsolete SessionBookmarksCache that was
introduced earlier, but rather it complements it. If WarmBookmarksCache is
enabled, then SessionBookmarksCache reads the bookmarks from it and not from
db.
4) There's one exception in point #3 - if we just did a push then we read
bookmarks from db rather than from bookmarks cache (see
update_publishing_bookmarks_after_push() method). This is done intentionally -
after push is finished we want to return the latest updated bookmarks to the
client (because the client has just moved a bookmark after all!).
I'd argue that the current code is a bit sketchy already - it doesn't read from
master but from replica, which means we could still see outdated bookmarks.

Reviewed By: krallin

Differential Revision: D22820879

fbshipit-source-id: 64a0aa0311edf17ad4cb548993d1d841aa320958
2020-07-31 03:09:24 -07:00
Alex Hornby
ecb58ff8d7 mononoke: add cmdlib argument to control cachelib zstd compression
Summary:
Add a cmdlib argument to control cachelib zstd compression. The default behaviour is unchanged, in that the CachelibBlobstore will attempted compression when putting to the cache if the object is larger than the cachelib max size.

To make the cache behaviour more testable, this change also adds an option to do an eager put to cache without the spawn. The default remains to do a lazy fire and forget put into the cache with tokio::spawn.

The motivation for the change is that when running the walker the compression putting to cachelib can dominate CPU usage for part of the walk, so it's best to turn it off and let those items be uncached as the walker is unlikely to visit them again (it only revisits items that were not fully derived).

Reviewed By: StanislavGlebik

Differential Revision: D22797872

fbshipit-source-id: d05f63811e78597bf3874d7fd0e139b9268cf35d
2020-07-31 01:12:02 -07:00
Santiago Alfonso Muñoz Rodriguez
c32b31984f Resolve cmd line argument conflict on populate_healer
Summary: populate_healer would panic on launch because there were 2 aguments assigned to -d: debug and destination-blobstore-id

Reviewed By: StanislavGlebik

Differential Revision: D22843091

fbshipit-source-id: e300af85b4e9d4f757b4311f2b7d776f59c7527d
2020-07-31 00:17:43 -07:00
Jun Wu
b57b6f8705 changegroup: do not print 'adding changeset X' with --debug
Summary:
The debug print abuses the `linkmapper`. The Rust commit add logic does not
use `linkmapper`. So let's remove the debug message to be consistent with
the Rust logic.

Reviewed By: DurhamG

Differential Revision: D22657189

fbshipit-source-id: 2e92087dbb5bfce2f00711dcd62881aba64b0279
2020-07-30 20:32:35 -07:00
Jun Wu
26580d00af allow pulling with empty 'common' set
Summary:
The check does not practically work because the client sends `common=[null]`
if the common set is empty.

D22519582 changes the client-side logic to send `common=[]` instead of
`common=[null]` in such cases. Therefore remove the constraint to keep
tests passing. 13 tests depend on this change.

Reviewed By: StanislavGlebik

Differential Revision: D22612285

fbshipit-source-id: 48fbc94c6ab8112f0d7bae1e276f40c2edd47364
2020-07-30 20:00:41 -07:00
Arun Kulshreshtha
439dd2d495 gotham_ext: move client hostname lookup into gotham_ext
Summary: Move client hostname reverse DNS lookup from inside of the LFS server's `RequestContext` to an async method on `ClientIdentity`, allowing it to be used elsewhere. The behavior of `RequestContext::dispatch_post_request` should remain unchanged.

Reviewed By: krallin

Differential Revision: D22835610

fbshipit-source-id: 15c1183f64324f216bd639630396c9c6f19bcaaa
2020-07-30 10:27:35 -07:00
Arun Kulshreshtha
d691e06abd tests: allow multiple curl error codes in test-lfs-server-https.t
Summary: When a TLS connection fails due to a missing client certificate, the `curl` command may fail with either code 35 or 56 depending on the TLS version used. With TLS v1.3, the error is explicitly reported as a missing client certificate, whereas in TLS v1.2, it is reported as a generic handshake failure. This is because TLS v1.3 defines an explicit [`certificate_required`](https://tools.ietf.org/html/rfc8446#section-4.4.2.4) alert, which is [not present](https://github.com/openssl/openssl/issues/6804) in earlier TLS versions.

Reviewed By: krallin

Differential Revision: D22834527

fbshipit-source-id: a15d6a169d35ece6ed5a54b37b8ca9bbc506b3da
2020-07-30 10:27:35 -07:00
Stanislau Hlebik
ffa578ed1f mononoke: change warm bookmark cache to store BookmarkKind
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large
commits.

We'd like to use WarmBookmarkCache in repo client, and to do that we need to be
able to tell Publishing and PullDefault bookmarks apart. Let's teach
WarmBookmarksCache about it.

Reviewed By: krallin

Differential Revision: D22812478

fbshipit-source-id: 2642be5c06155f0d896eeb47867534e600bbc535
2020-07-30 07:28:44 -07:00
Stanislau Hlebik
445994e44a mononoke: add method for creating publishing bookmarks
Summary:
This method will be used in the next diff to add a test, but it might be more
useful later as well.

Note that `update()` method in BookmarkTransaction already handles publishing bookmarks correctly

Reviewed By: farnz

Differential Revision: D22817143

fbshipit-source-id: 11cd7ba993c83b3c8bca778560af4a360f892b03
2020-07-30 07:28:43 -07:00
Stanislau Hlebik
8dcc48b90f mononoke: introduce SessionBookmarkCache
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large
commits.

The code for managing cached_publishing_bookmarks_maybe_stale was already a bit
tricky, and with WarmBookmarksCache introduction it would've gotten even worse.
Let's move this logic to a separate SessionBookmarkCache struct.

Reviewed By: krallin

Differential Revision: D22816708

fbshipit-source-id: 02a7e127ebc68504b8f1a7401beb063a031bc0f4
2020-07-30 07:28:43 -07:00
Lukas Piatkowski
9962321103 mononoke/regenerate_hg_filenodes: make regenerate_hg_filenodes public (#39)
Summary: Pull Request resolved: https://github.com/facebookexperimental/eden/pull/39

Reviewed By: krallin

Differential Revision: D22816308

fbshipit-source-id: e64b2b5f5b319814265fdb0129f2bce6b1a72a98
2020-07-30 06:50:54 -07:00
Lukas Piatkowski
4ccff9c2ef mononoke/megarepotool: make megarepotool public (#38)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/38

The tool is used in some integration tests, make it public so that the tests might pass

Reviewed By: ikostia

Differential Revision: D22815283

fbshipit-source-id: 76da92afb8f26f61ea4f3fb949044620a57cf5ed
2020-07-30 06:50:54 -07:00
Stanislau Hlebik
bca1052f78 mononoke: store publishing bookmarks in cache
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large
commits.

The problem with large changesets is deriving hg changesets for them. It might take
a significant amount of time, and that means that all the clients are stuck waiting on
listkeys() or heads() call waiting for derivation. WarmBookmarksCache can help here by returning bookmarks
for which hg changesets were already derived.

This is the second refactoring to introduce WarmBookmarksCache.
Now let's cache not only pull default, but also publishing bookmarks. There are two reasons to do it:
1) (Less important) It simplifies the code slightly
2) (More important) Without this change 'heads()' fetches all bookmarks directly from BlobRepo thus
bypassing any caches that we might have. So in order to make WarmBookmarksCache useful we need to avoid
doing that.

Reviewed By: farnz

Differential Revision: D22816707

fbshipit-source-id: 9593426796b5263344bd29fe5a92451770dabdc6
2020-07-30 03:35:02 -07:00
Stanislau Hlebik
6941d0cfe9 mononoke: do not store bytes in pull_default bookmarks cache
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large commits.

This diff just does a small refactoring that makes introducing
WarmBookmarksCache easier. In particular, later in cached_pull_default_bookmarks_maybe_stale cache I'd like to store
not only PullDefault bookmarks, but also Publishing bookmarks so that both
listkeys() and heads() method could be served from this cache. In order to do
that we need to store not only bookmark name, but also bookmark kind (i.e. is
it Publishing or PullDefault).

To do that let's store the actual Bookmarks and hg changeset objects instead of
raw bytes.

Reviewed By: farnz

Differential Revision: D22816710

fbshipit-source-id: 6ec3af8fe365d767689e8f6552f9af24cbcd0cb9
2020-07-30 03:35:02 -07:00
Mateusz Kwapich
d1322c621d don't error out when path doesn't exist
Summary:
Most out our APIs throw error when the path doesn't exist. I would like to
argue that's not the right choice for list_file_history.

Errors should be only retuned in abnormal situations and with
`history_across_deletions` param there's no other easy way to check if the file
ever existed other than calling this API - so it's not abnormal to call
it with path that doesn't exist in the repo.

Reviewed By: StanislavGlebik

Differential Revision: D22820263

fbshipit-source-id: 002bda2ef5ee9d6632259a333b7f3652cfb7aa6b
2020-07-30 03:25:01 -07:00
Viet Hung Nguyen
3abc4312af mononoke: add sql query to get max bookmark log id
Summary:
Added a new query function to get the largest log id from bookmarks_update_log.

In repo_import tool once we move a bookmark to reveal commits to users, we want to check if hg_sync has received the commits. To do this, we extract the largest log id from bookmarks_update_log to compare it with the mutable_counter value related to hg_sync. If the counter value is larger or equal to the log id, we can move the bookmark to the next batch of commits.
Since this query wasn't implemented before, this diff add this functionality.

Next step: add query for mutable_counter

Reviewed By: krallin

Differential Revision: D22816538

fbshipit-source-id: daaa4e5159d561e698c6e1874dd8822546c699c7
2020-07-30 03:23:08 -07:00
Lukas Piatkowski
db2f711159 mononoke/hg_sync_job: make mononoke_hg_sync_job public (#37)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/37

mononoke_hg_sync_job is used in integration tests, make it public

Reviewed By: krallin

Differential Revision: D22795881

fbshipit-source-id: 7a32c8e8adf723a49922dbb9e7723ab01c011e60
2020-07-30 02:52:56 -07:00
Lukas Piatkowski
0b5ac21f79 mononoke/backsyncer_cmd: make backsyncer_cmd public (#36)
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/36

This command is used in some integration tests, make it public.

Reviewed By: krallin

Differential Revision: D22792846

fbshipit-source-id: 39ac89b1a674ea63dc924cafa07107dbf8e5a098
2020-07-30 02:52:56 -07:00
Stanislau Hlebik
264a1493ca mononoke: fix a comment
Reviewed By: farnz

Differential Revision: D22816709

fbshipit-source-id: 7c338034bdfb835133eda12d23385fe432557868
2020-07-29 11:42:22 -07:00
Kostia Balytskyi
ff563aaf05 megarepolib: introduce stacked pre-merge deletes
Summary:
To gradually merge one repo into the other, we need to produce multiple slices of the working copy. The sum of these slices has to be equal to
the whole of the original repo's working copy. To create each of these slices all files but the ones in the slice need to be deleted from the working copy.
Before this diff, megarepolib would do this in a single delete commit. This however may be impractical, as it will produce huge commits, which we'll be unable
to process adequately. So this diff essentially introduces gradual deletion for each slice, and calls each slice "deletion stack". This is how it looks (a copy from the docstring):

```
  M1
  . \
  . D11 (ac5fca16ae)
  .  |
  . D12 (4c57c974e3)
  .   |
  M2   \
  . \   |
  . D21 (1135339320) |
  .  |  |
  . D22 (60419d261b) |
  .   | |
  o    \|
  |     |
  o    PM
  ^     ^
  |      \
main DAG   merged repo's DAG
```
Where:
 - `M1`, `M2` - merge commits, each of which merges only a chunk
   of the merged repo's DAG
 - `PM` is a pre-merge master of the merged repo's DAG
 - `D11 (ac5fca16ae)`, `D12 (4c57c974e3)`, `D21 (1135339320)` and `D22 (60419d261b)` are commits, which delete
   a chunk of working copy each. Delete commmits are organized
   into delete stacks, so that `D11 (ac5fca16ae)` and `D12 (4c57c974e3)` progressively delete
   more and more files.

Reviewed By: StanislavGlebik

Differential Revision: D22778907

fbshipit-source-id: ad0bc31f5901727b6df32f7950053ecdde6f599c
2020-07-28 09:43:32 -07:00
Viet Hung Nguyen
f1ef619284 mononoke/repo_import: add phabricator lag checker
Summary:
Once we start moving the bookmark across the imported commits (D22598159 (c5e880c239)), we need to check dependent systems to avoid overloading them when parsing the commits. In this diff we added the functionality to check Phabricator. We use an external service (jf graphql - find discussion here: https://fburl.com/nr1f19gs) to fetch commits from Phabricator. Each commit id starts with "r", followed by a call sign (e.g FBS for fbsource) and the commit hash (https://fburl.com/qa/9pf0vtkk). If we try to fetch an invalid commit id (e.g not having a call sign), we should receive an error. Otherwise, we should receive a JSON.
An imported commit should have the following query result: https://fburl.com/graphiql/20txxvsn - nodes has one result with the imported field true.
If the commit hasn't been recognised by Phabricator yet, the nodes array will be empty.
If the commit has been recognised, but not yet parsed, the imported field will be false.
If we haven't parsed the batch, we will try to check Phabricator again after sleeping for a couple of seconds.
If it has parsed the batch of commits, we move the bookmark to the next batch.

Reviewed By: krallin

Differential Revision: D22762800

fbshipit-source-id: 5c02262923524793f364743e3e1b3f46c921db8d
2020-07-28 08:09:21 -07:00
Lukas Piatkowski
22f90df1db mononoke/integration tests: use a combination of kill and wait to kill a process
Summary: On MacOS if you kill a process without waiting on it to be killed you will receive a warning on the terminal saying that the process was killed. To suppress that output, which is messing with the integratino tests, use a combination of kill and wait (the custom "killandwait" bash function). It will wait for the process to stop which is probably what most integration tests would prefer to do

Reviewed By: krallin

Differential Revision: D22790485

fbshipit-source-id: d2a08a5e617e692967f8bd566e48f5f9b50cb94d
2020-07-28 08:02:52 -07:00
Lukas Piatkowski
9db04f2daa mononoke/integration tests: use "date" command directly rather than via path
Summary: Using "/usr/bin/date" rather than just "date" is very limiting, not all systems have common command line tools installed in the same place, just use "date".

Reviewed By: krallin

Differential Revision: D22762186

fbshipit-source-id: 747da5a388932fb5b9f4c068014c01ee90a91f9b
2020-07-28 08:02:52 -07:00
Lukas Piatkowski
ec9be535eb mononoke/integration tests: use LC_ALL=C locale
Summary: On MacOS the default localisation configuration (UTF-8) won't allow operations on arbitrary bytes of data via some commands, because not all sequences of bytes are valid utf-8 characters. That is why when handling arbitrary bytes it is better to use the "C" locale, which can be achieved by setting the LC_ALL env variable to "C".

Reviewed By: krallin

Differential Revision: D22762189

fbshipit-source-id: aa917886c79fba5ea61ff7168767fc4b052a35a1
2020-07-28 08:02:52 -07:00
Lukas Piatkowski
16182e626b mononoke/integration tests: use newer bash version on MacOS GitHub CI runs
Summary: Use brew on MacOS GitHub CI runs to update bash from 3.* to 5.*.

Reviewed By: krallin

Differential Revision: D22762195

fbshipit-source-id: b3a4c9df7f8ed667e88b28aacf7d87c6881eb775
2020-07-28 08:02:52 -07:00