Commit Graph

183 Commits

Author SHA1 Message Date
Stanislau Hlebik
97cc687069 mononoke: add an option to disable leases in backfill_derive_data
Summary:
Let's by default not take a lease so that derived_data_tailer can make progress even if all other services are failing to derive.

One note - we don't remove the lease completely, but rather we use another lease that's separate from the lease used by other mononoke services. The motivation here is to make sure we don't derive unodes 4 times - blame, deleted_file_manifest and fastlog all want to derive unodes, and with no lease at all they would just all derive the same data a few times. Deriving unodes a few times seems undesirable, so I suggest to use a InProcessLease instead of no lease.

Reviewed By: krallin

Differential Revision: D22761222

fbshipit-source-id: 9595705d955f3bb2fe7efd649814fc74f9f45d54
2020-07-27 07:13:30 -07:00
Stanislau Hlebik
fd153acdef mononoke: make it possible to build sparse skiplist
Summary:
as in title.

Since we haven't tested it much yet I've added a note that this feature is
experimental

Reviewed By: krallin

Differential Revision: D22760648

fbshipit-source-id: 33f858b0021939dabbe1894b08bd495464ad0f63
2020-07-27 03:48:30 -07:00
Stanislau Hlebik
82c291010b mononoke: small refactoring of admin skiplist_subcommand
Summary:
Move changeset_fetcher building to a separate function, because
build_skiplist_index is already rather large and I'm going to make it larger in
the next diff

Reviewed By: krallin

Differential Revision: D22760556

fbshipit-source-id: 800baba052f46ed817f011f71dd28d40e98245fe
2020-07-27 03:48:30 -07:00
Stanislau Hlebik
4e252cbf2e mononoke: add --limit to backfill_derived_data
Summary: It's nice to have this flag available

Reviewed By: krallin

Differential Revision: D22693732

fbshipit-source-id: 9d0d8f44cb0f5f7263a33e86e9c5b8a9927c0c85
2020-07-23 13:33:16 -07:00
Kostia Balytskyi
55cb8dba22 admin: asyncify crossrepo subcommand
Summary: This makes crossrepo subcommand not import `BoxFuture`

Reviewed By: StanislavGlebik

Differential Revision: D22647197

fbshipit-source-id: ea2dd20039a8aaf96be0483cc25f3fad38d262f5
2020-07-22 09:17:43 -07:00
Stanislau Hlebik
aaafd7a707 mononoke: allow specifying list to redact in a file
Summary:
For large lists it's much more convenient to specify them in a file - we are
not limited by cmd line size limit.

Reviewed By: krallin

Differential Revision: D22595023

fbshipit-source-id: 93035208700f981453eaf98f84341a86f2f1c04d
2020-07-22 07:37:36 -07:00
Kostia Balytskyi
1e5a0dc4db admin: add crossrepo config subcommand
Summary: This is to be able to inspect `LiveCommitSyncConfig` from our admin tooling.

Reviewed By: StanislavGlebik

Differential Revision: D22497065

fbshipit-source-id: 3070890b7dc2a4075a5c15aca703494e33ee6530
2020-07-22 07:34:59 -07:00
Stanislau Hlebik
c6ff6a0216 mononoke: add a command to verify manifests
Summary:
We have three different types of manifests that store file type and content -
hg manifests, fsnodes and unodes.

Let's add a command that verifies that these manifests are consistent.

There's some copy-paste in the code when listing manifests (e.g. list_fsnodes,
list_unodes etc are quite similar). There might be a way to have less
copy-paste, but given that each of the functions have some small differences it
doesn't really seem worth it.

Reviewed By: krallin

Differential Revision: D22663631

fbshipit-source-id: 487be8611df218472cec1899f34367906794484b
2020-07-22 07:25:31 -07:00
Kostia Balytskyi
be29e70289 admin: report version in crossrepo map
Summary: What the title says.

Reviewed By: farnz

Differential Revision: D22476423

fbshipit-source-id: 5d6781fc09e3f89c1d555787821770d6b131c9f2
2020-07-21 09:09:23 -07:00
Stanislau Hlebik
9d18c46b1f remediation of S205607
fbshipit-source-id: 798decc90db4f13770e97cdce3c0df7d5421b2a3
2020-07-17 17:16:13 -07:00
Stanislau Hlebik
3665548bb0 remediation of S205607
fbshipit-source-id: 5113fe0c527595e4227ff827253b7414abbdf7ac
2020-07-17 17:16:13 -07:00
Stanislau Hlebik
23390ee238 mononoke: use bulkops in skiplist builder
Summary:
We already have a function to fetch all public changesets - let's use it
instead of re-implementing it.

The small caveat is that function in skiplist subcommand fetched all the
changesets (i.e. public and draft), so using bulkops function looks like a change in
behaviour. However it's actually the same - we index only public changesets for
skiplists anyway.

Reviewed By: krallin

Differential Revision: D22499940

fbshipit-source-id: ac8ad7d2b6ff0208e830a344877d7d2e93693abc
2020-07-13 15:17:35 -07:00
Lukas Piatkowski
a41db27baf mononoke/blobstore_healer: make it OSS buildable
Reviewed By: farnz

Differential Revision: D22460549

fbshipit-source-id: aa5327f5dae1008cee784d41e322034cd0bb5b61
2020-07-13 03:02:34 -07:00
Lukas Piatkowski
6b9637bbac mononoke/blobimport: make it OSS buildable
Reviewed By: krallin

Differential Revision: D22455491

fbshipit-source-id: 919ba0e4fc759ef25546eacf30200ff19cd89466
2020-07-13 03:02:34 -07:00
Stanislau Hlebik
64740aafce mononoke: asyncify build_skiplist_index subcommand
Reviewed By: farnz

Differential Revision: D22480533

fbshipit-source-id: af6bf14998fe38c7dd6655a51addeb41fbc7aa3b
2020-07-12 03:21:20 -07:00
Simon Farnsworth
81e65f5bcc Fully asyncify the blobstore healer
Summary:
As part of modernising MultiplexedBlobstore, I want to fully asyncify the blobstore_sync_queue; that means I need this fully asyncified.

Fully asyncify everything but the bits that interact with blobstore_sync_queue; those have to wait for MultiplexedBlobstore to be asyncified

End goal is to reduce the number of healer overloads, by adding a mode of operation in which writes (e.g. from backfills or derived data) can avoid a sync queue write when all blobstores are working

Reviewed By: StanislavGlebik

Differential Revision: D22460059

fbshipit-source-id: 5792c4a8daf17ffe99a04d792792f568c40fde37
2020-07-11 05:41:36 -07:00
Simon Farnsworth
9287bfca2c Move blobstore healer tests to their own file
Summary: I'm about to asyncify the healer - move 2/3rds of the file content (tests) into their own file.

Reviewed By: ikostia

Differential Revision: D22460166

fbshipit-source-id: 18c0dde5f582c4c7006e3f023816ac457d38234b
2020-07-11 05:41:36 -07:00
Lukas Piatkowski
c5f79f3668 mononoke/benchmark_filestore: make it OSS buildable
Reviewed By: krallin

Differential Revision: D22475133

fbshipit-source-id: c14bf4f0811e8c2f1cf31416bf88f378caf50be3
2020-07-10 22:12:40 -07:00
Simon Farnsworth
78847ff88c Make BlobstoreSyncQueue use new futures
Summary: Stage 1 of a migration - next step is to make all users of this trait use new futures, and then I can come back, add lifetimes and references, and leave it modernised

Reviewed By: StanislavGlebik

Differential Revision: D22460164

fbshipit-source-id: 94591183912c0b006b7bcd7388a3d7c296e60577
2020-07-10 06:43:13 -07:00
Mark Thomas
a51d164892 admin: increase type_length_limit
Reviewed By: ikostia

Differential Revision: D22476055

fbshipit-source-id: 1df7556a5cf774744b26f09e3ed681cceb30c617
2020-07-10 05:55:06 -07:00
Mark Thomas
fb5fdb9c15 bookmarks: remove repo_id from Bookmarks methods
Summary:
Remove the `repo_id` parameter from the `Bookmarks` trait methods.

The `repo_id` parameters was intended to allow a single `Bookmarks` implementation
to serve multiple repos.  In practise, however, each repo has its own config, which
results in a separate `Bookmarks` instance for each repo.  The `repo_id` parameter
complicates the API and provides no benefit.

To make this work, we switch to the `Builder` pattern for `SqlBookmarks`, which
allows us to inject the `repo_id` at construction time.  In fact nothing here
prevents us from adding back-end sharing later on, as these `SqlBookmarks` objects
are free to share data in their implementation.

Reviewed By: StanislavGlebik

Differential Revision: D22437089

fbshipit-source-id: d20e08ce6313108b74912683c620d25d6bf7ca01
2020-07-10 04:50:25 -07:00
Mark Thomas
3afceb0e2c bookmarks: extract BundleReplayData from BookmarkUpdateReason
Summary:
Separate out the `BundleReplayData` from the `BookmarkUpdateReason` enum.  There's
no real need for this to be part of the reason, and removing it means we can
abstract away the remaining dependency on Mercurial changeset IDs from
the main bookmarks traits.

Reviewed By: mitrandir77, ikostia

Differential Revision: D22417659

fbshipit-source-id: c8e5af7ba57d10a90c86437b59c0d48e587e730e
2020-07-10 04:50:24 -07:00
Simon Farnsworth
65e7404eba Command to manually scrub keys supplied on stdin
Summary: For populating the XDB blobstore, we'd like to copy data from Manifold - the easiest way to do that is to exploit MultiplexedBlobstore's scrub mode to copy data directly.

Reviewed By: krallin

Differential Revision: D22373838

fbshipit-source-id: 550a9c73e79059380337fa35ac94fe1134378196
2020-07-10 01:01:05 -07:00
Arun Kulshreshtha
5f0181f48c Regenerate all Cargo.tomls after upgrade to futures 0.3.5
Summary: D22381744 updated the version of `futures` in third-party/rust to 0.3.5, but did not regenerate the autocargo-managed Cargo.toml files in the repo. Although this is a semver-compatible change (and therefore should not break anything), it means that affected projects would see changes to all of their Cargo.toml files the next time they ran `cargo autocargo`.

Reviewed By: dtolnay

Differential Revision: D22403809

fbshipit-source-id: eb1fdbaf69c99549309da0f67c9bebcb69c1131b
2020-07-06 20:49:43 -07:00
Thomas Orozco
7e8c9174be mononoke/admin: add a filestore fetch subcommand
Summary:
Sometimes you want to fetch a file. Using curl and the LFS server works, but
this really should be part of Mononoke admin.

Reviewed By: ikostia

Differential Revision: D22397472

fbshipit-source-id: 17decf4aa2017a2c1be52605a254692f293d1bcd
2020-07-06 14:56:08 -07:00
Thomas Orozco
46def15c4f mononoke/admin: fix filestore store subcommand
Summary:
This got broken when we moved to Tokio 0.2. Let's fix it and add a test to make
sure it does not regress.

Reviewed By: ikostia

Differential Revision: D22396261

fbshipit-source-id: a8359aee33b4d6d840581f57f91af6c03125fd6a
2020-07-06 14:56:08 -07:00
Kostia Balytskyi
f223ca6e6e synced commit mapping: expose version in get query
Summary:
When we look up how a commit was synced, we frequently need to know which version of `CommitSyncConfig` was used to sync it. Specifically, this is useful for admin tooling and commit validator, which I am planning to migrate to use versioned `CommitSyncConfig` in the near future.

Later I will also include this information into `RewrittenAs` variant of `CommitSyncOutcome`, so that we expose it to real users. I did not do it in this diff to keep is small and easy to review. And because the other part is not ready :P

Reviewed By: StanislavGlebik

Differential Revision: D22255785

fbshipit-source-id: 4312e9b75e2c5f92ba018ff9ed9149efd3e7b7bc
2020-07-06 11:23:31 -07:00
Stanislau Hlebik
2a732f2626 mononoke: pass BlobRepo in fetch_full_file_content
Summary:
In the next diffs we'll need to read override_blame_filesize_limit from derived
data config, and this config is stored in BlobRepo.

this diff makes a small refactoring to pass BlobRepo to fetch_full_file_content

Reviewed By: krallin

Differential Revision: D22373946

fbshipit-source-id: b209abce82c0279d41173b5b25f6761659a92f3d
2020-07-03 09:58:46 -07:00
Stanislau Hlebik
b703f11685 mononoke: asyncify fetch_full_file_content
Summary: This will make adding blame file size limit override the next diffs easier

Reviewed By: krallin

Differential Revision: D22373945

fbshipit-source-id: 4857e43c5d80596340878753ea90bf31d7bb3367
2020-07-03 09:58:46 -07:00
Thomas Orozco
be1bac6c06 mononoke/virtually_sharded_blobstore: expose this in cmdlib
Summary:
Eventually, I plan to make this the default, but for now I'd like to make it
something we can choose to turn on or off as a cmd argument (so we can start
with the experimental tier and Fastreplay).

Note that this mixes volatile vs. non-volatile pools when accessing the pools
for cacheblob. In practice, those pools are actually volatile, it's just that
things don't break if you access them as non-volatile.

Reviewed By: farnz

Differential Revision: D22356537

fbshipit-source-id: 53071b6b21ca5727d422e10f685061c709114ae7
2020-07-03 05:53:11 -07:00
Kostia Balytskyi
f210326656 blobstore_healer: log the speed with which queue rows are deleted
Summary: This allowed me to compare two alternative approaches to queue draining, and generally seems like a useful thing to do.

Reviewed By: krallin

Differential Revision: D22364733

fbshipit-source-id: b6c76295c85b4dec6f0bfd7107c30bb4e4a28942
2020-07-03 05:09:56 -07:00
Stanislau Hlebik
2d24ddf2e1 mononoke: add --all-types to backfill_derive_data single
Summary: It's useful to derive all enabled derived data at once

Reviewed By: krallin

Differential Revision: D22336338

fbshipit-source-id: 54bc27ab2c23c175913fc02e6bf05d18a54c249c
2020-07-03 00:20:58 -07:00
Mark Thomas
3e4e59baef bookmarks: add 'pagination' filter to 'list'
Summary:
Add a new parameter, `pagination`, to the `list` method of the `Bookmarks` trait.

This restricts the returned bookmarks to those lexicographically after the
given bookmark name (exclusive).  This can be use to implement pagination:
callers can provide the last bookmark in the previous page to fetch the
next page of bookmarks.

Reviewed By: krallin

Differential Revision: D22333943

fbshipit-source-id: 686df545020d936095e29ae5fee24258511f4083
2020-07-02 07:53:12 -07:00
Mark Thomas
742eb6f829 bookmarks: rework Bookmarks traits
Summary:
Rework the bookmarks traits:

* Split out log functions into a separate `BookmarkUpdateLog` trait.  The cache doesn't care about these methods.

* Simplify `list` down to a single method with appropriate filtering parameters.  We want to add more filtering types, and adding more methods for each possible combination will be messier.

* The `Bookmarks` and `BookmarkUpdateLog` traits become `attributes` on `BlobRepo`, rather than being a named member.

Reorganise the bookmarks crate to separate out the bookmarks log and transactions into their own modules.

Reviewed By: krallin

Differential Revision: D22307781

fbshipit-source-id: 4fe514df8b7ef92ed3def80b21a16e196d916c64
2020-07-02 07:53:12 -07:00
Kostia Balytskyi
b134a2f5bb blobstore_healer: fix how replication lag is monitored
Summary: We were monitoring the wrong lag so far.

Reviewed By: farnz

Differential Revision: D22356455

fbshipit-source-id: abe41a4154c2a8d53befed4760e2e9544797c845
2020-07-02 06:18:35 -07:00
Stefan Filip
bf61eb5c64 mononoke: add trait ReplicaLagMonitor
Summary:
ReplicaLagMonitor is aimed to generalize over different stategies of fetching
the replication lag in a SQL database. Querying a set of connections is one
such strategy.

Reviewed By: ikostia

Differential Revision: D22104348

fbshipit-source-id: bbbeccb55a664e60b3c14ee17f404982d09f2b25
2020-07-01 18:18:55 -07:00
Kostia Balytskyi
a40d9bb264 blobstore healer: improve incomplete batch identification logic
Summary:
Blobstore healer has a logic, which prevents it from doing busy work, when the
queue is empty. This is implemented by means of checking whether the DB query
fetched the whole `LIMIT` of values. Or that is the idea, at least. In
practice, here's what happens:

1. DB query is a nested one: first it gets at most `LIMIT` distinct
`operation_key` entries, then it gets all rows with such entries. In practice
this almost always means `# of blobstores * LIMIT` rows, as we almost always
succeed writing to every blobstore
2. Once this query is done, the rows are grouped by the `blobstore_key`, and a
future is created for each such row (for simplicity, ignore that future may not
be created).
3. We then compare the number of created futures with `LIMIT` and report an
incomplete batch if the numbers are different.

This logic has a flaw: same `blobstore_key` may be written multiple times with
different `operation_key` values. One example of this: `GitSha1` keys for
identical contents. When this happens, grouping from step 2 above will produce
fewer than `LIMIT` groups, and we'll end up sleeping for nothing.

This is not a huge deal, but let's fix it anyway.

My fix also adds some strictly speaking unnecessary logging, but I found it
helpful during this investigation, so let's keep it.

The price of this change is collecting two `unique_by` calls, both of which
allocates a temporary hash set [1] of the size `LIMIT * len(blobstore_key) * #
blobstores` (and another one with `operation_key`). For `LIMIT=100_000`
`len(blobstore_key)=255`, `# blobstores = 3` we have roughly 70 mb for the
larger one, which should be ok.

[1] https://docs.rs/itertools/0.9.0/itertools/trait.Itertools.html#method.unique

Reviewed By: ahornby

Differential Revision: D22293204

fbshipit-source-id: bafb7817359e2c867cf33c319a886653b974d43f
2020-07-01 02:08:54 -07:00
Mark Thomas
160936b732 bookmarks: convert to new-style BoxFutures and BoxStreams
Summary: Convert the bookmarks traits to use new-style `BoxFuture<'static>` and `BoxStream<'static>`.  This is a step along the path to full `async`/`await`.

Reviewed By: farnz

Differential Revision: D22244489

fbshipit-source-id: b1bcb65a6d9e63bc963d9faf106db61cd507e452
2020-06-30 02:37:34 -07:00
Simon Farnsworth
b1c85aaf4b Switch Blobstore to new-style futures
Summary:
Eventually, we want everything to be `async`/`await`; as a stepping stone in that direction, switch the remaining lobstore traits to new-style futures.

This just pushes the `.compat()` out to old-style futures, but it makes the move to non-'static lifetimes easier, as all the compile errors will relate to lifetime issues.

Reviewed By: krallin

Differential Revision: D22183228

fbshipit-source-id: 3fe3977f4469626f55cbf5636d17fff905039827
2020-06-26 03:54:42 -07:00
Kostia Balytskyi
ef87f564bc add newtype for CommitSyncConfigVersion
Summary:
This is to avoid passing `String` around. Will be useful in one of the next
diffs, where I add querying `LiveCommitSyncConfig` by versions.

Reviewed By: krallin

Differential Revision: D22243254

fbshipit-source-id: c3fa92b62ae32e06d7557ec486d211900ff3964f
2020-06-26 02:45:26 -07:00
Simon Farnsworth
454de31134 Switch Loadable and Storable interfaces to new-style futures
Summary:
Eventually, we want everything to be `async`/`await`; as a stepping stone in that direction, switch some of the blobstore interfaces to new-style `BoxFuture` with a `'static` lifetime.

This does not enable any fixes at this point, but does mean that `.compat()` moves to the places that need old-style futures instead of new. It also means that the work needed to make the transition fully complete is changed from a full conversion to new futures, to simply changing the lifetimes involved and fixing the resulting compile failures.

Reviewed By: krallin

Differential Revision: D22164315

fbshipit-source-id: dc655c36db4711d84d42d1e81b76e5dddd16f59d
2020-06-25 08:45:37 -07:00
Thomas Orozco
edf93f8676 mononoke/blobstore_healer: limit concurrency of healing
Summary: Let's not heal 10000 blobs in parallel, that's a little too much data.

Reviewed By: farnz

Differential Revision: D22186543

fbshipit-source-id: 939fb5bc83b283090e979ac5fe3efc96191826d3
2020-06-23 09:00:29 -07:00
Pavel Aslanov
d13768d768 move DangerousOverride into a separate crate blobrepo_override
Summary: DangerousOverride is moved into a separate crate. Not only it is usually not needed but it was introducing dependencies on mercurial crate.

Reviewed By: StanislavGlebik

Differential Revision: D22115015

fbshipit-source-id: c9646896f906ea54d11aa83a8fbd8490a5b115ea
2020-06-22 07:29:19 -07:00
Pavel Aslanov
ea79e79538 move all mercurial content generation logic to blobrepo_hg
Summary: Move all mercurial changeset generation logic to `blobrepo_hg`. This is preliminary step is required to decouples BlobRepo from mercurial, and in later stages it will be moved to derived data infra once blobrepo is free of mercurial.

Reviewed By: StanislavGlebik

Differential Revision: D22089677

fbshipit-source-id: bca28dedda499f80899e729e4142e373d8bec0b8
2020-06-22 07:29:19 -07:00
Pavel Aslanov
905c8b213e move Filenodes to BlobRepo::attributes
Summary: Move `Filenodes` to `BlobRepo::attributes` as it is mercurial specific.

Reviewed By: ikostia

Differential Revision: D21662418

fbshipit-source-id: 87648a3e6fd7382437424df3ee60e1e582b6b958
2020-06-22 07:29:19 -07:00
Pavel Aslanov
a1f5e45a5a BlobRepoHg extension trait.
Summary: This diff introduces `BlobRepoHg` extension trait for `BlobRepo` object. Which contains mercurial specific methods that were previously part of `BlobRepo`. This diff also stars moving some of the methods from BlobRepo to BlobRepoHg.

Reviewed By: ikostia

Differential Revision: D21659867

fbshipit-source-id: 1af992915a776f6f6e49b03e4156151741b2fca2
2020-06-22 07:29:19 -07:00
Jeremy Fitzhardinge
c530d32056 rust: clean up some warnings
Summary: Prep for 1.44 but also general cleanups.

Reviewed By: dtolnay

Differential Revision: D22024428

fbshipit-source-id: 8e1d39a1e78289129b38554674d3dbf80681f4c3
2020-06-15 16:50:40 -07:00
Viet Hung Nguyen
d622ecb06a mononoke/bonsai_git_mapping: add CoreContext to BonsaiGitMapping functions
Summary: Log sql accesses in bonsai_git_mapping: added (core: &CoreContext) arguments to BonsaiGitMapping trait functions. Increment counters in functions where we have sql requests (e.g SQLReadsReplica). Updated tests covering BonsaiGitMapping with mock context object.

Reviewed By: StanislavGlebik

Differential Revision: D22043831

fbshipit-source-id: d05b07e262a8b7494d2ae46d5660d1c0695619ae
2020-06-15 11:23:23 -07:00
Mateusz Kwapich
0071d1803c migrate add_node to async/await syntax
Reviewed By: StanislavGlebik

Differential Revision: D21951752

fbshipit-source-id: 7de23875e51196018227729966701a4980a7c89a
2020-06-15 08:14:35 -07:00
Stefan Filip
f586f46ca5 mononoke: move replication_lag utilities from blobstore_healer to sql_ext
Summary: I would like to use these utilities when building segmented changelog.

Reviewed By: krallin

Differential Revision: D21876432

fbshipit-source-id: 9022627e224bfcb155b47d696371d24e538e6f39
2020-06-11 12:38:50 -07:00