Commit Graph

14 Commits

Author SHA1 Message Date
Stanislau Hlebik
ffd60c01b0 blobimport: do not use batch_derive
Summary:
batch_derive() is a dangerous function to use. I'd love to delete it but this
function is very useful for backfilling, so unfortunately I can't.

The problem arises when one tries to backfill blame and unodes simultaneously
(or just derive blame  which in turn derives unodes). While batch_derive()
tries to be careful with inserting "outer" derived data's mappings (i.e. blame
mapping), it doesn't do it for inner derived data mappings (i.e. unodes). So we
might end up in the situation where we insert unodes mapping before we inserted
all the manifests for it. If this thing fails in the middle of derivation then
we have a corruption.

Let's do not use it in blobimport. It will make derivation slower, but I'd
rather make it slower than incorrect.

Reviewed By: farnz

Differential Revision: D21905619

fbshipit-source-id: c0227df195a8cf4482b2452ca928acbc5750b3e5
2020-06-10 19:29:31 -07:00
Stefan Filip
aaac7bb066 mononoke: move fetch_all_public_changesets to the bulkops crate
Summary:
I want to reuse the functionality provided by `fetch_all_public_changesets`
in building Segmented Changelog. To share the code I am adding a new crate
intended to store utilities in dealing with bulk fetches.

Reviewed By: krallin

Differential Revision: D21471477

fbshipit-source-id: 609907c95b438504d3a0dee64ab5a8b8b3ab3f24
2020-05-13 16:53:16 -07:00
Stanislau Hlebik
5f8ab2526c mononoke: make sure commit is regenerated when backfill_derived_data single is
Summary:
subcommand_single calls `derived_data_utils.regenerate(vec![cs_id])` with the
intention that derived data for this commit will be regenerated. However
previously it didn't work because DerivedDataUtils::derive() was ignoring
regenerate parameter. This diff fixes it.

Reviewed By: krallin

Differential Revision: D21527344

fbshipit-source-id: 56d93135071a7f3789262b7a9d9ad84a0896c895
2020-05-13 03:27:46 -07:00
Stanislau Hlebik
50b71ac322 mononoke: log the oldest underived ancestor
Summary:
This diff logs the delay in deriving data. In particular it logs how much time
has left since an underived commit was created.

Note that this code makes an assumption about monotonic dates - for repos with pushrebase
repos that should be the case.

Reviewed By: krallin

Differential Revision: D21427265

fbshipit-source-id: bfddf594467dfd2424f711f895275fb54a4e1c60
2020-05-08 07:47:19 -07:00
Stanislau Hlebik
503d4003af mononoke: simplify subcommand_tail
Summary:
Two things will be simplified:
1) Do not pass sqlbookmarks, we can always get them from blobrep
2) Instead of passing repo per derived data type let's just always pass
unredacted repo

Add a very simple unittest

Differential Revision: D21426885

fbshipit-source-id: 712ef23340466438bf34a086517f7ba33d4eabed
2020-05-08 07:47:18 -07:00
Stanislau Hlebik
864a9bc991 mononoke: remove pending_heads
Summary: The alarm was already removed in D21425313

Reviewed By: krallin

Differential Revision: D21425971

fbshipit-source-id: d043e1393e497bdf29f28d224d7e710b6beaa8f8
2020-05-06 07:55:04 -07:00
Mistral Orhan Jean-Pierre Contrastin
5fe820dd06 Expose ctime from Blobstore::get() in mononoke
Summary:
- Change get return value for `Blobstore` from `BlobstoreBytes` to `BlobstoreGetData` which include `ctime` metadata
- Update the call sites and tests broken due to this change
- Change `ScrubHandler::on_repair` to accept metadata and log ctime
- `Fileblob` and `Manifoldblob` attach the ctime metadata
- Tests for fileblob in `mononoke:blobstore-test` and integration test `test-walker-scrub-blobstore.t`
- Make cachelib based caching use `BlobstoreGetData`

Reviewed By: ahornby

Differential Revision: D21094023

fbshipit-source-id: dc597e888eac2098c0e50d06e80ee180b4f3e069
2020-05-06 00:55:07 -07:00
Stanislau Hlebik
6914d544d9 mononoke: read list of derived data to derive from the config
Summary:
Currently we need to specify which derived data we need to derive, however they
are already specified in the configerator configs. Let's just read it from
there.

That means that we no longer need to update tw spec to add new derived data types - we'll just need to add them to configerator and restart the backfiller.

Reviewed By: krallin

Differential Revision: D21378640

fbshipit-source-id: f97c3f0b8bb6dbd23d5a50f479ecfccbebd33897
2020-05-04 04:52:26 -07:00
Stanislau Hlebik
b28b879846 mononoke: small refactoring before introducing Cleaner for unodes
Summary:
In the next diffs I'd like to introduce cleaner for unodes. This diff just
moves a bunch of code around to make reviewing next diffs easier

Reviewed By: krallin

Differential Revision: D21226921

fbshipit-source-id: c9f9b37bf9b11f36f8fc070dfa293fd8e6025338
2020-04-24 10:52:58 -07:00
Stanislau Hlebik
403347ee10 mononoke: add dry-run mode for backfilling fsnodes
Summary:
This diff adds a special dry-run mode of backfilling (for now only fsnodes are
supported). It does by keeping all derived data in memory (i.e. nothing is
written to blobstore) and periodically cleaning entries that can no longer
be referenced.

This mode can be useful to e.g. estimate size of derived data before actually
running the derivation.

Note that it requires --readonly-storage in order to make sure that we don't
accidentally write anything to e.g. mysql.

Reviewed By: ahornby

Differential Revision: D21088989

fbshipit-source-id: aeb299d5dd90a7da1e06a6be0b6d64b814bc7bde
2020-04-24 04:05:53 -07:00
Stanislau Hlebik
2a5cdfec02 mononoke: split warmup from backfill_derived_data
Summary: File is getting too large - let's split it

Reviewed By: farnz

Differential Revision: D21180807

fbshipit-source-id: 43f0af8e17ed9354a575b8f4dac6a9fe888e8b6f
2020-04-23 00:16:30 -07:00
Stanislau Hlebik
d34a940ab5 mononoke: explicity enable derived data type
Summary: See comments for more details

Reviewed By: krallin

Differential Revision: D21088800

fbshipit-source-id: b4c187b5d4d476602e69d26d71d3fe1252fd78e6
2020-04-20 06:57:40 -07:00
Stanislau Hlebik
1e908ed410 mononoke: change stream::iter into a for loop
Summary: I think it's more readable this way

Reviewed By: krallin

Differential Revision: D21088598

fbshipit-source-id: 1608c250701ae6870094f0f61c0c2ce4e2c12ebf
2020-04-20 06:57:40 -07:00
Stanislau Hlebik
2d56b1d530 mononoke: move backfill derived data to a separate directory
Summary:
In the next few diffs I'm going to add more functionality to backfill derived
data. The file has grown quite big already, so I'd rather put this new
functionality in a separate file. This diff does the first step - it just moves
a file to a separte directory.

Reviewed By: farnz

Differential Revision: D21087813

fbshipit-source-id: 4a8e3eac4b8d478aa4ceca6bb55fa0d2973068ba
2020-04-20 06:57:39 -07:00