Commit Graph

1211 Commits

Author SHA1 Message Date
Alex Hornby
74f2e7affc mononoke: add context to blobstore_sync_queue get error handling
Summary: Add context to show the affected key if there are problems peeking a key.

Reviewed By: farnz

Differential Revision: D23003001

fbshipit-source-id: b46b7626257f49d6f11e80a561820e4b37a5d3b0
2020-08-11 02:52:44 -07:00
Alex Hornby
0a8c81c668 mononoke: walker state, check for visited before insert
Summary:
Now that the previous diff has pre-computed the hash value using EagerHashMemo, its less expensive to try a read-lock only get() first before committing to a write lock acquiring insert().

The combination of these and the previous diff moved WalkState::visit from dominating the cpu profile to not ( the path interning dominates now ).

Reviewed By: krallin

Differential Revision: D22975881

fbshipit-source-id: 90b2be83282ee2095c517c0d4f13536ddadf6267
2020-08-11 02:52:43 -07:00
Alex Hornby
22add277f9 mononoke: update walker state to use eager hash memo
Summary:
DashMap takes the hash of its keys multiple times,  once outside the lock, and then once or twice inside the lock depending if the key is present in the shard.

Pre-computing the hash value using EagerHashMemo means its done only once and more importantly, outside the lock.

To use EagerHashMemo one needs to supply the BuildHasher, so its added as a struct member and the record method is made a member function.

Reviewed By: farnz

Differential Revision: D22975878

fbshipit-source-id: c2ca362fdfe31e5dca329e6200029207427cd9a1
2020-08-11 02:52:43 -07:00
Stefan Filip
2825193931 edenapi: add /commit/revlog_data endpoint
Summary:
Matches the `getcommitdata` SSH endpoint.
This is going to be used to remove the requirement that client repostories
need to have all commits locally.

Reviewed By: krallin

Differential Revision: D22979458

fbshipit-source-id: 75d7265daf4e51d3b32d76aeac12207f553f8f61
2020-08-11 01:54:14 -07:00
Simon Farnsworth
3086b241c6 Give the blobstore healer a way to cope with temporary falls in MySQL capacity
Summary:
The query we use to select blobs to heal is naturally expensive, due to the use of a subquery. This means that finding the perfect queue limit is hard, and we depend on task restarts to handle brief overload of MySQL.

Give us a fast fall in batch size (halve on each failure), and slow climb back (10% climb on each success), and a random delay after each failure before retrying.

Reviewed By: StanislavGlebik

Differential Revision: D23028518

fbshipit-source-id: f2909fe792280f81d604be99fabb8b714c1e6999
2020-08-10 15:24:13 -07:00
Stanislau Hlebik
9787a2c33a mononoke: add admin command to return filenodes for path
Summary: It's useful to debug filenodes

Reviewed By: krallin

Differential Revision: D23028528

fbshipit-source-id: 500fe2ad62a8e07498f46801c0c1523d1656ceeb
2020-08-10 11:13:01 -07:00
Stanislau Hlebik
bdd494b2ce mononoke: fix filenodes cache key
Summary:
`is_tree` weren't part of the cache key, and that means we could have returned
incorrect history if we had a file and a directory with the same name.

This diff fixes it.

Reviewed By: krallin

Differential Revision: D23028527

fbshipit-source-id: 98a3b2028fa62231dfb570a76fb836374ce1eed0
2020-08-10 07:13:35 -07:00
Stanislau Hlebik
21e232ddaf mononoke: add init_tunables in fastreplay
Summary:
I noticed that fastreplay doesn't init tunables, and that means that it doesn't
get the updates, and more importantly it doesn't use default values of
tunables.

That doesn't look expected (but lmk if I'm wrong!)

Reviewed By: krallin

Differential Revision: D23027311

fbshipit-source-id: ee43d02457d2240ebeb1530c672cb3847bc3afd4
2020-08-10 03:55:41 -07:00
Alex Hornby
02b9979b21 rust: vendor dashmap 3.11.9
Summary: This has my into_key() PR https://github.com/xacrimon/dashmap/pull/91 merged so the patch pointing to my fork is also removed.

Reviewed By: farnz

Differential Revision: D22896911

fbshipit-source-id: 188d438ce2aa20cfb3c466a62227d1cd27625f74
2020-08-10 03:19:33 -07:00
Alex Hornby
a7ff2a0c34 rust: vendor ahash 0.4.4
Summary:
Vendor ahash 0.4.4.   In tests I haven't found this update significant in mononoke walker performance, but might as well be current now I'd tried it.

I have found that wrapping ahash in a memoizing hasher helps, but that is for another diff.

Reviewed By: farnz

Differential Revision: D22864635

fbshipit-source-id: 5019259273ae3bd2df95cdd18adceed895baf8f2
2020-08-07 05:34:01 -07:00
Alex Hornby
e0c6e249fe mononoke: add a non-thrift header to packblob so we can vary thrift protocol in future
Summary: Add a non-thrift header to packblob so we can vary thrift protocol in future.

Reviewed By: farnz

Differential Revision: D22953758

fbshipit-source-id: a114a350105e75cbe57f6c824295d863c723f32f
2020-08-07 03:43:56 -07:00
Stanislau Hlebik
be3c46e10d mononoke: add --find-latest-imported-rev-only mode to blobimport
Reviewed By: ikostia

Differential Revision: D22975677

fbshipit-source-id: d4322901a84b8d76ccdffab17421f32c8e7510eb
2020-08-06 13:08:50 -07:00
Jun Wu
6fd7a2e582 dag: use concrete error types
Summary:
This is more complex than previous libraries, mainly because `dag` defines APIs
(traits) used by other code, which might raise error type not interested
by `dag` itself. `BackendError::Other(anyhow::Error)` is currently used to
capture types that do not fit in `dag`'s predefined error types.

Reviewed By: sfilipco

Differential Revision: D22883865

fbshipit-source-id: 3699e14775f335620eec28faa9a05c3cc750e1d1
2020-08-06 12:31:57 -07:00
Jun Wu
8d0f48c4da dag: rename some anyhow::Result to dag::Result
Summary:
Prefix some `Result` with `dag::Result`. Since `dag::Result` is just
`anyhow::Result` for now, this does not change anything but makes
it more compatible with upcoming changes.

Reviewed By: sfilipco

Differential Revision: D22883864

fbshipit-source-id: 95a26897ed026f1bb8000b7caddeb461dcaad0e7
2020-08-06 12:31:57 -07:00
Mark Thomas
b56a1b5b2c scs_server: add repo_list_hg_manifest
Summary:
To allow EdenFS to get aux manifest data from Mononoke without needing to derive fsnodes, provide
a mechanism to list a manifest using the hg manifest id that returns the size and content hashes
of each of the files.

NOTE: this is temporary until the EdenAPI server is fully online and serving this data.

Reviewed By: krallin

Differential Revision: D22975967

fbshipit-source-id: 0a25da6d74534d42fc3b5f38ba3b72107b209681
2020-08-06 11:28:11 -07:00
Stanislau Hlebik
3bb6bddce8 mononoke: remove expect
Summary: Let's return normal error instead

Reviewed By: krallin

Differential Revision: D22976148

fbshipit-source-id: fd89dfa1949d4b5e3354aab7d93ca40d779a18ec
2020-08-06 08:00:33 -07:00
Stanislau Hlebik
37747550ef mononoke: open RevlogRepo once
Summary: Previously it was opened twice, even though there were no reason to do it.

Reviewed By: krallin

Differential Revision: D22976149

fbshipit-source-id: 426858da4548f1eaffe1d989e5424937af2583a5
2020-08-06 08:00:33 -07:00
Alex Hornby
e29db47009 mononoke: factor out walker hasher settings, take explicit ahash dependency
Summary:
Factor out the walkers state internals to BuildStateHasher and StateMap

This change keeps the defaults the same using DashMap and ahash::RandomState and uses the same ahash version that DashMap defaults to internally.

This is in preparation for the next diff the where the ahash dependency is updated to 0.4.4. Though it was clearer not to combine the refactoring and the update of the hasher used in the same diff.

Reviewed By: ikostia

Differential Revision: D22851585

fbshipit-source-id: 84fa0dc73ff9d32f88ad390243903812a4a48406
2020-08-06 06:27:22 -07:00
Alex Hornby
f07e0be8e3 mononoke: only emit NodeData from walker if required
Summary:
Only emit NodeData from walker if required to save some memory.  Each of the walks can now specify which NodeData it is interested in observing in the output stream.

We still need to emit Some as part of the Option<NodeData> in the output stream as it is used in things like the final count of loaded objects. Rather than stream over Option<Option<NodeData>> we instead add a NodeData::NotRequired variant

Reviewed By: markbt

Differential Revision: D22849831

fbshipit-source-id: ef212103ac2deb9d66b017b8febe233eb53c9ed3
2020-08-06 06:27:22 -07:00
Stanislau Hlebik
c0347c6baf mononoke: refactor verify_working_copy slightly
Summary:
Extract verify_working_copy_inner function, which lets directly specify
source/target repo, hash and movers. It can be useful to verify equivalence of
two commits even if they are not in commit equivalence mapping.

Reviewed By: krallin

Differential Revision: D22950840

fbshipit-source-id: ab30be7190e29db3343b846b48333d7c7339d043
2020-08-06 05:51:37 -07:00
Simon Farnsworth
0c3fe9b20f Fully asyncify blobstore sync queue
Summary: Move it from `'static` BoxFutures to async_trait and lifetimes

Reviewed By: markbt

Differential Revision: D22927171

fbshipit-source-id: 637a983fa6fa91d4cd1e73d822340cb08647c57d
2020-08-05 15:41:15 -07:00
David Tolnay
014f40209b Back out "rust: 1.45.2 update"
Summary:
This is a backout of D22912569 (34760b5164), which is breaking opt-clang-thinlto builds on platform007 (S206790).

Original commit changeset: 5ffdc48adb1f

Reviewed By: aaronabramov

Differential Revision: D22956288

fbshipit-source-id: 45940c288d6f10dfe5457d295c405b84314e6b21
2020-08-05 13:28:13 -07:00
Viet Hung Nguyen
f2ee103884 mononoke/repo_import: add more meaningful print outs and save hashes
Summary:
Added more logs when running the binary to be able to track the progress more easily.
Saved bonsai hashes into a file. In case we fail at deriving data types, we can still try to derive them manually with the saves hashes and avoid running the whole tool again.

Reviewed By: StanislavGlebik

Differential Revision: D22943309

fbshipit-source-id: e03a74207d76823f6a2a3d92a1e31929a39f39a5
2020-08-05 12:46:14 -07:00
Mark Thomas
cbd105a73e hook_tailer: reduce default concurrency to 20
Summary:
Large commits and many hooks can mean checking 100 commits at a time overload
the system.  Reduce the default concurrency to something more reasonable.

While we're here, lets use the proper mechanism for default values in clap.

Reviewed By: ikostia

Differential Revision: D22945597

fbshipit-source-id: 0f0a086c3b74bec614ada44a66409c8d2b91fe69
2020-08-05 10:34:05 -07:00
Mark Thomas
e12728305c hook_tailer: make command line arguments consistent
Summary:
Argument names should be `snake_case`.  Long options should be `--kebab-case`.

Retain the old long options as aliases for compatibility.

Reviewed By: HarveyHunt

Differential Revision: D22945600

fbshipit-source-id: a290b3dc4d9908eb61b2f597f101b4abaf3a1c13
2020-08-05 10:34:05 -07:00
Mark Thomas
b2b895353f hook_tailer: add --exclude-merges to skip merge commits
Summary: Add `--exclude-merges` which will skip merge commits.

Reviewed By: HarveyHunt

Differential Revision: D22945598

fbshipit-source-id: 3c20cf049bbe15a975671e8792259b460356804a
2020-08-05 10:34:05 -07:00
Mark Thomas
57626bec98 hook_tailer: add --log-interval to log every N commits
Summary:
Add `--log-interval` to log every N commits, so that it can be seen to be
making progress in the logs.

The default is set to 500, which logs about once every 10 seconds on my devserver.

Reviewed By: HarveyHunt

Differential Revision: D22945599

fbshipit-source-id: 7fc09b907793ea637289c9018958013d979d6809
2020-08-05 10:34:05 -07:00
Simon Farnsworth
99247529d5 Wishlist priority connections should use background mode
Summary: Commitcloud fillers use wishlist priority because we want them to wait their turn behind other users; let's also stop them from flooding the blobstore healer queue by making them background priority.

Reviewed By: ahornby

Differential Revision: D22867338

fbshipit-source-id: 5d16438ea185b580f3537e3c4895a545483eca7a
2020-08-05 06:35:46 -07:00
Simon Farnsworth
aa94fb9581 Add a multiplex mode that doesn't update the sync queue
Summary:
Backfillers and other housekeeping processes can run so far ahead of the blobstore sync queue that we can't empty it from the healer task as fast as the backfillers can fill it.

Work around this by providing a new mode that background tasks can use to avoid filling the queue if all the blobstores are writing successfully. This has a side-effect of slowing background tasks to the speed of the slowest blobstore, instead of allowing them to run ahead at the speed of the fastest blobstore and relying on the healer ensuring that all blobs are present.

Future diffs will add this mode to appropriate tasks

Reviewed By: ikostia

Differential Revision: D22866818

fbshipit-source-id: a8762528bb3f6f11c0ec63e4a3c8dac08d0b4d8e
2020-08-05 06:35:46 -07:00
Stanislau Hlebik
f13067b0da mononoke: add manual_commit_sync to megarepotool
Summary:
This operation is useful immediately after a small repo is merged into a large repo.
See example below

```
  B' <- manually synced commit from small repo (in small repo it is commit B)
  |
  BM <- "big merge"
 /  \
...  O <- big move commit i.e. commit that moves small repo files in correct location
     |
     A <- commit that was copied from small repo. It is identical between small and large repos.
```

Immediately after a small repo is merged into a large one we need to tell that a commit B and all of
its ancestors from small repo needs to be based on top of "big merge" commit in large repo rather than on top of
commit A.
The function below can be used to achieve exactly that.

Reviewed By: ikostia

Differential Revision: D22943294

fbshipit-source-id: 33638a6e2ebae13a71abd0469363ce63fb6b014f
2020-08-05 05:55:15 -07:00
Simon Farnsworth
33c2a0c846 Update auto_impl to 0.4
Summary: We were using a git snapshot of auto_impl from somewhere between 0.3 and 0.4; 0.4 fixes a bug around Self: 'lifetime constraints on methods that blocks work I'm doing in Mononoke, so update.

Reviewed By: dtolnay

Differential Revision: D22922790

fbshipit-source-id: 7bb68589a1d187393e7de52635096acaf6e48b7e
2020-08-04 18:12:45 -07:00
Kostia Balytskyi
c8e3c27a65 megarepo: test invisible merge e2e
Reviewed By: StanislavGlebik

Differential Revision: D22924237

fbshipit-source-id: ba13d610c26c1b0be4f4afa75de93568359457c6
2020-08-04 12:21:13 -07:00
Stefan Filip
7392392a33 server: add commit/location_to_hash path
Summary:
Eden api endpoint for segmented changelog. It translates a path in the
graph to the hash corresponding to that commit that the path lands on.
It is expected that paths point to unique commits.

This change looks to go through the plumbing of getting the request from
the edenapi side through mononoke internals and to the segmented changelog
crate. The request used is an example. Follow up changes will look more at
what shape the request and reponse should have.

Reviewed By: kulshrax

Differential Revision: D22702016

fbshipit-source-id: 9615a0571f31a8819acd2b4dc548f49e36f44ab2
2020-08-04 11:22:39 -07:00
Stefan Filip
2f3e569120 mononoke_api: add segmented changelog location to hash translation
Summary:
This functionality is going to be used in EdenApi. The translation is required
to unblock removing the changelog from the local copy of the repositories.
However the functionality is not going to be turned on in production just yet.

Reviewed By: kulshrax

Differential Revision: D22869062

fbshipit-source-id: 03a5a4ccc01dddf06ef3fb3a4266d2bfeaaa8bd2
2020-08-04 11:22:39 -07:00
Stefan Filip
4261013101 metaconfig: add segmented changelog config
Summary:
To start the only configuration available is whether the functionality provided
by this component is available in any shape or form. By default the component
is going to be disabled to all repositories. We will enable it first to
bootstrapped repositories and after additional tooling is added to production
repositories.

Reviewed By: kulshrax

Differential Revision: D22869061

fbshipit-source-id: fbaed88f2f45e064c0ae1bc7762931bd780c8038
2020-08-04 11:22:39 -07:00
Santiago Alfonso Muñoz Rodriguez
007dc93916 Enumeration API for BlobStore keys
Summary:
- Enumerate API now provided via trait BlobstoreKeySource
- Implementation for Fileblob and ManifoldBlob
- Modified populate_healer to use new api
- Modified fixrepocontents to use new api

Reviewed By: ahornby

Differential Revision: D22763274

fbshipit-source-id: 8ee4503912bf40d4ac525114289a75d409ef3790
2020-08-04 06:54:18 -07:00
Alex Hornby
f7210430d9 mononoke: check whether to emit an edge earlier from the walker, remaining types
Summary: Update all the remaining steps in the walker to use the new early checks, so as to prune unnecessary edges earlier in the walk.

Reviewed By: farnz

Differential Revision: D22847412

fbshipit-source-id: 78c499a1870f97df7b641ee828fb8ec58303ebef
2020-08-04 06:47:38 -07:00
Alex Hornby
5fb309a7b2 mononoke: check whether to emit an edge from the walker earlier
Summary:
Check whether to emit an edge from the walker earlier to reduce vec allocation of unnecessary edges that would immediately be dropped in WalkVistor::visit.

The VisitOne trait is introduced as a simpler api to the Visitor that can be used to check if one edge needs to be visited,  and the Checker struct in walk.rs is a helper around that that will only call the VisitOne api if necessary. Checker also takes on responsibility for respecting keep_edge_paths when returning paths,  so that parameter has be removed  for migrated steps.

To keep the diff size reasonable, this change has all the necessary Checker/VisitOne changes but only converts hg_manifest_step, with the remainder of the steps converted in the next in stack.  Marked todos labelling unmigrated types as always emit types are be removed as part of converting remaining steps.

Reviewed By: farnz

Differential Revision: D22864136

fbshipit-source-id: 431c3637634c6a02ab08662261b10815ea6ce293
2020-08-04 04:30:49 -07:00
Stanislau Hlebik
fe60eeff85 mononoke: megarepotool support for gradual merge
Summary:
This tool can be used in tandem with pre_merge_delete tool to merge a one large
repository into another in a controlled manner - the size of the working copy
will be increased gradually.

Reviewed By: ikostia

Differential Revision: D22894575

fbshipit-source-id: 0055d3e080c05f870cfd0026174365813b0eb253
2020-08-04 02:53:15 -07:00
Simon Farnsworth
f7e8931a56 Add a minimum successful writes count for MultiplexedBlobstore
Summary:
There are two reasons to want a write quorum:

1. One or more blobstores in the multiplex are experimental, and we don't want to accept a write unless the write is in a stable blobstore.
2. To reduce the risk of data loss if one blobstore loses data at a bad time.

Make it possible

Reviewed By: krallin

Differential Revision: D22850261

fbshipit-source-id: ed87d71c909053867ea8b1e3a5467f3224663f6a
2020-08-04 02:45:38 -07:00
Jeremy Fitzhardinge
34760b5164 rust: 1.45.2 update
Summary: A couple of features stabilized, so drop their `#![feature(...)]` lines.

Reviewed By: eugeneoden, dtolnay

Differential Revision: D22912569

fbshipit-source-id: 5ffdc48adb1f57a1b845b1b611f34b8a7ceff216
2020-08-03 19:29:17 -07:00
Kostia Balytskyi
6824787241 library.sh: add absolute config paths everywhere
Summary:
In several places in `library.sh` we had `--mononoke-config-path
mononoke-config`. This ensured that we could not run such commands from
non-`$TESTTMP` directorires. Let's fix that.

Reviewed By: StanislavGlebik

Differential Revision: D22901668

fbshipit-source-id: 657bce27ce6aee8a88efb550adc2ee5169d103fa
2020-08-03 13:00:23 -07:00
Kostia Balytskyi
fe487f9e8b push_redirector: add contexts
Summary: The more contexts the better. Makes debugging errors much more pleasant.

Reviewed By: StanislavGlebik

Differential Revision: D22890940

fbshipit-source-id: 48f89031b4b5f9b15f69734d784969e2986b926d
2020-08-03 13:00:23 -07:00
Kostia Balytskyi
b7f8a1b193 megarepotool: add bonsai merge
Summary:
An extremely thin wrapper around existing APIs: just a way to create merge commits from the command line.

This is needed to make the merge strategy work:

```
C
|
M3
| \
.  \
|   \
M2   \
| \   \
.  \   \
|   \   \
M1   \   \
| \   \   \
.  TM3 \   \
.  /    |  |
.  D3 (e7a8605e0d) TM2  |
.  | /    /
.  D2 (33140b117c)  TM1
.  |  /
.  D1 (733961456f)
|   |
|    \
|    DAG to merge
|
main DAG
```

When we're creating `M2` as a result of merge of `TM2` into the main DAG, some files are deleted in the `TM3` branch, but not deleted in the `TM2` branch. Executing merge by running `hg merge` causes these files to be absent in `M2`. To make Mercurial work, we would need to execute `hg revert` for each such file prior to `hg merge`. Bonsai merge semantics however just creates correct behavior for us. Let's therefore just expose a way to create bonsai merges via the `megarepotool`.

Reviewed By: StanislavGlebik

Differential Revision: D22890787

fbshipit-source-id: 1508b3ede36f9b7414dc4d9fe9730c37456e2ef9
2020-08-03 11:32:35 -07:00
Kostia Balytskyi
f9e410d965 megarepotool: add pre-merge-delete CLI
Summary:
This adds a CLI for the functionality, added in the previous diff. In addition, this adds an integration test, which tests this deletion functionality.

The output of this tool is meant to be stored in the file. It simulates a simple DAG, and it should be fairly easy to automatically parse the "to-merge" commits out of this output. In theory, it could have been enough to just print the "to-merge" commits alone, but it felt like sometimes it may be convenient to quickly examine the delete commits.

Reviewed By: StanislavGlebik

Differential Revision: D22866930

fbshipit-source-id: 572b754225218d2889a3859bcb07900089b34e1c
2020-08-03 11:32:35 -07:00
Kostia Balytskyi
1eb7cfe277 megarepolib: add pre-merge delete implementation
Summary:
This implements a new strategy of creating pre-merge delete commits.

As a reminder, the higher-level goal is to gradually merge two independent DAGs together. One of them is the main repo DAG, the other is an "import". It is assumed that the import DAG is already "moved", meaning that all files are at the right paths to be merged.

The strategy is as follows: create a stack of delete commits with gradually decreasing working copy size. Merge them into `master` in reverse order.

Reviewed By: StanislavGlebik

Differential Revision: D22864996

fbshipit-source-id: bfc60836553c656b52ca04fe5f88cdb1f15b2c18
2020-08-03 11:32:35 -07:00
Simon Farnsworth
a5e9b79d7d Return all errors in the event of a multiplexed put failure
Summary:
With upcoming write quorum work, it'll be interesting to know all the failures that prevent a put from succeeding, not just the most recent, as the most recent may be from a blobstore whose reliability is not yet established.

Store and return all errors, so that we can see exactly why a put failed

Reviewed By: ahornby

Differential Revision: D22896745

fbshipit-source-id: a3627a04a46052357066d64135f9bf806b27b974
2020-08-03 09:30:05 -07:00
Kostia Balytskyi
48aa00ed92 megarepolib: implement chunker from hint string
Summary:
"Chunking hint" is a string (expected to be in a file) of the following format:
```
prefix1, prefix2, prefix3
prefix4,
prefix5, prefix6
```

Each line represents a single chunk: if a paths starts with any of the prefixes in the line, it should belong to the corresponding chunk. Prefixes are comma-separated. Any path that does not start with any prefix in the hint goes to an extra chunk.

This hint will be used in a new pre-merge-delete approach, to be introduced further in the stack.

Reviewed By: StanislavGlebik

Differential Revision: D22864999

fbshipit-source-id: bbc87dc14618c603205510dd40ee5c80fa81f4c3
2020-08-03 08:44:15 -07:00
Kostia Balytskyi
1825ed96d3 megarepolib: delete obsolete pre_merge_deletes impl
Summary:
We need to use a different type of pre-merge deletes, it seems, as the one proposed requires a huge number of commits. Namely, if we have `T` files in total in the working copy and we're happy to delete at most `D` files per commit, while merging at most `S` files per deletion stack:
```
#stacks = T/S
#delete_commits_in_stack = (T-X)/D
#delete_commits_total = T/S * (T-X)/D = (T^2 - TX)/SD ~ T^2/SD

T ~= 3*10^6

If D~=10^4 and X~=10^4:
#delete_commits_total ~= 9*10^12 / 10^8 = 9*10^4

If D~=10^5 and X~=10^5:
#delete_commits_total ~= 9*10^12 / 10^10 = 9*10^2
```

So either 90K or 900 delete commits. 90K is clearly too big. 900 may be tolerable, but it's still hard to manage and make sense of. What's more, there seems to be a way to produce fewer of these, see further in the stack.

Reviewed By: StanislavGlebik

Differential Revision: D22864998

fbshipit-source-id: e615613a34e0dc0d598f3178dde751e9d8cde4da
2020-08-03 08:27:16 -07:00
Simon Farnsworth
a9b8793d2d Add a write-mostly blobstore mode for populating blobstores
Summary:
We're going to add an SQL blobstore to our existing multiplex, which won't have all the blobs initially.

In order to populate it safely, we want to have normal operations filling it with the latest data, and then backfill from Manifold; once we're confident all the data is in here, we can switch to normal mode, and never have an excessive number of reads of blobs that we know aren't in the new blobstore.

Reviewed By: krallin

Differential Revision: D22820501

fbshipit-source-id: 5f1c78ad94136b97ae3ac273a83792ab9ac591a9
2020-08-03 04:36:19 -07:00