Summary:
This new crate is part of the new telemetry / logging effort. Its input is
tracing data, and output is aggregated NoSQL table content.
This diff is only the start, setting up the direction.
Reviewed By: DurhamG
Differential Revision: D19797702
fbshipit-source-id: bdf34461c05b5eae5e59652bc82d8ee1857dbf1e
Summary:
Schema for `xdb.mononoke_blobstore_sync_queue : blobstore_sync_queue` wasn't altered yet as AOSC [wasn't working](https://fb.workplace.com/groups/mysql.users/permalink/4991036464278262/).
This will allow the blobstore healer to make decisions about how many blobs it should copy at once. The goal is to eliminate the OOMs that were regularly seen by giving the blobstore healer a memory budget.
Reviewed By: krallin
Differential Revision: D25925522
fbshipit-source-id: d3714dbadc74274a4c4a0e66fa732b84bef89227
Summary: Suddenly prompt stopped appearing for me. Flush the stream to be sure that it's printed out.
Reviewed By: HarveyHunt
Differential Revision: D25956018
fbshipit-source-id: 83419037fa6ce672e203385b71f1403a738d0c90
Summary: Its easier to distinguish the parameters when constructing it directly rather than via a new() method.
Differential Revision: D26017347
fbshipit-source-id: a020db1133de727f217f67a05953059122e3623a
Summary: Its only called from inside MononokeAppBuilder so move it inside to save passing all the struct members as params.
Differential Revision: D25976405
fbshipit-source-id: e95d7b8f5f4474f0289d29bb7bb0a8b0780112e0
Summary: This doesn't need to be in metaconfig anymore, can move it to multiplexedblob
Reviewed By: krallin
Differential Revision: D25928061
fbshipit-source-id: 8aa6ce6aafa16f84730cf388ebf7eab6d5bf2c53
Summary:
We have a few tricky opimizations, so it's better to have more logging than
less.
Reviewed By: HarveyHunt
Differential Revision: D25995937
fbshipit-source-id: b5502708125b70f3d656be3dc1120176f5c76ce8
Summary:
The test added in previous diff showed that hg filenodes weren't being deferred between chunks in the expected way.
This is because we can't tell if a hg filenode is in a given chunk until it is loaded. This is similar to unodes, but the linked changeset in this case isHgChangesetId rather than the bonsai ChangesetId, so this change introduces hg_to_bcs mapping in the walker state, which is used for looking up whether the filenodes linked HgChangesetId is in the chunk, and if not defers the edge.
Reviewed By: krallin
Differential Revision: D25742276
fbshipit-source-id: 1f92452d012aab5b9fdf29f43fc05ebc043b2c7a
Summary: Add a test for hg filenode chunking showing its not deferring any edges to the second chunk, which is a problem. The fix is in the next diff.
Reviewed By: krallin
Differential Revision: D25742278
fbshipit-source-id: bafd59cef01c3153eb1beadccb6026d456454d6b
Summary:
Add test for walking hg non-filenode data in chunks. Expect some deferred edges to next chunk as parents point into history.
Done by deferring hg_changeset_via_bonsai_step if the bonsai is outside range of the chunk
Reviewed By: krallin
Differential Revision: D25742288
fbshipit-source-id: 385c9261151d10f7a7029f86ec10470226fc993c
Summary: Add test for walking fsnodes in chunks. Fsnodes don't point into history, so not expecting any edges to be deferred between chunks.
Reviewed By: krallin
Differential Revision: D25742291
fbshipit-source-id: dfacbffd1640713df0bc80e9306210860f9a932c
Summary: Add test for walking skeleton manifest in chunks. These manifests don't point into history, so no edges expected to be deferred between chunks
Reviewed By: krallin
Differential Revision: D25742290
fbshipit-source-id: 0bee980940d3f023392a518174aae0352d5eebda
Summary:
Add test to walk deleted manifests in chunks, no deferring expected as these manifests don't point into history.
Test showed was missing handling for this manifest type in chunking so fixed it.
Reviewed By: krallin
Differential Revision: D25742285
fbshipit-source-id: 5411f904510f9b4fd9028c7d0dde6c652a784796
Summary: Add test for walking blame in chunks, check that edges crossing chunk boundary are deferred.
Reviewed By: StanislavGlebik
Differential Revision: D25742296
fbshipit-source-id: 163b07df57ebb1c745ee0577f58a45660e6cd18d
Summary: Add test for scrubbing fastlog in chunks, make sure edges get deferred between chunks.
Reviewed By: StanislavGlebik
Differential Revision: D25742277
fbshipit-source-id: 46fd47cfc776783b713717df0ab86bae1b0873fe
Summary:
For Unodes we can't determine before loading them whether they fall within the current chunk as the linked changesetid value is not visibile until the step is executed.
This change adds the ability to defer an already executing step and uses it for unode to defer if its linked changeset is not in the chunk being processed
Deferred edges are stored in the walker state, and are checked on each chunk so that any deferred edges can be run
Reviewed By: StanislavGlebik
Differential Revision: D25742280
fbshipit-source-id: 8a0e7d96b8bf10889bf5e83fe4bee829a1a5cb4c
Summary: Add an enum for the walker step output in preparation for adding a Deferred variant to it in next diff
Reviewed By: StanislavGlebik
Differential Revision: D25742293
fbshipit-source-id: 6aabacb1cd39d16f4d36998908048fd2a10eba4d
Summary: Allow scrubbing of ChangesetInfo in chunks of public commit ids
Reviewed By: markbt
Differential Revision: D25742286
fbshipit-source-id: a5e2faed16eb60c5b7054261a74595a945e68c15
Summary:
For large repos is is desirable to walk them in chunks as a prerequisite for being able to clear state caches to reduced memory usage between chunks and to checkpoint between chunks so that an interrupted scrub can resume.
Chunks are fetched from the repo bounds of changeset id in newestfirst order, this means that we scrub newest data first. Any edges discovered from the walk that point outside the chunk are deferred until the later chunk that covers them.
This change adds chunking and tests if for core bonsai data, following diffs add it for other types.
Reviewed By: StanislavGlebik
Differential Revision: D25742295
fbshipit-source-id: b989abdf2ca367cf9b10f45d9f932eba55ee6dae
Summary: New command line args to allow scrubbing a repo in chunks of N changesets. Used in a later diff.
Reviewed By: StanislavGlebik
Differential Revision: D25742282
fbshipit-source-id: 4bcf74d26f8c2863c6e96f25eca69e01f9c2c0d5
Summary:
The main thing this change does is make sure pending roots to visit are represented in the difference between Walked and Children. Children is the sum of all child nodes discovered, both visited and unvisited. Walked is a measure of number of nodes visited. Children-Walked is used as a measure of queue depth remaining to be processed.
When not chunking this is a minor issue as usually just one bookmark root node is not counted in children, but when chunking not counting the roots means mean the chunk of several 100000 roots is not visible as waiting to be processed.
Reviewed By: StanislavGlebik
Differential Revision: D25852526
fbshipit-source-id: df5f21a37be152f0baee40d33fd7dfb7aaa763de
Summary:
It's no longer true - we're doing metalog commit in transaction.py, not lock
release. Also rename the function to clarify.
Reviewed By: DurhamG
Differential Revision: D25984806
fbshipit-source-id: b17a3f635210be7855341fc8a47fed6411599164
Summary:
This setup is more extendable than the TracingData focused approach. We can
more easily add new functionality using the Subscriber list.
The approach taken here to introduce the new collector tries to maintain
existing functionality. We can then move various logic to their own
Subscribers.
Reviewed By: quark-zju
Differential Revision: D25988580
fbshipit-source-id: 045cd355dbd499109e554a29a1439c2d490b7c40
Summary:
Dot `.` is the common separator for the metrics aggregator that we use.
This adds some form of consistency.
Reviewed By: DurhamG
Differential Revision: D25968398
fbshipit-source-id: 194d2f33fe477fe5d768a9cd8f9f46f56445e3e8
Summary:
We can see number of HTTP errors when working with darkisilon S3 https://fburl.com/scuba/mononoke_blobstore_trace/lt1itidt
While we are still investigating the root cause, it seems most of them are the result of too many connections we are trying to open to the host. Long term solution for that is to make reverse proxy in between with some fixed number persistent connection to isilon. However we can still have some errors and to increase reliability let's make some exponential retry logic.
Reviewed By: krallin
Differential Revision: D25995114
fbshipit-source-id: b19e92933416f0bee20c2fa3235052ee1aa15c89
Summary: If progress is logged less than one millisecond apart it gives a divide by zero. Fix it.
Reviewed By: farnz
Differential Revision: D25997768
fbshipit-source-id: 65dcba2dc7a789540a8e4fce6aeca0ee9668895d
Summary:
Yesterday I landed a diff (D25950531 (79c34c5094)) which allows changing a mapping by
landing a commit with a special commit extra. The implementation was done
inside backsyncer.
However, this was incorrect for a few reasons, the main reason being backsyncer
is not the only thing that rewrites commits between repos (x-repo commit lookup
is another thing for example).
So we need to push this implementation down to CommitSyncer level so that
everyone rewrites commits in the same way.
This diff fixes this issue, and moves all the logic of figuring out correct
mapping version down to CommitSyncer level. Doing so allowed me to also
simplify backsyncer to use normal sync_commit() method to backsync commits.
Another important note is about how we handle commits with no parents. We allow
syncing them only if there's an unambiguous version that we can pick. In case
of any ambiguity we don't sync them and return an error, which means that e.g.
merging a new repo and simultaneously changing the mapping is not possible.
Reviewed By: ikostia
Differential Revision: D25975842
fbshipit-source-id: a87fee545ac1305832ac905337610e7b87884477
Summary:
this is the function that will be responsible for figuring out which version to
use for remapping a commit. We'll extend it in the next diff.
Differential Revision: D25973951
fbshipit-source-id: 42b02608b4b88a216f6ea895943c49573fb29171
Summary:
We do the same check a little bit below - let's remove the first one, since
it's unnecessary
Differential Revision: D25973248
fbshipit-source-id: ab88128906b938c5ee57e23a261a7fc997e0ef72
Summary:
D25883922 (c1c9a9c585) does a refactor which messes up the headers, which will break
proxygen. Correct this and add a test to make sure we don't repeat this
mistake.
Reviewed By: krallin
Differential Revision: D25983271
fbshipit-source-id: 4994a4992fd6df7c62c5d91970f76165f848cc08
Summary: Allows us to use new APIs in libbpf
Reviewed By: anakryiko
Differential Revision: D25933787
fbshipit-source-id: f0988caae351760b814eba74f6f716db51f728bd
Summary:
Some linkers (lld being one) use fallocate() or posix_fallocate() on
the output file before writing its contents. EdenFS would return
ENOSYS or ENOTSUP so glibc would fall back and write a single byte to
every 512 byte block, which is terribly slow and generates a bunch of
fake traffic in the Watchman journal.
This diff implements basic support for FUSE_FALLOCATE, avoiding this
slow emulation.
Reviewed By: xavierd
Differential Revision: D25934694
fbshipit-source-id: c6c90ea2b517d4dbedce29d9a4340870c8c177c3
Summary:
A lot of users have been having trouble getting their eden repos recloned
recently, to make this process simplier I add the reclone process to the clone
script.
In this script I check if a user has multiple repos depending on the same
backing store. This allows me to warn them that they might lose changes from
these other repos. This diff threads along the backing store to the
`eden list` result for that check.
Reviewed By: chadaustin
Differential Revision: D25078423
fbshipit-source-id: 9ceb1f9acc4ec170cbb12d4b0b3b7d51987f88e3
Summary:
The both options have basically the same value.
This is my next step for resolving mismatches between CC dynamic config and the current configuration and generally improving CC configuration.
Reviewed By: DurhamG
Differential Revision: D25973556
fbshipit-source-id: aae21efcd5174ed58efcb9e5d8c85831d35777ea