Summary:
Right now we're only logging hooks that outright fail, which isn't great. Let's
log rejections as well.
Reviewed By: johansglock
Differential Revision: D21522804
fbshipit-source-id: 6bfc6b12394099b04faa9d23f164b436935f9fb3
Summary:
Make the repo path in Option<WrappedPath> available in stream output in preparation for using it in the corpus dumper to write to disk
The path is Option as not all nodes can have an associated file system path (e.g. BonsaiChangeset)
The headlines changes are in sampling.rs and sizing.rs. The progress.rs change slightly generalises to allow any type convertible to NodeType as the main walk identifier in the output stream.
Some refactors done as part of this
* NodeSamplingHandler is renamed to WalkSampleMapping to reflect this is what it stores.
* WalkSampleMapping generic parameters are extended to take both a key and a sample type
* NodeSamplingHandler::start_node() is moved to a new SampleTrigger::map_keys() type. This is so that SamplingWalkVisitor doesn't need the full WalkSampleMapping generic parameters.
Reviewed By: krallin
Differential Revision: D20835662
fbshipit-source-id: 58db622dc63d7f869a092739d1187a34b77219f6
Summary: Make sampling blobstore handlers fallible in preparation for corpus dumper so we can know if writes to disk/directory creations failed.
Reviewed By: farnz
Differential Revision: D21168632
fbshipit-source-id: d25123435e8f54c75aaabfc72f5fa653e5cf573d
Summary:
Not all node types can have a path associated
Reset the tracked path to None if the route is taking us through a node type that can't have a repo path.
Reviewed By: krallin
Differential Revision: D21228372
fbshipit-source-id: 2b1e291f09232500adce79c630d428f09cd2d2cc
Summary:
Add new --sample-offset argument so that in combination with the existing --sample-rate the whole repo can be sampled in slices
For --sample-rate=N, this allows us to scrub or corpus dump 1/Nth of the repo a time, which is particularly useful for corpus dumping on machines with limited disk.
Also factored out the sampling args construction as 3 of the 4 walk variants use them (only validate does not)
Reviewed By: krallin
Differential Revision: D21158486
fbshipit-source-id: 94f98ceb71c22e0e9d368a563cdb04225b6fc459
Summary: use ArcIntern for WrappedPath to reduced walker memory usage for paths
Reviewed By: farnz
Differential Revision: D21230828
fbshipit-source-id: 525bac5a14b205659e177e03bd83bf06d1444617
Summary:
How is this Dag structure going to be used? This is probably the interesting
question for this diff.
On one side the structure could maintain a view of all the repositories and
manage the DAGs for all repositories in a central place. On the other side the
`Dag` is just an instance of a Changelog and Mononoke manages repositories that
each have a `Dag`. I went with the former pattern as it seems to me to be more
in line with the general architecture of Mononoke.
We can see the Dag being another part of the BlobRepo in the future. We will
want to avoid depending on the BlobRepo for actual functionality to avoid
cyclic dependencies. Currently the BlobRepo is used in construction for
convenience but that will have to change in the future.
Reviewed By: StanislavGlebik
Differential Revision: D21418367
fbshipit-source-id: 7c133eac0f38084615c2b9ba1466de626d2ffcbe
Summary: This removes .compat() from edenapi_server/main.rs. The actual removal probably could be done with less code, but in addition to removing compat(), I made most of the blocking code async.
Reviewed By: kulshrax, farnz
Differential Revision: D21426641
fbshipit-source-id: 1b3de4dc0b24d06faeb73de2e8658f0629d9491d
Summary:
Add a simple `/repos` endpoint that returns the list of repos available in a JSON response.
While the handler itself is quite simple, this diff establishes the general pattern by which new handlers will be added to the server.
Reviewed By: krallin
Differential Revision: D21330778
fbshipit-source-id: 77f57c969c34c8c1f7c94979fac383ec442a1e14
Summary:
When I refactored MPath to limit path length to 255 throughout, I had to change
the logic in this hook because it couldn't represent problematic paths anymore.
Unfortunately, I didn't realize that this would break in cases where the file
fits in 254 or 255 characters with one of the less compact encodings (but that
doesn't fit in 255 if you add `.i`), but also fits in 255 characters once you
add `.i` using one of the more compact encodings.
This results in the hook rejecting things that could have been represented in a
more compact encoding un-necessarily.
This fixes that, but to do so it also requires no requiring MPath in fsencode
(and instead allowing any slice of bytes) which is basically the bulk of
changes here.
Reviewed By: StanislavGlebik
Differential Revision: D21462205
fbshipit-source-id: d4fe6129b379675e842bff5b20bd776cb39157b2
Summary:
This diff logs the delay in deriving data. In particular it logs how much time
has left since an underived commit was created.
Note that this code makes an assumption about monotonic dates - for repos with pushrebase
repos that should be the case.
Reviewed By: krallin
Differential Revision: D21427265
fbshipit-source-id: bfddf594467dfd2424f711f895275fb54a4e1c60
Summary:
Two things will be simplified:
1) Do not pass sqlbookmarks, we can always get them from blobrep
2) Instead of passing repo per derived data type let's just always pass
unredacted repo
Add a very simple unittest
Differential Revision: D21426885
fbshipit-source-id: 712ef23340466438bf34a086517f7ba33d4eabed
Summary: Small refactoring that will make the next diffs easier
Differential Revision: D21426166
fbshipit-source-id: f3c3ae00794046828eaf3c0912dbabc233c97e77
Summary:
The transformation is pretty direct. I didn't add additional functionality
to the IdMap and I did not update the construction algorithm yet. The querying
method on IdMap were updated to async and then there are the SQL interaction
details.
In follow up changes I want to update the construction algorithm and add support
for multiple repositories.
I am not happy with the names of the columns or naming in general in this code.
Open to suggestions. One idea could be matching the client nomenclature as much
as possible.
Reviewed By: StanislavGlebik
Differential Revision: D20929576
fbshipit-source-id: 12104892faa69f37c141e8baf54d5fb24fc5df6b
Summary: This also unblocks the MacOS Mononoke builds, so enabling them back
Reviewed By: farnz
Differential Revision: D21455422
fbshipit-source-id: 4eae10785db5b93b1167f580a1c887ee4c8a96a2
Summary: What it says in the title. I'd like to set up alarms on this.
Reviewed By: farnz
Differential Revision: D21450584
fbshipit-source-id: 539299407cea84c67ff14b30184e8df4282415f8
Summary:
If a bundle comes from the commit cloud forward filler, we need to ignore
and not record it.
To do so, we need to start paying attention to stream-level params for the
first time.
Reviewed By: krallin
Differential Revision: D21427620
fbshipit-source-id: 9ee417cd2a5f2f5bb6ec342cd63071c6ca822475
Summary:
We want to be able to record all the bundles Mononoke processes to be later
replayed by Mercurail.
Reviewed By: krallin
Differential Revision: D21427622
fbshipit-source-id: b88e10e03d07dae35369286fe31022f36a1ee5cf
Summary: To make it easier to navigate the codebase the oss-only code will be from now on stored in a separate module, similarly to how the fbcode-only code is stored.
Reviewed By: markbt
Differential Revision: D21429060
fbshipit-source-id: aa7e80961de2897dae31bd0ec83488c683633b7a
Summary:
Something was up yesterday with the warm bookmarks cache. It started failing on
some hosts, and went out of sync for > 1 hour on some hosts. The logs reported
a lot of failures, but without any context they weren't super useful:
P130438422.
This adds a bit more logging. If this happens again, we'll be able to better
understand what happened.
Reviewed By: StanislavGlebik
Differential Revision: D21447043
fbshipit-source-id: 67a3924c4486991df5e4d38a995ff8054c145cf9
Summary: This is an sqlite equivalent of what exists in xdb now.
Reviewed By: krallin
Differential Revision: D21427621
fbshipit-source-id: 7024fbf7a8773c4465d2e6ee327aadeaf87cb213
Summary:
There was a bug in scrub blobstore that caused failures while traversing the
fsnodes.
If all blobstores returned None, then we need to return None and not fail as we
do now. So the situation we ran into was:
1) fsnodes is not derived, all blobstore return None
2) Previously it returned the error, which later checked in
https://fburl.com/diffusion/mhhhnkxv - this check makes sure there's no entry
with the same key on the queue. However by that time fsnodes might already be
derived and someone else might insert a new entry in the blobstore and in the
queue. This would return an error to the client.
The fix here is to not fail if all blobstores returned None
Reviewed By: ahornby
Differential Revision: D21405418
fbshipit-source-id: 21fe130ce65a0087c408a5014e5b108c7ce8fe6c
Summary: Cover as much as remining code with `Cargo.toml`s, for the rest create an exlusion list in the autocargo config.
Reviewed By: krallin
Differential Revision: D21383620
fbshipit-source-id: 64cc78a38ce0ec482966f32a2963ab4939e20eba
Summary: Covering repo_listener and microwave plus some final touch and we have a buildable Mononoke binary.
Reviewed By: krallin
Differential Revision: D21379008
fbshipit-source-id: cca3fbb53b90ce6d2c3f3ced7717404d6b04dd51
Summary:
There are few related changes included in this diff:
- backsyncer is made public
- stubs for SessionContext::is_quicksand and scuba_ext::ScribeClientImplementation
- mononoke/hgproto is made buildable
Reviewed By: krallin
Differential Revision: D21330608
fbshipit-source-id: bf0a3c6f930cbbab28508e680a8ed7a0f10031e5
Summary:
- Change get return value for `Blobstore` from `BlobstoreBytes` to `BlobstoreGetData` which include `ctime` metadata
- Update the call sites and tests broken due to this change
- Change `ScrubHandler::on_repair` to accept metadata and log ctime
- `Fileblob` and `Manifoldblob` attach the ctime metadata
- Tests for fileblob in `mononoke:blobstore-test` and integration test `test-walker-scrub-blobstore.t`
- Make cachelib based caching use `BlobstoreGetData`
Reviewed By: ahornby
Differential Revision: D21094023
fbshipit-source-id: dc597e888eac2098c0e50d06e80ee180b4f3e069
Summary:
When we log to Scuba, we need to truncate the `msg` field if it's too long, or
we might be missing log entries in Scuba. I put this behind a tunable so we can, well,
tune it.
Reviewed By: farnz
Differential Revision: D21405959
fbshipit-source-id: 08f0d3491d1a9728b0ca9221436dee2e8f1a17eb
Summary:
This test is flaky right now, but it's not clear why. I'm also unable to repro.
Let's add more logging.
Reviewed By: StanislavGlebik
Differential Revision: D21405284
fbshipit-source-id: 3ce5768066091de61e62339286410a6223d251d5
Summary: Making a trait out of TimeWindowCounter will help with providing different implementations of load limiting for OSS and FB.
Reviewed By: krallin
Differential Revision: D21329265
fbshipit-source-id: 7f317f8e9118493f3dcbadb0519eaff565cbd882
Summary:
This is helpful. Also, while in there, I removed an error that wasn't used at
all.
Reviewed By: StanislavGlebik
Differential Revision: D21399489
fbshipit-source-id: 0e5ef20b842afa9ffc0bb8530c48eb48339c558e
Summary:
We have a number of error enums that wrap an existing errors, but fail to
register the underlying error as a `#[source]`. This results in truncated
context chains when we print the error. This fixes that. It also removes a
bunch of manual `From` implementation that can be provided by thiserror's
`#[from]`.
This also required updating the `Display` implementation for those errors. I've
opted for not printing the underlying error, since the context chain will
include it. This does mean that if we print one of those errors without the
context chain (i.e. `{}` as opposed to `{:#}` or `{:?}`), then we'll lose out a
bit of context. That said, this should be OK, as we really shouldn't ever being
do this, because we'd be missing the rest of the chain anyways.
Reviewed By: StanislavGlebik
Differential Revision: D21399490
fbshipit-source-id: a970a7ef0a9404e51ea3b59d783ceb7bf33f7328
Summary:
This removes our own (Mononoke's) implementation of failure chains, and instead
replaces them with usage of Anyhow. This doesn't appear to be used anywhere
besides Mononoke.
The historical motivation for failure chains was to make context introspectable
back when we were using Failure. However, we're not using Failure anymore, and
Anyhow does that out of the box with its `context` method, which you can
downcast to the original error or any of the context instances:
https://docs.rs/anyhow/1.0.28/anyhow/trait.Context.html#effect-on-downcasting
Reviewed By: StanislavGlebik
Differential Revision: D21384015
fbshipit-source-id: 1dc08b4b38edf8f9a2c69a1e1572d385c7063dbe
Summary:
I'm going to send a diff to get rid of failure chains, and the LFS Server
actually uses that quite a bit. Let's make sure we don't affect the error
rendering there.
Reviewed By: StanislavGlebik
Differential Revision: D21383032
fbshipit-source-id: e0ec9c88760e7fd48d39fa1570efd1870a9ef532
Summary:
Looks like this broke yesterday. There was a Reindeer update yesterday IIRC, so
I'm guessing that's the cause. In any case, this is easy to fix forward.
Reviewed By: farnz
Differential Revision: D21399830
fbshipit-source-id: 5cf33411e089a8c675a8b3fdf7b6ae5ae267058d
Summary:
All the changes we need are now in stable, so use the stable crates.io version.
I also had to do coordinated updates of git2 and rustsec to make sure they're
all using the same version of libgit2-sys. This had a couple of little API changes which affected our code:
- mononoke gitimport (krallin)
- linttool (zertosh) (BTW the old code had some very dubious lifetime stuff - a signature of the form `fn foo<'a>(&self) -> Thing<'a>` never makes any sense - output lifetimes should always be derived from the params)
Similarly, toml-rs needed to be updated because there's now a hard dependency on 0.5.6.
msdkland[rust_cargo]
msdkland[rust_reindeer]
Reviewed By: dtolnay
Differential Revision: D21311180
fbshipit-source-id: 82083c8f2bb8523e70cbe99dc0a630c4bc67a505
Summary:
This updates repo_client to log when hooks finished, and how many were rejecte,
if any. This required a bit of refactoring to avoid iterating twice over
whether hooks are rejected or not (and instead just filter-maps outcomes to a
rejection), but it's probably for the better since it removes a bit of
un-necessary cloning (notably of the hook name).
Reviewed By: farnz
Differential Revision: D21379690
fbshipit-source-id: 53c8368d3871620ec61db76dc35b47dd17276ac4
Summary:
This adds support for running Gitimport with `--readonly-storage`. The way we
do this is by masking the various storages we use (blobstore, changesets,
bonsai).
Reviewed By: markbt
Differential Revision: D21347939
fbshipit-source-id: 68084ba0d812dc200776c761afdfe41bab9a6d82
Summary:
The original gitimport wasn't really designed for concurrency, since it did
commits one by one. With this update, we can now derive Bonsais from multiple
commits in parallel, and use multiple threads to communicate with the Git
repository (which is actually somewhat expensive when that's all we do).
We also store Bonsais iteratively. There is a bit of extra work that could be
done also here by saving Bonsais asynchronously to the Blobstore, and inserting
a single batch in Changesets once we're finished.
Reviewed By: farnz
Differential Revision: D21347941
fbshipit-source-id: e0ea86bf4d164599df1370844d3f0301d1031801
Summary:
This adds support for deriving commits within a range in gitimport, which gets
us one step closer to resumable gitimport. The primary goal of this is to
evaluate whether using Gitimport for Configerator might be suitable.
Differential Revision: D21347942
fbshipit-source-id: aa3177466e389ceb675328999ccf836f29912698
Summary:
This adds some basic functionality for deriving hg manifests in gitimport. I'd
like to add this to do some correctness testing on importing Git manifests from
Configerator.
Differential Revision: D21347940
fbshipit-source-id: 6f819fa8a62b3088fb163138fc23910b8f2ff3ce
Summary:
- Use the same case consistently
- Log even when pushrebase fails
Reviewed By: farnz
Differential Revision: D21378033
fbshipit-source-id: 062e986151086476db9100e3d9c71aa702661032
Summary:
Currently we need to specify which derived data we need to derive, however they
are already specified in the configerator configs. Let's just read it from
there.
That means that we no longer need to update tw spec to add new derived data types - we'll just need to add them to configerator and restart the backfiller.
Reviewed By: krallin
Differential Revision: D21378640
fbshipit-source-id: f97c3f0b8bb6dbd23d5a50f479ecfccbebd33897
Summary: Making a trait out of LoadLimiter will help with providing different implementations of load limiting for OSS and FB.
Reviewed By: farnz
Differential Revision: D21302819
fbshipit-source-id: 1b982a367aa7126ca5d7772e4a2406dabbe9e13b