Summary: It's useful to be able to copy multiple dirs at once
Reviewed By: markbt
Differential Revision: D29358375
fbshipit-source-id: f1cc351195cc2c19de36a1b6936b598e314848c3
Summary:
Previously only conversion between bonsai and hg was supported. Let's add git
as well.
Obviously you can use `scsc lookup`, but mononoke_admin can be useful for repos
that are not on scs yet.
Reviewed By: farnz
Differential Revision: D29360793
fbshipit-source-id: eb2b71eab192b3456ba3d580f7eb8c4a85b2fd1d
Summary:
Extend metaconfig to include configuration for the ephemeral blobstore.
An ephemeral blobstore is optional: repos without an ephemeral blobstore cannot
store ephemeral commits or snapshots.
Reviewed By: StanislavGlebik
Differential Revision: D29067719
fbshipit-source-id: fe7d42173d5c34a937c99c72f4b2bd08af503889
Summary:
Pull in a patch which fixes writing out an incorrect entsize for the
`SHT_GNU_versym` section:
ddbae72082
Reviewed By: igorsugak
Differential Revision: D29248208
fbshipit-source-id: 90bbaa179df79e817e3eaa846ecfef5c1236073a
Summary:
For context and high level goal, see: https://fb.quip.com/8zOkAQRiXGQ3
Instead of using `HashMap<String, RedactedMetadata>` everywhere, let's use a `Arc<RedactedBlobs>` object from which we can instead borrow a map. The borrow function is async because it will need to be when we're fetching from configerator, as it may need to rebuild the redaction data.
Wrapping it in `Arc` will also makes it re-use the same across repos, I believe right now it's cloned everywhere.
In later diffs I'll use this enum to add a new way to fetch configs.
Reviewed By: markbt
Differential Revision: D28935506
fbshipit-source-id: befa96810ee7ebb9487f99f9e769a945981b58ed
Summary:
We're doing imports for AOSP megarepo work, and want a tool to quickly check that our imports are what we expect.
Use libgit2 and a simple LFS parser to read git SHA-256 entries, and FSNodes to get the Mononoke entries to match
Reviewed By: StanislavGlebik
Differential Revision: D29169743
fbshipit-source-id: 1ef1e2c780b8742c7fa5f15f9ee01bc0481a6543
Summary: Update versions for several of the crates we depend on.
Reviewed By: danobi
Differential Revision: D29165283
fbshipit-source-id: baaa9fa106b7dad000f93d2eefa95867ac46e5a1
Summary:
Like it says in the title. Let's allow specifying an oncall here since that
oncall will be tasked with retroactive review of the commit.
Reviewed By: StanislavGlebik
Differential Revision: D29162534
fbshipit-source-id: 9ed3ac43c38a1120bb16a2f5b5218fdbf80e0d47
Summary:
Manifold enumeration ranges are inclusive. Update documentation of options that
ultimately feed into this as such.
To avoid future confusion, use Rust's inclusive ranges to initialize these, and
remove the exclusive range option.
The fileblob implementation was actually performing exclusive checks at both
ends, rather than inclusive ones. Correct this by implementing `RangeBounds`
and using `range.contains` instead.
Reviewed By: liubov-dmitrieva
Differential Revision: D28224481
fbshipit-source-id: 7244588271d7754d6c6820790cbd76574b296d7b
Summary: revert the zstd crates back to previous version
Reviewed By: johansglock
Differential Revision: D29038514
fbshipit-source-id: 3cbc31203052034bca428441d5514557311b86ae
Summary: I'm going to touch these files soon, found this easy improvement to do before.
Reviewed By: StanislavGlebik
Differential Revision: D28993265
fbshipit-source-id: 89fe9ac52f6d99e1b5ce7cb24d949e048226436a
Summary:
Using the previous diff, this diff will now build `InnerRepo` instead of `BlobRepo`, as a way to build `Skiplist` without having to do it "manually" by calling `fetch_skiplist_index`.
See D28877887 for goal
Reviewed By: StanislavGlebik
Differential Revision: D28880352
fbshipit-source-id: 73e05864a0c0ffaba454a27ea521b4d3dc6ee78b
Summary:
BlobRepo is already easily cloneable (actually holds an arc inside).
This should also make code a little prettier, where we needed to use `(*blob_repo).clone()` instead of just `blob_repo.clone()`.
Reviewed By: StanislavGlebik
Differential Revision: D28930384
fbshipit-source-id: 59f95d10576a3f71808d0d26d36358421673351e
Summary:
The important change on this diff is in this file: `eden/mononoke/cmdlib/src/args/mod.rs`
On this diff I change that file's repo-building functions to be able to build both `BlobRepo` and `InnerRepo` (added on D28748221 (e4b6fd3751)). In fact, they are now able to build any facet container that can be built by the `RepoFactory` factory, so each binary can specify their own subset of needed "attributes" and only build those ones.
For now, they're all still using BlobRepo, this diff is only a refactor that enables easily changing the repo attributes you need.
The rest of the diff is mostly giving hints to the compiler, as in several places it couldn't infer it should use `BlobRepo` directly, so I had to add type hints.
## High level goal
This is part of the blobrepo refactoring effort.
I am also doing this in order to:
1. Make sure every place that builds `SkiplistIndex` uses `RepoFactory` for that.
2. Then add a `BlobstoreGetOps` trait for blobstores, and use the factory to feed it to skiplist index, so it can query the blobstore while skipping cache. (see [this thread](https://www.internalfb.com/diff/D28681737 (850a1a41b7)?dst_version_fbid=283910610084973&transaction_fbid=106742464866346))
Reviewed By: StanislavGlebik
Differential Revision: D28877887
fbshipit-source-id: b5e0093449aac734591a19d915b6459b1779360a
Summary: Update to latest version. This includes a patch to async-compression crate from [my PR updating it](https://github.com/Nemo157/async-compression/pull/125), I will remove once the crate is released.
Reviewed By: mitrandir77
Differential Revision: D28897019
fbshipit-source-id: 07c72f2880e7f8b85097837d084178c6625e77be
Summary: It can be useful to understand how many ancestors are not derived yet.
Reviewed By: Croohand
Differential Revision: D28902194
fbshipit-source-id: 87c11b3e35ba7f67122990318ff07408c47d4d6c
Summary:
Recently we had an issue with `connectivity-lab` repo where 3 keys P416141335 had different values because of parent ordering P416094337.
Walker can detect difference between keys in the multiplex inner blobstores and repair them, however it doesn't have notion of the copy keys (there isn't concept of source and the target). We have a copy_blobstore_keys tool, which is used for restoring keys from the backup and with small modification it can handle copy between innerstore.
Reviewed By: StanislavGlebik
Differential Revision: D28707364
fbshipit-source-id: 3d5a4f39999623023539b9159fa7310d430f0ee4
Summary:
D28679419 has landed and the chronos job has been updated, which AFAIK is the only place that used this.
This now removes the `--sparse` argument.
Reviewed By: HarveyHunt
Differential Revision: D28796187
fbshipit-source-id: 277f0b8be4e72aa00b058e09a0044841c067c58f
Summary: They are not being used anymore AFAIK, so it's best to get rid of dead code.
Reviewed By: farnz
Differential Revision: D28677902
fbshipit-source-id: a2de5a1ca3908128b57afc07227a46bba0e45b1a
Summary: I'm going to reuse this for AOSP import logic speedups, and I do not want my low QPS limit overridden by a higher QPS limit set for backfilling. Push the rate limiter out
Reviewed By: StanislavGlebik
Differential Revision: D28638180
fbshipit-source-id: ef3a783d4b1993614a146f534337f719958a1f36
Summary:
Extend the `blame_v2` format to include metadata about the location in the
parent commit that a blamed line replaces. This can be used to implement
accurate "skip past this change" in clients.
Most ranges only need the range of lines that the original blame range
replaces. For ranges that are inserts, the parent range is of zero length and
the offset indicates the line that the range was inserted before.
For renames, we must include the path of the file before the rename, so that
the file can be found in the parent.
For merge commits, if the file is present in more than one parent, then lines
that are introduced in the merge commit itself have multiple possibilities for
the parent range. We select and record the first parent that contains the file
as the provider of the parent range for these lines. This favours the p1
history of the file, but allows "skip past this change" to work when files
are merged in.
Reviewed By: farnz
Differential Revision: D28546768
fbshipit-source-id: 2af1e95a0d27fb25aeea51682177fbac2c41b029
Summary:
Like it says in the title. The API between Bytes 1.x has changed a little bit,
but the concepts are basically the same, so we just need to change the
callsites that were calling `bytes()` and have them ask for `chunk()` instead.
This diff attempts to be as small as it can (and it's already quite big). I
didn't attempt to update *everything*: I only updated whatever was needed to
keep `common/rust/tools/scripts/check_all.sh` passing.
However, there are a few changes that fall out of this. I'll outline them here:
## `BufExt`
One little caveat is the `copy_to_bytes` we had on `BufExt`. This was
introduced into Bytes 1.x (under that name), but we can't use it here directly.
The reason we can't is because the instance we have is a `Cursor<Bytes>`, which
receives an implementation of `copy_from_bytes` via:
```
impl<T: AsRef<[u8]>> Buf for std::io::Cursor<T>
```
This means that implementation isn't capable of using the optimized
`Bytes::copy_from_bytes` which doesn't do a copy at all. So, instead, we need
to use a dedicated method on `Cursor<Bytes>`: `copy_or_reuse_bytes`.
## Calls to `Buf::to_bytes()`
This method is gone in Bytes 1.x, and replaced by the idiom
`x.copy_to_bytes(x.remaining())`, so I updated callsites of `to_bytes()`
accordingly.
## `fbthrift_ext`
This set of crates provides transports for Thrift calls that rely on Tokio 0.2
for I/O. Unfortunately, Tokio 0.2 uses Bytes 0.5, so that doesn't work well.
For now, I included a copy here (there was only one required, when reading from
the socket). This can be removed if we update the whole `fbthrift_ext` stack to
Bytes 1.x. fanzeyi had been wanting to update this to Tokio 1.x, but was blocked on `thrift/lib/rust` using Bytes 0.5, and confirmed that the overhead of a copy here is fine (besides, this code can now be updated to Tokio 1.x to remove the copy).
## Crates using both Bytes 0.5 & Bytes 1.x
This was mostly the case in Mononoke. That's no coincidence: this is why I'm
working on this. There, I had to make changes that consist of removing Bytes
0.5 to Bytes 1.x copies.
## Misuse of `Buf::bytes()`
Some places use `bytes()` when they probably mean to use `copy_to_bytes()`. For
now, I updated those to use `chunk()`, which keeps the behavior the same but
keeps the code buggy. I filed T91156115 to track fixing those (in all
likelihood I will file tasks for the relevant teams).
Reviewed By: dtolnay
Differential Revision: D28537964
fbshipit-source-id: ca42a614036bc3cb08b21a572166c4add72520ad
Summary: Change from an bool to an enum in preparation for adding a third value later in stack.
Reviewed By: farnz
Differential Revision: D28472922
fbshipit-source-id: 5c25e7e03f1d29eff0455282e43e359efaf9b942
Summary:
Currently, we take a `&mut (dyn Buf + Send)` as input when writng to Manifold.
This has a couple of downsides:
- It's not very ergonomic. From the API it's not obvious what mutations are
going to be done on your `Buf` exactly (it's going to be consumed entirely),
and it often results in code that's a little clumsy (see what I had to change
here), where make calls like `write(&key, &mut mydata.clone())`.
- It limits what we can do with it. If you already have a `IOBufShared` on
hand, you shouldn't need to copy it in order to pass it to Manifold, but
currently the Manifold client code does have to copy it because all it sees
is a `dyn Buf`.
This diff updates the client to take a `IOBufShared` directly, which has plenty
of convenient `From<...>` conversions (and some efficient ones like
`From<Bytes>`, which doesn't copy the `Bytes` at all).
Reviewed By: ahornby, Imxset21
Differential Revision: D28535539
fbshipit-source-id: bba1b963a96350ad57cc0fbfcc31f7e1eb36c317
Summary:
Paying the setup and teardown overhead of multiple processes seems silly, when we can pack in parallel in a single process.
Make it possible to run multiple packing runs from a single packer process
Reviewed By: ahornby
Differential Revision: D28508527
fbshipit-source-id: eab07d028db46d62731f06effbde2f5bc5579000
Summary: The new and new_with_ttl constructors were causing duplication. Combine them.
Reviewed By: farnz
Differential Revision: D28471162
fbshipit-source-id: c7095a9a337d0ccbf5cd15ac3650cd5b361aaebf
Summary:
manual_scrub was using the ordered form of buffered so that checkpoint was written correctly.
This diff switches to buffered_unordered which can give better throughput. To do so checkpoint uses a tracker to know what keys have completed, so it can save the latest done key which has no preceding keys still executing.
Reviewed By: farnz
Differential Revision: D28438371
fbshipit-source-id: 274aa371a0c33d37d0dc7779b04daec2b5e1bc15
Summary: This is useful to get an idea of what the scrub is doing
Reviewed By: farnz
Differential Revision: D28417785
fbshipit-source-id: 1421e0aae13f43371d4c0d066c08aee80b17e9c0
Summary: Write mostly stores are often in the process of being populated. Add an option to control whether scrub errors are raised for missing values in write mostly stores.
Differential Revision: D28393689
fbshipit-source-id: dfc371dcc3b591beadead82608a747958b53f580
Summary: Log the blobstore stack being used for the scrub
Reviewed By: farnz
Differential Revision: D28408340
fbshipit-source-id: 2299f7f7397f48d70b9a8295f0aa28c89bbf5809
Summary: Log the blobstore id as part of sampled pack info. This is allows running the walker pack info logging directly agains a multiplex rather than invoke it for one component at a time.
Reviewed By: farnz
Differential Revision: D28264093
fbshipit-source-id: 0502175200190527b7cc1cf3c48b8154c8b27c90
Summary:
These are undermaintained, and need an update for oncall support. Start by moving to CXX, which makes maintenance easier.
In the process, I've fixed a couple of oddities in the API that were either due to the age of the code, or due to misunderstandings propagating through bindgen that CXX blocks, and fixed up the users of those APIs.
Reviewed By: dtolnay
Differential Revision: D28264737
fbshipit-source-id: d18c3fc5bfce280bd69ea2a5205242607ef23f28
Summary:
Because cachelib is not initialised at this point, it returns `None` unconditionally.
I'm refactoring the cachelib bindings so that this returns an error - take it out completely for now, leaving room to add it back in if caching is useful here
Reviewed By: sfilipco
Differential Revision: D28286986
fbshipit-source-id: cd9f43425a9ae8f0eef6fd32b8cd0615db9af5f6
Summary: This wants to use Scuba so it needs this.
Reviewed By: StanislavGlebik
Differential Revision: D28282511
fbshipit-source-id: 6d3a2b6316084f7e16f5a2f92cfae1d101a9c2d3
Summary: This update makes it so that we don't log versions to scuba from tests.
Reviewed By: krallin
Differential Revision: D27449808
fbshipit-source-id: 9c79e83fbfdf3d9a02c2cfc8b6a8255edb4241fe
Summary:
This is going to enable the background update in SegmentedChangelog to log
entries to Scuba.
The scuba sample builder is not fundamentally different than other elements of
the environment. It is used slightly differently to, for example, Logger,
because it has to cloned in all places that want to log rows but otherwise it
has the same characteristics.
Reviewed By: krallin
Differential Revision: D28210008
fbshipit-source-id: 68468868d13f29dddf21095bd7526cb4ff690786
Summary:
Add `--blame-v2` to `mononoke_admin blame compute`. This can be used to compute
blames in the new format and validate that they are correct.
Reviewed By: mitrandir77
Differential Revision: D28183160
fbshipit-source-id: f698a77c109bfce05aeb66cd405c6f20bf158801
Summary:
Refactoring CoreContext construction to express purpose. We will have
a constructor for request processing, for background processing.
Bulk processing is another category that has it's own constructor
already. Renaming it to make it more prominent.
Reviewed By: krallin
Differential Revision: D28210006
fbshipit-source-id: 2bb74d97e2f3588aa539e58c3d6dd6842f898121
Summary: Upstream crate has landed my PR for zstd 1.4.9 support and made a release, so can remove this patch now.
Reviewed By: ikostia
Differential Revision: D28221163
fbshipit-source-id: b95a6bee4f0c8d11f495dc17b2737c9ac9142b36
Summary: Manual scrub success file can be very large and have fairly high IO rate. When checkpointing keys to a file its not really needed, so make it optional.
Reviewed By: farnz
Differential Revision: D28199084
fbshipit-source-id: 83d946f7ab8dc6f5f17f94b6a1c3818d9af7b0b0
Summary:
Right now, if your batch size is 1K we prefetch 1K unodes in parallel. This
tends to result in e.g. timeouts as various things get starved for CPU as
a result.
Let's stop doing that.
Reviewed By: StanislavGlebik
Differential Revision: D28183685
fbshipit-source-id: d8353ae8e36921a485b982a1043b81f443258098
Summary:
Our batch sizes are a bit crazy here and causing the backfiller to OOM if there
actually is that many commits to derive. Lower them, a lot.
Reviewed By: StanislavGlebik
Differential Revision: D28183686
fbshipit-source-id: 54b546c4507f65c34a264df283516b5d62408a66
Summary: The packer was adding repo prefix as part of the pack key, which would mean that same content for different repos had different binary form. This change fixes the prefix.
Reviewed By: farnz
Differential Revision: D28119422
fbshipit-source-id: 338e17885abd8cfca12d5bb399244039dbf22e63