Summary:
This diff changes failure_ext::Error from reexporting failure::Error to anyhow::Error instead.
Failure's error type requires all errors to have an implementation of failure's own failure::Fail trait in order for cause chains and backtraces to work. The necessary methods for this functionality have made their way into the standard library error trait, so modern error libraries build directly on std::error::Error rather than something like failure::Fail. Once we are no longer tied to failure 0.1's Fail trait, different parts of the codebase will be free to use any std::error::Error-based libraries they like while still working nicely together.
Sadly this diff is not the end of the remove-failure stack, it is near the middle. This diff is the point where we've made failure look as much as possible like anyhow and made anyhow look as much like failure via failure_ext, such that we can atomically cut over with a relatively small diff. That means some of the code here isn't necessarily idiomatic. Later diffs will follow up on removing failurisms like the bail_err / bail_msg / ensure_err / ensure_msg distinction which should no longer be necessary.
Reviewed By: Imxset21
Differential Revision: D18577341
fbshipit-source-id: 5689c3c9ddeaa79a123e710831986cf4656a5205
Summary: This is intended to be used by both the ApiServer and the SCS.
Reviewed By: StanislavGlebik
Differential Revision: D18475673
fbshipit-source-id: 5aafabadff4ed2fd2fa3b33f26ae30f56096a757
Summary:
As part of the LFS Server ACL checking, we want to be able to specify a
ACL to check for a repo. Add a new field to the repo config that allows for this.
Reviewed By: mitrandir77
Differential Revision: D18452654
fbshipit-source-id: 23a6f0a8d8a827705d69601c7c940cbb0da792e2
Summary:
This diff replaces code of the form:
```
use failure::Fail;
#[derive(Fail, Debug)]
pub enum ErrorKind {
#[fail(display = "something failed {} times", _0)]
Failed(usize),
}
```
with:
```
use thiserror::Error;
#[derive(Error, Debug)]
pub enum ErrorKind {
#[error("something failed {0} times")]
Failed(usize),
}
```
The former emits an implementation of failure 0.1's `Fail` trait while the latter emits an impl of `std::error::Error`. Failure provides a blanket impl of `Fail` for any type that implements `Error`, so these `Error` impls are strictly more general. Each of these error types will continue to have exactly the same `Fail` impl that it did before this change, but now also has the appropriate `std::error::Error` impl which sets us up for dropping our various dependencies on `Fail` throughout the codebase.
Reviewed By: Imxset21
Differential Revision: D18523700
fbshipit-source-id: 0e43b10d5dfa79820663212391ecbf4aeaac2d41
Summary:
The file hook cache stores the result of running a hook on a specific
file. It allows the result of expensive hooks to be reused. However, there are
some limitations that mean the cache is practically useless. The hook cache
will be populated when a push happens to the server. The only case in which the
cached result is used is if someone pushes a file that exactly matches a
previous push to the exact same server (which hasn't been restarted). In
practice, this is very unlikely to happen.
In order to include the hook cache, the code for file and changeset hooks is
quite different. Removing the cache allows for some unification (such as a
single run_hook function) and simplification. I've also added logging to
the changeset hooks, as previously we only printed debug output for the file
hooks.
Further, it allow the removal of asyncmemo in a few places.
Reviewed By: StanislavGlebik
Differential Revision: D17932986
fbshipit-source-id: df8220aea7511a00aeb6b9de615e15d657bf4602
Summary:
In the multi-directional-sync world it is not enough to have a single direction
specified for all small repos, let's move it to the small repo config level.
This prompts some changes in how we do verification:
- no overlapping prefixes are allowed among all small repos with small-to-large
sync direction (when small repos tail into the large repo, we cannot deny
a pushrebase because of the path conflicts (since the small repo has already
accepted the commit in the past), so let's prohibit path conflicts in the first place)
- no overlapping prefixes are allowed between *all* small-to-large repos and
*each* large-to-small repo (same reason, we cannot reject tailed commits,
so we need to protect ourselves from path conflicts)
- overlapping *is* allowed between large-to-small repos, but only when used
prefixes are identical, not when one prefix is a prefix of another prefix.
To be clear, I am not certain that we cannot accommodate the latter case,
but I just want to avoid additional complexity so far. Overlapped prefixes
represent the "locked across repos" directories: ones, which can be changed
from each of the small repos as part of a push redirector pushrebase
Reviewed By: StanislavGlebik
Differential Revision: D17930877
fbshipit-source-id: 97f0ab8f5975f9635e84716851fe42c8cc8800f5
Summary:
While we have the `RepositoryId` dedicated type, the `RepoConfig` struct used
`i32` for its `repoid` field, so in a lot of places, which referenced or used
this config we continued to use `i32`. This is sad, so let's fix this.
This diff changes `RepoConfig` and everything that depends on it.
Ideally we would also just remove the `.id()` method of the `RepositoryId`
struct, so that our clients have to use the struct, but in a few places it
is deserialized into thift/SQL types, so unless we move those transformations
into the same file (which also seems bad and not future-proof), we cannot get
rid of the method.
Reviewed By: farnz
Differential Revision: D17928574
fbshipit-source-id: 3d9355272cfcd20af787edd6417cc529be640356
Summary:
This updates the repo client and hook tailer to use the text only store.
This does represent a slight behavior change in hooks for all repos (towards allowing more things) since we no longer run things like conflict marker checks on files that are too big. That seems like a reasonable thing to do, though.
Reviewed By: StanislavGlebik
Differential Revision: D17930266
fbshipit-source-id: 0abdf2382ec6b45558002c6aeed46da0acb840ea
Summary: I'll add WireprotoLogging struct, so need to rename this one
Reviewed By: ikostia
Differential Revision: D17878970
fbshipit-source-id: 2966f5cb8c8d2399e1691ef1c710dd9362a976ee
Summary:
This diff updates all license headers to use the new text and style.
Also, a few internal files were missing the header, but now they have it.
`fbcode/common/rust/netstring/` had the internal header, but now it has
GPLV2PLUS - since that goes to Mononoke's Github too.
Differential Revision: D17881539
fbshipit-source-id: b70d2ee41d2019fc7c2fe458627f0f7c01978186
Summary:
This diff achieves two things:
- configs are verified from this point of view when read, not just when
compiled
- we have a way to get config by repoid if we have `RepoConfigs`
Reviewed By: farnz
Differential Revision: D17460214
fbshipit-source-id: cedce280d6f8209c2f4e1a4ff0d53780242bac30
Summary:
`movers` are functions, that we use to shift file paths when syncing commits.
These functions should be automatically buildable from repo-sync configs
for both small-to-large and large-to-small sync directions.
Reviewed By: farnz
Differential Revision: D17395844
fbshipit-source-id: 25ec9b06ba5908d8c125702a712b3cf782ccffca
Summary: I think these are left over from pre-2018 code where they may have been necessary. In 2018 edition, import paths in `use` always begin with a crate name or `crate`/`super`/`self`, so `use $ident;` always refers to a crate. Since extern crates are always in scope in every module, `use $ident` does nothing.
Reviewed By: Imxset21
Differential Revision: D17290473
fbshipit-source-id: 23d86e5d0dcd5c2d4e53c7a36b4267101dd4b45c
Summary:
Commit sync will operate based on the following idea:
- there's one "large" and potentially many "small" repos
- there are two possible sync directions: large-to-small and small-to-large
- when syncing a small repo into a large repo, it is allowed to change paths
of each individual file, but not to drop files
- large repo prepends a predefined prefix to every bookmark name from the small repo, except for the bookmarks, specified in `common_pushrebase_bookmarks` list, which refers to the bookmarks that can be advanced by any small repo
Reviewed By: krallin
Differential Revision: D17258451
fbshipit-source-id: 6cdaccd0374250f6bbdcbc9a280da89ccd7dff97
Summary:
This is a mechanical part of rename, does not change any commit messages in
tests, does not change the scuba table name/config setting. Those are more
complex.
Reviewed By: krallin
Differential Revision: D16890120
fbshipit-source-id: 966c0066f5e959631995a1abcc7123549f7495b6
Summary: This updates our repo config to allow passing through Filestore params. This will be useful to conditionally enable Filestore chunking for new repos.
Reviewed By: HarveyHunt
Differential Revision: D16580700
fbshipit-source-id: b624bb524f0a939f9ce11f9c2983d49f91df855a
Summary:
In earlier diffs in this stack, I updated the callsites that reference XDB tiers to use concrete &str types (which is what they were receiving until now ... but it wasn't spelled out as-is).
In this diff, I'm updating them to use owned `String` instead, which lets us hoist up `to_string()` and `clone()` calls in the stack, rather than pass down reference only to copy them later on.
This allows us to skip some unnecessary copies. Tt turns out we were doing quite a few "turn this String into a reference, pass it down the stack, then turn it back into a String".
Reviewed By: farnz
Differential Revision: D16260372
fbshipit-source-id: faec402a575833f6555130cccdc04e79ddb8cfef
Summary:
Report to Scuba whenever someone tries to access a blobstore which is blacklisted. Scuba reporting is done for any `get` or `put` method call.
Because of the possible overload - given the high number of requests mononoke receives and that CensoredBlobstore make the verification before we add the caching layer for blobstores - I considered reporting at most one bad request per second. If multiple requests to blacklisted blobstores are made in less than one second, only the first request should be reported. Again, this is not the best approach (to not report all of them), but performance wise is the best solution.
NOTE: I also wrote an implementation using `RwLock` (instead of the current `AtomicI64`), but atomic variables should be faster than using lockers so I gave up on that idea.
Reviewed By: ikostia, StanislavGlebik
Differential Revision: D16108456
fbshipit-source-id: 9e5338c50a1c7d15f823a2b8af177ffdb99e399f
Summary:
This adds the ability to provide an infinitepush namespace configuration without actually allowing infinite pushes server side. This is useful while Mercurial is the write master for Infinite Push commits, for two reasons:
- It lets us enable the infinitepush namespace, which will allow the sync to proceed between Mercurial and Mononoke, and also prevents users from making regular pushes into the infinitepush namespace.
- It lets us prevent users from sending commit cloud backups to Mononoke (we had an instance of this reported in the Source Control @ FB group).
Note that since we are routing backfills through the shadow tier, I've left infinitepush enabled there.
Reviewed By: StanislavGlebik
Differential Revision: D16071684
fbshipit-source-id: 21e26f892214e40d94358074a9166a8541b43e88
Summary:
`MultiplexedBlobstore` can hide errors up until we suddenly lose availabilty of the wrong blobstore. Introduce an opt-in `ScrubBlobstore`, which functions as a `MultiplexedBlobstore` but checks that the combination of blobstores and healer queue should result in no data loss.
Use this new blobstore in the blobrepo checker, so that we can be confident that data is safe.
Later, this blobstore should trigger the healer to fix "obvious" problems.
Reviewed By: krallin
Differential Revision: D15353422
fbshipit-source-id: 83bb73261f8ae291285890324473f5fc078a4a87
Summary:
Added an option to control for which repositories should censoring be
enabled or disabled. The option is added in `server.toml` as `censoring` and it
is set to true or false. If `censoring` is not specified, then the default
option is set to true ( censoring is enabled).
By disabling `censoring` the verification if the key is blacklisted or not is
omitted, therefor all the files are fetchable.
Reviewed By: ikostia
Differential Revision: D16029509
fbshipit-source-id: e9822c917fbcec3b3683d0e3619d0ef340a44926
Summary:
Add config option to set the load limiter category, to be used by the
LoadLimiter library.
Reviewed By: krallin
Differential Revision: D15628073
fbshipit-source-id: 8df22badeb2b255e44b4675f5b6701c63c00d0c8
Summary:
This is the final step in making sure we have control over whether
non-pushrebase pushes are supported by a given repo.
Reviewed By: krallin
Differential Revision: D15522276
fbshipit-source-id: 7e3228f7f0836f3dcd0b1a3b2500545342af1c5e
Summary: This is the first step towards per-repo control of whether pushes are allowed.
Reviewed By: StanislavGlebik
Differential Revision: D15519959
fbshipit-source-id: a0bb96bd995af7df0cef225c73d559f309cfe592
Summary:
This adds a sanity check that limits the count of matches in `list_all_bookmarks_with_prefix`.
If we find more matches than the limit, then an error will be returned (right now, we don't have support for e.g. offsets in this functionality, so the only alternative approach is for the caller to retry with a more specific pattern).
The underlying goal is to ensure that we don't trivially expose Mononoke to accidental denial of service when a list lists `*` and we end up querying literally all bookmarks.
I picked a fairly conservative limit here (500,000), which is > 5 times the number of bookmarks we currently have (we can load what we have right now successfully... but it's pretty slow);
Note that listing pull default bookmarks is not affected by this limit: this limit is only used when our query includes scratch bookmarks.
Reviewed By: StanislavGlebik
Differential Revision: D15413620
fbshipit-source-id: 1030204010d78a53372049ff282470cdc8187820
Summary:
This updates our receive path for B2xInfinitepush to create new scratch bookmarks.
Those scratch bookmarks will:
- Be non-publishing.
- Be non-pull-default.
- Not be replicated to Mercurial (there is no entry in the update log).
I added a sanity check on infinite pushes to validate that bookmarks fall within a given namespace (which is represented as a Regexp in configuration). We'll want to determine whether this is a good mechanism and what the regexp for this should be prior to landing (I'm also considering adding a soft-block mode that would just ignore the push instead of blocking it).
This ensures that someone cannot accidentally perform an infinitepush onto master by tweaking their client-side configuration.
---
Note that, as of this diff, we do not support the B2xInfinitepushBookmarks part (i.e. backup bookmarks). We might do that separately later, but if we do, it won't be through scratch Bookmarks (we have too many backup bookmarks for this to work)
Reviewed By: StanislavGlebik
Differential Revision: D15364677
fbshipit-source-id: 23e67d4c3138716c791bb8050459698f8b721277
Summary:
As part of adding support for infinitepush in Mononoke, we'll include additional server-side metadata on Bookmarks (specifically, whether they are publishing and pull-default).
However, we do use the name `Bookmark` right now to just reference a Bookmark name. This patch updates all reference to `Bookmark` to `BookmarkName` in order to free up `Bookmark`.
Reviewed By: StanislavGlebik
Differential Revision: D15364674
fbshipit-source-id: 126142e24e4361c19d1a6e20daa28bc793fb8686
Summary:
There was a request about importing a GitHub repo into fbsource. While pushing
it to Mononoke with pushrebase disabled, the sync job broke because it can only
handle pushrebase pushes.
Before this diff, pushrebase has a repo-level config about whether dates need
to be rewritten. We definitely want "master" to have date rewritten turned on,
but not the imported commits. This diff adds logic to turn off date rewriting
for bookmarks by using the `rewrite_dates` config, to address the repo import
requirement.
Reviewed By: StanislavGlebik
Differential Revision: D15291030
fbshipit-source-id: 8dcf8359d7de9ac33f0af6f9ab3bcbac424323e4
Summary:
Some tests were failing because their syntax wasn't updated, not
because of the thing they're testing for. Add a check for the error string as
well.
Reviewed By: StanislavGlebik
Differential Revision: D15280521
fbshipit-source-id: 81402fae6854811a8e386ee4d7f37139f0489035
Summary:
This change has two goals:
- Put storage configuration that's common to multiple repos in a common place,
rather than replicating it in each server.toml
- Allow tools that only operate on the blobstore level - like blobstore healing
- to be configured directly in terms of the blobstore, rather than indirectly
by using a representative repo config.
This change makes several changes to repo configuration:
1. There's a separate common/storage.toml which defines named storage
configurations (ie, a combination of a blobstore and metadata DB)
2. server.toml files can also define local storage configurations (mostly
useful for testing)
3. server.toml files now reference which storage they're using with
`storage_config = "name"`.
4. Configuration of multiplex blobstores is now explicit. Previously if a
server.toml defined multiple blobstores, it was assumed that it was a
multiplex. Now storage configuration only accepts a single blobstore config,
but that config can be explicitly a multiplexed blobstore, which has the
sub-blobstores defined within it, in the `components` field. (This is
recursive, so it could be nested, but I'm not sure if this has much value in
practice.)
5. Makes configuration parsing more strict - unknown fields will be treated as
an error rather than ignored. This helps flag problems in refactoring/updating
configs.
I've updated all the configs to the new format, both production and in
integration tests. Please review to make sure I haven't broken anything.
Reviewed By: StanislavGlebik
Differential Revision: D15065423
fbshipit-source-id: b7ce58e46e91877f4e15518c014496fb826fe03c
Summary:
This migrates the internal structures representing the repo and storage config,
while retaining the existing config file format.
The `RepoType` type has been replaced by `BlobConfig`, an enum containing all
the config information for all the supported blobstores. In addition there's
the `StorageConfig` type which includes `BlobConfig`, and also
`MetadataDBConfig` for the local or remote SQL database for metadata.
Reviewed By: StanislavGlebik
Differential Revision: D15065421
fbshipit-source-id: 47636074fceb6a7e35524f667376a5bb05bd8612
Summary:
We don't need Option<bool> or Option<Vec<T>> - in the former case, the
bool is always treated as having a default value if not present, and in the
latter, None is equivalent to Some(vec![]), so just use an empty vector for
absense.
Reviewed By: lukaspiatkowski
Differential Revision: D15051895
fbshipit-source-id: 0ac6f2e6b13357bf6e30dbfa25c7fdebd208e505
Summary:
This updates our configuration to allow using a different tier name for sharded filenodes.
One thing I'd like to call out is that we currently use the DB tier name in the keys generated by `CachingFilenodes`. Updating the tier name will therefore result in us dropping all our caches. Is this acceptable? If not, should we just continue using the old tier name.
Reviewed By: jsgf, StanislavGlebik
Differential Revision: D15243112
fbshipit-source-id: 3bfdcefcc823768f2964b4733e570e9cef57cebc
Summary: Use the existing library function to read a file into a `Vec<u8>`.
Reviewed By: aslpavel
Differential Revision: D15051894
fbshipit-source-id: 853b31450556c0a2e74a09fa06e7814ac68b1052
Summary:
The config will be used to whitelist connections with certain identities and
blacklist everything else.
Differential Revision: D15150921
fbshipit-source-id: e4090072ea6ba9714575fb8104d9f45e92c6fefb
Summary:
Disallow unknown fields. They're generally the result of a mis-editing
a file and putting the config in the wrong place, or some incomplete refactor.
Reviewed By: StanislavGlebik
Differential Revision: D15168963
fbshipit-source-id: a9c9658378cda4866e44daf6e2c6bfbdfcdb9f84
Summary: Now it is possible to configure and enable/disable bookmark cache from configs
Reviewed By: StanislavGlebik
Differential Revision: D14952840
fbshipit-source-id: 3080f7ca4639da00d413a949547705ad480772f7
Summary: Added obsmarkers to pushrebase output. This allows the client to hide commits that were rebased server-side, and check out the rebased commit.
Reviewed By: StanislavGlebik
Differential Revision: D14932842
fbshipit-source-id: f215791e86e32e6420e8bd6cd2bc25c251a7dba0