Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large
commits.
The code for managing cached_publishing_bookmarks_maybe_stale was already a bit
tricky, and with WarmBookmarksCache introduction it would've gotten even worse.
Let's move this logic to a separate SessionBookmarkCache struct.
Reviewed By: krallin
Differential Revision: D22816708
fbshipit-source-id: 02a7e127ebc68504b8f1a7401beb063a031bc0f4
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large
commits.
The problem with large changesets is deriving hg changesets for them. It might take
a significant amount of time, and that means that all the clients are stuck waiting on
listkeys() or heads() call waiting for derivation. WarmBookmarksCache can help here by returning bookmarks
for which hg changesets were already derived.
This is the second refactoring to introduce WarmBookmarksCache.
Now let's cache not only pull default, but also publishing bookmarks. There are two reasons to do it:
1) (Less important) It simplifies the code slightly
2) (More important) Without this change 'heads()' fetches all bookmarks directly from BlobRepo thus
bypassing any caches that we might have. So in order to make WarmBookmarksCache useful we need to avoid
doing that.
Reviewed By: farnz
Differential Revision: D22816707
fbshipit-source-id: 9593426796b5263344bd29fe5a92451770dabdc6
Summary:
The overall goal of this stack is to add WarmBookmarksCache support to
repo_client to make Mononoke more resilient to lands of very large commits.
This diff just does a small refactoring that makes introducing
WarmBookmarksCache easier. In particular, later in cached_pull_default_bookmarks_maybe_stale cache I'd like to store
not only PullDefault bookmarks, but also Publishing bookmarks so that both
listkeys() and heads() method could be served from this cache. In order to do
that we need to store not only bookmark name, but also bookmark kind (i.e. is
it Publishing or PullDefault).
To do that let's store the actual Bookmarks and hg changeset objects instead of
raw bytes.
Reviewed By: farnz
Differential Revision: D22816710
fbshipit-source-id: 6ec3af8fe365d767689e8f6552f9af24cbcd0cb9
Summary: Same as a previous diff. Let's keep the top-level dir tidy.
Reviewed By: krallin
Differential Revision: D22691638
fbshipit-source-id: 7f9a21f307efd9bbe37f515f475409c89b99cd31
Summary:
Currently the repo lock is checked only once at the beginnig of unbundle future. That unbundle process take some time and during that time repo can be locked by someone.
We can reduce that possibility by creating additional future, which will check the repo in the loop and poll both futures for whoever will finish first.
Reviewed By: StanislavGlebik
Differential Revision: D22560907
fbshipit-source-id: 1cba492fa101dba988e07361e4048c6e9b778197
Summary:
This is the next step in exposing version, used to sync commits in read
queries. The previous step was to query this from DB, now let's also put it
into an enum payload. Further, I will add consumers of this in admin and
validation.
Note that ideally, `RewrittenAs` should always have a version associated with
it, but:
- right now this is not true in the DB (needs backfilling)
- even when I backfill everything, I would not like to error out just at
reading time, I would prefer the consumers to deal with the absense of a
version in rewrtitten commits.
Therefore, I decided to use `Option` here.
Reviewed By: StanislavGlebik
Differential Revision: D22476166
fbshipit-source-id: 5bc27bb21b7e59c604755ef35aa5d3d2c3df527e
Summary:
Separate out the `BundleReplayData` from the `BookmarkUpdateReason` enum. There's
no real need for this to be part of the reason, and removing it means we can
abstract away the remaining dependency on Mercurial changeset IDs from
the main bookmarks traits.
Reviewed By: mitrandir77, ikostia
Differential Revision: D22417659
fbshipit-source-id: c8e5af7ba57d10a90c86437b59c0d48e587e730e
Summary:
The goal is to make it easier to implement unit tests, which depend on `LiveCommitSyncConfig`. Specifically, `scs` has a piece of code, which instantiates `mononoke_api::Repo` with a test version of `CommitSyncConfig`. To migrate it to `LiveCommitSyncConfig`, I need to be able to create a test version of that. It **is** possible now, but would require me to turn a supplied instance of `CommitSyncConfig` back into `json`, which is cumbersome. Using a `dyn LiveCommitSyncConfig` there, instead of a concrete struct seems like a good idea.
Note also that we are using this technique in many places: most (all?) of our DB tables are traits, which we then implement for SQL-specific structs.
Finally, this diff does not actually migrate all of the current users of `LiveCommitSyncConfig` (the struct) to be users of `LiveCommitSyncConfig` (the trait), and instead makes them use `CfgrLiveCommitSyncConfig` (the trait impl). The idea is that we can migrate bits to use traits when needed (for example, in an upcoming `scs` diff). When not needed, it's fine to use concrete structs. Again, this is already the case in a a few places: we sometimes use `SqlSyncedCommitMapping` struct directly, instead of `T: SyncedCommitMapping` or `dyn SyncedCommitMapping`.
Reviewed By: StanislavGlebik
Differential Revision: D22383859
fbshipit-source-id: 8657fa39b11101684c1baae9f26becad6f890302
Summary:
Just knowing the number of fetched undesired files doesn't give the full
picture. e.g. fetching lots of small files is better than fetching single
multi-Gb file.
So knowing the size of files is helpful
Reviewed By: krallin
Differential Revision: D22408400
fbshipit-source-id: 7653c1cdceccf50aeda9ce8a4880ee5178d4b107
Summary: D22381744 updated the version of `futures` in third-party/rust to 0.3.5, but did not regenerate the autocargo-managed Cargo.toml files in the repo. Although this is a semver-compatible change (and therefore should not break anything), it means that affected projects would see changes to all of their Cargo.toml files the next time they ran `cargo autocargo`.
Reviewed By: dtolnay
Differential Revision: D22403809
fbshipit-source-id: eb1fdbaf69c99549309da0f67c9bebcb69c1131b
Summary:
Rework the bookmarks traits:
* Split out log functions into a separate `BookmarkUpdateLog` trait. The cache doesn't care about these methods.
* Simplify `list` down to a single method with appropriate filtering parameters. We want to add more filtering types, and adding more methods for each possible combination will be messier.
* The `Bookmarks` and `BookmarkUpdateLog` traits become `attributes` on `BlobRepo`, rather than being a named member.
Reorganise the bookmarks crate to separate out the bookmarks log and transactions into their own modules.
Reviewed By: krallin
Differential Revision: D22307781
fbshipit-source-id: 4fe514df8b7ef92ed3def80b21a16e196d916c64
Summary:
`bulk_add()` method was checking for conflicts correctly i.e. it wouldn't fail
if we try to insert the same mapping twice.
`bulk_add_git_mapping_in_transaction` wasn't doing this check i.e. it would
fail.
This caused us a few problems and this diff fixes them - now
`bulk_add_git_mapping_in_transaction` would do the same checks as bulk_add was
doing previously.
There is another important change in behaviour: if we try to insert two entries, one of them
has a conflict another don't then previously we'd insert the second entry.
Now we don't insert any, arguably that's a preferred behaviour.
Reviewed By: krallin
Differential Revision: D22332001
fbshipit-source-id: 86fff8c23c43eeca0fb36b01b10cdaa73b3ce4ab
Summary: Convert the bookmarks traits to use new-style `BoxFuture<'static>` and `BoxStream<'static>`. This is a step along the path to full `async`/`await`.
Reviewed By: farnz
Differential Revision: D22244489
fbshipit-source-id: b1bcb65a6d9e63bc963d9faf106db61cd507e452
Summary: This is the final diff of the stack - it starts logging pushed commits to scribe
Reviewed By: farnz
Differential Revision: D22212755
fbshipit-source-id: ec09728408468acaeb1c214d43f930faac30899b
Summary:
Failing push if we failed to log to scribe doesn't make a lot of sense. By that
time the ship has sailed - commit has already been pushed and by failing the
request we can't undo that. It will just create an annoyance by whoever is
pushing.
Instead let's log it to scuba
Reviewed By: farnz
Differential Revision: D22256687
fbshipit-source-id: 2428bbf1db4cef6fa80777ad65184fab1804fa9c
Summary:
At the moment we can't test logging to scribe easily - we don't have a way to
mock it. Scribe are supposed to help with that.
They will let us to configure all scribe logs to go to a directory on a
filesystem similar to the way we configure scuba. The Scribe itself will
be stored in CoreContext
Reviewed By: farnz
Differential Revision: D22237730
fbshipit-source-id: 144340bcfb1babc3577026191428df48e30a0bb6
Summary:
Eventually, we want everything to be `async`/`await`; as a stepping stone in that direction, switch the remaining lobstore traits to new-style futures.
This just pushes the `.compat()` out to old-style futures, but it makes the move to non-'static lifetimes easier, as all the compile errors will relate to lifetime issues.
Reviewed By: krallin
Differential Revision: D22183228
fbshipit-source-id: 3fe3977f4469626f55cbf5636d17fff905039827
Summary:
Eventually, we want everything to be `async`/`await`; as a stepping stone in that direction, switch some of the blobstore interfaces to new-style `BoxFuture` with a `'static` lifetime.
This does not enable any fixes at this point, but does mean that `.compat()` moves to the places that need old-style futures instead of new. It also means that the work needed to make the transition fully complete is changed from a full conversion to new futures, to simply changing the lifetimes involved and fixing the resulting compile failures.
Reviewed By: krallin
Differential Revision: D22164315
fbshipit-source-id: dc655c36db4711d84d42d1e81b76e5dddd16f59d
Summary:
Push supported multiple bookmarks in theory, but in practice we never used it.
Since we want to start logging pushed commits in the next diffs we need to decide what to do with
bookmarks, since at the moment we can log only a single bookmark to scribe
let's just allow a single bookmark push
Reviewed By: farnz
Differential Revision: D22212674
fbshipit-source-id: 8191ee26337445ce2ef43adf1a6ded3e3832cc97
Summary:
In the next diffs it will be passed to unbundle processing so that we can use
scribe category to log pushed commits
Reviewed By: krallin
Differential Revision: D22212616
fbshipit-source-id: 17552bda11f102041a043f810125dc381e478611
Summary: That was like 50% of the point of this change, and somehow I forgot to do it.
Reviewed By: farnz
Differential Revision: D22231923
fbshipit-source-id: 4a4daaeaa844acd219680907c0b5a5fdacdf535c
Summary:
This fn is not used anywhere except tests, and its only difference from
`backsync_all_latest` is in the fact that it accepts a limit. So let's rename
`backsync_all_latest` into `backsync_latest` and make it accept a limit arg.
I decided to use a custom enum instead of `Option` so that people don't have to
open fn definition to understand what `BacksyncLimit::Limit(2)` or
`BacksyncLimit::NoLimit` mean.
Reviewed By: StanislavGlebik
Differential Revision: D22187118
fbshipit-source-id: 6bd97bd6e6f3776e46c6031f775739ca6788ec8c
Summary:
This diff enables `unbundle` flow to start creating `push_redirector` structs from hot-reloaded `CommitSyncConfig` (by using the `LiveCommitSyncConfig` struct).
Using `LiveCommitSyncConfig` unfortunately means that we need to make sure those tests, which don't use standard fixtures, need to have both the `.toml` and the `.json` commit sync configs present, which is a little verbose. But it's not too horrible.
Reviewed By: StanislavGlebik
Differential Revision: D21962960
fbshipit-source-id: d355210b5dac50d1b3ad277f99af5bab56c9b62e
Summary:
Due to Thrift design of "include" statements in fbcode the thrift structures has to be contained in folders that are identical to the folder layout inside fbcode.
This diff changes the folder layout on Cargp.toml files and in fbcode_builder, there will be a next diff that changes this for ShipIt as well.
Reviewed By: ikostia
Differential Revision: D22208707
fbshipit-source-id: 65f9cafed2f0fcc8398a3887dfa622de9e139f68
Summary:
If a commit changes modes (i.e. executable, symlink or regular) of a lot of files but
doesn't change their content then we don't need to put these filenodes to the
generated bundle. Mercurial stores mode in manifest, so changing the mode
doesn't change the filenode.
Reviewed By: ikostia
Differential Revision: D22206736
fbshipit-source-id: f64ee8a34281cd207c92653b927bf9109ccbe1b4
Summary: `HgPhase` type is redundant and was adding dependency on mercurial in phases crate.
Reviewed By: farnz
Differential Revision: D22162716
fbshipit-source-id: 1c21841d34897d0072ff6fe5e4ac89adddeb3c68
Summary: DangerousOverride is moved into a separate crate. Not only it is usually not needed but it was introducing dependencies on mercurial crate.
Reviewed By: StanislavGlebik
Differential Revision: D22115015
fbshipit-source-id: c9646896f906ea54d11aa83a8fbd8490a5b115ea
Summary: Move all mercurial changeset generation logic to `blobrepo_hg`. This is preliminary step is required to decouples BlobRepo from mercurial, and in later stages it will be moved to derived data infra once blobrepo is free of mercurial.
Reviewed By: StanislavGlebik
Differential Revision: D22089677
fbshipit-source-id: bca28dedda499f80899e729e4142e373d8bec0b8
Summary: move HgMutationStore to attributes, and all related methods to BlobRepoHg
Reviewed By: StanislavGlebik
Differential Revision: D22089657
fbshipit-source-id: 8fe87418ccb8a7ad43828758844bdbd73dc0573d
Summary: This diff introduces `BlobRepoHg` extension trait for `BlobRepo` object. Which contains mercurial specific methods that were previously part of `BlobRepo`. This diff also stars moving some of the methods from BlobRepo to BlobRepoHg.
Reviewed By: ikostia
Differential Revision: D21659867
fbshipit-source-id: 1af992915a776f6f6e49b03e4156151741b2fca2
Summary: Megarepo is simplified if we can avoid copying hooks everywhere - run megarepo hooks as well as small repo hooks during pushredirection.
Reviewed By: StanislavGlebik
Differential Revision: D20652331
fbshipit-source-id: f42216797b9061db10b50c1440253de1f56d6b85
Summary:
Remove unused dependencies for Rust targets.
This failed to remove the dependencies in eden/scm/edenscmnative/bindings
because of the extra macro layer.
Manual edits (named_deps) and misc output in P133451794
Reviewed By: dtolnay
Differential Revision: D22083498
fbshipit-source-id: 170bbaf3c6d767e52e86152d0f34bf6daa198283
Summary:
The goal of the stack is to support hot reloading of `CommitSyncConfig`s everywhere: in `push_redirector`, `backsyncer`, `x-repo sync job` and so forth.
This diff in particular is a refactoring of how we instantiate the `PushRedirector` struct for the `unbundle` flow. Previously the struct would be instantiated when `RepoHandler` struct was built and would later be reused by `RepoClient`. Now we want to instantiate `PushRedirector` before we start processing the `unbundle` request, so that we can request the newest `CommitSyncConfig`. Note that this diff does not introduce the hot reload itself, it just lays the groundwork: instantiation of `PushRedirector` at request start.
To achieve this goal, `RepoClient` now contains a somewhat modified `PushRedirectorArgs` struct, whose goal is to own the unchanging stuff, needed to create a full `PushRedirector`.
Here are a few explicit non-goals for this hot reloading:
- the overall decision whether the repo is part of any `CommitSyncConfig` or not is still made at `RepoHandler` creation time. What this means is that if `CommitSyncConfig` is changed to have another small repo and Mononoke servers happens to know about that repo, it would not automatically pick up the fact that the repo should be a part of `CommitSyncConfig`
- same for removal (stopping push redirector is already possible via a different hot-reloaded config)
- changing anything about a large/small relationship is likely to be very complicated under the best circumstances of everything being down, let alone during a hot reload. This means that target repo cannot be changed via this mechanizm.
Essentially, the goal is only to be able to do a live change of how paths change between repos.
Reviewed By: StanislavGlebik
Differential Revision: D21904799
fbshipit-source-id: e40e6a9c39f4f03a436bd974f3cba26c690c5f27
Summary:
Add logging of infinitepush (draft) commits to a separate scribe category.
The logging will also include the username and hostname of the pusher. Since
this code is shared with the public commits scribe logging, that logging will
also gain this information.
Reviewed By: farnz
Differential Revision: D21742656
fbshipit-source-id: bdbfd14db9e8aae190c634ac4bfff35b3f62bbe4
Summary:
Add the `move_bookmark` method to `mononoke_api`.
This attempts to re-use code from `repo_client`. This code isn't really designed to be called from this API, so the fit is very poor. In a future diff we will fix this up.
The `repo_client` code for force-moving a bookmark does not support running hooks for that bookmark. For now we will prevent API users from moving bookmarks that have hooks configured. This will also be addressed in a future diff.
Reviewed By: krallin
Differential Revision: D21904979
fbshipit-source-id: 42bf840489e5b04f463c69c752bcaa5174630c21
Summary:
Some of the hg changeset structures use `comments` as the name for commit
message. But this is confusing - let's rename
Reviewed By: HarveyHunt
Differential Revision: D21974571
fbshipit-source-id: e7c5c5ad8db9b2f1343abe9101fc56a6d4287548
Summary: Let's log the name as well - it will help with investigation.
Reviewed By: farnz
Differential Revision: D21906595
fbshipit-source-id: 51eb49354017c17ba3304f0a66c95dfc3c695e6a
Summary:
Let's return FilenodeResult from get_all_filenodes_maybe_stale and change
callers to deal with that.
The change is straightforward with the exception of `file_history.rs`.
get_all_filenodes_maybe_stale() is used here to prefetch a lot filenodes in one
go. This diff changes it to return an empty vec in case filenodes are disabled.
Unfortunately this is not a great solution - since prefetched files are empty
get_file_history_using_prefetched() falls back to fetching filenodes
sequentially from the blobstore. that might be too slow, and the next diffs in
the stack will address this problem.
Reviewed By: krallin
Differential Revision: D21881082
fbshipit-source-id: a86dfd48a92182381ab56994f6b0f4b14651ea14
Summary:
I observed that for whatever reason our setting of `use_try_shorthand = true` in rustfmt.toml was causing entire functions to not get processed by rustfmt. Even files that contain neither `try` nor `?`. Remove it and reformat fbsource.
Documentation of that config:
- https://github.com/rust-lang/rustfmt/blob/master/Configurations.md#use_try_shorthand
We don't particularly care about the value anymore because nobody writes `r#try!(...)` in 2018 edition code.
Minimized:
```
fn f() {
g(
)
// ...
.h
}
```
This function gets formatted only if use_try_shorthand is not set.
The bug is fixed in the rustfmt 2.0 release candidate.
Reviewed By: jsgf
Differential Revision: D21878162
fbshipit-source-id: b028673c0eb703984d24bf0d2983453fc2a8c212
Summary:
See D21765065 for more context. TL;DR is that we want to control
lfs rollout from client side to make sure we don't put lfs pointers in the
shared memcache
Reviewed By: xavierd
Differential Revision: D21822159
fbshipit-source-id: daea6078d95eb4e9c040d353a20bcdf1b6ae07b1
Summary:
The motivation for the whole stack:
At the moment if mysql is down then Mononoke is down as well, both for writes
and for reads. However we can relatively easily improve the situation.
During hg update client sends getpack() requests to fetch files, and currently
for each file fetch we also fetch file's linknode. However hg client knows how
to deal with null linknodes [1], so when mysql is unavailable we can disable
filenode fetching completely and just return null linknodes. So the goal of this stack is to
add a knob (i.e. a tunable) that can turn things filenode fetches on and off, and make
sure the rest of the code deals nicely with this situation.
Now, about this diff. In order to force callers to deal with the fact that
filenodes might unavailable I suggest to add a special type of result, which (in
later diffs) will be returned by every filenodes methods.
This diff just introduces the FilenodeResult and convert BlobRepo filenode
methods to return it. The reason why I converted BlobRepo methods first
is to limit the scope of changes but at the same time show how the callers' code will look
like after FilenodeResult is introduced, and get people's thoughts of whether
it's reasonable or not.
Another important change I'd like to introduce in the next diffs is modifying FilenodesOnlyPublic
derived data to return success if filenodes knob is off. If we don't do that
then any attempt to derive filenodes might fail which in turn would lead to the
same problem we have right now - people won't be able to do hg update/hg
pull/etc if mysql is down.
[1] null linknodes might make some client side operation slower (e.g. hg rebase/log/blame),
so we should use it only in sev-like situations
Reviewed By: krallin
Differential Revision: D21787848
fbshipit-source-id: ad48d5556e995af09295fa43118ff8e3c2b0e53e
Summary:
Follow up from D21596758 - current logic of pushrebasing a merge is very
complicated. To prevent repo corruptions let's do a very simple validation -
generate hg changeset from a rebased bonsai changeset.
Given that generating hg changeset is an expensive operation let's do it only
after the first rebase attempt - otherwise we might risk constantly losing the
pushrebase race.
Differential Revision: D21659187
fbshipit-source-id: f43a855cf0fbdbd11a40d3ec38283af344cde5e6
Summary: This is to bring it into sync with the `forwardfillerqueue` types.
Reviewed By: markbt
Differential Revision: D21660012
fbshipit-source-id: 5148023478c175cd49707d88251701a08fcbe0ce
Summary:
There are a few paths through `resolve` that return early with a failure, and thus never record what happened.
Make a record the moment we enter `resolve` - then, we can use `count` type ODS charts to determine the failure rate deterministically, and alarm if the failure rate is too high
Reviewed By: ahornby
Differential Revision: D21647575
fbshipit-source-id: 667787ec000a8cd8e715563df10dbb84832fefa1
Summary: First diff in the stack that removes getfiles since it's no longer needed.
Reviewed By: farnz
Differential Revision: D21623156
fbshipit-source-id: 44f310ec4e4f34845cc5bf1738f1a8ece14e6694
Summary:
Currently we record them only during pushrebase. Let's record during push as
well.
To simplify things a little bit let's allow only a very simple push case:
1) Single bookmark.
2) All pushed commits should be reachable by this bookmark.
Reviewed By: krallin
Differential Revision: D21451337
fbshipit-source-id: bf2f1e6025ac116fb8096824b7c4c6440d073874
Summary:
Let's add an option to log how many files and trees were fetched in a
particular repo that start with a prefix.
Reviewed By: farnz
Differential Revision: D21617347
fbshipit-source-id: a57f74eadf32781e6c024e18da252c98af21996d
Summary:
This adds support for periodically logging that a command is in progress in
Mononoke. The underlying motivation is to make sure that if something is taking
a long time, we can still show some feedback to the user (and log to Scuba). We
might want to log this every 30 seconds.
That said, this is more of an RFC at this stage. I'm thinking it might make
sense to log to Scuba more often and to users less often. It might make sense
to also restrict this to specific commands, such as unbundle:
https://fburl.com/scuba/mononoke_test_perf/atik5959
Reviewed By: StanislavGlebik
Differential Revision: D21549862
fbshipit-source-id: 1d02c5c926abc7e491ac5b8ae0244b5f4620c93e
Summary: Same as the previous diff, but for commands that return a stream.
Reviewed By: StanislavGlebik
Differential Revision: D21549864
fbshipit-source-id: ba8c14db34a651cd4ddbc1c8b9ad382c08cc775d
Summary:
This doesn't do anything on its own, but it's refactoring I need for later in
this stack. It wraps all our commands in a command_future call that gives us an
opportunity to wrap the future being returned. We used to use `start_command`
to get the context, so this just replaces that.
Reviewed By: StanislavGlebik
Differential Revision: D21549863
fbshipit-source-id: 0e613bb1db876d27d662fd6c993d7b7d954b5f2b
Summary:
The `support_bundle2_listkeys` flag controls at runtime whether we support
`listkeys` in bundle2. Since this was added before tunables were available,
it uses a value in the mutable counters SQL store.
We could migrate this to tunables, but in practice we have never disabled it,
so let's just make it the default.
Reviewed By: krallin
Differential Revision: D21546246
fbshipit-source-id: 066a375693757ea841ecf0fddb0cc91dc144fd6f
Summary:
When the client pulls draft commits, include mutation information in the bundle
response.
Reviewed By: farnz
Differential Revision: D20871339
fbshipit-source-id: a89a50426fbd8f9ec08bbe43f16fd0e4e3424e0b
Summary:
Advertise support for `b2x:infinitepushmutation`. When the client sends us
mutation information, store it in the mutation store.
Reviewed By: mitrandir77
Differential Revision: D20871340
fbshipit-source-id: ab0b3a20f43a7d97b3c51dcc10035bf7115579af
Summary: This also unblocks the MacOS Mononoke builds, so enabling them back
Reviewed By: farnz
Differential Revision: D21455422
fbshipit-source-id: 4eae10785db5b93b1167f580a1c887ee4c8a96a2
Summary:
If a bundle comes from the commit cloud forward filler, we need to ignore
and not record it.
To do so, we need to start paying attention to stream-level params for the
first time.
Reviewed By: krallin
Differential Revision: D21427620
fbshipit-source-id: 9ee417cd2a5f2f5bb6ec342cd63071c6ca822475
Summary:
We want to be able to record all the bundles Mononoke processes to be later
replayed by Mercurail.
Reviewed By: krallin
Differential Revision: D21427622
fbshipit-source-id: b88e10e03d07dae35369286fe31022f36a1ee5cf
Summary: To make it easier to navigate the codebase the oss-only code will be from now on stored in a separate module, similarly to how the fbcode-only code is stored.
Reviewed By: markbt
Differential Revision: D21429060
fbshipit-source-id: aa7e80961de2897dae31bd0ec83488c683633b7a
Summary: This is an sqlite equivalent of what exists in xdb now.
Reviewed By: krallin
Differential Revision: D21427621
fbshipit-source-id: 7024fbf7a8773c4465d2e6ee327aadeaf87cb213
Summary:
There are few related changes included in this diff:
- backsyncer is made public
- stubs for SessionContext::is_quicksand and scuba_ext::ScribeClientImplementation
- mononoke/hgproto is made buildable
Reviewed By: krallin
Differential Revision: D21330608
fbshipit-source-id: bf0a3c6f930cbbab28508e680a8ed7a0f10031e5
Summary:
- Change get return value for `Blobstore` from `BlobstoreBytes` to `BlobstoreGetData` which include `ctime` metadata
- Update the call sites and tests broken due to this change
- Change `ScrubHandler::on_repair` to accept metadata and log ctime
- `Fileblob` and `Manifoldblob` attach the ctime metadata
- Tests for fileblob in `mononoke:blobstore-test` and integration test `test-walker-scrub-blobstore.t`
- Make cachelib based caching use `BlobstoreGetData`
Reviewed By: ahornby
Differential Revision: D21094023
fbshipit-source-id: dc597e888eac2098c0e50d06e80ee180b4f3e069
Summary: Making a trait out of TimeWindowCounter will help with providing different implementations of load limiting for OSS and FB.
Reviewed By: krallin
Differential Revision: D21329265
fbshipit-source-id: 7f317f8e9118493f3dcbadb0519eaff565cbd882
Summary:
This updates repo_client to log when hooks finished, and how many were rejecte,
if any. This required a bit of refactoring to avoid iterating twice over
whether hooks are rejected or not (and instead just filter-maps outcomes to a
rejection), but it's probably for the better since it removes a bit of
un-necessary cloning (notably of the hook name).
Reviewed By: farnz
Differential Revision: D21379690
fbshipit-source-id: 53c8368d3871620ec61db76dc35b47dd17276ac4
Summary:
- Use the same case consistently
- Log even when pushrebase fails
Reviewed By: farnz
Differential Revision: D21378033
fbshipit-source-id: 062e986151086476db9100e3d9c71aa702661032
Summary: Making a trait out of LoadLimiter will help with providing different implementations of load limiting for OSS and FB.
Reviewed By: farnz
Differential Revision: D21302819
fbshipit-source-id: 1b982a367aa7126ca5d7772e4a2406dabbe9e13b
Summary: I'm about to add a new parameter to `scribe_commit_queue`; first asyncify it to modernise
Reviewed By: krallin
Differential Revision: D21288044
fbshipit-source-id: d1a4bb052b3c055383dd9d9df5fe36d61b14bdfe
Summary:
Correctly identify infinitepush without bookmarks as infinitepush instead of plain push.
Current behavior would sometimes pass `infinitepush` bundles through the `push` pipeline. Interestingly, this does not result in any user-visible effects at the moment. However, in the future we may want to diverge these pipelines:
- maybe we want to disable `push`, but enable `infinitepush`
- maybe there will be performance optimizations, applicable only to infinitepush
In any case, the fact that things worked so far is a consequence of a historical accident, and we may not want to keep it this way. Let's have correct identification.
Reviewed By: StanislavGlebik
Differential Revision: D18934696
fbshipit-source-id: 69650ca2a83a83e2e491f60398a4e03fe8d6b5fe
Summary: This is useful during invesigations. ODS only works for stats.
Reviewed By: krallin
Differential Revision: D21144414
fbshipit-source-id: 0fbb95a79c324d270c8d6dc4770d7729c7b23694
Summary:
In getbundle, we compute the set of new draft commit ids. This is used to
include tree and file data in the bundle when draft commits are fully hydrated,
and will also be used to compute the set of mutation information we will
return.
Currently this calculation only computes the non-common draft heads. It
excludes all of the ancestors, which should be included. This is because it
re-uses the prepare_phases code, which doesn't quite do what we want.
Instead, separate out these calculations into two functions:
* `find_new_draft_commits_and_public_roots` finds the draft heads
and their ancestors that are not in the common set, as well as the
public roots the draft commits are based on.
* `find_phase_heads` finds and generates phase head information for
the public heads, draft heads, and the nearest public ancestors of the
draft heads.
Reviewed By: StanislavGlebik
Differential Revision: D20871337
fbshipit-source-id: 2f5804253b8b4f16b649d737f158fce2a5102002
Summary:
We should use the HgsqlName to check the repo lock, because that's the one
Mercurial uses in the repo lock there.
Reviewed By: farnz
Differential Revision: D20943177
fbshipit-source-id: 047be6cb31da3ee006c9bedc3de21d655a4c2677
Summary: Not in use any more - all hooks are now Bonsai form - so remove it.
Reviewed By: krallin
Differential Revision: D20891164
fbshipit-source-id: b92f169a0ec3a4832f8e9ec8dc9696ce81f7edb3
Summary: This allows the client to do proper feature detection.
Reviewed By: krallin
Differential Revision: D20910379
fbshipit-source-id: c7b9d4073e94518835b39809caf8b068f70cbc2f
Summary:
The Mercurial SHA1 is defined as:
sorted([p1, p2]) + content
The client wants to be able to verify the commit hashes returned by
getcommitdata. Therefore, also write the sorted parents so the client can
calculate the SHA1 easily without fetching SHA1s of parents. This is
useful because we also want to make commit SHA1s lazy on client-side.
I also changed the NULL behavior so the server does not return
content for the NULL commit, as it will fail the SHA1 check.
The server will expects the client to already know how to handle
the NULL special case.
Reviewed By: krallin
Differential Revision: D20910380
fbshipit-source-id: 4a9fb8ef705e93c759443b915dfa67d03edaf047
Summary:
If a blob is redacted, we shouldn't crash in batch. Instead, we should return
that the blob exists, and let the download path return to the client the
information that the blob is redacted. This diff does that.
Reviewed By: HarveyHunt
Differential Revision: D20897247
fbshipit-source-id: 3f305dfd9de4ac6a749a9eaedce101f594284d16
Summary: Running on Mercurial hooks isn't scalable long term - move the consumers of hooks to run on both forms for a transition period
Reviewed By: krallin
Differential Revision: D20879136
fbshipit-source-id: 4630cafaebbf6a26aa6ba92bd8d53794a1d1c058
Summary: We want all hooks to run against the Bonsai form, not a Mercurial form. Create a second form of hooks (currently not used) which acts on Bonsai hooks. Later diffs in the stack will move us over to Bonsai only, and remove support for Mercurial changeset derived hooks
Reviewed By: krallin
Differential Revision: D20604846
fbshipit-source-id: 61eece8bc4ec5dcc262059c19a434d5966a8d550
Summary: As it says in the title!
Reviewed By: HarveyHunt
Differential Revision: D20869828
fbshipit-source-id: df7728ce548739ef2dadad1629817fb56c166b66
Summary:
We use the logged arguments directly for wireproto replay, and then we replay
this directly in traffic replay, but just joining a list with `,` doesn't
actually work for directories:
- We need trailing commas
- We need wireproto encoding
This does that. It also clarifies that this encoding is for debug purposes by
updating function names, and relaxes a bunch of types (since hgproto uses
bytes_old).
Reviewed By: StanislavGlebik
Differential Revision: D20868630
fbshipit-source-id: 3b805c83505aefecd639d4d2375e0aa9e3c73ab9
Summary:
Combined with the unbundle resolver stats, we will be able to say which
percentage of pushrebases fails, for example.
Reviewed By: StanislavGlebik
Differential Revision: D20818224
fbshipit-source-id: 70888b1cb90ffae8b11984bb024ec1db0e0542f7
Summary:
We need this to be able to monitor how frequently we get pushes vs
infinitepushes, etc. A furhter diff will add a similar reporting to
`processing.rs`, so that we can compute a percentage of successful pushes to
all pushes, for example.
Reviewed By: StanislavGlebik
Differential Revision: D20818225
fbshipit-source-id: 7945dc285560d1357bdc6aef8e5fe50b61622254
Summary:
Migrate the configuration of sql data managers from the old configuration using `sql_ext::SqlConstructors` to the new configuration using `sql_construct::SqlConstruct`.
In the old configuration, sharded filenodes were included in the configuration of remote databases, even when that made no sense:
```
[storage.db.remote]
db_address = "main_database"
sharded_filenodes = { shard_map = "sharded_database", shard_num = 100 }
[storage.blobstore.multiplexed]
queue_db = { remote = {
db_address = "queue_database",
sharded_filenodes = { shard_map = "valid_config_but_meaningless", shard_num = 100 }
}
```
This change separates out:
* **DatabaseConfig**, which describes a single local or remote connection to a database, used in configuration like the queue database.
* **MetadataDatabaseConfig**, which describes the multiple databases used for repo metadata.
**MetadataDatabaseConfig** is either:
* **Local**, which is a local sqlite database, the same as for **DatabaseConfig**; or
* **Remote**, which contains:
* `primary`, the database used for main metadata.
* `filenodes`, the database used for filenodes, which may be sharded or unsharded.
More fields can be added to **RemoteMetadataDatabaseConfig** when we want to add new databases.
New configuration looks like:
```
[storage.metadata.remote]
primary = { db_address = "main_database" }
filenodes = { sharded = { shard_map = "sharded_database", shard_num = 100 } }
[storage.blobstore.multiplexed]
queue_db = { remote = { db_address = "queue_database" } }
```
The `sql_construct` crate facilitates this by providing the following traits:
* **SqlConstruct** defines the basic rules for construction, and allows construction based on a local sqlite database.
* **SqlShardedConstruct** defines the basic rules for construction based on sharded databases.
* **FbSqlConstruct** and **FbShardedSqlConstruct** allow construction based on unsharded and sharded remote databases on Facebook infra.
* **SqlConstructFromDatabaseConfig** allows construction based on the database defined in **DatabaseConfig**.
* **SqlConstructFromMetadataDatabaseConfig** allows construction based on the appropriate database defined in **MetadataDatabaseConfig**.
* **SqlShardableConstructFromMetadataDatabaseConfig** allows construction based on the appropriate shardable databases defined in **MetadataDatabaseConfig**.
Sql database managers should implement:
* **SqlConstruct** in order to define how to construct an unsharded instance from a single set of `SqlConnections`.
* **SqlShardedConstruct**, if they are shardable, in order to define how to construct a sharded instance.
* If the database is part of the repository metadata database config, either of:
* **SqlConstructFromMetadataDatabaseConfig** if they are not shardable. By default they will use the primary metadata database, but this can be overridden by implementing `remote_database_config`.
* **SqlShardableConstructFromMetadataDatabaseConfig** if they are shardable. They must implement `remote_database_config` to specify where to get the sharded or unsharded configuration from.
Reviewed By: StanislavGlebik
Differential Revision: D20734883
fbshipit-source-id: bb2f4cb3806edad2bbd54a47558a164e3190c5d1
Summary:
For the initial rollout of lfs on fbsource we want to rollout just for our
team using rollout_smc_tier option. This diff adds a support for that in
Mononoke.
It spawns a future that periodically updates list of enabled hosts in smc tier.
I had a slight concern about listing all the available services and storing
them in memory - what if smc tier have too many services? I decided to go ahead
with that because
1) [Smc antipatterns](https://fburl.com/wiki/ox43ni3a) wiki page doesn't seem
to list it as a concern.
2) We are unlikely to use for large tier - most likely we'll use it just for
hg-dev which contains < 100 hosts.
Reviewed By: krallin
Differential Revision: D20789751
fbshipit-source-id: d35323e49530df6983e159e2ed5bce205cc5666d
Summary:
We are currently having problems with streaming clone:
```
$ hg --config 'extensions.fsmonitor=!' clone --shallow -U --config 'ui.ssh=ssh -oControlMaster=no' --configfile /etc/mercurial/repo-specific/fbsource.rc 'ssh://hg.vip.facebook.com//data/scm/fbsource?force_mononoke' "$(pwd)/fbsource-clone-test"
remote: server: https://fburl.com/mononoke
remote: session: vJ3qkiQIm9FT7mCp
connected to twshared11499.02.cln2.facebook.com
streaming all changes
2 files to transfer, 5.42 GB of data
abort: unexpected response from remote server:
'\x00\x01B?AB\x00\x00\x00\x00\x02U\x00\x00\x02\xc7\x00b\xf0\xd5\x00b\xf0\xd5\x00b\xf0\xd4\xff\xff\xff\xff\xa8z\xc7W\xd0&\xab\xb2\xf1{\xbfq\xac<\xaf6W\x06q\x81\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01B?C\x97\x00\x00\x00\x00\x053\x00\x00\x06\xce\x00b\xf0\xd6\x00b\xf0\xd6\x00b\xf0\xd5\xff\xff\xff\xff\xa3I\x19+\xe2\x0f\xae\xd2\x95\x14\x8a\xde\x19\x18\xf0\x8cUQu\xf1\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01B?H\xca\x00\x00\x00\x00\x02\xe4\x00\x00\x03\x9e\x00b\xf0\xd7\x00b\xf0\xd7\x00b\xf0\xd6\xff\xff\xff\xffx\xd6}\x12nt\xb9\xbc(\x83\xfb\xfa\xcc\xc1o?\xde\xcc\x06L\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01B?K\xae\x00\x00\x00\x00\x02j\x00\x00\x02\xb5\x00b\xf0\xd8\x00b\xf0\xd8\x00b\xf0\xd7\xff\xff\xff\xff\x04"\xfcw6\'M\xba\xf1f\xdb\x02\xbeE\x93:\xc8\x17\x88P\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01B?N\x18\x00\x00\x00\x00\x03\xbb\x00\x00\x04\xb8\x00b\xf0\xd9\x00b\xf0\xd9\x00b\xf0\xd8\xff\xff\xff\xff\xb9\x15*p/\xa4*\x00\x9dZw\x01B\x87L\x8f\x08\x11\x89\xe0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0000changelog.d\x005406413267\n'
```
as the result of the debugging it is turned out that we are sending more data than expected, to have better error next time if we have any corruption of the `streaming_changelog_chunks` table
Reviewed By: StanislavGlebik
Differential Revision: D20763738
fbshipit-source-id: 6f6fa9f9a29909e044d9ba42fe84916ddcb62e8f
Summary:
As suggested in D20680173, we can reduce the overall need to copy things by
storing refs in the resolver.
Reviewed By: krallin
Differential Revision: D20696588
fbshipit-source-id: 9456e2e208cfef6faed57fc52ca59fafdccfc68c
Summary:
See bottom diff of this stack for overview.
This diff in particular asyncifies the `upload_changeset` fn. Apart from that,
it also makes sure it can accept `&RevlogChangeset` instead of
`RevlogChangeset`, which helps us to get rid of cloning.
Reviewed By: krallin
Differential Revision: D20693932
fbshipit-source-id: b0e5e1604cbfb6f6b6e269c85a79208115325734
Summary: Same as the bottom diff of this stack, but for another file.
Reviewed By: krallin
Differential Revision: D20693934
fbshipit-source-id: 4c2d12bf9d9ab272898a7830ece6d9f563adb8fb
Summary:
This diff focuses on the following:
- replaces clones with references, both when this decreases the total sum of
clones, and when it causes the only clone to be on the boundary with the
compat code. This, when those boundaries are pushed further, we can only fix
one place in resolver
- removes a weird wrapping of a closure into an `Arc` and just calls
`upload_changesets` directly instead
- in cases when `BundleResolver` methods take `ctx` as an argument removes it
and makes those methods use the one stored in the struct
Reviewed By: StanislavGlebik
Differential Revision: D20680173
fbshipit-source-id: c397c4ade57a07cbbc9206fa8a44f4225426778c
Summary:
This updates unbundle_replay to account for pushrebase hooks, notably to assign
globalrevs.
To do so, I've extracted the creation of pushrebase hooks in repo_client and
reused it in unbundle_replay. I also had to update unbundle_replay to no longer
use `args::get_repo` since that doesn't give us access to the config (which we
need to know what pushrebase hooks to enable).
Reviewed By: ikostia
Differential Revision: D20622723
fbshipit-source-id: c74068c920822ac9d25e86289a28eeb0568768fc
Summary:
This adds a unbundle_replay Rust binary. Conceptually, this is similar to the
old unbundle replay Python script we used to have, but there are a few
important differences:
- It runs fully in-process, as opposed to pushing to a Mononoke host.
- It will validate that the pushrebase being produced is consistent with what
is expected before moving the bookmark.
- It can find sources to replay from the bookmarks update log (which is
convenient for testing).
Basically, this is to writes and to the old unbundle replay mechanism what
Fastreplay is to reads and to the traffic replay script.
There is still a bit of work to do here, notably:
- Make it possible to run this in a loop to ingest updates iteratively.
- Run hooks.
- Log to Scuba!
- Add the necessary hooks (notably globalrevs)
- Set up pushrebase flags.
I would also like to see if we can disable the presence cache here, which would
let us also use this as a framework for benchmarking work on push performance,
if / when we need that.
Reviewed By: StanislavGlebik
Differential Revision: D20603306
fbshipit-source-id: 187c228832fc81bdd30f3288021bba12f5aca69c
Summary:
A few of our tasks failed on startup and most likely it was during warmup
though we are not sure (see attached task).
Let's add move logging
Reviewed By: farnz
Differential Revision: D20698273
fbshipit-source-id: 4facd21a94d2917103e417a014b820c893da4718
Summary:
See bottom diff of the stack for overview.
This diff asyncifies the resolver's entrypoint: `resolve` fn, and provides a compatibility shim `resolve_compat` to call from old-style code. Alternatively, we could just do `async move { resolve(...).await }.boxed().compat()` at a callsite. This did not seem too important to do one way or the other, let me know if you feel strongly about it.
Reviewed By: StanislavGlebik
Differential Revision: D20634406
fbshipit-source-id: ee3c73a7a2c65c333e95194bd90ca7330b225011
Summary: See bottom diff of the stack for overview.
Reviewed By: StanislavGlebik
Differential Revision: D20633227
fbshipit-source-id: 16b3f3a764a75261da0585c9724a17853e865681
Summary:
See bottom diff of the stack for overview.
This diff asyncifies `resolve_push`.
Reviewed By: StanislavGlebik
Differential Revision: D20620100
fbshipit-source-id: 1109933e388d9485f42c63638621a7b9227f157f
Summary:
See bottom diff of the stack for overview.
This diff asyncifies `resolve_bookmark_only_pushrebase`.
Reviewed By: StanislavGlebik
Differential Revision: D20610253
fbshipit-source-id: 2a79ac9e8bdca18401ed95d98a0e1b3e92fee4fe
Summary:
See bottom diff of the stack for overview.
This diff asyncifies `return_with_rest_of_bundle`.
Reviewed By: StanislavGlebik
Differential Revision: D20605807
fbshipit-source-id: 0df7d18d06720ff166cdc3e9b981932a819cb0aa
Summary:
See bottom diff of the stack for overview.
This diff asyncifies `next_item`
Reviewed By: StanislavGlebik
Differential Revision: D20605817
fbshipit-source-id: 99fc0100736f0a9307448c6f2ead91da81a531cb
Summary:
See the bottom diff of the stack for overview.
This diff asyncifies two fns: `maybe_save_full_content_bundle2 ` and `is_next_part_pushkey`
Reviewed By: StanislavGlebik
Differential Revision: D20605814
fbshipit-source-id: 1daea901e7620638fa9b8d0b69c18b0ff4e967da
Summary:
See bottom diff of the stack for overview.
This diff asyncifies `maybe_resolve_commonheads` fn.
Reviewed By: StanislavGlebik
Differential Revision: D20605808
fbshipit-source-id: 6b628a4c66970e468839732db8ffb11c961be591
Summary:
See bottom diff of the stack for overview.
This diff asyncifies `maybe_resolve_pushvars` fn.
Reviewed By: StanislavGlebik
Differential Revision: D20605812
fbshipit-source-id: e68f9d878a294a9980e53b104aa1035c0d47ae65
Summary:
See bottom diff of the stack for overview.
This diff asyncifies `maybe_resolve_changegroup`.
Reviewed By: StanislavGlebik
Differential Revision: D20605810
fbshipit-source-id: fbf5e9d93b355dfa23a3a7657edb96b033535f9d
Summary:
See bottom diff of the stack for overview.
This diff asyncifies `ensure_stream_finished` fn.
Reviewed By: StanislavGlebik
Differential Revision: D20605805
fbshipit-source-id: 853a6d5d1afeee0f8f841eec51302fb3ceb701c7
Summary:
See bottom diff of the stack for motivation.
This diff asyncifies `resolver_multiple_parts` and `maybe_resolve_pushkey` functions.
Reviewed By: StanislavGlebik
Differential Revision: D20605815
fbshipit-source-id: 90768f4495632ec83b79ca9fcf982b0ec5c277cf
Summary:
See bottom diff of the stack for overview.
This diff asyncifies `resolve_b2xtreegroup2`.
Reviewed By: StanislavGlebik
Differential Revision: D20605806
fbshipit-source-id: 2e667d19f2014d051d25a74353e9ebd2e6a93c72
Summary:
See bottom diff of the stack for the overview.
This particular diff asyncifies the `maybe_resolve_infinitepush_bookmarks` fn.
Reviewed By: StanislavGlebik
Differential Revision: D20605816
fbshipit-source-id: 11b6e9c5dd7423bcc4ecc988efd581a3e970ccdc
Summary:
See the bottom diff of the stack for the overview.
This diff specifically migrates the `upload_changesets` function.
Reviewed By: StanislavGlebik
Differential Revision: D20605809
fbshipit-source-id: 36a11a72fb828d494bd18c7737e2682cb3b7cb9a
Summary:
See D20605813 for the overview of the stack.
This diff migrates a few leaf functions.
Reviewed By: krallin
Differential Revision: D20605811
fbshipit-source-id: 2f5d5e5fba3a00afd61a4eb58c505658ac82943a
Summary:
A wider goal of this stack is to migrate `repo_client/unbundle/src/resolver.rs` to async/await and new futures.
The approach is as follows:
- rename old futures upon import into `OldFuture`, `OldBoxFuture`, etc. so that it is easily visible where we use what [this diff]
- implement a bunch of mechanical conversions of continuation-passing-style code to the imperative async/await based code without worrying about clones-vs-references or excessive boxification. Keep individual diffs as small and as mechanical as possible, so that it is easier to review.
- once the `resolve` fn is migrated, introduce `resolve_compat`, which is used from `repo_client/client/src/mod.rs`
- then go through the codebase and see where we can remove clones of resolver/ctx and excessive `boxed()/boxify()` if any are left
Note: `Bundle2` is an `OldStream` and I postpone its migration till the high-level structure of the `resolver.rs` is migrated, since the main value is in allowing imperative-style code in the file
Reviewed By: krallin
Differential Revision: D20605813
fbshipit-source-id: 32255d7b3573f87f74a496e6e40b842e553242a7
Summary:
Run buck build -c rust.clippy=true eden/mononoke/:mononoke#check and fix some
of them manually. I wasn't able to make rustfix to work - will try to see
what's wrong and run it.
The suggestions looks non-controversial
Reviewed By: krallin
Differential Revision: D20520123
fbshipit-source-id: 25d4eb493f2363c5aa77bdb3876da4378483f6cb
Summary: 'new' is not very explicit with the fact that things are not refreshed.
Reviewed By: dtolnay
Differential Revision: D20356129
fbshipit-source-id: ff4a8c6fe4c34e93729c902e4b41afbe3c9deca1
Summary:
Now that Arun is about to roll this out to the team, we should get some more
logging in place server side. This updates the designated nodes handling code
to report whether it was enabled (and log prior to the request as well).
Reviewed By: HarveyHunt
Differential Revision: D20514429
fbshipit-source-id: 76ce62a296fe27310af75c884a3efebc5f210a8a
Summary: We had hooks logic scattered around the place - move it all into the hooks crate, so that it's easier to refactor to use Bonsai changesets instead of hg.
Reviewed By: StanislavGlebik
Differential Revision: D20198725
fbshipit-source-id: fb8bdc2cdbd1714c7181a5a0562c1dacce9fcc7d
Summary: Migrate hooks to new futures and thus modern tokio. In the process, replace Lua hooks with Rust hooks, and add fixes for the few cases where Lua was too restrictive about what could be done.
Reviewed By: StanislavGlebik
Differential Revision: D20165425
fbshipit-source-id: 7bdc6820144f2fdaed653a34ff7c998913007ca2
Summary:
D20444137 added a new use of `lfs_threshold`, and D20441264 removed this
variable. These two diffs landed close to the same time without ever being
tested with both diffs together.
Reviewed By: StanislavGlebik
Differential Revision: D20484843
fbshipit-source-id: fd0f0837142cdb641892005a64fd14272da7d2b7
Summary:
Update the `getpack` code to calculate how many files (and their total
size) would be served over LFS.
NOTE: The columns have `Possible` in their names as we might not have LFS
enabled, in which case we aren't actually fetching this many blobs from an LFS
server.
Reviewed By: farnz
Differential Revision: D20444137
fbshipit-source-id: 85506d8c468cfdc470684dd216567f1848c43d08
Summary:
Allow to gradually rollout lfs. A lot of the details are covered in D20441254
I won't repeat them here. I'd only mention that in order for fastreplay to
correctly calculate percentages this diff starts to log client_hostname for
fastreplay.
Reviewed By: ikostia
Differential Revision: D20441264
fbshipit-source-id: e272176f68879f6c545784609799d21daedec5eb
Summary: For now it's identical to LfsParams - it will change in the next diff
Reviewed By: krallin
Differential Revision: D20442530
fbshipit-source-id: 8434610373bb9aefe16702207448283b34676ca2
Summary: Since they are static and immutable, fbwhoami/fbwhatami could be simple structs with public fields.
Reviewed By: eugeneoden
Differential Revision: D20299423
fbshipit-source-id: 492f49c2b3003760517bfc5be06ace07fabbc6b9
Summary: I'm about to introduce one more usecase of it so let's rename it first.
Reviewed By: farnz
Differential Revision: D20393776
fbshipit-source-id: d74146fa212cdc4989a18c2cbd28307f58994759
Summary:
The input for getcommitdata is a list of HgChangesetIds. For every entry, the
endpoint retrieves the commit, formats it like the revlog then returns it back
to the client.
The format for the returned entries is:
```
Hash " " Length LF
RevlogContent LF
```
Looking for recommendations for how to structure the code better.
Looking for recommendations on implementation requirements:
metrics, throttling, veriftying hash for returned bytes.
Reviewed By: krallin
Differential Revision: D20376665
fbshipit-source-id: 5d9eb0d581fd2b352cf3ce44f4777ad45076c8f4
Summary:
This will allow the hg client to do tree fetching like we do in the API Server,
but through the SSH protocol — i.e. by passing a series a manifest ids and
their paths, without recursion on the server side through gettreepack.
Reviewed By: StanislavGlebik
Differential Revision: D20307442
fbshipit-source-id: a6dca03622becdebf41b264381fdd5837a7d4292
Summary:
A lot of callsites want to know repo name. Currently they need to pass it from
the place where repo was initialized, and that's quite awkward, and in some
places even impossible (i.e. in derived data, where I want to log reponame).
This diff adds reponame in BlobRepo
Reviewed By: krallin
Differential Revision: D20363065
fbshipit-source-id: 5e2eb611fb9d58f8f78638574fdcb32234e5ca0d
Summary:
The goal of the whole stack is quite simple (add reponame field to BlobRepo), but
this stack also tries to make it easier to initialize BlobRepo.
To do that BlobrepoBuilder was added. It now accepts RepoConfig instead of 6
different fields from RepoConfig - that makes it easier to pass a field from
config into BlobRepo. It also allows to customize BlobRepo. Currently it's used
just to add redaction override, but later we can extend it for other use cases
as well, with the hope that we'll be able to remove a bunch of repo-creation
functions from cmdlib.
Because of BlobrepoBuilder we no longer need open_blobrepo function. Later we
might consider removing open_blobrepo_given_datasources as well.
Note that this diff *adds* a few new clones. I don't consider it being a big
problem, though I'm curious to hear your thoughts folks.
Note that another option for the implementation would be to take a reference to objects
instead of taking them by value. I briefly looked into how they used, and lot of them are passed to the
objects that actually take ownership of what's inside these config fields. I.e. Blobstore essentially takes ownership
of BlobstoreOptions, because it needs to store manifold bucket name.
Same for scuba_censored_table, filestore_params, bookmarks_cache_ttl etc. So unless I'm missing anything, we can
either pass them as reference and then we'll have to copy them, or we can
just pass a value from BlobrepoBuilder directly.
Reviewed By: krallin
Differential Revision: D20312567
fbshipit-source-id: 14634f5e14f103b110482557254f084da1c725e1
Summary:
Note that comparing to many other asyncifying efforts, this one actually adds
one more clone instead of removing them. This is the clone of a logger field.
That shouldn't matter much because it can be cleaned up later and because this
function will be called once per repo.
Reviewed By: krallin
Differential Revision: D20311122
fbshipit-source-id: ace2a108790b1423f8525d08bdea9dc3a2e3c37c
Summary:
This updates the store_bytes method to chunk incoming data instead of uploading
it as-is. This is unfortunately a bit hacky (but so was the previous
implementation), since it means we have to hash the data before it has gone
through the Filestore's preparation.
That said, one of the invariants of the filestore is that chunk size shouldn't
affect the Content ID (and there is fairly extensive test coverage for this),
so, notionally, this does work.
Performance-wise, it does mean we are hashing the object twice. That actually
was the case before as well anyway (since obtain the ContentId for FileContents
would clone them then hash them).
The upshot of this change is that large files uploaded through unbundle will
actually be chunked (whereas before, they wouldn't be).
Long-term, we should try and delete this method, as it is quite unsavory to
begin with. But, for now, we don't really have a choice since our content
upload path does rely on its existence.
Reviewed By: StanislavGlebik
Differential Revision: D20281937
fbshipit-source-id: 78d584b2f9eea6996dd1d4acbbadc10c9049a408
Summary: separate out the Facebook-specific pieces of the sql_ext crate
Reviewed By: ahornby
Differential Revision: D20218219
fbshipit-source-id: e933c7402b31fcd5c4af78d5e70adafd67e91ecd
Summary:
Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/
In targets that depend on both 0.1 and 0.2 tokio, this codemod renames the 0.1 dependency to be exposed as tokio_old::. This is in preparation for flipping the 0.2 dependencies from tokio_preview:: to plain tokio::.
This is the tokio version of what D20168958 did for futures.
Codemod performed by:
```
rg \
--files-with-matches \
--type-add buck:TARGETS \
--type buck \
--glob '!/experimental' \
--regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
-x \
buck query "labels(srcs,
rdeps(%Ss, fbsource//third-party/rust:tokio-old, 1)
intersect
rdeps(%Ss, //common/rust/renamed:tokio-preview, 1)
)" \
| xargs sed -i 's,\btokio::,tokio_old::,'
```
Reviewed By: k21
Differential Revision: D20235404
fbshipit-source-id: cfb2689a584ad0d73f16d98d8587fb9c44661465
Summary:
This removes the Extend implementation for FileBytes, which was incorrect (it
discarded existing data!). I had introduced this as a backwards compatibility
shim when doing the Bytes 0.4 to Bytes 0.5 migration :/
We don't really need this shim, considering:
- The only place that really matters that uses this is the remotefilelog crate,
where we have a content id, and where we should use `filestore::fetch_concat`
instead.
- The other places are tests (or close to abandonware...), which can do their
own folding.
Longer term, I'd like to remove the whole `Content` stream in hg entries, so
those callsites can use the filestore methods, which a) have test coverage
(unlike ad-hoc folds, which don't always do), and b) are more efficient since
they know how large the destination buffer needs to be ahead of time, and don't
need to re-allocate.
To make sure this fixes the bug, I also introduced tests for the remotefilelog
crate. As expected, the chunked variant fails without this fix.
Reviewed By: mitrandir77
Differential Revision: D20248978
fbshipit-source-id: 1b554d3e595eb867b6b6cf4204d31f27dd90a111
Summary:
Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/
This codemod replaces *all* dependencies on `//common/rust/renamed:futures-preview` with `fbsource//third-party/rust:futures-preview` and their uses in Rust code from `futures_preview::` to `futures::`.
This does not introduce any collisions with `futures::` meaning 0.1 futures because D20168958 previously renamed all of those to `futures_old::` in crates that depend on *both* 0.1 and 0.3 futures.
Codemod performed by:
```
rg \
--files-with-matches \
--type-add buck:TARGETS \
--type buck \
--glob '!/experimental' \
--regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
-x \
buck query "labels(srcs, rdeps(%Ss, //common/rust/renamed:futures-preview, 1))" \
| xargs sed -i 's,\bfutures_preview::,futures::,'
rg \
--files-with-matches \
--type-add buck:TARGETS \
--type buck \
--glob '!/experimental' \
--regexp '(_|\b)rust(_|\b)' \
| xargs sed -i 's,//common/rust/renamed:futures-preview,fbsource//third-party/rust:futures-preview,'
```
Reviewed By: k21
Differential Revision: D20213432
fbshipit-source-id: 07ee643d350c5817cda1f43684d55084f8ac68a6
Summary:
In targets that depend on *both* 0.1 and 0.3 futures, this codemod renames the 0.1 dependency to be exposed as futures_old::. This is in preparation for flipping the 0.3 dependencies from futures_preview:: to plain futures::.
rs changes performed by:
```
rg \
--files-with-matches \
--type-add buck:TARGETS \
--type buck \
--glob '!/experimental' \
--regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
-x \
buck query "labels(srcs,
rdeps(%Ss, fbsource//third-party/rust:futures-old, 1)
intersect
rdeps(%Ss, //common/rust/renamed:futures-preview, 1)
)" \
| xargs sed -i 's/\bfutures::/futures_old::/'
```
Reviewed By: jsgf
Differential Revision: D20168958
fbshipit-source-id: d2c099f9170c427e542975bc22fd96138a7725b0
Summary:
In case this starts to cause problems, let's have a way to correlate those
problems with some exported metrics.
Reviewed By: StanislavGlebik
Differential Revision: D20158822
fbshipit-source-id: 6ac9e25861dbedaecdf04fd92bda835ae66535eb
Summary:
## Wider goal
See D20068839
## This diff
This diff actually implements the conditional hydration of `getbundle`
responses, as described in the D20068839.
Note that as well as implementing support for hydrated `getbyndle` responses, this diff also implements support for changegroup v3 and lfs in such responses, which is needed if we are to do this kind of stuff in LFS-enabled repository.
Reviewed By: StanislavGlebik
Differential Revision: D20068838
fbshipit-source-id: fbdd3f8f5fb7cd2cb60473a94094553a1d4b4d2f
Summary:
The former implementation would eagerly query Memcache when fetching history
(due to how old futures work) for files in getpack, but the new one does not.
This means the new one loses out on a lot of buffering, which the old one used
to do.
This diff emulates the old behavior by eagerly querying filenodes in getpack,
which improves performance on a very big getpack (32K files) by about 3x, and
makes it 30% faster than the old code, instead of > 2x slower.
Note that I'm not certain we really want to do this kind of aggressive
buffering in getpack long term, but for now, I'd like to keep this unchanged.
Reviewed By: StanislavGlebik
Differential Revision: D19905398
fbshipit-source-id: 49f9a2cd505a98123fd1dabb835e8e378d45c930
Summary:
The Bytes 0.5 update left us in a somewhat undesirable position where every
access to our blobstore incurs an extra copy whenever we fetch data out of our
cache (by turning it from Bytes 0.5 into Bytes 0.4) — we also have quite a few
place where we convert in one direction then immediately into the other.
Internally, we can start using Bytes 0.5 now. For example, this is useful when
pulling data out of our blobstore and deserializing as Thrift (or conversely,
when serializing and putting it into our blobstore).
However, when we interface with Tokio (i.e. decoders & encoders), we still have
to use Bytes 0.4. So, when needed, we convert our Bytes 0.5 to 0.4 there.
The tradeoff idea is that we deal with more bytes internally than we end up
sending to clients, so doing the Bytes conversion closer to the point of
sending data to clients means less copies.
We can also start removing those once we migrate to Tokio 0.2 (and newer
versions of Hyper for HTTP services).
Changes that were required:
- You can't extend new bytes (because that implicitly copies). You need to use
BytesMut instead, which I did where that was necessary (I also added calls in
the Filestore to do that efficiently).
- You can't create bytes from a `&'a [u8]`, unless `'a` is `'static`. You need
to use `copy_from_slice` instead.
- `slice_to` and `slice_from` have been replaced by a `slice()` function that
takes ranges.
Reviewed By: StanislavGlebik
Differential Revision: D20121350
fbshipit-source-id: eb31af2051fd8c9d31c69b502e2f6f1ce2190cb1
Summary:
## Wider goal
See D20068839
## This diff
Asyncifying only singatures allows us to independently work on function bodies, without touching the callsites later in the diff.
Reviewed By: StanislavGlebik
Differential Revision: D20097804
fbshipit-source-id: f1391a055947c7802f719bc99b9eae71a4ac39cd
Summary:
## Wider goal
See D20068839
## This diff
This file contains a mix of old and new-style futures. It even has futures,
which have items composed of futures. To be able to convert on one of the
levels and not the other, we need to deal with the confusion.
Let's have old things have `Old` in the name.
Reviewed By: StanislavGlebik
Differential Revision: D20097803
fbshipit-source-id: fedb3669ef34a8328ec389a30ff2c512ab363818
Summary:
## Wider goal
We want the flexibility to return hydrated responses for `getbundle` wireproto
requests for draft commits. This means that the responses will contain not
only the commit data (as they do now), but also trees and files.
For context, when an "unhydrated" response is returned for the `getbundle`
request for a draft commit, we expect one of two things to happen later
in the e2e scenario:
- either `hg` client would immediately make another wireproto request
(`gettreepack`, `getpackv1`) within the same client `hg` command execution
- or a subsequent `hg update` call will cause another wireproto request
In any case, another request is needed before the pulled commit can be used.
This request can hit a different server, sometimes it can even be Mercurial
instead of Mononoke. Specifically, it can Mercurial instead of Mononoke if the
`fallback` path markers are configured incorrectly. In that case we have a
problem, as Mercurial is incapable of serving `gettreepack` or `getpackv1` for
infinitepush commits.
One way to deal with this is to always have correct path markers, which is
prone to human mistakes. Another way is to guarantee that Mononoke returns
everything in the original `getbundle` request. We don't want to do this for
public commits, as `pull`s of public commits typically fetch thousands of those
commits and never care about tree or file data for all but one of them. Draft
commits are different however, as they are usually exactly what the client
intends to use, so hydrating those is fine. Still, we want this behavior to
be gated behind a config flag.
## This diff
A lot of the needed code is already implemented in the hg-sync job, bundle
generating variant. So prior to implementing the actual behavior described
above, let's move the relevant bits to `getbundle_response`. Later we can comb
them up a bit (asyncify) and use to implement the needed behavior.
Reviewed By: StanislavGlebik
Differential Revision: D20068839
fbshipit-source-id: 0ab63d57b2d167401b7ee8864fe7760f5f65f8ec
Summary:
During S196197 lease expired and we were rederiving the same derived data over and over again for a big commit.
this diff adds lease renewal that should help with this problem.
Reviewed By: HarveyHunt
Differential Revision: D20093323
fbshipit-source-id: d139abf6659722f47ea40d9b2f279daa03623ff4
Summary:
Let's populate the bonsai<->git mapping on pushrebase of the commits that are
coming from git. By this being a pushrebase hook we can have the accuare mappings
being available as soon as the bonsai commit is available.
Corresponding configerator change: D19951607
Reviewed By: krallin
Differential Revision: D19949472
fbshipit-source-id: b957cbcdd0f14450ceb090539814952db9872576
Summary:
This adds some basic logging for input size for Gettreepack and Getpack. This
might make it easier to understand "poison pill" requests that take out the
host before it has a chance to finish the request.
Reviewed By: StanislavGlebik
Differential Revision: D19974661
fbshipit-source-id: deae13428ae2d1857872185de2b6c0a8bcaf3334
Summary:
Currently if derivation of a particular derived data type is disabled, but a
client makes a request that requires that derived data type, we will fail with
an internal error.
This is not ideal, as internal errors should indicate something is wrong, but
in this case Mononoke is behaving correctly as configured.
Convert these errors to a new `DeriveError` type, and plumb this back up to
the SCS server. The SCS server converts these to a new `RequestError`
variant: `NOT_AVAILABLE`.
Reviewed By: krallin
Differential Revision: D19943548
fbshipit-source-id: 964ad0aec3ab294e4bce789e6f38de224bed54fa
Summary:
This allows code that is being exercised under async_unit to call into code
that expects a Tokio 0.2 environment (e.g. 0.2 timers).
Unfortunately, this requires turning off LSAN for the async_unit tests, since
it looks like LSAN and Tokio 0.2 don't work very well together, resulting in
LSAN reporting leaked memory for some TLS structures that were initialized by
tokio-preview (regardless of whether the Runtime is being dropped):
https://fb.workplace.com/groups/rust.language/permalink/3249964938385432/
Considering async_unit is effectively only used in Mononoke, and Mononoke
already turns off LSAN in tests for precisely this reason ... it's probably
reasonable to do the same here.
The main body of changes here is also about updating the majority of our
changes to stop calling wait(), and use this new async unit everywhere. This is
effectively a pretty big batch conversion of all of our tests to use async fns
instead of the former approaches. I've also updated a substantial number of
utility functions to be async fns.
A few notable changes here:
- Some pushrebase tests were pretty flaky — the race they look for isn't
deterministic. I added some actual waiting (using pushrebase hooks) to make
it more deterministic. This is kinda copy pasted from the globalrev hook
(where I had introduced this first), but this will do for now.
- The multiplexblob tests don't work at all with new futures, because they call
`poll()` all over the place. I've updated them to new futures, which required
a bit of reworking.
- I took out a couple tests in async unit that were broken anyway.
Reviewed By: StanislavGlebik
Differential Revision: D19902539
fbshipit-source-id: 352b4a531ef5fa855114c1dd8bb4d70ed967dd55
Summary: The load_limiter was extracted from server/context into its own crate and the server/context itself was refactored into multiple modules, one of which contains facebook-specific code.
Reviewed By: StanislavGlebik
Differential Revision: D19902972
fbshipit-source-id: d577492b4fe01ccfe11b3e092e0521b190516268
Summary: remove the need to pass mapping to `::derive` method
Reviewed By: StanislavGlebik
Differential Revision: D19856560
fbshipit-source-id: 219af827ea7e077a4c3e678a85c51dc0e3822d79
Summary:
This will allow us to distinguish `getbundle` for a normal `pull` from the one
for infinitepush pull.
Reviewed By: StanislavGlebik
Differential Revision: D19833206
fbshipit-source-id: 86534320fbb4d60bac04d458a0953701201cba87
Summary:
This commit manually synchronizes the internal move of
fbcode/scm/mononoke under fbcode/eden/mononoke which couldn't be
performed by ShipIt automatically.
Reviewed By: StanislavGlebik
Differential Revision: D19722832
fbshipit-source-id: 52fbc8bc42a8940b39872dfb8b00ce9c0f6b0800
Summary:
See D19787960 for more details why we need to do it.
This diff just adds a struct in BlobRepo
Reviewed By: HarveyHunt
Differential Revision: D19788395
fbshipit-source-id: d609638432db3061f17aaa6272315f0c2efe9328
Summary:
Follow up from D19718839 - let's add a function that will safely sync a commit
from one repo to another. Other function to sync a commit are prefixed with
unsafe
Reviewed By: krallin
Differential Revision: D19769762
fbshipit-source-id: 844da3e2c1cc39ef3cd86d282d275d860be55f44
Summary:
Fetching things from MySQL sequentially in a buffered fashion is a bad
practice, since we might end up saturating the underlying MySQL pool with a lot
of requests. Doing so will result in other queries being delayed as they wait
behind our batch of queries, which results in higher dispatch latency.
Instead, let's make fewer, bigger queries. Also, while we're in here, let's
update blobrepo to have an up-to-date comment.
Reviewed By: StanislavGlebik
Differential Revision: D19766788
fbshipit-source-id: 318ec4778ca259b210d431fc2add8b327bfce99a
Summary:
Suggestions come in the error message as it is currently implemented in
Mercurial code. Format of suggestions also stays the same.
We give the hash, time, author and the title.
All suggestions are ordered (most recent go first).
We don't show them if there are two many.
Reviewed By: krallin
Differential Revision: D19732053
fbshipit-source-id: b94154cbc5a4f440a0053fc3fac2bca2ae0b7119