Summary: Just a small cleanup before the next diff
Reviewed By: HarveyHunt
Differential Revision: D16986263
fbshipit-source-id: 85f064ab5b7dc1bfdf0596b9ca1c001851e967db
Summary:
This is a mechanical part of rename, does not change any commit messages in
tests, does not change the scuba table name/config setting. Those are more
complex.
Reviewed By: krallin
Differential Revision: D16890120
fbshipit-source-id: 966c0066f5e959631995a1abcc7123549f7495b6
Summary:
We now have two Manifest and two Entry traits and it makes it very confusing.
The old Manifest trait (and related EmptyManifest struct and Entry trait) are
mercurial-specific, while new Manifest trait is generic over all kinds of
manifests (unodes, mercurial etc).
In ideal state we should have only new Manifest trait, but it'd require quite a
big refactoring. In short term let's at least rename Manifest to HgManifest and
Entry to HgEntry to make distinction clearer.
Reviewed By: krallin
Differential Revision: D16886032
fbshipit-source-id: 80e3999fae3e89347448579caf7aafec56705500
Summary:
This updates Mononoke to support LFS metadata when serving data over getpackv2.
However, in doing so, I've also refactored the various ways in which we currently access file data to serve it to clients or to process client uploads (when we need to compute deltas). The motivation to do that is that we've had several issues recently where some protocols knew about some functionality, and others didn't. Notably, redaction and LFS were supported in getfiles, but neither of them were supported in getpack or eden_get_data.
This patch refactors all those callsites away from blobrepo and instead through repo_client/remotefilelog, which provides an internal common method to fetch a filenode and return its metadata and bytes (prepare_blob), and separate protocol specific implementations for getpackv1 (includes metadata + file content -- this is basically the existing fetch_raw_filenode_bytes function), getpackv2 (includes metadata + file contents + getpackv2 metadata), getfiles (includes just file content, and ties file history into its response) and eden_get_data (which uses getpackv1).
Here are a few notable changes here that are worth noting as you review this:
- The getfiles method used to get its filenode from get_maybe_draft_filenode, but all it needed was the copy info. However, the updated method gets its filenode from the envelope (which also has this data). This should be equivalent.
- I haven't been able to remove fetch_raw_filenode_bytes yet because there's a callsite that still uses it and it's not entirely clear to me whether this is used and why. I'll look into it, but for now I left it unchanged.
- I've used the Mercurial implementation of getpack metadata here. This feels like the better approach so we can reuse some of the code, but historically I don't think we've depended on many Mercurial crates. Let me know if there's a reason not to do that.
Finally, there are a couple things to be aware of as you review:
- I removed some more `Arc<BlobRepo>` in places where it made it more difficult to call the new remotefilelog methods.
- I updated the implementation to get copy metadata out of a file envelope to not require copying the metadata into a mercurial::file::File only to immediately discard it.
- I cleaned up an LFS integration test a little bit. There are few functional changes there, but it makes tests a little easier to work with.
Reviewed By: farnz
Differential Revision: D16784413
fbshipit-source-id: 5c045d001472fb338a009044ede1e22ccd34dc55
Summary:
Start moving mercurial related stuff to `mercurial` directory:
- rename `mercurial` to `mercurial_revlog` and moved to `/mercurial/revlog`
- move `mercurial_types` to `/mercurial/types`
- move `mercurial_bundles` to `/mercurial/bundels`
Reviewed By: farnz
Differential Revision: D16783728
fbshipit-source-id: 79cf1757bb7cc84a6273a4a3c486242b1ef4cd00
Summary: We need this to not break clients with `remotefilelog.fetchpacks=True`.
Reviewed By: krallin
Differential Revision: D16735900
fbshipit-source-id: a993610c2e95cfde95a8be1e31eaad825f95dc15
Summary:
I plan to use it for `getpackv1`/`getpackv2`/edenapi handling as well.
Note: there might be a better place to put this code, given that it will be used for edenapi, I am open for suggestiions.
Reviewed By: krallin
Differential Revision: D16735901
fbshipit-source-id: 6a71d80bb9223cd18f080163dcb20146c91498bf
Summary: Not used anywhere and is not obvious why it should be public.
Reviewed By: ahornby
Differential Revision: D16730490
fbshipit-source-id: 8f24b871a7c1f5cca00ea8418fe4e7e94a48e2f1
Summary:
NOTE: This isn't 100% complete yet. I have a little more work to do around the aliasverify binary, but I think it'll make sense to rework this a little bit with the Filestore anyway.
This patch incorporates the Filestore throughout Mononoke. At this time, what this means is:
- Blobrepo methods return streams of `FileBytes`.
- Various callsites that need access to `FileBytes` call `concat2` on those streams.
This also eliminates the Sha256 aliasing code that we had written for LFS and replaces it with a Filestore-based implementation.
However, note that this does _not_ change how files submitted through `unbundle` are written to blobstores right now. Indeed, those contents are passed into the Filestore through `store_bytes`, which doesn't do chunking. This is intentional since it lets us use LFS uploads as a testbed for chunked storage before turning it on for everything else (also, chunking those requires further refactoring of content uploads, since right now they don't expect the `ContentId` to come back through a Future).
The goal of doing it this way is to make the transition simpler. In other words, this diff doesn't change anything functionally — it just updates the underlying API we use to access files. This is also important to get a smooth release: it we had new servers that started chunking things while old servers tried to read them, things would be bad. Doing it this way ensures that doesn't happen.
This means that streaming is there, but it's not being leveraged just yet. I'm planning to do so in a separate diff, starting with the LFS read and write endpoints in
Reviewed By: farnz
Differential Revision: D16440671
fbshipit-source-id: 02ae23783f38da895ee3052252fa6023b4a51979
Summary:
It's currently challenging to compare client logging to server logging, because we just don't have enough information to correlate logs.
Add in two pieces of data from the client, if available:
1. Client correlator, which lets us compare to client performance logging.
2. Short hg command (`pull`, `update`, `rebase`, but not `update --clean master` or `rebase --source foo --destination master`). This lets us quickly look for problematic commands with a suitable query
Reviewed By: StanislavGlebik
Differential Revision: D16499809
fbshipit-source-id: c17a348c6f950bd207df5296531e5d39124e3a79
Summary: Will be used in the canaries
Reviewed By: HarveyHunt
Differential Revision: D16515265
fbshipit-source-id: 7ed21dce971531899bc8983ec04841aa6d707e09
Summary:
Any get request for a blacklisted file returns a magic string - not an error anymore - which replaces the actual content of the file. The magic string represents a sequence of 256 random characters.
Put still returns an error if there is an attempt to change a blacklisted file, but that's the normal behavior for this type of request.
Reviewed By: ikostia
Differential Revision: D16333820
fbshipit-source-id: 54e0189ebe2a71ea365979950d44dbb3fd1e5dc7
Summary: Adding throttling to gettreepack requests. Note that it also fixes ods tree counting as it was wrong previously - it was counting blocks of bytes, which made no sense at all. One block of bytes could contain a single tree, multiple trees or even part of a tree.
Reviewed By: krallin
Differential Revision: D16328588
fbshipit-source-id: e5f1869e08cac19f5a662b9a8e3a87e002a5f25c
Summary:
In earlier diffs in this stack, I updated the callsites that reference XDB tiers to use concrete &str types (which is what they were receiving until now ... but it wasn't spelled out as-is).
In this diff, I'm updating them to use owned `String` instead, which lets us hoist up `to_string()` and `clone()` calls in the stack, rather than pass down reference only to copy them later on.
This allows us to skip some unnecessary copies. Tt turns out we were doing quite a few "turn this String into a reference, pass it down the stack, then turn it back into a String".
Reviewed By: farnz
Differential Revision: D16260372
fbshipit-source-id: faec402a575833f6555130cccdc04e79ddb8cfef
Summary:
We're currently not emitting any copy metadata in LFS in response to getfiles. As it happens, this appears to fine (clients still get the right metadata), because getreepack includes file copy data and so the client isn't bothered by the fact that the LFS pointers we return don't include it.
That said, it's probably a good idea to clean this up and include copy data. After all, we have it, and we should return to the client whatever they sent us.
Reviewed By: farnz
Differential Revision: D16282793
fbshipit-source-id: 6671f87de2acae2c5da1c754510bbb61de5768d3
Summary:
Instantiating a new DB connection may require remote calls to be made to e.g. Hipster to allocate a new certificate (this is only the case when connecting to MySQL).
Currently, our bindings to our underlying DB locator make a blocking call to pretend that this operaiton is synchronous: https://fburl.com/ytmljxkb
This isn't ideal, because this call might actually take time, and we might also occasionally want to retry it (we've had issues in our MySQL tests with acquiring certificates that retrying should resolve). Running this synchronously makes doing so inefficient.
This patch doesn't update that, but it fixes everything on the Rust side of things to stop expecting connections to return a `Result` (and to start expecting a Future instead).
In a follow up diff, I'll work on making the changes in common/rust/sql to start returning a Future here.
Reviewed By: StanislavGlebik
Differential Revision: D16221857
fbshipit-source-id: 263f9237ff9394477c65e455de91b19a9de24a20
Summary:
Before this diff, the `RepoBlobstore` type alias (and the newly-added
`RepoBlobstoreArgs` struct) lived in `blobrepo/blob_changeset` crate, which is
obviously the most correct place for them to be located. All would be fine, had
these things been used locally only. But `RepoBlobstore` is a reasonably
widely-used type alias across our codebase and importing it from
`blob_changeset` seems weird. Let's move it into a dedicated crate.
Reviewed By: StanislavGlebik
Differential Revision: D16174126
fbshipit-source-id: b83e345adcfe567e4a67c8a1621f3a789fab63c6
Summary:
This diff sets two Rust lints to warn in fbcode:
```
[rust]
warn_lints = bare_trait_objects, ellipsis_inclusive_range_patterns
```
and fixes occurrences of those warnings within common/rust, hg, and mononoke.
Both of these lints are set to warn by default starting with rustc 1.37. Enabling them early avoids writing even more new code that needs to be fixed when we pull in 1.37 in six weeks.
Upstream tracking issue: https://github.com/rust-lang/rust/issues/54910
Reviewed By: Imxset21
Differential Revision: D16200291
fbshipit-source-id: aca11a7a944e9fa95f94e226b52f6f053b97ec74
Summary:
Previously we used "getpack" name for both getpackv1 and getpackv2.
There were two problems with that:
1) Traffic replay doesn't support getpackv2
2) Traffic replay needs an exact name - either "getpackv1" or "getpackv2"
This diff fixes both of the problems.
Note that it'd be worth adding integration tests for traffic replay. I'll do it
in the next diffs
Reviewed By: krallin
Differential Revision: D16200293
fbshipit-source-id: 8fdb3032bcf64a10bde27bd9750e0356d889a480
Summary:
We have problems with expensive gettreepack requests coming from quicksand.
I'll add load shedding, but before I do this I need to find what the limits for
load shedding are.
Let's use ods for that
Reviewed By: ikostia
Differential Revision: D16182208
fbshipit-source-id: 70367c876fb392f368583442ad1e45db593f9e0b
Summary:
Sync job is used to replay pushes from Mononoke onto mercurial. It's code is in https://fburl.com/dt1hkf4g. For more about the sync job see https://fburl.com/wiki/sd2i6rch.
The goal of the sync job is to not only replicate pushes from Mononoke onto Mercurial, but also to verify that the push was correct i.e. push on Mononoke produced the same hash as on mercurial.
The problem is that we lock and unlock in two completely differen codepaths - locking is done by an external script called "failure handler" (https://fburl.com/7txm3jxf), while unlocking is a part of the sync job.
Ideally we'd like to lock and unlock in the sync job itself and remove "failure handler" entirely.
Reviewed By: StanislavGlebik
Differential Revision: D16107759
fbshipit-source-id: a418f8d0f48fa6db82476be72a91adbc03b66168
Summary: Gettreepack requests can be slow if there's no base manifest. Instead of trying to track every manifest we've passed down to the client (which is CPU and memory expensive, as we end up tracking every `(path, manifesthash)` pair in the output), simply use the first manifest we were asked for as the base manifest in this case
Reviewed By: StanislavGlebik
Differential Revision: D16120603
fbshipit-source-id: 19119e481dfad37c2539f6f02f6ffccb13bc9895
Summary:
This adds the ability to provide an infinitepush namespace configuration without actually allowing infinite pushes server side. This is useful while Mercurial is the write master for Infinite Push commits, for two reasons:
- It lets us enable the infinitepush namespace, which will allow the sync to proceed between Mercurial and Mononoke, and also prevents users from making regular pushes into the infinitepush namespace.
- It lets us prevent users from sending commit cloud backups to Mononoke (we had an instance of this reported in the Source Control @ FB group).
Note that since we are routing backfills through the shadow tier, I've left infinitepush enabled there.
Reviewed By: StanislavGlebik
Differential Revision: D16071684
fbshipit-source-id: 21e26f892214e40d94358074a9166a8541b43e88
Summary:
CensoredBlob was placed between Blobstore and PrefixBlobstore. I moved CensoredBlob, so that now it is a wrapper for PrefixBlobstore. This means the key is compared before appending the `repoid` to the key.
By moving CensoredBlob on top of PrefixBlobstore, it provides better isolation for the existing blobstores. This way CensoredBlob does not interact with the underlaying layers and future changes of those layers will, most probably, not impact CensoredBlob's implementation.
Reviewed By: ikostia
Differential Revision: D15900610
fbshipit-source-id: 391594355d766f43638f3152b56d4e9acf49af32
Summary:
The getpackv1 protocol doesn't unfortunately support LFS blobs, which is
therefore blocking deploying remotefilelog.fetchpacks on ovrsource on the
clients.
The easiest way to get there was to simply add a getpackv2 API that is similar
in every way to getpackv1, but with the addition of a metadata field. While
full support for this was added to Mercurial, the Mononoke support is the
absolute minimum required as Mononoke doesn't support LFS.
I'm expecting that EdenAPI will completely remove the need for getpackv2 and
therefore for this code should be fairly short-lived.
Reviewed By: farnz
Differential Revision: D15954031
fbshipit-source-id: 465ac13ed8987191ccf9a7cec198d913143aaf13
Summary:
In the next diffs it will be used inside blobrepo to correctly generate hg
filenodes. This diff just moves the functions to a common place.
There are a few cleanups that can be done - see next diff.
Reviewed By: aslpavel
Differential Revision: D15896733
fbshipit-source-id: 19b58e4eb73fd4897d7c8ab28c0dcd8786124f2a
Summary:
New name better reflects what this function does - it might return cached
version of filenodes that might be out of date.
Reviewed By: aslpavel
Differential Revision: D15896734
fbshipit-source-id: caf4f1d3a9a29889327c3373ac886687ec916903
Summary:
This updates the mononoke server code to support booting without myrouter. This required 2 changes:
- There were a few callsites where we didn't handle not having a myrouter port.
- In our function that waits for myrouter, we were failing if we had no myrouter port, but that's not desirable: if we don't have a myrouter port, we simply don't need to wait.
Arguably, This isn't 100% complete yet. Notably, RepoReadWriteStatus still requires myrouter. I'm planning to create a bootcamp task for this since it's not blocking my work adding integration tests, but would be a nice to have.
Speaking of further refactor, it'd be nice if we supported a `SqlConstructors::with_xdb` function that did this matching for us so we didn't have to duplicate it all over the place. I'm also planning to bootcamp this.
Reviewed By: farnz
Differential Revision: D15855431
fbshipit-source-id: 96187d887c467abd48ac605c28b826d8bf09862b
Summary:
- more tracing for potentialy large pieces of work
- removed some unnecessary tracing
Reviewed By: StanislavGlebik
Differential Revision: D15851576
fbshipit-source-id: 6686c00da56176cad43f72d1671e08eb8141110f
Summary:
By saving bookmarks when the client calls `heads` (and thus is starting discovery), we can fix the race that stopped us supporting `listkeys` as a bundle2 part.
This saves about 1 second per no-op pull on my devvm - I'm hoping for more improvement on Sandcastle.
Reviewed By: StanislavGlebik
Differential Revision: D15625211
fbshipit-source-id: 47e59848dff56fcf9d893ee3b3c329d69883a57e
Summary:
This is just a boilerplate, giving `resolve` access to the config option,
which controls whether we should allow pure pushes. This option will be
enabled by default and we'll disable it explicitly for repos where it makes
no sense.
Reviewed By: StanislavGlebik
Differential Revision: D15520450
fbshipit-source-id: b1bb913c14a6aac6aa68ed8b8a7b4ff270da1688
Summary:
This adds a sanity check that limits the count of matches in `list_all_bookmarks_with_prefix`.
If we find more matches than the limit, then an error will be returned (right now, we don't have support for e.g. offsets in this functionality, so the only alternative approach is for the caller to retry with a more specific pattern).
The underlying goal is to ensure that we don't trivially expose Mononoke to accidental denial of service when a list lists `*` and we end up querying literally all bookmarks.
I picked a fairly conservative limit here (500,000), which is > 5 times the number of bookmarks we currently have (we can load what we have right now successfully... but it's pretty slow);
Note that listing pull default bookmarks is not affected by this limit: this limit is only used when our query includes scratch bookmarks.
Reviewed By: StanislavGlebik
Differential Revision: D15413620
fbshipit-source-id: 1030204010d78a53372049ff282470cdc8187820
Summary:
This updates our receive path for B2xInfinitepush to create new scratch bookmarks.
Those scratch bookmarks will:
- Be non-publishing.
- Be non-pull-default.
- Not be replicated to Mercurial (there is no entry in the update log).
I added a sanity check on infinite pushes to validate that bookmarks fall within a given namespace (which is represented as a Regexp in configuration). We'll want to determine whether this is a good mechanism and what the regexp for this should be prior to landing (I'm also considering adding a soft-block mode that would just ignore the push instead of blocking it).
This ensures that someone cannot accidentally perform an infinitepush onto master by tweaking their client-side configuration.
---
Note that, as of this diff, we do not support the B2xInfinitepushBookmarks part (i.e. backup bookmarks). We might do that separately later, but if we do, it won't be through scratch Bookmarks (we have too many backup bookmarks for this to work)
Reviewed By: StanislavGlebik
Differential Revision: D15364677
fbshipit-source-id: 23e67d4c3138716c791bb8050459698f8b721277
Summary:
This adds support for recording server-side whether a given bookmark is publishing and / or pull-default. This is a change on the way towards supporting Infinite Push in Mononoke. This diff will require schema changes on `xdb.mononoke_production`.
There isn't a whole lot of new user-facing functionality in this particular diff. For starters, nobody can edit this flag on bookmarks, and pushes that create a new bookmark will always result in setting a bookmark as publishing AND pull_default.
What this change does however introduce is the notion of `BookmarkHgKind`, which is represents the behavior of this bookmark as far as Mercurial operations are concerned.
There are 3 such kinds, which are relevant in different parts of the codebase:
- PullDefault - this is useful when we want to respond to listkeys queries.
- Publishing — this is useful when we want to identify commit Phases.
- All - this is useful when we want to respond to listkeyspatterns.
Note that only the first two groups are cached in CachedBookmarks.
---
There are a few things going on in this diff (which logically have to happen together):
- First, I updated the `Bookmarks` trait and its various implementations to expose new methods to select Bookmarks based on their hg kind. There's one method per hg kind, and all the methods use a `Freshness` parameter to determine whether to hit cache or not.
- Second, I refactored the various bookmark-access methods in blobrepo. A few of them were duplicative of each other, and some were unused, which became a bigger problem now that we had more (for publishing, pull default, etc.). We are now down to just one method that doesn't hit cache (which is used only by the blobimport script and some tests — perhaps this could be changed?).
- Third, I updated the call sites for the methods that were udpated in step 2 to use the proper method for their use case.
---
Here's a summary of where each method is used (I'm only listing stuff that isn't unit tests):
- `get_bonsai_heads_maybe_stale`:
- `SqlPhases`'s `get_public`
- `build_skiplist_index`
- `get_bonsai_publishing_bookmarks_most_recent`:
- Blobimport (perhaps we can update this to a `maybe_stale` read?)
- `get_pull_default_bookmarks_maybe_stale`
- `listkeys` implementations
- `get_publishing_bookmarks_maybe_stale`
- API Server's `get_branches`
- `get_bookmarks_by_prefix_maybe_stale`:
- `listkeyspatterns` implementation
---
As an aside, note that a lot of the code changes in this diff are actually in CacheBookmark's tests — I had to update those to minimize sleeps: it was fine to sleep in the earlier tests, but I introduced new quickcheck-based tests, and sleeping in those isn't ideal.
Reviewed By: StanislavGlebik
Differential Revision: D15298598
fbshipit-source-id: 4f0fe80ea98dea8af0c8289db4f6f4e4070af622
Summary: Add an endpoint to the API server that provides functionality similar to the `gettreepack` wire protocol command.
Reviewed By: fanzeyi
Differential Revision: D15492734
fbshipit-source-id: 7d0f113f0d33c68d5bfba5781269a92f0d5a66e8
Summary: We need to access the stream of `Entries` for a given `gettreepack` call from the API server. Currently, these entries are only available in the Mercurial wire protocol format from `RepoClient`. This diff splits out the entry fetching code into its own function, which can later be called from the API server.
Reviewed By: xavierd
Differential Revision: D15483702
fbshipit-source-id: e3050b6a0504f97aa28a2c9adbbdfb0f613f3030
Summary: It will be used in the sync job.
Reviewed By: HarveyHunt
Differential Revision: D15449661
fbshipit-source-id: eace10b5f5622cec3d54c011767c35f49fce5960
Summary:
As part of adding support for infinitepush in Mononoke, we'll include additional server-side metadata on Bookmarks (specifically, whether they are publishing and pull-default).
However, we do use the name `Bookmark` right now to just reference a Bookmark name. This patch updates all reference to `Bookmark` to `BookmarkName` in order to free up `Bookmark`.
Reviewed By: StanislavGlebik
Differential Revision: D15364674
fbshipit-source-id: 126142e24e4361c19d1a6e20daa28bc793fb8686
Summary:
`Phases` currently have very ugly API, which is constant source of confusion. I've made following changes
- only return/cache public phases
- do not require `public_heads` and always request them from `BlobRepo::get_bonsai_heads_maybe_stale`
- consolidated `HintPhases` and `SqlPhases`
- removed `Hint` from types which does not carry any meaning
- fixed all effected callsites
Reviewed By: StanislavGlebik
Differential Revision: D15344092
fbshipit-source-id: 848245a58a4e34e481706dbcea23450f3c43810b
Summary: This updates Mononoke's repo_read_write_status to fetch the reason from the database. The "Repo is locked in DB" default message is used as a fallback if the reason is NULL.
Reviewed By: HarveyHunt
Differential Revision: D15164791
fbshipit-source-id: f4cb68c28db1db996c7ef1a309b737cb781659d1
Summary: Seems like 15 mins is not enough, let's bump it
Reviewed By: farnz
Differential Revision: D15154597
fbshipit-source-id: 78b8a43bbc95845719245f71ac85fd22336f3ed6
Summary: Added obsmarkers to pushrebase output. This allows the client to hide commits that were rebased server-side, and check out the rebased commit.
Reviewed By: StanislavGlebik
Differential Revision: D14932842
fbshipit-source-id: f215791e86e32e6420e8bd6cd2bc25c251a7dba0
Summary:
Will be used to coordinate hg sync job and blobimport - only one should run at
the same time.
Reviewed By: HarveyHunt
Differential Revision: D14799003
fbshipit-source-id: abbf06350114f1d756e288d467236e3a5b7b2f01