Summary:
See bottom diff for more context on the problem we are trying to solve.
This is a preparation for commit retries.
Reviewed By: krallin
Differential Revision: D15855498
fbshipit-source-id: 6675cc2d698a985bc2ee7d54da00173ba192bc6b
Summary:
See bottom diff for more context on the problem we are trying to solve.
This function can be used to use some more complex decision-making logic than
just `attempt < retry_num` for whether to retry or not. This also does not
have any delay, since in the envisioned use-case delays are not needed.
NB: `#[allow(dead_code)]` will be removed in the later diff. This is just for
the sake of diff separation.
NB 2: I think we should extract all of the different `retry` implementations into a separate crate, but I don't want to do it as part of this task.
Reviewed By: krallin
Differential Revision: D15855496
fbshipit-source-id: a47ccdb507634b202feecfa7f6076decd9451f18
Summary:
This enum represents errors which prompt us for different actions:
- `LogicError` means that retry should happen somewhere in the pushrebase code
- `Other` means infrastructure error
- `RetryabeError` means that the transaction should be retried right away
Next diff will add retries.
Reviewed By: krallin
Differential Revision: D15834461
fbshipit-source-id: 4e2f65b1b1fd8669c29dd22c2920cf88273adb3a
Summary:
This is the first step towards not skipping bundles due to race conditions
in our queue logic.
Below is our hypothesis of why this might happen and how we are trying to fix
it.
```
T | Mononoke Server 1 | Mononoke Server 2 | Sync job
---------------------------------------------------------------------
1 | start transaction | |
---------------------------------------------------------------------
2 | | start transaction |
---------------------------------------------------------------------
3 | mysql decides auto | |
| incremented key 20 | |
---------------------------------------------------------------------
4 | | mysql decides auto |
| | incremented key 21 |
---------------------------------------------------------------------
5 | | INSERT |
---------------------------------------------------------------------
6 | | COMMIT |
---------------------------------------------------------------------
7 | | | read the
| | | latest key (21)
---------------------------------------------------------------------
8 | INSERT | | update the latest
| | | key counter
---------------------------------------------------------------------
9 | COMMIT | |
---------------------------------------------------------------------
```
Note that in the above scenario key 20 will never be read by the sync job,
therefore bundle with this key will never be replayed.
To fix this and also to provide total order of all bookmark moves, we are
will not rely on the `AUTO_INCREMENT` mysql feature, but rather set the new
id ourelves, using the previously computed `MAX(id)` query. Note that it is
fine to insert a predetermined value into an `AUTO_INCREMENT` field.
Clearly, since `SELECT MAX(id)` and `INSERT INTO ...` are two different
queries, the behavior as introduced in this diff is racy, e.g. some other
bookmark move might've already used the `MAX(id)` key we've computed before
our insert succeeds. There are two aspects of this:
- how dangerous it is? In such case transaction commit fails and we fall back
to pushrebase retry logic. We cannot introduce data inconsistency this way.
- what performance hit it introduces? We do not know, maybe it causes many more
retries to happen. To address this potential hit, the following diff will
make sure that when we have reasons to believe that the failure is because of
the mentioned race, we only retry the database transaction, not the entire
pushrebase.
Always winning this race (potentially after a few retries) is exactly what we want
to ensure that we have a total order of all the bookmark moves.
NB: I will not land this diff before introducing the next one
Reviewed By: krallin
Differential Revision: D15821895
fbshipit-source-id: cfb8180d1f43caa9282f06d7dd6a55fd856b71b8
Summary: `HgEntryId` is much more useful in a typed from (enum of `(FileType, HgFileNodeId)` and `HgManifestId`), most of the time we now which type entry should contain and it makes it harder to make and error, all other use cases which require just hash should use `HgNodeHash` instead. This diff includes minimal changes which are necessary to make it work. Some of the cases which do sequence of `Entry::get_hash().into_nondehash() -> HgManifestId::new() | HgFileNodeId::new()` are left for future diffs.
Reviewed By: farnz
Differential Revision: D15866081
fbshipit-source-id: 5be9ecc30dbfd0a49ae6c5d084cdfe2dac351dac
Summary:
We used to only invalidate the peer process when we explicitly fail the sync,
not when we timeout. Among other things, this means that if we talk to an
overloaded/bad server, we'll spend *all* of our retries timing out on it,
which will cause the queue to grow large. Let's invalidate the process
more aggressively.
Reviewed By: farnz
Differential Revision: D15877934
fbshipit-source-id: ed7e3d37e56392fed7fd9a58cb6cb7632bd9a694
Summary:
Apiserver was failing to start up after D15762598. In this diff we started to
read censored blobs from xdb table. The problem was in that in apiserver we
didn't wait for myrouter to start up before we read censored blbos from the
table.
This diff fixes it
Reviewed By: krallin
Differential Revision: D15873817
fbshipit-source-id: 98550289a3c135c5f4a472206c6654e818ed54f4
Summary:
This adds suppport for running Mononoke integration tests using MySQL.
This is useful to ensure at least a nominal amount of testing happens on MySQL code. After all, MySQL is what we actually use in production. Notably, this feels like something we should do before deploying automatically to prod.
Currently, this requires whitelisting tests for MySQL support. I whitelisted two tests to act as a proof of concept.
Reviewed By: farnz
Differential Revision: D15855430
fbshipit-source-id: c554e2dbc6a5ae21df670a60edb713e2945de10a
Summary:
This updates the mononoke server code to support booting without myrouter. This required 2 changes:
- There were a few callsites where we didn't handle not having a myrouter port.
- In our function that waits for myrouter, we were failing if we had no myrouter port, but that's not desirable: if we don't have a myrouter port, we simply don't need to wait.
Arguably, This isn't 100% complete yet. Notably, RepoReadWriteStatus still requires myrouter. I'm planning to create a bootcamp task for this since it's not blocking my work adding integration tests, but would be a nice to have.
Speaking of further refactor, it'd be nice if we supported a `SqlConstructors::with_xdb` function that did this matching for us so we didn't have to duplicate it all over the place. I'm also planning to bootcamp this.
Reviewed By: farnz
Differential Revision: D15855431
fbshipit-source-id: 96187d887c467abd48ac605c28b826d8bf09862b
Summary:
Apparently, this isn't used anywhere. However, having our production DB tier in the code is a little scary (we wouldn't want things accidentally talking to it).
This removes that.
Reviewed By: farnz
Differential Revision: D15855429
fbshipit-source-id: 8fad14716a88a61d8a1948a5a2d265e6734cea35
Summary: It looks like this test is a little flaky. I'm seeing tests fail with a different error message from the hg side of things. I suspect this happens because something is being fetched concurrently, and we return the first error we find.
Reviewed By: StanislavGlebik
Differential Revision: D15874528
fbshipit-source-id: 49f99cdb142e31dbb20465eccafddfc84fee2e7f
Summary:
- more tracing for potentialy large pieces of work
- removed some unnecessary tracing
Reviewed By: StanislavGlebik
Differential Revision: D15851576
fbshipit-source-id: 6686c00da56176cad43f72d1671e08eb8141110f
Summary:
Use `application/cbor` rather than `application/octet-stream` as the
MIME type for streaming responses containing CBOR records. In practice, this
doesn't appear to make any difference on the client side, but it's still nice
to provide the correct MIME type.
Reviewed By: quark-zju
Differential Revision: D15838482
fbshipit-source-id: b92f0d42d4696ac3392468d53f7138055d40eb5b
Summary:
This add a `mononoke_exec` helper script that does two things:
- Fetches the latest config if you don't have it locally yet and turns that into a `--mononoke-config-path` argument.
- Turns a repo name into a `--repo-id` argument.
This is helpful to run commands that require both arguments (notably: `mononoke_admin`) without having to spend a majority of your time getting the args right.
Reviewed By: farnz
Differential Revision: D15821793
fbshipit-source-id: a3d9d6c9b9395c3e3bc367fd1e1626eeb5cac969
Summary: Use the service SqlCensoredContentStore to populate the hashmap of blacklisted blobs at startup. Every get and put call on a censoredblob instance will check the hash key against the existing keys in the hashmap. It will display a suggestive error in case the hash key is censored, otherwise, it will follow the normal behavior
Reviewed By: ikostia
Differential Revision: D15762598
fbshipit-source-id: 03a2d19f06934e8207f5294140c38a80b67a3c3d
Summary:
Previously, all of the handlers for Eden API operations would buffer an in-progress response in memory, serialize it, and send it in one go, resulting in high latency for responses. However, the underlying Mononoke APIs used by these handlers are streaming APIs, so it is possible to send the response incrementally as the data for each entry is received.
This diff adds an optional `stream` query parameter to all of the Eden API endpoints. When set to `true`, the API server will stream the response as a series of independently serialized CBOR responses rather than as one big CBOR response. Since this is a breaking format change, the default behavior remains as before to avoid breaking existing clients. Once this is deployed, we'll migrate clients to use the streaming functionality, and then eventually remove the gating.
Reviewed By: xavierd
Differential Revision: D15784634
fbshipit-source-id: c49dfc75498dfb16dbe5fc2b7a669b282a2bfc1b
Summary:
By saving bookmarks when the client calls `heads` (and thus is starting discovery), we can fix the race that stopped us supporting `listkeys` as a bundle2 part.
This saves about 1 second per no-op pull on my devvm - I'm hoping for more improvement on Sandcastle.
Reviewed By: StanislavGlebik
Differential Revision: D15625211
fbshipit-source-id: 47e59848dff56fcf9d893ee3b3c329d69883a57e
Summary:
1.add function myrouter_ready to common/sql_ext/sr/lib.rs;
2.refactor main.rs and repo_handlers.rs to use the new function
Reviewed By: ikostia
Differential Revision: D15623501
fbshipit-source-id: 7b9d6c5fd7c33845148dfacefbcf1bf3c6afaa5d
Summary:
Implemented a service which retrieves the censored blobs from the
database. I am working on writing the integration test for the service before using it in production (the next diff will contain the integration test).
In addition, I've changed the unit tests for get and put methods (they now verify the type of error returned, together with the task identifiers).
Reviewed By: krallin
Differential Revision: D15577325
fbshipit-source-id: 0ffe1b77f41aa9ca0efe686348ea885714a6cf25
Summary: This adds a command in Mononoke Admin to list all bookmarks.
Reviewed By: ikostia
Differential Revision: D15737660
fbshipit-source-id: 5d4e649adf09c5f601b149a87515707f3d88b203
Summary:
Mercurial has complicatied logic of finding filenode parents. And this diff
implements one of those tricky cases.
If a file existed and it was copied over (with `hg cp --force`) then Mononoke
set both p1 and copy information. Mercurial though just sets copy information.
This diff implements the same logic in Mononoke (note that for now it's only
for non-merge cases).
Reviewed By: krallin
Differential Revision: D15737250
fbshipit-source-id: 0b2117097801a7137ccc7382477452a0b7bf4504
Summary: The API server's Eden API endpoints (which all return CBOR responses) are usually accessed from behind a VIP, making it difficult to know which actual API server host served the request. To improve debuggability, let's add the (non-standard, but common) "x-served-by" header to CBOR responses containing the hostname of the API server host.
Reviewed By: quark-zju
Differential Revision: D15717893
fbshipit-source-id: 626d752ecb647d4e883e9127a3c549f1e40f355e
Summary: We want to have just one entry point to Mercurial, namely the Rust binary. Getting rid of the `hg` Python script means that we finally can do things that only exist in Rust and not in Python.
Reviewed By: simpkins
Differential Revision: D13186374
fbshipit-source-id: f3c8cfe4beb7bf764172a8af04fd25202eca9af2
Summary:
A large portion of Mononoke health/performance is dominated by files being kept
in memory. There is a direct link between traffic we're sending out to clients
and how much memory we have in use. Therefor this is a potential effective
counter to limit against.
Reviewed By: krallin
Differential Revision: D15664331
fbshipit-source-id: f7ffaa9604d51d337bb05237dee5d8a3e43ce7ad
Summary:
In order to be able to use load limiting, our extension needs to be initialized
to receive the proper configuration.
Reviewed By: krallin
Differential Revision: D15664332
fbshipit-source-id: 987adf9183f53cc0d6b847ad782ff31851503721
Summary:
Add config option to set the load limiter category, to be used by the
LoadLimiter library.
Reviewed By: krallin
Differential Revision: D15628073
fbshipit-source-id: 8df22badeb2b255e44b4675f5b6701c63c00d0c8
Summary:
It was failing and disabled, this diff fixes it.
Note that the test is quite heavy and we had problem with it failing on
sandcastle before. If this attempt of fixing it won't work, then it might be
worth removing the test completely because it doesn't give a lot of benefit to
us.
Reviewed By: farnz
Differential Revision: D15657387
fbshipit-source-id: 3e0ed588861cbadbaab2178826b13e1ba201fbc7
Summary:
We don't control whatever `get_from_db` will do when asked to fetch no data. We can hope it'll do the smart thing and not hit the DB nor increment any monitoring counters.
This can be problematic, because it can result in confusing data in ODS. See T45198435 for a recent example of this.
Reviewed By: StanislavGlebik
Differential Revision: D15620424
fbshipit-source-id: 629c2eaad00d4977b0598c26e1f2a2ca64a1d66e
Summary:
It's all in the title.
See D15620424 for context.
Reviewed By: StanislavGlebik
Differential Revision: D15620545
fbshipit-source-id: 218d4a97d0ac2df4b9c6cef44f18d7191ea7d47c
Summary: As per request in previous diff, adding stats for requests to phases sql table so it would be easier to debug in the future
Reviewed By: farnz
Differential Revision: D15601643
fbshipit-source-id: a65509ad64d0a0d4948434f771274e4e17d7f133
Summary: Bulk fetch all public heads in `mark_rechable_as_public`, and remove all public heads from traversal, this would avoid initial burst of requests as most of the bookmarks never moves
Reviewed By: krallin
Differential Revision: D15600915
fbshipit-source-id: b6a5faf16d5c5465be53a575c4cc966ed8cdd875
Summary: This diff will make API server to log the client certificate every request used to scuba.
Reviewed By: StanislavGlebik
Differential Revision: D15572329
fbshipit-source-id: 695d5fefdae083a36fea27e806aa797c657cba3b
Summary:
This is the final step in making sure we have control over whether
non-pushrebase pushes are supported by a given repo.
Reviewed By: krallin
Differential Revision: D15522276
fbshipit-source-id: 7e3228f7f0836f3dcd0b1a3b2500545342af1c5e
Summary:
This is just a boilerplate, giving `resolve` access to the config option,
which controls whether we should allow pure pushes. This option will be
enabled by default and we'll disable it explicitly for repos where it makes
no sense.
Reviewed By: StanislavGlebik
Differential Revision: D15520450
fbshipit-source-id: b1bb913c14a6aac6aa68ed8b8a7b4ff270da1688
Summary: This is the first step towards per-repo control of whether pushes are allowed.
Reviewed By: StanislavGlebik
Differential Revision: D15519959
fbshipit-source-id: a0bb96bd995af7df0cef225c73d559f309cfe592
Summary:
I find that it is common pattern
```
// create unnecessary to reduce scope of lock guard
let output = {
let mut value = mutex_value.lock().expect("lock poisoned");
... // do some stuff here
};
```
This extension simplifies this pattern to
```
let output = mutex_value.with(|value| /* do some stuff here */);
```
Reviewed By: Imxset21, farnz
Differential Revision: D15577135
fbshipit-source-id: 6b22b20dda79e532ff5ec8ce75cda8b1c1404368
Summary:
We accidentally were not verifying the bookmark namespace (with regard to infinitepush) when performing a pushrebase.
This is a problem, because it means if a pushrebase was performed into the infinitepush namespace, then the push would be allowed. This could happen by accident (the earlier diff in this stack fixes a Mercurial bug that did this), or simply if the end-user changes their infinitepush branchpattern.
This patch fixes the bug, and extracts the "do basic checks for whether this bookmark can move" logic into a single function to minimize the potential for this validation logic diverging again between pushrebase and push.
Reviewed By: ikostia
Differential Revision: D15576198
fbshipit-source-id: 24cf9999a7370503e5e0173e34185d9aa57903f7
Summary:
CachedBookmarksTransaction drops the cache when a transaction is
committedd, even if the the transaction doesn't modify any bookmarks.
Keep track of whether bookmarks have been modified and only purge the bookmark
cache if this is the case.
Reviewed By: krallin
Differential Revision: D15560228
fbshipit-source-id: f311a17414f2762397084b0b3fdcb7efae551be7
Summary:
Hashset of keys to verify against them if the current key
is blacklisted.
Reviewed By: StanislavGlebik
Differential Revision: D15516498
fbshipit-source-id: ed7bdc0b824810b0cbcfb9474ff7c3983d8c4759
Summary: We have the same pattern 3 times, and I have at least two more instances of this pattern to add. Factor it out.
Reviewed By: StanislavGlebik
Differential Revision: D15202663
fbshipit-source-id: 9c0de8ccef71964e65389e0bf0f2b0fc383f7c1d
Summary:
This adds a sanity check that limits the count of matches in `list_all_bookmarks_with_prefix`.
If we find more matches than the limit, then an error will be returned (right now, we don't have support for e.g. offsets in this functionality, so the only alternative approach is for the caller to retry with a more specific pattern).
The underlying goal is to ensure that we don't trivially expose Mononoke to accidental denial of service when a list lists `*` and we end up querying literally all bookmarks.
I picked a fairly conservative limit here (500,000), which is > 5 times the number of bookmarks we currently have (we can load what we have right now successfully... but it's pretty slow);
Note that listing pull default bookmarks is not affected by this limit: this limit is only used when our query includes scratch bookmarks.
Reviewed By: StanislavGlebik
Differential Revision: D15413620
fbshipit-source-id: 1030204010d78a53372049ff282470cdc8187820
Summary:
This updates our receive path for B2xInfinitepush to create new scratch bookmarks.
Those scratch bookmarks will:
- Be non-publishing.
- Be non-pull-default.
- Not be replicated to Mercurial (there is no entry in the update log).
I added a sanity check on infinite pushes to validate that bookmarks fall within a given namespace (which is represented as a Regexp in configuration). We'll want to determine whether this is a good mechanism and what the regexp for this should be prior to landing (I'm also considering adding a soft-block mode that would just ignore the push instead of blocking it).
This ensures that someone cannot accidentally perform an infinitepush onto master by tweaking their client-side configuration.
---
Note that, as of this diff, we do not support the B2xInfinitepushBookmarks part (i.e. backup bookmarks). We might do that separately later, but if we do, it won't be through scratch Bookmarks (we have too many backup bookmarks for this to work)
Reviewed By: StanislavGlebik
Differential Revision: D15364677
fbshipit-source-id: 23e67d4c3138716c791bb8050459698f8b721277
Summary:
This adds support for recording server-side whether a given bookmark is publishing and / or pull-default. This is a change on the way towards supporting Infinite Push in Mononoke. This diff will require schema changes on `xdb.mononoke_production`.
There isn't a whole lot of new user-facing functionality in this particular diff. For starters, nobody can edit this flag on bookmarks, and pushes that create a new bookmark will always result in setting a bookmark as publishing AND pull_default.
What this change does however introduce is the notion of `BookmarkHgKind`, which is represents the behavior of this bookmark as far as Mercurial operations are concerned.
There are 3 such kinds, which are relevant in different parts of the codebase:
- PullDefault - this is useful when we want to respond to listkeys queries.
- Publishing — this is useful when we want to identify commit Phases.
- All - this is useful when we want to respond to listkeyspatterns.
Note that only the first two groups are cached in CachedBookmarks.
---
There are a few things going on in this diff (which logically have to happen together):
- First, I updated the `Bookmarks` trait and its various implementations to expose new methods to select Bookmarks based on their hg kind. There's one method per hg kind, and all the methods use a `Freshness` parameter to determine whether to hit cache or not.
- Second, I refactored the various bookmark-access methods in blobrepo. A few of them were duplicative of each other, and some were unused, which became a bigger problem now that we had more (for publishing, pull default, etc.). We are now down to just one method that doesn't hit cache (which is used only by the blobimport script and some tests — perhaps this could be changed?).
- Third, I updated the call sites for the methods that were udpated in step 2 to use the proper method for their use case.
---
Here's a summary of where each method is used (I'm only listing stuff that isn't unit tests):
- `get_bonsai_heads_maybe_stale`:
- `SqlPhases`'s `get_public`
- `build_skiplist_index`
- `get_bonsai_publishing_bookmarks_most_recent`:
- Blobimport (perhaps we can update this to a `maybe_stale` read?)
- `get_pull_default_bookmarks_maybe_stale`
- `listkeys` implementations
- `get_publishing_bookmarks_maybe_stale`
- API Server's `get_branches`
- `get_bookmarks_by_prefix_maybe_stale`:
- `listkeyspatterns` implementation
---
As an aside, note that a lot of the code changes in this diff are actually in CacheBookmark's tests — I had to update those to minimize sleeps: it was fine to sleep in the earlier tests, but I introduced new quickcheck-based tests, and sleeping in those isn't ideal.
Reviewed By: StanislavGlebik
Differential Revision: D15298598
fbshipit-source-id: 4f0fe80ea98dea8af0c8289db4f6f4e4070af622
Summary: This help text was wrong. This fixes it.
Reviewed By: HarveyHunt
Differential Revision: D15555767
fbshipit-source-id: 5202d46a8527c9e44c5588d52f7425d3efb38898
Summary:
This adds functionality in the sync job to capture error output from the file where we ask hg to write its output (see bottom diff in this stack for how that part works).
This ensures the "sync failed" error we were logging to Scuba will now include additional helpful detail, such as why the sync failed!
Reviewed By: ikostia
Differential Revision: D15468649
fbshipit-source-id: 18573e016d3c8b380145f94c2ddeca0061fe9fd1