Summary:
This wires up the bookmarks replay logic by connecting the `process_replay_stream` function to an implementation that actually makes bookmark moves through the blobrepo.
This also slightly refactors `process_replay_stream`:
- Use specific errors instead of a generic `failure_ext` `Error`.
- Add logging + Scuba reporting (and report all moves we attempted to make for a given bookmark, as well as their outcome).
- Support passing in a closure instead of a static function, since we need a closure to get a `BlobRepo` reference into the bookmark replay function.
Reviewed By: StanislavGlebik
Differential Revision: D15902312
fbshipit-source-id: d6d57ba880066677748137d684e10796fb8633d5
Summary:
Currently, we implicitly expect that caching is enabled if we're dealing with a remote repository, but that means cachelib must be enabled when running with a remote repository, and that is ... slow.
This can be problematic in two cases:
In tests. It makes MySQL tests unbearably slow, and a little more flaky because we end up using so much CPU. With this patch, MySQL tests remain slower than SQLite tests, but by a factor of < 2, which is a pretty substantial improvement.
Running trivial administrative commands (e.g. a `mononoke_admin`), notably using a dev build (which right now unbearably slow). With this patch, such a trivial command is about 6x faster:
```
[torozco@devvm4998.lla1 ~/fbcode] time buck-out/gen/scm/mononoke/admin#binary/admin --repo-id 2102 --mononoke-config-path /home/torozco/local/.mononoke_exec/config/PROD --skip-caching bookmarks list --kind publishing
Jun 21 08:57:36.658 INFO using repo "instagram-server" repoid RepositoryId(2102)
master c96ac4654e4d2da45a9597af859adeac9dba3d7ca964cb42e5c96bc153f185e3 2c5713ad27262b91bf1dfaf21b3cf34fe3926c8d
real 0m5.299s
user 0m5.097s
sys 0m0.699s
[torozco@devvm4998.lla1 ~/fbcode] time buck-out/gen/scm/mononoke/admin#binary/admin --repo-id 2102 --mononoke-config-path /home/torozco/local/.mononoke_exec/config/PROD bookmarks list --kind publishing
I0621 08:57:59.299988 1181997 CacheAllocator-inl.h:3123] Started worker 'PoolRebalancer'
Jun 21 08:57:59.328 INFO using repo "instagram-server" repoid RepositoryId(2102)
master c96ac4654e4d2da45a9597af859adeac9dba3d7ca964cb42e5c96bc153f185e3 2c5713ad27262b91bf1dfaf21b3cf34fe3926c8d
real 0m28.620s
user 0m27.680s
sys 0m2.466s
```
This is also nice because it means the MySQL tests won't talk to Memcache anymore.
---
Note: in this refactor, I made `Caching` an enum so it can't accidentally be swapped with some other boolean.
---
Finally, it also uses up quite a bit less RAM (we no longer need 2GB of RAM to output one line of bookmarks — although we're still using quite a bit!):
```
[torozco@devvm4998.lla1 ~/fbcode] env time buck-out/gen/scm/mononoke/admin#binary/admin --skip-caching --repo-id 2102 --mononoke-config-path /home/torozco/local/.mononoke_exec/config/PROD bookmarks list --kind publishing
Jun 21 09:18:36.074 INFO using repo "instagram-server" repoid RepositoryId(2102)
master abdd2f78dafeaa8d4b96897955a63844b31324f9d89176b3a62088d0e2ae2b22 1702392d14bf7a332bf081518cb1ea3c83a13c39
5.08user 0.68system 0:05.28elapsed 109%CPU (0avgtext+0avgdata 728024maxresident)k
6776inputs+0outputs (8major+115477minor)pagefaults 0swaps
[torozco@devvm4998.lla1 ~/fbcode] env time buck-out/gen/scm/mononoke/admin#binary/admin --repo-id 2102 --mononoke-config-path /home/torozco/local/.mononoke_exec/config/PROD bookmarks list --kind publishing
I0621 09:19:01.385933 1244489 CacheAllocator-inl.h:3123] Started worker 'PoolRebalancer'
Jun 21 09:19:01.412 INFO using repo "instagram-server" repoid RepositoryId(2102)
master abdd2f78dafeaa8d4b96897955a63844b31324f9d89176b3a62088d0e2ae2b22 1702392d14bf7a332bf081518cb1ea3c83a13c39
26.96user 2.27system 0:27.93elapsed 104%CPU (0avgtext+0avgdata 2317716maxresident)k
11416inputs+5384outputs (17major+605118minor)pagefaults 0swaps
```
Reviewed By: farnz
Differential Revision: D15941644
fbshipit-source-id: 0df4a74ccd0220a786ebf0e883e1a9b8aab0a647
Summary:
This adds suppport for running Mononoke integration tests using MySQL.
This is useful to ensure at least a nominal amount of testing happens on MySQL code. After all, MySQL is what we actually use in production. Notably, this feels like something we should do before deploying automatically to prod.
Currently, this requires whitelisting tests for MySQL support. I whitelisted two tests to act as a proof of concept.
Reviewed By: farnz
Differential Revision: D15855430
fbshipit-source-id: c554e2dbc6a5ae21df670a60edb713e2945de10a
Summary:
Mercurial has complicatied logic of finding filenode parents. And this diff
implements one of those tricky cases.
If a file existed and it was copied over (with `hg cp --force`) then Mononoke
set both p1 and copy information. Mercurial though just sets copy information.
This diff implements the same logic in Mononoke (note that for now it's only
for non-merge cases).
Reviewed By: krallin
Differential Revision: D15737250
fbshipit-source-id: 0b2117097801a7137ccc7382477452a0b7bf4504
Summary:
It was failing and disabled, this diff fixes it.
Note that the test is quite heavy and we had problem with it failing on
sandcastle before. If this attempt of fixing it won't work, then it might be
worth removing the test completely because it doesn't give a lot of benefit to
us.
Reviewed By: farnz
Differential Revision: D15657387
fbshipit-source-id: 3e0ed588861cbadbaab2178826b13e1ba201fbc7
Summary:
This is the final step in making sure we have control over whether
non-pushrebase pushes are supported by a given repo.
Reviewed By: krallin
Differential Revision: D15522276
fbshipit-source-id: 7e3228f7f0836f3dcd0b1a3b2500545342af1c5e
Summary:
This adds a sanity check that limits the count of matches in `list_all_bookmarks_with_prefix`.
If we find more matches than the limit, then an error will be returned (right now, we don't have support for e.g. offsets in this functionality, so the only alternative approach is for the caller to retry with a more specific pattern).
The underlying goal is to ensure that we don't trivially expose Mononoke to accidental denial of service when a list lists `*` and we end up querying literally all bookmarks.
I picked a fairly conservative limit here (500,000), which is > 5 times the number of bookmarks we currently have (we can load what we have right now successfully... but it's pretty slow);
Note that listing pull default bookmarks is not affected by this limit: this limit is only used when our query includes scratch bookmarks.
Reviewed By: StanislavGlebik
Differential Revision: D15413620
fbshipit-source-id: 1030204010d78a53372049ff282470cdc8187820
Summary:
This updates our receive path for B2xInfinitepush to create new scratch bookmarks.
Those scratch bookmarks will:
- Be non-publishing.
- Be non-pull-default.
- Not be replicated to Mercurial (there is no entry in the update log).
I added a sanity check on infinite pushes to validate that bookmarks fall within a given namespace (which is represented as a Regexp in configuration). We'll want to determine whether this is a good mechanism and what the regexp for this should be prior to landing (I'm also considering adding a soft-block mode that would just ignore the push instead of blocking it).
This ensures that someone cannot accidentally perform an infinitepush onto master by tweaking their client-side configuration.
---
Note that, as of this diff, we do not support the B2xInfinitepushBookmarks part (i.e. backup bookmarks). We might do that separately later, but if we do, it won't be through scratch Bookmarks (we have too many backup bookmarks for this to work)
Reviewed By: StanislavGlebik
Differential Revision: D15364677
fbshipit-source-id: 23e67d4c3138716c791bb8050459698f8b721277
Summary:
This adds a new test target that will use Mononoke as the server.
Initially we only support one test, but we can add to this list in the future.
Reviewed By: quark-zju
Differential Revision: D15382714
fbshipit-source-id: ad9b3bd35ffefc01239ff05f9f65829fb7a94555
Summary:
Previously the cert directory was assumed to be inside $TESTDIR. This
doesn't work with the Mercurial tests, since they run inside the actual test
directory instead of in a buck-out/gen created directory.
To allow the Mercurial tests to use these same certs, let's allow customizing
the location via HGTEST_CERTDIR. In a future diff the Mercurial tests will set
this when running with Mononoke integration.
Reviewed By: StanislavGlebik
Differential Revision: D15381789
fbshipit-source-id: e1be3d1cc17f5622d1cc277a377d628d5eb45160
Summary:
Previously it always set up a repo named 'repo'. For Mercurial
integration tests we'll want the ability to name the repository other things, so
let's refactor the configuration logic into a more generic function.
Reviewed By: quark-zju
Differential Revision: D15354044
fbshipit-source-id: e222e1389ca700af712af79062aae4977e125dc4
Summary:
In some cases, the sync binary gets killed after it had synced the bundle,
but before it had updated the counter in the database. This causes any
sunsequent sync to fail with the "conflicting changes" error. One of the ways
to mitigate this problem is to check on failure whether the bookmark on the server
is already pointing to the expected location, and pretend that the sync
succeeded, if it is.
Reviewed By: StanislavGlebik
Differential Revision: D15460956
fbshipit-source-id: 55d2df3545f8a34ebc61eb6da493b8d244629089
Summary:
We're trying to make it possible to run the Mercurial tests with the
Mononoke server. The session id output makes it difficult to have the same
output of the Mercurial server, so let's add a flag that allows us to disable
it.
Reviewed By: krallin
Differential Revision: D15462727
fbshipit-source-id: 63e4a7f31ddc41cd5ae767712f7ee5fbcdffda56
Summary:
We want to migrate the tests to run using treemanifest. As part of
that, we want to first transition to using treemanifest without actually
changing the hash, so we can check that the tests still work first, then update
the hashes second.
This diff adds the flatcompat mode and enables it by default. A future diff will
start enabling treemanifest for existing tests.
Reviewed By: quark-zju
Differential Revision: D15030252
fbshipit-source-id: 06c82be749282d62f1d9cfb43246695c427f8165
Summary: Can be used to verify multiplexer
Reviewed By: aslpavel
Differential Revision: D15264133
fbshipit-source-id: cb9044b6b51e099b61e751925367c71fd506332e
Summary:
This change has two goals:
- Put storage configuration that's common to multiple repos in a common place,
rather than replicating it in each server.toml
- Allow tools that only operate on the blobstore level - like blobstore healing
- to be configured directly in terms of the blobstore, rather than indirectly
by using a representative repo config.
This change makes several changes to repo configuration:
1. There's a separate common/storage.toml which defines named storage
configurations (ie, a combination of a blobstore and metadata DB)
2. server.toml files can also define local storage configurations (mostly
useful for testing)
3. server.toml files now reference which storage they're using with
`storage_config = "name"`.
4. Configuration of multiplex blobstores is now explicit. Previously if a
server.toml defined multiple blobstores, it was assumed that it was a
multiplex. Now storage configuration only accepts a single blobstore config,
but that config can be explicitly a multiplexed blobstore, which has the
sub-blobstores defined within it, in the `components` field. (This is
recursive, so it could be nested, but I'm not sure if this has much value in
practice.)
5. Makes configuration parsing more strict - unknown fields will be treated as
an error rather than ignored. This helps flag problems in refactoring/updating
configs.
I've updated all the configs to the new format, both production and in
integration tests. Please review to make sure I haven't broken anything.
Reviewed By: StanislavGlebik
Differential Revision: D15065423
fbshipit-source-id: b7ce58e46e91877f4e15518c014496fb826fe03c
Summary:
Seems redundant to also require callers to open_ssl to also pass a
(mostly) identical string.
Also make open_ssl special-case filenodes with sharding (though filenodes
aren't currently opened through it).
Reviewed By: StanislavGlebik
Differential Revision: D15157834
fbshipit-source-id: 0df45307f17bdb2c021673b3153606031008bee2
Summary:
This will help us understand which servers behave poorly and whether there's
any clustering.
Reviewed By: StanislavGlebik
Differential Revision: D15229101
fbshipit-source-id: 1aae9196702ed6fb791b5265c3bdfe90e7e24ae4
Summary:
The config will be used to whitelist connections with certain identities and
blacklist everything else.
Differential Revision: D15150921
fbshipit-source-id: e4090072ea6ba9714575fb8104d9f45e92c6fefb
Summary:
At the moment verify integrity script blocks master commits that do not pass
the check. Non-master commits are not blocked even if they don't pass the check
(for example, if they don't have a valid reviewer), but verify_integrity hook
still logs them to scuba.
Mononoke's verify_integrity was enabled only on master, meaning that the hook
won't run on non-master commits at all, and we won't log non-master commits to scuba.
This diff fixes it.
Reviewed By: farnz
Differential Revision: D15146725
fbshipit-source-id: ab432fb2eccae0fbcc10755f5c8447964c490285
Summary:
This introduces prefetching in the hg sync job, which allows us to separate two logically separate steps, and buffer between them:
- Fetching bundle data from Manifold so it can be replayed in Mercurial (we call this "preparing"). This can happen out of order.
- Submitting the changes to Mercurial. This must happen sequentially.
The upshot is that we can fetch new data from Manifold while we wait for Mercurial to apply it, which should increase our throughput.
---
This required a bit of refactoring for the main loop here. The reason for that is that, currently, we heavily rely on closures to capture the `log_entry` so we can pass it to Scuba, etc.
That doesn't work if we want to have buffering, because there isn't a single scope that represents
Instead, every step of the stream has its own scope, so we need to pass the `log_entry` through the stream itself. We also need to track the log entry in errors as well (e.g. to report them to Scuba).
Reviewed By: ikostia
Differential Revision: D15146682
fbshipit-source-id: f83284571f6ca90cb621c621c5165f7fbafd81b5
Summary: Enabling it was a mistake as we can't enable it on our real hg servers.
Reviewed By: farnz
Differential Revision: D15122167
fbshipit-source-id: 5e28968354b7487bd9c965d8b3f23245f6bd265a
Summary: This adds a command in mononoke admin to verify the consistency of remaining bundles to sync -- i.e. whether all bundles are blobimports or all of them are not blobimports.
Reviewed By: ikostia
Differential Revision: D15097935
fbshipit-source-id: a0df221c38e84897213edf232972ba420977e9d3
Summary:
Stanislau and Kostia mentioned that adding a few more commands for managing the HG sync job in the Mononoke Admin tool would be useful — in particular those commands should interact with the list of log entries and the last sync.
Notably, they were:
- Add support for skipping ahead any blobimport log entries.
- Add a verify command to check that all entries looking forward are consistent — i.e. they're either all blobimport or all non-blobimport.
However, as it stands, it's hard to test those in integration tests, because one can't easily generate stub data to exercise on the admin tool.
Notably, inserting data through the `sqlite3` database command line results in inconsistent values for the `reason` column that don't compare equal to those we query from the Rust side of things.
We could work around this by updating the Rust code to e.g. use LIKE instead of equality checks, but then we're updating the production code exclusively for the purpose of serving tests.
So, instead, this patch proposes introduces a lightweight command line tool to use in tests that basically just inserts data into the bookmarks database, but from the Rust side of things.
Reviewed By: StanislavGlebik
Differential Revision: D15081758
fbshipit-source-id: 574739d2e19e6ce59d729f1aee73c368612fb92b
Summary: Now it is possible to configure and enable/disable bookmark cache from configs
Reviewed By: StanislavGlebik
Differential Revision: D14952840
fbshipit-source-id: 3080f7ca4639da00d413a949547705ad480772f7
Summary:
In order to do more complete tests around authentication we need to provide
fb identity in to our test certificates.
Reviewed By: StanislavGlebik
Differential Revision: D15046017
fbshipit-source-id: 3f3cd450425944a2970c6f02e7eb92a878076a05
Summary: Added obsmarkers to pushrebase output. This allows the client to hide commits that were rebased server-side, and check out the rebased commit.
Reviewed By: StanislavGlebik
Differential Revision: D14932842
fbshipit-source-id: f215791e86e32e6420e8bd6cd2bc25c251a7dba0
Summary:
Sometimes we have one-off failures that retry can fix. Let's retry 3 times
before giving up.
Note that it'll "retry" 3 times even in the case of UnexpectedBookmarkMove,
and retrying in that case is useless. However it's not a big problem
Reviewed By: ikostia
Differential Revision: D14796293
fbshipit-source-id: a145213cf74e9d436212690042f187cee56e26cd
Summary: We want to forbid pushrebases when root node is a p2 of its parents for now, since mercurial swaps them after pushrebase which causes inconsistency
Reviewed By: StanislavGlebik
Differential Revision: D14642177
fbshipit-source-id: f8f6e9565c53958e8cff5df6f4d006ddfe5a69c0
Summary: In Mononoke we want to be able to block merge commits on a repo per repo basis.
Reviewed By: aslpavel
Differential Revision: D14455502
fbshipit-source-id: 400e85834c20df811674405bc0c391860cf677dd
Summary:
This is a hook in mercurial, in Mononoke it will be part of the implementation. By default all non fastforward pushes are blocked, except when using the NON_FAST_FORWARD pushvar (--non-forward-move is also needed to circumvent client side restrictions). Additionally certain bookmarks (e.g. master) shouldn't be able to be moved in a non fastforward manner at all. This can be done by setting block_non_fast_forward field in config.
Pushrebase can only move the bookmark that is actually being pushrebased so we do not need to check whether it is a fastforward move (it always is)
Reviewed By: StanislavGlebik
Differential Revision: D14405696
fbshipit-source-id: 782b49c26a753918418e02c06dcfab76e3394dc1
Summary: This is needed to run the scmadmin tool and lock the repo.
Reviewed By: StanislavGlebik
Differential Revision: D14497608
fbshipit-source-id: 5865b90375db29a17d462044ca4cdb87242a8209
Summary:
Previously we always restart replaying from the beginning. starting from this
diff we actually record the latest replayed id in the mutable_counters table.
Special flag `--loop-forever` was added. It should be set in production, but for tests we don't want a binary to run forever.
Reviewed By: ikostia
Differential Revision: D14226938
fbshipit-source-id: cbfde15a506ee94b0fa72015d9dcfd550f5b8ca3
Summary:
Small refactoring before the next diffs.
Before
mononoke_hg_sync ... --mode sync-once ssh://PATH --start-id 0
Now
mononoke_hg_sync ... ssh://PATH sync-once --start-id 0
Reviewed By: HarveyHunt
Differential Revision: D14226940
fbshipit-source-id: 1b6db6f194aecb2f4bdb7bbd9e846aaba180e098
Summary:
Pretty big cleanup. The biggest part is simplifying comparing logic.
Pushrebase replay compared replayed pushrebase commits with their hg
counterparts. Because previously we didn't log enough data from hg servers we
had to jump through a lot of hoops to find out which commits should be compared
to which.
Now we are logging `ordered_added_revs` i.e. which commits mercurial produced
after the hg pushrebase, which is exactly what we need for comparison. So a lot
of code can be simplified.
That does mean though that some of the recorded pushrebases we won't be able to
replay because they don't have `ordered_added_revs`. At the moment we have
~300K pushrebases without `ordered_added_revs` and ~100K with. I think this
change is worth it given how simpler the code is and I'd argue 100K is a pretty
big number. For the rest 300K pushrebases we can later manually fill in
`ordered_added_revs` field if we consider it's necessary.
After this diff pushrebase replay won't use hg repo, and so we can run it on
normal twshared jobs instead of on hg servers.
A few smaller changes:
1) Also note that for now this diff makes pushrebase replay single threaded, while
previously we ran comparison in parallel. That will be fixed in the next diffs.
2) Pushrebase replay now always compare commits i.e. `--compare-commits` option
was removed
Also change repo id in tw spec to 20
Reviewed By: HarveyHunt
Differential Revision: D14122963
fbshipit-source-id: 0f8da7cffb13899f11143a01d1a301fdf8ea7f00
Summary:
First stab at a job that will keep hg in sync with Mononoke when Mononoke
becomes a source of truth.
Reviewed By: ikostia
Differential Revision: D14018269
fbshipit-source-id: f88c5eba8bf5482f2f162b7807ca8e41a3b4291d
Summary: This is adds a metaconfig option to preserve push/pushrebase bundles in the blobstore.
Reviewed By: StanislavGlebik
Differential Revision: D14020299
fbshipit-source-id: 94304d69e0ac5d81232f058c6d94eec61eb0020a
Summary: Passing input on command line is restricted to the maximum length of a command, it is much better to pass the file descriptor, this way one can pass very long inputs inside the bash functions.
Reviewed By: StanislavGlebik
Differential Revision: D13964484
fbshipit-source-id: da010ecfcf05da8c5860c8b5ee0860a8aeda0502
Summary:
There were lots of small stupid mistakes with pushrebase replay in past.
Since pushrebase replay will stay with us for quite some time,
let's add a test to prevent these regressions in future.
In tests we'll use filesystem+sqlite instead of everstore + xdb to fetch
requests to replay.
Reviewed By: HarveyHunt
Differential Revision: D13940390
fbshipit-source-id: a4398d1c8c22bf16a85d6391a1f5665ce4b73eb1
Summary:
I already made them the same (but copy) in a different diff. As we discussed at
war room, symlinks is a better solution.
Also removed unused import and some functions that were copied from mercurial.
This is no longer needed as we now share tinit.sh with general purpose
functions.
The only mononoke specific file left is dummyssh.py
I think later we could pack them in a separate buck target or something and share in a better way that symlinks but for now it is the easiest solution.
Reviewed By: DurhamG
Differential Revision: D13881960
fbshipit-source-id: 36f425d6f0ddbae2c9d083de35d2779669dc01e7
Summary: This test covers corner case (partially public stacks)
Reviewed By: StanislavGlebik
Differential Revision: D13750852
fbshipit-source-id: cd1a5a84dfb62951cb37f1fbdd6c510d825adb41
Summary:
Mercurial has a hack to determine if a file was renamed. If p1 is None then
copy metadata is checked. Note that this hack is purely to make finding renames
faster and we don't need it in Mononoke. So let's just read copy metadata.
This diff also removes `maybe_copied()` method and unused code like `Symlink`
Reviewed By: farnz
Differential Revision: D12826409
fbshipit-source-id: 53792218cb61fcba96144765790278d17eecdbb1
Summary:
Currently some tests are failing intermittently due to a timeout, this timeout
is caused by the integration tests making an external call to AclChecker which
is intermittently very slow.
For our integration tests we disable this call since it's not part of the test
suite to stop our tests from being flaky.
Reviewed By: StanislavGlebik
Differential Revision: D13580144
fbshipit-source-id: 0c26bb14dd222b888ca2638319071f4d99eab6df