Summary: We're not using this anymore.
Reviewed By: ikostia
Differential Revision: D18744426
fbshipit-source-id: f6a1be21624200ff3193baec5fb7953a325cf061
Summary:
A couple of LFS server tests are flaky due to the expected output not
appearing in the scuba log file. The affected tests run curl to send a command
to the LFS server. A theory for the missing output is that curl completes so
quickly that the async write to the log file hasn't happened before the test
checks the contents of the log file.
Fix this by adding a test helper function that waits until a file is non empty.
Further, update the scuba logging code to seek to the end of the file before
writing so that it doesn't create sparse files if the log file is truncated.
Reviewed By: krallin
Differential Revision: D18687831
fbshipit-source-id: 46b93306a3ac694064ab5cd4acd5eff25ab320e3
Summary: When we import to megarepo, we want large repo mappings set up, so that everything runs as expected. Add a new option to blobimport to handle this, plus supporting code.
Reviewed By: krallin
Differential Revision: D18614778
fbshipit-source-id: 807668c9ce4dbefa21618828b187ff2fe27c7787
Summary:
StanislavGlebik reported that some of the LFS tests were occasionally flaking out. This
is because we asynchronously resolve the hostname of the connecting client
before logging, and if the client is making requests fast enough we might
occasionally end up with logs out of order.
This fixes the problem by adding a test-friendly log mode that logs
synchronously and doesn't wait for hostnames. This is also nicer when writing
tests since it means we can just let the integration runner copy output and
don't need any globs.
Reviewed By: HarveyHunt
Differential Revision: D18592900
fbshipit-source-id: 1bafd684d8fc710dfb09cb8453bb16de774ad02d
Summary:
Now that the LFS Server can correctly verify identities, AclChecker can be
used to ensure that only authorised users can interact with the LFS server.
If the acl_check config option is set to False, then this change is a NOP.
If the hipster_acl config option is set, then a new AclChecker will be created that will
be queried after each connection is established. If it isn't set, then all clients with
identities will be allowed to connect - if a client doesn't provide an ident then it
will be rejected.
NOTE: Acl checking is disabled by default.
Reviewed By: krallin
Differential Revision: D18452653
fbshipit-source-id: 631b34f70ee074c0a7d53c0c4fef4ea27bf13a2c
Summary:
Previously, we were reading forwarded client identities from the
HTTP header. We should only do this if we have verified that it is a trusted
proxy giving us this header.
Update the LFS server to pass the identities from the X509 cert to the client
identity middleware, such that we parse the HTTP header if the client identity is
the trusted proxy else we take the identities from the X509 cert.
Reviewed By: krallin
Differential Revision: D18452652
fbshipit-source-id: c5d06bd07ee84864413a59b5e391ef9fd00819a0
Summary: D18478452 broke mode/opt 2 tests. One of them directly modifies bookmarks table, in another separate binary modifies the boomarks. This creates undeterministic tests. Let's disable bookmarks cache for those 2 tests
Reviewed By: farnz
Differential Revision: D18501660
fbshipit-source-id: d4f625dbdf2f8b110eb6196761e655187407abf6
Summary: I added more integration tests to cover all options for getting commits. Also added tests to check output when asking about globalrevs repo with or without globalrev. Modified list_repos to make it deterministic.
Reviewed By: markbt, HarveyHunt
Differential Revision: D18448770
fbshipit-source-id: 8662c3a0d1676813def5dd9f2b17200ca1c52040
Summary:
Handling diamond merges correctly in megarepo is a hard problem. For now I'd
like to add this half-manual tool that can sync a merge commit into a megarepo
should we have it again. This tool is a hack until we start fully support a merge commits
in megarepo.
Notes:
1) tool is a best-effort, not a production quality. It might not handle all
edge-cases. It might require tweaking and should be used with care (e.g. run
mononoke_repo crossrepo verify-wc). That said I'd like to land it -
previously it took me > 4 hours to sync a diamond merge. I'd like the next one
to take less, and even this hacky tool should help.
2) A diff below in the stack adds changes to blobsync crate to not upload blob if it
already exists. It is necessary for this tool to work. Currently `upload_commit`
copies all blobs from source repo, however the merge commit the tool creates can contain
entries from multiple source repos - trying to copy all of them from the single source repo
will fail!
Reviewed By: farnz
Differential Revision: D18373457
fbshipit-source-id: 7cdb042b3a335cdc0807d0cf98533f9aec937fd0
Summary: This diff adds basic happy path pushrebase tests for the push redirector. In other words, it covers a situation, where there's a single repository, which is push-redirected into a large repo, and which only serves pushrebase pushes.
Reviewed By: StanislavGlebik
Differential Revision: D18421133
fbshipit-source-id: c58af0c3c8fa767660f5e864554cc4a91cd0402c
Summary: This is to allow for cross-table transactions in integration tests.
Reviewed By: StanislavGlebik
Differential Revision: D18322477
fbshipit-source-id: 7bcc3ece8e2e75af65f008054c9461f77da43bc7
Summary: mercurial style tests for doing integration testing of the new service and its CLI
Reviewed By: krallin
Differential Revision: D18303172
fbshipit-source-id: 93e4c9cb92a39b3dfb36cd014296f35197b8fd8e
Summary:
During integration testing it's helpful to be able to configure
the config fetch interval. Add a new command line argument to allow this.
The value will default to the previous value of 5 seconds if the argument isn't
provided.
Reviewed By: krallin
Differential Revision: D18298192
fbshipit-source-id: 5812c84153e54a7775f1be094cf3e3f3376e35e4
Summary:
This adds support in the Mononoke LFS Server for reloading configurations in
real time, and notably for tracking Configerator configurations.
Since we don't have async Rust bindings to Configerator (yet), this spawns a
thread that periodically checks Configerator for updates (which is probably
good enough).
This also adds support for hot-reloading configurations from files as well,
which I've used for integration tests here.
Reviewed By: HarveyHunt
Differential Revision: D18270434
fbshipit-source-id: 57527cd9f7388144022cb509fc2d695dc4e1c1df
Summary: This should let the test runner kill it, which is nice.
Reviewed By: HarveyHunt
Differential Revision: D18324298
fbshipit-source-id: a2eb795264ac1a84ced515b1d5d1615369a2f991
Summary:
We grew our Scuba logging sort of organically in the LFS Server, and it's starting to show. Notably, we have several places setting the same keys (on purpose, but that's brittle), and some inconsistency in terms of whether we're looking at a content length of an actual request or response size.
This diff fixes that by:
- Making sure we differentiate content lengths and response sizes,
- Moving all Scuba keys into an enum that has documentation,
- Adding an integration test for Scuba logging (by logging to a file),
- Adding sane defaults for various fields (content length, notably),
- For downloads, capturing the volume of data we actually sent.
This will cause a bit of churn in our Scuba tables, but this is probably the right time to do it. Notably, things that look at `response_size` and `upload_size` should now probably start looking at `response_bytes_sent` and `request_bytes_received` instead. I'll update our derived columns to look at those.
This also updates our ODS logging a little bit, but we don't have any alarms looking at content size yet.
Reviewed By: HarveyHunt
Differential Revision: D18303711
fbshipit-source-id: f7e955c872242dc49c379f24230a151eb9e25fda
Summary:
If the SQL DB is being written to by Mononoke, it'll be locked. This means
reading the count there will fail. In that case, we should retry instead of
report a count of `""`. jsgf ran into this earlier this week.
This does that.
Reviewed By: StanislavGlebik
Differential Revision: D18224545
fbshipit-source-id: e4aa8077276e67dddb732e00f5cf9ba3613feb11
Summary: When a new repository is created we want to make sure that Mercurial has the ability to write to it, without having to go through a blobimport step first. This modifies the existing test for bookmarks to not use blobimport, and therefor enforce this feature.
Reviewed By: StanislavGlebik
Differential Revision: D18202830
fbshipit-source-id: 319cb4ba230a57bcedabd2990ee0b98c72a4de4e
Summary:
The file hook cache stores the result of running a hook on a specific
file. It allows the result of expensive hooks to be reused. However, there are
some limitations that mean the cache is practically useless. The hook cache
will be populated when a push happens to the server. The only case in which the
cached result is used is if someone pushes a file that exactly matches a
previous push to the exact same server (which hasn't been restarted). In
practice, this is very unlikely to happen.
In order to include the hook cache, the code for file and changeset hooks is
quite different. Removing the cache allows for some unification (such as a
single run_hook function) and simplification. I've also added logging to
the changeset hooks, as previously we only printed debug output for the file
hooks.
Further, it allow the removal of asyncmemo in a few places.
Reviewed By: StanislavGlebik
Differential Revision: D17932986
fbshipit-source-id: df8220aea7511a00aeb6b9de615e15d657bf4602
Summary:
This diff updates all license headers to use the new text and style.
Also, a few internal files were missing the header, but now they have it.
`fbcode/common/rust/netstring/` had the internal header, but now it has
GPLV2PLUS - since that goes to Mononoke's Github too.
Differential Revision: D17881539
fbshipit-source-id: b70d2ee41d2019fc7c2fe458627f0f7c01978186
Summary:
This silences logging from various cpp libraries we use (but not our own
logging) in tests. This should remove some flakyness in our tests coming from
those.
Reviewed By: StanislavGlebik
Differential Revision: D17475237
fbshipit-source-id: ecee69b543d1b431d1da883f67fbc30915697e13
Summary:
This is nice when running locally to know when we're done with setting up
cachelib and such.
Reviewed By: farnz
Differential Revision: D17569829
fbshipit-source-id: adfe5944991c8842a459df8606d9d81c2dcd02de
Summary:
After this diff sync job [1] starts to keep track of bookmark on hg server.
There are two reasons for that:
1) Some pushes just move a bookmark i.e. they don't push any new commits.
However without knowing hg server state it's impossible to tell apart normal push vs
bookmark-only push and generate correct bundles for it (note that the actual
support for bookmark-only pushes will be added in the next diffs, this diffs
just tracks hg sever bookmarks).
2) There are force pushes that move a bookmark to a completely new place AND
push new commits. Without knowing the state of hg server the sync job might try
to push too many commits.
To track the bookmarks they are fetched from hg server on sync job start up,
and later updated and the sync job progresses. To fetch bookmark from hg server
a separate listserverbookmarks extension is used.
If bundles are prepared together (see --bundle-prefetch option) then we also need to have bookmark overlay
i.e. bookmarks that were modified by the previous bundles in the same batch.
Note that combine-bundles is removed since it hasn't been used at all, include tw specs.
[1] It only keeps track if the job needs to regenerate bundles
Reviewed By: krallin
Differential Revision: D17525983
fbshipit-source-id: 528949ad4ad57ae51ad68fced9caf7256a057ba3
Summary:
This updates the lfs_server to publish client_hostname to Scuba. It also
unifies the various mechanisms we had to get data about the client into a
single piece of middleware, which ensures we get a consistent view of e.g.
client IPs in Scuba and stderr lgoging.
Reviewed By: HarveyHunt
Differential Revision: D17570749
fbshipit-source-id: e5b62abf440d5d09e78c1f51632444200768126c
Summary:
When I made the LFS server not wait for upstream if it can reply immediately, I
also made the tests for its proxy functionality a bit racy, since it no longer
always connects to upstream immediately. This fixes that.
While I'm in there, I also:
- Added an integration test for the "skip upstream" functionality.
- Made our stderr logging log empty responses.
- Made our invalid URI error reporting a little more comprehensive.
Reviewed By: HarveyHunt
Differential Revision: D17547889
fbshipit-source-id: 47f150136ef91a7f6334bb09f95782357b72f01a
Summary:
This is the very first and very limited implementation of a new mode of the
sync job. This is the mode where bundles are generated by the sync job instead
of reusing the bundles pushed by the client.
At the moment it syncs only a single commit, new functionality has to be added
later.
A few notes:
1) We need to manually generate replycaps part . Mononoke essentially ignores it
(with a few small exceptions), however Mercurial actually pays attention to it, and
generates different response.
At the moment reply capabilities are hard-coded in the sync job, and that might be
problematic if replycaps change by Mercurial client but sync job hasn't been
updated.
There are a few protections from it - we can make replycaps dynamic (for
example, specify it in the tw spec) and we can also rely on integration tests
to catch regressions.
2) create_filenode_entries_stream has a complicated logic for renames - please
have a look at it to double check my understanding!
Reviewed By: krallin
Differential Revision: D17477983
fbshipit-source-id: fdb4584e768bdc5637868468a035c81f1584f7fe
Summary:
Rather than using hardcoded path maps and ad-hoc moveres, lets use the
logic which creates moves from config.
Reviewed By: farnz
Differential Revision: D17424678
fbshipit-source-id: 64a0a0b1c7332661408444a6d81f5931ed680c3c
Summary:
This wires up the stdlog crate with our slog output. The upshot is that we can
now run binaries with `RUST_LOG` set and expect it to work.
This is nice because many crates use stdlog (e.g. Tokio, Hyper), so this is
convenient to get access to their logging. For example, if you run with
`RUST_LOG=gotham=info,hyper=debug`, then you get debug logs from Hyper and info
logs from Gotham.
The way this works is by registering a stdlog logger that uses the env_logger's
filter (the one that "invented" `RUST_LOG`) to filter logs, and routes them to
slog if they pass. Note that the slog Logger used there doesn't do any
filtering, since we already do it before sending logs there.
One thing to keep in mind is that we should only register the stdlog global
logger once. I've renamed `get_logger` to `init_logging` to make this clearer.
This behavior is similar to what we do with `init_cachelib`. I've updated
callsites accordingly.
Note that we explicitly tell the stdlog framework to ignore anything that we
won't consider for logging. If you don't set `RUST_LOG`, then the default
logging level is `Error`, which means that anything below error that is sent to
stdlog won't even get to out filtering logic (the stdlog macros for logging
check for the global level before actually logging), so this is cheap unless
you do set `RUST_LOG`.
As part of this, I've also updated all our binaries (and therefore, tests) to
use glog for logging. We had been meaning to do this, and it was convenient to
do it here because the other logger factory we were using didn't make it easy
to get a Drain without putting it a Logger.
Reviewed By: ahornby
Differential Revision: D17314200
fbshipit-source-id: 19b5e8edc3bbe7ba02ccec4f1852dc3587373fff
Summary:
Commit sync will operate based on the following idea:
- there's one "large" and potentially many "small" repos
- there are two possible sync directions: large-to-small and small-to-large
- when syncing a small repo into a large repo, it is allowed to change paths
of each individual file, but not to drop files
- large repo prepends a predefined prefix to every bookmark name from the small repo, except for the bookmarks, specified in `common_pushrebase_bookmarks` list, which refers to the bookmarks that can be advanced by any small repo
Reviewed By: krallin
Differential Revision: D17258451
fbshipit-source-id: 6cdaccd0374250f6bbdcbc9a280da89ccd7dff97
Summary:
This adds support in the LFS Server for running without an upstream. This is
useful to run tests, because it lets chain LFS Servers and have one acting as a
proxy and one acting as a backend.
This also highlighted a bug in my implementation (we were expecting a download
action from upstream if the data didn't need to be uploaded there), which I
fixed here.
For the time being, we won't use this in production.
Reviewed By: HarveyHunt
Differential Revision: D17263039
fbshipit-source-id: 7cba550054e5f052a4b8953ebe0195907919aade
Summary: Currently in mononoke_admin tool we have to use repo-id to identify repository. Sometimes it can be inconvenient. Changed it, so we can use either repo-id or reponame.
Reviewed By: StanislavGlebik
Differential Revision: D17202962
fbshipit-source-id: d33ad55f53c839afc70e42c26501ecd4421e32c0
Summary: This adds a command that allows importing a set of LFS blobs into Mononoke by streaming them out of another storage location (e.g. Dewey).
Reviewed By: HarveyHunt
Differential Revision: D16917806
fbshipit-source-id: 4917d56e11a187c89e00c23a32c6e791b351f8ef
Summary:
This is a mechanical part of rename, does not change any commit messages in
tests, does not change the scuba table name/config setting. Those are more
complex.
Reviewed By: krallin
Differential Revision: D16890120
fbshipit-source-id: 966c0066f5e959631995a1abcc7123549f7495b6
Summary:
Create a new binary that can be used to rechunk files content using the filestore.
The binary accepts multiple filenodes, that it will then go and rechunk using the filestore
config provided to it.
Reviewed By: krallin
Differential Revision: D16802701
fbshipit-source-id: d7c05729f5072ff2925bbc90cdd89fcfed56bba2
Summary:
The network blackhole is causing the API server to occasionally hang while serving requests, which has broken some LFS tests. This appears to be have happened in the last month or so, but unfortunately, I haven't been able to root cause why this is happening.
From what I can tell, we have an hg client that tries an upload to the API Server, and uploads everything... and then the API server just hangs. If I kill the hg client, then the API server responds with a 400 (so it's not completely stuck), but otherwise it seems like the API server is waiting for something to happen on the client-side, but the client isn't sending that.
As far as I can tell, the API Server isn't actualy trying to make outbound requests (strace does report that it has a Scribe client that's trying to connect, but Scuba logging isn't enabled, and this is just trying to connect but not send anything), but something with the blackhole is causing this hg - API server interaciton to fail.
In the meantime, this diff disables the blackhole for those tests that definitely don't work when it's enabled ...
Reviewed By: HarveyHunt
Differential Revision: D16599929
fbshipit-source-id: c6d77c5428e206cd41d5466e20405264622158ab
Summary:
This config option should be enabled in tests. We can use it to not run prod
services in non-test enviroment.
It will be used in the next diff to not load configerator in tests
Reviewed By: krallin
Differential Revision: D16221512
fbshipit-source-id: 7e0ba9c1d46b652a06e0e767de5df78d1671951a
Summary:
Sync job is used to replay pushes from Mononoke onto mercurial. It's code is in https://fburl.com/dt1hkf4g. For more about the sync job see https://fburl.com/wiki/sd2i6rch.
The goal of the sync job is to not only replicate pushes from Mononoke onto Mercurial, but also to verify that the push was correct i.e. push on Mononoke produced the same hash as on mercurial.
The problem is that we lock and unlock in two completely differen codepaths - locking is done by an external script called "failure handler" (https://fburl.com/7txm3jxf), while unlocking is a part of the sync job.
Ideally we'd like to lock and unlock in the sync job itself and remove "failure handler" entirely.
Reviewed By: StanislavGlebik
Differential Revision: D16107759
fbshipit-source-id: a418f8d0f48fa6db82476be72a91adbc03b66168
Summary:
This adds the ability to provide an infinitepush namespace configuration without actually allowing infinite pushes server side. This is useful while Mercurial is the write master for Infinite Push commits, for two reasons:
- It lets us enable the infinitepush namespace, which will allow the sync to proceed between Mercurial and Mononoke, and also prevents users from making regular pushes into the infinitepush namespace.
- It lets us prevent users from sending commit cloud backups to Mononoke (we had an instance of this reported in the Source Control @ FB group).
Note that since we are routing backfills through the shadow tier, I've left infinitepush enabled there.
Reviewed By: StanislavGlebik
Differential Revision: D16071684
fbshipit-source-id: 21e26f892214e40d94358074a9166a8541b43e88
Summary:
Added an option to control for which repositories should censoring be
enabled or disabled. The option is added in `server.toml` as `censoring` and it
is set to true or false. If `censoring` is not specified, then the default
option is set to true ( censoring is enabled).
By disabling `censoring` the verification if the key is blacklisted or not is
omitted, therefor all the files are fetchable.
Reviewed By: ikostia
Differential Revision: D16029509
fbshipit-source-id: e9822c917fbcec3b3683d0e3619d0ef340a44926
Summary: This will make it easier to use the same bookmarks filling binary to process a backfill scratch bookmarks from Mercurial before we start actually processing new scratch bookmarks as they come in (see D16028731 for the plan I'm going to follow).
Reviewed By: farnz
Differential Revision: D16028817
fbshipit-source-id: 195e5bf746284e34c70ae2cbd2b9270fbc0c02c7
Summary:
Our test framework as it stands right now is a light passthrough to the hg `run-tests.py` test framework, which attempts to place all the files it needs to run (including tests) into a `python_binary`, then runs the hg test runner from that directory.
It heavily relies on how Buck works to offer functionality:
- It expects that all the sources it registers for its master binary will all be in the same directory when it builds
- It expects that the sources will be symlinks to the real files so that `--interactive` can work.
This has a few problems:
- It doesn't work in `mode/opt`. The archive that gets built in `mode/opt` doesn't actually have all the sources we registered, so it's impossible to run tests.
- To add a new test, you must rebuild everything. We don't do that very often, but it'd be nice if we didn't have to.
- Iterating on the runner itself is painful, because as far as Buck is concerned, it depends on the entire world. This means that every change to the runner has to scan a lot more stuff than necessary. There's some functionality I'd like to get into the runner (like reporting test timings) that hasn't been easy to add as a result.
This diff attempts to solve these problems by separating concerns a little more:
- The runner is now just a simple `python_binary`, so it's easier to make changes to it.
- The runner now provides the logic of working from local files when needed (this means you can add a new test and it'll work immediately),
- All the binaries we need are dependencies of the integration test target, not the runner's. However, to make it possible to run the runner incrementally while iterating on something, there's a manifest target that points at all the various paths the runner needs to work. This will also help integrate the test runner with other build frameworks if necessary (e.g. for open-sourcing).
- We have separate targets for various assets we need to run the tests (e.g. the hg test framework).
- The runner now controls whether to use the network blackhole. This was necessary because the network blackhole breaks PAR archives (because tmp is no longer owned by the right owner, because we use a user namespace). We should be able to bring this back at some point if we want to by using a proper chroot for opt tests.
I included a README to explain this new design as well.
There are some things that could yet stand to be improved here (notably, I think we should put assets and tests in different directories for the sake of clarity), but so far I've been aiming at providing a 1-1 translation of the old system into the new one. I am planning to make further improvements in followup diffs.
Reviewed By: farnz
Differential Revision: D15921732
fbshipit-source-id: 09052591c419acf97f7e360b1e88ef1f412da6e5