Summary:
Change the multiplexed blobstore logging to log the blobstore key.
Also, fix the get() method to log errors to scuba (which it wasn't before!).
Further, refactor the code to remove env parsing for tupperware information and also switch
to using ScubaSampleBuilder. This means that the common server information can
be stored once and then cloned each time we log, rather than iterating through
a vec of server information each time we want to log.
Additionally, using ScubaSampleBuilder means that we don't need to pass in an Option<ScubaClient>,
cleaning up the code a little bit.
Reviewed By: StanislavGlebik
Differential Revision: D17368779
fbshipit-source-id: 0896962cdbd37912fc6f23a5e541e10cea90fa0e
Summary:
This diff moves initFacebook calls that used to happen just before FFI calls to instead happen at the beginning of main.
The basic assumption of initFacebook is that it happens at the beginning of main before there are additional threads. It must be allowed to modify process-global state like env vars or gflags without the possibility of a data race from other code concurrently reading those things. As such, the previous approach of calling initFacebook through `*fbinit::FACEBOOK` near FFI calls was prone to race conditions.
The new approach is based on attribute macros added in D17245802.
---
The primary remaining situations that still require `*fbinit::FACEBOOK` are when we don't directly control the function arguments surrounding the call to C++, such as in lazy_static:
lazy_static! {
static ref S: Ty = {
let _ = *fbinit::FACEBOOK;
/* call C++ */
};
}
and quickcheck:
quickcheck! {
fn f(/* args that impl Arbitrary */) {
let _ = *fbinit::FACEBOOK;
/* call C++ */
}
}
I will revisit these in a separate diff. They are a small fraction of total uses of fbinit.
Reviewed By: Imxset21
Differential Revision: D17328504
fbshipit-source-id: f80edb763e7f42b3216552dd32f1ea0e6cc8fd12
Summary: `single` subcommand which regenerates specified derived data for provided changeset. This is useful for debugging purposes
Reviewed By: StanislavGlebik
Differential Revision: D17318968
fbshipit-source-id: 3d3a551991b0628a05335addedd7d5b315fd45d2
Summary: I modified statistics_collector tool, so now it calculates both number of files, and total file size in repo.
Reviewed By: krallin
Differential Revision: D17342512
fbshipit-source-id: 94217d8b61c2a7350f1793a2ef33f84d600bbb54
Summary:
Generating a lot of fastlog batches for a single commit turned out to be a cpu
intensive operation. To make sure more cpu threads are used let's use
spawn_future() function that puts each future in a separate Tokio task, which
in turn let it be scheduled on different cpu.
There are some concerns that it might cause problems when we'll derive data in
production i.e. the whole host will be unavailable for a few seconds because it
uses all cpus. My hope is that it will affect only very few hosts because of
memcache leases that we use in derive data implementation.
Reviewed By: krallin
Differential Revision: D17364581
fbshipit-source-id: 0281de9c42f93d793f6caccbca7e8809056782ec
Summary:
Timing a request is going to be important for multiple logging
middlewares, so let's move it into its own middleware and place it
in the pipeline.
Further, create a new middleware/ directory and move middlewares into there.
Reviewed By: krallin
Differential Revision: D17346276
fbshipit-source-id: f84c6c06d76e95c11aab18c3a24200a67429bebf
Summary:
As the filestore can cheaply calculate file size, update the Scuba logging to
also log that. Further, set the content length header when responding to download requests.
Reviewed By: krallin
Differential Revision: D17319666
fbshipit-source-id: 858372316930c384f19b89e2b69b08faaf656237
Summary: Add lease_type to memcache lease stats so that its clearer what types of leases are driving lease waits
Reviewed By: HarveyHunt
Differential Revision: D17344425
fbshipit-source-id: ea1ae44319428bc1705f502ad9fa1b2b0c44bf96
Summary:
It's quite useful for testing to be able to keep cachelib caching, but disable
memcache caching. This diff adds it
Note that in this diff it uses cachelib caching for blobstore. We don't use any caching for filenodes, changesets etc
Reviewed By: HarveyHunt
Differential Revision: D17342171
fbshipit-source-id: 65458170560ea6913b3249a4118404dcc47e507d
Summary:
Previously rebuilding skiplist required fetching all the commits from mysql,
and then rebuilding the index from scratch. That was quite slow (> 16 mins to
finish). Instead let's try to read the key to check if it has the data already
and prepopulate skiplist with it.
Reviewed By: krallin
Differential Revision: D17343950
fbshipit-source-id: e8a446b94af61dbbd224d853f7dd8dd41510549d
Summary:
This wires up the stdlog crate with our slog output. The upshot is that we can
now run binaries with `RUST_LOG` set and expect it to work.
This is nice because many crates use stdlog (e.g. Tokio, Hyper), so this is
convenient to get access to their logging. For example, if you run with
`RUST_LOG=gotham=info,hyper=debug`, then you get debug logs from Hyper and info
logs from Gotham.
The way this works is by registering a stdlog logger that uses the env_logger's
filter (the one that "invented" `RUST_LOG`) to filter logs, and routes them to
slog if they pass. Note that the slog Logger used there doesn't do any
filtering, since we already do it before sending logs there.
One thing to keep in mind is that we should only register the stdlog global
logger once. I've renamed `get_logger` to `init_logging` to make this clearer.
This behavior is similar to what we do with `init_cachelib`. I've updated
callsites accordingly.
Note that we explicitly tell the stdlog framework to ignore anything that we
won't consider for logging. If you don't set `RUST_LOG`, then the default
logging level is `Error`, which means that anything below error that is sent to
stdlog won't even get to out filtering logic (the stdlog macros for logging
check for the global level before actually logging), so this is cheap unless
you do set `RUST_LOG`.
As part of this, I've also updated all our binaries (and therefore, tests) to
use glog for logging. We had been meaning to do this, and it was convenient to
do it here because the other logger factory we were using didn't make it easy
to get a Drain without putting it a Logger.
Reviewed By: ahornby
Differential Revision: D17314200
fbshipit-source-id: 19b5e8edc3bbe7ba02ccec4f1852dc3587373fff
Summary:
This updates the LFS server to route hg client correlators to Scuba. This will
help in troubleshooting issues should any arise.
Reviewed By: HarveyHunt
Differential Revision: D17319280
fbshipit-source-id: d4323925a425203f53aba184d5854dd674462da6
Summary:
This adds support for logging client identities to Scuba. This is useful to
know who's connecting to the LFS Server.
Reviewed By: StanislavGlebik
Differential Revision: D17318696
fbshipit-source-id: adba75e4133e54af7eef5183a245a3934527db05
Summary:
This updates our error reporting to actually log the chained causes we bothered
to put on the errors. It also adds a few more chained causes.
Reviewed By: StanislavGlebik
Differential Revision: D17315689
fbshipit-source-id: d3b83f73fa06b56b863e23f2f76e78f699af8e36
Summary:
Fetching commits takes a long time. I've added a subcommand that saves them to
a file, and "backfill" command can later read it. Note that another option
would be to add indices to the db to make commit fetching faster, however this diff
was simpler to do.
I also fixed a panic which was because of division by zero.
Reviewed By: aslpavel
Differential Revision: D17341640
fbshipit-source-id: b0335ebf8799cd48884c19fa8a0ee8023eb751af
Summary:
`extern crate` is usually no longer needed in 2018 edition of Rust. This diff removes `extern crate` lines from fbcode where possible, replacing #[macro_use] with individual import of macros.
Before:
```
#[macro_use]
extern crate futures_ext;
extern crate serde_json;
```
After:
```
use futures_ext::try_boxfuture;
```
Reviewed By: Imxset21
Differential Revision: D17313537
fbshipit-source-id: 70462a2c161375017b77fa44aba166884ad2fdc3
Summary:
Simple implementation of the `file_history` using batched Unodes. It only contains 2 additional parameters: `skip` and `limit`.
Current implementation is very simple: it queries batched history only once and returns result based on skip, limit and first history batch.
Reviewed By: StanislavGlebik
Differential Revision: D17262719
fbshipit-source-id: 054944ec6d1ea0c75d879d33798e20720b35ae1a
Summary:
Add a new function to the filestore that allows for a file to be fetched,
along with its size. This will be used in the LFS server to record response sizes
and set the content length header.
Reviewed By: krallin
Differential Revision: D17319665
fbshipit-source-id: 039ef9c1eea59d19c54c5378229515ac1aeedcca
Summary:
Add per request logging to the LFS server, so that it's easier to debug LFS server issues. In
a future diff, I will also log response size.
Reviewed By: krallin
Differential Revision: D17314566
fbshipit-source-id: d40dc23a55bc56f6f768c9c0119553d03ea568c5
Summary:
Our tests were dead when D17244837 landed, but that diff removed loose files and
broke our tests because they were trying to corrupt them. Let's fix this by
corrupting packs instead.
Reviewed By: StanislavGlebik
Differential Revision: D17314430
fbshipit-source-id: 708f24b902ba1f94836f48989a32b696d2f6e52a
Summary: Add header so we can tell response is from mononoke LFS.
Reviewed By: krallin, StanislavGlebik
Differential Revision: D17288640
fbshipit-source-id: 73dc4b29c2b865f8f5407636de4235819ca8ffdb
Summary:
Pushing of really long linear histories might cause stackoverflow
{P109510712}
The problem is that we have a chain of futures of the same length as the number of
commits we are pushing. Polling them might result in a very long stacktrace.
To fix it let's create commits in batches
Reviewed By: krallin
Differential Revision: D17257875
fbshipit-source-id: 4c19fcd17fe5e7f4e6718080ec4180c59c8a6c6f
Summary: First version of binary that counts number of files in specific repo and print it.
Reviewed By: StanislavGlebik
Differential Revision: D17285812
fbshipit-source-id: 30bc9e2c11ee75fcfb8d94610bd4e320a56dafc7
Summary:
We need this for `megarepo_test` blobimport, which will prefix `fbsource` and
`ovrsource` bookmarks.
Reviewed By: krallin
Differential Revision: D17286275
fbshipit-source-id: 40bb3f97b0a06dcd636f891d9b32c7ef9b55a0fc
Summary:
Now that I've updated the `queries!` macro to allow a `>list` along with other
arguments in a query, we can use it in the dbbookmarks update query that was
checking for valid `UPDATE`s.
Reviewed By: farnz
Differential Revision: D17288575
fbshipit-source-id: 051e03bd711bf26e43ec79509051ffff68316db7
Summary:
We don't need to create double-indirection when accepting `>list` arguments.
This tends to force callers into creating new Vecs of references here and
there.
Since I was in here fixing D17286200, I figured I might as well do this too.
Reviewed By: farnz
Differential Revision: D17286608
fbshipit-source-id: 994f7d6da309b16b4e613d05faeaa3ae70ae70ab
Summary:
Since ConfigeratorAPI::new requires initFacebook to have been called, this diff gives it a `Facebook` parameter to require that callers prove they have called #[fbinit::main] as described in D17245802.
This diff turned out to be not too invasive in my opinion. It turns out there are usually pretty few layers between main and ConfigeratorAPI::new. Threading along a `Facebook` object beats the experience of debugging an error that looks like this: P109696408 when forgetting to use fbinit.
Reviewed By: Imxset21
Differential Revision: D17277841
fbshipit-source-id: 2fe3096ebcac58bb123149906e7e5d9d9e2da685
Summary: I think these are left over from pre-2018 code where they may have been necessary. In 2018 edition, import paths in `use` always begin with a crate name or `crate`/`super`/`self`, so `use $ident;` always refers to a crate. Since extern crates are always in scope in every module, `use $ident` does nothing.
Reviewed By: Imxset21
Differential Revision: D17290473
fbshipit-source-id: 23d86e5d0dcd5c2d4e53c7a36b4267101dd4b45c
Summary:
Previously, the LFS server's internal functions returned HandlerResponses
which included HTTP status codes. The bail_http macros were used to implement this,
but made it difficult pass state to them for logging, as well as not allowing the use
of the ? operator, which is a nice benefit of async / await syntax.
Refactor these functions to that they don't return such Gotham specific types
and remove the relevant http macros.
Reviewed By: krallin
Differential Revision: D17286238
fbshipit-source-id: 05ff791d4761b0f742d22a2966d5ecc5968728ba
Summary:
That's something we'd like to do for a while - for each request track how many requests it
sends to our storages (i.e. manifold and xdb). That might make perf debugging easier.
There's a concern that it might increase cpu usage, and I'll run a canary to check
if it's the case.
Reviewed By: krallin
Differential Revision: D17091115
fbshipit-source-id: 27fea314241d883ced72d88d39f2e188716a1b9a
Summary:
Extract the test utils from derive_unode_manifest to their own crate so that
they can be re-used in future tests.
Reviewed By: StanislavGlebik
Differential Revision: D17282411
fbshipit-source-id: 50410cffe8a912bd07283bc6ac4e97e28663d854
Summary:
Commit sync will operate based on the following idea:
- there's one "large" and potentially many "small" repos
- there are two possible sync directions: large-to-small and small-to-large
- when syncing a small repo into a large repo, it is allowed to change paths
of each individual file, but not to drop files
- large repo prepends a predefined prefix to every bookmark name from the small repo, except for the bookmarks, specified in `common_pushrebase_bookmarks` list, which refers to the bookmarks that can be advanced by any small repo
Reviewed By: krallin
Differential Revision: D17258451
fbshipit-source-id: 6cdaccd0374250f6bbdcbc9a280da89ccd7dff97
Summary: This removes the LFS implementation from the apiserver. We haven't used it, and we weren't going to.
Reviewed By: StanislavGlebik
Differential Revision: D17263036
fbshipit-source-id: cad3e8a7d54abef06ff1bc1cb5e73c55342743b6
Summary:
This updates all the existing integration tests that talked to the API Server to tlak to the LFS Server instead.
Note that some tests have changed a little bit. That's because unlike the old implementation, the new one doesn't ask clients to re-upload the things it already has.
Reviewed By: HarveyHunt
Differential Revision: D17263038
fbshipit-source-id: b3ef9af6f416458354d49b0ab9e99e588f8edd26
Summary:
This adds support in the LFS Server for running without an upstream. This is
useful to run tests, because it lets chain LFS Servers and have one acting as a
proxy and one acting as a backend.
This also highlighted a bug in my implementation (we were expecting a download
action from upstream if the data didn't need to be uploaded there), which I
fixed here.
For the time being, we won't use this in production.
Reviewed By: HarveyHunt
Differential Revision: D17263039
fbshipit-source-id: 7cba550054e5f052a4b8953ebe0195907919aade
Summary:
This adds a logging middleware that logs requests to stdlog (and therefore to slog through slog-stdlog).
It's convenient for tests, and it's also a pretty standard thing for HTTP servers.
Reviewed By: StanislavGlebik
Differential Revision: D17263040
fbshipit-source-id: 992c5e46ba9ae5b829001a26536b827130aa813c
Summary: It looks like I forgot a few prior to landing the LFS server. This removes a few extra `Fallible`s I still had laying around.
Reviewed By: StanislavGlebik
Differential Revision: D17263037
fbshipit-source-id: 0e39e04daf6fc1336a7a35aaafd243dc88c8836e
Summary:
Previous implementation of `create_merged_list` didn't always produce entries
in BFS order (see a test case for example).
Reviewed By: farnz
Differential Revision: D17281943
fbshipit-source-id: ec93b815e0b8f528952e05a47df767b679e41aad
Summary:
We want to tell apart user errors (i.e. incorrect input) vs server errors (i.e.
blobstore is down). Let's log a special field
Reviewed By: farnz
Differential Revision: D17229670
fbshipit-source-id: f08c96f5e5d9e1adcf6919970a7aaf7c0d4cd985
Summary:
Finally support merges. Whenever we have a merge a completely new batch is
created.
The one thing that surprised me was that ParentOffsets actually can be
negative. For example
```
o
/ \
o o
| |
o |
| |
o <--- | --- ParentOffset of this commit can be negative!
\ /
o
```
It happens because of BFS order of traversal - parent can be visited befor the
child. Note that negative offsets shouldn't cause problems.
Reviewed By: aslpavel
Differential Revision: D17183355
fbshipit-source-id: b5165ffef7212ce220dd338079db9e26a3030f58
Summary:
Implement the `commit_is_ancestor_of` call, which returns whether this commit
is an ancestor of some other commit.
Reviewed By: krallin
Differential Revision: D17183595
fbshipit-source-id: be7826e778c48dd86f116d7fbcaabe18bdffab44
Summary: Implement the `commit_info` call, which returns the metadata for the commit.
Reviewed By: krallin
Differential Revision: D17183596
fbshipit-source-id: 500029b7c4b4705fd937a894d15c14d911129b3c
Summary:
Implement the `commit_lookup` call, which looks-up commits to see if they exist,
and maps between commit identity schemes.
Reviewed By: krallin
Differential Revision: D17183597
fbshipit-source-id: 3d21c9b0804ce3bbd576543716ce9647d7d1d7e2
Summary:
Implement the `repo_list_bookmarks` call, which lists bookmarks.
Listing scratch bookmarks requires the user to provide a prefix to match on an
a limit for the number of bookmarks to fetch. There is currently no provision
for paging.
Reviewed By: krallin
Differential Revision: D17157497
fbshipit-source-id: 247f02299f40a9e9142c6ca838fca1d1de874382