Summary: It will be used in the next diffs
Reviewed By: krallin
Differential Revision: D17498798
fbshipit-source-id: c71319a21aa586208871555f2055c81afb021b52
Summary:
The extra_context field is a populated with certain performance counters.
However, if there is nothing to log then we log a serialised empty HashMap. In such a
case, don't log anything at all.
Reviewed By: krallin
Differential Revision: D17419339
fbshipit-source-id: 9ede283b3ee20c59c09765412fa9c932a12cd913
Summary:
This diff moves initFacebook calls that used to happen just before FFI calls to instead happen at the beginning of main.
The basic assumption of initFacebook is that it happens at the beginning of main before there are additional threads. It must be allowed to modify process-global state like env vars or gflags without the possibility of a data race from other code concurrently reading those things. As such, the previous approach of calling initFacebook through `*fbinit::FACEBOOK` near FFI calls was prone to race conditions.
The new approach is based on attribute macros added in D17245802.
---
The primary remaining situations that still require `*fbinit::FACEBOOK` are when we don't directly control the function arguments surrounding the call to C++, such as in lazy_static:
lazy_static! {
static ref S: Ty = {
let _ = *fbinit::FACEBOOK;
/* call C++ */
};
}
and quickcheck:
quickcheck! {
fn f(/* args that impl Arbitrary */) {
let _ = *fbinit::FACEBOOK;
/* call C++ */
}
}
I will revisit these in a separate diff. They are a small fraction of total uses of fbinit.
Reviewed By: Imxset21
Differential Revision: D17328504
fbshipit-source-id: f80edb763e7f42b3216552dd32f1ea0e6cc8fd12
Summary: `single` subcommand which regenerates specified derived data for provided changeset. This is useful for debugging purposes
Reviewed By: StanislavGlebik
Differential Revision: D17318968
fbshipit-source-id: 3d3a551991b0628a05335addedd7d5b315fd45d2
Summary:
This wires up the stdlog crate with our slog output. The upshot is that we can
now run binaries with `RUST_LOG` set and expect it to work.
This is nice because many crates use stdlog (e.g. Tokio, Hyper), so this is
convenient to get access to their logging. For example, if you run with
`RUST_LOG=gotham=info,hyper=debug`, then you get debug logs from Hyper and info
logs from Gotham.
The way this works is by registering a stdlog logger that uses the env_logger's
filter (the one that "invented" `RUST_LOG`) to filter logs, and routes them to
slog if they pass. Note that the slog Logger used there doesn't do any
filtering, since we already do it before sending logs there.
One thing to keep in mind is that we should only register the stdlog global
logger once. I've renamed `get_logger` to `init_logging` to make this clearer.
This behavior is similar to what we do with `init_cachelib`. I've updated
callsites accordingly.
Note that we explicitly tell the stdlog framework to ignore anything that we
won't consider for logging. If you don't set `RUST_LOG`, then the default
logging level is `Error`, which means that anything below error that is sent to
stdlog won't even get to out filtering logic (the stdlog macros for logging
check for the global level before actually logging), so this is cheap unless
you do set `RUST_LOG`.
As part of this, I've also updated all our binaries (and therefore, tests) to
use glog for logging. We had been meaning to do this, and it was convenient to
do it here because the other logger factory we were using didn't make it easy
to get a Drain without putting it a Logger.
Reviewed By: ahornby
Differential Revision: D17314200
fbshipit-source-id: 19b5e8edc3bbe7ba02ccec4f1852dc3587373fff
Summary:
`extern crate` is usually no longer needed in 2018 edition of Rust. This diff removes `extern crate` lines from fbcode where possible, replacing #[macro_use] with individual import of macros.
Before:
```
#[macro_use]
extern crate futures_ext;
extern crate serde_json;
```
After:
```
use futures_ext::try_boxfuture;
```
Reviewed By: Imxset21
Differential Revision: D17313537
fbshipit-source-id: 70462a2c161375017b77fa44aba166884ad2fdc3
Summary:
Since ConfigeratorAPI::new requires initFacebook to have been called, this diff gives it a `Facebook` parameter to require that callers prove they have called #[fbinit::main] as described in D17245802.
This diff turned out to be not too invasive in my opinion. It turns out there are usually pretty few layers between main and ConfigeratorAPI::new. Threading along a `Facebook` object beats the experience of debugging an error that looks like this: P109696408 when forgetting to use fbinit.
Reviewed By: Imxset21
Differential Revision: D17277841
fbshipit-source-id: 2fe3096ebcac58bb123149906e7e5d9d9e2da685
Summary: I think these are left over from pre-2018 code where they may have been necessary. In 2018 edition, import paths in `use` always begin with a crate name or `crate`/`super`/`self`, so `use $ident;` always refers to a crate. Since extern crates are always in scope in every module, `use $ident` does nothing.
Reviewed By: Imxset21
Differential Revision: D17290473
fbshipit-source-id: 23d86e5d0dcd5c2d4e53c7a36b4267101dd4b45c
Summary:
That's something we'd like to do for a while - for each request track how many requests it
sends to our storages (i.e. manifold and xdb). That might make perf debugging easier.
There's a concern that it might increase cpu usage, and I'll run a canary to check
if it's the case.
Reviewed By: krallin
Differential Revision: D17091115
fbshipit-source-id: 27fea314241d883ced72d88d39f2e188716a1b9a
Summary:
Previously we had separate load shedding configs for quicksand and
non-quicksand. They were hard to manage because we had to duplicate each load
shedding option for quicksand and non-quicksand. More importantly we had to
duplicate that in the code, and that was error prone.
Now we have a quicksand_multiplier instead which is applied to every
loadshedding option if request is coming from quicksand. let's use it
Reviewed By: krallin
Differential Revision: D16986262
fbshipit-source-id: 46a2ed9eb6e591d8c8a70f8d57efb1cb7836db78
Summary:
This is a mechanical part of rename, does not change any commit messages in
tests, does not change the scuba table name/config setting. Those are more
complex.
Reviewed By: krallin
Differential Revision: D16890120
fbshipit-source-id: 966c0066f5e959631995a1abcc7123549f7495b6
Summary: Proper duration types can help catch more bugs at compile time than passing around usize. This moves the Rust Configerator client's timeout argument from usize to a proper duration type.
Reviewed By: dtolnay
Differential Revision: D16854216
fbshipit-source-id: a067c13a56a1dc8e5bad5c9f6243774fd9e0b462
Summary: Clean up non-test usage of CoreContext::test_mock left over from T37478150
Reviewed By: farnz
Differential Revision: D16804838
fbshipit-source-id: f420b8186557a42e9b6c78437c0fb76c9a343b31
Summary:
If load limiting isn't enabled, then tracking load doens't really make sense, since we don't have a category to record it into (we end up using an empty category instead, which doesn't feel right). This patch reworks load limiting so that it all happens within a `LoadLimiter` struct, which means we won't record load at all if load limiting is not enabled. This also removes a few of string creations that were happening here and there to reproduce the same strings over and over.
The main reason I did this is because the first time you bump load, we'll create a connection to Memcache, and for some reason that seems to be pretty slow (this only seems to happen on the first connection, so this might be expected). This makes running Mononoke locally slower since the first request you make to it takes ~2 seconds. I suspect it also makes tests slower (at least those tests that don't run in a network blackhole where presumably establishing a connection fails immediately).
Reviewed By: StanislavGlebik
Differential Revision: D16762054
fbshipit-source-id: faeb07e058c63b597916b6379022435e21a13e85
Summary:
If the perfcounters hashmap is empty, we log "{}". Check if
the perf counters are empty before logging to prevent this.
Reviewed By: krallin
Differential Revision: D16712702
fbshipit-source-id: fecc6c10f5cd0acba2a3b3b96f8d6268660bac0d
Summary: This updates our repo config to allow passing through Filestore params. This will be useful to conditionally enable Filestore chunking for new repos.
Reviewed By: HarveyHunt
Differential Revision: D16580700
fbshipit-source-id: b624bb524f0a939f9ce11f9c2983d49f91df855a
Summary:
Running Mononoke locally make it look like that 'Loading hooks' takes a lot of
time, while in reality it's skiplists. let's add logs to make it clearer
Reviewed By: farnz
Differential Revision: D16460545
fbshipit-source-id: 8fe43d2b4efb8b2cdc13a1363f36350fb11cda0a
Summary:
Improve memory usage on mononoke startup and reduce number of small allocations. Done via:
* Pre-size CHashMap used by SkiplistEdgeMapping, working around the 4x multiplier and lack of load factor awareness in CHashMap::with_capacity
* Add a SingleEdge optimization to SkiplistNodeType so as to save vector overhead in the case of one edge ( this is the common case )
* Size the HashMap in rust thrift deserialization with HashMap::with_capacity
Reviewed By: krallin
Differential Revision: D16265993
fbshipit-source-id: 99c3a7149493d824a3c00540bc5557410d0273fc
Summary: As above, add the output for load_limit_config to debug formatting.
Reviewed By: StanislavGlebik
Differential Revision: D16357353
fbshipit-source-id: d5ce5aa3801563de33902c7e2216dcee904db76e
Summary:
In earlier diffs in this stack, I updated the callsites that reference XDB tiers to use concrete &str types (which is what they were receiving until now ... but it wasn't spelled out as-is).
In this diff, I'm updating them to use owned `String` instead, which lets us hoist up `to_string()` and `clone()` calls in the stack, rather than pass down reference only to copy them later on.
This allows us to skip some unnecessary copies. Tt turns out we were doing quite a few "turn this String into a reference, pass it down the stack, then turn it back into a String".
Reviewed By: farnz
Differential Revision: D16260372
fbshipit-source-id: faec402a575833f6555130cccdc04e79ddb8cfef
Summary:
It's used only in a very few places, and most likely that's by accident. We
pass logger via CoreContext now
Reviewed By: krallin
Differential Revision: D16336953
fbshipit-source-id: 36ea4678b3c3df448591c606628b93ff834fae45
Summary:
This diff is loosely based of D15851686.
There were a few problems with the previous load shedding api. The biggest
problem was that it used static variables that required explicit
initialization. So if someone forgets to add an init in any tool (say, in
mononoke_admin), and then calls the code that runs `should_throttle` then the
binary would just crash.
The second big problem was that on each `bump_load` call configerator config
was parsing json config over and over again. I didn't measure how expensive it
was, but it looked very counter-intuitive that we put load counters in static
variables (IIRC to avoid allocations), but then we reparse the config on each
call.
To avoid these problems this diff moves ratelimiting logic into CoreContext.
The config is parsed only once on CoreContext creation and all global variables
are now local. Because of these changes `loadlimiter` library is no longer
necessary.
Note that with load limiter counters are allocated on each
`bumpLoad/shouldThrottle` call. I'll investigate how expensive it actually is.
One drawback is that now it's trickier to track egressbytes for stderr, and it was removed.
We can add it later if needed
Reviewed By: farnz
Differential Revision: D16221607
fbshipit-source-id: f934cb2fc778df1679acc7d6eb2a1a8b4deb0aac
Summary:
Instantiating a new DB connection may require remote calls to be made to e.g. Hipster to allocate a new certificate (this is only the case when connecting to MySQL).
Currently, our bindings to our underlying DB locator make a blocking call to pretend that this operaiton is synchronous: https://fburl.com/ytmljxkb
This isn't ideal, because this call might actually take time, and we might also occasionally want to retry it (we've had issues in our MySQL tests with acquiring certificates that retrying should resolve). Running this synchronously makes doing so inefficient.
This patch doesn't update that, but it fixes everything on the Rust side of things to stop expecting connections to return a `Result` (and to start expecting a Future instead).
In a follow up diff, I'll work on making the changes in common/rust/sql to start returning a Future here.
Reviewed By: StanislavGlebik
Differential Revision: D16221857
fbshipit-source-id: 263f9237ff9394477c65e455de91b19a9de24a20
Summary:
This config option should be enabled in tests. We can use it to not run prod
services in non-test enviroment.
It will be used in the next diff to not load configerator in tests
Reviewed By: krallin
Differential Revision: D16221512
fbshipit-source-id: 7e0ba9c1d46b652a06e0e767de5df78d1671951a
Summary:
This adds support for disabling arbitrary hooks by name using a `--disable-hook` flag. This serves as a replacement for our current approach, which is to muck with configuration when we need to disable a hook on an ad-hoc basis, with mixed results ().
There are a couple use cases for which this is useful:
- Running the Hook Tailer.
- Running Mononoke Server locally.
in both cases, you probably don't want to have the verify integrity and check_unittests hooks enabled, respectively because the former might not be installed on your server (and it'll do wrong things for hook tailing anyway), and because the latter requires a token file you're not likely to have.
Reviewed By: ahornby
Differential Revision: D16201500
fbshipit-source-id: 76404eab72b9335d82235c63003c2425f0ecc480
Summary:
We have problems with expensive gettreepack requests coming from quicksand.
I'll add load shedding, but before I do this I need to find what the limits for
load shedding are.
Let's use ods for that
Reviewed By: ikostia
Differential Revision: D16182208
fbshipit-source-id: 70367c876fb392f368583442ad1e45db593f9e0b
Summary:
Report to Scuba whenever someone tries to access a blobstore which is blacklisted. Scuba reporting is done for any `get` or `put` method call.
Because of the possible overload - given the high number of requests mononoke receives and that CensoredBlobstore make the verification before we add the caching layer for blobstores - I considered reporting at most one bad request per second. If multiple requests to blacklisted blobstores are made in less than one second, only the first request should be reported. Again, this is not the best approach (to not report all of them), but performance wise is the best solution.
NOTE: I also wrote an implementation using `RwLock` (instead of the current `AtomicI64`), but atomic variables should be faster than using lockers so I gave up on that idea.
Reviewed By: ikostia, StanislavGlebik
Differential Revision: D16108456
fbshipit-source-id: 9e5338c50a1c7d15f823a2b8af177ffdb99e399f
Summary:
We want to decrease memory and cpu usage for gettreepack requests. We have a
few parallel efforts to fix the problem, and having a common benchmark will
help us coordinate.
It's super simple - it runs a function once to warm up a cache, then does runs
in parallel. Running benchmark with `env time -v` can track memory usage
Differential Revision: D16120018
fbshipit-source-id: 2e13de92b1316fac5c7b86a88f11122ba166d210
Summary:
Added an option to control for which repositories should censoring be
enabled or disabled. The option is added in `server.toml` as `censoring` and it
is set to true or false. If `censoring` is not specified, then the default
option is set to true ( censoring is enabled).
By disabling `censoring` the verification if the key is blacklisted or not is
omitted, therefor all the files are fetchable.
Reviewed By: ikostia
Differential Revision: D16029509
fbshipit-source-id: e9822c917fbcec3b3683d0e3619d0ef340a44926
Summary: Blob repo waits for MyRouter now, so we don't need to wait for myrouter to go into blobrepo.
Reviewed By: farnz
Differential Revision: D15941643
fbshipit-source-id: 408919e76f7fa2577a5192cded44e11bc97de840
Summary:
Currently, we implicitly expect that caching is enabled if we're dealing with a remote repository, but that means cachelib must be enabled when running with a remote repository, and that is ... slow.
This can be problematic in two cases:
In tests. It makes MySQL tests unbearably slow, and a little more flaky because we end up using so much CPU. With this patch, MySQL tests remain slower than SQLite tests, but by a factor of < 2, which is a pretty substantial improvement.
Running trivial administrative commands (e.g. a `mononoke_admin`), notably using a dev build (which right now unbearably slow). With this patch, such a trivial command is about 6x faster:
```
[torozco@devvm4998.lla1 ~/fbcode] time buck-out/gen/scm/mononoke/admin#binary/admin --repo-id 2102 --mononoke-config-path /home/torozco/local/.mononoke_exec/config/PROD --skip-caching bookmarks list --kind publishing
Jun 21 08:57:36.658 INFO using repo "instagram-server" repoid RepositoryId(2102)
master c96ac4654e4d2da45a9597af859adeac9dba3d7ca964cb42e5c96bc153f185e3 2c5713ad27262b91bf1dfaf21b3cf34fe3926c8d
real 0m5.299s
user 0m5.097s
sys 0m0.699s
[torozco@devvm4998.lla1 ~/fbcode] time buck-out/gen/scm/mononoke/admin#binary/admin --repo-id 2102 --mononoke-config-path /home/torozco/local/.mononoke_exec/config/PROD bookmarks list --kind publishing
I0621 08:57:59.299988 1181997 CacheAllocator-inl.h:3123] Started worker 'PoolRebalancer'
Jun 21 08:57:59.328 INFO using repo "instagram-server" repoid RepositoryId(2102)
master c96ac4654e4d2da45a9597af859adeac9dba3d7ca964cb42e5c96bc153f185e3 2c5713ad27262b91bf1dfaf21b3cf34fe3926c8d
real 0m28.620s
user 0m27.680s
sys 0m2.466s
```
This is also nice because it means the MySQL tests won't talk to Memcache anymore.
---
Note: in this refactor, I made `Caching` an enum so it can't accidentally be swapped with some other boolean.
---
Finally, it also uses up quite a bit less RAM (we no longer need 2GB of RAM to output one line of bookmarks — although we're still using quite a bit!):
```
[torozco@devvm4998.lla1 ~/fbcode] env time buck-out/gen/scm/mononoke/admin#binary/admin --skip-caching --repo-id 2102 --mononoke-config-path /home/torozco/local/.mononoke_exec/config/PROD bookmarks list --kind publishing
Jun 21 09:18:36.074 INFO using repo "instagram-server" repoid RepositoryId(2102)
master abdd2f78dafeaa8d4b96897955a63844b31324f9d89176b3a62088d0e2ae2b22 1702392d14bf7a332bf081518cb1ea3c83a13c39
5.08user 0.68system 0:05.28elapsed 109%CPU (0avgtext+0avgdata 728024maxresident)k
6776inputs+0outputs (8major+115477minor)pagefaults 0swaps
[torozco@devvm4998.lla1 ~/fbcode] env time buck-out/gen/scm/mononoke/admin#binary/admin --repo-id 2102 --mononoke-config-path /home/torozco/local/.mononoke_exec/config/PROD bookmarks list --kind publishing
I0621 09:19:01.385933 1244489 CacheAllocator-inl.h:3123] Started worker 'PoolRebalancer'
Jun 21 09:19:01.412 INFO using repo "instagram-server" repoid RepositoryId(2102)
master abdd2f78dafeaa8d4b96897955a63844b31324f9d89176b3a62088d0e2ae2b22 1702392d14bf7a332bf081518cb1ea3c83a13c39
26.96user 2.27system 0:27.93elapsed 104%CPU (0avgtext+0avgdata 2317716maxresident)k
11416inputs+5384outputs (17major+605118minor)pagefaults 0swaps
```
Reviewed By: farnz
Differential Revision: D15941644
fbshipit-source-id: 0df4a74ccd0220a786ebf0e883e1a9b8aab0a647
Summary:
This updates the mononoke server code to support booting without myrouter. This required 2 changes:
- There were a few callsites where we didn't handle not having a myrouter port.
- In our function that waits for myrouter, we were failing if we had no myrouter port, but that's not desirable: if we don't have a myrouter port, we simply don't need to wait.
Arguably, This isn't 100% complete yet. Notably, RepoReadWriteStatus still requires myrouter. I'm planning to create a bootcamp task for this since it's not blocking my work adding integration tests, but would be a nice to have.
Speaking of further refactor, it'd be nice if we supported a `SqlConstructors::with_xdb` function that did this matching for us so we didn't have to duplicate it all over the place. I'm also planning to bootcamp this.
Reviewed By: farnz
Differential Revision: D15855431
fbshipit-source-id: 96187d887c467abd48ac605c28b826d8bf09862b
Summary:
By saving bookmarks when the client calls `heads` (and thus is starting discovery), we can fix the race that stopped us supporting `listkeys` as a bundle2 part.
This saves about 1 second per no-op pull on my devvm - I'm hoping for more improvement on Sandcastle.
Reviewed By: StanislavGlebik
Differential Revision: D15625211
fbshipit-source-id: 47e59848dff56fcf9d893ee3b3c329d69883a57e
Summary:
1.add function myrouter_ready to common/sql_ext/sr/lib.rs;
2.refactor main.rs and repo_handlers.rs to use the new function
Reviewed By: ikostia
Differential Revision: D15623501
fbshipit-source-id: 7b9d6c5fd7c33845148dfacefbcf1bf3c6afaa5d
Summary:
A large portion of Mononoke health/performance is dominated by files being kept
in memory. There is a direct link between traffic we're sending out to clients
and how much memory we have in use. Therefor this is a potential effective
counter to limit against.
Reviewed By: krallin
Differential Revision: D15664331
fbshipit-source-id: f7ffaa9604d51d337bb05237dee5d8a3e43ce7ad
Summary:
In order to be able to use load limiting, our extension needs to be initialized
to receive the proper configuration.
Reviewed By: krallin
Differential Revision: D15664332
fbshipit-source-id: 987adf9183f53cc0d6b847ad782ff31851503721
Summary:
This is just a boilerplate, giving `resolve` access to the config option,
which controls whether we should allow pure pushes. This option will be
enabled by default and we'll disable it explicitly for repos where it makes
no sense.
Reviewed By: StanislavGlebik
Differential Revision: D15520450
fbshipit-source-id: b1bb913c14a6aac6aa68ed8b8a7b4ff270da1688
Summary:
This adds a sanity check that limits the count of matches in `list_all_bookmarks_with_prefix`.
If we find more matches than the limit, then an error will be returned (right now, we don't have support for e.g. offsets in this functionality, so the only alternative approach is for the caller to retry with a more specific pattern).
The underlying goal is to ensure that we don't trivially expose Mononoke to accidental denial of service when a list lists `*` and we end up querying literally all bookmarks.
I picked a fairly conservative limit here (500,000), which is > 5 times the number of bookmarks we currently have (we can load what we have right now successfully... but it's pretty slow);
Note that listing pull default bookmarks is not affected by this limit: this limit is only used when our query includes scratch bookmarks.
Reviewed By: StanislavGlebik
Differential Revision: D15413620
fbshipit-source-id: 1030204010d78a53372049ff282470cdc8187820
Summary:
This updates our receive path for B2xInfinitepush to create new scratch bookmarks.
Those scratch bookmarks will:
- Be non-publishing.
- Be non-pull-default.
- Not be replicated to Mercurial (there is no entry in the update log).
I added a sanity check on infinite pushes to validate that bookmarks fall within a given namespace (which is represented as a Regexp in configuration). We'll want to determine whether this is a good mechanism and what the regexp for this should be prior to landing (I'm also considering adding a soft-block mode that would just ignore the push instead of blocking it).
This ensures that someone cannot accidentally perform an infinitepush onto master by tweaking their client-side configuration.
---
Note that, as of this diff, we do not support the B2xInfinitepushBookmarks part (i.e. backup bookmarks). We might do that separately later, but if we do, it won't be through scratch Bookmarks (we have too many backup bookmarks for this to work)
Reviewed By: StanislavGlebik
Differential Revision: D15364677
fbshipit-source-id: 23e67d4c3138716c791bb8050459698f8b721277