Summary: Write mostly stores are often in the process of being populated. Add an option to control whether scrub errors are raised for missing values in write mostly stores.
Differential Revision: D28393689
fbshipit-source-id: dfc371dcc3b591beadead82608a747958b53f580
Summary:
This is going to enable the background update in SegmentedChangelog to log
entries to Scuba.
The scuba sample builder is not fundamentally different than other elements of
the environment. It is used slightly differently to, for example, Logger,
because it has to cloned in all places that want to log rows but otherwise it
has the same characteristics.
Reviewed By: krallin
Differential Revision: D28210008
fbshipit-source-id: 68468868d13f29dddf21095bd7526cb4ff690786
Summary: Upstream crate has landed my PR for zstd 1.4.9 support and made a release, so can remove this patch now.
Reviewed By: ikostia
Differential Revision: D28221163
fbshipit-source-id: b95a6bee4f0c8d11f495dc17b2737c9ac9142b36
Summary:
Right now we write straight to a logger with no filter, so no matter the log
level we print this stuff out. Let's fix it.
While we're at it, move this back to debug level.
I'd made this trace in my recent cmdlib refactoring (which resulted in us
properly initializing logging in all binaries), since I assumed we just had level
filtering working but with debug-logging enabled and I didn't want to have to
update every single test, but it turns out that the reason we didn't print it
out at trace is just because thats not enabled at all in our slog build:
D28097080.
Reviewed By: StanislavGlebik
Differential Revision: D28116053
fbshipit-source-id: f59d9a70ea3c3d834adea16f2686bfc244672b14
Summary:
We used to carry patches for Tokio 0.2 to add support for disabling Tokio coop
(which was necessary to make Mononoke work with it), but this was upstreamed
in Tokio 1.x (as a different implementation), so that's no longer needed. Nobody
else besides Mononoke was using this.
For Hyper we used to carry a patch with a bugfix. This was also fixed in Tokio
1.x-compatible versions of Hyper. There are still users of hyper-02 in fbcode.
However, this is only used for servers and only when accepting websocket
connections, and those users are just using Hyper as a HTTP client.
Reviewed By: farnz
Differential Revision: D28091331
fbshipit-source-id: de13b2452b654be6f3fa829404385e80a85c4420
Summary:
This used to be used by Mononoke, but we're now on Tokio 1.x and on
corresponding versions of Gotham so it's not needed anymore.
Reviewed By: farnz
Differential Revision: D28091091
fbshipit-source-id: a58bcb4ba52f3f5d2eeb77b68ee4055d80fbfce2
Summary:
NOTE: there is one final pre-requisite here, which is that we should default all Mononoke binaries to `--use-mysql-client` because the other SQL client implementations will break once this lands. That said, this is probably the right time to start reviewing.
There's a lot going on here, but Tokio updates being what they are, it has to happen as just one diff (though I did try to minimize churn by modernizing a bunch of stuff in earlier diffs).
Here's a detailed list of what is going on:
- I had to add a number `cargo_toml_dir` for binaries in `eden/mononoke/TARGETS`, because we have to use 2 versions of Bytes concurrently at this time, and the two cannot co-exist in the same Cargo workspace.
- Lots of little Tokio changes:
- Stream abstractions moving to `tokio-stream`
- `tokio::time::delay_for` became `tokio::time::sleep`
- `tokio::sync:⌚:Sender::send` became `tokio::sync:⌚:Sender::broadcast`
- `tokio::sync::Semaphore::acquire` returns a `Result` now.
- `tokio::runtime::Runtime::block_on` no longer takes a `&mut self` (just a `&self`).
- `Notify` grew a few more methods with different semantics. We only use this in tests, I used what seemed logical given the use case.
- Runtime builders have changed quite a bit:
- My `no_coop` patch is gone in Tokio 1.x, but it has a new `tokio::task::unconstrained` wrapper (also from me), which I included on `MononokeApi::new`.
- Tokio now detects your logical CPUs, not physical CPUs, so we no longer need to use `num_cpus::get()` to figure it out.
- Tokio 1.x now uses Bytes 1.x:
- At the edges (i.e. streams returned to Hyper or emitted by RepoClient), we need to return Bytes 1.x. However, internally we still use Bytes 0.5 in some places (notably: Filestore).
- In LFS, this means we make a copy. We used to do that a while ago anyway (in the other direction) and it was never a meaningful CPU cost, so I think this is fine.
- In Mononoke Server it doesn't really matter because that still generates ... Bytes 0.1 anyway so there was a copy before from 0.1 to 0.5 and it's from 0.1 to 1.x.
- In the very few places where we read stuff using Tokio from the outside world (historical import tools for LFS), we copy.
- tokio-tls changed a lot, they removed all the convenience methods around connecting. This resulted in updates to:
- How we listen in Mononoke Server & LFS
- How we connect in hgcli.
- Note: all this stuff has test coverage.
- The child process API changed a little bit. We used to have a ChildWrapper around the hg sync job to make a Tokio 0.2.x child look more like a Tokio 1.x Child, so now we can just remove this.
- Hyper changed their Websocket upgrade mechanism (you now need the whole `Request` to upgrade, whereas before that you needed just the `Body`, so I changed up our code a little bit in Mononoke's HTTP acceptor to defer splitting up the `Request` into parts until after we know whether we plan to upgrade it.
- I removed the MySQL tests that didn't use mysql client, because we're leaving that behind and don't intend to support it on Tokio 1.x.
Reviewed By: mitrandir77
Differential Revision: D26669620
fbshipit-source-id: acb6aff92e7f70a7a43f32cf758f252f330e60c9
Summary:
MyRouter is no longer used by Mononoke services, it is deprecated and will stop working when we upgrade the tokio.
This diff removes MyRouter support from Mononoke and simplifies the Mysql connection type struct.
Before we had `MysqlOptions` and `MysqlConnectionType` enum to represent what kind of a client we want to use. Now we use only MySQL FFI so I removed `MysqlConnectionType` completely and put everything into the options struct.
As setting up the connections (aka conn pool) is not an async operation, some of the methods don't need to be async anymore. Because this diff is already enormous, I'm refactoring this in the next one.
Reviewed By: StanislavGlebik
Differential Revision: D28007850
fbshipit-source-id: 32c3740f4bb132f06e1e256b0530ace755446cdd
Summary:
This change makes it so that our binaries do not instantiate a real configo
client in integration test setup.
Reviewed By: ahornby
Differential Revision: D28026790
fbshipit-source-id: 0fb9ce66a1324e845f4b8a80d4479263ec6e4ee1
Summary:
Update the zstd crates.
This also patches async-compression crate to point at my fork until upstream PR https://github.com/Nemo157/async-compression/pull/117 to update to zstd 1.4.9 can land.
Reviewed By: jsgf, dtolnay
Differential Revision: D27942174
fbshipit-source-id: 26e604d71417e6910a02ec27142c3a16ea516c2b
Summary:
This diff makes MySQL FFI client the default option for a MySQL connection. It means that if no arguments provided, the MySQL FFI client is used. `--use-mysql-client` option is still accepted, as it is used in the configs, and will be removed a bit later.
I also removed raw connections as a way to connect to MySQL from Mononoke, as it is no longer used. Although I had to keep some `sql_ext` API for now because other projects rely on it.
(I talked to the teams and they are willing to switch to the new client as well. I'm helping where it's possible to replace these raw xdb conns.)
Reviewed By: krallin
Differential Revision: D27925435
fbshipit-source-id: 4f08eef07df676a4e6be58b6e351be3e3d3e8ab7
Summary:
Cargo builds run multiple tests in same process, which was causing a failure since MononokeMatches started doing the one time log::set_boxed_logger() call.
I made MononokeMatches:new() error rather than panic in this case as it was already returning a Result anyway.
The failing test was only testing a very small section of code in block_execute, so this change removes it (its covered by integration test). It also gave some MononokeMatches coverage as a side effect (also covered by integration test).
Reviewed By: krallin
Differential Revision: D27891187
fbshipit-source-id: 9787029b610040cbf0125ab79748d6a3e540d2ae
Summary:
After doing some local benchmarking (using MononokeApi instantiation as the
benchmark), one thing that's apparent is that we have quite a few parameters
here and that tuning them is likely to be a challenge.
One parameter in particular is the batch "objective", which controls how many
requests we want to see in the last batching interval before we choose to
batch (this is `rendezvous_dispatch_min_threshold`).
The problem with this is this is that there is no good, real-world, metric to
set it based on. This in contrast to the other parameters we have, which do
have some reasonable metric to compare to:
- rendezvous_dispatch_delay_ms: this is overhead we add to queries, so it
should be small & on the order of query execution latency (i.e. a few ms).
- rendezvous_dispatch_max_threshold: this controls how big our batches get, so
it should be on the order of what makes a SQL query too big (i.e. less than
a hundred records).
In contrast, we want to set `rendezvous_dispatch_min_threshold` such that
batching kicks in before we start using too many concurrent connections (which
is what query batching seeks to reduce), but the problem is that those two
numbers aren't directly connected. One clear problem, for example, is that if
our DB is in-region vs. out of-region, then for a given query execution time,
and a desired concurrency level before batching kicks in, we'd need different
values of `rendezvous_dispatch_min_threshold` (it would have to kick in faster
for the out-of-region workload).
So, this diff updates rendez vou to actually track concurrent connection count
before we force batching. This is the actual metric we care about here, and it
has a pretty natural "real world" values we can look at to decide where to set
it (our connection pool — which is limited at 100 concurrent connections —, and
our open connection baseline).
Note: I set this at 5 because that's more or less what servers look like
outside of spikes for Bonsai hg mapping, and of Changesets where I'm planning to
introduce this in the future:
- bonsai: https://fburl.com/ods/6d4a9qb5
- changesets: https://fburl.com/ods/kuq5x1vw (note: to make sense of this,
focus on just one server, otherwise the constnat spikes we get sort of hide
the big picture).
Reviewed By: farnz
Differential Revision: D27792603
fbshipit-source-id: 1a9189f6b50d48444b3373bd1cb14dc51b85a6d2
Summary:
Like it says in the title. There's no reason for this to be ad ad-hoc "throw in
an arg" when everything else is done by adding arg types.
Reviewed By: HarveyHunt
Differential Revision: D27791333
fbshipit-source-id: 38e5a479800179b249ace5cc599340cb84eb53e2
Summary:
Like it says in the title. Let's remove ad-hoc "add an arg then look the arg"
mechanisms like this one.
Reviewed By: HarveyHunt
Differential Revision: D27791334
fbshipit-source-id: 257cea7763ab5130525ad739fe4ebdda4e8bfeb6
Summary:
This module is way too big and bundles many different functions:
- Our app builder
- Our matches object and environment initialization
- A bunch of utility functions
Let's split it up
Reviewed By: HarveyHunt
Differential Revision: D27790730
fbshipit-source-id: 8353b18a28fde5267d03ba0342c8cb98ad855e37
Summary:
This isn't useful anymore. Let's ask our MononokeMatches what is set up for
caching instead of parsing the args one more time.
Reviewed By: HarveyHunt
Differential Revision: D27767697
fbshipit-source-id: 9da83769284a4aed4a96cd0eb212f42dd01ade87
Summary:
There is a very frustrating operation that happens often when working on the
Mononoke code base:
- You want to add a flag
- You want to consume it in the repo somewhere
Unfortunately, when we need to do this, we end up having to thread this from a
million places and parse it out in every single main() we have.
This is a mess, and it results in every single Mononoke binary starting with
heaps of useless boilerplate:
```
let matches = app.get_matches();
let (caching, logger, mut runtime) = matches.init_mononoke(fb)?;
let config_store = args::init_config_store(fb, &logger, &matches)?;
let mysql_options = args::parse_mysql_options(&matches);
let blobstore_options = args::parse_blobstore_options(&matches)?;
let readonly_storage = args::parse_readonly_storage(&matches);
```
So, this diff updates us to just use MononokeEnvironment directly in
RepoFactory, which means none of that has to happen: we can now add a flag,
parse it into MononokeEnvironment, and get going.
While we're at it, we can also remove blobstore options and all that jazz from
MononokeApiEnvironment since now it's there in the underlying RepoFactory.
Reviewed By: HarveyHunt
Differential Revision: D27767700
fbshipit-source-id: e1e359bf403b4d3d7b36e5f670aa1a7dd4f1d209
Summary:
Basically every single Mononoke binary starts with the same preamble:
- Init mononoke
- Init caching
- Init logging
- Init tunables
Some of them forget to do it, some don't, etc. This is a mess.
To make things messier, our initialization consists of a bunch of lazy statics
interacting with each other (init logging & init configerator are kinda
intertwined due to the fact that configerator wants a logger but dynamic
observability wants a logger), and methods you must only call once.
This diff attempts to clean this up by moving all this initialization into the
construction of MononokeMatches. I didn't change all the accessor methods
(though I did update those that would otherwise return things instantiated at
startup).
I'm planning to do a bit more on top of this, as my actual goal here is to make
it easier to thread arguments from MononokeMatches to RepoFactory, and to do so
I'd like to just pass my MononokeEnvironment as an input to RepoFactory.
Reviewed By: HarveyHunt
Differential Revision: D27767698
fbshipit-source-id: 00d66b07b8c69f072b92d3d3919393300dd7a392
Summary: We have deprecated it in favor of argument that takes a boolean value.
Reviewed By: farnz
Differential Revision: D27709429
fbshipit-source-id: 45e9569188f2e9d017f1c5bf61f7c61bc0e5318a
Summary: SQLBlob doesn't benefit from sharing a pool with other MySQL users, but does benefit from more aggressive connection timeouts. Give it its own pool, which we can tweak later.
Reviewed By: krallin
Differential Revision: D27651133
fbshipit-source-id: 8f5216ec0506b217f9365babfe1ebac00f68a9a9
Summary: Use `RepoFactory` to construct repositories for all users of `cmdlib`.
Reviewed By: krallin
Differential Revision: D27363471
fbshipit-source-id: c9a483b41709fd90406c6600936671bf9ba61625
Summary: We have support for backup-repo-id, but tw blobimport doesn't have id and have source repo name to use. Let's add support similar to other repo-id/source-repo-id etc.
Reviewed By: StanislavGlebik
Differential Revision: D27325583
fbshipit-source-id: 44b5ec7f99005355b8eaa4c066cb7168ec858049
Summary: It's no longer in use, so remove it completely to simplify the code
Reviewed By: ahornby
Differential Revision: D27234699
fbshipit-source-id: ec26f4e283d48b05e19b951b8485ca6fa7751072
Summary: We have new config fields available that can specify default compression level, let's read and use them.
Reviewed By: StanislavGlebik
Differential Revision: D27127455
fbshipit-source-id: 27935fd58da5f1150c9caf56d9601c37f2ae3581
Summary: All important jobs (SCS Server, LFS Server, Mononoke Server, derived data) have switched successfully. Roll up anything that's been missed by switching the default and letting contbuild take care of it
Reviewed By: krallin
Differential Revision: D26980991
fbshipit-source-id: 2c9f7cd56c38e9e1a2f8374c76141e7a99c88a2a
Summary:
Atm in tests a separate ConfigStore with file source is created for some configs and then a reference to it is dropped immediately ([see get_config_handle function in mod.rs](https://fburl.com/diffusion/fpkj7ekv)). This is uncomfortable as we may need a reference to e.g. force update configs in tests.
Instead of creating separate stores we can reuse static configerator which already uses local files (in tests).
Reviewed By: krallin
Differential Revision: D26725515
fbshipit-source-id: 24269cd93b7d35216c025807c3f3eb527688b72b
Summary: We have a number of sleeps in our integration tests. The two main reasons are configs & tunables that need reloading. Currently, we have no way of force-reloading those.
Reviewed By: krallin
Differential Revision: D26615732
fbshipit-source-id: 217c4ae039abd398972b4a9764d08e18d6182493
Summary:
I accidentally broke this in D26544410 (097e4ad00c) when I updated it to use
schedule_stats_aggregation_preview (in Tokio 0.2 and up, you can't create
an interval stream off the runtime, but in Tokio 0.1 you could).
We only use this method in 2 places, so it probably makes sense to just get rid
of it anyway, which is what this diff does. The alternative is better as it
spawns this unconditionally, so if we get it wrong, it'll fail in tests,
even though our tests don't pass `--fb303-port`, whereas
`start_fb303_and_stats_agg` will only start stats aggregation if its passed.
Reviewed By: ahornby
Differential Revision: D26690223
fbshipit-source-id: 7d151a3c46fa428f00ac32601da161609fb498f7
Summary:
For dependencies V2 puts "version" as the first attribute of dependency or just after "package" if present.
Workspace section is after patch section in V2 and since V2 autoformats patch section then the third-party/rust/Cargo.toml manual entries had to be formatted manually since V1 takes it as it is.
The thrift files are to have "generated by autocargo" and not only "generated" on their first line. This diff also removes some previously generated thrift files that have been incorrectly left when the corresponding Cargo.toml was removed.
Reviewed By: ikostia
Differential Revision: D26618363
fbshipit-source-id: c45d296074f5b0319bba975f3cb0240119729c92
Summary:
The on demand update code we have is the most basic logic that we could have.
The main problem is that it has long and redundant write locks. This change
reduces the write lock strictly to the section that has to update the in memory
IdDag.
Updating the Dag has 3 phases:
* loading the data that is required for the update;
* updating the IdMap;
* updating the IdDag;
The Dag can function well for serving requests as long as the commits involved
have been built so we want to have easy read access to both the IdMap and the
IdDag. The IdMap is a very simple structure and because it's described as an
Arc<dyn IdMap> we push the update locking logic to the storage. The IdDag is a
complicated structure that we ask to update itself. Those functions take
mutable references. Updating the storage of the iddag to hide the complexities
of locking is more difficult. We deal with the IdDag directly by wrapping it in
a RwLock. The RwLock allows for easy read access which we expect to be the
predominant access pattern.
Updates to the dag are not completely stable so racing updates can have
conflicting results. In case of conflics one of the update processes would have
to restart. It's easier to reason about the process if we just allow one
"thread" to start an update process. The update process is locked by a sync
mutex. The "threads" that fail the race to update are asked to wait until the
ongoing update is complete. The waiters will poll on a shared future that
tracks the ongoing dag update. After the update is complete the waiters will go
back to checking if the data they have is available in the dag. It is possible
that the dag is updated in between determining that the an update is needed and
acquiring the ongoing_update lock. This is fine because the update building
process checks the state of dag before the dag and updates only what is
necessary if necessary.
Reviewed By: krallin
Differential Revision: D26508430
fbshipit-source-id: cd3bceed7e0ffb00aee64433816b5a23c0508d3c
Summary:
The earlier diffs in this stack have removed all our dependencies on the Tokio
0.1 runtime environment (so, basically, `tokio-executor` and `tokio-timer`), so
we don't need this anymore.
We do still have some deps on `tokio-io`, but this is just traits + helpers,
so this doesn't actually prevent us from removing the 0.1 runtime!
Note that we still have a few transitive dependencies on Tokio 0.1:
- async-unit uses tokio-compat
- hg depends on tokio-compat too, and we depend on it in tests
This isn't the end of the world though, we can live with that :)
Reviewed By: ahornby
Differential Revision: D26544410
fbshipit-source-id: 24789be2402c3f48220dcaad110e8246ef02ecd8
Summary:
This was a thing that was only ever used in Mononoke, and we don't think it's
usable and haven't been using it. Let's get rid of it. As-is, it won't even work
for most people due to its (indirect) dependency on Tokio 0.1.
Reviewed By: StanislavGlebik
Differential Revision: D26512243
fbshipit-source-id: faa16683f2adb20dfba43c4768486b982bc02ff9
Summary: In preparation for adding C++ client alternatives, use a wrapper around the Thrift client. This currently just calls the Thrift client directly, but will grow the ability to call the C++ client, too. For now, there's a dummy parameter to control the use of the C++ client.
Reviewed By: ahornby
Differential Revision: D26513598
fbshipit-source-id: b26cd9a9d89ab0502510b8533df4a60f5ca65292
Summary:
The changes (and fixes) needed were:
- Ignore rules that are not rust_library or thrift_library (previously only ignore rust_bindgen_library, so that binary and test dependencies were incorrectly added to Cargo.toml)
- Thrift package name to match escaping logic of `tools/build_defs/fbcode_macros/build_defs/lib/thrift/rust.bzl`
- Rearrange some attributes, like features, authors, edition etc.
- Authors to use " instead of '
- Features to be sorted
- Sort all dependencies as one instead of grouping third party and fbcode dependencies together
- Manually format certain entries from third-party/rust/Cargo.toml, since V2 formats third party dependency entries and V1 just takes them as is.
Reviewed By: zertosh
Differential Revision: D26544150
fbshipit-source-id: 19d98985bd6c3ac901ad40cff38ee1ced547e8eb
Summary: Add a cache pool for the new bonsai_svnrev_mapping
Reviewed By: krallin
Differential Revision: D26511043
fbshipit-source-id: feaa1bca525e84ebea7462bcc9a814dcb9e5a478
Summary: The latest libc has fixes for compiling on the M1 mac, let's update it.
Reviewed By: dtolnay
Differential Revision: D26476625
fbshipit-source-id: d9a997e69c428d53c51fc52353289a6510314c50
Summary:
Like it says in the title. This adds support for enabling slab rebalancing
that targets an even eviction age across all slabs, instead of trying to
maximize hit rate. It also lets us rebalance more often (right now we do it
every 5 minutes).
Reviewed By: HarveyHunt
Differential Revision: D26424323
fbshipit-source-id: a90e5a6b08cdfb01b5be8f995002bb5d5a38c4a2
Summary:
Autocargo V2 will use a more structured format for autocargo field
with the help of `cargo_toml` crate it will be easy to deserialize and handle
it.
Also the "include" field is apparently obsolete as it is used for cargo-publish (see https://doc.rust-lang.org/cargo/reference/manifest.html#the-exclude-and-include-fields). From what I know this might be often wrong, especially if someone tries to publish a package from fbcode, then the private facebook folders might be shipped. Lets just not set it and in the new system one will be able to set it explicitly via autocargo parameter on a rule.
Reviewed By: ahornby
Differential Revision: D26339606
fbshipit-source-id: 510a01a4dd80b3efe58a14553b752009d516d651
Summary:
See the earlier diff for what flag controls.
When booting SCS, we poll a lot of nested FuturesUnordered. This results in
very inefficient behavior in Tokio's cooperative scheduling, and results in us
spending 50% of total our CPU (in fact, a full thread) on just yielding, with
most other threads being idle.
With this change, we use 20+ threads running work that is scheduled by the main
thread, which is what we want.
Note that this applies to all Mononoke binaries. This has only been especially
bad in SCS startup, but we've also not benefited from this feature anywhere,
so rather than leaving this footgun in other apps, let's take it out
everywhere.
Reviewed By: markbt, StanislavGlebik
Differential Revision: D26399889
fbshipit-source-id: 0a13e1275d367e49c2342cb85cb6cd0047cda224
Summary: Give data some time to appear in other stores before triggering repair action.
Differential Revision: D26123924
fbshipit-source-id: 71dc84dc243ec5452aaa7686882c0a176bcb0dd8