Summary:
We used to carry patches for Tokio 0.2 to add support for disabling Tokio coop
(which was necessary to make Mononoke work with it), but this was upstreamed
in Tokio 1.x (as a different implementation), so that's no longer needed. Nobody
else besides Mononoke was using this.
For Hyper we used to carry a patch with a bugfix. This was also fixed in Tokio
1.x-compatible versions of Hyper. There are still users of hyper-02 in fbcode.
However, this is only used for servers and only when accepting websocket
connections, and those users are just using Hyper as a HTTP client.
Reviewed By: farnz
Differential Revision: D28091331
fbshipit-source-id: de13b2452b654be6f3fa829404385e80a85c4420
Summary:
This used to be used by Mononoke, but we're now on Tokio 1.x and on
corresponding versions of Gotham so it's not needed anymore.
Reviewed By: farnz
Differential Revision: D28091091
fbshipit-source-id: a58bcb4ba52f3f5d2eeb77b68ee4055d80fbfce2
Summary:
NOTE: there is one final pre-requisite here, which is that we should default all Mononoke binaries to `--use-mysql-client` because the other SQL client implementations will break once this lands. That said, this is probably the right time to start reviewing.
There's a lot going on here, but Tokio updates being what they are, it has to happen as just one diff (though I did try to minimize churn by modernizing a bunch of stuff in earlier diffs).
Here's a detailed list of what is going on:
- I had to add a number `cargo_toml_dir` for binaries in `eden/mononoke/TARGETS`, because we have to use 2 versions of Bytes concurrently at this time, and the two cannot co-exist in the same Cargo workspace.
- Lots of little Tokio changes:
- Stream abstractions moving to `tokio-stream`
- `tokio::time::delay_for` became `tokio::time::sleep`
- `tokio::sync:⌚:Sender::send` became `tokio::sync:⌚:Sender::broadcast`
- `tokio::sync::Semaphore::acquire` returns a `Result` now.
- `tokio::runtime::Runtime::block_on` no longer takes a `&mut self` (just a `&self`).
- `Notify` grew a few more methods with different semantics. We only use this in tests, I used what seemed logical given the use case.
- Runtime builders have changed quite a bit:
- My `no_coop` patch is gone in Tokio 1.x, but it has a new `tokio::task::unconstrained` wrapper (also from me), which I included on `MononokeApi::new`.
- Tokio now detects your logical CPUs, not physical CPUs, so we no longer need to use `num_cpus::get()` to figure it out.
- Tokio 1.x now uses Bytes 1.x:
- At the edges (i.e. streams returned to Hyper or emitted by RepoClient), we need to return Bytes 1.x. However, internally we still use Bytes 0.5 in some places (notably: Filestore).
- In LFS, this means we make a copy. We used to do that a while ago anyway (in the other direction) and it was never a meaningful CPU cost, so I think this is fine.
- In Mononoke Server it doesn't really matter because that still generates ... Bytes 0.1 anyway so there was a copy before from 0.1 to 0.5 and it's from 0.1 to 1.x.
- In the very few places where we read stuff using Tokio from the outside world (historical import tools for LFS), we copy.
- tokio-tls changed a lot, they removed all the convenience methods around connecting. This resulted in updates to:
- How we listen in Mononoke Server & LFS
- How we connect in hgcli.
- Note: all this stuff has test coverage.
- The child process API changed a little bit. We used to have a ChildWrapper around the hg sync job to make a Tokio 0.2.x child look more like a Tokio 1.x Child, so now we can just remove this.
- Hyper changed their Websocket upgrade mechanism (you now need the whole `Request` to upgrade, whereas before that you needed just the `Body`, so I changed up our code a little bit in Mononoke's HTTP acceptor to defer splitting up the `Request` into parts until after we know whether we plan to upgrade it.
- I removed the MySQL tests that didn't use mysql client, because we're leaving that behind and don't intend to support it on Tokio 1.x.
Reviewed By: mitrandir77
Differential Revision: D26669620
fbshipit-source-id: acb6aff92e7f70a7a43f32cf758f252f330e60c9
Summary:
Knowing the prepushrebase changeset id is required for retroactive_review.
retroactive_review checks landed commits, but verify_integrity hook runs on a commit before landing. This way the landed commit has no straightforward connection with the original one and retroactive_review can't acknowledge if verify_integrity have seen it.
Reviewed By: krallin
Differential Revision: D27911317
fbshipit-source-id: f7bb0cfbd54fa6ad2ed27fb9d4d67b9f087879f1
Summary:
Update the zstd crates.
This also patches async-compression crate to point at my fork until upstream PR https://github.com/Nemo157/async-compression/pull/117 to update to zstd 1.4.9 can land.
Reviewed By: jsgf, dtolnay
Differential Revision: D27942174
fbshipit-source-id: 26e604d71417e6910a02ec27142c3a16ea516c2b
Summary:
This updates the bookmarks cache TTL to be something we configure using
tunables instead of repo configs. There's a few goals here:
- Less us tune the pressure we put on SQL via a tunable
- Letting us experiment more easily with disabling this cache and tuning the
WBC poll interval.
Right now, this defaults to what we had set up in all our repo configs, which
is 2000ms.
Reviewed By: farnz
Differential Revision: D27915403
fbshipit-source-id: 4361d38939e5b2a0fe37442ed8c1dd0611af5098
Summary:
Like it says in the title. It would be a good idea for us to have support for
tweaking this so that we can control the load it forces onto our backend
storage.
Reviewed By: StanislavGlebik
Differential Revision: D27909638
fbshipit-source-id: 1783e1fddbbd609877ff3a9b3084f12e005fca4b
Summary: I am removing this change because we've decided to store prepushrebase changeset id server-side.
Reviewed By: ikostia
Differential Revision: D27853518
fbshipit-source-id: 888897bc48c67477309b09af5f8c1825ce20cbca
Summary:
After doing some local benchmarking (using MononokeApi instantiation as the
benchmark), one thing that's apparent is that we have quite a few parameters
here and that tuning them is likely to be a challenge.
One parameter in particular is the batch "objective", which controls how many
requests we want to see in the last batching interval before we choose to
batch (this is `rendezvous_dispatch_min_threshold`).
The problem with this is this is that there is no good, real-world, metric to
set it based on. This in contrast to the other parameters we have, which do
have some reasonable metric to compare to:
- rendezvous_dispatch_delay_ms: this is overhead we add to queries, so it
should be small & on the order of query execution latency (i.e. a few ms).
- rendezvous_dispatch_max_threshold: this controls how big our batches get, so
it should be on the order of what makes a SQL query too big (i.e. less than
a hundred records).
In contrast, we want to set `rendezvous_dispatch_min_threshold` such that
batching kicks in before we start using too many concurrent connections (which
is what query batching seeks to reduce), but the problem is that those two
numbers aren't directly connected. One clear problem, for example, is that if
our DB is in-region vs. out of-region, then for a given query execution time,
and a desired concurrency level before batching kicks in, we'd need different
values of `rendezvous_dispatch_min_threshold` (it would have to kick in faster
for the out-of-region workload).
So, this diff updates rendez vou to actually track concurrent connection count
before we force batching. This is the actual metric we care about here, and it
has a pretty natural "real world" values we can look at to decide where to set
it (our connection pool — which is limited at 100 concurrent connections —, and
our open connection baseline).
Note: I set this at 5 because that's more or less what servers look like
outside of spikes for Bonsai hg mapping, and of Changesets where I'm planning to
introduce this in the future:
- bonsai: https://fburl.com/ods/6d4a9qb5
- changesets: https://fburl.com/ods/kuq5x1vw (note: to make sense of this,
focus on just one server, otherwise the constnat spikes we get sort of hide
the big picture).
Reviewed By: farnz
Differential Revision: D27792603
fbshipit-source-id: 1a9189f6b50d48444b3373bd1cb14dc51b85a6d2
Summary:
Like it says in the title, this adds support for publishing our max open
connections to ODS. Note that this is a little more involved than I would like
for it to be, but there is no way to get direct access to this information.
This means, we need to:
- Expose how many open connections we have in flight (this is done earlier in
this stack in the Rust MySQL bindings).
- Periodically get this information out out for MySQL, put it in a timeseries.
- Get the max out of said timeseries and publish it to a counter so that it can
be fetched in ODS.
This is what this diff does. Note that I've only done this for read pools,
largely because I think they're the ones we tend to exhaust the most and I'd
like to see if there is value in exposing those counters before I use them.
We do the aggregation on a dedicated thread here. I contemplated making this a
Tokio task, but I figured making it a thread would make it easier to see if
it's misbehaving in any way (also: note that the SQL client allocates a bunch
of threads already anyway).
Reviewed By: HarveyHunt
Differential Revision: D27678955
fbshipit-source-id: c7b386f3a182bae787d77e997d108d8a74a6402b
Summary:
Previously we ran into an issue where client has sent us too large known
request, and we passed it all the way to the mysql.
Mysql slow log shows that we have quite a few slow queries
(https://fburl.com/scuba/mysql_slow_log/w0ugmc1i), so it might be that these
requests are still coming, but because of the problems in the logging (see
previous diff), we can't know it for sure.
In any case, adding a knob like that can be useful
Reviewed By: farnz
Differential Revision: D27650806
fbshipit-source-id: c4c82b7b5781a85c349abb4e5fa534b5e8f125a0
Summary:
When creating a commit via scs api we need to validate a few things (e.g. that
the file that a commit is trying to delete existed in the parent), and in order
to do that we need to use a manifest. Previously we were using fsnodes
manifests, however fsnodes is the slowest manifest to derive, and that makes
the whole create_commit operation much slower. Let's try to use skeleton
manifest, which is the fastest to derive.
Reviewed By: markbt
Differential Revision: D27587664
fbshipit-source-id: a60cab4956063bf26c0f1ec8c9cfa05233bb3cc0
Summary:
Direct reads mean we cut the Manifold server out of the loop for bigger reads, and is an enabler for Hedwig and other ways to distribute a big blob between tasks in a region.
Weak consistency means that we don't always check the master server for metadata for a blob, which should speed up reads especially in remote regions; it depends on direct reads to function, as it allows us to grab the read plan from Memcache and then go directly to the backend storage (ZippyDB, Everstore) instead of via the Manifold servers
Reviewed By: StanislavGlebik
Differential Revision: D27264690
fbshipit-source-id: 78fe77dfa4f85b5bf657b64e56577f4e00af0b94
Summary:
While Mononoke should support pushing large commits, it's not clear if we need
(or want) to support pushing a lot of commits. At the very least pushing lots of commits at the same
time can create problems with derivation, but there could be more places that might break.
Besides there's usually an easy way to work around that i.e. sqush commits or push in small batches.
If we ever need to import a lot of it we use tools like blobimport/repo_import,
so blocking pushing of lots of commits should be fine.
Reviewed By: farnz
Differential Revision: D27237798
fbshipit-source-id: 435e61110f145b06a177d6e492905fccd38d83da
Summary:
This introduces a basic building block for query batching. I called this
rendezvous, since it's about multiple queries meeting up in the same place :)
There are a few (somewhat conflicting) goals this tries to satisfy, so let's go
over them:
1), we'd like to reduce the total number of queries made by batch jobs. For
example, group hg bonsai lookups made by the walker. Those jobs are
characterized by the fact that they have a lot of queries to make, all the
time. Here's an example: https://fburl.com/ods/zuiep7yh.
2), we'd like to reduce the overall number of connections held to MySQL by
our tasks. The main way we achieve this is by reducing the maximum number of
concurrent queries. Indeed, a high total number of queries doesn't necessarily
result in a lot of connections as long as they're not concurrent, because we
can reuse connections. On the other hand, if you dispatch 100 concurrent
queries, that _does_ use 100 connections. This is something that applies to
batch jobs due to their query volume, but also to "interactive" jobs like
Mononoke Server or SCS, just not all the time. Here's an example:
https://fburl.com/ods/o6gp07qp (you can see the query count is overall low, but
sometimes spikes substantially).
2.1) It's also worth noting that concurrent queries are often the result of
many clients wanting the same data, so deduplication is also useful here.
3), we also don't want to impact the latency of interactive jobs when they
need to a little query here or there (i.e. it's largely fine if our jobs all
hold a few connections to MySQL and use them somewhat consistently).
4), we'd like this to make it easier to do batching right. For example, if
you have 100 Bonsais to map to hg, you should be able to just map and call
`future::try_join_all` and have that do the right thing.
5), we don't want "bad" queries to affect other queries negatively. One
example would be the occasional queries we make to Bonsai <-> Hg mapping in
`known` for thousands (if not more) of rows.
6), we want this to be easy to incorporate into the codebase.
So, how do we try to address all of this? Here's how:
- We ... do batching, and we deduplicate requests in a batch. This is the
easier bit and should address #1, #2 and #2.1, #4.
- However, batching is conditional. We notably don't batch very large requests
with the rest (addresses #5). We also don't batch small queries all the time:
we only batch if we are observing a throughput of queries that suggests we
can find some benefit in batching (this targets #3).
- Finally, we have some utilities for common cases like having to group by repo
id (this is `MultiRendezVous`), and this is all configurable via tunables
(and the default is to not do anything).
Reviewed By: StanislavGlebik
Differential Revision: D27010317
fbshipit-source-id: 4a2397255f9785c6722c02e4d419438fd0aafa07
Summary:
Knowing the prepushrebase changeset id is required for retroactive review.
Retroactive review checks landed commits, but verify integrity hook runs on a commit before landing. This way the landed commit has no straightforward connection with the original one and retroactive review can't acknowledge if verify integrity have seen it.
Reviewed By: StanislavGlebik
Differential Revision: D26944453
fbshipit-source-id: af1ec3c2e7fd3efc6572bb7be4a8065afa2631c1
Summary:
This tunable is not used anymore, we use
getbundle_high_low_gen_num_difference_threshold instead. Let's remove it.
Differential Revision: D26984966
fbshipit-source-id: 4e8ded5982f7e90c90476ff758b766df55644273
Summary:
Remove case conflict checking on upload. Disallowing case conflicts will
always be done during bookmark movement by checking the skeleton manifests.
A side-effect of this change is that configuring a repository with case
conflict checks, but not enabling skeleton manifests, will mean that commits
can't be landed in that repository, as there are no skeleton manifests to
check.
Reviewed By: ahornby
Differential Revision: D26781269
fbshipit-source-id: b4030bea5d92fa87f182a70d31f7e9d9e74989a9
Summary:
It was added for the initial rollout only so that we can fallback quickly if
needed (see D26221250 (7115cf31d2)). We can remove it now since the feature has been enabled
for a few weeks already with no big issues.
Reviewed By: krallin
Differential Revision: D26909490
fbshipit-source-id: 849fac4838c272e92a04a971869842156e88a1cf
Summary:
We ran into an issue while uploading too many blobs at once to darkstorm repo.
We were able to workaround this issue by spawning less blobstore writes at
once.
It's still a bit unclear why this issue happens exactly, but I'd like to make
the number of concurrent uploaded blobs configurable so that we can tweak it if
necessary.
Differential Revision: D26883061
fbshipit-source-id: 57c0d6fc51548b3c7404ebd55b5e07deba9e0601
Summary:
This diffs add a layer of indirection between fbinit and tokio, thus allowing
us to use fbinit with tokio 0.2 or tokio 1.x.
The way this works is that you specify the Tokio you want by adding it as an
extra dependency alongside `fbinit` in your `TARGETS` (before this, you had to
always include `tokio-02`).
If you use `fbinit-tokio`, then `#[fbinit::main]` and `#[fbinit::test]` get you
a Tokio 1.x runtime, whereas if you use `fbinit-tokio-02`, you get a Tokio 0.2
runtime.
This diff is big, because it needs to change all the TARGETS that reference
this in the same diff that introduces the mechanism. I also didn't produce it
by hand.
Instead, I scripted the transformation using this script: P242773846
I then ran it using:
```
{ hg grep -l "fbinit::test"; hg grep -l "fbinit::main" } | \
sort | \
uniq | \
xargs ~/codemod/codemod.py \
&& yes | arc lint \
&& common/rust/cargo_from_buck/bin/autocargo
```
Finally, I grabbed the files returned by `hg grep`, then fed them to:
```
arc lint-rust --paths-from ~/files2 --apply-patches --take RUSTFIXDEPS
```
(I had to modify the file list a bit: notably I removed stuff from scripts/ because
some of that causes Buck to crash when running lint-rust, and I also had to add
fbcode/ as a prefix everywhere).
Reviewed By: mitrandir77
Differential Revision: D26754757
fbshipit-source-id: 326b1c4efc9a57ea89db9b1d390677bcd2ab985e
Summary:
Atm in tests a separate ConfigStore with file source is created for some configs and then a reference to it is dropped immediately ([see get_config_handle function in mod.rs](https://fburl.com/diffusion/fpkj7ekv)). This is uncomfortable as we may need a reference to e.g. force update configs in tests.
Instead of creating separate stores we can reuse static configerator which already uses local files (in tests).
Reviewed By: krallin
Differential Revision: D26725515
fbshipit-source-id: 24269cd93b7d35216c025807c3f3eb527688b72b
Summary: We have a number of sleeps in our integration tests. The two main reasons are configs & tunables that need reloading. Currently, we have no way of force-reloading those.
Reviewed By: krallin
Differential Revision: D26615732
fbshipit-source-id: 217c4ae039abd398972b4a9764d08e18d6182493
Summary:
For dependencies V2 puts "version" as the first attribute of dependency or just after "package" if present.
Workspace section is after patch section in V2 and since V2 autoformats patch section then the third-party/rust/Cargo.toml manual entries had to be formatted manually since V1 takes it as it is.
The thrift files are to have "generated by autocargo" and not only "generated" on their first line. This diff also removes some previously generated thrift files that have been incorrectly left when the corresponding Cargo.toml was removed.
Reviewed By: ikostia
Differential Revision: D26618363
fbshipit-source-id: c45d296074f5b0319bba975f3cb0240119729c92
Summary:
The earlier diffs in this stack have removed all our dependencies on the Tokio
0.1 runtime environment (so, basically, `tokio-executor` and `tokio-timer`), so
we don't need this anymore.
We do still have some deps on `tokio-io`, but this is just traits + helpers,
so this doesn't actually prevent us from removing the 0.1 runtime!
Note that we still have a few transitive dependencies on Tokio 0.1:
- async-unit uses tokio-compat
- hg depends on tokio-compat too, and we depend on it in tests
This isn't the end of the world though, we can live with that :)
Reviewed By: ahornby
Differential Revision: D26544410
fbshipit-source-id: 24789be2402c3f48220dcaad110e8246ef02ecd8
Summary:
The changes (and fixes) needed were:
- Ignore rules that are not rust_library or thrift_library (previously only ignore rust_bindgen_library, so that binary and test dependencies were incorrectly added to Cargo.toml)
- Thrift package name to match escaping logic of `tools/build_defs/fbcode_macros/build_defs/lib/thrift/rust.bzl`
- Rearrange some attributes, like features, authors, edition etc.
- Authors to use " instead of '
- Features to be sorted
- Sort all dependencies as one instead of grouping third party and fbcode dependencies together
- Manually format certain entries from third-party/rust/Cargo.toml, since V2 formats third party dependency entries and V1 just takes them as is.
Reviewed By: zertosh
Differential Revision: D26544150
fbshipit-source-id: 19d98985bd6c3ac901ad40cff38ee1ced547e8eb
Summary:
Autocargo V2 will use a more structured format for autocargo field
with the help of `cargo_toml` crate it will be easy to deserialize and handle
it.
Also the "include" field is apparently obsolete as it is used for cargo-publish (see https://doc.rust-lang.org/cargo/reference/manifest.html#the-exclude-and-include-fields). From what I know this might be often wrong, especially if someone tries to publish a package from fbcode, then the private facebook folders might be shipped. Lets just not set it and in the new system one will be able to set it explicitly via autocargo parameter on a rule.
Reviewed By: ahornby
Differential Revision: D26339606
fbshipit-source-id: 510a01a4dd80b3efe58a14553b752009d516d651
Summary:
Like it says in the title, this adds support for exposing EdenAPI in Mononoke
Server. That's it!
Differential Revision: D26131777
fbshipit-source-id: 15ed2d6d80b1ea06763adc0b7312d1cab2df5b76
Summary: This is more flexible.
Reviewed By: StanislavGlebik
Differential Revision: D26168559
fbshipit-source-id: 5946b8b06b3a577f1a8398a228467925f748acf7
Summary: Just as we have global strings/ints, let's have per-repo ones.
Reviewed By: StanislavGlebik
Differential Revision: D26168541
fbshipit-source-id: f31cb4d556231d8f13f7d7dd521086497d52288b
Summary:
Please see added test. Without this diff such test does not even compile,
as `new_values_by_repo` is moved out by `self.#names.swap(Arc::new(new_values_by_repo));` after processing the first tunable (line 202).
Reviewed By: StanislavGlebik
Differential Revision: D26168371
fbshipit-source-id: 3cd9d77b72554eb97927662bc631611fa91eaecb
Summary:
Yesterday we had an alarm when blobstore sync queue got overloaded again. This
time it was caused by a large commit cloud commit landing and writing lots of
content and alias blobs.
As we discussed before, let's add an option that would allow us to not write to
blobstore sync queue for commit cloud pushes of content and aliases.
It would slightly increase the latency, but will protect blobstore sync queue
from overloading.
Reviewed By: farnz
Differential Revision: D26129038
fbshipit-source-id: 0e96887e3aa3cf26880899c820f556bb16c437cb
Summary:
Lots of generated code in this diff. Only code change was in
`common/rust/cargo_from_buck/lib/cargo_generator.py`.
Path/git-only dependencies (ie `mydep = { path = "../foo/bar" }`) are not
publishable to crates.io. However, we are allowed to specify both a path/git
_and_ a version. When building locally, the path/git is chosen. When publishing,
the version on crates.io is chosen.
See https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#multiple-locations .
Note that I understand that not all autocargo projects are published on crates.io (yet).
The point of this diff is to allow projects to slowly start getting uploaded.
The end goal is autocargo generated `Cargo.toml`s that can be `cargo publish`ed
without further modification.
Reviewed By: lukaspiatkowski
Differential Revision: D26028982
fbshipit-source-id: f7b4c9d4f4dd004727202bd98ab10e201a21e88c
Summary:
A bit of background: some time ago I landed an optimization to getbundle, which
split the commits user want to fetch into "high generation numbers"
(which means commits are close to main bookmark) and "low generation numbers"
(commits that are far away from the main bookmark). User can fetch "low
generation numbers" if e.g. a commit to an old release branch was landed.
Processing them separately can yield significant speedups.
Previously we chose a threshold statically e.g. if a commit has generation
number lower than X then it's considered to have a low generation number.
Threshold was chosen arbitrarily and relatively small [1], and it didn't work
that well for commits that were above this threshold, but still not close
enough to main bookmark - we still see cpu spikes.
Instead let's define a commit as having low generation number if it's >X
commits farther from the commit with the highest generation number.
Reviewed By: ahornby
Differential Revision: D25995936
fbshipit-source-id: 57eba4ba288114f430722266a8326183b3b1f0bd
Summary:
Yesterday I landed a diff (D25950531 (79c34c5094)) which allows changing a mapping by
landing a commit with a special commit extra. The implementation was done
inside backsyncer.
However, this was incorrect for a few reasons, the main reason being backsyncer
is not the only thing that rewrites commits between repos (x-repo commit lookup
is another thing for example).
So we need to push this implementation down to CommitSyncer level so that
everyone rewrites commits in the same way.
This diff fixes this issue, and moves all the logic of figuring out correct
mapping version down to CommitSyncer level. Doing so allowed me to also
simplify backsyncer to use normal sync_commit() method to backsync commits.
Another important note is about how we handle commits with no parents. We allow
syncing them only if there's an unambiguous version that we can pick. In case
of any ambiguity we don't sync them and return an error, which means that e.g.
merging a new repo and simultaneously changing the mapping is not possible.
Reviewed By: ikostia
Differential Revision: D25975842
fbshipit-source-id: a87fee545ac1305832ac905337610e7b87884477
Summary:
At the moment the procedure to change commit sync mapping version is fairly
complicated [1]. This diff attempts to simplify one aspect of it.
At the moment we need to disable pushredirection and manually sync a commit
between two repos. Disabling pushredirection is quite error prone and led to
SEVs in the past. With this diff we just need to create a commit with a special
extra "change-xrepo-mapping" and let the backsyncer rewrite a commit with the
new version for us.
This means that landing a commit with this extra should not be allowed to
everyone - I'm going to change bookmark movement to check that.
Reviewed By: ikostia
Differential Revision: D25950531
fbshipit-source-id: 78b1f4950e6bb7da8b511e02685ed6084c29a680
Summary:
We had issues with mononoke writing too much blobstore sync queue entries while
deriving data for large commits. We've contemplated a few solutions, and
decided to give this one a go.
This approach forces derive data to use Background SessionClass which has an
effect of not writing data to blobstore sync queue if the write to blobstore
was successful (it still writes data to the queue otherwise). This should
reduce the number of entries we write to the blobstore sync queue
significantly. The downside is that writes might get a bit slower - our
assumption is that this slowdown is acceptable. If that's not the case we can
always disable this option.
This diff overrides SessionClass for normal ::derive() method. However there's
also batch_derive() - this one will be addressed in the next diff.
One thing to note - we still write derived data mapping to blobstore sync queue. That should be find as we have a constant number of writes per commits.
Reviewed By: krallin
Differential Revision: D25910464
fbshipit-source-id: 4113d00bc0efe560fd14a5d4319b743d0a100dfa
Summary:
The goal of this stack is to start logging commits to scribe even if a commit was
introduced by scs create_commit/move_bookmark api. Currently we don't do that.
Initially I had bigger plans and I wanted to log to scribe only from bookmarks_movement and remove scribe logging from unbundle/processing.rs, but it turned out to be trickier to implement. In general, the approach we use right now where in order to log to scribe we need to put `log_commit_to_scribe` call in all the places that can possibly create commits/move bookmarks seems wrong, but changing it is a bit hard. So for now I decided to solve the main problem we have, which is the fact that we don't get scribe logs from repos where bookmarks is moved via scs methods.
To fix that I added an additional option to CreateBookmark/UpdateBookmark structs. If this option is set to true then before moving/creating the bookmark it finds all draft commits that are going to be made public by the bookmark move i.e. all draft ancestors of new bookmark destinationl. This is unfortunate that we have to do this traversal on the critical path of the move_bookmark call, but in practice I hope it won't be too bad since we do similar traversal to record bonsai<->git mappings. In case my hopes are wrong we have scuba logging which should make it clear that this is an expensive operation and also we have a tunable to disable this behavioiur.
Also note that we don't use PushParams::commit_scribe_category. This is intentional - PushParams::commit_scribe_category doesn't seem useful at all, and I plan to deprecate it later in the stack. Even later it would make sense to deprecate PushrebaseParams::commit_scribe_category as well, and put commit_scribe_category optoin in another place in the config.
Reviewed By: markbt
Differential Revision: D25558248
fbshipit-source-id: f7dedea8d6f72ad40c006693d4f1a265977f843f
Summary:
The backfiller may read or write to the blobstore too quickly. Apply QPS
limits to the backfill batch context to keep the read or write rate acceptable.
Reviewed By: ahornby
Differential Revision: D25371966
fbshipit-source-id: 276bf2dd428f7f66f7472aabd9e943eec5733afe