Summary:
We have some tests that are a bit racy because they write bookmarks from
one process then look at them from another process, but that can fail because
we have a cache on bookmarks that holds them for 2 seconds.
This is normally not a huge issue because we don't access said bookmarks, but
now that as of my earlier diff we run a warm bookmarks cache, it *is* a
problem. This fixes that. We can expand it later to do things like reload
tunables, but for now this satisfies one basic use case.
Reviewed By: ahornby
Differential Revision: D26149371
fbshipit-source-id: 11c7f64b1ae45f6a0de142be25ab367cb25df567
Summary:
Right now, if you enable Mononoke API, you always get a WBC, for all derived
data kinds, and with a delay. This isn't great for a few reasons:
- Some places don't really care about bookmarks being warm to being with. This
is the case in the bookmarks filler (note that right now, it still does
polling to satisfy the WBC API, it's just not "warm").
- Some places don't want a delay, or don't want all kinds. This is the case for
Mononoke Server (which doesn't use Mononoke API right now, but that's what
I'm working towards), or EdenAPI, which uses a WBC sort-of-de-facto but
doesn't really care (but likely will in the future, and will want to follow
Mononoke Server's behavior).
As of this diff, we can now configure this when initializing `Mononoke`. I also
split out all those args into a `MononokeEnvironment` because the args list
was getting out of hand. One thing I did not do is make a way to instantiate
`MononokeEnvironment` from args (but we should do it at some point in the
future).
Reviewed By: StanislavGlebik
Differential Revision: D26100706
fbshipit-source-id: 1daa6335f3ce2b297929a84788bc5b4d9ad6432f
Summary:
This test module accidentally got lost when I added a `mod tests { ... }` in
the containing module. This brings it back and modernizes the tests that could
be. The push redirection test has way too much boilerplate to be manageable so
for now I removed it. I'll see if I can bring it back after some refactoring
I'm doing.
I'll try to see if there's a way we can try to lint / warn against inline
modules shadowing other files.
Reviewed By: ahornby
Differential Revision: D26124354
fbshipit-source-id: 7b24c4fe635bf8197142ab9ee087631ed49f10be
Summary:
I'd like to be able to use mononoke_api from repo_client, but the hg parts
depend on repo_client, so that's a circular dependency.
This splits out the hg parts of Mononoke API so that places that don't want
them don't have to import them. This is similar to what we did with blobrepo.
Reviewed By: StanislavGlebik
Differential Revision: D26099495
fbshipit-source-id: 73a9c7b5dc95feceb35b5eabccf697e9aa0a27de
Summary:
Right now, if we got no Hipster ACL in a repo listener, we default to denying
all access.
This is kind of annoying when using Mononoke locally but behind a trusted HTTP
proxy, because that means you cannot access the repo at all (if you're not
behind a proxy, then your TLS identities are used instead and everything is
fine).
If we trust a given proxy to impersonate literally anyone, we probably trust
them to access the repo to begin with (since the set of people with access is
not empty and therefore there is always at least someone they could impersonate
that has access), so this is what this diff does.
Reviewed By: johansglock
Differential Revision: D26073274
fbshipit-source-id: 0ef06cb6283d7f69072b712d3cb5a8383a493998
Summary:
If the socket is already shutdown for writes, then not being able to shut it
down is fine.
Reviewed By: ahornby
Differential Revision: D26052499
fbshipit-source-id: 2da6c34f657317419df00a0b7ba615e0eb351e0d
Summary:
Like it says in the title, this updates our HTTP stack to Hyper. There are a
few reasons to do this here:
- We can remove all the manual parsing & generation of HTTP, and instead let
Hyper (i.e. a HTTP framework) handle HTTP for us.
- We can use / reuse more pre-existing abstractions for things where we have to
implement HTTP handling (rather than just try to update to a websocket ASAP),
like the net speed test.
- And finally, my main motivation for this is that this will make it much
easier to load EdenAPI into Mononoke Server as a service. At this point, we
have a `Request` to pass into a function that returns a `Response`, which is
exactly what EdenAPI is, so hooking it in will be trivial.
There's a lot going on in this diff, so here is an overview. Overall, our
connection handling process is:
- Accept connection
- Perform TLS handshake
- Check if the remote is trusted
- Check ALPN:
- If hgcli, then read the preamble then run wireproto over the socket.
- If http, hand off the socket to Hyper. Hyper will call into our
MononokeHttpService (which is a Hyper Service) once per request.
- If websocket upgrade, accept the upgrade, then run wireproto over the
resulting I/O (i.e. the upgraded connection). An upgrade takes over the
connection, so implicitly this means there won't be further requests.
- If health check or net speed test, handle it. There might be multiple
requests here via connection reuse.
- This is where hooking EdenAPI will happen. We can instantiate Gotham
here: it also is a Hyper Service, so we just need to call it.
While in there, I've modelled those various states using structs instead of
passing a handful of arguments here or there.
Reviewed By: johansglock
Differential Revision: D26018641
fbshipit-source-id: dd757d72fe0f17f7c98c1948a6aa34d0c4d95cbf
Summary:
Like it says in the title, this updates our implementation of Globalrevs to
be a little more relaxed, and allows you to create and move bookmarks as long as
they are ancestors of the "main" Globalrevs bookmark (but NOT to pushrebase to
them later, because we only want to allow ONE globalrevs-publishing bookmark
per repo).
While in there, I also deduplicated how we instantiate pushrebase hooks a
little bit. If anything, this could be better in the pushrebase crate, but
that'd be a circular dependency between pushrebase & bookmarks movement.
Eventually, the callsites should probably be using bookmarks movement anyway,
so leaving pushrebase as the low-level crate and bookmarks movement as the high
level one seems reasonable.
Reviewed By: StanislavGlebik
Differential Revision: D26020274
fbshipit-source-id: 5ff6c1a852800b491a16d16f632462ce9453c89a
Summary:
Prior to this diff, only mononoke server initialized
`MononokeScubaSampleBuilder` in a way that used observability context, and
therefore respected vebosity settings.
Let's make a generic sample initializing function use this config too.
Reviewed By: ahornby
Differential Revision: D26156986
fbshipit-source-id: 632bda279e7f3905367b82db5b36f81264156de3
Summary: This is more flexible.
Reviewed By: StanislavGlebik
Differential Revision: D26168559
fbshipit-source-id: 5946b8b06b3a577f1a8398a228467925f748acf7
Summary: Just as we have global strings/ints, let's have per-repo ones.
Reviewed By: StanislavGlebik
Differential Revision: D26168541
fbshipit-source-id: f31cb4d556231d8f13f7d7dd521086497d52288b
Summary:
Please see added test. Without this diff such test does not even compile,
as `new_values_by_repo` is moved out by `self.#names.swap(Arc::new(new_values_by_repo));` after processing the first tunable (line 202).
Reviewed By: StanislavGlebik
Differential Revision: D26168371
fbshipit-source-id: 3cd9d77b72554eb97927662bc631611fa91eaecb
Summary:
If your disk1 is an external HFS-formatted disk, then
eden_apfs_mount_helper will fail to create apfs subvolumes on
it. Instead, use the disk backing the mount.
Reviewed By: fanzeyi
Differential Revision: D26096296
fbshipit-source-id: baa45181afb6610a095c864eb3183e5af76ec4e0
Summary:
It is already broken with segmented changelog (it assumes 0..len(repo) are
valid revs). It is super slow and cannot be optimized efficiently. The _only_
non-zero-exit-code usage in the past month is like:
hg log -r 'reverse(children(ancestors(remote/master) and branchpoint()) and draft() and age("<4d"))'
which takes 40 to 100s and can be rewritten using more efficient queries like `parents(roots(draft()))`.
Reviewed By: singhsrb
Differential Revision: D26158011
fbshipit-source-id: 7957710f27af8a83920021a228e4fa00439b6f3d
Summary:
Migrate most crates to tokio 1.0. The exception is edenfs-client, which has
some dependencies on `//common/rust/shed/fbthrift_ext` and seems non-trivial
to upgrade. It creates a separate tokio runtime so it shouldn't be affected
feature-wise.
Reviewed By: singhsrb
Differential Revision: D26152862
fbshipit-source-id: c84c43b1b1423eabe3543bccde34cc489b7805be
Summary: Configure segmented changelog to use caching when caching is requested.
Reviewed By: krallin
Differential Revision: D26121496
fbshipit-source-id: d0711a5939b5178b3a93d081019cfab47996da40
Summary:
Caching for the IdMap to speed things up.
Values for a key do not change. The IdMap is versioned for a given Repository.
We use the version of the IdMap in the generation of the cache keys. We set the
"site version" to be the IdMap version.
Reviewed By: krallin
Differential Revision: D26121498
fbshipit-source-id: 7e82e40b818d1132a7e86f4cd7365dd38056348e
Summary: This implementation is used for all things that are cached in Monononoke.
Reviewed By: quark-zju
Differential Revision: D26121497
fbshipit-source-id: a0088b539f3c3656921ab9a7a25c6442996aed18
Summary:
Logging all these throttling notifications is not necessary. There can
sometimes be big batches of fetches (like 100s of K). Lets reduce this by a
factor of 1000.
Note we also would like to add logging of what process triggered these fetches
what endpoint they use etc. This will help us identify the workflows causing it,
so we could address them or skip aux data fetching in these code paths.
But this requires some fiddling with ObjectFetchContext and the logging
code, so its gonna take a bit longer :(
Reviewed By: genevievehelsel
Differential Revision: D25505654
fbshipit-source-id: e7c40164db86fadf4baf0afd0c52879e0cb2568b
Summary:
For `repo.transaction("tr-name")`, this records `Transaction: tr-name` to
metalog commit description.
It can be helpful to narrow things down for commands with multiple
transactions.
In the future we might want to attach more data to the logging (ex. what the
commit cloud local, remote states at the time of syncing). However I didn't
do it now since metalog is designed to hold repository data, not too much
logging data. With a better logging infra we might want to move `config` out
from metalog, associated with metalog root ids.
Reviewed By: DurhamG
Differential Revision: D25984805
fbshipit-source-id: 59c074272cff555c6ff11dd755f7e3ce9a292eb6
Summary:
On setup SCS initializes the repos concurrently, warming up derived data for each repo, warming bookmark cache and fetching skiplists.
Fetching skiplists is an expensive operation and includes two steps: async get a large blob from the Blobstore and then sync deserialization of the blob.
While running on the same task as warming the bookmark cache, it takes all CPU and the other futures have to wait and can't process results returned by MySQL queries or connect to the DB. Thus SCS eventually fail to acquire a new connection or to perform a query in a reasonable time and terminates.
Spawning skiplists in a separate task helps to unlock the thread where the warm is running.
This was first noticed in TW tasks because after the MySQL rollout some of the SCS tasks started to take an hour to start.
To debug this and localize the issue, we put debug output to see what exactly blocks the repo initialization and, turned out it, when skiplists fetching started the rest was blocked.
Reviewed By: StanislavGlebik
Differential Revision: D26128171
fbshipit-source-id: fe9e1882af898950cf16d8e939dc6bc6be56510e
Summary:
The `async_common.py` needs to be ignored for Python 2 build since it
cannot be parsed by Python 2.
Reviewed By: xavierd
Differential Revision: D26142682
fbshipit-source-id: f921e7a35781b3336ba745e886380afc26d5ca36
Summary:
`blobstore_healer` works by healing blobs in turn, with some level of concurrency. Healing each blob consumes at least `O(blob_size)` or memory, so healing multiple blobs consumes their combined size of memory. Because blob sizes are not distributed uniformly, we cannot just calculate the desired concurrency level once and for all. Prior to this diff, this is what we did and whenever a few multi-gb blobs ended up in the same concurrently-healed batch, the healer OOMed. To help with this problem, this diff starts using dynamic concurrency - it assigns weight to each healed blob and only concurrently heals up to a certain total weight of blobs. This way, we can limit the total amount of memory consumed by our healer.
This solution is not perfect for a variety of reasons:
- if a single blob is larger than the total allowed weight, we'll still let it through. It's better than never healing it, but it means that OOMs are still possible in theory.
- we do not yet know the sizes of all the blobs in the queue. To mitigate that, I took a look at the known sizes distribution and saw that between 0 and 2KB is the most common size range. I defaulted to 1KB size of the unknown blob
Note 1: I had to make `heal_blob` consume it's args instead of borrowing them because `buffered_weight_limited` needs `'static` lifetime for the futures.
Note 2: When using `futures_ext`, I explicitly rename them to `futures_03_ext`, despite the fact that `blobstore_healer` does not depend on the older version. This is because `Cargo.toml` uses the same `[dependencies]` section for the combined dependencies of all the targets in the same `TARGETS` file. As there are other targets that claim the name of `futures_ext` for 0.1 version, I decided that it's easier to just use `_03_` name here than fix in other places. We can always change that of course.
Reviewed By: krallin
Differential Revision: D26106044
fbshipit-source-id: 4931d86d6e85d055ed0eefdd357b9ba6266a1c37
Summary: This was just a thin wrapper around with_metadata_database_config and it was using old futures, so remove it.
Differential Revision: D26100512
fbshipit-source-id: 22aa40ed73df2555645ba1d639fee3ae3dd38a09
Summary:
Currently the data layer eats all errors from remote stores and treats
them as KeyErrors. This hides connection issues from users behind obscure
KeyErrors. Let's make it so that any non-key error reported by the remote store
is propagated up as a legitimate error.
This diff makes Http errors from EdenApi show up with a nicer error message,
suggesting that the user run fixmyserver.
Further fixes will probably be necessary to categorize other errors from the
remote store more nicely.
Reviewed By: quark-zju
Differential Revision: D26117726
fbshipit-source-id: 7d7dee6ec101c6a1d226185bb27423d977096050
Summary:
Before this change we could throttle only based on one identity matching one of the identities from user's set of identities.
Now we'll be able to match a subset of user's identities.
Depends on D26125638.
Reviewed By: krallin
Differential Revision: D26125637
fbshipit-source-id: 534326264b9093e46fbdda846516fdaceb40c931
Summary:
For fsnodes and skeleton manifests it should be possible to allow gaps in the
commits that are backfilled. Any access to a commit in one of these gaps can
be quickly derived from the nearest ancestor that is derived. Since each
commit's derived data is independent, there is no loss from omitting them.
Add the `--gap-size` option to `backfill_derived_data`. This allows `fsnodes`,
`skeleton_manifests` and other derived data types that implement it in the
future to skip some of the commits during backfill. This will speed up
backfilling and reduced the amount of data that is stored for infrequently
accessed historical commits.
Reviewed By: StanislavGlebik
Differential Revision: D25997854
fbshipit-source-id: bf4df3f5c16a913c13f732f6837fc4c817664e29
Summary:
While looking into scs performance we noticed that fetching skiplist from
blobstore takes a lot of cpu. And looks like the slow part comes from lots of
allocation that we are doing in sql blob. Even though we might have successfully
fetched the skiplist from manifold, we'd still waste cpu time trying to assemble
object in xdb blobstore.
This diff fixes it by doing pre-allocation - however note that we don't have the precise size we need to
allocate, only an upper bound (number of chunks * max chunk size).
I opted to allocate less so that we don't waste memory on small requests (i.e.
it's not great to allocate 1mb buffer for 100 bytes object).
Note that I suspect that underlying problem might be in BytesMut extend_from_slice()
implementation - it might be unoptimized. However this is just a hunch, I haven't investigated it.For now let's do preallocation on our side.
Reviewed By: krallin
Differential Revision: D26145078
fbshipit-source-id: a50ba72656ffe6053af993fdec07ce55ddddacf3
Summary:
Replace `common::Job` by using `futures::Join` and `futures::Ready`.
We still need a heterogeneous variant of `Either`, where the output types of the
two futures differ, so extract this from `Job` as `common::Either2`, which
returns `either::Either<LeftFuture::Out, RightFuture::Out>`.
Reviewed By: ahornby
Differential Revision: D25867668
fbshipit-source-id: 13c90b212c64ca5eae67217a1cecd9aee5e40a38
Summary:
krallin noticed that we aren't using warm bookmark cache anymore. Turned out
the reason was in the fact that client uses `listkeyspatterns` call to fetch
bookmarks and not `listkeys`. This diff makes `listkeyspatterns` use warm
bookmark cache as well.
Reviewed By: markbt
Differential Revision: D26124605
fbshipit-source-id: 637db8d66934cabc1793f9f615fefddd07c3af62