Summary: I'm going to want to be able to test against a single ephemeral shard, as well as production use against a real DB. Use the standard config to make that possible.
Reviewed By: ahornby
Differential Revision: D21048697
fbshipit-source-id: 644854e2c831a9410c782ca1fddc1c4b5f324d03
Summary:
This is generally something I wanted to have for a long time: instead of having to open a writable db shell, now we can just use the admin command. Also, this will be easier to document in the oncall wikis.
NB: this is lacking the `delete` functionality atm, but that one is almost never needed.
Reviewed By: krallin
Differential Revision: D21039606
fbshipit-source-id: 7b329e1782d1898f1a8a936bc711472fdc118a96
Summary:
We don't use this anymore (instead we just do backtesting in bulk). Let's get
rid of it.
Reviewed By: farnz
Differential Revision: D21042083
fbshipit-source-id: af5aea3033a4d58ba61b8f22d7dc1249a112933e
Summary:
I'd like to clean up this code a little bit since I'm going to make a few
changes and would like to avoid mixing too many old and new futures.
Reviewed By: farnz
Differential Revision: D21042081
fbshipit-source-id: d6a807ce9c60d09d82c6b8c6866ea23b8ef45f21
Summary:
run_in_range isn't being used anywhere. Let's get rid of it. Also, let's not
make run_in_range0 a method on Tailer since it's more of a helper function.
Reviewed By: farnz
Differential Revision: D21042084
fbshipit-source-id: 2678a94ce4b0b6ae1c97e47eb02652bcbf238b0d
Summary:
The hook_tailer is broken in mode/dev right now because it blows up with a
debug assertion in clap complaining that `--debug` is being added twice. This
is because it sets up its own logging, which is really not needed.
Let's just take this all out: it's not necessary
Reviewed By: farnz
Differential Revision: D21040108
fbshipit-source-id: 75ec70717ffcd0778730a0960607c127a958fe52
Summary: It's nice to be able to use a Bonsai ID if that's what you have.
Reviewed By: farnz
Differential Revision: D21040109
fbshipit-source-id: 4dfc447437053f9d7f4a1c9b3753d51fe5d02491
Summary: Add a few debug-level log lines during server startup so we can see which part of startup is slow.
Reviewed By: quark-zju
Differential Revision: D21054216
fbshipit-source-id: 5dfb7b58fffb360506f34e3f2bb9e8b51fcc5e6b
Summary:
Previously, an extension adding the "changeset" pushop might forget to call the
prepushoutgoing hooks, preventing them from being called.
Reviewed By: DurhamG
Differential Revision: D21008487
fbshipit-source-id: a6bc506c7e1695854aca3d3b2cd118ef1c390c52
Summary: `#![deny(warnings)]` does nothing outside of the crate root file, so this was a no-op.
Reviewed By: singhsrb
Differential Revision: D21054214
fbshipit-source-id: dc1931c0a186eb42aae7700dd006550616f29a70
Summary:
This is needed because the tonic crate (see the diff stack) relies on tokio ^0.2.13
We can't go to a newer version because a bug that affects mononoke was introduced on 0.2.14 (discussion started on T65261126). The issue was reported upstream https://github.com/tokio-rs/tokio/issues/2390
This diff simply changed the version number on `fbsource/third-party/rust/Cargo.toml` and ran `fbsource/third-party/rust/reindeer/vendor`.
Also ran `buck run //common/rust/cargo_from_buck:cargo_from_buck` to fix the tokio version on generated cargo files
Reviewed By: krallin
Differential Revision: D21043344
fbshipit-source-id: e61797317a581aa87a8a54e9e2ae22655f22fb97
Summary:
In getbundle, we compute the set of new draft commit ids. This is used to
include tree and file data in the bundle when draft commits are fully hydrated,
and will also be used to compute the set of mutation information we will
return.
Currently this calculation only computes the non-common draft heads. It
excludes all of the ancestors, which should be included. This is because it
re-uses the prepare_phases code, which doesn't quite do what we want.
Instead, separate out these calculations into two functions:
* `find_new_draft_commits_and_public_roots` finds the draft heads
and their ancestors that are not in the common set, as well as the
public roots the draft commits are based on.
* `find_phase_heads` finds and generates phase head information for
the public heads, draft heads, and the nearest public ancestors of the
draft heads.
Reviewed By: StanislavGlebik
Differential Revision: D20871337
fbshipit-source-id: 2f5804253b8b4f16b649d737f158fce2a5102002
Summary:
Computing delta force the client to have the previous version locally, which it
may not have, forcing a full fetch of the blob, to then compute a delta. Since
delta are a way to save on bandwidth usage, fetching a blob to compute it
negate its benefits.
Reviewed By: DurhamG
Differential Revision: D20999424
fbshipit-source-id: ae958bb71e6a16cfc77f9ccebd82eec00ffda0db
Summary:
A new method on BonsaiDerived trait that derives data for a batch of commits.
Default implementation just derives them in parallel, so it's not particularly
useful. However it might be overriden if a particular derived data has a more
efficinet way of deriving a batch of commits
Reviewed By: farnz
Differential Revision: D21039983
fbshipit-source-id: 3c6a7eaa682f5eaf6b8a768ca61d6f8a8f1258a7
Summary: It makes it backfill a great deal faster
Reviewed By: krallin
Differential Revision: D21040292
fbshipit-source-id: f6d06cbc76e710b4812f15e85eba73b24cdbbd3e
Summary:
Unfortunately, `BonsaiChangeset::get_changeset_id()` is a fairly expensive
operation, since it'll clone, serialize, and hash the changeset. In hooks in
particular, since we run this once per hook execution (and therefore once per
file), that can be come a problem.
Indeed, on a commit with 1K file changes, hooks run for ~30 seconds
(P129058164). According to perf, the overwhelming majority of that time is
spent in computing hashes of bonsai changesets. For a commit with 10K changes,
it spends time there as well, it took 3.5 hours.
This diff updates hooks to compute the changeset id just once, which brings our
time down to O(N) (where N = file changes).
Reviewed By: StanislavGlebik
Differential Revision: D21039811
fbshipit-source-id: 73f9939ffc7d095e717bdb5efc46dbf4ad312c65
Summary: This is generally helpful to log — see later in this stack.
Reviewed By: HarveyHunt
Differential Revision: D21039810
fbshipit-source-id: 4087db70b3f56f47270c10eb31a37f33c61778df
Summary:
This is important for various syncs.
Note: there's an obvious race condition, TOCTTOU is non-zero for existing bookmark locations. I don't think this is a problem, as we can always re-run the admin.
Reviewed By: StanislavGlebik
Differential Revision: D21017448
fbshipit-source-id: 1e89df0bb33276a5a314301fb6f2c5049247d0cf
Summary:
Use deleted manifest to search deleted paths in the repos with linear history. For merged history it returns error as there was no such path.
Commit, where the path was deleted, is returned as a first commit in the history stream, the rest is a history before deletion.
Reviewed By: StanislavGlebik
Differential Revision: D20897083
fbshipit-source-id: e75e53f93f0ca27b51696f416b313466b9abcee8
Summary:
Some time ago we decided on the "redaction" naming for this feature. A few
places were left unfixed.
Reviewed By: xavierd
Differential Revision: D21021354
fbshipit-source-id: 18cd86ae9d5c4eb98b843939273cfd4ab5a65a3a
Summary:
We should use the HgsqlName to check the repo lock, because that's the one
Mercurial uses in the repo lock there.
Reviewed By: farnz
Differential Revision: D20943177
fbshipit-source-id: 047be6cb31da3ee006c9bedc3de21d655a4c2677
Summary:
The name for repository in hgsql might not match that of the repository itself.
Let's use the hgsql repo name instead of the repo name for syncing globalrevs.
Reviewed By: farnz
Differential Revision: D20943175
fbshipit-source-id: 605c623918fd590ba3b7208b92d2fedf62062ae1
Summary:
This parses out the Hgsql name out of the repo config. While in there, I also
noticed that our tests force us to have a default impl right now (there are
otherwise waaaay to many fields to specify), but at the same time we don't use
it everywhere. So, in an effort to clean up, I updated hooks to use a default.
I added a newtype wrapper for the hgsql name, since this will let me update the
globalrev syncer and SQL repo lock implementation to require a HgsqlName
instead of a string and have the compiler prove that all callsites are doing
so.
Reviewed By: farnz
Differential Revision: D20942177
fbshipit-source-id: bfbba6ba17cf3e3cad0be0f8406e41e5a6e6c3d4
Summary:
See D20941946 for why this is being added. This just brings in the updated
Thrift definition.
Reviewed By: farnz
Differential Revision: D20942176
fbshipit-source-id: c060f80666cb79f1498023276b7a09ec12bf52b4
Summary:
This diff may not have quite the right semantics.
It switches `prefetch_content` to async syntax,
in the process getting rid of the old function `spawn_future`,
which assumes old-style futures, in favor of using
`try_for_each_concurrent` to handle concurrency.
In the process, we were able to remove a couple levels of clones.
I *think* that the old code - in which each call to `spawn_future`
would spin off its own future on the side but then also wait
for completion, and then we buffered - would run at most 256
versions of `prefetch_content_node` at a time, and the current
code is the same. But it's possible that I've either halved or
doubled the concurrency somehow here, if I lost track of the
details.
Reviewed By: krallin
Differential Revision: D20665559
fbshipit-source-id: d95d50093f7a9ea5a04c835baea66e07a7090d14
Summary:
The revisionstore is a large crate with many dependencies, split out the types part which is most likely to be shared between different pieces of eden/mononoke infrastructure.
With this split it was easy to get eden/mononoke/mercurial/bundles
Reviewed By: farnz
Differential Revision: D20869220
fbshipit-source-id: e9ee4144e7f6250af44802e43221a5b6521d965d
Summary:
By switching to the new futures api, we can save a few heap allocations
and reduce indentation of the code.
Reviewed By: krallin
Differential Revision: D20666338
fbshipit-source-id: 730a97e0365c31ec1a8ab2995cba6dcbf7982ecd
Summary:
We had accumulated lots of unused dependendencies, and had several test_deps in deps instead. Clean this all up to reduce build times and speed up autocargo processing.
Net removal is of around 500 unneeded dependency lines, which represented false dependencies; by removing them, we should get more parallelism in dev builds, and less overbuilding in CI.
Reviewed By: krallin, StanislavGlebik
Differential Revision: D20999762
fbshipit-source-id: 4db3772cbc3fb2af09a16601bc075ae8ed6f0c75
Summary:
RepoBlobstore is currently a type alias for the underlying blobstore type. This
is a bit unideal for a few reasons:
- It means we can't add convenience methods on it. Notably, getting access to
the underlying blobstore can be helpful in tests, but as-is we cannot do that
(see the test that I updated in the LFS server change in this diff for an
example).
- Since the various blobstores we use for wrapping are blobstores themselves,
it is possible when deconstructing the repo blobstore to accidentally forget
to remove one layer. By making the internal blobstore a `T`, we can let the
compiler prove that deconstructing the `RepoBlobstore` is done properly.
Most of the changes in this diff are slight refactorings to make this compile
(e.g. removing obsolete trait bounds, etc.), but there are a couple functional
changes:
- I've extracted the RedactedBlobstore configuration into its own Arc. This
enables us to pull it back out of a RedactedBlobstore without having to copy
the actual data that's in it.
- I've removed `as_inner()` and `into_inner()` from `RedactedBlobstore`. Those
methods didn't really make sense. They had 2 use cases:
- Deconstruct the `RedactedBlobstore` (to rebuild a new blobstore). This is
better handled by `as_parts()`.
- Get the underlying blobstore to make a request. This is better handled by
yielding the blobstore when checking for access, which also ensures you
cannot accidentally bypass redaction by using `as_inner()` (this which also
allowed me to remove a clone on blobstore in the process).
Reviewed By: farnz
Differential Revision: D20941351
fbshipit-source-id: 9fa566702598b916cb87be6b3f064cd7e8e0b3e0
Summary:
Filenode envelopes have metadata, let's display it as well.
Althouth I've never seen it being non-empty, whenever I investigate some
filenode difference, I would like to know for sure.
Reviewed By: StanislavGlebik
Differential Revision: D20951954
fbshipit-source-id: 188321591e0d591d31e1ca765994f953dc23221c
Summary: It says I was doing it the slow way. Do it the fast way
Reviewed By: krallin
Differential Revision: D20926911
fbshipit-source-id: 65790d510d626e70a402c22a2df5d7606427aa7f
Summary: In production, we'll never look at blobstores on their own. Use the standard cachelib and memcache layers in benchmarks to test with caching.
Reviewed By: krallin
Differential Revision: D20926910
fbshipit-source-id: 030dcf7ced76293eda269a31adc153eb6d51b48a
Summary: This lets us look at a blobstore's behaviour for repeated single reads, parallel same-blob reads, and parallel reads to multiple blobs.
Reviewed By: krallin
Differential Revision: D20920206
fbshipit-source-id: 24d9a58024318ff3454fbbf44d6f461355191c55
Summary: Not in use any more - all hooks are now Bonsai form - so remove it.
Reviewed By: krallin
Differential Revision: D20891164
fbshipit-source-id: b92f169a0ec3a4832f8e9ec8dc9696ce81f7edb3
Summary: These hooks now run on modified files, not just added files, after porting to Bonsai form.
Reviewed By: krallin
Differential Revision: D20891166
fbshipit-source-id: 93a142f91c0bea7f5fe5e541530c644d215dce3a
Summary:
We were exluding warmup, which might take a noticeable amount of time. Let's
measure everything
Reviewed By: krallin
Differential Revision: D20920211
fbshipit-source-id: f48b0c2425eb2bae2991fa537dde1bc61b5e44ac
Summary: This allows the client to do proper feature detection.
Reviewed By: krallin
Differential Revision: D20910379
fbshipit-source-id: c7b9d4073e94518835b39809caf8b068f70cbc2f
Summary:
The Mercurial SHA1 is defined as:
sorted([p1, p2]) + content
The client wants to be able to verify the commit hashes returned by
getcommitdata. Therefore, also write the sorted parents so the client can
calculate the SHA1 easily without fetching SHA1s of parents. This is
useful because we also want to make commit SHA1s lazy on client-side.
I also changed the NULL behavior so the server does not return
content for the NULL commit, as it will fail the SHA1 check.
The server will expects the client to already know how to handle
the NULL special case.
Reviewed By: krallin
Differential Revision: D20910380
fbshipit-source-id: 4a9fb8ef705e93c759443b915dfa67d03edaf047
Summary:
This makes sense to have when running locally. Of you're running Mononoke LFS
locally, then implicitly your access is governed by whether you have access to
the underlying data. If you are on the source control team and you do have
access, it makes sense to let you run without ACL checks (since you could
rebuild from source anyway).
Reviewed By: farnz
Differential Revision: D20897249
fbshipit-source-id: 43e8209952f22aa68573c9b94a34e83f2c88f11b
Summary:
When a client requests a blob that is redacted, we should tell them that,
instead of returning a 500. This does that, we now return a `410 Gone` when
redacted content is accessed.
Reviewed By: farnz
Differential Revision: D20897251
fbshipit-source-id: fc6bd75c82e0cc92a5dbd86e95805d0a1c8235fb
Summary:
If a blob is redacted, we shouldn't crash in batch. Instead, we should return
that the blob exists, and let the download path return to the client the
information that the blob is redacted. This diff does that.
Reviewed By: HarveyHunt
Differential Revision: D20897247
fbshipit-source-id: 3f305dfd9de4ac6a749a9eaedce101f594284d16
Summary:
502 made a bit of sense since we can occasionally proxy things to upstream, but
it's not very meaningful because our inability to service a batch request is
never fully upstream's fault (it would not a failure if we had everything
internally).
So, let's just return a 500, which makes more sense.
Reviewed By: farnz
Differential Revision: D20897250
fbshipit-source-id: 239c776d04d2235c95e0fc0c395550f9c67e1f6a
Summary:
I noticed this while doing some unrelated work on this code. Basically, if we
get an error from upstream, then we shouldn't return an error the client
*unless* upstream being down means we are unable to satisfy their request
(meaning, we are unable to say whether a particular piece of content is
definitely present or definitely missing).
This diff fixes that. Instead of checking for a success when hearing form
upstream _then_ running our routing logic, let's instead only fail if in the
course of trying to route the client, we discover that we need a URL from
upstream AND upstream has failed.
Concretely, this means that if upstream blew up but internal has all the data
we want, we ignore the fact that upstream is down. In practice, internal is
usually very fast (because it's typically all locally-cached) so this is
unlikely to really occur in real life, but it's still a good idea to account
for this failure scenario.
Reviewed By: HarveyHunt
Differential Revision: D20897252
fbshipit-source-id: f5a8598e8a9da382d0d7fa6ea6a61c2eee8ae44c
Summary:
Right now we have a couple functions, but they're not easily composable. I'd
like to make the redacted blobs configurable when creating a test repo, but I
also don't want to have 2 new variants, so let's create a little builder for
test repos.
This should make it easier to extend in the future to add more customizability
to test repos, which should in turn make it easier to write unit tests :)
Reviewed By: HarveyHunt
Differential Revision: D20897253
fbshipit-source-id: 3cb9b52ffda80ccf5b9a328accb92132261616a1
Summary:
This asyncifies the internals of `subcommand_tail`, which
loops over a stream, by taking the operation performed in
the loop and making it an async function.
The resulting code saves a few heap allocations by reducing
clones, and is also *much* less indented, which helps with
readability.
Reviewed By: krallin
Differential Revision: D20664511
fbshipit-source-id: 8e81a1507e37ad2cc59e616c739e19574252e72c