Commit Graph

257 Commits

Author SHA1 Message Date
Stanislau Hlebik
c2fec9f75d mononoke: pass &str instead of String
Summary: small cleanup - will help remove clone in the later diff

Reviewed By: farnz

Differential Revision: D20441266

fbshipit-source-id: ee15c7bdbfe2f42ccd1d10abbbe0d43e89547df5
2020-03-13 09:26:16 -07:00
Mateusz Kwapich
0cb4e3eca6 return generation numbers
Summary:
let's return the generation numbers as part of commit info - they are cheap to
obtain.

Reviewed By: StanislavGlebik

Differential Revision: D20426744

fbshipit-source-id: 50c7017c55aeba04fb9059e2c1db19f2fb0a6e5e
2020-03-13 08:31:07 -07:00
Mateusz Kwapich
a522cde826 add generation numbers to the changeset API
Summary:
Let's return those to the caller. The changeset is already fetched at this
point so this should come at no extra cost.

Reviewed By: markbt

Differential Revision: D20426745

fbshipit-source-id: 7e28f5fd44efdb502b8e37a7afac3ea9113e2c5e
2020-03-13 08:31:07 -07:00
Mateusz Kwapich
e70e7a9017 rename get_generation_number_by_bonsai to get_generation_number
Summary: I'm about to introduce one more usecase of it so let's rename it first.

Reviewed By: farnz

Differential Revision: D20393776

fbshipit-source-id: d74146fa212cdc4989a18c2cbd28307f58994759
2020-03-13 08:31:07 -07:00
Kostia Balytskyi
23dee89ed0 mononoke: make commit validator manifest-diff-based
Summary:
Currently, x-repo commit validator runs full working copy comparison every time. This is slow, as it requires fetching of full manifests for the commit versions in both a small and a large repo. For those of leafs, which are different, filenodes are also fetched, so that `ContentId`s can be compared. It's slow. Because it's slow, we skip the majority of commits and validate only 1 in ~40 `bookmarks_update_log` entries. Obviously, this is not ideal.

To address this, this diff introduces a new way to run validation: based on comparing full manifest diffs. For each entry of the `bookmarks_update_log` of the large repo, we:
- unfold it into a list of commits it introduces into the repository
- fetch each commit
- see how it rewrites to small repos
- compute manifest differences between the commit and it's parent in each involved repo
- compare those differences
- check that topological order relationship remains sane (i.e. if `a` is a p1 of `b` in a small repo, than `a'` must be a p1 of `b'` in a large repo, where `a'` and `b'` are remappings of `a` and `b`)

In addition, this diff adds some integration tests and gets rid of the skipping logic.

Reviewed By: StanislavGlebik

Differential Revision: D20007320

fbshipit-source-id: 6e4647e9945e1da40f54b7f5ed79651927b7b833
2020-03-13 07:46:48 -07:00
Kostia Balytskyi
206a18703f mononoke: refactor cross_repo_sync/validation a little bit
Summary:
Replace ownership moves with borrows here and there, add fetching of `FileType`
as well.

Reviewed By: krallin

Differential Revision: D20387564

fbshipit-source-id: 378511402277d744ddfbb68e01a3f5d1707f6d08
2020-03-13 07:46:47 -07:00
Stanislau Hlebik
c96d794502 mononoke: fix integration test regression by reducing number of sqlite shards
Summary:
Even simplest test took ~30 seconds to finish. Turned out most of the time is
sepnt in creating sqlite shards, and the reason we haven't noticed it before is
because we haven't use sqlblob in integration tests until recently (until
February 24 to be precise - D20001261).

We don't need 100 sqlite shards, let's decrease it downto 5.

Reviewed By: HarveyHunt

Differential Revision: D20438107

fbshipit-source-id: e71fd4cabf908e3d92b446fc518a0e5dd64a00bb
2020-03-13 06:28:02 -07:00
Stefan Filip
78deb9142b mononoke: add stats collection for getcommitdata
Summary: Stats. Cargo culted from `gettreepack`.

Reviewed By: StanislavGlebik

Differential Revision: D20377955

fbshipit-source-id: f64b702d3f9f3c76915db5b397dfaac77a848fa2
2020-03-12 14:36:13 -07:00
Stefan Filip
974ccafbb0 mononoke: initial implementation for getcommitdata
Summary:
The input for getcommitdata is a list of HgChangesetIds. For every entry, the
endpoint retrieves the commit, formats it like the revlog then returns it back
to the client.
The format for the returned entries is:
```
Hash " " Length LF
RevlogContent LF
```
Looking for recommendations for how to structure the code better.
Looking for recommendations on implementation requirements:
metrics, throttling, veriftying hash for returned bytes.

Reviewed By: krallin

Differential Revision: D20376665

fbshipit-source-id: 5d9eb0d581fd2b352cf3ce44f4777ad45076c8f4
2020-03-12 14:36:13 -07:00
Stefan Filip
f17234a665 mercurial: add method for directly loading RevlogChangeset
Summary:
The data portion of the changelog revlog from local checkouts is going away.
Instead of fetching data from disk, data will be fetched from the server.
For migration purposes we want the data from the server to match the data that
would have been available locally.

Note. I don't understand the difference between `RevlogChangeset` and
`HgChangesetContent`. I want to eventually call
`mercurial_revlog::changeset::serialize_cs` and this change reduces
the number of transformations required. The whole process is more direct and
thus easier to follow.

Reviewed By: StanislavGlebik

Differential Revision: D20373568

fbshipit-source-id: f2f4eccdf6a35b322e13c5a4b3fa18d5d289848e
2020-03-12 14:36:12 -07:00
Stefan Filip
450073c203 commands: add getcommitdata command
Summary:
The diff only contains HgCommand signatures. No implementation yet.

The purpose of the getcommitdata command is to return the serialized contents
of a commit. Given a Mercurial Changelog Id, the endpoint returns the same
contents that the Revlog would return on a Mercurial server.

At this point I am looking for suggestions regarding the protocol and the
implementation. My assumption is that both request and response can be fully
kept in memory. I think that we may decide that the request is going to be
streamed to the client so the initial protocol allows for that.

Requirements:
Input: HgChangelogId
Output: Changelog entry contents

Protocol Summary:
```
Request: "getcommitdata" LF "nodes " Length LF Length*(HEXDIG|" ")
Response: *(40HEXDIG Length LF Length*(%x00-79) LF)
```

A bit of a silly protocol. Let me know what recommendations you have.
The Request is modelled after the "known" command. This allows for efficient
batching compared to a batch command model. It's a bit awkward that we don't
pass in the number of HgChangelogId entries that we have in the request but
that is the existing protocol.

For every HgChangelogId in the request the response will first have a line
with the HgChangelogId that was requested and the length of the contents.
The next line will contain the contents followed by line feed.

Reviewed By: krallin

Differential Revision: D20345367

fbshipit-source-id: 50dffff4f6c60396f564f2f1f519744ce730bf96
2020-03-12 14:36:12 -07:00
Xiaoting Tang
1951fa305e Replace deprecated rust-crypto in eden/mononoke/git/git_types
Summary:
rust-crypto hasn't been updated since 2016, we are replacing it according to https://github.com/RustSec/advisory-db/blob/master/crates/rust-crypto/RUSTSEC-2016-0005.toml
Specifically, we replace crypto::Digest with digest::Digest, crypto::Sha1 with sha1::Sha1.

Reviewed By: krallin

Differential Revision: D20424524

fbshipit-source-id: aa5d268cb5d3c4b1b4e9f7c6f1f3c41d559f121a
2020-03-12 13:56:37 -07:00
Arun Kulshreshtha
7a22f95631 mononoke_api: use fbinit::compat_test
Summary:
Avoid boilerplate in unit tests by consistently using `fbinit::compat_test`.

This is a very mechanical change; most of the modified lines are due to indentation changes.

Reviewed By: krallin

Differential Revision: D20414564

fbshipit-source-id: 11c6bd702de28c8a21978f0345d2b0ff8a574e72
2020-03-12 12:34:07 -07:00
Stanislau Hlebik
e3bf91e944 mononoke: add generate_placeholder_diff option
Summary:
Let's use functionality added in D20389226 so that we can generate
diffs even for large files.

I've contemplated between two approaches: either "silently" generate
placeholder diff for files that are over the limit or add a new option
where client can request these placeholders. I choose the latter for a few
reasons:

1) To me option #1 might cause surprises e.g. diffing a single large file
doesn't fail, but diffing two files whose size is (LIMIT / 2 + 1) will fail.
Option #2 let's the client be very explicit about what it needs - and it also
won't return a placeholder when the actual content is expected!
2) Option #2 makes the client think about what it wants to be returned,
and it seems in line with source control thrift API design in general i.e.
commit_file_diffs is not trivial to use by design to because force client to
think about edge cases. And it seems that forcing client to think about additional
important edge case (i.e. large file) makes sense.
3) The code change is simpler with option #2

I also thought about adding generate_placeholder_diff parameter to CommitFileDiffsParams,
but that makes integration with scs_helper.py harder.

Reviewed By: markbt

Differential Revision: D20417787

fbshipit-source-id: ab9b32fd7a4768043414ed7d8bf39e3c6f4eb53e
2020-03-12 12:00:26 -07:00
Thomas Orozco
62a027768f mononoke: add a test for BFS fetching over SSH
Summary:
This adds a test demonstrating that we can perform BFS fetching over SSH. The
test should demonstrate that the fetches are:

- Done in a BFS fashion (we don't fetch the entire trees before comparing them,
  instead we do a fetch at each layer in the tree). In particular, note that
  the cc tree, which is unchanged, doesn't get explored at all.
- Done using the right paths / filenodeids, and therefore the right linknodes
  are located.

Reviewed By: farnz

Differential Revision: D20387124

fbshipit-source-id: b014812b0e6e85a5cdf6abefe3fe4f47b004461e
2020-03-12 11:16:07 -07:00
Thomas Orozco
999333424e mononoke/server: add support for logging Scuba to a file
Summary:
This is convenient because it makes it possible to tell what is going on within
Mononoke from the outside (e.g. introspect perf counters).

I'll land this after D20387115.

Reviewed By: farnz

Differential Revision: D20387125

fbshipit-source-id: ee070c4d658a0ec8f232152fe8b34bd0b56e6888
2020-03-12 11:16:07 -07:00
Thomas Orozco
86ad431e16 mononoke/fastreplay: add an replay_outcome_permille counter
Summary:
Similarly to D20418610, let's keep track of request failure rates in ODS
counters so they can be health-cheked by SF.

Reviewed By: farnz

Differential Revision: D20418705

fbshipit-source-id: 1d281e06c59414a0610836106370592596b47e39
2020-03-12 08:18:11 -07:00
Thomas Orozco
b60606cf9a mononoke/request_handler: log success permille to its own counter
Summary:
Earlier, I added counters for successes and failures, but unfortunately
canaries do not allow for tracking formulas, so I also need the actual request
success rate logged.

More broadly speaking, it probably does make sense to have a very high level
"success rate" counter to query anyway. This adds it.

Reviewed By: farnz

Differential Revision: D20418610

fbshipit-source-id: 94f115f8201aef4f6498f4da98e170194186f0f8
2020-03-12 08:13:09 -07:00
Thomas Orozco
06d11938cf mononoke/hgproto: expect comma-terminated list of directories
Summary:
We need to differentiate the empty directory from no directories. Adding a
trailing comma after each directory instead of separating them achieves that.

Reviewed By: StanislavGlebik

Differential Revision: D20309700

fbshipit-source-id: 387ec477560968392de0a9631d67ccb591bd3cab
2020-03-12 06:35:34 -07:00
Thomas Orozco
87491d14a7 mononoke/repo_client: add designated manifest fetching in gettreepack
Summary:
This will allow the hg client to do tree fetching like we do in the API Server,
but through the SSH protocol — i.e. by passing a series a manifest ids and
their paths, without recursion on the server side through gettreepack.

Reviewed By: StanislavGlebik

Differential Revision: D20307442

fbshipit-source-id: a6dca03622becdebf41b264381fdd5837a7d4292
2020-03-12 06:35:34 -07:00
Stanislau Hlebik
ba871d3bdc xdiff: allow rendering diff for large files
Summary:
The goal of the stack is to support "rendering" diffs for large files in scs
server. Note that rendering is in quotes - we are fine with just showing a
placeholder like "Binary file ... differs". This is still better than the
current behaviour which just return an error.

In order to do that I suggest to tweak xdiff library to accept FileContentType
which can be either Normal(...) meaning that we have file content available, or
Omitted, which usually means the file is large and we don't even want to fetch it, and we
just want xdiff to generate a placeholder.

Reviewed By: markbt, krallin

Differential Revision: D20389226

fbshipit-source-id: 0b776d4f143e2ac657d664aa9911f6de8ccfea37
2020-03-12 04:27:23 -07:00
Thomas Orozco
2955121955 mononoke/lfs_server: get rid of futures-old, tokio-old, bytes-old
Summary:
Our post-request processing was still implemented using old futures. Let's get
rid of that, and make it all new futures. This will make it easier to maintain,
and means we can get rid of the tokio-old dependency.

Reviewed By: mitrandir77

Differential Revision: D20343188

fbshipit-source-id: 513e124805e2ca588a1e312d8f5ca697ed6030c8
2020-03-11 10:22:28 -07:00
Thomas Orozco
399fd6c573 mononoke/{edenapi,lfs}_server: update to new Hyper, new Bytes, new Gotham
Summary:
This updates the lfs server and eden api server to use a newer version of
Gotham, which comes along with an updated version of Bytes and Hyper.

A few things had to change for this:

- New bytes don't support concatenation, so we need to fold them ourselves,
  except...
- ... new Hyper bodies don't tell you how big they are (either in requests or
  responses), so we need to inspect headers to find the size instead (I added
  this in `gotham_ext::body_ext::BodyExt`, although it arguably belongs more in
  a `hyper_ext` crate, but creating a new crate for just this seems overkill).
- New Hyper requires its data stream to be `Sync` for reasons that have more to
  do with developer experience than runtime
  (https://github.com/hyperium/hyper/pull/1857). Unfortunately, our Filestore
  streams aren't `Sync`, because our `BoxFuture` contains a `dyn Future` that
  isn't explicitly `Sync` (which is how we pull things out of blobstores). Even
  if `BoxFuture` contained a `Sync` future, that still wouldn't be enough
  anyway, because `compat()` explicitly implements `!Sync` on the stream it
  returns. I'll ask upstream in Hyper if this can possibly change in the
  future, but for now we can work around it by wrapping the stream in a
  channel. I'll keep an eye out for performance here.
- When I updated our "pre state data" tweaks on top of Gotham, I renamed those
  to "socket data", since that's a better name or what they are (hence the
  changes here).
- I updated the lfs_protocol to stop depending on Hyper and instead depend on
  http, since that's all we need here.

As you review this, please pay close attention to the updated implementation of
`SignalStream`. Since this is a custom `Stream` in new futures, it requires a
bit of `unsafe { ... }`.

Note that, unfortunately, the diff includes both the Gotham update and the
server updates, since they have to happen together.

Reviewed By: kulshrax, dtolnay

Differential Revision: D20342689

fbshipit-source-id: a490db96ca7c4da8ff761cb80c1e7e3c836bad87
2020-03-11 10:22:28 -07:00
Thomas Orozco
04f347484b mononoke: allow selecting a priority in hgcli, and passing it to Mononoke
Summary:
This adds the ability to specify a priority in hgcli, and to pass it on to
Mononoke. This will be used to replay commit cloud traffic at a lower priority.

Reviewed By: farnz

Differential Revision: D20038573

fbshipit-source-id: 4055d28ee295e2b15c15945bd3741f6d739ead3a
2020-03-11 08:54:51 -07:00
Thomas Orozco
c5917acc3f mononoke: context_concurrency_blobstore
Summary:
This adds a blobstore that can reach into a CoreContext in order to identify
the allowed level of concurrency for blobstore requests initiated by this
CoreContext. This will let us replay infinitepush bundles with limits on a
per-request basis.

Reviewed By: farnz

Differential Revision: D20038575

fbshipit-source-id: 07299701879b7ae65ad9b7ff6e991ceddf062b24
2020-03-11 08:54:51 -07:00
Stanislau Hlebik
0f6a8e259f mononoke: log reponame to derived data scuba table
Reviewed By: HarveyHunt

Differential Revision: D20363458

fbshipit-source-id: b52f0051b8b00fc601d00686bc732585317403bc
2020-03-10 11:42:31 -07:00
Aida Getoeva
3e510491cc scs/history: don't buffer stream if there is no need
Summary:
Time filters for file history require fetching changeset or changeset info to decide whether to include the commit into response. To speed up the process instead of sequential mapping, I buffer the map stream in batches of 100.
However it is unfortunate to fetch additional 100 history changesets if there is no time filters and only several commits were requested. Keeping also in mind that the most of the requests don't care about time.
Avoiding that would speed up the history generation,

I changed the changeset_path API so now it just returns the stream of changeset contexts. And commit_path API for history if there is no time filters collects everything to vector, otherwise applies them and also collects to vector. Then this vector is converted into response using FuturesOrdered.

Reviewed By: StanislavGlebik

Differential Revision: D20287497

fbshipit-source-id: 0c6b1315eccddb48f67bf5fa732bdf7c9a54a489
2020-03-10 10:03:05 -07:00
Stanislau Hlebik
4241ca0272 mononoke: add repo name in BlobRepo
Summary:
A lot of callsites want to know repo name. Currently they need to pass it from
the place where repo was initialized, and that's quite awkward, and in some
places even impossible (i.e. in derived data, where I want to log reponame).

This diff adds reponame in BlobRepo

Reviewed By: krallin

Differential Revision: D20363065

fbshipit-source-id: 5e2eb611fb9d58f8f78638574fdcb32234e5ca0d
2020-03-10 08:32:33 -07:00
Stanislau Hlebik
a7f67cb698 mononoke: remove in_memory_writes* method
Summary: DangerousOverride trait is better

Reviewed By: krallin

Differential Revision: D20362886

fbshipit-source-id: f083936bce60faa0c60ccd4672704fc642b68234
2020-03-10 07:51:28 -07:00
Thomas Orozco
6f5c623899 mononoke/repo_listener: track request success / failure in ODS
Summary:
We track this in Scuba right now (and alarm on it), but tracking it in ODS will
make it easier to incorporate in our canary and post-release health check
workflow.

Reviewed By: StanislavGlebik

Differential Revision: D20361803

fbshipit-source-id: 99fb514d41f9cda42c3c9a82f3b8d6681285430a
2020-03-10 06:25:11 -07:00
Mateusz Kwapich
9fd7f0d2b4 improve the check for conflicts during insert
Summary:
Before we assumed that if the rows_affected length doesn't match the number of
entries we were trying to insert we have a conflict. Let's verify if we really
have conflict or we're trying to insert the same entry twice.

Reviewed By: krallin

Differential Revision: D20343219

fbshipit-source-id: 19e032439fdd65f5fe1afe1a10b401bc2fe33462
2020-03-10 05:47:05 -07:00
Mateusz Kwapich
e1bf77097f test showing the blobimport problem.
Summary: Running blobimport twice on the same commit seems to cause problems.

Reviewed By: krallin

Differential Revision: D20343218

fbshipit-source-id: 4d572630e7c15c219bee8db15cc879b2cb8602fe
2020-03-10 05:47:05 -07:00
Alex Hornby
b8ca854c0b mononoke: walker: add ability to walk all published bookmarks
Summary: Add ability to walk all published bookmarks as there may be multiple important bookmarks

Reviewed By: krallin

Differential Revision: D20249806

fbshipit-source-id: aff2ee1ec7d51a9e4fb6e1e803612abd207fd6cb
2020-03-10 05:26:35 -07:00
Lukas Piatkowski
6365fa6509 rust-shed: add no-op memcache_stub implementation to the shed
Reviewed By: mitrandir77

Differential Revision: D20304739

fbshipit-source-id: bd2956619f6a5cf1551bccb921780e7a542e9859
2020-03-10 01:07:46 -07:00
Aida Getoeva
84c0aa9e8c mononoke: allow to derive changeset info
Summary: This also replaces RepoBlobstore with generic Blobstore.

Reviewed By: krallin

Differential Revision: D20336107

fbshipit-source-id: 4e09430d462053d5ccb0b2f0eca7a9d0417b4b5e
2020-03-09 16:41:53 -07:00
Stanislau Hlebik
b0cb300af8 mononoke: introduce BlobrepoBuilder
Summary:
The goal of the whole stack is quite simple (add reponame field to BlobRepo), but
this stack also tries to make it easier to initialize BlobRepo.

To do that BlobrepoBuilder was added. It now accepts RepoConfig instead of 6
different fields from RepoConfig - that makes it easier to pass a field from
config into BlobRepo. It also allows to customize BlobRepo. Currently it's used
just to add redaction override, but later we can extend it for other use cases
as well, with the hope that we'll be able to remove a bunch of repo-creation
functions from cmdlib.

Because of BlobrepoBuilder we no longer need open_blobrepo function. Later we
might consider removing open_blobrepo_given_datasources as well.

Note that this diff *adds* a few new clones. I don't consider it being a big
problem, though I'm curious to hear your thoughts folks.

Note that another option for the implementation would be to take a reference to objects
instead of taking them by value. I briefly looked into how they used, and lot of them are passed to the
objects that actually take ownership of what's inside these config fields. I.e. Blobstore essentially takes ownership
of BlobstoreOptions, because it needs to store manifold bucket name.
Same for scuba_censored_table, filestore_params, bookmarks_cache_ttl etc. So unless I'm missing anything, we can
either pass them as reference and then we'll have to copy them, or we can
just pass a value from BlobrepoBuilder directly.

Reviewed By: krallin

Differential Revision: D20312567

fbshipit-source-id: 14634f5e14f103b110482557254f084da1c725e1
2020-03-09 12:05:57 -07:00
Stanislau Hlebik
44effbe62e mononoke: log derived data type and changeset as separate fields
Summary:
We've recently added new scuba table for derived data
(https://fburl.com/scuba/mononoke_derived_data/e4sekisf), and looks like our
previous format of logging is not very useful. It's better to have separate
fields for changeset id and derived data type, since it makes aggregation
easier.

Reviewed By: krallin

Differential Revision: D20309093

fbshipit-source-id: 48f5f04e0412002ef04028e34b12bf267a9b6834
2020-03-09 10:56:59 -07:00
Mark Thomas
63d704f456 mercurial_types: add mock functions for generating commit IDs
Summary:
The mock crates contain a standard set of mock commits ids with all nybbles
set to a single hex value.

For the mutation tests I want to be able to generate them from a number and
have more than 15 changeset IDs.  Add a new function that generates an hg
changeset ID from a number.

Reviewed By: krallin

Differential Revision: D20287382

fbshipit-source-id: 1f57de89f19e2e2eea8dbfea969a4d54510e23d8
2020-03-09 07:24:46 -07:00
Stanislau Hlebik
c1ea56c17c mononoke: asyncify blobrepo_factory
Summary:
Note that comparing to many other asyncifying efforts, this one actually adds
one more clone instead of removing them. This is the clone of a logger field.
That shouldn't matter much because it can be cleaned up later and because this
function will be called once per repo.

Reviewed By: krallin

Differential Revision: D20311122

fbshipit-source-id: ace2a108790b1423f8525d08bdea9dc3a2e3c37c
2020-03-09 05:02:05 -07:00
Stanislau Hlebik
c15405d96d mononoke: remove unnecessary clones from lfs_server
Reviewed By: krallin

Differential Revision: D20310686

fbshipit-source-id: 662a9cf358e389e848b1196bbf03469aec691ab8
2020-03-06 13:07:10 -08:00
Kuba Zika
6a25dbee81 Simplify error pattern matching
Summary:
Instead of returning `anyhow::Error` wrapping an `ErrorKind` enum
from each Thrift client method, just return an error type specific
to that method. This will make error handling simpler and less
error-prone by removing the need to downcast the returned error.

This diff also removes the `ErrorKind` enums so that we can be sure
that there are no leftover places trying to downcast to them.

(Note: this ignores all push blocking failures!)

Reviewed By: dtolnay

Differential Revision: D20260398

fbshipit-source-id: f0dd96a7b83dd49f6b30948660456539012f82e6
2020-03-06 12:09:38 -08:00
Stanislau Hlebik
b905d1399c mononoke: change error of log messages
Summary:
Previously we could have "Started ..." before "Starting ..."
This diff fixes it.

Reviewed By: krallin

Differential Revision: D20277406

fbshipit-source-id: 3c2f3fa1723c2e0852c6b114592ab7ad90be17ff
2020-03-06 10:24:24 -08:00
Thomas Orozco
9493a05e7b mononoke/filestore: update store_bytes to chunk content
Summary:
This updates the store_bytes method to chunk incoming data instead of uploading
it as-is. This is unfortunately a bit hacky (but so was the previous
implementation), since it means we have to hash the data before it has gone
through the Filestore's preparation.

That said, one of the invariants of the filestore is that chunk size shouldn't
affect the Content ID (and there is fairly extensive test coverage for this),
so, notionally, this does work.

Performance-wise, it does mean we are hashing the object twice. That actually
was the case before as well anyway (since obtain the ContentId for FileContents
would clone them then hash them).

The upshot of this change is that large files uploaded through unbundle will
actually be chunked (whereas before, they wouldn't be).

Long-term, we should try and delete this method, as it is quite unsavory to
begin with. But, for now, we don't really have a choice since our content
upload path does rely on its existence.

Reviewed By: StanislavGlebik

Differential Revision: D20281937

fbshipit-source-id: 78d584b2f9eea6996dd1d4acbbadc10c9049a408
2020-03-06 07:43:07 -08:00
Thomas Orozco
56a7ce8697 mononoke/filestore: make FilestoreConfig Copy and pass it by value
Summary:
This is a very small struct (2 u64s) that really doesn't need to be passed by
reference. Might as well just pass it by value.

Differential Revision: D20281936

fbshipit-source-id: 2cc64c8ab6e99ee50b2e493eff61ea34d6eb54c1
2020-03-06 02:00:23 -08:00
Lukas Piatkowski
bdb3b625d1 blobstore: cover more blobstores to make them OSS buildable
Reviewed By: farnz

Differential Revision: D20221288

fbshipit-source-id: 708be6d429e673dcb4201b88541dff2bf9fca153
2020-03-06 01:33:38 -08:00
Lukas Piatkowski
7ddcdd818c mononoke: make sql_ext OSS buildable
Summary: separate out the Facebook-specific pieces of the sql_ext crate

Reviewed By: ahornby

Differential Revision: D20218219

fbshipit-source-id: e933c7402b31fcd5c4af78d5e70adafd67e91ecd
2020-03-06 01:33:38 -08:00
David Tolnay
754a755eee rust: Rename tokio_preview:: to tokio::
Summary:
Context: https://fb.workplace.com/groups/rust.language/permalink/3338940432821215/

This codemod replaces all dependencies on `//common/rust/renamed:tokio-preview` with `fbsource//third-party/rust:tokio-preview` and their uses in Rust code from `tokio_preview::` to `tokio::`.

This does not introduce any collisions with `tokio::` meaning 0.1 tokio because D20235404 previously renamed all of those to `tokio_old::` in crates that depend on both 0.1 and 0.2 tokio.

This is the tokio version of what D20213432 did for futures.

Codemod performed by:

```
rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| sed 's,TARGETS$,:,' \
| xargs \
    -x \
    buck query "labels(srcs, rdeps(%Ss, //common/rust/renamed:tokio-preview, 1))" \
| xargs sed -i 's,\btokio_preview::,tokio::,'

rg \
    --files-with-matches \
    --type-add buck:TARGETS \
    --type buck \
    --glob '!/experimental' \
    --regexp '(_|\b)rust(_|\b)' \
| xargs sed -i 's,//common/rust/renamed:tokio-preview,fbsource//third-party/rust:tokio-preview,'
```

Reviewed By: k21

Differential Revision: D20236557

fbshipit-source-id: 15068b93a0a944d6249a1d9f63840a4c61c9c1ba
2020-03-05 14:25:10 -08:00
Thomas Orozco
3ee98c82e2 mononoke/microwave: add support for changesets
Summary:
This updates microwave to also support changesets, in addition to filenodes.
Those create a non-trivial amount of SQL load when we warm up the cache (due to
sequential reads), which we can eliminate by loading them through microwave.

They're also a bottleneck when manifests are loaded already.

Note: as part of this, I've updated the Microwave wrapper methods to panic if
we try to access a method that isn't instrumented. Since we'd be running
the Microwave builder in the background, this feels OK (because then we'd find
out if we call them during cache warmup unexpectedly).

Reviewed By: farnz

Differential Revision: D20221463

fbshipit-source-id: 317023677af4180007001fcaccc203681b7c95b7
2020-03-05 11:57:43 -08:00
Thomas Orozco
dd38f1fdb2 mononoke/cache_warmup: conditionally use microwave for faster warmup
Summary:
This incorporates microwave into the cache warmup process. See earlier in this
stack for a description of what this does, how it works, and why it's useful.

Reviewed By: ahornby

Differential Revision: D20219904

fbshipit-source-id: 52db74dc83635c5673ffe97cd5ff3e06faba7621
2020-03-05 11:57:43 -08:00
Pavel Aslanov
95bf3a32a4 Report bytes sent via perf counters for stream_out_shallow command
Summary: Report bytes sent via perf counters for `stream_out_shallow` command

Reviewed By: krallin

Differential Revision: D20283114

fbshipit-source-id: 1f354904c68322b941ff0c035bb0b811e41e74a1
2020-03-05 08:58:21 -08:00