Summary: We had to do one more cleanup, need to bump memcache
Reviewed By: farnz
Differential Revision: D16856098
fbshipit-source-id: 4c072f1aac80982cab9b8c34d9e4f11f41e48ab2
Summary:
We have a FetchKey object in the filestore that serves as both the thing you use to ask for reads and to write aliases. One downside of this is that the same enum is used for reads and writes, and that means that you could potentially accidentally try to write an alias for the "canonical" fetch key, which would effectively overwrite your content.
That feels bad, and the underlying reason is that the typing doens't really reflect the reality of what we are trying to do here, which is to support a set of aliases for pieces of content, and expose an enum that allows clients to access content by alias or canonical ID.
This patch updates the type hierarchy to be more in line with what we are trying to do (note that you can no longer store a canonical fetch key). One downside is that it makes the API a bit more verbose, since now you need to say e.g. `FetchKey::Aliased(Alias::Sha256(...))` instead of just `FetchKey::Sha256`. I think the tradeoff is reasonable, but I'm curious to hear other folk's thoughts.
Reviewed By: farnz
Differential Revision: D16764582
fbshipit-source-id: 00c596d1370b52ef2aea04ac44cdcd19b7a2c92a
Summary:
This updates an earlier diff I had that introduced the typed fetch method on the Blobstore trait. Since we now have Storable and Loadable traits, it makes a little more sense to implement those for MononokeId and use that instead.
I was hoping this would mean we could move the implementation of those traits into the mononoke_types crate, but unfortunately it looks like the `where T: MononokeId` implementation for `Loadable` can't live there (Rust wants it next to the trait definition).
That said, this change does mean we can get rid of some manual BlobstoreBytes creation here and there, so that's still an upside.
In doing so, I updated the Loadable and Storable traits to require only a Blobstore reference, and to take ownership of what they're storing. This saves some clones, especially in the latter.
Reviewed By: aslpavel
Differential Revision: D16733733
fbshipit-source-id: 3ec3a97cef90d6850f2939ba3f5ec6e6e5b459fb
Summary:
We may need to inspect the list of blacklisted files for a given commit.
This command is called to do so.
Reviewed By: krallin
Differential Revision: D16643467
fbshipit-source-id: ad389c4b9262ffa9ee6ea9b792b4866a23bebaa5
Summary:
The `get_content_by_path` method will be replaced by the more general
mononoke_api. For now, move it to a legacy module so it doesn't get in the
way.
Reviewed By: krallin
Differential Revision: D16803205
fbshipit-source-id: 020f445bc042607d6d09f15014e96f81a7f22f74
Summary: Remove `get_changeset_by_bookmark` as it is no longer used.
Reviewed By: StanislavGlebik
Differential Revision: D16803208
fbshipit-source-id: 935e4c4598aa1121d815adcf8ae3d280e60b10c3
Summary:
After we've replaced all broken linknodes (see code in the previous diff in the
stack), we'll need to update memcache keys.
Reviewed By: farnz
Differential Revision: D16803181
fbshipit-source-id: 4392f8859999e9b8a127b34f3452de4b0e04efd4
Summary: Clean up non-test usage of CoreContext::test_mock left over from T37478150
Reviewed By: farnz
Differential Revision: D16804838
fbshipit-source-id: f420b8186557a42e9b6c78437c0fb76c9a343b31
Summary:
This is needed for inter-repo commit sync. See https://fb.quip.com/O9iNAMSNTo1F
for more details.
Reviewed By: farnz
Differential Revision: D16804787
fbshipit-source-id: e0f7aace790f2c27b6f3449cbc6a34c2f31ef212
Summary:
This updates Mononoke to support LFS metadata when serving data over getpackv2.
However, in doing so, I've also refactored the various ways in which we currently access file data to serve it to clients or to process client uploads (when we need to compute deltas). The motivation to do that is that we've had several issues recently where some protocols knew about some functionality, and others didn't. Notably, redaction and LFS were supported in getfiles, but neither of them were supported in getpack or eden_get_data.
This patch refactors all those callsites away from blobrepo and instead through repo_client/remotefilelog, which provides an internal common method to fetch a filenode and return its metadata and bytes (prepare_blob), and separate protocol specific implementations for getpackv1 (includes metadata + file content -- this is basically the existing fetch_raw_filenode_bytes function), getpackv2 (includes metadata + file contents + getpackv2 metadata), getfiles (includes just file content, and ties file history into its response) and eden_get_data (which uses getpackv1).
Here are a few notable changes here that are worth noting as you review this:
- The getfiles method used to get its filenode from get_maybe_draft_filenode, but all it needed was the copy info. However, the updated method gets its filenode from the envelope (which also has this data). This should be equivalent.
- I haven't been able to remove fetch_raw_filenode_bytes yet because there's a callsite that still uses it and it's not entirely clear to me whether this is used and why. I'll look into it, but for now I left it unchanged.
- I've used the Mercurial implementation of getpack metadata here. This feels like the better approach so we can reuse some of the code, but historically I don't think we've depended on many Mercurial crates. Let me know if there's a reason not to do that.
Finally, there are a couple things to be aware of as you review:
- I removed some more `Arc<BlobRepo>` in places where it made it more difficult to call the new remotefilelog methods.
- I updated the implementation to get copy metadata out of a file envelope to not require copying the metadata into a mercurial::file::File only to immediately discard it.
- I cleaned up an LFS integration test a little bit. There are few functional changes there, but it makes tests a little easier to work with.
Reviewed By: farnz
Differential Revision: D16784413
fbshipit-source-id: 5c045d001472fb338a009044ede1e22ccd34dc55
Summary:
Start moving mercurial related stuff to `mercurial` directory:
- rename `mercurial` to `mercurial_revlog` and moved to `/mercurial/revlog`
- move `mercurial_types` to `/mercurial/types`
- move `mercurial_bundles` to `/mercurial/bundels`
Reviewed By: farnz
Differential Revision: D16783728
fbshipit-source-id: 79cf1757bb7cc84a6273a4a3c486242b1ef4cd00
Summary: Use memcache leases to prevent simultaneous generation of derived data for the same changeset id.
Reviewed By: StanislavGlebik
Differential Revision: D16666659
fbshipit-source-id: 47e57449ab854e595a9dc860414c49afa966be01
Summary: Switching from manually computing content SHA1 to using the SHA1 from `ContentMetadata`
Reviewed By: krallin
Differential Revision: D16714362
fbshipit-source-id: 7578d91c26775da88db3c60cbb76a4ac86cd6917
Summary:
This is needed, so that we do not break the existing clients when they update
to commits with redacted content.
Reviewed By: farnz
Differential Revision: D16752748
fbshipit-source-id: fdd4e2931fd362d2bcb5eb590d56ac71b74a8b03
Summary: We need this to not break clients with `remotefilelog.fetchpacks=True`.
Reviewed By: krallin
Differential Revision: D16735900
fbshipit-source-id: a993610c2e95cfde95a8be1e31eaad825f95dc15
Summary:
I plan to use it for `getpackv1`/`getpackv2`/edenapi handling as well.
Note: there might be a better place to put this code, given that it will be used for edenapi, I am open for suggestiions.
Reviewed By: krallin
Differential Revision: D16735901
fbshipit-source-id: 6a71d80bb9223cd18f080163dcb20146c91498bf
Summary:
If load limiting isn't enabled, then tracking load doens't really make sense, since we don't have a category to record it into (we end up using an empty category instead, which doesn't feel right). This patch reworks load limiting so that it all happens within a `LoadLimiter` struct, which means we won't record load at all if load limiting is not enabled. This also removes a few of string creations that were happening here and there to reproduce the same strings over and over.
The main reason I did this is because the first time you bump load, we'll create a connection to Memcache, and for some reason that seems to be pretty slow (this only seems to happen on the first connection, so this might be expected). This makes running Mononoke locally slower since the first request you make to it takes ~2 seconds. I suspect it also makes tests slower (at least those tests that don't run in a network blackhole where presumably establishing a connection fails immediately).
Reviewed By: StanislavGlebik
Differential Revision: D16762054
fbshipit-source-id: faeb07e058c63b597916b6379022435e21a13e85
Summary: Stop inserting to the repo_id column in preparation for its removal. Remove column from sqlite tests.
Reviewed By: HarveyHunt
Differential Revision: D16731404
fbshipit-source-id: cb7b7d796a705455bb239bbd158cf87071f7b17c
Summary:
This resolves the list of files at a given commit to a list of `ContentId`s
and unblacklists all of those `ContentId`s.
Note 1: This means that these particular versions of files will be
unblacklisted in all commits which reference them.
Note 2: There's no auditing of this admin command invokation, if the
need for one arises, we should add it externally.
Reviewed By: krallin
Differential Revision: D16643436
fbshipit-source-id: 3af488e3e4d748027605587ab1e08c0abd0da109
Summary:
This is done to make `remove` and `list` be implementable in terms of the same
boilerplate with some different closures.
NB: I don't particularly like how this refactoring looks, but I do think it
looks better than copy/pasting the code and I think it is of no major
importance, given that this is an admin command.
Reviewed By: krallin
Differential Revision: D16643437
fbshipit-source-id: 7f64b112eda1ce963dffed897bf2c9fa42234d79
Summary: In the following diffs, I will implement the actual logic.
Reviewed By: krallin
Differential Revision: D16639720
fbshipit-source-id: f42b91476035343a6f00b11e4921e26836b1a574
Summary: This adds support for running benchmarks using sqlblob. I've refactored the command a little bit so that destinations are subcommands now.
Reviewed By: HarveyHunt
Differential Revision: D16733238
fbshipit-source-id: 963365ed3492cddf1c7302f18fc44148d9c1ec16
Summary: Building on the previous diff, expose the newly added ability to fetch the manifest entries of all path components of the given set of paths as a new method on BlobRepo. This method will be called from the API server as a way to perform selective prefetching of manifest nodes for client commands that operate on only a subset of paths for a given commit.
Reviewed By: krallin
Differential Revision: D16704978
fbshipit-source-id: 6ab68ff7c0d0b76d3317d8e2bd6d9085f52f3477
Summary:
The `find_entries_in_manifest` method uses a data structure called
`QueryTree` to guide manifest traversal. Previously, this traversal would
only return the manifest entries corresponding to the final component of each
path traversed. This diff adds the optional ability to return the entry for
every path component traversed, regardless of whether that entry was explicitly
requested. This is useful for prefetching in Mercurial, where we'd like to
cache these entries locally on the client to avoid unnecessary server
roundtrips when accessing trees with common path prefixes.
Reviewed By: krallin
Differential Revision: D16704979
fbshipit-source-id: 6436c1f8772cc4a8db74994435be8ba87352ff44
Summary: Bump a counter each time an API method is called.
Reviewed By: fanzeyi
Differential Revision: D16733219
fbshipit-source-id: 078ef56aad3b74d29013d3af284dbcbb817ed678
Summary: Not used anywhere and is not obvious why it should be public.
Reviewed By: ahornby
Differential Revision: D16730490
fbshipit-source-id: 8f24b871a7c1f5cca00ea8418fe4e7e94a48e2f1
Summary:
While investigating API server memory usage with HarveyHunt (made possible by the fact that he enabled Strobelight there!), we eventually identified that the root cause is that Actix keeps around a pool of response buffers after it responds to requests.
The pool contains up to 128 buffers, and over time those will resize up to whatever is the size of the biggest content the API server served up in a single chunk. So, if you're periodically serving 500MB chunks, then eventually you'll have 128 * 500MB chunks in your pool. That's kinda sad for memory usage, Harvey has a post coming up explaining this more in-depth. Look for `SharedBytesPool` in the Actix code if you're curious to know more.
One way to improve things is to avoid letting those buffers get too big, by serving our responses in smaller chunks. Since we now have a Filestore that chunks, we can do that, sot that's what this patch does.
Now, this only works for files that are _actually_ chunked when the Filestore returns them. Indeed, other files will have just one big chunk so Actix will still reuse a big chunk (though we _could_ fix this by re-chunking on the fly when we read things). However, the plan here is to simply go back and re-chunk existing large blobs in our repos, then we're done.
As an aside, note that I didn't touch the Thrift API here. So, that API still puts everything into a single buffer. That's not ideal, but it's not catastrophic since that buffer gets freed (unlike the Actix response buffers).
Reviewed By: fanzeyi
Differential Revision: D16709352
fbshipit-source-id: b01fba478118bd8f286e096cf52b8e7d7b7e8c42
Summary: add support for fb303 and ods stats to cmdlib. use for blobstore_healer to monitor healer status
Reviewed By: krallin
Differential Revision: D16671526
fbshipit-source-id: 8f95fc097133f2fe697ea3cf70fd9d0fa3b3649c
Summary:
If the perfcounters hashmap is empty, we log "{}". Check if
the perf counters are empty before logging to prevent this.
Reviewed By: krallin
Differential Revision: D16712702
fbshipit-source-id: fecc6c10f5cd0acba2a3b3b96f8d6268660bac0d
Summary:
This command allows a file to be uploaded to the blobstore using
the filestore - which allows for chunking if required.
Additionally, port the other filestore commands to use logger
rather than println!().
Reviewed By: krallin
Differential Revision: D16712167
fbshipit-source-id: 0fad13c29035a550f8caee7a02d37071eb87cd73
Summary:
The ffi call we use for launching a thrift server is a blocking call: [`thrift_server_serve`](diffusion/FBS/browse/master/fbcode/common/rust/srserver/src/RustThriftServer.cpp;4dc16e1ee0878536fa1fede8913e104ab998a270$366). That means calling it directly from an async fn is bad because it would block the executor and prevent any other task from making progress. To work around that, callers need to know that they must only call this off of their task's thread, such as on a threadpool.
This diff moves the use of threadpool into the srserver crate, exposing `serve` instead as a simple async fn.
**Before:**
```
future::poll_fn(move |_| tokio_threadpool::blocking(|| thrift.serve())).await??;
```
**After:**
```
thrift.serve().await?;
```
We keep the old blocking behavior under the new name `serve_blocking()`. This is needed for the Actix use case such as in Mononoke's apiserver, where Actix takes care of juggling threads itself and we don't need to bring our own threadpool.
Reviewed By: Imxset21
Differential Revision: D16684766
fbshipit-source-id: bca2502b4eba8875569ed6c277f5606a6e2cefa0
Summary: Now that the `BoxFuture`-based server traits originally located at crate::server::Trait have been removed in D16683500, we can rename the `async fn`-based server traits from crate::server_async::Trait to crate::server::Trait.
Reviewed By: Imxset21
Differential Revision: D16684619
fbshipit-source-id: f98220c656779f9f56472b2b94df454c073631ad
Summary: This updates blobimport to run filenode computations on their own Tokio task. This is particularly helpful when dealing with a repo with large files: recomputing filenodes can be a fairly expensive process if a large number of large files have moved. Running those on separate tasks allows for better utilization of multiple CPUs.
Reviewed By: HarveyHunt
Differential Revision: D16667040
fbshipit-source-id: 0746af4da876c0b4313b86aca32c5e261a316a8c
Summary: This updates blobimport to retry on failures to fetch through its LFS helper.
Reviewed By: HarveyHunt
Differential Revision: D16667042
fbshipit-source-id: dfaec8fc029030a7d90806018301d22525d094df
Summary: This updates blobimport to a newer Rust-2018-style. Rust 2018 is nicer.
Reviewed By: StanislavGlebik
Differential Revision: D16667043
fbshipit-source-id: a217cbef0337e7d7cf288f972e9b4af0340a8e0b
Summary:
This updates blobimport to avoid using a per-changeset limit for the number of blob uploads (which used to be 100), and instead to use a global blob upload limit, and a global LFS import limit.
This allows for finer-grained control over the resource utilization (notably in terms of Manifold quota) of blobimport.
While I was in there, I also eliminated an `Arc<BlobRepo>` that was laying around, sicne we don't use those anymore.
Reviewed By: HarveyHunt
Differential Revision: D16667044
fbshipit-source-id: 9fc2f347969c7ca9472ce8dd3d4e2f1acb175b66
Summary: Updates blobimport to check for existing LFS blobs. This is perfectly safe since at worse it'll fail closed (i.e. blobs will be missing and blobimport will hang).
Reviewed By: HarveyHunt
Differential Revision: D16648560
fbshipit-source-id: ed9da7f2fa69c28451ac058a4e1adc937d111b31
Summary: This updates blobimport to allow specifying a different commit limit than the default of 100. This makes it a little more convenient when tuning limits.
Reviewed By: HarveyHunt
Differential Revision: D16648558
fbshipit-source-id: 2e8c8496487a79bec84ec964302b1d6e871caf2a
Summary:
Let's log a bit more often than once every 5,000 commits. Logging 10 times per
run seems reasonable.
Reviewed By: HarveyHunt
Differential Revision: D16621623
fbshipit-source-id: b5ae481f5d899a12c736bb2de74efc8b370002df
Summary:
If you give blobimport a bad path for the LFS helper, it just tells you the
file doesn't exist, which is not very useful if you don't know what file we're
talking about. This fixes that.
Reviewed By: HarveyHunt
Differential Revision: D16621624
fbshipit-source-id: ff7fb5c8800604f8799c682e5957c888d2b01647
Summary:
Sometimes, our Mononoke CLI tests report errors in mode/opt where they don't
actually output error messages at all when returning an error. Those error
messages actually get buffered by our logger, and are normally emitted when
said logger is dropped.
However, I have a suspicion that by calling process::exit(), we're not giving
Rust a chance to drop the logger (because that eventually just calls the exit
syscall), and as a result our buffer isn't getting flushed.
This patch attempts to fix this by removing all calls to process::exit in the
admin CLI, and instead returning an exit code from main, which I hope will
allow main to drop everything before it returns.
More broadly speaking, I don't think it's very good practice to exit from all
over the place in the admin CLI, so this fixes that :)
Reviewed By: HarveyHunt
Differential Revision: D16687747
fbshipit-source-id: 38e987463363e239d4f9166050a8dd26a4bef985
Summary:
Address nits from D16599270:
* use map instead of and_then
* use &[T] instead of &Vec[T]
Reviewed By: StanislavGlebik
Differential Revision: D16666838
fbshipit-source-id: ad0afa2d44d7a713409ac75bab599a35f5cf1a87
Summary:
fanzeyi pointed out that get_metadata calling a different get_metadata because
of `use metadata ::*` was confusing. This cleans this up.
Reviewed By: fanzeyi
Differential Revision: D16669239
fbshipit-source-id: f1e56fa9691c885ed56475c911f9407454afc4bb
Summary:
This diff introduces `bounded_traversal_dag` which can handle arbitrary DAGs and detect invalid DAGs with cycles, but it has limitation in comparison to `bounded_traversal`:
- `bounded_traversal_dag` keeps `Out` result of computation for all the nodes
but `bounded_traversal` only keeps results for nodes that have not been completely
evaluatated
- `In` has additional constraints to be `Eq + Hash + Clone`
- `Out` has additional constraint to be `Clone`
Reviewed By: krallin
Differential Revision: D16621004
fbshipit-source-id: b9f60e461d5d50e060be4f5bb6b970f16a9b99f9
Summary: This adds debug subcommands metadata and verify in the Filestore. Those respectively output the metadata for a file verify that the file is reachable through all aliases.
Reviewed By: ahornby
Differential Revision: D16621789
fbshipit-source-id: 4a2156bfffb9d9641ce58f6d5f691364ba9dc145