Summary: This is actually missing from `HgQueuedBackingStore`. This diff fixes this and assign low priority to these prefetch requests.
Reviewed By: chadaustin
Differential Revision: D20655681
fbshipit-source-id: f3c92b358e16e980390ac7adcae27d41ae5a7277
Summary: This diff bumps the open call from FUSE to High priority (which is higher than any other blob open request atm). This has shown improvement on the user experience of EdenFS when it's importing many other things from other channels (thrift, etc.)
Reviewed By: chadaustin
Differential Revision: D20287389
fbshipit-source-id: 319bc44ef8be5c904d7cf0db7cc2f8be28b4760a
Summary:
Add a fetch context interface to ObjectStore that allows tracing cache
hits, backing store fetches, and fetch durations in the context of a
diff or checkout operation.
Reviewed By: simpkins
Differential Revision: D19135625
fbshipit-source-id: d0d8f134b1c89f7ba4971a404a46a69a1704ba5c
Summary:
Remove some half-baked, unnecessary logic for caching sizes separately
from SHA-1. Eden's backing stores do not support chunking large files
yet, so there's no value in caching content SHA-1 and size
separately. This fixes a scenario where fetching blob size and then
SHA-1 would result in two backing store imports.
Reviewed By: fanzeyi
Differential Revision: D19169096
fbshipit-source-id: dc32f3313e5f4230c06a5bbaa67da7bf0febaba8
Summary:
Troditionally, ObjectStore relied on HgBackingStore writing directly
to LocalStore in order to cache trees. This had the unfortunate side
effect that other backing store implementations did not benefit from
tree caching.
Move tree caching into ObjectStore so all backing stores benefit from
tree caching.
Reviewed By: simpkins, fanzeyi
Differential Revision: D19168211
fbshipit-source-id: b1019591ebb4760cc8b933b9adb82174b8e5fa1f
Summary:
This diff turns the return type of `BackingStore::getBlob` from `folly::Future` into `folly::SemiFuture` to prevent executor leaks.
This also enable us to remove the need of holding `serverThreadPool` from backing stores.
----
**Changes**
* `ObjectStore` now needs to hold a `folly::Executor::KeepAlive` that is used to turn `SemiFuture`s it gets from backing stores into `Future`.
* Signature changes of the implementations of `BackingStore` class.
* For tests, I chose to use `QueuedImmediateExecutor` in place of `UnboundedQueueExecutor` as it will basically execute tasks inline. I'm concerned introducing thread pool executor in tests may turn tests flaky.
Reviewed By: wez
Differential Revision: D18669664
fbshipit-source-id: 0cae89f365dcf8b345b49d64469a530cf25d4ac5
Summary: Adds a function which takes both the manifestID and the commitID to get a Tree. This will be used in `checkOutRevision()` and this allows us to skip looking up the manifestID since the caller can just pass it in themselves.
Reviewed By: wez
Differential Revision: D18719405
fbshipit-source-id: 919f0a7c84bff4a2f0bc20110c45bd272f9e9107
Summary: Tracing was not an accurate name for what this directory had become. So rename it to telemetry.
Reviewed By: wez
Differential Revision: D17923303
fbshipit-source-id: fca07e8447d9b9b3ea5d860809a2d377e3c4f9f2
Summary: Uses the existing RequestData class to make calls to static functions to set and get the `didImportFromBackingStore` flag.
Reviewed By: simpkins
Differential Revision: D16461868
fbshipit-source-id: e3ed39249f5dd1a842ad06a204b5933014b12f7f
Summary: Now that Eden depends on open source fb303, EDEN_HAVE_STATS is unnecessary. Remove it.
Reviewed By: simpkins, strager
Differential Revision: D15526905
fbshipit-source-id: 2354f1b92545a089de0e91e7c33515fa0b74b067
Summary: Added the cli command `eden stats object-store` for querying the counts on what part of the object store was responsible for finding the blob or blob size (local store or backing store). This will tell us how well local and in-memory caching works for different workflows.
Reviewed By: chadaustin
Differential Revision: D15934535
fbshipit-source-id: 70345f11a51c3c6996dc001d4101744395a3d182
Summary:
Implements size-only local storage, as opposed to storing metadata. This is useful when the blob's SHA-1 is not needed. This diff prevents SHA-1 computations, which can be especially expensive for large blobs.
From D15934535, operations such as `ls -l` and `stat` will get the size of a blob in two ways:
1) The blob's size is already stored locally, so it will be deserialized and returned.
2) The blob is fetched from the backing store, stored, and its size is returned.
This diff optimizes the second case, because SHA-1 is no longer computed.
Reviewed By: strager
Differential Revision: D15723239
fbshipit-source-id: a868f3bf6b68a83ddafb057dc3e4e65f0a2dd989
Summary:
Update the copyright & license headers in C++ files to reflect the
relicensing to GPLv2
Reviewed By: wez
Differential Revision: D15487078
fbshipit-source-id: 19f24c933a64ecad0d3a692d0f8d2a38b4194b1d
Summary: `getSize` and `getSha1` were misleading function names, since the functions refer to the size and hash of a given blob and not the object store itself. These functions have been renamed to `getBlobSize` and `getBlobSha1`.
Reviewed By: chadaustin
Differential Revision: D15696510
fbshipit-source-id: 4dd31659f60969fa90d8e2b39f43c46a2b7dff7c
Summary: I decoupled the getSize() function from the getMetadata() function, using a naive implementation for now. This was necessary because I want to add support for fetching only the size of a blob during a request like `ls -l`. Right now, the size and hash of a blob are coupled in a Metadata object, so if the size is requested, the whole file must be downloaded to calculate the hash, which is expensive for large files.
Reviewed By: chadaustin, strager
Differential Revision: D15678216
fbshipit-source-id: 8f68692768faaae0e65373ffe608d09ae49bbc42
Summary:
Because Mercurial blob IDs change without the contents changing, and
because files get unloaded upon checkout, rebasing across a large
distance in history can result in status fetching a lot of
metadata. Keep a smallish LRU cache for SHA-1 and size by blob ID.
Reviewed By: strager
Differential Revision: D10419965
fbshipit-source-id: 81499573814775471913db05f924767c3bab300e
Summary:
Now that the import bug has been fixed for some time, it's likely few
people have cached empty files. And if they do, `eden gc; eden
restart` is a fine workaround.
On the other hand, reimporting empty files gums up the importer, and
I've seen several people recently complaining about performance that
could be partially attributed to the fact that their Eden needed to
verify empty files. simpkins has talked about adding a bit to the
cached blob to determine whether it needs reimporting or not, but it's
probably going to take a while for anyone to implement that, and this
reverification logic is hurting people today.
Reviewed By: strager
Differential Revision: D10456519
fbshipit-source-id: 657bc377ee16ce93494075bde4388aed59dceecf
Summary:
Instead of calling getBlobMetadata in multiple places and only using
the .sha1 field, add a getSha1 function directly to ObjectStore. This
gives ObjectStore the latitude to fetch it and store it in different ways.
Reviewed By: wez
Differential Revision: D10227935
fbshipit-source-id: 180830534db3c42c07f04216599e496406af5ced
Summary: We've diverged in a few places from clang-format, so run it across the entirety of Eden.
Reviewed By: wez
Differential Revision: D10137785
fbshipit-source-id: 9603c2eeddc7472c33041ae60e3e280065095eb7
Summary:
Part of the larger project to modify Future<T>::then to be r-value qualified and use Future<T>::thenTry or Future<T>::thenValue.
The goal is to disambiguate folly::Future and to improve type and lifetime safety of Future and its methods.
Codemod:
future<T>.then(callable with operator()(not-a-try)) to future<T>.thenValue(callable with operator()(not-a-try)).
future<T>.then(callable with operator()()) to future<T>.thenValue(callable with operator()(auto&&)).
future<T>.then(callable with operator()(auto)) to future<T>.thenValue(callable with operator()(auto)).
future<T>.then(callable with operator()(folly::Try<T>)) to future<T>.thenTry(callable)
Reviewed By: Orvid
Differential Revision: D9819578
fbshipit-source-id: f9e31f47354c041ecbf0a90953cbe50ebfda6adc
Summary:
Part of the larger project to modify Future<T>::then to be r-value qualified and use Future<T>::thenTry or Future<T>::thenValue.
The goal is to disambiguate folly::Future and to improve type and lifetime safety of Future and its methods.
Codemod:
future<T>.then(callable with operator()(not-a-try)) to future<T>.thenValue(callable with operator()(not-a-try)).
future<T>.then(callable with operator()()) to future<T>.thenValue(callable with operator()(auto&&)).
future<T>.then(callable with operator()(auto)) to future<T>.thenValue(callable with operator()(auto)).
Reviewed By: Orvid
Differential Revision: D9696716
fbshipit-source-id: d71433c75af8422b2f16733c0b18a417d5a4cf2e
Summary:
When we load an empty blob from the LocalStore double check with the
BackingStore to confirm that it should actually be empty.
We have seen multiple instances of files being incorrectly imported as empty.
So far this error has always been fixed by a re-import. We still haven't
tracked down the root cause, but this change should help workaround the issue
by ensuring that we double check the file contents before returning the data.
Reviewed By: chadaustin
Differential Revision: D9476522
fbshipit-source-id: 6d57cf15c42736ecbcb106a731259b77db06d8f1
Summary:
Have getBlobMetadata always return a Future. It's a little unfortunate
that this will always allocate, but it sounds like we might decide to
put all RocksDB access on a background thread to increase CPU
parallelism.
Reviewed By: bolinfest
Differential Revision: D8101464
fbshipit-source-id: 6e9ec95050c366c7c57519e3f68b311470b2addd
Summary:
Remove getTreeFuture and have getTree always return a Future. It's a
little unfortunate that this will always allocate, but it sounds like
we might decide to put all RocksDB access on a background thread to
increase CPU parallelism.
Reviewed By: bolinfest
Differential Revision: D8101430
fbshipit-source-id: e12b7ab07b3468114a58753768655c107265b8af
Summary:
Remove getBlobFuture and have getBlob always return a Future. It's a
little unfortunate that this will always allocate, but it sounds like
we might decide to put all RocksDB access on a background thread to
increase CPU parallelism.
Reviewed By: bolinfest
Differential Revision: D8101402
fbshipit-source-id: d6cbbd7fe4fe55bad661c9158297db2f03f7d352
Summary:
This removes the main point of contention for eden prefetch
in two ways:
1. We batch up the complete list of blobs so that they can be processed
in bulk rather than stalling the tree walk
2. We can ask remotefilelog to check and fetch that list to the local
hgcache, again as a batch, rather than by forcing the data to be
loaded through into the local store
The goal of this prefetch is to bulk load data from the mercurial server
so that a subsequent file access doesn't have to make a one-off ssh session
for each one, rather than making sure that all the data is loaded into
the local store.
Reviewed By: chadaustin
Differential Revision: D7965818
fbshipit-source-id: 753400460d633b5467c5110e3f5608ce06106e00
Summary:
This allows using multiple cores when supported by
the BackingStore, and improves the throughput of prefetches.
Reviewed By: chadaustin
Differential Revision: D7888343
fbshipit-source-id: 1747f4ec4edf9ace02d54a4fb0ea3e8f509f51e5
Summary:
Promote the folly logging code out of the experimental subdirectory.
We have been using this for several months in a few projects and are pretty
happy with it so far.
After moving it out of the experimental/ subdirectory I plan to update
folly::Init() to automatically support configuring it via a `--logging` command
line flag (similar to the initialization it already does today for glog).
Reviewed By: yfeldblum, chadaustin
Differential Revision: D7755455
fbshipit-source-id: 052db34c97f7516728f7cbb1a5ad959def2f6efb
Summary:
Per discussion with bolinfest, this brings Eden in line with clang-format.
This diff was generated with `find . \( -iname '*.cpp' -o -iname '*.h' \) -exec bash -c "yes | arc lint {}" \;`
Reviewed By: bolinfest
Differential Revision: D6232695
fbshipit-source-id: d54942bf1c69b5b0dcd4df629f1f2d5538c9e28c
Summary:
Originally I thought this would help move towards removing a
`future.get()` call from FileInode, but it turned out to not make a difference
to that code.
It does make it a bit less of a chore to deal with the Journal related diff
callbacks added in D5896494 though, and is a move towards a future where we
could potentially return cached and shared instances of these objects.
This diff is a mechanical change to alter the return type so that we can share
instances returned from the object store interface. It doesn't change any
functionality.
Reviewed By: simpkins
Differential Revision: D5919268
fbshipit-source-id: efe4b3af74e80cf1df20e81b4386450b72fa2c94
Summary:
Update eden to log via the new folly logging APIs rather than with glog.
This adds a new --logging flag that takes a logging configuration string.
By default we set the log level to INFO for all eden logs, and WARNING for
everything else. (I suspect we may eventually want to run with some
high-priority debug logs enabled for some or all of eden, but this seems like a
reasonable default to start with.)
Reviewed By: wez
Differential Revision: D5290783
fbshipit-source-id: 14183489c48c96613e2aca0f513bfa82fd9798c7
Summary:
Now that the non-future versions of these APIs have been removed, rename
getBlobFuture() to getBlob(), and getTreeFuture() to getTree()
Reviewed By: wez
Differential Revision: D5295690
fbshipit-source-id: 30dcb88854b23160692b9cd83a632f863e07b491
Summary:
Remove the blocking getBlob() API.
There were a few call sites in FileData still using this blocking API. For now
I simply updated them to use getBlobFuture() and make a blocking get() call on
the returned future. These call sites already had TODO comments documenting
the blocking behavior.
I plan to rename getBlobFuture() to getBlob() in a subsequent diff.
Reviewed By: wez
Differential Revision: D5295726
fbshipit-source-id: 748fd7a140b9b59da339a330071f732bba38cb35
Summary:
Remove the blocking getTree() API. All call sites are using getTreeFuture()
instead now.
I plan to rename getTreeFuture() to getTree() in a subsequent diff.
Reviewed By: wez
Differential Revision: D5295725
fbshipit-source-id: 6b40b4c808da94a9c68decae3ce38c7d13fbe9f5
Summary:
Make sure the eden coded compiles cleanly with -Wshadow-compatible-local
Pretty much all of the warnings were issues with lambdas shadowing names from
their parent context (even though they didn't ask to capture those names from
the parent).
Reviewed By: wez
Differential Revision: D4644849
fbshipit-source-id: 66629cd98b5af4760f3fbb256e44c0bc47e52316
Summary:
Update copyright statements to "2016-present". This makes our updated lint
rules happy and complies with the recommended license header statement.
Reviewed By: wez, bolinfest
Differential Revision: D4433594
fbshipit-source-id: e9ecb1c1fc66e4ec49c1f046c6a98d425b13bc27
Summary:
Update the ObjectStore and BackingStore classes to have APIs that return
folly::Future objects, rather than blocking until the requested data is loaded.
For now most users still call the blocking versions of getBlob() and getTree().
Furthermore, all of the Future-based implementations actually still block
until the data is ready. I will update the code to use these new APIs in
future diffs, and then deprecate the non-future based versions.
Reviewed By: bolinfest
Differential Revision: D4318055
fbshipit-source-id: a250c23b418e69b597a4c6a95dbe80c56da5c53b
Summary:
Hash objects are small enough (20 bytes) that it isn't worth allocating them on
the heap. This updates LocalStore::getSha1ForBlob() to return a
folly::Optional<Hash>, and ObjectStore::getSha1ForBlob() to return a plain
Hash.
Reviewed By: bolinfest
Differential Revision: D4298162
fbshipit-source-id: 9cf54f2997ba8c3b2346db315a2aca41e580b078
Summary:
In addition to storing the SHA-1 of each file's contents, also store the size.
This will allow us to more quickly look up the file size, without having to
retreive the file size.
I haven't yet added an API to ObjectStore to retreive the full BlobMetadata
object; I will do that in a subsequent diff. One benefit for now is that this
does avoid double-computing the SHA-1 in ObjectStore::getSha1ForBlob() if we
had to load the blob.
Reviewed By: bolinfest
Differential Revision: D4298157
fbshipit-source-id: 4d83ebfa631c93fcef06ca1cd0ba0e1a70a2476d
Summary:
Add some verbose logging about when trees and blobs are loaded in the object
store.
Reviewed By: bolinfest
Differential Revision: D3434182
fbshipit-source-id: 3e8d2617290604f119e6164d15d63324a4c9a2aa
Summary:
Update the HgImporter class to support retrieving file contents from mercurial.
This also includes simple code for storing the data in the LocalStore using
git's blob serialization format. In the future I think it would perhaps be
better to drop the "blob<length>" prefix, and instead just use a RocksDB column
family to separate blob data from other types of data. However, for now using
the git format is simplest for keeping compatibility with the getBlob() code.
Reviewed By: bolinfest
Differential Revision: D3416691
fbshipit-source-id: 268787533be2172b2dbedc3bf06464eabf3d2c5e
Summary:
This adds an HgBackingStore implementation which can load tree data from a
mercurial repository. Blob loading is not implemented yet, but will come in a
separate diff.
This also adds a minimal GitBackingStore class. The GitBackingStore has nearly
no functionality, but is needed to keep the existing git functionality working.
Reviewed By: bolinfest
Differential Revision: D3409743
fbshipit-source-id: dbebf53e9de08bd1469e489baa48b84cbf889511