Summary:
The ProcessNameCache code is compiled on Windows now, this definiton could
cause issues with different cpp files compiling different version of the
ProcessNameCache. To avoid this, let's remove it from Stub.h, this also removes
a bunch of #ifdef.
Reviewed By: chadaustin
Differential Revision: D23693490
fbshipit-source-id: 8f3f7b1128235b9a60f850e688b9e98910c202fc
Summary:
We are unifying C++ APIs for accessing optional and unqualified fields:
https://fb.workplace.com/groups/1730279463893632/permalink/2541675446087359/.
This diff migrates code from accessing data members generated from unqualified
Thrift fields directly to the `field_ref` API, i.e. replacing
```
thrift_obj.field
```
with
```
*thrift_obj.field_ref()
```
The `_ref` suffixes will be removed in the future once data members are private
and names can be reclaimed.
The output of this codemod has been reviewed in D20039637.
The new API is documented in
https://our.intern.facebook.com/intern/wiki/Thrift/FieldAccess/.
drop-conflicts
Reviewed By: yfeldblum
Differential Revision: D22631599
fbshipit-source-id: 9bfcaeb636f34a32fd871c7cd6a2db4a7ace30bf
Summary:
Prefetching metadata for the entries in a tree when we fetch it saves us
an extra round trip to the server to fetch a blob when only the metadata
for that blob is fetched. (This can happen often while parsing targets in
builds)
We will to prefetch the metadata for each of the entries in a tree when
we fetch the tree and store the metadata for each entry under that
entries id (to make looking up the entry metadata by its id quick)
However, we also don't want to unnecessarily fetch data from
the server if we already done so.
To accomplish this we will also store the metadata for each entry under the tree
id in the local store. This will: 1) allow us to check if we have already fetched
the metadata from the server when we are fetching a tree (we only have the
tree id easily available here to storing the metadata under the tree id makes
it much easier/less expensive to do this check). 2) allow us to refil the
metadata for each entry stored under that entries blob id if it has been cleared
from the local store (this may happen is the local store is gets large and gets
partially cleaned to reclaim space).
This implements the method to store tree metadata for all entries under the tree
id and under the blob id for each entry.
Reviewed By: chadaustin
Differential Revision: D22239173
fbshipit-source-id: d4e0ffd642ce0b4034188cfc4eeaf2ea05f54e77
Summary:
This introduces a class to manipulate the metadata for all the entries in a
tree. This adds serialization and deserialization to this class so that it can
be written to the local store.
Why do we need this? We need some way to easily check when we have already
fetched metadata for a tree and do not need to refetch this from the server to
avoid expensive network requests. Later diffs add functionally to store the metadata
for tree entries in the local store under the tree hash using this class.
Reviewed By: chadaustin
Differential Revision: D21959015
fbshipit-source-id: 0c0e8750737f3076c1f9604d0319cab7f2658656
Summary: This diff made fetch threshold configurable, so we can change it later as repository size grows.
Reviewed By: fanzeyi
Differential Revision: D22337850
fbshipit-source-id: 4b46420cb4e7164a3f1080279d67fa5f90549cd8
Summary: This diff updated `ObjectStore` to send a `FetchHeavy` event to Scuba when the number of fetching requests of a process has reached 2000.
Reviewed By: fanzeyi
Differential Revision: D22292104
fbshipit-source-id: b7ac48412868216ea960c8681a5fb71c587952bc
Summary:
While this isn't the right fix, this is what shipped in our packages, for the
sake of being able to reproduce the package, let's land this as it is. A
future change will remove this ifdef.
Below is pkaush original description:
In Eden Windows we treat all the files as regular files and don't have a
concept of symlinks and executable files. Fixing the TreeEntryType::getType()
to return REGULAR_FILE for executable file and symlink.
Reviewed By: wez
Differential Revision: D20481051
fbshipit-source-id: 0b0c4d7aea28134383ef45aeafc02930b420286b
Summary:
These 3 tests compile without issues on Windows. The RocksDB one is weird,
while it compiles with no hickups, I simply cannot run the resulting test
binary, and I'm not sure how to debug this. The local store one fails in folly.
Reviewed By: chadaustin
Differential Revision: D21393724
fbshipit-source-id: db90bf20a9d116bc8aa493703997c5e8da76eb1f
Summary:
This is a rough pass that resolves a linker issue on MSVC by
switching to inline static member functions.
Reviewed By: chadaustin
Differential Revision: D20529163
fbshipit-source-id: 578ed440758c685091d3e039e261638e027db17a
Summary:
This diff introduces a `Priority` type for EdenFS. This type is used to pass along the priority of a request.
The priority class itself contains two parts, `kind` and `offset`. `kind` uses the first 4-bytes and the reset 12-bytes are used to store offset. The idea is that we can roughly assign a priority kind to most of the requests and offset is used to dynamically tweak the priority of some particular requests. For example, when we saw a process is generate millions of requests we can use this to express "normal priority but less important than other process's normal priority".
Reviewed By: chadaustin
Differential Revision: D20287652
fbshipit-source-id: 9a849fb6cc6ba5e443fea978d5b4dc3ab8ca906a
Summary:
Add a fetch context interface to ObjectStore that allows tracing cache
hits, backing store fetches, and fetch durations in the context of a
diff or checkout operation.
Reviewed By: simpkins
Differential Revision: D19135625
fbshipit-source-id: d0d8f134b1c89f7ba4971a404a46a69a1704ba5c
Summary: Pass a GitIgnoreStack* and isIgnored flag through the diff operation. It is used later in the stack when we go down this code path in the TreeInode diffing code. When context->loadFileContentsFromPath() is null, the gitignore loading code will be ignored, used when we do not want to honor a gitignore file. For example, during `getScmStatusBetweenRevisions()`.
Reviewed By: chadaustin
Differential Revision: D18647089
fbshipit-source-id: 20d2abd2ef61669465e134165da5a0ac5e987cca
Summary: Simplify the definition and use of KeySpace and move it into its own header.
Reviewed By: simpkins
Differential Revision: D19353441
fbshipit-source-id: ef07677d927a48839b709711388abeb3c1ed9679
Summary:
Remove some half-baked, unnecessary logic for caching sizes separately
from SHA-1. Eden's backing stores do not support chunking large files
yet, so there's no value in caching content SHA-1 and size
separately. This fixes a scenario where fetching blob size and then
SHA-1 would result in two backing store imports.
Reviewed By: fanzeyi
Differential Revision: D19169096
fbshipit-source-id: dc32f3313e5f4230c06a5bbaa67da7bf0febaba8
Summary:
This diff turns the return type of `BackingStore::getBlob` from `folly::Future` into `folly::SemiFuture` to prevent executor leaks.
This also enable us to remove the need of holding `serverThreadPool` from backing stores.
----
**Changes**
* `ObjectStore` now needs to hold a `folly::Executor::KeepAlive` that is used to turn `SemiFuture`s it gets from backing stores into `Future`.
* Signature changes of the implementations of `BackingStore` class.
* For tests, I chose to use `QueuedImmediateExecutor` in place of `UnboundedQueueExecutor` as it will basically execute tasks inline. I'm concerned introducing thread pool executor in tests may turn tests flaky.
Reviewed By: wez
Differential Revision: D18669664
fbshipit-source-id: 0cae89f365dcf8b345b49d64469a530cf25d4ac5
Summary:
A spike in automatic GCs usually implies something has gone wrong. Log
an event for each one, recording the cache size prior to the GC and
the cache size after.
Reviewed By: simpkins
Differential Revision: D18902580
fbshipit-source-id: 158b2635733a415a9fcc7c412b2c0f44ed04aa01
Summary: Enabling getScmStatusBetweenRevisions to work on Windows and its tests.
Reviewed By: simpkins
Differential Revision: D18431271
fbshipit-source-id: eee82538e2fc3d7e371c96fc271cd9662ea6d737
Summary: There are a couple of functions in `store/Diff.h` which are not used elsewhere except for testing. These functions are tested by `diffCommitsForStatus()` anyways so these standalone functions are not needed.
Reviewed By: chadaustin
Differential Revision: D18690006
fbshipit-source-id: f2b24575c17403d7241896f35f4e0e16bb03b7ce
Summary: Even if blobs have different hashes, they could have the same contents. For example, if between the two revisions being compared, if a file was changed and then later reverted. In that case, the contents would be the same but the blobs would have different hashes. Currently, `getScmStatusBetweenRevisions()` would report false positives in this case. This is also needed so we do not report false positives in `getScmStatus()` when hit this code path
Reviewed By: simpkins
Differential Revision: D18647086
fbshipit-source-id: 66e12648a24fd7e5612eee5e599a5b81c7c5f2d1
Summary: Formatting had diverged in a few places. Fix that up.
Reviewed By: fanzeyi
Differential Revision: D18123219
fbshipit-source-id: 832cdd70789642f665a029196998928a9173be81
Summary: Removes `TreeDiffer` class and passes `DiffContext` through standalone `TreeDiffer` functions as first argument as per comment on D17400466 for setup for processing gitignores in the `TreeDiffer` codepath. (also this allows for easy implementation of short circut of `future_getScmStatusBetweenRevisions` similar to D17531102)
Reviewed By: chadaustin
Differential Revision: D17717977
fbshipit-source-id: d480d212474bd80aeac9cd9bb901f97562b62b13
Summary: Use callback to save ScmStatus instead of storing status inside a `TreeDiffer` object. Involes a bit of restructuring of some code to avoid circular dependencies and library creation. Mostly renames and file moves, some funtion moves as well.
Reviewed By: simpkins
Differential Revision: D17400466
fbshipit-source-id: fcd194a4c20204dffd3d11cd4a083564dc0ea938
Summary: Added the cli command `eden stats object-store` for querying the counts on what part of the object store was responsible for finding the blob or blob size (local store or backing store). This will tell us how well local and in-memory caching works for different workflows.
Reviewed By: chadaustin
Differential Revision: D15934535
fbshipit-source-id: 70345f11a51c3c6996dc001d4101744395a3d182
Summary:
Implements size-only local storage, as opposed to storing metadata. This is useful when the blob's SHA-1 is not needed. This diff prevents SHA-1 computations, which can be especially expensive for large blobs.
From D15934535, operations such as `ls -l` and `stat` will get the size of a blob in two ways:
1) The blob's size is already stored locally, so it will be deserialized and returned.
2) The blob is fetched from the backing store, stored, and its size is returned.
This diff optimizes the second case, because SHA-1 is no longer computed.
Reviewed By: strager
Differential Revision: D15723239
fbshipit-source-id: a868f3bf6b68a83ddafb057dc3e4e65f0a2dd989
Summary:
Update the copyright & license headers in CMake files to reflect the
relicensing to GPLv2
Reviewed By: wez
Differential Revision: D15487079
fbshipit-source-id: 715e559464c19a0070d6e55a095b3fc7d61ad2f8
Summary:
Update the copyright & license headers in C++ files to reflect the
relicensing to GPLv2
Reviewed By: wez
Differential Revision: D15487078
fbshipit-source-id: 19f24c933a64ecad0d3a692d0f8d2a38b4194b1d
Summary: The first step towards decoupling size from the metadata object in the local store. This is necessary for the next step, which is to serialize size separately from the combined metadata object.
Reviewed By: chadaustin
Differential Revision: D15719637
fbshipit-source-id: 979555e3e362f2b698b3af3a0b6db1e350e07dcd
Summary: Wrote tests for retrieving hashes from the object store and refactored existing tests. Includes positive and negative tests.
Reviewed By: strager, pkaush
Differential Revision: D15697488
fbshipit-source-id: 44aa8a36fbc1d1e56dcbbb6bcb665d3784cbb476
Summary: `getSize` and `getSha1` were misleading function names, since the functions refer to the size and hash of a given blob and not the object store itself. These functions have been renamed to `getBlobSize` and `getBlobSha1`.
Reviewed By: chadaustin
Differential Revision: D15696510
fbshipit-source-id: 4dd31659f60969fa90d8e2b39f43c46a2b7dff7c
Summary: I decoupled the getSize() function from the getMetadata() function, using a naive implementation for now. This was necessary because I want to add support for fetching only the size of a blob during a request like `ls -l`. Right now, the size and hash of a blob are coupled in a Metadata object, so if the size is requested, the whole file must be downloaded to calculate the hash, which is expensive for large files.
Reviewed By: chadaustin, strager
Differential Revision: D15678216
fbshipit-source-id: 8f68692768faaae0e65373ffe608d09ae49bbc42
Summary:
beholdunittests
This enables some plumbing for running some of the
tests using the gtest/gmock machinery in cmake.
Part of this diff is removing the FindGMock.cmake file from the
eden repo; we now pull this in from the shared cmake library
that is populated by shipit.
Reviewed By: simpkins
Differential Revision: D14993344
fbshipit-source-id: 51caf9518c7f3a083a3b90cda10324c3a8170359
Summary: this fixes linking the tests when building with cmake in D14993344
Reviewed By: chadaustin
Differential Revision: D15167223
fbshipit-source-id: 1dda3beed3b6ff3788f3992783c58b4b92f697f2
Summary:
Build the store tests on mode/mac. RocksDB doesn't build yet (and we
will likely move away from it anyway) so split it into its own target.
Reviewed By: simpkins
Differential Revision: D15002840
fbshipit-source-id: fa0fea1d52df06526032a6fd585e2da7273910ef
Summary: These allow more tests to compile
Reviewed By: chadaustin
Differential Revision: D14994582
fbshipit-source-id: 6b2a0b276fda64c7f27e28ea9e6d548aaaa1db7c
Summary:
If TreeInode::startLoadingInode() is in progress, and EdenServer::startTakeoverShutdown() is called, edenfs can deadlock:
1. Thread A: A FUSE request calls TreeInode::readdir() -> TreeInode::prefetch() -> TreeInode::startLoadingInode() on the children TreeInode-s -> RocksDbLocalStore::getFuture().
2. Thread B: A takeover request calls EdenServer::performTakeoverShutdown() -> InodeMap::shutdown().
3. Thread C: RocksDbLocalStore::getFuture() (called in step 1) completes -> TreeInode::inodeLoadComplete(). (The inodeLoadComplete continuation was registered by TreeInode::registerInodeLoadComplete().)
4. Thread C: After TreeInode::inodeLoadComplete() returns, the TreeInode's InodePtr is destructed, dropping the reference count to 0.
5. Thread C: InodeMap::onInodeUnreferenced() -> InodeMap::shutdownComplete() -> EdenMount::shutdown() (called in step 2) completes -> EdenServer::performTakeoverShutdown().
6. Thread C: EdenServer::performTakeoverShutdown() -> localStore_.reset() -> RocksDbLocalStore::~RocksDbLocalStore().
7. Thread C: RocksDbLocalStore::~RocksDbLocalStore() signals the thread pool to exit and waits for the pool's threads to exit. Because thread C is one of the threads managed by RocksDbLocalStore's thread pool, the signal is never handled and RocksDbLocalStore::~RocksDbLocalStore() never finishes.
Fix this deadlock by executing EdenServer::shutdown()'s callback (in EdenServer::performTakeoverShutdown()) on a different thread.
Reviewed By: simpkins
Differential Revision: D14337058
fbshipit-source-id: 1d63b4e7d8f5103a2dde31e329150bf763be3db7
Summary:
In the future, we will likely coalesce redundant fetches at the
BlobAccess/ObjectStore layer. To measure what we actually want,
populate a normal ObjectStore with a NullLocalStore and add counters
to FakeBackingStore.
Reviewed By: wez
Differential Revision: D13454331
fbshipit-source-id: 2fbf393d159f94e84c24ac53ccc207162fa754b7
Summary:
There was a bug in BlobCache where, if you had an interest handle to a
blob, but that blob was evicted anyway and then something else caused
it to be reloaded, dropping your interest handle would cause the blob
to be incorrectly evicted since the reference counts were no longer
compatible. Add a version to cache items and only decrement the
reference count on an item if the interest handle and item agree.
Reviewed By: strager
Differential Revision: D13405144
fbshipit-source-id: aee052bf777e7225551c3ae2b8b69a99f4f77691
Summary:
Have `eden stats` print the size of the blob cache if the running
edenfs has information about it.
Reviewed By: strager
Differential Revision: D13349220
fbshipit-source-id: 9f59f4399f2d4283aa80bcb54ba73c51d555d502
Summary:
A later diff needed a constant for the SHA-1 of an empty buffer. While
I'm at it, I made Hash a little bit nicer to use.
Reviewed By: strager
Differential Revision: D13224195
fbshipit-source-id: b2fb1437be042215b5b398a8c7fc9fc5dd115e9e
Summary: In support of making Eden's file access stateless, add a facade that connects the BlobCache and ObjectStore, allowing FileInode to fetch blobs, minimizing reloads and memory usage.
Reviewed By: strager
Differential Revision: D10850143
fbshipit-source-id: 4307f7c1143aecad1284ea3cadf3e4efca9a3925
Summary: Add a BlobCache with a maximum cache size and a minimum entry count and interest-based eviction.
Reviewed By: strager
Differential Revision: D12972062
fbshipit-source-id: 1958f7f500c051a5bc0b39b5b89a6f0fc1774b0f
Summary:
Update all of the C++ unit tests that create temporary files and directories
to use the new `facebook::eden::makeTempFile()` and
`facebook::eden::makeTempDir()` functions.
Note that this likely changes the behavior of some code paths in meaningful
ways: `/dev/shm` normally does not support `getxattr()`, and Eden's overlay
code attempts to store the SHA-1 for materialized files as using extended
attributes. This means that the tests will now typically hit the fallback
code path and will not store SHA-1 data in the overlay.
Reviewed By: chadaustin, strager
Differential Revision: D12971162
fbshipit-source-id: 6cc5eba2e04be7e9a13a30e90883feadfb78f9ce
Summary:
Because Mercurial blob IDs change without the contents changing, and
because files get unloaded upon checkout, rebasing across a large
distance in history can result in status fetching a lot of
metadata. Keep a smallish LRU cache for SHA-1 and size by blob ID.
Reviewed By: strager
Differential Revision: D10419965
fbshipit-source-id: 81499573814775471913db05f924767c3bab300e
Summary:
This is part of "the great r-valuification of folly::Future":
* This is something we should do for safety in general.
* Context: `Future::get(...)` means both `Future::get()` and `Future::get(Duration)`
* Using lvalue-qualified `Future::get(...)` has caused some failures around D7840699 since lvalue-qualification hides that operation's move-out semantics - leads to some use of future operations that are really not correct, but are not obviously incorrect.
* Problems with `Future::get(...) &`: it moves-out the result but doesn't invalidate the Future - the Future remains (technically) valid even though it actually is partially moved-out. Callers can subsequently access that moved-out result via things like `future.get(...)`, `future.result()`, `future.value()`, etc. - these access an already-moved-out result which is/can be surprising.
* Reasons `Future::get(...) &&` is better: its semantics are more obvious and user-testable. It moves-out the Future, leaving it with `future.valid() == false`.
Reviewed By: LeeHowes
Differential Revision: D8756764
fbshipit-source-id: 5c75c79cebcec77658a3a4605087716e969a376d
Summary:
Have getBlobMetadata always return a Future. It's a little unfortunate
that this will always allocate, but it sounds like we might decide to
put all RocksDB access on a background thread to increase CPU
parallelism.
Reviewed By: bolinfest
Differential Revision: D8101464
fbshipit-source-id: 6e9ec95050c366c7c57519e3f68b311470b2addd
Summary:
Remove getTreeFuture and have getTree always return a Future. It's a
little unfortunate that this will always allocate, but it sounds like
we might decide to put all RocksDB access on a background thread to
increase CPU parallelism.
Reviewed By: bolinfest
Differential Revision: D8101430
fbshipit-source-id: e12b7ab07b3468114a58753768655c107265b8af