Commit Graph

38 Commits

Author SHA1 Message Date
Andrey Chursin
ae684f3993 explicit Hash20 instead of Hash [proxy hash removal 2/n]
Summary:
This is fairly mechanical diff that finalizes split of Hash into ObjectId and Hash20.

More specifically this diff does two things:
* Replaces `Hash` with `Hash20`
* Removes alias `using Hash = Hash20`

Reviewed By: chadaustin

Differential Revision: D31324202

fbshipit-source-id: 780b6d2a422ddf6d0f3cfc91e3e70ad10ebaa8b4
2021-10-01 12:43:26 -07:00
Andrey Chursin
0af2511a3f separate out ObjectId [proxy hash removal 1/n]
Summary:
The goal of this stack is to remove Proxy Hash type, but to achieve that we need first to address some tech debt in Eden codebase.

For the long time EdenFs had single Hash type that was used for many different use cases.

One of major uses for Hash type is identifies internal EdenFs objects such as blobs, trees, and others.

We seem to reach agreement that we need a different type for those identifiers, so we introduce separate ObjectId type in this diff to denote new identifier type and replace _some_ usage of Hash with ObjectId.

We still retain original Hash type for other use cases.

Roughly speaking, this is how this diff separates between Hash and ObjectId:

**ObjectId**:
* Everything that is stored in local store(blobs, trees, commits)

**Hash20**:
* Explicit hashes(Sha1 of the blob)
* Hg identifiers: manifest id and blob hg ig

For now, in this diff ObjectId has exactly same content as Hash, but this will change in the future diffs. Doing this way allows to keep diff size manageable, while migrating to new ObjectId right away would produce insanely large diff that would be both hard to make and review.

There are few more things that needs to be done before we can get to the meat of removing proxy hashes:

1) Replace include Hash.h with ObjectId.h where needed
2) Remove Hash type, explicitly rename rest of Hash usages to Hash20
3) Modify content of ObjectId to support new use cases
4) Modify serialized metadata and possibly other places that assume ObjectId size is fixed and equal to Hash20 size

Reviewed By: chadaustin

Differential Revision: D31316477

fbshipit-source-id: 0d5e4460a461bcaac6b9fd884517e129aeaf4baf
2021-10-01 10:25:46 -07:00
Genevieve Helsel
55378444c8 remove dynamic cast on backing store for getting the repo name
Summary: instead of dynamic casting to find the repo name, all backing stores can return an optional reponame, and can check if the optional is set.

Reviewed By: zhengchaol

Differential Revision: D31214723

fbshipit-source-id: 9d10114ff6bde13254d3a3caaf2401f87d07ffd7
2021-09-27 17:30:29 -07:00
Xavier Deguillard
e6135bbf0a store: return fetch origin from backing store
Summary:
The `ObjectFetchContext::Origin::FromBackingStore` is widely interpreted as
meaning that a network fetch was performed, but for some backing stores, this
isn't true. The Mercurial backing store for instance can either read data from
its on-disk cache, or from the network. Since both have very different
characteristics we shouldn't bundle them in the same enum value.

Since the backing store knows how data was obtained, let's have the backing
store return how it was obtained to enable the ObjectStore to properly record
this information. The `FromBackingStore` is also renamed to make it clearer
what its purpose is.

Reviewed By: zhengchaol

Differential Revision: D31118906

fbshipit-source-id: ee42a0c9d221f870742de07c0df7c732bc79d880
2021-09-23 14:23:30 -07:00
Chad Austin
a4ba22dc48 rename Hash to Hash20
Summary:
In preparation for expanding to variable-width hashes, rename the
existing hash type to Hash20.

Reviewed By: genevievehelsel

Differential Revision: D28967365

fbshipit-source-id: 8ca8c39bf03bd97475628545c74cebf0deb8e62f
2021-09-08 16:27:10 -07:00
Xavier Deguillard
ae518f2784 store: devirtualize BackingStore::recordFetch
Summary:
The recordFetch is an implementation detail of a BackingStore and thus we don't
need to explicitely make it virtual.

Differential Revision: D30459635

fbshipit-source-id: 34f847ca906f81924c99c26b4e8af646e91fd735
2021-08-23 11:05:03 -07:00
Xavier Deguillard
fe0ea26fdf store: avoid copying proxy hashes during prefetch
Summary:
Looking at strobelight when performing an `eden prefetch` shows that a lot of
time is spent copying data around. The list of hash to prefetch is for instance
copied 4 times, let's reduce this to only one time when converting Hash to a
ByteRange.

Reviewed By: chadaustin

Differential Revision: D30433285

fbshipit-source-id: 922e6e5c095bd700ee133e9bb219904baf2ae1ac
2021-08-23 11:05:02 -07:00
Yipu Miao
a721dd3ef3 Add getTreeEntryForRootId API to BackingStore to get a tree entry by RootId
Summary: Add a new method to backingstore so we can get TreeEntry by rootID

Reviewed By: chadaustin

Differential Revision: D29889482

fbshipit-source-id: 93e63624e75c7d559c4de6f68821a8efa0e0c184
2021-08-20 17:11:23 -07:00
Mark Juggurnauth-Thomas
02c0bfc9e3 make hg inform edenfs of newly created root manifests
Summary:
If Mercurial asks EdenFS to update to a commit that it has just created, this
can cause a long delay while EdenFS tries to import the commit.

EdenFS needs to resolve the commit to a root manifest.  It does this via the
import helper, but the import helper won't know about the commit until it is
restarted, which takes a long time.

To fix this, we add an optional "root manifest" parameter to the checkout or
reset parents thrift calls.  This allows the Mercurial client to inform EdenFS
of the root manifest that it already knows about, allowing EdenFS to skip this
step.

Reviewed By: chadaustin

Differential Revision: D29845604

fbshipit-source-id: 61736d84971cd2dd9a8fdaa29a1578386246e4bf
2021-07-29 10:01:02 -07:00
Chad Austin
bb1cccac89 introduce a variable-width RootId type that identifies the root of an EdenFS checkout's contents
Summary:
Backing stores differentiate between individual tree objects and the
root of a checkout. For example, Git and Mercurial roots are commit
hashes. Allow EdenFS to track variable-width roots to better support
arbitrary backing stores.

Reviewed By: genevievehelsel

Differential Revision: D28619584

fbshipit-source-id: d94f1ecd21a0c416c1b4933341c70deabf386496
2021-06-07 17:25:31 -07:00
Chad Austin
894eaa9840 move root ID parsing and rendering into BackingStore
Summary:
The meaning of the root ID is defined by the BackingStore, so move
parsing and rendering into the BackingStore interface.

Reviewed By: xavierd

Differential Revision: D28560426

fbshipit-source-id: 7cfed4870d48016811b604348742754f6cdbd842
2021-06-03 11:07:14 -07:00
Chad Austin
d45b2711a2 remove the dead getTreeForManifest
Summary: getTreeForManifest is no longer called, so remove it.

Reviewed By: genevievehelsel

Differential Revision: D28306796

fbshipit-source-id: e51a32fa7d75c54b2e3525e88c162247b4496560
2021-05-10 11:53:30 -07:00
Katie Mancini
be0cd8da1e enable skipping Metadata prefetches during eden prefetches
Summary:
This is the plumbing to allow us to skip Metadata prefetching during eden
prefetches. These can trigger  a bunch of wasted network requests
when we are fetching files anyways. (These network requests are wasted since we
fetch the file contents and most of them are being throttled on sandcastle anyways.)

We won't necessarily want to skip metadata prefetching always, we will still want it
for the watchman queries, but for `eden prefetch` will probably want to skip it. This
is why we are making it an option in the GlobParams.

Reviewed By: chadaustin

Differential Revision: D24640754

fbshipit-source-id: 20db62d4c0e59fe17cb6535c86ac8f1e3877879c
2020-11-11 16:30:02 -08:00
Ailin Zhang
106ce3af12 record fetch in HgQueuedBackingStore
Summary:
after `startRecordingFetch()` is called by `HgQueuedBackingStore`, record the path for each fetched file. When `stopRecordingFetch()` is called, stop recording and return the collected file paths.

`startRecordingFetch()` and `stopRecordingFetch()` will be used in the next diff.

Reviewed By: chadaustin

Differential Revision: D22797037

fbshipit-source-id: a1fe30424d3c2884ffe139a0062b1e36328fd4fe
2020-08-04 06:50:45 -07:00
Katie Mancini
b71d531a39 add data fetch logging for prefetch
Summary:
This adds logging for files fetched in prefetch like was aleady added for
blob and tree fetches.

This is needed to log the fetches caused by the glob files thrift call. The
purpose of this to help debug the cause of unexpected data fetches (See
D22448048 for more motivation).

Reviewed By: genevievehelsel

Differential Revision: D22561619

fbshipit-source-id: 5ae78b99fb0c7d863d8223b93492b0d0210ddf9e
2020-07-26 23:09:40 -07:00
Ailin Zhang
03a2308028 get priority from ObjectFetchContext in BackingStore
Summary: Previously, `BackingStore` and all its sub-classes' `getBlob` and `getTree` methods accepted both `ObjectFetchContext` and `ImportPriority`  as arguments. Now, `ImportPriority`  is removed because we can get the priority from `ObjectFetchContext `

Reviewed By: kmancini

Differential Revision: D22650629

fbshipit-source-id: e1b0c57a059f11504b28b2c17d698bb58f51e1ee
2020-07-24 08:24:02 -07:00
Katie Mancini
a0b05b4bf0 thread ObjectFetchContext to backing store
Summary:
This passes ObjectFetchContext into the backing store to prepare for adding
logging for the cause of server fetches.

In following changes I will add logging in the HgQueuedBackingStore.
Ultimately we will want to move this logging to be closer to the data fetching
(in HgDatapackStore and HgImporter), but I plan to temporarily add logging to
the HgQueuedBackingStore to simplify so that we can more quickly roll out.

Reviewed By: chadaustin

Differential Revision: D22022992

fbshipit-source-id: ccb428458cbf7a1e33aaf9be9d0d766c45acedb3
2020-06-23 10:02:40 -07:00
Zeyi (Rice) Fan
bcc69fc668 implement prefetch for HgQueuedBackingStore
Summary: This is actually missing from `HgQueuedBackingStore`. This diff fixes this and assign low priority to these prefetch requests.

Reviewed By: chadaustin

Differential Revision: D20655681

fbshipit-source-id: f3c92b358e16e980390ac7adcae27d41ae5a7277
2020-04-06 19:12:42 -07:00
Wez Furlong
0a6aa21d77 eden: fix multiply defined symbols issue with ImportPriority
Summary:
This is a rough pass that resolves a linker issue on MSVC by
switching to inline static member functions.

Reviewed By: chadaustin

Differential Revision: D20529163

fbshipit-source-id: 578ed440758c685091d3e039e261638e027db17a
2020-03-20 10:56:08 -07:00
Zeyi (Rice) Fan
2da686d315 add priority to BackingStore interface
Summary: This diff adds `Priority` added in the previous diff to the `BackingStore` interface with the default value set to `Priority::Normal`.

Reviewed By: chadaustin

Differential Revision: D20197071

fbshipit-source-id: a92f1b49bb82e3478042e5e3b79b047d834755ea
2020-03-17 02:31:23 -07:00
Zeyi (Rice) Fan
a431e64e4e eden: periodically refresh content store to give it chances to release mapped files
Summary:
As reported by JT, EdenFS may hold the file descriptor of mapped pack files too long even when it is deleted by external processes, thus taking more disk spaces.

This diff fixes this problem by making EdenFS periodically rescan the pack files.

Reviewed By: chadaustin

Differential Revision: D19395439

fbshipit-source-id: 4bfd6a7ac13dceb3099d2704d62b3825433aff4b
2020-01-17 15:00:01 -08:00
Zeyi (Rice) Fan
071a3682fd eden: semi-futurify BackingStore::getTree
Summary: D18669664

Reviewed By: chadaustin

Differential Revision: D18670157

fbshipit-source-id: 74c8c3e2fae16973079e5d3d3bc2fe18adf088a7
2019-12-20 16:14:22 -08:00
Zeyi (Rice) Fan
99c2a6ca2e eden: semi-futurify BackingStore::getTreeForCommit
Summary: Same as D18669664.

Reviewed By: chadaustin

Differential Revision: D18669966

fbshipit-source-id: 54974528259a91f8f222bd60e897d28f41675351
2019-12-20 16:14:22 -08:00
Zeyi (Rice) Fan
aeaa26c274 eden: semi-futurify BackingStore::getTreeForManifest
Summary: see D19107687

Reviewed By: genevievehelsel

Differential Revision: D19107687

fbshipit-source-id: 652084ad0e64884d6273a4206d26d572915e3a51
2019-12-20 16:14:22 -08:00
Zeyi (Rice) Fan
068ff196bd eden: SemiFuture-ify BackingStore::getBlob
Summary:
This diff turns the return type of `BackingStore::getBlob` from `folly::Future` into `folly::SemiFuture` to prevent executor leaks.

This also enable us to remove the need of holding `serverThreadPool` from backing stores.

----

**Changes**

* `ObjectStore` now needs to hold a `folly::Executor::KeepAlive` that is used to turn `SemiFuture`s it gets from backing stores into `Future`.
* Signature changes of the implementations of `BackingStore` class.
* For tests, I chose to use `QueuedImmediateExecutor` in place of `UnboundedQueueExecutor` as it will basically execute tasks inline. I'm concerned introducing thread pool executor in tests may turn tests flaky.

Reviewed By: wez

Differential Revision: D18669664

fbshipit-source-id: 0cae89f365dcf8b345b49d64469a530cf25d4ac5
2019-12-20 16:14:21 -08:00
Chad Austin
11589b3595 enable and fix more warnings
Summary: Enable a couple nice-to-have warnings.

Reviewed By: simpkins

Differential Revision: D18800030

fbshipit-source-id: 5bca073c6dd0b2d40ba8c2c9725fe152f20042a5
2019-12-20 16:14:17 -08:00
Genevieve Helsel
504a255355 add getTreeForManifest
Summary: Adds a function which takes both the manifestID and the commitID to get a Tree. This will be used in `checkOutRevision()` and this allows us to skip looking up the manifestID since the caller can just pass it in themselves.

Reviewed By: wez

Differential Revision: D18719405

fbshipit-source-id: 919f0a7c84bff4a2f0bc20110c45bd272f9e9107
2019-12-09 16:25:27 -08:00
Andres Suarez
fbdb46f5cb Tidy up license headers
Reviewed By: chadaustin

Differential Revision: D17872966

fbshipit-source-id: cd60a364a2146f0dadbeca693b1d4a5d7c97ff63
2019-10-11 05:28:23 -07:00
Adam Simpkins
aa5e6c7295 update license headers in C++ files
Summary:
Update the copyright & license headers in C++ files to reflect the
relicensing to GPLv2

Reviewed By: wez

Differential Revision: D15487078

fbshipit-source-id: 19f24c933a64ecad0d3a692d0f8d2a38b4194b1d
2019-06-19 17:02:45 -07:00
Chad Austin
2a9e1e2f29 remove fallback for correcting empty files
Summary:
Now that the import bug has been fixed for some time, it's likely few
people have cached empty files. And if they do, `eden gc; eden
restart` is a fine workaround.

On the other hand, reimporting empty files gums up the importer, and
I've seen several people recently complaining about performance that
could be partially attributed to the fact that their Eden needed to
verify empty files. simpkins has talked about adding a bit to the
cached blob to determine whether it needs reimporting or not, but it's
probably going to take a while for anyone to implement that, and this
reverification logic is hurting people today.

Reviewed By: strager

Differential Revision: D10456519

fbshipit-source-id: 657bc377ee16ce93494075bde4388aed59dceecf
2018-10-22 20:27:26 -07:00
Adam Simpkins
a88763ae96 re-verify blob contents for empty blobs loaded from the LocalStore
Summary:
When we load an empty blob from the LocalStore double check with the
BackingStore to confirm that it should actually be empty.

We have seen multiple instances of files being incorrectly imported as empty.
So far this error has always been fixed by a re-import.  We still haven't
tracked down the root cause, but this change should help workaround the issue
by ensuring that we double check the file contents before returning the data.

Reviewed By: chadaustin

Differential Revision: D9476522

fbshipit-source-id: 6d57cf15c42736ecbcb106a731259b77db06d8f1
2018-08-23 14:22:58 -07:00
Adam Simpkins
a5f53c6e3a avoid blocking on HgProxyHash::getBatch() in hg importer threads
Summary:
D8065370 changed the HgImporter code to make a blocking `get()` call on a
future from the hg importer thread pool.  This caused deadlocks, since all of
the hg importer threads could become stuck waiting on these `get()` calls to
complete.  These would be waiting on RocksDbLocalStore threads which were in
turn all busy waiting to schedule operations on the HgImporter threads.

This fixes the code to use `Future::then()` rather than `Future::get()` to
avoid blocking the HgImporter threads on these operations.

Reviewed By: wez

Differential Revision: D8438777

fbshipit-source-id: a0d647b10ef5a182be2d19f636c2dbc24eab1b23
2018-06-14 22:02:38 -07:00
Wez Furlong
8be54b4a1b prefetch file batch for hg import helper
Summary:
This removes the main point of contention for eden prefetch
in two ways:

1. We batch up the complete list of blobs so that they can be processed
   in bulk rather than stalling the tree walk
2. We can ask remotefilelog to check and fetch that list to the local
   hgcache, again as a batch, rather than by forcing the data to be
   loaded through into the local store

The goal of this prefetch is to bulk load data from the mercurial server
so that a subsequent file access doesn't have to make a one-off ssh session
for each one, rather than making sure that all the data is loaded into
the local store.

Reviewed By: chadaustin

Differential Revision: D7965818

fbshipit-source-id: 753400460d633b5467c5110e3f5608ce06106e00
2018-05-25 13:51:27 -07:00
Chad Austin
8b9261f2a1 run clang-format across all C++ files
Summary:
Per discussion with bolinfest, this brings Eden in line with clang-format.

This diff was generated with `find . \( -iname '*.cpp' -o -iname '*.h' \) -exec bash -c "yes | arc lint {}" \;`

Reviewed By: bolinfest

Differential Revision: D6232695

fbshipit-source-id: d54942bf1c69b5b0dcd4df629f1f2d5538c9e28c
2017-11-03 16:02:03 -07:00
Adam Simpkins
251da81f36 update all copyright statements to "2016-present"
Summary:
Update copyright statements to "2016-present".  This makes our updated lint
rules happy and complies with the recommended license header statement.

Reviewed By: wez, bolinfest

Differential Revision: D4433594

fbshipit-source-id: e9ecb1c1fc66e4ec49c1f046c6a98d425b13bc27
2017-01-20 22:03:02 -08:00
Adam Simpkins
fc202f81e5 add new Future-based APIs to ObjectStore
Summary:
Update the ObjectStore and BackingStore classes to have APIs that return
folly::Future objects, rather than blocking until the requested data is loaded.

For now most users still call the blocking versions of getBlob() and getTree().
Furthermore, all of the Future-based implementations actually still block
until the data is ready.  I will update the code to use these new APIs in
future diffs, and then deprecate the non-future based versions.

Reviewed By: bolinfest

Differential Revision: D4318055

fbshipit-source-id: a250c23b418e69b597a4c6a95dbe80c56da5c53b
2016-12-13 18:12:21 -08:00
Adam Simpkins
eae8ee41e9 start adding an HgBackingStore implementation
Summary:
This adds an HgBackingStore implementation which can load tree data from a
mercurial repository.  Blob loading is not implemented yet, but will come in a
separate diff.

This also adds a minimal GitBackingStore class.  The GitBackingStore has nearly
no functionality, but is needed to keep the existing git functionality working.

Reviewed By: bolinfest

Differential Revision: D3409743

fbshipit-source-id: dbebf53e9de08bd1469e489baa48b84cbf889511
2016-06-13 15:16:30 -07:00
Adam Simpkins
d9be0757b8 add a BackingStore API
Summary:
Add the basic BackingStore interface, plus a NullBackingStore implementation
that always returns null.  This updates the ObjectStore to query the
BackingStore if data is not found in the LocalStore.

Additionally, this updates EdenServer to manage the BackingStore objects.  It
maintains a map of the BackingStore objects created for each known repository.

Reviewed By: bolinfest

Differential Revision: D3409602

fbshipit-source-id: 2920dc4c24ee1ec37efb542f058d0d121ceb5532
2016-06-13 15:16:29 -07:00