Commit Graph

16 Commits

Author SHA1 Message Date
Wez Furlong
0de8d10208 move LocalStore::PutXXX methods to a LocalStore::WriteBatch class
Summary:
Add LocalStore::WriteBatch class so that LocalStore users
can manage own batches independently.  This makes it safer to
perform concurrent batched updates as the current implementation
assumes (incorrectly) that it has exclusive ownership of the batch mode setting.

Reviewed By: simpkins

Differential Revision: D5911697

fbshipit-source-id: 62005245ce2542b87eac206a15d8657bcca81d83
2017-09-28 20:53:40 -07:00
Wez Furlong
78ef6207e3 use rocksdb column families to separate key spaces
Summary:
We're running down a performance problem with hg status that we believe
is something happening at a higher level in the code but noticed that there
were a lot of reads of the rocksdb SST files.  In the strace output for those
we observed file content data being read.  The status operation shouldn't
need file contents; what's happening is that we're over-fetching some metadata
but happen to be scooping up the file contents from the SST file because we
use the same key prefixes and differentiate the keyspace with key suffixes.

This diff implements the use of rocksdb column families to partition things
more effectively and results in a speed up of around 6x in this scenario.

Furthermore, applying point lookup optimization options yields an additional
2x performance improvement to our rocksdb performance.

As part of this diff, I've removed the hash set that we were using to allow
checking whether a key was present in the store; it wasn't very useful
and would have had to be split into one set per keyspace with this diff;
easier to just remove it.

Reviewed By: bolinfest

Differential Revision: D5781906

fbshipit-source-id: 97f068ade546fd09f391e60a7a57fec0e9081e67
2017-09-07 14:50:42 -07:00
Adam Simpkins
a6ae3edab9 move eden/utils and eden/fuse into eden/fs
Summary:
This change makes it so that all of the C++ code related to the edenfs daemon
is now contained in the eden/fs subdirectory.

Reviewed By: bolinfest, wez

Differential Revision: D4889053

fbshipit-source-id: d0bd4774cc0bdb5d1d6b6f47d716ecae52391f37
2017-04-14 11:39:02 -07:00
Wez Furlong
d32fc630f2 switch hg manifest import to two passes
Summary:
previously, the importer would read the entire manifest
and emit data to the store as it resolved complete directory entries.
The entire manifest data would be buffered and sent out to the store.

In the scenario where one subtree has been modified and a commit has
been made, only the parents of the subdirectory need to be hashed
and stored, but we would compute and try to store everything anyway.

While this diff can't avoid having to compute hashes for everything (we need
tree manifest data for that), by breaking the import into two passes we can
potentially avoid interrogating the LocalStore about every tree in the entire
manifest during an import; we only need to store the trees that are missing and
can simply cut out sub trees that are already present.

This saves us IO at the expense of buffering the manifest tree in memory
for the duration of the first pass; it is an acceptable trade off.

Reviewed By: simpkins

Differential Revision: D4832166

fbshipit-source-id: 0a40cb851c65393b407a8161db05c4b1795fb11a
2017-04-06 10:52:06 -07:00
Wez Furlong
d808e1d46c improve batching during importing
Summary:
This explicitly uses a WriteBatch during importing.  This results in a
5x perf gain for import operations.

I've also added in a read-before-write check to make sure that we're not
attempting to overwrite a key that already exists.  Since this is a
content-addressed store, a given key should always have the same
value; it should be immutable.  Overwriting an existing key is an
opportunity to change data that should never change and is also wasteful
of IO bandwidth.

Ideally we'd be able to exploit this to avoid descending into a directory;
I'd like to hook that up to the treemanifest stuff in mercurial in a
follow on diff.

Reviewed By: bolinfest

Differential Revision: D4783598

fbshipit-source-id: 24f95495d09a852859b346559fdf83d064de2b51
2017-03-31 11:20:14 -07:00
Adam Simpkins
251da81f36 update all copyright statements to "2016-present"
Summary:
Update copyright statements to "2016-present".  This makes our updated lint
rules happy and complies with the recommended license header statement.

Reviewed By: wez, bolinfest

Differential Revision: D4433594

fbshipit-source-id: e9ecb1c1fc66e4ec49c1f046c6a98d425b13bc27
2017-01-20 22:03:02 -08:00
Adam Simpkins
970159aa5c normalize path arguments
Summary:
This updates the EdenServer and LocalStore classes to require more arguments be
passed in as AbsolutePath arguments rather than just plain strings.

This updates the main program to process path arguments using canonicalPath().

Reviewed By: bolinfest

Differential Revision: D4332273

fbshipit-source-id: 3d235a767963b11129c3897ad027ad761b6dae50
2016-12-15 13:02:38 -08:00
Adam Simpkins
c81ee4c997 update getSha1ForBlob() to return the Hash by value
Summary:
Hash objects are small enough (20 bytes) that it isn't worth allocating them on
the heap.  This updates LocalStore::getSha1ForBlob() to return a
folly::Optional<Hash>, and ObjectStore::getSha1ForBlob() to return a plain
Hash.

Reviewed By: bolinfest

Differential Revision: D4298162

fbshipit-source-id: 9cf54f2997ba8c3b2346db315a2aca41e580b078
2016-12-12 17:50:36 -08:00
Adam Simpkins
5857b36df8 update LocalStore to also store the blob size along with its SHA-1
Summary:
In addition to storing the SHA-1 of each file's contents, also store the size.
This will allow us to more quickly look up the file size, without having to
retreive the file size.

I haven't yet added an API to ObjectStore to retreive the full BlobMetadata
object; I will do that in a subsequent diff.  One benefit for now is that this
does avoid double-computing the SHA-1 in ObjectStore::getSha1ForBlob() if we
had to load the blob.

Reviewed By: bolinfest

Differential Revision: D4298157

fbshipit-source-id: 4d83ebfa631c93fcef06ca1cd0ba0e1a70a2476d
2016-12-12 17:50:35 -08:00
Caren Thomas
adc13d4ed6 make put and get for trees/blobs symmetric
Summary: This change updates LocalStore to perform serialization of trees and blobs internally so that its users don't need to be aware of the internal serialization format. Previously, the get and put APIs were asymmetric such that the get APIs returned deserialized Tree and Blob objects, while put required raw serialized bytes. After this change, put will also use deserialized Tree and Blob objects.

Reviewed By: simpkins

Differential Revision: D3589899

fbshipit-source-id: 2e572e6ec5af44d66206b178a03f7a9d619b2290
2016-07-25 12:34:25 -07:00
Adam Simpkins
5f639c037b support retrieving file data from mercurial
Summary:
Update the HgImporter class to support retrieving file contents from mercurial.

This also includes simple code for storing the data in the LocalStore using
git's blob serialization format.  In the future I think it would perhaps be
better to drop the "blob<length>" prefix, and instead just use a RocksDB column
family to separate blob data from other types of data.  However, for now using
the git format is simplest for keeping compatibility with the getBlob() code.

Reviewed By: bolinfest

Differential Revision: D3416691

fbshipit-source-id: 268787533be2172b2dbedc3bf06464eabf3d2c5e
2016-06-15 14:24:11 -07:00
Adam Simpkins
6a9f974f31 add a generic LocalStore get() and put() methods
Summary:
Add APIs for storing arbitrary (key, value) data.

This will allow BackingStore implementations to store additional metadata, such
as mapping mercurial commit IDs to the eden root tree ID.

Eventually we may want to use RocksDB column families to partition the
different types of data being put into the LocalStore.  However, for now this
just uses a single key space.  We can add column family support in a separate
diff, if desired.

Reviewed By: bolinfest

Differential Revision: D3409866

fbshipit-source-id: 19a1d340b65bff2081981bf5daf32d5ad15b60c4
2016-06-13 15:16:30 -07:00
Adam Simpkins
eae8ee41e9 start adding an HgBackingStore implementation
Summary:
This adds an HgBackingStore implementation which can load tree data from a
mercurial repository.  Blob loading is not implemented yet, but will come in a
separate diff.

This also adds a minimal GitBackingStore class.  The GitBackingStore has nearly
no functionality, but is needed to keep the existing git functionality working.

Reviewed By: bolinfest

Differential Revision: D3409743

fbshipit-source-id: dbebf53e9de08bd1469e489baa48b84cbf889511
2016-06-13 15:16:30 -07:00
Adam Simpkins
1b36d4bf83 add a StoreResult class
Summary:
Add a new StoreResult which wraps the std::string returned by RocksDB.

This replaces the std::unique<string> that LocalStore::get() used to return.
This lets us avoid a memory allocation.  StoreResult can also represent a "not
found" result, so that this case can be processed efficiently without having to
throw an exception.

Additionally, StoreResult is move-only so we can't ever unintentionally copy
the string data, which is potentially expensive.  It also provides APIs for
creating IOBuf wrappers, or moving the string to the heap so we can create an
managed IOBuf around it.

Reviewed By: bolinfest

Differential Revision: D3403958

fbshipit-source-id: ab0c304988a53eda50341ecc2f96ae5235e5260c
2016-06-08 19:01:13 -07:00
Adam Simpkins
96cea91e54 various minor efficiency improvements in LocalStore
Summary:
- Add a Sha1Key class that can more efficiently compute the key for
  file content SHA-1 values, without having to copy it into a new std::string
  object.  (In practice fbstring would have avoided having to actually allocate
  memory, but it was still an extra data copy.)

- The code was always converting the hash keys to hex on get and put
  operations, just in case it needed it if an error occurred.  This diff
  changes the code to only compute the hex value if an error actually occurred.

Reviewed By: bolinfest

Differential Revision: D3403889

fbshipit-source-id: 5abd8ef202cb00677a84a03a82e2a3d21f16cd2f
2016-06-08 14:54:01 -07:00
Facebook Github Bot 5
2eeea32117 Initial commit
fbshipit-source-id: 2bcefbd0cd127cc5ea982e074ea6819d7aac3d7a
2016-05-12 14:09:13 -07:00