Summary:
Originally I thought this would help move towards removing a
`future.get()` call from FileInode, but it turned out to not make a difference
to that code.
It does make it a bit less of a chore to deal with the Journal related diff
callbacks added in D5896494 though, and is a move towards a future where we
could potentially return cached and shared instances of these objects.
This diff is a mechanical change to alter the return type so that we can share
instances returned from the object store interface. It doesn't change any
functionality.
Reviewed By: simpkins
Differential Revision: D5919268
fbshipit-source-id: efe4b3af74e80cf1df20e81b4386450b72fa2c94
Summary:
akushner reported this error situation with full details in this paste:
{P58193090}
(see also comments on the paste which include the edenfs.log)
Summarizing here, what appears to be happening is that there is a local commit
for which there is no local tree manifest data. We try to load this from the
local treepacks and fail, so we fall back to asking the server, which fails.
In this case we want to fall back to performing a flat manifest import but
since the error is a generic runtime_error it doesn't get caught by the
code that decides to fallback on line 639.
Let's assume that an error reading the chunk header for the prefetchtrees
request is really a MissingKeyError so that we can attempt the fallback.
Reviewed By: simpkins
Differential Revision: D5898439
fbshipit-source-id: ea4f3f3ef22ee72c888c592a93c5c8b9269f39b0
Summary:
Add LocalStore::WriteBatch class so that LocalStore users
can manage own batches independently. This makes it safer to
perform concurrent batched updates as the current implementation
assumes (incorrectly) that it has exclusive ownership of the batch mode setting.
Reviewed By: simpkins
Differential Revision: D5911697
fbshipit-source-id: 62005245ce2542b87eac206a15d8657bcca81d83
Summary:
We're running down a performance problem with hg status that we believe
is something happening at a higher level in the code but noticed that there
were a lot of reads of the rocksdb SST files. In the strace output for those
we observed file content data being read. The status operation shouldn't
need file contents; what's happening is that we're over-fetching some metadata
but happen to be scooping up the file contents from the SST file because we
use the same key prefixes and differentiate the keyspace with key suffixes.
This diff implements the use of rocksdb column families to partition things
more effectively and results in a speed up of around 6x in this scenario.
Furthermore, applying point lookup optimization options yields an additional
2x performance improvement to our rocksdb performance.
As part of this diff, I've removed the hash set that we were using to allow
checking whether a key was present in the store; it wasn't very useful
and would have had to be split into one set per keyspace with this diff;
easier to just remove it.
Reviewed By: bolinfest
Differential Revision: D5781906
fbshipit-source-id: 97f068ade546fd09f391e60a7a57fec0e9081e67
Summary:
The on-disk treemanifest store doesn't contain an empty tree for the null
manifest ID. We have to explicitly check for this ID in software and generate
an empty tree in this case.
Reviewed By: wez
Differential Revision: D5750053
fbshipit-source-id: d1a58df45f9025ff5a4757f0a814f92dd58798e8
Summary:
The fixes to the fastmanifest code should be landing in prod
tomorrow, so this diff turns on trees in anticipation of that.
Folks working in eden will need to also use hg-dev until this ships out.
Reviewed By: simpkins
Differential Revision: D5740960
fbshipit-source-id: fac3b59183ceb63b6af704715fb5a5b9daed013d
Summary:
this is required together with D5711177 to successfully perform an `hg
amend`. The changes in D5711177 cause the pending pack files to get written
out and the amend command will then call into eden to change the parents of the
commit.
We need to be able to resolve the tree from the packfiles when this happens,
but since this happens within the default refresh interval in the store code
(which is ~100ms) we need to explicitly refresh the set of pack files.
This is most easily accomplished by forcing a refresh on a tree miss.
This is probably fine if you assume that we won't legitimately be asked
to resolve non-existent trees very frequently.
Reviewed By: simpkins
Differential Revision: D5712623
fbshipit-source-id: 4d0034affcc276f1ae29caac36aa5596e52cd746
Summary:
Currently treemanifest import fails when a mercurial transaction is in progress
(if the tree in question was created by the pending transaction), and this
breaks many mercurial workflows.
Turning treemanifest back off by default, to fall back to the slower but
functional flat manifest import.
D5685880 adds framework to run the eden integration tests with treemanifest
enabled; those tests can be used to tell when treemanifest is working well
enough that we can turn it back on.
Reviewed By: bolinfest
Differential Revision: D5692894
fbshipit-source-id: 5ee84cf73db4cd87bbdaae1edd92c74058fa00e2
Summary:
Change the default value for --use_hg_tree_manifest to true, to re-enable
treemanifest import by default. This should work much better now that we are
using the latest treemanifest code and now support fetching treemanifest data
from the remote server when it is not found locally.
Reviewed By: bolinfest
Differential Revision: D5647204
fbshipit-source-id: 424baf4de1d3b247c1e04a838040baeb5976404b
Summary:
This refactors the treemanifest union store initialization code, and fixes a
crash introduced in D5560787 if --use_hg_tree_manifest is enabled but the
underlying repository does not support treemanifest.
This moves the treemanifest initialization code into a new importTreeManifest()
helper function, and separates it from parsing the CMD_STARTED response. We
now only initialize unionStore_ if FLAGS_use_hg_tree_manifest is enabled and
the repository supports treemanifest. Other places in the code now only check
if unionStore_ is non-null to see if they should try importing via
treemanifest.
Splitting importTreeManifest() from the CMD_STARTED response parsing will
potentially make things cleaner in the future for using multiple
hg_import_helper.py processes. Even if we start multiple of these processes,
we only need one instance of the C++ union store, and we don't need to
re-initialize the treemanifest separately each time.
Reviewed By: bolinfest
Differential Revision: D5647203
fbshipit-source-id: 71e14156f1d0ebc0880dd819c5b041b7c1146818
Summary:
Update the hg import tester utility to recursively import all trees under the
specified path by default. Passing in `--tree_import_recurse=no` can be used
to disable this behavior.
Reviewed By: wez
Differential Revision: D5588902
fbshipit-source-id: 1426dd18e0fb86728429f203b5be4742998886ed
Summary:
Add a flag to prevent HgImporter from trying to fetch missing tree data from
the remote mercurial server.
This is mainly useful for debugging purposes to force operations to fail if we
are missing some tree data locally.
Reviewed By: wez
Differential Revision: D5588900
fbshipit-source-id: b9f688383a0c1fceab614341473b357cda31a88b
Summary:
The `entry->flag` field in a ManifestEntry object is a `char*`, but it is not
actually a nul-terminated string. When it is non-null, only the first
character should be used. This updates the eden logging code to log just this
one character, instead of trying to log it like a normal `char*` and reading
garbage after the flag character.
Reviewed By: wez
Differential Revision: D5588899
fbshipit-source-id: 63b727a3aaee838f3a0cf6d765140964dd3ea746
Summary:
Summary
This removes the stale copy of the fb-hgext C/C++ code at eden/hg/datastorage.
The eden include paths have been updated to find the newer includes from
scm/hgext.
This will remove the treemanifest code from the eden-hg repository. This code
is available in the fb-hgext bitbucket repository instead. However, I will
likely send out a diff to `#ifdef` out the treemanifest integration from eden
in open source builds. This will ensure that open source eden builds can
build without using the treemanifest code.
Reviewed By: wez
Differential Revision: D5588675
fbshipit-source-id: ab3f26c2797e95baaedeb9ca73fee56ddb521d6a
Summary:
Update the hg_import_helper.py script to include additional information in the
initial CMD_STARTED response that it sends after it has successfully
initialized.
This response now includes:
- A protocol version number, to help catch errors if edenfs somehow ends up
invoking an incompatible version of the hg_import_helper.py script.
- Whether treemanifest is supported in this repository.
- If treemanifest is supported, the list of treemanifest pack directories.
This eliminates the need for the separate GET_CACHE_PATH command that was
previously sent immediately after the CMD_STARTED response.
This also fixes the code to always let the python code select the pack
directories. Previously the C++ code assumed the local pack directory was
aways located at ".hg/store/packs/manifest" under the repoPath. This may not
be correct if repoPath happens to be a repository created with the share
extension.
Reviewed By: wez
Differential Revision: D5560787
fbshipit-source-id: e796222f42927a85e886227b3a7b2ccf9c1ef1bd
Summary:
This is a first pass at updating eden to fetch tree manifest data from a remote
mercurial server on demand. The code now catches MissingKeyErrors thrown by
the datastore code, and then fetches the requested tree data from the remote
server.
This makes it safe to use tree manifest for import, as we can now fetch missing
tree data as needed, instead of always returning I/O errors whenever we reach a
tree not available locally.
Unfortunately, the performance of downloading tree data is currently not very
good: it takes 90+ seconds plus to download any tree data. The current
mercurial server code appears to always provide full recursive tree data
(unless you ask for the delta between two manifests, which we do not have).
Even when asking for very small subdirectories, the server appears to send the
full tree manifest data for the entire repository, sometimes taking longer than
if we had asked for the full repository data to start with.
Note that tree manifest import is still disabled by default with this diff, and
must be explicitly enabled by running edenfs with `--use_hg_tree_manifest`.
Reviewed By: wez
Differential Revision: D5544817
fbshipit-source-id: 940e0c914f055edc1beee438b0ac50c2f8b08b03
Summary:
Add logic to the hg import tester tool to support recursively importing
specific subdirectories using the tree manifest import mechanism.
Reviewed By: wez
Differential Revision: D5544818
fbshipit-source-id: 1637d32691c30dfab8d59599891feab6fb27bcdb
Summary:
This makes several improvements to the hg import tester script:
- If no --edenDir flag is specified, initialize a new temporary directory to
keep the RocksDB data store.
- Add a `--rocksdb_options_file` flag to allow controlling the options used for
the RocksDB store.
- Add an --import_type flag to allow explicitly selecting if we should test
the flat manifest or tree manifest import code.
- Add a --flat_import_file flag, to allow testing a pre-generated flat manifest
input data file, rather than retrieving the data from mercurial. This allows
benchmarking only the C++ import code, and eliminating the python portion of
the import. The input file can be generated by running
`hg_import_helper.py --manifest <revision>`
Reviewed By: wez
Differential Revision: D5541732
fbshipit-source-id: 340af4fea872412248d41453792b2179f0afa466
Summary:
Update `hg_import_helper.py` to take `--config` options to set mercurial
options, just like the normal `hg` command.
This makes it easier to override mercurial settings during testing. For
instance, this lets us override the `remotefilelog.cachepath` option for
testing tree manifest import without hitting the real system hgcache
directory.
Reviewed By: bolinfest
Differential Revision: D5523181
fbshipit-source-id: 8bb6f5f244fa979872275cbb6ee797d1477a9b5b
Summary: This uses `clang-tidy` to comment out unused parameters (in functions, methods and lambdas) in fbcode. Cases that the tool failed to handle are fixed manually.
Reviewed By: igorsugak
Differential Revision: D5454343
fbshipit-source-id: 5dee339b4334e25e963891b519a5aa81fbf627b2
Summary:
This updates hg_import_helper.py to override
`mercurial.txnutil._mayhavepending()` to always return True. This allows it
access to commits that are still part of a pending transaction, as long as
`writepending()` has been called on the transaction. (We could have set the
`HG_PENDING` environment variable to the repository root path instead of
monkey-patching `_mayhavepending()`, but this felt more fragile--it requires
normalizing the repository root the same way that mercurial does, and making
sure we get the correct repository root when using the share extension.)
This allows us to remove our override of `commitctx.markcommitted()` in the hg
extension. Previously we replaced this method to defer calling
`dirstate.setparents()` until the transaction was complete. We no longer have
to do this since eden can now access the pending commit.
This should also help fix bugs in the rebase workflow where we previously
failed trying to checkout a still-pending commit.
Reviewed By: wez
Differential Revision: D5345451
fbshipit-source-id: 9c32ab655254c79f030c10b7c9d02563decb04f7
Summary:
The treemanifest import code doesn't fully work: it can end up importing the
root tree using treemanifest data, but if data for all subtrees is not also
present in the local hgcache directory it doesn't know how to load them. This
results in error messages in the log like:
failed to load inode 812: MissingKeyError: unable to find delta chain
Filesystem clients see I/O errors.
Reviewed By: wez
Differential Revision: D5382256
fbshipit-source-id: 601f6705a6382cfff80ed9c64f221b0854e5c255
Summary:
This adds unit tests for the treemanifest import code, and fixes a couple of
bugs triggered by the tests: directory permissions were being set incorrectly,
and trees were being stored in the LocalStore with the wrong hash, causing
lookup errors later.
The HgImportTest code previously used HgImporter::importManifest(), which ended
up exercising only the flatmanifest code. This updates HgImporter to expose
separate importFlatManifest() and importTreeManifest() APIs, and updates the
tests to check both import mechanisms.
Reviewed By: wez
Differential Revision: D5365959
fbshipit-source-id: 3799ee3b937296e8025b666dd7176dbe7e268a31
Summary:
Refactor the HgImporter code to have separate functions for importing tree data
with treemanifest vs flatmanifest.
This will make it easier to unit test both code paths. In the future we may
also want to provide more control over which import mechanism to use for
particular repositories, and this will also make that easier.
Reviewed By: wez
Differential Revision: D5365957
fbshipit-source-id: 37bf7d9bc62332a1844405056cdee363dad77033
Summary:
Fix HgManifestImporter to always call LocalStore::disableBatchMode()
after calling enableBatchMode(). Previously if an error occurred importing a
tree disableBatchMode() would never be called, resulting in a crash when
enableBatchMode() was called the next time we tried to import a tree.
This API is still not thread-safe, and will break if multiple callers try to
use the enableBatchMode/disableBatchMode APIs simultaneously. In the future we
should move the batch functionality to a separate class, so each caller that
wants to do batching manages their own write batch, and this isn't controlled
globally for LocalStore.
Reviewed By: wez
Differential Revision: D5365958
fbshipit-source-id: 0616f0af5029de6ebdfee768b8fddda5b6d2dfd1
Summary:
This is mostly needed because of the `folly:file` target split, but there are a few other that had changes that weren't reflected.
This also re-enables auto-deps for `multifeed/aggregator/processor/TARGETS` as it works again.
Reviewed By: yfeldblum
Differential Revision: D5362875
fbshipit-source-id: f9d2793249c0bfe644d0504f19839c108648ea3b
Summary:
[Folly] Remove `construct_in_place`.
It is an alias for `in_place`, and unnecessary.
Reviewed By: Orvid
Differential Revision: D5362354
fbshipit-source-id: 39ce07bad87185e506fe7d780789b78060008182
Summary: It doesn't need to exist anymore
Reviewed By: yfeldblum
Differential Revision: D5318746
fbshipit-source-id: c70b184f4b3fc12ede4632d6b3d43de16ed758c7
Summary:
Format all of the TARGETS files under eden/fs with the autodeps tool.
A few rocksdb include statements require comments so that autodeps can
correctly tell which dependency this include comes from. The rocksdb library's
source file structure unfortunately does not match the layout of how its header
files get installed, so autodeps cannot figure this out automatically.
Reviewed By: wez
Differential Revision: D5316000
fbshipit-source-id: f8163adca79ee4a673440232d6467fb83e56aa10
Summary: Add unit tests for HgImporter to better test this import logic.
Reviewed By: bolinfest
Differential Revision: D5309171
fbshipit-source-id: b5619d8271fef1cc31a0f642ec2fcbf9eccced54
Summary:
Update the findImportHelperPath() function to search all parent directories
containing the binary for the hg import helper script, rather than hard-coding
exactly how many directories deep we expect the edenfs binary to be placed in
the build output directory.
This makes it possible to use HgImporter in unit tests and other binaries that
aren't built in the exact same directory as the main edenfs binary.
Reviewed By: bolinfest
Differential Revision: D5309170
fbshipit-source-id: 051b26ea3017d8a9a2b9e2c59e164c28b640d1f5
Summary:
Update HgBackingStore and GitBackingStore to accept the repository path as an
AbsolutePathPiece rather than as a plain StringPiece.
Reviewed By: bolinfest
Differential Revision: D5309172
fbshipit-source-id: aca4818c3add11d07ece796298f6175ca4fb1448
Summary:
Update eden to log via the new folly logging APIs rather than with glog.
This adds a new --logging flag that takes a logging configuration string.
By default we set the log level to INFO for all eden logs, and WARNING for
everything else. (I suspect we may eventually want to run with some
high-priority debug logs enabled for some or all of eden, but this seems like a
reasonable default to start with.)
Reviewed By: wez
Differential Revision: D5290783
fbshipit-source-id: 14183489c48c96613e2aca0f513bfa82fd9798c7
Summary:
Now that the non-future versions of these APIs have been removed, rename
getBlobFuture() to getBlob(), and getTreeFuture() to getTree()
Reviewed By: wez
Differential Revision: D5295690
fbshipit-source-id: 30dcb88854b23160692b9cd83a632f863e07b491
Summary:
Remove the blocking getBlob() API.
There were a few call sites in FileData still using this blocking API. For now
I simply updated them to use getBlobFuture() and make a blocking get() call on
the returned future. These call sites already had TODO comments documenting
the blocking behavior.
I plan to rename getBlobFuture() to getBlob() in a subsequent diff.
Reviewed By: wez
Differential Revision: D5295726
fbshipit-source-id: 748fd7a140b9b59da339a330071f732bba38cb35
Summary:
Remove the blocking getTree() API. All call sites are using getTreeFuture()
instead now.
I plan to rename getTreeFuture() to getTree() in a subsequent diff.
Reviewed By: wez
Differential Revision: D5295725
fbshipit-source-id: 6b40b4c808da94a9c68decae3ce38c7d13fbe9f5
Summary:
Remove the ObjectStores.h and .cpp files. These files contained helper
functions for looking up Tree and TreeEntry objects based on a deep relative
path.
These functions are no longer used anywhere anymore (the code that performs
deep lookups always does inode lookups instead). Additionally, this file is
the only place still using some of the blocking, non-Future-based APIs for
object lookups. Deleting these files will let us remove these deprecated
blocking APIs.
Reviewed By: wez
Differential Revision: D5295691
fbshipit-source-id: 89229827305490eba4d3a581954a90c5cf63238d
Summary:
This is generated by applying clang-tidy `-checks=modernize-use-override` to all the c++ code in project eden.
It enforces the use of the keywords `virtual`, `verride` and `final` in a way compliant to the style guide.
Reviewed By: igorsugak
Differential Revision: D5108807
fbshipit-source-id: 596f2d73f1137de350114416edb1c37da5423ed5
Summary:
On OSX mercurial libraries are not installed in the standard pythonpath. This
option lets me pass a pythonpath to the hg importer script only, without
polluting the pythonpath for other subprocess invocations (such as python 3
based hooks, which are not happy with python2 pythonpaths).
Reviewed By: bolinfest
Differential Revision: D4992222
fbshipit-source-id: aa8e734970344215cc8783e164fd119c3ee6aed6
Summary:
These files are defining flags, so they should import gflags.
This import being missing causes open source build to fail.
Reviewed By: bolinfest
Differential Revision: D4992246
fbshipit-source-id: 61eb35b43ae6f9ff88bd34a9ebc8db6d80ac9d08
Summary:
When tree manifest data is available, we don't need to parse the full
manifest any more, which is great because we can simply and quickly load the
root manifest and be done with a checkout operation. We can then quickly load
the manifest entry for a given directory path on demand, making us closer to our
ideal of O(what-you-use).
This diff plumbs in the manifest code from mercurial so that we can reuse
the decoder implementation to get the manfiest data and translate it into
something that we can put into our LocalStore.
There's a little wrinkle when it comes to files; mercurial doesn't support
a means for getting the file contents based on just its hash, we have to
provide the filename and revision hash for that. We have existing code
to create proxy entries in the store to map from a proxy hash to a tuple
with the appropriate data needed to resolve the file entry.
We need to extend its use to the trees that we generate because we need
to be able to propagate that path down to a child tree when we generate
the tree entry for a file for the first time.
If a tree manifest is not available for a given revision, we fall back
to the existing implementation that parses the full flat manifest.
Note: the flat manifest import generates entries with different hashes
compared to the tree manifest hash version. While we could make the
serialization match up, there's little risk of this causing problems
for us in practice, so punting on that for now.
Reviewed By: simpkins
Differential Revision: D4873069
fbshipit-source-id: 865001996745700586f571ed67371aed1668a6a7
Summary:
This change makes it so that all of the C++ code related to the edenfs daemon
is now contained in the eden/fs subdirectory.
Reviewed By: bolinfest, wez
Differential Revision: D4889053
fbshipit-source-id: d0bd4774cc0bdb5d1d6b6f47d716ecae52391f37
Summary:
previously, the importer would read the entire manifest
and emit data to the store as it resolved complete directory entries.
The entire manifest data would be buffered and sent out to the store.
In the scenario where one subtree has been modified and a commit has
been made, only the parents of the subdirectory need to be hashed
and stored, but we would compute and try to store everything anyway.
While this diff can't avoid having to compute hashes for everything (we need
tree manifest data for that), by breaking the import into two passes we can
potentially avoid interrogating the LocalStore about every tree in the entire
manifest during an import; we only need to store the trees that are missing and
can simply cut out sub trees that are already present.
This saves us IO at the expense of buffering the manifest tree in memory
for the duration of the first pass; it is an acceptable trade off.
Reviewed By: simpkins
Differential Revision: D4832166
fbshipit-source-id: 0a40cb851c65393b407a8161db05c4b1795fb11a
Summary:
This updates all of the references to gtest and gmock with googletest.
The change is mechanilcal, generated with the following one-liner:
```lang=bash
hg grep -lwE '(gtest|gmock)' 'glob:**/TARGETS' | grep -v '^third-party-buck' | xargs perl -pi -e '
$gt=qr!(["'"'"'])gtest\g1!;
(
s!$gt(\s*,\s*(.any.|None))(\s*,\s*)?\),?!\1googletest\1\2, \1gtest\1\),!g or
s!$gt((\s*,\s*(.any.|None)[^\)]+))\),?!\1googletest\1\2\),!g or
s!\(\s*$gt,?\s*\),?!\(\1googletest\1, None, \1gtest\1\),!g or
s!$gt,?!\(\1googletest\1, None, \1gtest\1\),!g
) unless /(name|type) *=/;
$gm=qr!(["'"'"'])gmock\g1!;
(
s!$gm(\s*,\s*(.any.|None))(\s*,\s*)?\),?!\1googletest\1\2, \1gmock\1\),!g or
s!$gm((\s*,\s*(.any.|None)[^\)]+))\),?!\1googletest\1\2\),!g or
s!\(\s*$gm,?\s*\),?!\(\1googletest\1, None, \1gmock\1\),!g or
s!$gm,?!\(\1googletest\1, None, \1gmock\1\),!g
) unless /(name|type) *=/;
'
```
Reviewed By: meyering
Differential Revision: D4643237
fbshipit-source-id: fda7f41760c7e44254231df87634631c343e6355
Summary:
The hg_import_helper script that eden uses to import data from mercurial keeps
a long-lived repository object open. This caches some data about the
repository, and if new commits are added after it was created, it can fail to
see them.
This updates hg_import_helper.py to catch errors that occur when trying to use
the repository objects. The code will invalidate the repository object and
then retry the operation once, in the hopes that it will now succeed after
invalidation.
Reviewed By: bolinfest
Differential Revision: D4752659
fbshipit-source-id: 1c75c84766d6bbda0710882a338eaa09e0cb0030