Summary:
Format all of the TARGETS files under eden/fs with the autodeps tool.
A few rocksdb include statements require comments so that autodeps can
correctly tell which dependency this include comes from. The rocksdb library's
source file structure unfortunately does not match the layout of how its header
files get installed, so autodeps cannot figure this out automatically.
Reviewed By: wez
Differential Revision: D5316000
fbshipit-source-id: f8163adca79ee4a673440232d6467fb83e56aa10
Summary:
Update EdenServer to catch SIGTERM and SIGINT, and to stop gracefully on these
signals.
Reviewed By: wez
Differential Revision: D5315323
fbshipit-source-id: 16611190c8d522a78cf6b624d97d4f7ecc478f96
Summary: Add unit tests for HgImporter to better test this import logic.
Reviewed By: bolinfest
Differential Revision: D5309171
fbshipit-source-id: b5619d8271fef1cc31a0f642ec2fcbf9eccced54
Summary: This includes adjusting them to conform to Folly's style for single file sources and headers, making targets as small as possible, not providing aggregate targets, and using auto-deps.
Reviewed By: yfeldblum
Differential Revision: D5312078
fbshipit-source-id: 1c7f5aa04da3bad236ffa23000c7bda47b82e3ef
Summary:
Update the findImportHelperPath() function to search all parent directories
containing the binary for the hg import helper script, rather than hard-coding
exactly how many directories deep we expect the edenfs binary to be placed in
the build output directory.
This makes it possible to use HgImporter in unit tests and other binaries that
aren't built in the exact same directory as the main edenfs binary.
Reviewed By: bolinfest
Differential Revision: D5309170
fbshipit-source-id: 051b26ea3017d8a9a2b9e2c59e164c28b640d1f5
Summary:
Update HgBackingStore and GitBackingStore to accept the repository path as an
AbsolutePathPiece rather than as a plain StringPiece.
Reviewed By: bolinfest
Differential Revision: D5309172
fbshipit-source-id: aca4818c3add11d07ece796298f6175ca4fb1448
Summary:
1. Moved read, write, mkdir, rm methods in hg/lib/hg_extension_test_base.py to lib/test_case.py.
2. Added integration test case to test unload free inodes.
Reviewed By: simpkins
Differential Revision: D5277870
fbshipit-source-id: b93b6049a10357cf8c92366e6dca3968f7f30c30
Summary:
Update eden to log via the new folly logging APIs rather than with glog.
This adds a new --logging flag that takes a logging configuration string.
By default we set the log level to INFO for all eden logs, and WARNING for
everything else. (I suspect we may eventually want to run with some
high-priority debug logs enabled for some or all of eden, but this seems like a
reasonable default to start with.)
Reviewed By: wez
Differential Revision: D5290783
fbshipit-source-id: 14183489c48c96613e2aca0f513bfa82fd9798c7
Summary:
Now that the non-future versions of these APIs have been removed, rename
getBlobFuture() to getBlob(), and getTreeFuture() to getTree()
Reviewed By: wez
Differential Revision: D5295690
fbshipit-source-id: 30dcb88854b23160692b9cd83a632f863e07b491
Summary:
Remove the blocking getBlob() API.
There were a few call sites in FileData still using this blocking API. For now
I simply updated them to use getBlobFuture() and make a blocking get() call on
the returned future. These call sites already had TODO comments documenting
the blocking behavior.
I plan to rename getBlobFuture() to getBlob() in a subsequent diff.
Reviewed By: wez
Differential Revision: D5295726
fbshipit-source-id: 748fd7a140b9b59da339a330071f732bba38cb35
Summary:
Remove the blocking getTree() API. All call sites are using getTreeFuture()
instead now.
I plan to rename getTreeFuture() to getTree() in a subsequent diff.
Reviewed By: wez
Differential Revision: D5295725
fbshipit-source-id: 6b40b4c808da94a9c68decae3ce38c7d13fbe9f5
Summary:
Remove the ObjectStores.h and .cpp files. These files contained helper
functions for looking up Tree and TreeEntry objects based on a deep relative
path.
These functions are no longer used anywhere anymore (the code that performs
deep lookups always does inode lookups instead). Additionally, this file is
the only place still using some of the blocking, non-Future-based APIs for
object lookups. Deleting these files will let us remove these deprecated
blocking APIs.
Reviewed By: wez
Differential Revision: D5295691
fbshipit-source-id: 89229827305490eba4d3a581954a90c5cf63238d
Summary:
1.Added a new thrift method to unload free inodes of a directory and its sub directories.
2.Added a new debug sub command 'eden debug unload <path>' to cli tools to unload free inodes.
Reviewed By: simpkins
Differential Revision: D5261038
fbshipit-source-id: 85b4c5ae18c0ae24c666a44ac9892765e753397f
Summary:
This is a major change to Eden's Hg extension.
Our initial attempt to implement `edendirstate` was to create a "clean room"
implementation that did not share code with `mercurial/dirstate.py`. This was
helpful in uncovering the subset of the dirstate API that matters for Eden. It
also provided a better safeguard against upstream changes to `dirstate.py` in
Mercurial itself.
In this implementation, the state transition management was mostly done
on the server in `Dirstate.cpp`. We also made a modest attempt to make
`Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for
Git at some point.
However, as we have tried to support more of the sophisticated functionality
in Mercurial, particularly `hg histedit`, achieving parity between the clean room
implementation and Mercurial's internals has become more challenging.
Ultimately, the clean room implementation is likely the right way to go for Eden,
but for now, we need to prioritize having feature parity with vanilla Hg when
using Eden. Once we have a more complete set of integration tests in place,
we can reimplement Eden's dirstate more aggressively to optimize things.
Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]]
extension has already demonstrated that it is possible to provide a faithful
dirstate implementation that subclasses the original `dirstate` while using a different
storage mechanism. As such, I used `sqldirstate` as a model when implementing
the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`).
In particular, `sqldirstate` uses SQL tables as storage for the following private fields
of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because
`_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we
do not support them in `eden_dirstate` and add code to ensure the codepaths that
would access them in `dirstate` never get exercised. Similarly, we also implemented
`eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the
dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden.
It appears to be primarily used for checking whether a path to a file already exists in
the dirstate as a directory. We can protect against that in more efficient ways.)
That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set
of files that have been marked "copied" in the current dirstate, so it is fairly small and
can be stored on disk or in memory with little concern. `_map` is a bit trickier because
it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored
across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data
analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new
equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles`
table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap`
data.
In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined
in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`,
which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data
structures that translate their method calls to Thrift server calls. I expect we will want to
optimize this in the future via some client-side caching, as well as creating batch APIs for talking
to the server via Thrift.
One advantage of this new implementation is that it enables us to delete
`eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`.
Between the recent implementation of `dirstate.walk()` for Eden and this switch
to the real dirstate, we can now use the default implementation of `hg add` and `hg remove`
(although we have to play some tricks, like in the implementation of `eden_dirstate.status()`
in order to make `hg remove` work).
In the course of doing this revision, I discovered that I had to make a minor fix to
`EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as
`hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which
case the glob was not matching `foo`!
I also had to do some work in `eden_dirstate.status()` in which the `match` argument
was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number
of things with the `match` specified as a filter, so the output of `status()` must be filtered
by `match` accordingly. Ultimately, this seems like work that would be better done on the
server, but for simplicity, we're just going to do it in Python, for now.
For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`.
As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was
testing should probably be converted to integration tests. At a high level, the role of
`DirstatePersistence` has not changed, but the exact data it writes is much different.
Its corresponding unit test is also disabled, for now.
Note that this revision does not change the name of the file where "dirstate data" is written
(this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing
instances of this file once this change lands. (It is still early enough in the project that it does
not seem worth the overhead of a proper migration.)
The true test of the success of this new approach is the ease with which we can write more
integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very
few changes to `eden_dirstate.py`.
Reviewed By: simpkins
Differential Revision: D5071778
fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
Summary:
This is generated by applying clang-tidy `-checks=modernize-use-override` to all the c++ code in project eden.
It enforces the use of the keywords `virtual`, `verride` and `final` in a way compliant to the style guide.
Reviewed By: igorsugak
Differential Revision: D5108807
fbshipit-source-id: 596f2d73f1137de350114416edb1c37da5423ed5
Summary: The `nodiscard` attribute was added to standard C++ in C++17. Prefer the standard formulation when it is available; otherwise, use `__attribute__((__warn_unused_result__))` on compilers that support the GNU extensions, and `_Check_return_` on MSVC. The old `FOLLY_WARN_UNUSED_RESULT` is now expands to `FOLLY_NODISCARD`.
Reviewed By: yfeldblum
Differential Revision: D5105137
fbshipit-source-id: 9aa22e81cd9f0b89f9343433aeae3ba365227ccb
Summary:
We're using `window` to constrain resources and maintain an upper
bound on the number of futures that we spawn. Ironically, for a largish number
of already completed futures, this results in a recursive chain of `spawn`
calls in the `window` implementation and results in a stack overflow (the kind
that crashes a program, not the kind that answers programming questions).
This diff removes the use of `window` and replaces it with `collect`.
An earlier iteration of this diff just made all of these calls serially,
blocking the main thread. This at least does things in parallel; both
can torture the system for pathological walks, but at least the parallel
one should complete faster.
Reviewed By: simpkins
Differential Revision: D4932774
fbshipit-source-id: 40d39a85029f38ff69a530070efb879a81950451
Summary:
It is an error to not provide mode argument when open is
called with O_CREAT.
Reviewed By: oebeling
Differential Revision: D5020517
fbshipit-source-id: 87665ceb16ebdff797f8bf3d33bea082191e3121
Summary:
On OSX mercurial libraries are not installed in the standard pythonpath. This
option lets me pass a pythonpath to the hg importer script only, without
polluting the pythonpath for other subprocess invocations (such as python 3
based hooks, which are not happy with python2 pythonpaths).
Reviewed By: bolinfest
Differential Revision: D4992222
fbshipit-source-id: aa8e734970344215cc8783e164fd119c3ee6aed6
Summary:
These files are defining flags, so they should import gflags.
This import being missing causes open source build to fail.
Reviewed By: bolinfest
Differential Revision: D4992246
fbshipit-source-id: 61eb35b43ae6f9ff88bd34a9ebc8db6d80ac9d08
Summary:
Update Dirstate::remove() to directly call TreeInode::unlink() instead of going
through the EdenDispatcher. Internal code should be able to use TreeInode
directly without going through the FUSE Dispatcher. This prevents us from
having to look up the TreeInode object by inode number again.
This also fixes TreeInode::unlink() and TreeInode::rmdir() to perform FUSE
cache invalidation correctly. EdenDispatcher used to do this for unlink()
calls, but it was missing for rmdir() calls.
Reviewed By: wez
Differential Revision: D4968832
fbshipit-source-id: 51fda5de196392c640436ca6ad7d8333e6799db9
Summary:
This replaces the shouldFileBeDeletedByHgRemove() function with a much simpler
and safer check.
It turns out this code was more complicated than it needed to be: it looked up
the inode in question and confirmed that it still existed, even though it's
calling function had already previously looked up the inode object.
I believe this deletes the last place in the code that still assumes that an
InodeBase object must be loaded if the inode has previously been materialized.
This also removes one of the last remaining places that directly holds the
TreeInode contents_ lock outside of TreeInode.cpp.
Reviewed By: bolinfest
Differential Revision: D4968834
fbshipit-source-id: a37bc0d7889eb81cca5b045cbc82732a53ef53d4
Summary:
This updates the Dirstate::addAll() code to use EdenMount::diff() internally,
instead of getStatusForExistingDirectory().
This lets us delete getStatusForExistingDirectory() and several other helper
functions that it used. The logic in getStatusForExistingDirectory() was based
on the incorrect assumption that files can only be modified from their expected
source control state if they are materialized. This could result in incorrect
results in a variety of cases (particularly after renames, or "hg reset"
operations). The EdenMount::diff() code does not have this problem.
That said, this new addAll() code does still have some performance issues--it
currently does a full tree diff for each input path. We should ideally fix
this in the future to only diff the necessary subtree for each path. However,
in the short term this trade-off seems worth being able to delete all of this
older, buggy code. diff() should be cheap enough in most cases that this won't
be a major problem unless a large number of paths are given as input.
Reviewed By: bolinfest
Differential Revision: D4968835
fbshipit-source-id: 1834aa98a26dcaa0e1c06c7ac25c57944fa1b5f7
Summary:
Make sure we use pointers to const GitIgnoreStack objects when possible.
Also change the code to store the root GitIgnoreStack object for a diff()
operation in the DiffContext object, instead of having EdenMount::diff()
allocate it separately on the heap. This will make it easier to consolidate a
bit more logic in the DiffContext class in the future. In particular, I
suspect we will want a version of diff() that only works on one portion of the
tree. Putting more of the functionality in DiffContext will make it slightly
easier to share code between the full-mount diff and the subtree diff
functions. This also happens to save a memory allocation for now.
Reviewed By: wez
Differential Revision: D4968833
fbshipit-source-id: 1dc33b3d44cdf00e93b22d810c3a736d27c13638
Summary:
Update the eden thrift APIs to support multiple thrift parents, and update the
hg extension to use the new thrift APIs.
Reviewed By: bolinfest
Differential Revision: D4949139
fbshipit-source-id: 98a7bd17ccad7be6a429df34ecfaf3fe7ae0b39a
Summary:
This updates the ClientConfig and EdenMount code to support storing two parent
commits.
This changes the on-disk SNAPSHOT file contents add an 8-byte header that
includes a file identifier and a file format version number, followed by up to
two commit hashes. The code currently can read either the old or new format
from the SNAPSHOT file. We should be able to drop the code for reading the old
format fairly soon if we want, though.
This diff only updates the ClientConfig and EdenMount code, and does not yet
update the thrift APIs or the eden mercurial extension yet. I will update the
rest of the code in a subsequent diff.
Reviewed By: bolinfest, wez
Differential Revision: D4943917
fbshipit-source-id: cf456e67b845aa0cf8b45c822985cb932df107b4
Summary: Add some unit tests that check the behavior of setattr() and getattr().
Reviewed By: bolinfest, wez
Differential Revision: D4941215
fbshipit-source-id: 6691f51aea742d0159a16112ca291546a8adc928
Summary:
The error messages for various failure cases in EXPECT_THROW_RE() and
EXPECT_THROW_ERRNO() were missing a space after the statement being executed.
Reviewed By: bolinfest
Differential Revision: D4941175
fbshipit-source-id: e10b4bff7db027acc5cc7ad5c12af248d55c7cd8
Summary:
This fixes "hg commit" so that it correctly updates the in-memory snapshot.
This has been broken ever since I added the in-memory snapshot when
implementing checkout(). The existing scmMarkCommitted() method updated only
the Dirstate object and the on-disk SNAPSHOT file.
This diff fixes checkout() and resetCommit() to clear the Dirstate user
directives correctly, and then replaces calls to scmMarkCommitted() with
resetCommit().
Reviewed By: bolinfest
Differential Revision: D4935943
fbshipit-source-id: 5ffcfd5db99f30c730ede202c5e013afa682bac9
Summary:
Update the gitignore code so that patterns ending in a trailing slash correctly
match only directories. Fortunately we have file type information available
during TreeInode::computeDiff(), so getting the correct file type information
does not add any extra overhead. The older ignore checking code in
Dirstate.cpp can also do a reasonable job of getting file type information.
That code should also be removed at some point in the future, and use the
TreeInode logic for computing ignore status.
Reviewed By: bolinfest
Differential Revision: D4901326
fbshipit-source-id: 1222c8142876c91e1b80ec937ec84c0c28737224
Summary:
The temporary Hash object doesn't live long enough for the copy into
the string, so make an explicit temporary to extend its lifetime.
Reviewed By: simpkins
Differential Revision: D4900457
fbshipit-source-id: 948d919b929285c30ee1f9f0111cba20b60d4dc6
Summary:
When tree manifest data is available, we don't need to parse the full
manifest any more, which is great because we can simply and quickly load the
root manifest and be done with a checkout operation. We can then quickly load
the manifest entry for a given directory path on demand, making us closer to our
ideal of O(what-you-use).
This diff plumbs in the manifest code from mercurial so that we can reuse
the decoder implementation to get the manfiest data and translate it into
something that we can put into our LocalStore.
There's a little wrinkle when it comes to files; mercurial doesn't support
a means for getting the file contents based on just its hash, we have to
provide the filename and revision hash for that. We have existing code
to create proxy entries in the store to map from a proxy hash to a tuple
with the appropriate data needed to resolve the file entry.
We need to extend its use to the trees that we generate because we need
to be able to propagate that path down to a child tree when we generate
the tree entry for a file for the first time.
If a tree manifest is not available for a given revision, we fall back
to the existing implementation that parses the full flat manifest.
Note: the flat manifest import generates entries with different hashes
compared to the tree manifest hash version. While we could make the
serialization match up, there's little risk of this causing problems
for us in practice, so punting on that for now.
Reviewed By: simpkins
Differential Revision: D4873069
fbshipit-source-id: 865001996745700586f571ed67371aed1668a6a7
Summary:
This tweaks the main.cpp and EdenServer.cpp code so that we can avoid having
separate internal and open source builds of the eden/fs/service:server library.
Only one small function is different between the two builds. This updates the
code so that this function is only used from main.cpp, instead of being used
inside EdenServer.cpp.
Reviewed By: wez
Differential Revision: D4889721
fbshipit-source-id: 97abda92c36752516fb8a8e73b4cf50580db8d79
Summary:
This change makes it so that all of the C++ code related to the edenfs daemon
is now contained in the eden/fs subdirectory.
Reviewed By: bolinfest, wez
Differential Revision: D4889053
fbshipit-source-id: d0bd4774cc0bdb5d1d6b6f47d716ecae52391f37
Summary:
Move the code for the command-line tool up one directory, out of eden/fs.
This better separates the code so that eden/fs contains code for the edenfs
daemon, while eden/cli contains code for the command line tool.
Reviewed By: bolinfest
Differential Revision: D4888633
fbshipit-source-id: 5041e292c5353d05122eefe5db3257289e31239a
Summary:
Fix a subtle crash during checkout when handling newly added entries that
already exist in the working directory: CheckoutAction passed the entry name to
checkoutUpdateEntry() as a PathComponentPiece. However, this
PathComponentPiece could refer to the entry name owned by newScmEntry_, and it
also passed newScmEntry_ into checkoutUpdateEntry() as an rvalue reference.
As a result, if the string data was stored invalidated by the move the name
would no longer be valid when checkoutUpdateEntry() tried to use it.
This bug is triggered by doing an "hg update --clean", where a file added in
the destination commit already exists on disk, and has an entry name of 23
characters or less. (The 23 character limit is fbstring's upper bound on
small string optimizations, where it will store the string data inline in the
object, causing it to be invalidated on move.)
This also fixes a crash in a VLOG() statement when the verbose log level for
TreeInode.cpp was set to 4 or greater.
Reviewed By: bolinfest
Differential Revision: D4882544
fbshipit-source-id: 917ede6eeae2224aaa0724b8b30324f3c3a5c924
Summary:
This is a bit too magical -- it's not clear that the thing produces an Options object. If you do know that you can chain further option setters off this thing, it's nice, but otherwise, the first impression is "what just happened?".
So, let's have one good way for doing things.
Reviewed By: yfeldblum
Differential Revision: D4863947
fbshipit-source-id: 3dfe83cfc077d47f604f47dcb21149fbaa2d2243
Summary:
Update the hg extension to implement dirstate.rebuild(). This is necessary for
the `hg reset` command. This also now implements dirstate.setparents() for
cases when there is only one parent.
Reviewed By: wez
Differential Revision: D4823780
fbshipit-source-id: 802de006e03860995095dc3af17acb2eb05f4e8b
Summary:
Update the checkOutRevision() thrift API to accept the commit ID as a
BinaryHash rather than as a hexadecimal ASCII string, to make it more
consistent with our other thrift APIs.
Reviewed By: wez
Differential Revision: D4823779
fbshipit-source-id: 6226f926533ef86d60f8b1d659bc5d912e2855e6
Summary:
The intent is to provide a way to locate the SNAPSHOT file
for tools that want to have a very fast way to figure out the commit
id without making any RPCs or subprocess invocations.
Reviewed By: simpkins
Differential Revision: D4824176
fbshipit-source-id: 5adca225d9984146852dad1e83de0d903848c1e5
Summary:
This simplifies the InodePtrImpl class by removing support for pointer-to-const
Inode objects. We don't currently use pointer-to-const Inode objects, and I
don't really anticipate that we will ever need this.
The primary motivating factor behind this diff is a recent change in clang that
breaks the existing code: clang PR31606 prevents children classes from
inheriting template constructor instantiations from their parent which take a
reference to the parent class type as the only argument.
While this could have been fixed by explicitly defining the necessary
constructors in the child class (or simply by overriding newPtrLocked() and
newPtrFromExisting() to return the subclass type), dropping support for
pointer-to-const allows us to simplify the code and resolve this issue.
Reviewed By: wez
Differential Revision: D4825526
fbshipit-source-id: 999de352788b13818b7f23788bb34686e193cd5d
Summary:
previously, the importer would read the entire manifest
and emit data to the store as it resolved complete directory entries.
The entire manifest data would be buffered and sent out to the store.
In the scenario where one subtree has been modified and a commit has
been made, only the parents of the subdirectory need to be hashed
and stored, but we would compute and try to store everything anyway.
While this diff can't avoid having to compute hashes for everything (we need
tree manifest data for that), by breaking the import into two passes we can
potentially avoid interrogating the LocalStore about every tree in the entire
manifest during an import; we only need to store the trees that are missing and
can simply cut out sub trees that are already present.
This saves us IO at the expense of buffering the manifest tree in memory
for the duration of the first pass; it is an acceptable trade off.
Reviewed By: simpkins
Differential Revision: D4832166
fbshipit-source-id: 0a40cb851c65393b407a8161db05c4b1795fb11a
Summary:
This avoids hard coding knowledge of the FB-specific `local` dir.
This is only suitable for use on already-mounted eden instances,
so the function accepts either the mounted path or the eden
configuration dir.
Reviewed By: bolinfest
Differential Revision: D4829166
fbshipit-source-id: df9daed6024cdec6222be90f5cb624dd9ca0da0b
Summary:
Folly is dropping support for GCC 4.8 now that the branch for last HHVM release that needs to support it has been cut.
This codemod's away all the trivially identifiable uses of `folly::make_unique` in fbcode outside of folly, mcrouter and hphp.
A future diff will kill it in those, and a diff after that will kill it entirely.
Reviewed By: djwatson
Differential Revision: D4498157
fbshipit-source-id: 5a2c05ffd29e5358298acb53dab189a699b022cc