Commit Graph

271 Commits

Author SHA1 Message Date
Adam Simpkins
f8930c5325 fix Dispatcher::symlink() API
Summary:
Fix Dispatcher::symlink() to accept the symlink contents as a StringPiece
rather than a PathComponentPiece.  symlink contents can be any arbitrary
string, and are not required to be a valid, normalized path name.

Reviewed By: wez

Differential Revision: D4325380

fbshipit-source-id: 88448bee50ea192c06442dc70042c7d17d49a12f
2016-12-14 15:36:11 -08:00
Adam Simpkins
4cb1fa6379 use EdenMount::getFileInode() getSHA1ForPath()
Summary:
Update EdenServiceHandler::getSHA1ForPath() to replace its own custom path
lookup code with EdenMount::getFileInode().

This also ended up fixing the error message to correctly include the path name
on EISDIR errors.

Reviewed By: wez

Differential Revision: D4325066

fbshipit-source-id: 9aa3932c71c33e6bc11d2c71cc8f1badb4c0dcb7
2016-12-14 15:36:11 -08:00
Adam Simpkins
007726931b add a new InodeError class
Summary:
InodeError is a subclass of std::system_error that accepts an InodePtr to the
inode that it refers to.  This makes it easier to construct error objects that
retain information about the inode they refer to.

InodeError also avoids computing the inode path until the error message is
actually needed.  This should make it less expensive in cases where errors are
thrown and handled internally without ever using the human-readable error
message.  It is possible that the file may have been renamed or unlinked by the
time the error message is computed.  However, this race condition might still
exist even if we computed the path at the time when the error is constructed.
getLogPath() will construct a usable human-readable string even if the file has
been unlinked.

Reviewed By: wez

Differential Revision: D4325043

fbshipit-source-id: c9683a80b022f281ca4583a9b7f73b15277335bb
2016-12-14 15:36:11 -08:00
Adam Simpkins
fc202f81e5 add new Future-based APIs to ObjectStore
Summary:
Update the ObjectStore and BackingStore classes to have APIs that return
folly::Future objects, rather than blocking until the requested data is loaded.

For now most users still call the blocking versions of getBlob() and getTree().
Furthermore, all of the Future-based implementations actually still block
until the data is ready.  I will update the code to use these new APIs in
future diffs, and then deprecate the non-future based versions.

Reviewed By: bolinfest

Differential Revision: D4318055

fbshipit-source-id: a250c23b418e69b597a4c6a95dbe80c56da5c53b
2016-12-13 18:12:21 -08:00
Michael Bolin
5e08c5e1a7 Introduce Dirstate::getStatusForDirectory().
Summary:
This will make it easier to implement `hg add <directory>` such that
`<directory>` is expanded on the server rather than on the client.

Reviewed By: simpkins

Differential Revision: D4318735

fbshipit-source-id: cf0e89bd95eb58304cd23e70beb77bc7151f2c5c
2016-12-13 12:00:22 -08:00
Michael Bolin
3196a05b86 Filter .hg directory on the server when computing getStatus().
Summary:
Previously, `.hg` entries were filtered when calling `hg status` on the client,
but this should be happening on the server. This updates the logic in
`Dirstate.cpp` to properly exclude `.hg` when traversing the overlay for
modified directories, so this will eliminate a bunch of unnecessary computation
and simplify the client.

I'm unsure of how best to implement the ownership relation for the set that
contains `.hg`. Please advise! I know that I could categorically exclude
`".hg"` in `getModifiedDirectoriesRecursive()`, but I haven't used it enough
scenarios yet to be sure that's the right thing to do. For example, if it were
a Git repo, arguably we should consider `.hg` and not `.git`, so I could also
require the set to be a parameter of `Dirstate`, but I want to make sure I get
the ownership stuff right.

Reviewed By: simpkins

Differential Revision: D4316531

fbshipit-source-id: a0f13ca1c3c620b686435c8aa6485ba4e850f043
2016-12-13 12:00:22 -08:00
Michael Bolin
7382936b9b Added tests for getEntryForPath().
Summary: These should have been added as part of D4270526.

Reviewed By: simpkins

Differential Revision: D4315382

fbshipit-source-id: 5920ff38f9cc63540e4813e8ab40f79ad46f9ec1
2016-12-13 12:00:22 -08:00
Michael Bolin
1619acb2c7 Change getModifiedDirectoriesForMount() to return a vector instead of a unique_ptr.
Summary: Follow-up on a comment that came out of the review for D4249214.

Reviewed By: simpkins

Differential Revision: D4314979

fbshipit-source-id: 76384474092e6fd48394f6faf8b84ba6220c556a
2016-12-13 12:00:22 -08:00
Adam Simpkins
5e3f6fb644 move fs/store/testutil code into fs/testharness
Summary:
Move the FakeObjectStoreTest class into fs/testharness, along with the
TestMount and TestBackingStore classes.  This simply consolidates the test
utility code into a single location.

Reviewed By: bolinfest

Differential Revision: D4317517

fbshipit-source-id: 4e19590c5ffde88b66f2c8d4a964352ec349031c
2016-12-12 18:24:31 -08:00
Adam Simpkins
c81ee4c997 update getSha1ForBlob() to return the Hash by value
Summary:
Hash objects are small enough (20 bytes) that it isn't worth allocating them on
the heap.  This updates LocalStore::getSha1ForBlob() to return a
folly::Optional<Hash>, and ObjectStore::getSha1ForBlob() to return a plain
Hash.

Reviewed By: bolinfest

Differential Revision: D4298162

fbshipit-source-id: 9cf54f2997ba8c3b2346db315a2aca41e580b078
2016-12-12 17:50:36 -08:00
Adam Simpkins
483256a64d document ObjectStore APIs
Summary:
Add comments in ObjectStore.h documenting the fact that the get* APIs all throw
std::domain_error when the specified ID does not exist, and never return
nullptr.

Also update the FakeObjectStore class used for testing to follow this behavior.

Reviewed By: bolinfest

Differential Revision: D4298160

fbshipit-source-id: c5509bb3aa2ed76619b06b733ad240aaa5f00862
2016-12-12 17:50:35 -08:00
Adam Simpkins
5857b36df8 update LocalStore to also store the blob size along with its SHA-1
Summary:
In addition to storing the SHA-1 of each file's contents, also store the size.
This will allow us to more quickly look up the file size, without having to
retreive the file size.

I haven't yet added an API to ObjectStore to retreive the full BlobMetadata
object; I will do that in a subsequent diff.  One benefit for now is that this
does avoid double-computing the SHA-1 in ObjectStore::getSha1ForBlob() if we
had to load the blob.

Reviewed By: bolinfest

Differential Revision: D4298157

fbshipit-source-id: 4d83ebfa631c93fcef06ca1cd0ba0e1a70a2476d
2016-12-12 17:50:35 -08:00
Adam Simpkins
89fb0f811b add InodePtr, TreeInodePtr, and FileInodePtr type names
Summary:
Define InodePtr, TreeInodePtr, and FileInodePtr as aliases for std::shared_ptr
of the underlying inode type.  This also updates all of the code to use these
new type names.

This will make it easier swap out std::shared_ptr with a custom pointer type in
the future.  (I believe we will need a custom type in the future so that we
can have more precise control of the reference counting so we can load and
unload Inode objects on demand.  std::shared_ptr::unique() doesn't quite
provide the flexibility we need, and is also being deprecated in C++17.)

Reviewed By: bolinfest

Differential Revision: D4297791

fbshipit-source-id: 1080945649290e676f62689592159f1166159b20
2016-12-12 17:50:35 -08:00
Adam Simpkins
e7b2e8bd52 update InodeBase to track its path
Summary:
This updates the InodeBase code to track its location in the filesystem.
Since we do not support hard links, each inode has a single path where it
exists.

Tracking this data allows us to implement getPath() as a method of InodeBase.

This code is not really complete yet, but it seems worth getting the current
code in as-is.  The location data is not updated properly on unlinks or
renames, but it looks like the existing InodeNameManager code does not get
updated either.  I am working on some additional refactoring of inode object
management, and it will be easier to come back and fix the unlink and rename
handling after this refactoring is further along.

Reviewed By: bolinfest

Differential Revision: D4297591

fbshipit-source-id: 82ceb326e4f9c376f627b1d8f49bb7db3cfc2b0b
2016-12-12 17:50:35 -08:00
Michael Bolin
309d0da769 Make it possible to make a commit from Eden.
Summary:
In this revision, we override `committablectx.markcommitted()` to make a Thrift
call to Eden to record the new commit. For now, this defers the need for us to
implement `edendirstate.normal()`, though we expect we will have to provide a
proper implementation at some point.

Because `hg update` is not implemented yet, this puts us in a funny state where
we have to restart eden after `hg commit` to ensure all of our `TreeEntry` and
other in-memory data structures are in the correct state.

Reviewed By: simpkins

Differential Revision: D4249214

fbshipit-source-id: 8ec06dfee67070f008dd93a0ee6c810ce75d2faa
2016-12-10 01:07:06 -08:00
Michael Bolin
d60888d210 Support hg remove <directory>.
Summary:
This refines the initial support for `hg remove` by adding support for
directories.

Reviewed By: simpkins

Differential Revision: D4270546

fbshipit-source-id: c97dfea555ad489ddda01ad2587f1856b1953e02
2016-12-09 19:36:17 -08:00
Michael Bolin
1e1b36afe5 Introduce getEntryForPath().
Summary:
`getEntryForPath()` is helpful when we don't know (or care) whether the path
corresponds to a file or a directory.

Reviewed By: simpkins

Differential Revision: D4270526

fbshipit-source-id: 6dd2f76df1749e040dda788a74055d9da2156a4d
2016-12-09 19:36:17 -08:00
Michael Bolin
017b35bca6 Add support for hg remove for an individual file.
Summary:
This adds support for `hg remove <file>` in Eden and sets up some of the scaffolding
to support `hg remove <directory>`.

Note that the Thrift API for `scmRemove()` is slightly different than that of `scmAdd()`
in that it returns a list of error messages to display to the user rather than throwing
an exception. In practice, for batch operations, Mercurial will allow some operations
to succeed while others may fail, so it is possible to have multiple error messages to
return.

Unlike the current implementation of `hg add`, this does the directory traversal
on the server rather than on the client. Once we work out how to do this for
`hg remove`, we should figure out how to reuse the logic for `hg add`.

Reviewed By: simpkins

Differential Revision: D4263068

fbshipit-source-id: d084774d562c48c59664f313eba229d4197929fe
2016-12-09 19:36:17 -08:00
Michael Bolin
ab750945fe Fix an edge case in hg remove handling.
Summary:
The check on the type of an exception was inverted, which meant
`hg remove <path>` would throw an exception if the parent directory of `path`
did not exist. This is not correct because the user should be able to expect
to do:

```
mkdir -p /tmp/example
cd /tmp/example
hg init
mkdir mydir
touch mydir/a
hg add mydir/a
rm -rf mydir/a
hg rm mydir/a
```

In this scenario, `mydir` does not exist when `hg rm` is called, but the command
should succeed, making `mydir/a` no longer tracked.

Reviewed By: simpkins

Differential Revision: D4268451

fbshipit-source-id: 517d81252aa8e4b6bd1a32dece14776a9f7dd6f7
2016-12-09 16:11:59 -08:00
Michael Bolin
c58ab82952 Ignore "group" and "other" bits of mode_t when diff'ing tree entries.
Summary:
A `TreeEntry` reports a `mode_t` whose "group" and "other" bits are set in a way
that reflects the "owner" bits (so they are non-zero). By comparison, the `mode`
of a `TreeInode::Entry` will reflect the permissions on disk in the overlay (if
the file is materialized). In general, the overlay bits will probably be the
same as those in the `TreeEntry` since we expect the user will rarely mess
with the "group" and "other" bits, but we're seeing a difference more often
right now because of t14953681.

Reviewed By: simpkins

Differential Revision: D4298214

fbshipit-source-id: 2919be94c6bba61135838ee86bbc68aa4031af7c
2016-12-09 15:42:04 -08:00
Adam Simpkins
da04640287 move InodeBase from eden/fuse to eden/fs/inodes
Summary:
Move the InodeBase class from the lower-level fusell code up to the
eden/fs/inodes layer, now that everything else that uses it is in
eden/fs/inodes.

I plan to start changing the ownership model of inode objects a bit, and this
will allow the InodeBase class to interact with EdenDispatcher and other
classes in eden/fs/inodes.

Reviewed By: bolinfest

Differential Revision: D4283392

fbshipit-source-id: 9e1d6fb81dc223f905847cbe8d165a40ad0aca4d
2016-12-07 20:05:20 -08:00
Christopher Dykes
d478ee151b Rename stdin, etc. in Subprocess to work with MSVC
Summary:
`stdin`, `stdout` and `stderr` are macros that expand to function calls with the MSVC CRT implementation. This is also the case for musl-libc. This means that Subprocess simply cannot be compiled on those platforms without changing the API.
To solve that, we change the API and deprecate the old API.

For more fun, `stdin`, `stdout` and `stderr` are also macros in glibc, they just expand to other identifiers rather than a function call.

Reviewed By: yfeldblum

Differential Revision: D4229544

fbshipit-source-id: 97f1a3b228b83cfdcaffee56d729063ea235e608
2016-12-07 11:43:09 -08:00
Adam Simpkins
9fd0a3a286 update remaining references to DirInode in the TestMount code
Summary:
Update the TestMount APIs to use the term "TreeInode" instead of "DirInode" now
that the DirInode class is no more.

Reviewed By: bolinfest

Differential Revision: D4257171

fbshipit-source-id: 2407dfc25405f7161987260c0299f1723831e264
2016-12-01 17:52:31 -08:00
Adam Simpkins
f9a99e0ba1 remove fusell::DirInode and fusell::FileInode
Summary:
Now that the EdenDispatcher class has been moved into eden/fs, we no longer
need the distinction between TreeInode and fusell::DirInode, and FileInode and
fusell::FileInode.

This diff deletes the fusell versions of these classes, and updates all of the
code to always directly use TreeInode and FileInode.  This allows us to get rid
of the remaining dynamic_casts between these pairs of classes.

Reviewed By: bolinfest

Differential Revision: D4257165

fbshipit-source-id: e2b6f328b9605ca0e2882f5cf7a3983fb4470cdf
2016-12-01 17:52:31 -08:00
Adam Simpkins
2a08798f88 move InodeDispatcher from eden/fuse to eden/fs
Summary:
Move the InodeDispatcher class out of the lower-level fusell namespace in
eden/fuse and into the higher-level eden code in eden/fs/inodes.  I also
renamed it from InodeDispatcher to EdenDispatcher, in anticipation of it
getting more eden-specific functionality in the future.

The fusell::MountPoint class is now independent of the Dispatcher type, and can
work with any Dispatcher subclass.  Previously the MountPoint class was
responsible for owning the InodeDispatcher object.  Now its caller (EdenMount
in our case) is responsible for supplying a Dispatcher object that is owned
externally.

Several parts of EdenDispatcher had to be updated as a result of the namespace
move, but I tried to keep this change somewhat minimal.  I did update it from
using fusell::DirInode and fusell::FileInode to eden's TreeInode and FileInode
classes directly.  However, there still remains more clean-up work to do.  I
will split remaining changes out into upcoming diffs.

Reviewed By: bolinfest

Differential Revision: D4257163

fbshipit-source-id: dc9c2526640798f9f924ae2531218ba2c45d1d0a
2016-12-01 17:52:31 -08:00
Adam Simpkins
bf252cb3e9 eliminate MountPoint::setRootInode()
Summary:
Drop the MountPoint::setRootInode() method, and have EdenMount perform the
operation directly on the InodeDispatcher.

Reviewed By: bolinfest

Differential Revision: D4257158

fbshipit-source-id: af4a696de2d36979c658972104361f225f482338
2016-12-01 17:52:31 -08:00
Adam Simpkins
fc29bccd78 move InodeNameManager access to EdenMount
Summary:
Update call sites in eden/fs to access the InodeNameManager through the
EdenMount object rather than the MountPoint.

It turns out that there was only one call site in TreeInode, and all other
callers in eden/fs get it indirectly via TreeInode::getNameMgr().

Reviewed By: bolinfest

Differential Revision: D4257156

fbshipit-source-id: 9f0212134b20c8dd8943827c17aa16ee7274bc36
2016-12-01 17:52:31 -08:00
Adam Simpkins
0af5d187c2 move MountPoint::getInode* methods into EdenMount
Summary:
Move getInodeBaseForPath(), getDirInodeForPath(), and getFileInodeForPath()
entirely into the EdenMount class, and make sure all call sites are using the
EdenMount methods rather than the MountPoint methods.

Reviewed By: bolinfest

Differential Revision: D4257153

fbshipit-source-id: d528cfad174757c3c9f23e62a0616f8bf1976da7
2016-12-01 17:52:31 -08:00
Adam Simpkins
f0082e9178 call EdenMount::getDispatcher() and EdenMount::getRootInode()
Summary:
Update all code in eden/fs to call EdenMount::getDispatcher() instead of
getting the underlying MountPoint from the EdenMount and then calling
getDispatcher() on it.  This will allow me to move the InodeDispatcher from
MountPoint to EdenMount in a subsequent diff.  This also simplifies many of the
callers of this method.

Additionally, add an EdenMount::getRootInode() method, and update call sites to
use this rather than having to look up the InodeDispatcher and call
getRootInode() or getDirInode(FUSE_ROOT_ID) on it.

Reviewed By: bolinfest

Differential Revision: D4257152

fbshipit-source-id: 33e6f6b8853db2a88f4f2c221122eea50e796390
2016-12-01 17:52:31 -08:00
Adam Simpkins
1d00843bb3 short-term code for checking ignore status
Summary:
This updates the Dirstate to also check if untracked files are ignored or not.

This is somewhat inefficient, as we have perform a separate check for each
untracked file we find.  Ideally we should perform all of the dirstate
computation in a single tree walk, and check ignore status as we go.  This
would allow us to skip ignored directories entirely, rather than potentially
having to check each file inside them.  I intend to work on cleaning this up in
the future, but it will require refactoring some of the inode code first.

Reviewed By: bolinfest

Differential Revision: D4225308

fbshipit-source-id: 49e444c85cbb6ede11cc6e19052fdd16cf8aab9f
2016-12-01 17:52:30 -08:00
Adam Simpkins
74fa63d0d1 store a pointer to the EdenMount in the Dirstate
Summary:
Update the Dirstate to store a pointer to the EdenMount object that owns it,
rather than storing pointers to the lower-level MountPoint and ObjectStore
objects.

This change is necessary in order for me to move more functionality from
MountPoint to EdenMount.  (In particular, I plan to move the InodeDispatcher to
the EdenMount.)

As part of this change I also started moving some APIs from MountPoint to
EdenMount.  For now the EdenMount versions are just thin wrappers on top of the
MountPoint APIs.  I will move the functionality directly into EdenMount in a
future diff.

Reviewed By: bolinfest

Differential Revision: D4255675

fbshipit-source-id: 93749c6516c3cea4b4ae93de4ca49ddf05f4d260
2016-12-01 17:52:30 -08:00
Adam Simpkins
87c2d46ce3 write dirstate file atomically
Summary:
Write the dirstate data using the new folly::writeFileAtomic() function.
This ensures that the dirstate file will always contain full, valid contents,
even if we crash or run out of disk space partway through writing out the data.

This diff also includes a couple other minor tweaks:
- Update Dirstate to store the DirstatePersistence object directly inline,
  rather than allocating it separately on the heap.
- Update the DirstatePersistenceTest code to prefix temporary directories with
  "eden_test".  This just makes it easier to tell if the tests fail to clean up
  after themselves for any reason.

Reviewed By: bolinfest

Differential Revision: D4254027

fbshipit-source-id: 6b6601b65aeacdee998a6c4260e972d5fb2426ac
2016-12-01 17:52:30 -08:00
Adam Simpkins
aaa3332644 simplify EdenMount and Dirstate construction
Summary:
This cleans up construction of the EdenMount and Dirstate objects:

- The EdenMount constructor is now responsible for creating the Overlay and
  Dirstate objects.
- The Dirstate constructor is now responsible for loading the
  DirstatePersistence file.
- The EdenMount now takes ownership of the ClientConfig object, and stores it
  for later use.
- The ClientConfig object now has a method to get the path to the
  DirstatePersistence file.
- I added a ClientConfig::createTestConfig() method, so that the TestMount code
  can now use the same EdenMount constructor as the normal code.

This simplifies the logic in EdenServiceHandler and TestMount, and makes some
of the initialization dependencies a little bit simpler.

This change is necessary in order for me to move some logic from
fusell::MountPoint into EdenMount.  The Dirstate object will need a pointer
back to its EdenMount object, and this diff enables that.

Reviewed By: bolinfest

Differential Revision: D4249393

fbshipit-source-id: 439786accbf48c8696dbc6ca4fe77a4c6bdeab65
2016-12-01 17:52:30 -08:00
Michael Bolin
1cd336846e Validate the value of a UserStatusDirective read from disk.
Summary: Follow-up work from D4181868.

Reviewed By: simpkins

Differential Revision: D4256227

fbshipit-source-id: 3e40a0cae4ab132046d96029d2881c2127ed4c83
2016-11-30 19:04:06 -08:00
Adam Simpkins
0e613bd96b rename TreeEntryFileInode to FileInode
Summary:
Rename TreeEntryFileInode to FileInode, and TreeEntryFileHandle to FileHandle.
These class names were long and awkward.

It's slightly unfortunate that we now have classes named both
eden::fuse::FileInode and eden::fuse::fusell::FileInode, but I don't believe
this should cause any major problems.  If we want to eliminate these name
collisions in the future I would advocate for renaming the fusell versions to
something like "FileInodeIface".

Reviewed By: bolinfest

Differential Revision: D4217909

fbshipit-source-id: 899672a318d7ae39595f2c18e171f8fd6cebedc6
2016-11-30 15:49:13 -08:00
Islam AbdelRahman
dde067e6a6 Update TARGETS files to use buckified RocksDB (Use RocksDB version 4.13.3)
Summary:
Reland D4181299
but after upgrading to RocksDB 4.13.3 to fix UBSAN issues

Reviewed By: siying

Differential Revision: D4241833

fbshipit-source-id: b342ad892354f60d8e867bda08055c5583e0abeb
2016-11-29 14:37:06 -08:00
Michael Bolin
7d79026f1a Use const auto& base to avoid doing a copy.
Reviewed By: simpkins

Differential Revision: D4244182

fbshipit-source-id: ffdb55f59af45b0675336d4b9450c8b9fb90af5a
2016-11-29 11:19:08 -08:00
Michael Bolin
8c19620e62 Support directories in Dirstate::computeDelta().
Summary:
Previous to this commit, the `Dirstate` logic only worked correctly when the
changes occurred in the root directory. Obviously that is very limiting, so this
adds support for changes in arbitrary directories at arbitrary depths.

This also introduces support for things like a file being replaced by a
directory of same name or vice versa. The tests have been updated to verify
these cases.

One interesting design change that fell out of this was the addition of the
`removedDirectories` field to the `DirectoryDelta` struct. As you can see,
all entries in a removed directory need to be processed by the new
`addDeletedEntries()` function. These require special handling because deleted
directories do not show up in the traversal of modified directories.

In contrast, new directories do show up in the traversal, so they require a
different type of special handling. Specifically, this call will return `NULL`:

```
auto tree = getTreeForDirectory(directory, rootTree.get(), objectStore);
```

When this happens, we must pass an empty vector of tree entries to
`computeDelta()` rather than `&tree->getTreeEntries()`. Admittedly, the special
case of new directories is much simpler than the special case of deleted ones.

Reviewed By: simpkins

Differential Revision: D4219478

fbshipit-source-id: 4c805ba3d7688c4d12ab2ced003a7f5c19ca07eb
2016-11-29 06:51:14 -08:00
Michael Bolin
98c56736cd Add some error messages for some DCHECKs.
Reviewed By: simpkins

Differential Revision: D4219453

fbshipit-source-id: 391dd5ec57e01b2f09154eb991eb3e8e2e969f62
2016-11-26 12:01:41 -08:00
Michael Bolin
9156794f06 Pass ClientConfig as a raw pointer rather than transferring ownership.
Summary:
This is a better fix for the quick fix introduced by D4198939.
It turns out that the `EdenMount` does not need to take ownership
of the `ClientConfig`, so removing the `std::move()` makes this code
much simpler because instead of declaring a bunch of variables
early in `mountImpl()` so that we can "hold on" to them before `EdenMount`
takes ownership of the `ClientConfig`, we can declare them closer to where they
are actually used.

Note that we may want `EdenMount` to actually take ownership of the
`ClientConfig` in the future, but we'll cross that bridge when we come to it.

Reviewed By: simpkins

Differential Revision: D4199000

fbshipit-source-id: 67411a9a5ef630a9d481aebc94631c79da4ab2c4
2016-11-26 12:01:41 -08:00
Michael Bolin
b078392a9c Add Thrift endpoints for Hg dirstate.
Summary:
This also introduces the change where the `EdenMount` creates
and takes ownership of the `Dirstate`.

To clean some of this up, I had to expose a `getEdenDir()` method on `EdenServer`
that returns an `AbsolutePathPiece`. This was previously stored internally as a
`std::string`, so I had to clean up a bunch of path construction that was using `edenDir_`.

Reviewed By: simpkins

Differential Revision: D4123763

fbshipit-source-id: 270b182521c1a84bb054832f4b5f92af849d67e4
2016-11-26 12:01:41 -08:00
Michael Bolin
0f834ea809 Flip Dirstate -> EdenMount dependency.
Summary:
Previously, `Dirstate` took a `std::shared_ptr<EdenMount>`, but now it takes
pointers to a `MountPoint` and an `ObjectStore` because it does not need the
entire `EdenMount`. Ultimately, this will enable us to have `EdenMount` create
the `Dirstate` itself, but that will be done in a follow-up commit.

Fortunately, it was pretty easy to remove the references to `edenMount_` in
`Dirstate.cpp` and rewrite them in terms of `mountPoint_` or `objectStore_`.
The one thing that I also decided to move was `getModifiedDirectoriesForMount()`
because I already needed to create an `EdenMounts` file (admittedly not a
great name) to collect some utility functions that use members of an `EdenMount`
while not having access to the `EdenMount` itself.

As part of this change, all of the code in `eden/fs/model/hg` has been moved to
`eden/fs/inodes` so that it is alongside `EdenMount`. We are going to change
the `Dirstate` from an Hg-specific concept to a more general concept.

`LocalDirstatePersistence` is no longer one of two implementations of
`DirstatePersistence`. (The other was `FakeDirstatePersistence`.) Now there is
just one concrete implementation called `DirstatePersistence` that takes its
implementation from `LocalDirstatePersistence`. Because there is no longer a
`FakeDirstatePersistence`, `TestMount` must create a `DirstatePersistence` that
uses a `TemporaryFile`.

Because `TestMount` now takes responsibility for creating the `Dirstate`, it
must also give callers the ability to specify the user directives. To that end,
`TestMountBuilder` got an `addUserDirectives()` method while `TestMount` got a
`getDirstate()` method. Surprisingly, `TestMountTest` did not need to be updated
as part of this revision, but `DirstateTest` needed quite a few updates
(which were generally mechanical).

Reviewed By: simpkins

Differential Revision: D4230154

fbshipit-source-id: 9b8cb52b45ef5d75bc8f5e62a58fcd1cddc32bfa
2016-11-26 12:01:41 -08:00
Michael Bolin
0a174e7128 Implement LocalDirstatePersistence.
Summary: This is an implementation of DirstatePersistence that persists data to a local file.

Reviewed By: simpkins

Differential Revision: D4181868

fbshipit-source-id: 7177b2ef67cd3aec56e5ad10f41169cc5ec69d81
2016-11-26 12:01:41 -08:00
Adam Simpkins
e96c3e1ca7 fully implement gitignore glob pattern matching
Summary:
Update the gitignore handling code to perform pattern matching the same way git
does.  Previously the code just called the standard fnmatch() function, which
does not handle "**" in patterns the same way git does.

This includes our own new implementation of glob pattern matching.  I did
evaluate several other options before writing our own implementation here:

- The wildmatch() code used by git (and watchman, and rsync) has a few
  downsides: it is not distributed by itself as a library anywhere else.
  Therefore we would probably have to include a copy of this code in our
  repository.  Making another copy is unfortunate, and somewhat undesirable
  from a legal and licensing perspective.  This code also only works with
  nul-terminated strings, and our code deals primarily with non-terminated
  StringPiece objects.

- I did look at translating glob patterns in to regular expressions and using
  re2 to perform matching.  Unfortunately re2 turns out to be substantially
  slower than wildmatch() for typical gitignore patterns.

This new implementation performs some preprocessing on the glob pattern, and
generates a pattern opcode buffer.  Eden can perform this glob preprocessing
when it first loads a .gitignore file, and can then save and re-use this result
each time it needs to match a filename.  Doing this preprocessing allows
matching to be done 50% to 100% faster than wildmatch() for typical glob
patterns.

Reviewed By: bolinfest

Differential Revision: D4194573

fbshipit-source-id: 46bc6a61b6d8066f4bbdb5d3e74265a3e72e42cc
2016-11-21 15:26:07 -08:00
Adam Simpkins
b7ff172fc6 initial framework for gitignore file handling
Summary:
This adds some initial code for handling gitignore files.

I did check to see if there were APIs from libgit2 that we could leverage for
this, but it does not look like we can easily use their functionality.  The
libgit2 ignore code seems to tightly coupled with their repository data
structures, and it requires that you actually have a git repository.

This code isn't quite 100% compatible with git's semantics yet.  In particular:

- For now we are just using fnmatch() to do the matching.  This is currently
  inefficient as we have to do string allocations on each match attempt.  This
  also doesn't quite match git's behavior, particularly with regard to "**"
  inside patterns.

- The code currently does not have a mechanism for indicating if a path refers
  to a directory or not, so trailing slashes in the pattern are not honored
  correctly.

We will probably need to implement our own fnmatch-like function in the future
to solve these issues.

Reviewed By: bolinfest

Differential Revision: D4156480

fbshipit-source-id: 8ceaefd3805358ae2edc29bfc316e5c8f2fb7d31
2016-11-21 15:26:07 -08:00
Felipe Silva
341bd51603 Revert D4181299: Update TARGETS files to use buckified RocksDB
Summary: This reverts commit 5b68a82fb1658c4af6edc898bc9bc4b5113ee785

Differential Revision: D4181299

fbshipit-source-id: df31a97b12da85c2fca46a1049c37e23e41cfe99
2016-11-20 19:55:07 -08:00
Michael Bolin
27c8864401 Allow initial state for Dirstate to be passed to the constructor.
Summary:
For now, initial state is represented by a
`std::unordered_map<RelativePath, HgUserStatusDirective>`.

Reviewed By: simpkins

Differential Revision: D4123461

fbshipit-source-id: 83a99e1f504dd1efca1bc1ed33cbc3f116787a80
2016-11-18 19:26:04 -08:00
Michael Bolin
12eac0f5db Implement Dirstate::remove().
Summary:
This adds the logic to power `hg rm`. There are comprehensive tests that attempt to cover
all of the edge cases.

This evolved to become a complex change because I realized that I needed to change
my internal representation of the dirstate to implement it properly. Specifically, we now maintain
a map (`userDirectives`) of files that have been explicitly scheduled for change via `hg add` or `hg rm`.

To compute the result of `hg status`, we find the changes between the manifest/root tree
and the overlay and also consult `userDirectives`. `Dirstate::getStatus()` was updated
considerably as part of this commit due to the introduction of `userDirectives`.

As such, `Dirstate::remove()` must do several things:
* Defend the integrity of the dirstate by throwing appropriate exceptions for invalid inputs.
* Delete the specified file, if appropriate.
* Update `userDirectives`, if appropriate.

Although `Dirstate::add()` was not the focus of this commit, it also had to be updated to
match the pattern introduced by `Dirstate::remove()`.

Some important features that are still not supported are:
* Handling ignored files correctly.
* Storing copy/move information.

Reviewed By: simpkins

Differential Revision: D4104503

fbshipit-source-id: d5d45a279e16ded584c6cd4d528ba92d2c8e2993
2016-11-18 19:26:04 -08:00
Islam AbdelRahman
9290e0e6c9 Update TARGETS files to use buckified RocksDB
Summary: Update all fbcode projects TARGETS to move to fbcode/rocksdb

Reviewed By: siying

Differential Revision: D4181299

fbshipit-source-id: 5b68a82fb1658c4af6edc898bc9bc4b5113ee785
2016-11-18 16:40:05 -08:00
Wez Furlong
96e19f1fe5 fix test_bad_mount_path
Summary:
D4014598 changed this line but didn't change the test expectations.
Since it seems desirable for the mount point name to be used, I've reverted
back to the prior state for this line.

Reviewed By: bolinfest

Differential Revision: D4202265

fbshipit-source-id: bddc01436e0a5921a3b0b2c01c0fd2c32f5f1960
2016-11-17 18:44:58 -08:00
Michael Bolin
61ff7492db Fix a bug where generate-hooks-dir was creating the wrong structure.
Summary:
Originally, D3858635 was going to introduce a scheme for hooks where the
repo type was included in the path:

    /etc/eden/hooks/hg/post-clone <args...>

But over the course of the review, we decided to make the repo type a
parameter:

    /etc/eden/hooks/post-clone hg <other-args...>

Unfortunately, `generate-hooks-dir` was not updated as part of that
change and it is not covered by unit tests. This error was particularly hard
to discover because of how `ENOENT` is handled, so I added a log statement for
that.

Reviewed By: simpkins

Differential Revision: D4200277

fbshipit-source-id: ffffd871cd78dcaeb717be8f1e01893ce9643a47
2016-11-17 14:39:06 -08:00
Michael Bolin
9f42f9a5f0 Fix a bug where an AbsolutePathPiece was getting collected.
Summary:
This is a quick and dirty fix for this issue that was causing
and confusing bug where the memory for the `AbsolutePathPiece`
was getting reclaimed, so when it was later read as the value
for a path, it failed because it was binary garbage.

This is mainly caused by the `std::move(config)` that passes
the `ClientConfig` to the `EdenMount` constructor. I will do
some more general cleanup for that in a follow-up revision,
but I wanted to have this change in its own commit that makes
it clear where the failure/fix were coming from.

Reviewed By: simpkins

Differential Revision: D4198939

fbshipit-source-id: 19e0423a1bee924fa6cc2edc8bae534ef472c988
2016-11-17 13:22:06 -08:00
Andrii Korotkov
8f94770d20 Rename thrift methods from NWorker/NPoolThreads to NumIOWorker/NumCPUWorkerThreads for all remaining directories
Summary:
Use new, less confusing names for mentioned thrift methods.

Codemod with 'Yes to All'. Reverted changes in thrift/

Reviewed By: yfeldblum

Differential Revision: D4076812

fbshipit-source-id: 4962ec0aead1f6a45efc1ac7fc2778f39c72e1d0
2016-11-04 15:31:04 -07:00
Michael Bolin
b43fa6ada7 Update license header for open source.
Reviewed By: simpkins

Differential Revision: D4119726

fbshipit-source-id: 0d2d17ca6caed233a26d77f0d197c1462a34e53f
2016-11-02 18:08:27 -07:00
Michael Bolin
ff20e0f5c0 Introducing Hg Dirstate abstraction in C++.
Summary:
This is the start of the C++ dirstate implementation. It's possible that this
commit does too many things at once:
* Introduces `Dirstate` type.
* Includes logic for serializing/deserializing the dirstate's data so that it persists across Eden restarts.
* Includes logic for basic `hg add` calls.
* Includes unit tests where we model Eden usage via the TestMount utility.

I'm backing this up in Phabricator with `--plan-changes` to start until I get
some basic `hg add` functionality working end-to-end. When that looks good, I'll
determine if/how this should be split into smaller commits.

Reviewed By: wez

Differential Revision: D4023232

fbshipit-source-id: 7fc931d547ccadb34f7caae93bc4eb8f91f6ceb8
2016-11-01 17:49:08 -07:00
Michael Bolin
8112c2a529 Introduce TestMount::hasFileAt().
Summary:
This is a utility that should be generally useful in creating test,
including the test of `TestMount` itself.

Reviewed By: simpkins

Differential Revision: D4073653

fbshipit-source-id: dda1d8ea8d29aa071a31f8e2afab324f9109e9b2
2016-10-28 14:58:24 -07:00
Michael Bolin
65e078dd1e Introduce TestMount::readFile().
Summary:
This is a utility that should be generally useful in creating test,
including the test of `TestMount` itself.

As you can see, this helped uncover a bug in the way we were
inserting blobs into `LocalStore`.

Reviewed By: simpkins

Differential Revision: D4073039

fbshipit-source-id: 42683fd0bfdb0a1e77df9324fcaa79091f45e83d
2016-10-28 14:58:24 -07:00
Michael Bolin
24d6612232 Make getRootInode() method available through the stack.
Summary: This is a follow-up revision from a comment on D4013464.

Reviewed By: wez

Differential Revision: D4050278

fbshipit-source-id: 1e46526f58a07e1eedd8ace1a6d84a919240d899
2016-10-21 14:17:33 -07:00
Michael Bolin
f5f9545bd3 Introduce getTreeForDirectory helper function.
Summary:
This is analogous to the existing `getEntryForFile()` helper function that we
have, and I was able to rewrite `getEntryForFile()` in terms of
`getTreeForDirectory()`, which simplifies the code considerably.

Also moved things from `eden/fs/model/hg/misc.h` to
`eden/fs/store/ObjectStores.h`, which is much more appropriate.

Reviewed By: wez

Differential Revision: D4032817

fbshipit-source-id: ff4d32120fb050f8b5c5c53b7f2e94b524781648
2016-10-21 13:32:02 -07:00
Michael Bolin
139c9dec3b Move getRootTree() helper from TestMount to EdenMount.
Summary:
This is not a one-liner and this is needed for the upcoming `Dirstate` class,
so moving this code to a place where it is more easily reusable.

Reviewed By: simpkins

Differential Revision: D4032001

fbshipit-source-id: 7d8d87802665ac2993ec0a3ac73c5f645fe4a1aa
2016-10-21 13:32:02 -07:00
Michael Bolin
a23cb5d8a2 Introduce getModifiedDirectoriesForMount().
Summary:
Performs a depth-first traversal of the overlay to find modified
directories and returns them in that order.

Reviewed By: simpkins

Differential Revision: D4025309

fbshipit-source-id: 09d8ed41b250dddbfb3fe545643ec3fd755a430e
2016-10-21 13:32:02 -07:00
Michael Bolin
83996c4630 Move some logic out of EdenServiceHandler so it can be reused by the dirstate and test harness.
Summary:
Now that I've done all this work, I'm not sure whether it is a good idea or even
necessary. I'll keep it in my back pocket.

Reviewed By: simpkins

Differential Revision: D4014598

fbshipit-source-id: 6ded3cc29838e964b56833ac24dff19e9de040f5
2016-10-21 13:32:02 -07:00
Michael Bolin
25c8407d96 Add mkdir(), overwriteFile(), and deleteFile() to TestMount.
Summary:
These are new helper methods we need to create test scenarios.
They will be used in upcoming revisions.

Reviewed By: wez

Differential Revision: D4046981

fbshipit-source-id: 9c66c456be57006173e4a65eed603de4a426a438
2016-10-21 13:32:02 -07:00
Michael Bolin
4d4538edec Fix bug I introduced in D4034298.
Summary: facepalm

Reviewed By: wez

Differential Revision: D4048640

fbshipit-source-id: 7a1ff55b7152d781c317c8e7f55c1afe4541fc12
2016-10-19 17:11:42 -07:00
Michael Bolin
9b1764f5ea Introducing TestMount to help create Eden mounts for unit tests.
Summary: This should be useful for my upcoming unit tests for the Hg dirstate.

Reviewed By: simpkins

Differential Revision: D4013464

fbshipit-source-id: 46460186abfa104aa026894068cd160e52c94729
2016-10-19 13:21:14 -07:00
Michael Bolin
1953391b36 Introduce TreeEntry.getMode() because getOwnerPermissions() was not doing the expected thing.
Summary: This will make it easier to compare a `TreeEntry` with a `TreeInode::Entry`.

Reviewed By: simpkins

Differential Revision: D4034298

fbshipit-source-id: 29674e2902661bf46394ea71b81537b35bd4b107
2016-10-19 10:54:11 -07:00
Michael Bolin
6c8512bbad Extract logic into MountPoint.getInodeBaseForPath() method.
Summary: This should make some of the upcoming test harness work a little easier.

Reviewed By: simpkins

Differential Revision: D4011747

fbshipit-source-id: 87ee80a6d641a29be9027b163b1adee496f4452f
2016-10-18 12:19:32 -07:00
Michael Bolin
7f20232d4b New EdenMount constructor.
Summary:
I need this for the upcoming test harness so I can avoid creating a
`ClientConfig`, which is currently a huge pain to do from a unit test.

Reviewed By: simpkins

Differential Revision: D4010842

fbshipit-source-id: 03d1e1de9c3047340a6f26202d4b432f4a8620b4
2016-10-18 12:19:31 -07:00
Michael Bolin
6f14b7f6d0 Fix "heap-use-after-free" issues in misc.cpp and miscTest.cpp.
Summary:
This was reported by ASAN.

The major issue was that `FakeObjectStore` was returning a copy of a `Tree`,
so it was not the case that the `TreeEntry*` returned by `getEntryForFile()`
was guaranteed to be "owned by" the `Tree* root` that was passed in. To address
this, we change `getEntryForFile()` to now return a copy of the `TreeEntry*`
that it gets back from `getEntryPtr()`. It really comes down to this line:

```
auto entry = currentDirectory->getEntryPtr(piece.basename());
```

because we cannot guarantee that `currentDirectory` will live past the end of
`getEntryForFile()`, so we cannot guarantee that return return value of
`currentDirectory->getEntryPtr()` will, either.

Special thanks to meyering and yfeldblum for helping me debug this.

Reviewed By: simpkins

Differential Revision: D4024627

fbshipit-source-id: 6295e6f2b1d2f544271b2aebad27a4ad3ae04563
2016-10-14 17:53:18 -07:00
Michael Bolin
402a5f8124 Fix "heap-use-after-free" issue reported in FakeObjestStoreTest.cpp.
Summary: This was reported by ASAN.

Reviewed By: simpkins

Differential Revision: D4024528

fbshipit-source-id: b8d45132ba6c01f7d17a425e557934658dc6b4a8
2016-10-14 15:57:09 -07:00
Michael Bolin
670b69cc6b Introduce getEntryForFile() utility function.
Summary:
Utility function that given a `Tree` and a `RelativePathPiece`, returns the
corresponding `TreeEntry` in the `ObjectStore`, if it exists.

Reviewed By: wez

Differential Revision: D3980261

fbshipit-source-id: 2808a4ca45be84e3a6bb91b0cf2db19a3bf88798
2016-10-14 10:41:29 -07:00
Michael Bolin
a252833e70 Introduce a FakeObjectStore for use in unit tests.
Summary:
In an upcoming revision, I am going to introduce a utility function that takes
an `ObjectStore` (well, now an `IObjectStore`) as a parameter and I want to be
able to test it. Having a `FakeObjectStore` should make this considerably easier
without having to resort to mocks.

Reviewed By: simpkins

Differential Revision: D3980580

fbshipit-source-id: 5886e2055c893e749cc898226e1baade776c3ea7
2016-10-14 10:41:29 -07:00
Wez Furlong
f5c781c3f0 some prep work for hypothesis testing
Summary:
Adds a very basic example of testing eden functionality with hypothesis.

We'll be building on this with stateful testing in a follow on diff tomorrow.

There's some prep/setup work in the base test class that can be removed when an updated version of hypothesis ships and is updated in our third-party repo.

Reviewed By: simpkins

Differential Revision: D3968250

fbshipit-source-id: 46382c3bf2d6a0edbd60ac2b048b1bae26ca2572
2016-10-14 07:26:13 -07:00
Michael Bolin
04932226b7 Make facebook::eden::Hash hashable.
Summary: This is necessary so that it can be used as the key in an `unordered_map`.

Reviewed By: simpkins

Differential Revision: D3980575

fbshipit-source-id: d225a98f957f9aae2f2f50a6cc365011d953c92e
2016-10-12 15:53:30 -07:00
Michael Bolin
ce7d1cdd3b Add a test for Tree.getEntryPtr().
Summary:
Apparently we did not have an existing unit test for `Tree`, so this adds one.
The other methods should be tested, as well, but I'm about to use `getEntryPtr()`
elsewhere, which is why I just focused on this one for the moment.

Reviewed By: simpkins

Differential Revision: D3980150

fbshipit-source-id: 33456fd621a1894606605af4fee06ba42d124752
2016-10-06 22:20:35 -07:00
Wez Furlong
4738a0dbba activate hypothesis deps
Summary:
We want to use these with Eden

Depends on D3961190
Depends on D3961193
Depends on D3961196
Depends on D3961208

Reviewed By: rhysparry

Differential Revision: D3961232

fbshipit-source-id: 56f5a1811625303514e4398a6d47ea90ba348724
2016-10-06 10:01:27 -07:00
Adam Simpkins
5b2e9cc7dc don't crash if getMaterializedEntries() is called with a bad mount point
Summary:
The getMaterializedEntries() would previously try to dereference a null pointer
if the input mount path did not refer to a valid mount piont.

Reviewed By: bolinfest, wez

Differential Revision: D3942600

fbshipit-source-id: 2a8c9aa87d2bd8175f7bc77f3d6293ad25e9c198
2016-10-04 11:04:18 -07:00
Adam Simpkins
17cf69d3b2 add some helper functions for constructing EdenErrors
Summary:
Add some helper functions for constructing EdenError objects from a few
different types of arguments.  Also update eden.thrift to indicate that most
functions can throw EdenErrors on failure.

Reviewed By: bolinfest, wez

Differential Revision: D3942588

fbshipit-source-id: 1b561c5310a8a218f88c38c70499e087fe47bbe0
2016-10-04 11:04:17 -07:00
Adam Simpkins
859a4c265b update the python client library to be python 2.x compatible
Summary:
Python 2.x requires the current class name be passed into super().
Add arguments to super so that we can use this inside a mercurial extension.
(Mercurial only supports python 2.x.)

Reviewed By: bolinfest

Differential Revision: D3942573

fbshipit-source-id: 06df55f217631a398004c0d25448d3a612f772e9
2016-09-30 19:13:13 -07:00
Adam Simpkins
29111f3733 normalize mount paths when doing config look-ups
Summary:
The keys in the config directory map are normalized, absolute paths to the
mount point.  When trying to look up a mount point make sure we also always use
a normalized absolute path.

Reviewed By: bolinfest

Differential Revision: D3942565

fbshipit-source-id: 63db838ffc7139d779925adf07c50f849d73bcc5
2016-09-30 19:07:45 -07:00
Wez Furlong
0f4132c35f ensure that we set materialized=true when loading overlay
Summary:
We were hitting an assertion in the case where we did a `mkdir`
followed by a `rename` followed by `getMaterializedEntries`.

The issue is that our in-memory representation has a boolean to indicate
whether a dir inode is materialized, but our serialization format does
not have this bit.  When we loaded the data we were not setting the
field to true and this was caught by the DCHECK.

If we have serialized data for a dir then it is, by definition, materialized
and we should just set that field to true.

Reviewed By: bolinfest

Differential Revision: D3900795

fbshipit-source-id: 62d8281e7a1009056d274888c9aff87664d2e09f
2016-09-26 13:54:14 -07:00
Michael Bolin
634e96872e Add initial support for hooks akin to Git hooks for Eden.
Summary:
This design is inspired by that of Git hooks:
https://git-scm.com/docs/githooks

By default, `/etc/eden/hooks` should be the place where Eden looks for
hooks; however, this can be overridden in `~/.edenrc` on a per-`repository` basis.
This directory should be installed as part of installing Eden.
There is information in `eden/hooks/README.md` about this.

The first hook that is supported is for post-clone logic for a repository.

This change demonstrates the need for an `eden config --get <value>`
analogous to what Git has, as hooks should be able to leverage this in their
own scripts. There introduces a `TODO` in `post-clone.py` where such a
feature would be useful, so that I could add the following to my `~/.edenrc`
to develop the Eden extension for Hg:

```
[hooks]
hg.edenextension = /data/users/mbolin/fbsource/fbcode/eden/hg/eden

[repository fbsource]
path = /data/users/mbolin/fbsource
type = hg
hooks = /data/users/mbolin/eden-hooks
```

Note that this revision also introduces a `generate-hooks-dir` script that can be
used to generate the standard `/etc/eden/hooks` directory that we intend to
distribute with Eden. This is also useful in creating the basis for a custom `hooks`
directory that can be specified as shown above in an `~/.edenrc` file.

Reviewed By: simpkins

Differential Revision: D3858635

fbshipit-source-id: 215ca26379a4b3b0a07d50845fd645b4d9ccf0f2
2016-09-26 13:53:05 -07:00
Wez Furlong
734fbf0c59 fix build in @mode/opt
Summary: CI didn't catch this, annoying!

Reviewed By: simpkins

fbshipit-source-id: 8217cfdc75365a5c1a5d2962792805d35b31d1b9
2016-09-26 13:52:25 -07:00
Wez Furlong
878ce3138a fix issue with renaming between different dirs
Summary:
simpkins spotted this; we were passing the wrong path down to the overlay saving dir.

This adds a test to prove that the source and destination directory contents
are correct both immediately after performing the rename and after remounting,
where we just read the serialized data.

Reviewed By: simpkins

Differential Revision: D3888694

fbshipit-source-id: 7f5fb5be417db5c693ac8a07b85abbffdbfe0fff
2016-09-26 13:52:25 -07:00
Wez Furlong
aef0c5b279 implement getFileInformation
Summary: fairly straight forward.

Reviewed By: simpkins

Differential Revision: D3872989

fbshipit-source-id: def2dfc624a6aa08ad089f19bd3a8438e26f0bbd
2016-09-26 13:52:25 -07:00
Wez Furlong
8f4373571b implement getFilesChangedSince
Summary:
This is pretty straightforward; we just walk back until we hit the
boundary with the requested JournalPosition.sequenceNumber

Reviewed By: simpkins

Differential Revision: D3872970

fbshipit-source-id: 1405f05957346d7ac513070f0407a477548aff1d
2016-09-26 13:52:25 -07:00
Wez Furlong
82c57b2bf8 implement getCurrentJournalPosition thrift API
Summary:
populate the position from the latest journal delta.

To facilitate this, we also define the mountGeneration value to be a
combination of the pid and the time at which we created the EdenMount object,
as well as a global counter that we bump for each mount.

The precise value and meaning of this bits really doesn't matter, just that we
are unlikely to pick the same value for this same mountPoint path again if we
were to remount in the future.

Since we are now in a position to report JournalPosition values to clients, now
is also a good time to fill out the `currentPosition` field for the
`getMaterializedEntries` thrift call, and to check that this value is
consistent with the value we return via `getCurrentJournalPosition`.

Reviewed By: simpkins

Differential Revision: D3872952

fbshipit-source-id: 2fbc25d2e9711035b66ab1bf5d746507b72de265
2016-09-26 13:52:25 -07:00
Wez Furlong
8b41b90108 sample the snapshot id in the journal at mount time
Summary:
This just populates the initial snapshot hash in the journal.

The `addDelta` method will propagate this into subsequent deltas if the delta
to be added has hash values that have not been set from the default 0-filled
hash values.

Reviewed By: simpkins

Differential Revision: D3872936

fbshipit-source-id: d0014ded40488a2be04d5a381e1d9815c7f0a638
2016-09-26 13:52:25 -07:00
Wez Furlong
07df3d8fbc additional query API for our thrift interface
Summary:
This diff adds a couple more things to our thrift interface:

1. Introduces JournalPosition
2. Adds methods to query the current JournalPosition and obtain a
   delta since a given JournalPosition
3. Augments getMaterializedFiles to also return the current JournalPosition
4. Adds a method to evaluate a `glob` against Eden
5. Adds a method using thrift streaming to subscribe to realtime changes

Could probably finesse the naming a little bit.

The JournalPosition allows reasoning about changes to files that are not part
of an Eden snapshot.  Internally the journal position is just the
SequenceNumber from the journal datastructures, but when we expose it to
clients we need to be able to distinguish between a sequence number from the
current instance of the eden service and a prior incarnation (eg: if the
process has been restarted, and we have no way to recreate the journal we need
to be able to indicate this to the client if they ask about changes in that
range).   For the convenience of the client we also include the `toHash` (the
most recent hash from the journal entry) which is likely useful for the `hg`
dirstate operations; it is useful to know that the snapshot may have changed
since the last query about the dirstate.

The `getFileInformation` method returns the instantaneously available `stat()`
like information about the requested list of files.   Since we simply don't
have historical data on how files in the overlay looked (only how they look
now), this method does not allow passing in a JournalPosition.  When it comes
to comparing historical data, we will need to add an API that accepts two
snapshot hashes and generates the results from there.  This particular method
is geared up to understanding the current state of the world; the obvious use
case is plugging in the file list from `getFilesChangedSince` into this
function to figure out what's what.

* Do we want a function that combines `getFilesChangedSince` + `getFileInformation` into a single RPC?

Why is there a glob method?  It's to support a use-case in the watchman/buck
integration.  I'm just sketching it out in the thrift interface at this stage.
In the future we also need to be able to express how to carry out a tree walk,
but that will require some query predicates that I don't want to get hung up on
specifying immediately.

Why is the streaming stuff in its own thrift file?  We can't generate code for
it in java or perhaps also python.  It's only needed to plumb data into
watchman so it's broken out into its own definition.  Nothing depends on that
file yet, so it's probably not specified quite right.  The important thing is
how the subscribe method looks: it's essentially the same as the method to
query a delta, but it keeps emitting deltas as they are produced.  This is
another API that will benefit from query predicates when we get around to
specifying them.

I've added `JournalDelta::fromHash` and `JournalDelta::toHash` to hold the
appropriate snapshot ids in the journal entry; this will allow us to indicate
when we've checked out a new snapshot, or created a new snapshot.  We have
no way to populate these yet; I commented on D3762646 about storing the
`snapshotID` that we have during `EdenServiceHandler::mountImpl` into either
the `EdenMount` or the proposed `RootInode` class.  Once we have that we
can simply sample it and store it as we generate `JournalDelta`s.

Reviewed By: simpkins

Differential Revision: D3860804

fbshipit-source-id: 896c24c354e6f58328fb45c24b16915d9e937108
2016-09-26 13:52:25 -07:00
Wez Furlong
ca929bcfa5 hook up journal functions to filesytem change operations
Summary:
This is pretty simplistic: we just wlock and add a delta for the set
of file(s) that were changed in a given fuse operation (this is typically 1
file, but rename affects 2).

To reduce boilerplate very slightly, I've added an initializer_list constructor
for JournalDelta that makes it less cumbersome to create a JournalDelta for a
list of files.

Reviewed By: simpkins

Differential Revision: D3866053

fbshipit-source-id: cd918e2c98c022d5ef79430cd8ab4aef88875239
2016-09-26 13:52:25 -07:00
Wez Furlong
c400658464 initial take on a Journal API
Summary:
This implements a pretty simple change Journal and associated
JournalDelta.

The Journal is intended to be held in memory and not persisted to disk.
The idea is that we'll hold a `Synchronized<Journal>` along with the
other mount data and grab a `wlock` on it each time we want to add
a change record.

This diff doesn't change any other existing functionality.

Reviewed By: simpkins

Differential Revision: D3660162

fbshipit-source-id: a6b6fa28dd12e4d34718956167ee87f8cb2d89ca
2016-09-26 13:52:25 -07:00
Wez Furlong
e54df2e422 add getMaterializedEntries thrift call
Summary:
Adds a thrift call that returns the list of materialized entries from the whole tree.

This is intended to be plugged into the mercurial dirstate extension.

Reviewed By: simpkins

Differential Revision: D3851805

fbshipit-source-id: 8429fdb4eeccc32928e8abc154d4e6fd49343556
2016-09-26 13:52:24 -07:00
Kirill Sazonov
af13d51a22 Fix fbcode projects that depend on fbthrift
Summary:
Previous diff in a stack moves following buck targets:
`//thrift/lib/java/src:thrift` -> `//thrift/lib/java:thrift`
`//thrift/lib/java/src/com/facebook/thrift/direct_server:DirectServer` -> `//thrift/lib/java:DirectServer`
`//thrift/lib/java/src/com/facebook/thrift/direct_server:TDirectServer` -> `//thrift/lib/java:TDirectServer`

This diff fixes fbcode projects to reflect that change via following script:
  find . -name TARGETS -type f -exec sed -i "s#'//thrift/lib/java/src:thrift'#'//thrift/lib/java:thrift'#g" {} \;
  find . -name TARGETS -type f -exec sed -i "s#'@/thrift/lib/java/src:thrift'#'@/thrift/lib/java:thrift'#g" {} \;
  find . -name TARGETS -type f -exec sed -i "s#\"//thrift/lib/java/src:thrift\"#\"//thrift/lib/java:thrift\"#g" {} \;
  find . -name TARGETS -type f -exec sed -i "s#\"@/thrift/lib/java/src:thrift\"#\"@/thrift/lib/java:thrift\"#g" {} \;
  find . -name TARGETS -type f -exec sed -i "s#'//thrift/lib/java/src/com/facebook/thrift/direct_server:DirectServer'#'//thrift/lib/java:DirectServer'#g" {} \;
  find . -name TARGETS -type f -exec sed -i "s#'@/thrift/lib/java/src/com/facebook/thrift/direct_server:DirectServer'#'@/thrift/lib/java:DirectServer'#g" {} \;
  find . -name TARGETS -type f -exec sed -i "s#\"/thrift/lib/java/src/com/facebook/thrift/direct_server:DirectServer\"#\"//thrift/lib/java:DirectServer\"#g" {} \;
  find . -name TARGETS -type f -exec sed -i "s#\"@/thrift/lib/java/src/com/facebook/thrift/direct_server:DirectServer\"#\"@/thrift/lib/java:DirectServer\"#g" {} \;
  find . -name TARGETS -type f -exec sed -i "s#'//thrift/lib/java/src/com/facebook/thrift/direct_server:TDirectServer'#'//thrift/lib/java:TDirectServer'#g" {} \;
  find . -name TARGETS -type f -exec sed -i "s#'@/thrift/lib/java/src/com/facebook/thrift/direct_server:TDirectServer'#'@/thrift/lib/java:TDirectServer'#g" {} \;
  find . -name TARGETS -type f -exec sed -i "s#\"/thrift/lib/java/src/com/facebook/thrift/direct_server:TDirectServer\"#\"//thrift/lib/java:TDirectServer\"#g" {} \;
  find . -name TARGETS -type f -exec sed -i "s#\"@/thrift/lib/java/src/com/facebook/thrift/direct_server:TDirectServer\"#\"@/thrift/lib/java:TDirectServer\"#g" {} \;

After that, following files are fixed manually after found via BigGrep:
  - .buckconfig
  - adsatlas/rex/pom.xml
  - facer/engine-java-client/.classpath
  - facer/vizi_java_utils/.classpath
  - java-foundations/scripts/tasks-csv/tasks.2016.java8.csv
  - mobile/device_categorization/service/copy-libs.sh
  - nettools/collection/job_manager/java/src/com/facebook/nettools/collection/job_manager/CachedModel.java
  - nettools/collection/job_manager/java/src/com/facebook/nettools/collection/job_manager/Util.java
  - stampede/example/buckconfig
  - titan/.hgsparse-fbsource
  - tools/build/buck/macro_lib/convert/thrift_library.py

Reviewed By: ryandm

Differential Revision: D3873877

fbshipit-source-id: fb008c13f75d663016b15c6fcfcd563033e7fbb8
2016-09-26 13:52:24 -07:00
Wez Furlong
4017d77bd0 eden: save ~30 seconds or ~30% waiting for buck test eden/...
Summary:
Was chatting with simpkins the other day and he mentioned that our
instrumented hg wrappers are quite CPU intensive.  This diff switches us to
running the underlying `hg.real` or `git.real` in our integration tests when we
find them in the path.

Reviewed By: bolinfest

Differential Revision: D3865996

fbshipit-source-id: d047749356f0c1c0662774e25801f3578f9f9243
2016-09-26 13:52:24 -07:00
Adam Simpkins
c9cb49c986 fix license message in hg_import_helper.py
Summary:
hg_import_helper.py imports mercurial python modules, which are GPLv2+, so this
code also needs to be licensed under GPLv2+, rather than the BSD-style license
used for the bulk of the eden code base.

Reviewed By: wez

Differential Revision: D3833508

fbshipit-source-id: eb2a8969a5a88c12444a3778875609f24e145e6b
2016-09-12 19:49:13 -07:00
Wez Furlong
c077dced83 eden: fix @mode/opt build
Summary:
Annoying that gcc and clang behave differently here.  The compilation
error is due to gcc not seeing the implicit this pointer for some of these
method calls, so we need to explicitly use it.

Reviewed By: simpkins

Differential Revision: D3846973

fbshipit-source-id: 3d5b8b8b8c9bbab1e7935cff0e65677f76d116fb
2016-09-12 19:26:05 -07:00
Michael Bolin
7a05213f34 New Thrift endpoint: getBindMounts(mountPoint).
Summary:
Buck needs this API so that it knows which paths under a project
root it should exclude when deciding whether it can ask Eden for its
SHA-1 or if it must compute it on its own.

Reviewed By: simpkins

Differential Revision: D3840658

fbshipit-source-id: 5eddc0bef423d3b3ee165d2a4b0bbf193f94f61a
2016-09-12 18:29:15 -07:00
Wez Furlong
9c79b74456 eden: re-do overlay serialization
Summary:
we now serialize the overlay data for each directory independently.

When we mount, we try to load the root overlay data.  The children are lazy
loaded as the inodes are instantiated.

Structural changes cause the overlay data for the impacted dirs to get saved out.

I need to make a pass over this to fixup comments and so on, I just wanted to get this diff out first.

I moved the overlay stuff from `eden/fs/overlay` -> `eden/fs/inodes` since most
of the overlay-ness is handled in `TreeInode` now; the `Overlay` class is
really just for carrying around the paths and providing the serialization
helpers.

Reviewed By: simpkins

Differential Revision: D3787108

fbshipit-source-id: f0e089a829defd953535b9d0a96b102ac729261b
2016-09-09 16:57:58 -07:00
Wez Furlong
e6239f63c4 eden: merge overlay into the inode objects
Summary:
It was starting to get pretty complex to manage locking across the
inodes, filedata, overlay and soon the journal, so as a simplifying step, this
folds data that was tracked by the overlay into the TreeInode itself.

This is the first diff in a short series for this.  This one:

1. Breaks the persistent overlay information, so shutting down eden and
   bringing it back up will lose your changes (to be restored in the
   following diff)
2. Allows deferring materialization of file data in more cases
3. Allows renaming dirs.

The approach here is now to keep just one source of information about the
directory contents; when we construct a TreeInode we import this data from the
Tree and then apply mutations to it locally.

Each inode can be mutated indepdently from others; we only need to lock the 1,
2 or 3 participating inodes in the various mutation operations.

I'll tackle persistence of the mutations in the following diff, but the high
level plan for that (to help understand this diff) is to always keep the
directory inodes for mutations alive as inode objects.  We make use of the
canForget functionality introduced by D3774269 to ensure that these don't
get evicted early.   On startup we'll load this information from the overlay
area.

This model simplifies some of the processing around reading dirs and looking up
children.

Since the overlay data now tracks the appropriate tree or content hash
we can be much more lazy at materializing data, especially in the rename
case.  For example, renaming "fbcode" to "fbcod" doesn't require us to
recursively materialize the "fbcode" tree.

Depends on D3653706

Reviewed By: simpkins

Differential Revision: D3657894

fbshipit-source-id: d4561639845ca93b93487dc84bf11ad795927b1f
2016-09-09 16:57:58 -07:00
Wez Furlong
96d6684947 eden: fix data race on shutdown
Summary:
We can't allow ~EdenServer to delete the memory until we're sure that
the other threads are done.  To ensure that, we need to notify the condition
variable while the aux thread still holds the lock.  This makes sure that the
thread destroying the EdenServer waits for the aux thread to release the lock
before we check the predicate and proceed to deleting the memory.

```
SUMMARY  ThreadSanitizer: data race /
/common/concurrency/Event.cpp:107 in facebook::common::concurrency::Event::set() const
==================
I0909 14:51:18.543072 4147554 main.cpp:173] edenfs performing orderly shutdown
I0909 14:51:18.555794 4148654 Channel.cpp:177] session completed
I0909 14:51:18.556011 4148654 EdenServer.cpp:192] mount point "/tmp/eden_test.0ostuc90/mounts/main" stopped
==================
WARNING: ThreadSanitizer: data race (pid=4147554)
  Write of size 8 at 0x7fff9e182d90 by main thread:
    #0 pthread_cond_destroy <null> (edenfs+0x00000007671a)
    #1 facebook::eden::EdenServer::~EdenServer() /
/eden/fs/service/EdenServer.cpp:93 (libeden_fs_service_server.so+0x0000000b96cd)
    #2 main /
/eden/fs/service/main.cpp:176 (edenfs+0x000000018515)

  Previous read of size 8 at 0x7fff9e182d90 by thread T73:
    #0 pthread_cond_broadcast <null> (edenfs+0x0000000765b7)
    #1 __gthread_cond_broadcast /home/engshare/third-party2/libgcc/4.9.x/src/gcc-4_9/x86_64-facebook-linux/libstdc++-v3/include/x86_64-facebook-linux/bits/gthr-default.h:852 (libstdc++.so.6+0x0000000e14f8)
    #2 std::condition_variable::notify_all() /home/engshare/third-party2/libgcc/4.9.x/src/gcc-4_9/x86_64-facebook-linux/libstdc++-v3/src/c++11/../../../.././libstdc++-v3/src/c++11/condition_variable.cc:72 (libstdc
++.so.6+0x0000000e14f8)
    #3 facebook::eden::EdenServer::mount(std::shared_ptr<facebook::eden::EdenMount>, std::unique_ptr<facebook::eden::ClientConfig, std::default_delete<facebook::eden::ClientConfig> >)::$_0::operator()() const /
/
/eden/fs/service/EdenServer.cpp:145 (libeden_fs_service_server.so+0x0000000bcdb5)
    #4 std::_Function_handler<void (), facebook::eden::EdenServer::mount(std::shared_ptr<facebook::eden::EdenMount>, std::unique_ptr<facebook::eden::ClientConfig, std::default_delete<facebook::eden::ClientConfig>
>)::$_0>::_M_invoke(std::_Any_data const&) /
/third-party-buck/gcc-4.9-glibc-2.20-fb/build/libgcc/include/c++/trunk/functional:2039 (libeden_fs_service_server.so+0x0000000bcab0)
    #5 std::function<void ()>::operator()() const /
/third-party-buck/gcc-4.9-glibc-2.20-fb/build/libgcc/include/c++/trunk/functional:2439 (libeden_fuse_fusell.so+0x00000020fbb9)
    #6 facebook::eden::fusell::MountPoint::start(bool, std::function<void ()> const&)::$_0::operator()() const /
/eden/fuse/MountPoint.cpp:69 (libeden_fuse_fusell.so+0x000000237447
)
    #7 void std::_Bind_simple<facebook::eden::fusell::MountPoint::start(bool, std::function<void ()> const&)::$_0 ()>::_M_invoke<>(std::_Index_tuple<>) /
/third-party-buck/gcc-4.9-
glibc-2.20-fb/build/libgcc/include/c++/trunk/functional:1699 (libeden_fuse_fusell.so+0x000000237048)
    #8 std::_Bind_simple<facebook::eden::fusell::MountPoint::start(bool, std::function<void ()> const&)::$_0 ()>::operator()() /
/third-party-buck/gcc-4.9-glibc-2.20-fb/build/libgc
c/include/c++/trunk/functional:1688 (libeden_fuse_fusell.so+0x000000236ff8)
    #9 std:🧵:_Impl<std::_Bind_simple<facebook::eden::fusell::MountPoint::start(bool, std::function<void ()> const&)::$_0 ()> >::_M_run() /
/third-party-buck/gcc-4.9-glibc-2.
20-fb/build/libgcc/include/c++/trunk/thread:115 (libeden_fuse_fusell.so+0x000000236d8c)
    #10 execute_native_thread_routine /home/engshare/third-party2/libgcc/4.9.x/src/gcc-4_9/x86_64-facebook-linux/libstdc++-v3/src/c++11/../../../.././libstdc++-v3/src/c++11/thread.cc:84 (libstdc++.so.6+0x0000000e6
ec0)
```

Reviewed By: simpkins

Differential Revision: D3844846

fbshipit-source-id: 545474bc1aff8621dbeb487dcd6b54c82828ff3b
2016-09-09 16:57:57 -07:00