Commit Graph

51 Commits

Author SHA1 Message Date
Wez Furlong
3da68e5adc dumb merge of MountPoint into EdenMount
Summary:
This is a mechanical and dumb move of the code from MountPoint
and into the EdenMount class.

Of note, it doesn't merge together the two different state/status fields
into a unified thing; that will be tackled in a follow on diff.

Reviewed By: bolinfest

Differential Revision: D5778212

fbshipit-source-id: 6e91a90a5cc760429d87a475ec12f81b93f87be0
2017-09-08 19:25:34 -07:00
Michael Bolin
83b3c38095 Fix for hg split in Eden.
Summary:
Before this change, `hg split` crashed complaining that `node` was a
`changectxwrapper` instead of a 20-byte hash when it was sent as `parent1`
of `WorkingDirectoryParents` in `resetParentCommits()`. Now we use `node()` to
get the hash from the `destctx` that we have already extracted via this line
earlier in `merge_update()`:

    destctx = repo[node]

The change to `eden/hg/eden/__init__.py` eliminated the crash, but was
not sufficient on its own to make `hg split` work correctly. There was also a fix
required in `Dirstate.cpp` where the `onSnapshotChanged()` callback was clearing out
entries of both `NotApplicable` and `BothParents` from `hgDirstateTuples`.
It appears that only `NotApplicable` entries should be cleared. (I tried leaving
`NotApplicable` entries in there, but that broke `eden/integration/hg/graft_test.py`.)

I suspected that the logic to clear out `hgDestToSourceCopyMap` in
`Dirstate::onSnapshotChanged` was also wrong, so I deleted it and all of the
integration tests still pass. Admittedly, we are pretty weak in our test coverage
for use cases that write to the `hgDestToSourceCopyMap`. In general, we should
rely on Mercurial to explicitly remove entries from `hgDestToSourceCopyMap`.
We have a Thrift API, `hgClearDirstate()`, that `eden_dirstate` can use to categorically
clear out `hgDirstateTuples` and `hgDestToSourceCopyMap`, if necessary.

Finally, creating a proper integration test for `hg split` required creating a value for
`HGEDITOR` that could write different commit messages for different commits.
To that end, I added a `create_editor_that_writes_commit_messages()` utility as a
method of `HgExtensionTestBase` and updated its `hg()` method to take `hgeditor`
as an optional parameter.

Reviewed By: wez

Differential Revision: D5758236

fbshipit-source-id: 5cb8bf4207d4e802726cd93108fae4a6d48f45ec
2017-09-06 21:20:45 -07:00
Michael Bolin
5cf1692782 Add new Thrift API: hgClearDirstate(mountPoint)
Summary:
When Hg tells the `dirstate` to `clear()`, we should also clear out any data we
have on the server for the `Dirstate`.

As it stands, the way we subclass `dirstate`, it does not appear like `clear()`
should be called, in practice, though one thing that could call it is
`hg debugrebuilddirstate`. It is probably good for us to have an RPC lying
around that we can use to reset the `Dirstate.`

Reviewed By: wez

Differential Revision: D5675298

fbshipit-source-id: 38926cfd93f4f83e4c28910f812a693cb32e423a
2017-08-22 16:50:24 -07:00
Michael Bolin
e050ffcce2 Add new Thrift API: hgDeleteDirstateTuple(mountPoint, relativePath)
Summary:
Previously, we were overloading `hgSetDirstateTuple()` to also make it possible
to delete an entry from the internal `hgDirstateTuples` map. Now we have an
explicit method to do this, which enables us to remove some hacks/TODOs.

Reviewed By: simpkins

Differential Revision: D5665170

fbshipit-source-id: bc0753d4990c8966fd9e6c50b29a954d5023292f
2017-08-18 21:49:59 -07:00
Michael Bolin
5a2246acbf Address a major outstanding TODO in Dirstate::hgGetDirstateTuple.
Summary:
Previously, `Dirstate::hgGetDirstateTuple()` was reporting a status of
`DirstateNonnormalFileStatus::NotTracked` even when the true status was
`Normal`.

Falsely reporting the status has serious consequences when running `hg log` on
an existing, tracked file. Specifically, it causes this `f not in wctx`
condition to be `True` here:

https://phab.mercurial-scm.org/diffusion/HG/browse/default/mercurial/cmdutil.py;da8bdeb1be28b976909a963c89e974264686e2bb$2316

which in turn causes the slow path to be selected:

https://phab.mercurial-scm.org/diffusion/HG/browse/default/mercurial/cmdutil.py;da8bdeb1be28b976909a963c89e974264686e2bb$2320

For large repositories like ours, this can be very, very slow.

There are still some TODOs in the new implementation, but this seems much more
faithful to the true implementation than what we had before.

Reviewed By: quark-zju

Differential Revision: D5655741

fbshipit-source-id: 07b953e23e4d74c480ac2d94dfc6a8df9df4fcbb
2017-08-18 21:49:59 -07:00
Yedidya Feldblum
d680237331 Remove construct_in_place
Summary:
[Folly] Remove `construct_in_place`.

It is an alias for `in_place`, and unnecessary.

Reviewed By: Orvid

Differential Revision: D5362354

fbshipit-source-id: 39ce07bad87185e506fe7d780789b78060008182
2017-07-02 10:35:18 -07:00
Adam Simpkins
b0d1e57975 update logging statements to use folly logging APIs
Summary:
Update eden to log via the new folly logging APIs rather than with glog.

This adds a new --logging flag that takes a logging configuration string.
By default we set the log level to INFO for all eden logs, and WARNING for
everything else.  (I suspect we may eventually want to run with some
high-priority debug logs enabled for some or all of eden, but this seems like a
reasonable default to start with.)

Reviewed By: wez

Differential Revision: D5290783

fbshipit-source-id: 14183489c48c96613e2aca0f513bfa82fd9798c7
2017-06-22 13:50:13 -07:00
Adam Simpkins
5740a0645b remove ObjectStores.h
Summary:
Remove the ObjectStores.h and .cpp files.  These files contained helper
functions for looking up Tree and TreeEntry objects based on a deep relative
path.

These functions are no longer used anywhere anymore (the code that performs
deep lookups always does inode lookups instead).  Additionally, this file is
the only place still using some of the blocking, non-Future-based APIs for
object lookups.  Deleting these files will let us remove these deprecated
blocking APIs.

Reviewed By: wez

Differential Revision: D5295691

fbshipit-source-id: 89229827305490eba4d3a581954a90c5cf63238d
2017-06-21 17:20:48 -07:00
Michael Bolin
57f5d72a27 Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate.
Summary:
This is a major change to Eden's Hg extension.

Our initial attempt to implement `edendirstate` was to create a "clean room"
implementation that did not share code with `mercurial/dirstate.py`. This was
helpful in uncovering the subset of the dirstate API that matters for Eden. It
also provided a better safeguard against upstream changes to `dirstate.py` in
Mercurial itself.

In this implementation, the state transition management was mostly done
on the server in `Dirstate.cpp`. We also made a modest attempt to make
`Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for
Git at some point.

However, as we have tried to support more of the sophisticated functionality
in Mercurial, particularly `hg histedit`, achieving parity between the clean room
implementation and Mercurial's internals has become more challenging.
Ultimately, the clean room implementation is likely the right way to go for Eden,
but for now, we need to prioritize having feature parity with vanilla Hg when
using Eden. Once we have a more complete set of integration tests in place,
we can reimplement Eden's dirstate more aggressively to optimize things.

Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]]
extension has already demonstrated that it is possible to provide a faithful
dirstate implementation that subclasses the original `dirstate` while using a different
storage mechanism. As such, I used `sqldirstate` as a model when implementing
the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`).

In particular, `sqldirstate` uses SQL tables as storage for the following private fields
of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because
`_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we
do not support them in `eden_dirstate` and add code to ensure the codepaths that
would access them in `dirstate` never get exercised. Similarly, we also implemented
`eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the
dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden.
It appears to be primarily used for checking whether a path to a file already exists in
the dirstate as a directory. We can protect against that in more efficient ways.)

That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set
of files that have been marked "copied" in the current dirstate, so it is fairly small and
can be stored on disk or in memory with little concern. `_map` is a bit trickier because
it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored
across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data
analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new
equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles`
table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap`
data.

In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined
in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`,
which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data
structures that translate their method calls to Thrift server calls. I expect we will want to
optimize this in the future via some client-side caching, as well as creating batch APIs for talking
to the server via Thrift.

One advantage of this new implementation is that it enables us to delete
`eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`.
Between the recent implementation of `dirstate.walk()` for Eden and this switch
to the real dirstate, we can now use the default implementation of `hg add` and `hg remove`
(although we have to play some tricks, like in the implementation of `eden_dirstate.status()`
in order to make `hg remove` work).

In the course of doing this revision, I discovered that I had to make a minor fix to
`EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as
`hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which
case the glob was not matching `foo`!

I also had to do some work in `eden_dirstate.status()` in which the `match` argument
was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number
of things with the `match` specified as a filter, so the output of `status()` must be filtered
by `match` accordingly. Ultimately, this seems like work that would be better done on the
server, but for simplicity, we're just going to do it in Python, for now.

For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`.
As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was
testing should probably be converted to integration tests. At a high level, the role of
`DirstatePersistence` has not changed, but the exact data it writes is much different.
Its corresponding unit test is also disabled, for now.

Note that this revision does not change the name of the file where "dirstate data" is written
(this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing
instances of this file once this change lands. (It is still early enough in the project that it does
not seem worth the overhead of a proper migration.)

The true test of the success of this new approach is the ease with which we can write more
integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very
few changes to `eden_dirstate.py`.

Reviewed By: simpkins

Differential Revision: D5071778

fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 12:05:29 -07:00
Adam Simpkins
c3f47540b0 update the Dirstate to use TreeInode directly to unlink files
Summary:
Update Dirstate::remove() to directly call TreeInode::unlink() instead of going
through the EdenDispatcher.  Internal code should be able to use TreeInode
directly without going through the FUSE Dispatcher.  This prevents us from
having to look up the TreeInode object by inode number again.

This also fixes TreeInode::unlink() and TreeInode::rmdir() to perform FUSE
cache invalidation correctly.  EdenDispatcher used to do this for unlink()
calls, but it was missing for rmdir() calls.

Reviewed By: wez

Differential Revision: D4968832

fbshipit-source-id: 51fda5de196392c640436ca6ad7d8333e6799db9
2017-05-01 18:54:42 -07:00
Adam Simpkins
a521832407 simplify the Dirstate::removeAll() code
Summary:
This replaces the shouldFileBeDeletedByHgRemove() function with a much simpler
and safer check.

It turns out this code was more complicated than it needed to be: it looked up
the inode in question and confirmed that it still existed, even though it's
calling function had already previously looked up the inode object.

I believe this deletes the last place in the code that still assumes that an
InodeBase object must be loaded if the inode has previously been materialized.
This also removes one of the last remaining places that directly holds the
TreeInode contents_ lock outside of TreeInode.cpp.

Reviewed By: bolinfest

Differential Revision: D4968834

fbshipit-source-id: a37bc0d7889eb81cca5b045cbc82732a53ef53d4
2017-05-01 18:54:41 -07:00
Adam Simpkins
51df0c0663 refactor Dirstate::addAll() to clean up some old code
Summary:
This updates the Dirstate::addAll() code to use EdenMount::diff() internally,
instead of getStatusForExistingDirectory().

This lets us delete getStatusForExistingDirectory() and several other helper
functions that it used.  The logic in getStatusForExistingDirectory() was based
on the incorrect assumption that files can only be modified from their expected
source control state if they are materialized.  This could result in incorrect
results in a variety of cases (particularly after renames, or "hg reset"
operations).  The EdenMount::diff() code does not have this problem.

That said, this new addAll() code does still have some performance issues--it
currently does a full tree diff for each input path.  We should ideally fix
this in the future to only diff the necessary subtree for each path.  However,
in the short term this trade-off seems worth being able to delete all of this
older, buggy code.  diff() should be cheap enough in most cases that this won't
be a major problem unless a large number of paths are given as input.

Reviewed By: bolinfest

Differential Revision: D4968835

fbshipit-source-id: 1834aa98a26dcaa0e1c06c7ac25c57944fa1b5f7
2017-05-01 18:54:41 -07:00
Adam Simpkins
87cbfe142b update the in-memory snapshot correctly after a commit
Summary:
This fixes "hg commit" so that it correctly updates the in-memory snapshot.
This has been broken ever since I added the in-memory snapshot when
implementing checkout().  The existing scmMarkCommitted() method updated only
the Dirstate object and the on-disk SNAPSHOT file.

This diff fixes checkout() and resetCommit() to clear the Dirstate user
directives correctly, and then replaces calls to scmMarkCommitted() with
resetCommit().

Reviewed By: bolinfest

Differential Revision: D4935943

fbshipit-source-id: 5ffcfd5db99f30c730ede202c5e013afa682bac9
2017-04-24 18:06:59 -07:00
Adam Simpkins
841c9006e3 honor trailing slashes on gitignore patterns
Summary:
Update the gitignore code so that patterns ending in a trailing slash correctly
match only directories.  Fortunately we have file type information available
during TreeInode::computeDiff(), so getting the correct file type information
does not add any extra overhead.  The older ignore checking code in
Dirstate.cpp can also do a reasonable job of getting file type information.
That code should also be removed at some point in the future, and use the
TreeInode logic for computing ignore status.

Reviewed By: bolinfest

Differential Revision: D4901326

fbshipit-source-id: 1222c8142876c91e1b80ec937ec84c0c28737224
2017-04-20 21:21:15 -07:00
Adam Simpkins
a6ae3edab9 move eden/utils and eden/fuse into eden/fs
Summary:
This change makes it so that all of the C++ code related to the edenfs daemon
is now contained in the eden/fs subdirectory.

Reviewed By: bolinfest, wez

Differential Revision: D4889053

fbshipit-source-id: d0bd4774cc0bdb5d1d6b6f47d716ecae52391f37
2017-04-14 11:39:02 -07:00
Adam Simpkins
6cced7c449 implement "hg status" using EdenMount::diff()
Summary:
This is an initial pass at implementing "hg status" using the new EdenMount and
TreeInode diff() code.

More work still needs to be done to clean things up here, but this makes
"hg status" more correct in the face of modified files and directories that are
not modified, and vice-versa.

In particular, the following still needs to be cleaned up in future diffs:
- Update the "hg status" code to more correctly process user directives for
  paths that did not appear in the diff output.  This will probably be
  simplified some by correctly updating the user directive list on checkout and
  resetCommit() operations.  It may also make things simpler if we store the
  user directives in a hierarchical path map, rather than in a flat list.
- Update the addAll() code to also use the new diff logic, rather than the
  getStatusForExistingDirectory() function.
- Clean up the locking some.  I think it may be worth changing the code to use
  a single lock to manage both the current commit ID for the mount and the
  dirstate user directives.  We probably should hold this lock for the duration
  of a diff operation.  The diff operation should also return the commit ID
  that it is based against.

Reviewed By: wez

Differential Revision: D4752281

fbshipit-source-id: 6a6d7e2815321b09c75e95aa7460b0da41931dc0
2017-03-30 21:35:00 -07:00
Wez Furlong
4235784907 add .eden "magic" dir
Summary:
It's not really magic because we don't have a virtual directory
inode base any more.  Instead, we mkdir and populate it at mount time.

What is slightly magical about it is that we give it some special powers:

* We know the inode number of the eden dir and prevent unlink operations
  on it or inside it.
* The .eden dir is present in the contents of the root inode and will
  show up when that directory is `readdir`'d
* When resolving a child of a TreeInode by name, we know to return the
  magic `.eden` inode number.  This means that it is possible to `stat`
  and consume the `.eden` directory from any directory inside the eden
  mount, even though it won't show up in `readdir` for those child dirs.

The contents of the `.eden` dir are:

* `socket` - a symlink back to the unix domain socket that our thrift
  server is listening on.  This means that it is a simple
  `readlink(".eden/socket")` operation to discover both whether a directory
  is part of an eden mount and how to talk to the server.

* `root` - a symlink back to the root of this eden mount.  This allows
  using `readlink(".eden/root")` as a simple 1-step operation to find
  the root of an eden mount, and avoids needing to walk up directory
  by directory as is the common pattern for locating `.hg` or `.git`
  dirs.

Reviewed By: simpkins

Differential Revision: D4637285

fbshipit-source-id: 0eabf98b29144acccef5c83bd367493399dc55bb
2017-03-24 23:07:42 -07:00
Adam Simpkins
d1e1e4c621 refactor FileInode to avoid using its parent's TreeInode::Entry
Summary:
Refactor FileInode and FileData so that they no longer store the
TreeInode::Entry that refers to this file.  The TreeInode::Entry is owned by
the parent TreeInode and is not safe to access without holding the TreeInode
contents_ lock, which the FileInode code never did.  Additionally, once a
FileInode is unlocked the TreeInode::Entry is destroyed.  Previously the
FileInode code would retain its pointer to this now invalid data, and could
still dereference it.

In the future I think it might be worth refactoring the FileInode/FileData
split a bit more, but this at least addresses the issues around invalid
accesses to entry_.

Reviewed By: bolinfest

Differential Revision: D4688058

fbshipit-source-id: 17d5d0c4e756afc6e3c4bb279cb1bb5231f3134f
2017-03-10 18:29:29 -08:00
Adam Simpkins
fed342da30 store overlay files using inode numbers instead of paths
Summary:
Refactor the Overlay code to store data using inode numbers rather than the
affected file's path in the repository.  This simplifies the TreeInode code a
bit, as we no longer have to rename overlay files to stay in sync with the file
paths.  This also eliminates some crashes when trying to update overlay files
for inodes that have been unlinked (and hence no longer have a path).  This
also includes a few fixes to avoid writing journal entries for unlinked files
too.  Additionally this contains a few fixes to how mode bits are stored in the
overlay, and fixes a bug where create() was ignoring the mode argument.

Reviewed By: wez

Differential Revision: D4517578

fbshipit-source-id: c1e31497dcf62c322b0deff72b0a02675b0509ab
2017-02-10 14:17:52 -08:00
Adam Simpkins
d1556c3601 implement Future-based recursive Inode lookup
Summary:
Rename the current EdenMount::getInodeBase() function to
EdenMount::getInodeBaseBlocking() and add a new getInodeBase() function that
performs the lookup asynchronously and returns a Future<InodePtr>.

This is implemented with a TreeInode::getChildRecursive() function that allows
asynchronous recursive lookups at any point in the tree.

Reviewed By: bolinfest

Differential Revision: D4427855

fbshipit-source-id: aca55e681a48c848b085b7fc6a13efe6cf0a4e3a
2017-01-25 16:56:12 -08:00
Adam Simpkins
251da81f36 update all copyright statements to "2016-present"
Summary:
Update copyright statements to "2016-present".  This makes our updated lint
rules happy and complies with the recommended license header statement.

Reviewed By: wez, bolinfest

Differential Revision: D4433594

fbshipit-source-id: e9ecb1c1fc66e4ec49c1f046c6a98d425b13bc27
2017-01-20 22:03:02 -08:00
Adam Simpkins
3a73253057 add new InodePtr classes
Summary:
This defines our own custom smart pointer type for Inode objects.  This will
provide us with more control over Inode lifetime, allowing us to decide if we
want to unload them immediately when they become unreferenced, or keep them
around for a while.

This will also allow us to fix some memory management issues around EdenMount
destruction.  Currently we destroy the EdenMount immediately when it is
unmounted.  This can cause issues if other parts of the code are still holding
references to Inode objects from this EdenMount.  Our custom InodePtr class
will also allow us to delay destroying the EdenMount until all of its Inodes
have been destroyed.

This diff adds the new pointer types and updates the code to use them, but does
not actually implement destroying unreferenced inodes yet.  The logic for that
has proven to be slightly subtle; I will split it out into its own separate
diff.

Reviewed By: wez

Differential Revision: D4351072

fbshipit-source-id: 7a9d81cbd226c9662a79a2f2ceda82fe2651f312
2017-01-17 15:03:20 -08:00
Adam Simpkins
9ba08b9d1e Update FileData to store a pointer to its FileInode
Summary:
Have FileData objects store a pointer to the FileInode that owns them, rather
than just to the EdenMount.

FileData objects still store a direct pointer to their FileInode's mutex_ and
entry_.  It's potentially worth just accessing these through inode_ in the
future.

Reviewed By: bolinfest

Differential Revision: D4348103

fbshipit-source-id: 1f8497979bfc89c6a192ca0195209335db0d911c
2016-12-22 15:36:29 -08:00
Adam Simpkins
5b346bbff2 update code to use InodeMap
Summary:
This updates all of the eden code to use the new InodeMap class.  This replaces
the InodeNameManager class and the unordered_map previously stored in the
EdenDispatcher.

Reviewed By: bolinfest

Differential Revision: D4325750

fbshipit-source-id: d80ae7581ba79ca2b63155e184995a3e83e85dc1
2016-12-22 15:36:29 -08:00
Michael Bolin
1ea4af26c5 Use the StatusCode from Thrift as the canonical representation.
Summary:
Most of the work for this diff was achieved via:

```
find eden -type f | xargs sed -i -e 's#ThriftHgStatusCode#StatusCode#g'
find eden -type f | xargs sed -i -e 's#HgStatusCode#StatusCode#g'
```

Reviewed By: simpkins

Differential Revision: D4237975

fbshipit-source-id: 2ee04a89101291c8972ac7bd3ff6cca92cbb0799
2016-12-16 17:49:05 -08:00
Michael Bolin
dfd1634170 Move batch processing of hg add <directory> to the server.
Summary:
This should make the common case of `hg add` or `hg add .` much more efficient
because it no longer performs a walk of the entire repository from the client
side.

Reviewed By: simpkins

Differential Revision: D4317871

fbshipit-source-id: 7061553fe0de0c4afa84b4f835316965088675e8
2016-12-15 13:02:38 -08:00
Adam Simpkins
fc202f81e5 add new Future-based APIs to ObjectStore
Summary:
Update the ObjectStore and BackingStore classes to have APIs that return
folly::Future objects, rather than blocking until the requested data is loaded.

For now most users still call the blocking versions of getBlob() and getTree().
Furthermore, all of the Future-based implementations actually still block
until the data is ready.  I will update the code to use these new APIs in
future diffs, and then deprecate the non-future based versions.

Reviewed By: bolinfest

Differential Revision: D4318055

fbshipit-source-id: a250c23b418e69b597a4c6a95dbe80c56da5c53b
2016-12-13 18:12:21 -08:00
Michael Bolin
5e08c5e1a7 Introduce Dirstate::getStatusForDirectory().
Summary:
This will make it easier to implement `hg add <directory>` such that
`<directory>` is expanded on the server rather than on the client.

Reviewed By: simpkins

Differential Revision: D4318735

fbshipit-source-id: cf0e89bd95eb58304cd23e70beb77bc7151f2c5c
2016-12-13 12:00:22 -08:00
Michael Bolin
3196a05b86 Filter .hg directory on the server when computing getStatus().
Summary:
Previously, `.hg` entries were filtered when calling `hg status` on the client,
but this should be happening on the server. This updates the logic in
`Dirstate.cpp` to properly exclude `.hg` when traversing the overlay for
modified directories, so this will eliminate a bunch of unnecessary computation
and simplify the client.

I'm unsure of how best to implement the ownership relation for the set that
contains `.hg`. Please advise! I know that I could categorically exclude
`".hg"` in `getModifiedDirectoriesRecursive()`, but I haven't used it enough
scenarios yet to be sure that's the right thing to do. For example, if it were
a Git repo, arguably we should consider `.hg` and not `.git`, so I could also
require the set to be a parameter of `Dirstate`, but I want to make sure I get
the ownership stuff right.

Reviewed By: simpkins

Differential Revision: D4316531

fbshipit-source-id: a0f13ca1c3c620b686435c8aa6485ba4e850f043
2016-12-13 12:00:22 -08:00
Michael Bolin
1619acb2c7 Change getModifiedDirectoriesForMount() to return a vector instead of a unique_ptr.
Summary: Follow-up on a comment that came out of the review for D4249214.

Reviewed By: simpkins

Differential Revision: D4314979

fbshipit-source-id: 76384474092e6fd48394f6faf8b84ba6220c556a
2016-12-13 12:00:22 -08:00
Adam Simpkins
c81ee4c997 update getSha1ForBlob() to return the Hash by value
Summary:
Hash objects are small enough (20 bytes) that it isn't worth allocating them on
the heap.  This updates LocalStore::getSha1ForBlob() to return a
folly::Optional<Hash>, and ObjectStore::getSha1ForBlob() to return a plain
Hash.

Reviewed By: bolinfest

Differential Revision: D4298162

fbshipit-source-id: 9cf54f2997ba8c3b2346db315a2aca41e580b078
2016-12-12 17:50:36 -08:00
Adam Simpkins
89fb0f811b add InodePtr, TreeInodePtr, and FileInodePtr type names
Summary:
Define InodePtr, TreeInodePtr, and FileInodePtr as aliases for std::shared_ptr
of the underlying inode type.  This also updates all of the code to use these
new type names.

This will make it easier swap out std::shared_ptr with a custom pointer type in
the future.  (I believe we will need a custom type in the future so that we
can have more precise control of the reference counting so we can load and
unload Inode objects on demand.  std::shared_ptr::unique() doesn't quite
provide the flexibility we need, and is also being deprecated in C++17.)

Reviewed By: bolinfest

Differential Revision: D4297791

fbshipit-source-id: 1080945649290e676f62689592159f1166159b20
2016-12-12 17:50:35 -08:00
Michael Bolin
309d0da769 Make it possible to make a commit from Eden.
Summary:
In this revision, we override `committablectx.markcommitted()` to make a Thrift
call to Eden to record the new commit. For now, this defers the need for us to
implement `edendirstate.normal()`, though we expect we will have to provide a
proper implementation at some point.

Because `hg update` is not implemented yet, this puts us in a funny state where
we have to restart eden after `hg commit` to ensure all of our `TreeEntry` and
other in-memory data structures are in the correct state.

Reviewed By: simpkins

Differential Revision: D4249214

fbshipit-source-id: 8ec06dfee67070f008dd93a0ee6c810ce75d2faa
2016-12-10 01:07:06 -08:00
Michael Bolin
d60888d210 Support hg remove <directory>.
Summary:
This refines the initial support for `hg remove` by adding support for
directories.

Reviewed By: simpkins

Differential Revision: D4270546

fbshipit-source-id: c97dfea555ad489ddda01ad2587f1856b1953e02
2016-12-09 19:36:17 -08:00
Michael Bolin
017b35bca6 Add support for hg remove for an individual file.
Summary:
This adds support for `hg remove <file>` in Eden and sets up some of the scaffolding
to support `hg remove <directory>`.

Note that the Thrift API for `scmRemove()` is slightly different than that of `scmAdd()`
in that it returns a list of error messages to display to the user rather than throwing
an exception. In practice, for batch operations, Mercurial will allow some operations
to succeed while others may fail, so it is possible to have multiple error messages to
return.

Unlike the current implementation of `hg add`, this does the directory traversal
on the server rather than on the client. Once we work out how to do this for
`hg remove`, we should figure out how to reuse the logic for `hg add`.

Reviewed By: simpkins

Differential Revision: D4263068

fbshipit-source-id: d084774d562c48c59664f313eba229d4197929fe
2016-12-09 19:36:17 -08:00
Michael Bolin
ab750945fe Fix an edge case in hg remove handling.
Summary:
The check on the type of an exception was inverted, which meant
`hg remove <path>` would throw an exception if the parent directory of `path`
did not exist. This is not correct because the user should be able to expect
to do:

```
mkdir -p /tmp/example
cd /tmp/example
hg init
mkdir mydir
touch mydir/a
hg add mydir/a
rm -rf mydir/a
hg rm mydir/a
```

In this scenario, `mydir` does not exist when `hg rm` is called, but the command
should succeed, making `mydir/a` no longer tracked.

Reviewed By: simpkins

Differential Revision: D4268451

fbshipit-source-id: 517d81252aa8e4b6bd1a32dece14776a9f7dd6f7
2016-12-09 16:11:59 -08:00
Michael Bolin
c58ab82952 Ignore "group" and "other" bits of mode_t when diff'ing tree entries.
Summary:
A `TreeEntry` reports a `mode_t` whose "group" and "other" bits are set in a way
that reflects the "owner" bits (so they are non-zero). By comparison, the `mode`
of a `TreeInode::Entry` will reflect the permissions on disk in the overlay (if
the file is materialized). In general, the overlay bits will probably be the
same as those in the `TreeEntry` since we expect the user will rarely mess
with the "group" and "other" bits, but we're seeing a difference more often
right now because of t14953681.

Reviewed By: simpkins

Differential Revision: D4298214

fbshipit-source-id: 2919be94c6bba61135838ee86bbc68aa4031af7c
2016-12-09 15:42:04 -08:00
Adam Simpkins
da04640287 move InodeBase from eden/fuse to eden/fs/inodes
Summary:
Move the InodeBase class from the lower-level fusell code up to the
eden/fs/inodes layer, now that everything else that uses it is in
eden/fs/inodes.

I plan to start changing the ownership model of inode objects a bit, and this
will allow the InodeBase class to interact with EdenDispatcher and other
classes in eden/fs/inodes.

Reviewed By: bolinfest

Differential Revision: D4283392

fbshipit-source-id: 9e1d6fb81dc223f905847cbe8d165a40ad0aca4d
2016-12-07 20:05:20 -08:00
Adam Simpkins
f9a99e0ba1 remove fusell::DirInode and fusell::FileInode
Summary:
Now that the EdenDispatcher class has been moved into eden/fs, we no longer
need the distinction between TreeInode and fusell::DirInode, and FileInode and
fusell::FileInode.

This diff deletes the fusell versions of these classes, and updates all of the
code to always directly use TreeInode and FileInode.  This allows us to get rid
of the remaining dynamic_casts between these pairs of classes.

Reviewed By: bolinfest

Differential Revision: D4257165

fbshipit-source-id: e2b6f328b9605ca0e2882f5cf7a3983fb4470cdf
2016-12-01 17:52:31 -08:00
Adam Simpkins
2a08798f88 move InodeDispatcher from eden/fuse to eden/fs
Summary:
Move the InodeDispatcher class out of the lower-level fusell namespace in
eden/fuse and into the higher-level eden code in eden/fs/inodes.  I also
renamed it from InodeDispatcher to EdenDispatcher, in anticipation of it
getting more eden-specific functionality in the future.

The fusell::MountPoint class is now independent of the Dispatcher type, and can
work with any Dispatcher subclass.  Previously the MountPoint class was
responsible for owning the InodeDispatcher object.  Now its caller (EdenMount
in our case) is responsible for supplying a Dispatcher object that is owned
externally.

Several parts of EdenDispatcher had to be updated as a result of the namespace
move, but I tried to keep this change somewhat minimal.  I did update it from
using fusell::DirInode and fusell::FileInode to eden's TreeInode and FileInode
classes directly.  However, there still remains more clean-up work to do.  I
will split remaining changes out into upcoming diffs.

Reviewed By: bolinfest

Differential Revision: D4257163

fbshipit-source-id: dc9c2526640798f9f924ae2531218ba2c45d1d0a
2016-12-01 17:52:31 -08:00
Adam Simpkins
0af5d187c2 move MountPoint::getInode* methods into EdenMount
Summary:
Move getInodeBaseForPath(), getDirInodeForPath(), and getFileInodeForPath()
entirely into the EdenMount class, and make sure all call sites are using the
EdenMount methods rather than the MountPoint methods.

Reviewed By: bolinfest

Differential Revision: D4257153

fbshipit-source-id: d528cfad174757c3c9f23e62a0616f8bf1976da7
2016-12-01 17:52:31 -08:00
Adam Simpkins
f0082e9178 call EdenMount::getDispatcher() and EdenMount::getRootInode()
Summary:
Update all code in eden/fs to call EdenMount::getDispatcher() instead of
getting the underlying MountPoint from the EdenMount and then calling
getDispatcher() on it.  This will allow me to move the InodeDispatcher from
MountPoint to EdenMount in a subsequent diff.  This also simplifies many of the
callers of this method.

Additionally, add an EdenMount::getRootInode() method, and update call sites to
use this rather than having to look up the InodeDispatcher and call
getRootInode() or getDirInode(FUSE_ROOT_ID) on it.

Reviewed By: bolinfest

Differential Revision: D4257152

fbshipit-source-id: 33e6f6b8853db2a88f4f2c221122eea50e796390
2016-12-01 17:52:31 -08:00
Adam Simpkins
1d00843bb3 short-term code for checking ignore status
Summary:
This updates the Dirstate to also check if untracked files are ignored or not.

This is somewhat inefficient, as we have perform a separate check for each
untracked file we find.  Ideally we should perform all of the dirstate
computation in a single tree walk, and check ignore status as we go.  This
would allow us to skip ignored directories entirely, rather than potentially
having to check each file inside them.  I intend to work on cleaning this up in
the future, but it will require refactoring some of the inode code first.

Reviewed By: bolinfest

Differential Revision: D4225308

fbshipit-source-id: 49e444c85cbb6ede11cc6e19052fdd16cf8aab9f
2016-12-01 17:52:30 -08:00
Adam Simpkins
74fa63d0d1 store a pointer to the EdenMount in the Dirstate
Summary:
Update the Dirstate to store a pointer to the EdenMount object that owns it,
rather than storing pointers to the lower-level MountPoint and ObjectStore
objects.

This change is necessary in order for me to move more functionality from
MountPoint to EdenMount.  (In particular, I plan to move the InodeDispatcher to
the EdenMount.)

As part of this change I also started moving some APIs from MountPoint to
EdenMount.  For now the EdenMount versions are just thin wrappers on top of the
MountPoint APIs.  I will move the functionality directly into EdenMount in a
future diff.

Reviewed By: bolinfest

Differential Revision: D4255675

fbshipit-source-id: 93749c6516c3cea4b4ae93de4ca49ddf05f4d260
2016-12-01 17:52:30 -08:00
Adam Simpkins
87c2d46ce3 write dirstate file atomically
Summary:
Write the dirstate data using the new folly::writeFileAtomic() function.
This ensures that the dirstate file will always contain full, valid contents,
even if we crash or run out of disk space partway through writing out the data.

This diff also includes a couple other minor tweaks:
- Update Dirstate to store the DirstatePersistence object directly inline,
  rather than allocating it separately on the heap.
- Update the DirstatePersistenceTest code to prefix temporary directories with
  "eden_test".  This just makes it easier to tell if the tests fail to clean up
  after themselves for any reason.

Reviewed By: bolinfest

Differential Revision: D4254027

fbshipit-source-id: 6b6601b65aeacdee998a6c4260e972d5fb2426ac
2016-12-01 17:52:30 -08:00
Adam Simpkins
aaa3332644 simplify EdenMount and Dirstate construction
Summary:
This cleans up construction of the EdenMount and Dirstate objects:

- The EdenMount constructor is now responsible for creating the Overlay and
  Dirstate objects.
- The Dirstate constructor is now responsible for loading the
  DirstatePersistence file.
- The EdenMount now takes ownership of the ClientConfig object, and stores it
  for later use.
- The ClientConfig object now has a method to get the path to the
  DirstatePersistence file.
- I added a ClientConfig::createTestConfig() method, so that the TestMount code
  can now use the same EdenMount constructor as the normal code.

This simplifies the logic in EdenServiceHandler and TestMount, and makes some
of the initialization dependencies a little bit simpler.

This change is necessary in order for me to move some logic from
fusell::MountPoint into EdenMount.  The Dirstate object will need a pointer
back to its EdenMount object, and this diff enables that.

Reviewed By: bolinfest

Differential Revision: D4249393

fbshipit-source-id: 439786accbf48c8696dbc6ca4fe77a4c6bdeab65
2016-12-01 17:52:30 -08:00
Adam Simpkins
0e613bd96b rename TreeEntryFileInode to FileInode
Summary:
Rename TreeEntryFileInode to FileInode, and TreeEntryFileHandle to FileHandle.
These class names were long and awkward.

It's slightly unfortunate that we now have classes named both
eden::fuse::FileInode and eden::fuse::fusell::FileInode, but I don't believe
this should cause any major problems.  If we want to eliminate these name
collisions in the future I would advocate for renaming the fusell versions to
something like "FileInodeIface".

Reviewed By: bolinfest

Differential Revision: D4217909

fbshipit-source-id: 899672a318d7ae39595f2c18e171f8fd6cebedc6
2016-11-30 15:49:13 -08:00
Michael Bolin
7d79026f1a Use const auto& base to avoid doing a copy.
Reviewed By: simpkins

Differential Revision: D4244182

fbshipit-source-id: ffdb55f59af45b0675336d4b9450c8b9fb90af5a
2016-11-29 11:19:08 -08:00
Michael Bolin
8c19620e62 Support directories in Dirstate::computeDelta().
Summary:
Previous to this commit, the `Dirstate` logic only worked correctly when the
changes occurred in the root directory. Obviously that is very limiting, so this
adds support for changes in arbitrary directories at arbitrary depths.

This also introduces support for things like a file being replaced by a
directory of same name or vice versa. The tests have been updated to verify
these cases.

One interesting design change that fell out of this was the addition of the
`removedDirectories` field to the `DirectoryDelta` struct. As you can see,
all entries in a removed directory need to be processed by the new
`addDeletedEntries()` function. These require special handling because deleted
directories do not show up in the traversal of modified directories.

In contrast, new directories do show up in the traversal, so they require a
different type of special handling. Specifically, this call will return `NULL`:

```
auto tree = getTreeForDirectory(directory, rootTree.get(), objectStore);
```

When this happens, we must pass an empty vector of tree entries to
`computeDelta()` rather than `&tree->getTreeEntries()`. Admittedly, the special
case of new directories is much simpler than the special case of deleted ones.

Reviewed By: simpkins

Differential Revision: D4219478

fbshipit-source-id: 4c805ba3d7688c4d12ab2ced003a7f5c19ca07eb
2016-11-29 06:51:14 -08:00
Michael Bolin
98c56736cd Add some error messages for some DCHECKs.
Reviewed By: simpkins

Differential Revision: D4219453

fbshipit-source-id: 391dd5ec57e01b2f09154eb991eb3e8e2e969f62
2016-11-26 12:01:41 -08:00