Summary:
Previously, `.hg` entries were filtered when calling `hg status` on the client,
but this should be happening on the server. This updates the logic in
`Dirstate.cpp` to properly exclude `.hg` when traversing the overlay for
modified directories, so this will eliminate a bunch of unnecessary computation
and simplify the client.
I'm unsure of how best to implement the ownership relation for the set that
contains `.hg`. Please advise! I know that I could categorically exclude
`".hg"` in `getModifiedDirectoriesRecursive()`, but I haven't used it enough
scenarios yet to be sure that's the right thing to do. For example, if it were
a Git repo, arguably we should consider `.hg` and not `.git`, so I could also
require the set to be a parameter of `Dirstate`, but I want to make sure I get
the ownership stuff right.
Reviewed By: simpkins
Differential Revision: D4316531
fbshipit-source-id: a0f13ca1c3c620b686435c8aa6485ba4e850f043
Summary: These should have been added as part of D4270526.
Reviewed By: simpkins
Differential Revision: D4315382
fbshipit-source-id: 5920ff38f9cc63540e4813e8ab40f79ad46f9ec1
Summary: Follow-up on a comment that came out of the review for D4249214.
Reviewed By: simpkins
Differential Revision: D4314979
fbshipit-source-id: 76384474092e6fd48394f6faf8b84ba6220c556a
Summary:
Move the FakeObjectStoreTest class into fs/testharness, along with the
TestMount and TestBackingStore classes. This simply consolidates the test
utility code into a single location.
Reviewed By: bolinfest
Differential Revision: D4317517
fbshipit-source-id: 4e19590c5ffde88b66f2c8d4a964352ec349031c
Summary:
Hash objects are small enough (20 bytes) that it isn't worth allocating them on
the heap. This updates LocalStore::getSha1ForBlob() to return a
folly::Optional<Hash>, and ObjectStore::getSha1ForBlob() to return a plain
Hash.
Reviewed By: bolinfest
Differential Revision: D4298162
fbshipit-source-id: 9cf54f2997ba8c3b2346db315a2aca41e580b078
Summary:
Add comments in ObjectStore.h documenting the fact that the get* APIs all throw
std::domain_error when the specified ID does not exist, and never return
nullptr.
Also update the FakeObjectStore class used for testing to follow this behavior.
Reviewed By: bolinfest
Differential Revision: D4298160
fbshipit-source-id: c5509bb3aa2ed76619b06b733ad240aaa5f00862
Summary:
In addition to storing the SHA-1 of each file's contents, also store the size.
This will allow us to more quickly look up the file size, without having to
retreive the file size.
I haven't yet added an API to ObjectStore to retreive the full BlobMetadata
object; I will do that in a subsequent diff. One benefit for now is that this
does avoid double-computing the SHA-1 in ObjectStore::getSha1ForBlob() if we
had to load the blob.
Reviewed By: bolinfest
Differential Revision: D4298157
fbshipit-source-id: 4d83ebfa631c93fcef06ca1cd0ba0e1a70a2476d
Summary:
Define InodePtr, TreeInodePtr, and FileInodePtr as aliases for std::shared_ptr
of the underlying inode type. This also updates all of the code to use these
new type names.
This will make it easier swap out std::shared_ptr with a custom pointer type in
the future. (I believe we will need a custom type in the future so that we
can have more precise control of the reference counting so we can load and
unload Inode objects on demand. std::shared_ptr::unique() doesn't quite
provide the flexibility we need, and is also being deprecated in C++17.)
Reviewed By: bolinfest
Differential Revision: D4297791
fbshipit-source-id: 1080945649290e676f62689592159f1166159b20
Summary:
This updates the InodeBase code to track its location in the filesystem.
Since we do not support hard links, each inode has a single path where it
exists.
Tracking this data allows us to implement getPath() as a method of InodeBase.
This code is not really complete yet, but it seems worth getting the current
code in as-is. The location data is not updated properly on unlinks or
renames, but it looks like the existing InodeNameManager code does not get
updated either. I am working on some additional refactoring of inode object
management, and it will be easier to come back and fix the unlink and rename
handling after this refactoring is further along.
Reviewed By: bolinfest
Differential Revision: D4297591
fbshipit-source-id: 82ceb326e4f9c376f627b1d8f49bb7db3cfc2b0b
Summary:
In this revision, we override `committablectx.markcommitted()` to make a Thrift
call to Eden to record the new commit. For now, this defers the need for us to
implement `edendirstate.normal()`, though we expect we will have to provide a
proper implementation at some point.
Because `hg update` is not implemented yet, this puts us in a funny state where
we have to restart eden after `hg commit` to ensure all of our `TreeEntry` and
other in-memory data structures are in the correct state.
Reviewed By: simpkins
Differential Revision: D4249214
fbshipit-source-id: 8ec06dfee67070f008dd93a0ee6c810ce75d2faa
Summary:
This refines the initial support for `hg remove` by adding support for
directories.
Reviewed By: simpkins
Differential Revision: D4270546
fbshipit-source-id: c97dfea555ad489ddda01ad2587f1856b1953e02
Summary:
`getEntryForPath()` is helpful when we don't know (or care) whether the path
corresponds to a file or a directory.
Reviewed By: simpkins
Differential Revision: D4270526
fbshipit-source-id: 6dd2f76df1749e040dda788a74055d9da2156a4d
Summary:
This adds support for `hg remove <file>` in Eden and sets up some of the scaffolding
to support `hg remove <directory>`.
Note that the Thrift API for `scmRemove()` is slightly different than that of `scmAdd()`
in that it returns a list of error messages to display to the user rather than throwing
an exception. In practice, for batch operations, Mercurial will allow some operations
to succeed while others may fail, so it is possible to have multiple error messages to
return.
Unlike the current implementation of `hg add`, this does the directory traversal
on the server rather than on the client. Once we work out how to do this for
`hg remove`, we should figure out how to reuse the logic for `hg add`.
Reviewed By: simpkins
Differential Revision: D4263068
fbshipit-source-id: d084774d562c48c59664f313eba229d4197929fe
Summary:
The check on the type of an exception was inverted, which meant
`hg remove <path>` would throw an exception if the parent directory of `path`
did not exist. This is not correct because the user should be able to expect
to do:
```
mkdir -p /tmp/example
cd /tmp/example
hg init
mkdir mydir
touch mydir/a
hg add mydir/a
rm -rf mydir/a
hg rm mydir/a
```
In this scenario, `mydir` does not exist when `hg rm` is called, but the command
should succeed, making `mydir/a` no longer tracked.
Reviewed By: simpkins
Differential Revision: D4268451
fbshipit-source-id: 517d81252aa8e4b6bd1a32dece14776a9f7dd6f7
Summary:
A `TreeEntry` reports a `mode_t` whose "group" and "other" bits are set in a way
that reflects the "owner" bits (so they are non-zero). By comparison, the `mode`
of a `TreeInode::Entry` will reflect the permissions on disk in the overlay (if
the file is materialized). In general, the overlay bits will probably be the
same as those in the `TreeEntry` since we expect the user will rarely mess
with the "group" and "other" bits, but we're seeing a difference more often
right now because of t14953681.
Reviewed By: simpkins
Differential Revision: D4298214
fbshipit-source-id: 2919be94c6bba61135838ee86bbc68aa4031af7c
Summary:
Move the InodeBase class from the lower-level fusell code up to the
eden/fs/inodes layer, now that everything else that uses it is in
eden/fs/inodes.
I plan to start changing the ownership model of inode objects a bit, and this
will allow the InodeBase class to interact with EdenDispatcher and other
classes in eden/fs/inodes.
Reviewed By: bolinfest
Differential Revision: D4283392
fbshipit-source-id: 9e1d6fb81dc223f905847cbe8d165a40ad0aca4d
Summary:
`stdin`, `stdout` and `stderr` are macros that expand to function calls with the MSVC CRT implementation. This is also the case for musl-libc. This means that Subprocess simply cannot be compiled on those platforms without changing the API.
To solve that, we change the API and deprecate the old API.
For more fun, `stdin`, `stdout` and `stderr` are also macros in glibc, they just expand to other identifiers rather than a function call.
Reviewed By: yfeldblum
Differential Revision: D4229544
fbshipit-source-id: 97f1a3b228b83cfdcaffee56d729063ea235e608
Summary:
Update the TestMount APIs to use the term "TreeInode" instead of "DirInode" now
that the DirInode class is no more.
Reviewed By: bolinfest
Differential Revision: D4257171
fbshipit-source-id: 2407dfc25405f7161987260c0299f1723831e264
Summary:
Now that the EdenDispatcher class has been moved into eden/fs, we no longer
need the distinction between TreeInode and fusell::DirInode, and FileInode and
fusell::FileInode.
This diff deletes the fusell versions of these classes, and updates all of the
code to always directly use TreeInode and FileInode. This allows us to get rid
of the remaining dynamic_casts between these pairs of classes.
Reviewed By: bolinfest
Differential Revision: D4257165
fbshipit-source-id: e2b6f328b9605ca0e2882f5cf7a3983fb4470cdf
Summary:
Move the InodeDispatcher class out of the lower-level fusell namespace in
eden/fuse and into the higher-level eden code in eden/fs/inodes. I also
renamed it from InodeDispatcher to EdenDispatcher, in anticipation of it
getting more eden-specific functionality in the future.
The fusell::MountPoint class is now independent of the Dispatcher type, and can
work with any Dispatcher subclass. Previously the MountPoint class was
responsible for owning the InodeDispatcher object. Now its caller (EdenMount
in our case) is responsible for supplying a Dispatcher object that is owned
externally.
Several parts of EdenDispatcher had to be updated as a result of the namespace
move, but I tried to keep this change somewhat minimal. I did update it from
using fusell::DirInode and fusell::FileInode to eden's TreeInode and FileInode
classes directly. However, there still remains more clean-up work to do. I
will split remaining changes out into upcoming diffs.
Reviewed By: bolinfest
Differential Revision: D4257163
fbshipit-source-id: dc9c2526640798f9f924ae2531218ba2c45d1d0a
Summary:
Drop the MountPoint::setRootInode() method, and have EdenMount perform the
operation directly on the InodeDispatcher.
Reviewed By: bolinfest
Differential Revision: D4257158
fbshipit-source-id: af4a696de2d36979c658972104361f225f482338
Summary:
Update call sites in eden/fs to access the InodeNameManager through the
EdenMount object rather than the MountPoint.
It turns out that there was only one call site in TreeInode, and all other
callers in eden/fs get it indirectly via TreeInode::getNameMgr().
Reviewed By: bolinfest
Differential Revision: D4257156
fbshipit-source-id: 9f0212134b20c8dd8943827c17aa16ee7274bc36
Summary:
Move getInodeBaseForPath(), getDirInodeForPath(), and getFileInodeForPath()
entirely into the EdenMount class, and make sure all call sites are using the
EdenMount methods rather than the MountPoint methods.
Reviewed By: bolinfest
Differential Revision: D4257153
fbshipit-source-id: d528cfad174757c3c9f23e62a0616f8bf1976da7
Summary:
Update all code in eden/fs to call EdenMount::getDispatcher() instead of
getting the underlying MountPoint from the EdenMount and then calling
getDispatcher() on it. This will allow me to move the InodeDispatcher from
MountPoint to EdenMount in a subsequent diff. This also simplifies many of the
callers of this method.
Additionally, add an EdenMount::getRootInode() method, and update call sites to
use this rather than having to look up the InodeDispatcher and call
getRootInode() or getDirInode(FUSE_ROOT_ID) on it.
Reviewed By: bolinfest
Differential Revision: D4257152
fbshipit-source-id: 33e6f6b8853db2a88f4f2c221122eea50e796390
Summary:
This updates the Dirstate to also check if untracked files are ignored or not.
This is somewhat inefficient, as we have perform a separate check for each
untracked file we find. Ideally we should perform all of the dirstate
computation in a single tree walk, and check ignore status as we go. This
would allow us to skip ignored directories entirely, rather than potentially
having to check each file inside them. I intend to work on cleaning this up in
the future, but it will require refactoring some of the inode code first.
Reviewed By: bolinfest
Differential Revision: D4225308
fbshipit-source-id: 49e444c85cbb6ede11cc6e19052fdd16cf8aab9f
Summary:
Update the Dirstate to store a pointer to the EdenMount object that owns it,
rather than storing pointers to the lower-level MountPoint and ObjectStore
objects.
This change is necessary in order for me to move more functionality from
MountPoint to EdenMount. (In particular, I plan to move the InodeDispatcher to
the EdenMount.)
As part of this change I also started moving some APIs from MountPoint to
EdenMount. For now the EdenMount versions are just thin wrappers on top of the
MountPoint APIs. I will move the functionality directly into EdenMount in a
future diff.
Reviewed By: bolinfest
Differential Revision: D4255675
fbshipit-source-id: 93749c6516c3cea4b4ae93de4ca49ddf05f4d260
Summary:
Write the dirstate data using the new folly::writeFileAtomic() function.
This ensures that the dirstate file will always contain full, valid contents,
even if we crash or run out of disk space partway through writing out the data.
This diff also includes a couple other minor tweaks:
- Update Dirstate to store the DirstatePersistence object directly inline,
rather than allocating it separately on the heap.
- Update the DirstatePersistenceTest code to prefix temporary directories with
"eden_test". This just makes it easier to tell if the tests fail to clean up
after themselves for any reason.
Reviewed By: bolinfest
Differential Revision: D4254027
fbshipit-source-id: 6b6601b65aeacdee998a6c4260e972d5fb2426ac
Summary:
This cleans up construction of the EdenMount and Dirstate objects:
- The EdenMount constructor is now responsible for creating the Overlay and
Dirstate objects.
- The Dirstate constructor is now responsible for loading the
DirstatePersistence file.
- The EdenMount now takes ownership of the ClientConfig object, and stores it
for later use.
- The ClientConfig object now has a method to get the path to the
DirstatePersistence file.
- I added a ClientConfig::createTestConfig() method, so that the TestMount code
can now use the same EdenMount constructor as the normal code.
This simplifies the logic in EdenServiceHandler and TestMount, and makes some
of the initialization dependencies a little bit simpler.
This change is necessary in order for me to move some logic from
fusell::MountPoint into EdenMount. The Dirstate object will need a pointer
back to its EdenMount object, and this diff enables that.
Reviewed By: bolinfest
Differential Revision: D4249393
fbshipit-source-id: 439786accbf48c8696dbc6ca4fe77a4c6bdeab65
Summary:
Rename TreeEntryFileInode to FileInode, and TreeEntryFileHandle to FileHandle.
These class names were long and awkward.
It's slightly unfortunate that we now have classes named both
eden::fuse::FileInode and eden::fuse::fusell::FileInode, but I don't believe
this should cause any major problems. If we want to eliminate these name
collisions in the future I would advocate for renaming the fusell versions to
something like "FileInodeIface".
Reviewed By: bolinfest
Differential Revision: D4217909
fbshipit-source-id: 899672a318d7ae39595f2c18e171f8fd6cebedc6
Summary:
It is important to make the local of the Hg Eden extension configurable so we
can develop it.
Reviewed By: simpkins
Differential Revision: D4223733
fbshipit-source-id: 1746d0cda590f24fc48e0d781b557824ff5f60d3
Summary:
Previous to this commit, the `Dirstate` logic only worked correctly when the
changes occurred in the root directory. Obviously that is very limiting, so this
adds support for changes in arbitrary directories at arbitrary depths.
This also introduces support for things like a file being replaced by a
directory of same name or vice versa. The tests have been updated to verify
these cases.
One interesting design change that fell out of this was the addition of the
`removedDirectories` field to the `DirectoryDelta` struct. As you can see,
all entries in a removed directory need to be processed by the new
`addDeletedEntries()` function. These require special handling because deleted
directories do not show up in the traversal of modified directories.
In contrast, new directories do show up in the traversal, so they require a
different type of special handling. Specifically, this call will return `NULL`:
```
auto tree = getTreeForDirectory(directory, rootTree.get(), objectStore);
```
When this happens, we must pass an empty vector of tree entries to
`computeDelta()` rather than `&tree->getTreeEntries()`. Admittedly, the special
case of new directories is much simpler than the special case of deleted ones.
Reviewed By: simpkins
Differential Revision: D4219478
fbshipit-source-id: 4c805ba3d7688c4d12ab2ced003a7f5c19ca07eb
Summary:
This is a better fix for the quick fix introduced by D4198939.
It turns out that the `EdenMount` does not need to take ownership
of the `ClientConfig`, so removing the `std::move()` makes this code
much simpler because instead of declaring a bunch of variables
early in `mountImpl()` so that we can "hold on" to them before `EdenMount`
takes ownership of the `ClientConfig`, we can declare them closer to where they
are actually used.
Note that we may want `EdenMount` to actually take ownership of the
`ClientConfig` in the future, but we'll cross that bridge when we come to it.
Reviewed By: simpkins
Differential Revision: D4199000
fbshipit-source-id: 67411a9a5ef630a9d481aebc94631c79da4ab2c4
Summary:
This also introduces the change where the `EdenMount` creates
and takes ownership of the `Dirstate`.
To clean some of this up, I had to expose a `getEdenDir()` method on `EdenServer`
that returns an `AbsolutePathPiece`. This was previously stored internally as a
`std::string`, so I had to clean up a bunch of path construction that was using `edenDir_`.
Reviewed By: simpkins
Differential Revision: D4123763
fbshipit-source-id: 270b182521c1a84bb054832f4b5f92af849d67e4
Summary:
Previously, `Dirstate` took a `std::shared_ptr<EdenMount>`, but now it takes
pointers to a `MountPoint` and an `ObjectStore` because it does not need the
entire `EdenMount`. Ultimately, this will enable us to have `EdenMount` create
the `Dirstate` itself, but that will be done in a follow-up commit.
Fortunately, it was pretty easy to remove the references to `edenMount_` in
`Dirstate.cpp` and rewrite them in terms of `mountPoint_` or `objectStore_`.
The one thing that I also decided to move was `getModifiedDirectoriesForMount()`
because I already needed to create an `EdenMounts` file (admittedly not a
great name) to collect some utility functions that use members of an `EdenMount`
while not having access to the `EdenMount` itself.
As part of this change, all of the code in `eden/fs/model/hg` has been moved to
`eden/fs/inodes` so that it is alongside `EdenMount`. We are going to change
the `Dirstate` from an Hg-specific concept to a more general concept.
`LocalDirstatePersistence` is no longer one of two implementations of
`DirstatePersistence`. (The other was `FakeDirstatePersistence`.) Now there is
just one concrete implementation called `DirstatePersistence` that takes its
implementation from `LocalDirstatePersistence`. Because there is no longer a
`FakeDirstatePersistence`, `TestMount` must create a `DirstatePersistence` that
uses a `TemporaryFile`.
Because `TestMount` now takes responsibility for creating the `Dirstate`, it
must also give callers the ability to specify the user directives. To that end,
`TestMountBuilder` got an `addUserDirectives()` method while `TestMount` got a
`getDirstate()` method. Surprisingly, `TestMountTest` did not need to be updated
as part of this revision, but `DirstateTest` needed quite a few updates
(which were generally mechanical).
Reviewed By: simpkins
Differential Revision: D4230154
fbshipit-source-id: 9b8cb52b45ef5d75bc8f5e62a58fcd1cddc32bfa
Summary: This is an implementation of DirstatePersistence that persists data to a local file.
Reviewed By: simpkins
Differential Revision: D4181868
fbshipit-source-id: 7177b2ef67cd3aec56e5ad10f41169cc5ec69d81
Summary:
Update the gitignore handling code to perform pattern matching the same way git
does. Previously the code just called the standard fnmatch() function, which
does not handle "**" in patterns the same way git does.
This includes our own new implementation of glob pattern matching. I did
evaluate several other options before writing our own implementation here:
- The wildmatch() code used by git (and watchman, and rsync) has a few
downsides: it is not distributed by itself as a library anywhere else.
Therefore we would probably have to include a copy of this code in our
repository. Making another copy is unfortunate, and somewhat undesirable
from a legal and licensing perspective. This code also only works with
nul-terminated strings, and our code deals primarily with non-terminated
StringPiece objects.
- I did look at translating glob patterns in to regular expressions and using
re2 to perform matching. Unfortunately re2 turns out to be substantially
slower than wildmatch() for typical gitignore patterns.
This new implementation performs some preprocessing on the glob pattern, and
generates a pattern opcode buffer. Eden can perform this glob preprocessing
when it first loads a .gitignore file, and can then save and re-use this result
each time it needs to match a filename. Doing this preprocessing allows
matching to be done 50% to 100% faster than wildmatch() for typical glob
patterns.
Reviewed By: bolinfest
Differential Revision: D4194573
fbshipit-source-id: 46bc6a61b6d8066f4bbdb5d3e74265a3e72e42cc
Summary:
This adds some initial code for handling gitignore files.
I did check to see if there were APIs from libgit2 that we could leverage for
this, but it does not look like we can easily use their functionality. The
libgit2 ignore code seems to tightly coupled with their repository data
structures, and it requires that you actually have a git repository.
This code isn't quite 100% compatible with git's semantics yet. In particular:
- For now we are just using fnmatch() to do the matching. This is currently
inefficient as we have to do string allocations on each match attempt. This
also doesn't quite match git's behavior, particularly with regard to "**"
inside patterns.
- The code currently does not have a mechanism for indicating if a path refers
to a directory or not, so trailing slashes in the pattern are not honored
correctly.
We will probably need to implement our own fnmatch-like function in the future
to solve these issues.
Reviewed By: bolinfest
Differential Revision: D4156480
fbshipit-source-id: 8ceaefd3805358ae2edc29bfc316e5c8f2fb7d31
Summary:
For now, initial state is represented by a
`std::unordered_map<RelativePath, HgUserStatusDirective>`.
Reviewed By: simpkins
Differential Revision: D4123461
fbshipit-source-id: 83a99e1f504dd1efca1bc1ed33cbc3f116787a80
Summary:
This adds the logic to power `hg rm`. There are comprehensive tests that attempt to cover
all of the edge cases.
This evolved to become a complex change because I realized that I needed to change
my internal representation of the dirstate to implement it properly. Specifically, we now maintain
a map (`userDirectives`) of files that have been explicitly scheduled for change via `hg add` or `hg rm`.
To compute the result of `hg status`, we find the changes between the manifest/root tree
and the overlay and also consult `userDirectives`. `Dirstate::getStatus()` was updated
considerably as part of this commit due to the introduction of `userDirectives`.
As such, `Dirstate::remove()` must do several things:
* Defend the integrity of the dirstate by throwing appropriate exceptions for invalid inputs.
* Delete the specified file, if appropriate.
* Update `userDirectives`, if appropriate.
Although `Dirstate::add()` was not the focus of this commit, it also had to be updated to
match the pattern introduced by `Dirstate::remove()`.
Some important features that are still not supported are:
* Handling ignored files correctly.
* Storing copy/move information.
Reviewed By: simpkins
Differential Revision: D4104503
fbshipit-source-id: d5d45a279e16ded584c6cd4d528ba92d2c8e2993
Summary:
D4014598 changed this line but didn't change the test expectations.
Since it seems desirable for the mount point name to be used, I've reverted
back to the prior state for this line.
Reviewed By: bolinfest
Differential Revision: D4202265
fbshipit-source-id: bddc01436e0a5921a3b0b2c01c0fd2c32f5f1960
Summary:
Originally, D3858635 was going to introduce a scheme for hooks where the
repo type was included in the path:
/etc/eden/hooks/hg/post-clone <args...>
But over the course of the review, we decided to make the repo type a
parameter:
/etc/eden/hooks/post-clone hg <other-args...>
Unfortunately, `generate-hooks-dir` was not updated as part of that
change and it is not covered by unit tests. This error was particularly hard
to discover because of how `ENOENT` is handled, so I added a log statement for
that.
Reviewed By: simpkins
Differential Revision: D4200277
fbshipit-source-id: ffffd871cd78dcaeb717be8f1e01893ce9643a47
Summary:
This is a quick and dirty fix for this issue that was causing
and confusing bug where the memory for the `AbsolutePathPiece`
was getting reclaimed, so when it was later read as the value
for a path, it failed because it was binary garbage.
This is mainly caused by the `std::move(config)` that passes
the `ClientConfig` to the `EdenMount` constructor. I will do
some more general cleanup for that in a follow-up revision,
but I wanted to have this change in its own commit that makes
it clear where the failure/fix were coming from.
Reviewed By: simpkins
Differential Revision: D4198939
fbshipit-source-id: 19e0423a1bee924fa6cc2edc8bae534ef472c988
Summary:
Use new, less confusing names for mentioned thrift methods.
Codemod with 'Yes to All'. Reverted changes in thrift/
Reviewed By: yfeldblum
Differential Revision: D4076812
fbshipit-source-id: 4962ec0aead1f6a45efc1ac7fc2778f39c72e1d0