Summary: Small refactoring prior to a fix in a later diff.
Reviewed By: simpkins
Differential Revision: D7358218
fbshipit-source-id: 198160df1aa0dc65b917c5a96c2998c3675b03fe
Summary: Now we can write unit tests for takeover. :)
Reviewed By: simpkins
Differential Revision: D7345419
fbshipit-source-id: 05741dab65016c59109bf190c83be2f676a6141d
Summary:
This replaces the `ensureDataLoaded()`, `materializeForWrite()`, and
`materializeAndTruncate()` methods in `FileInode` with new safer alternatives:
`runWhileDataLoaded()`, `runWhileMaterialized()`, and `truncateAndRun()`
These new methods take a function to run, and run it while holding the state
lock with the FileInode guaranteed to be in the desired state. They also
require the caller move in the state lock as an argument, to help ensure that
the caller gets the locking behavior correct.
The old methods required that the caller not already hold the state lock while
calling them. This pre-condition was violated in `FileInode::write()` and
possibly other places. They also required releasing the lock between they
confirmed the file state and when the caller's function ran, requiring
additional lock and unlock operations, and also making the code harder to
reason about since other threads could change the state during the time when
the lock was released.
Reviewed By: chadaustin
Differential Revision: D7363180
fbshipit-source-id: d8e667d0bc7006c519252a8d0682af97517997eb
Summary:
The changes in this diff comments out unused parameters. All changes are automated using clang-tidy.
This will allow us to enable `-Wunused-parameter` as error.
Reviewed By: simpkins
Differential Revision: D7371610
fbshipit-source-id: 0134e2f0b916313d690c073a46d747c52399a226
Summary:
FileInode::read() did not handle being called in the BLOB_LOADING or NOT_LOADED
states. This method is only called from EdenFileHandle::read(), which does no
checking that it is fully loaded first.
This fixes read() to always call ensureDataLoaded() first.
Reviewed By: chadaustin
Differential Revision: D7359878
fbshipit-source-id: f5a1c8a28db3267da3180b67f970430e3ea291da
Summary:
While debugging a takeover crash (task #25590050) we wanted to help confirm
that inode objects weren't ever getting copied somehow. That turned out not to
be the issue, but it still seems worth explicitly deleting the copy and move
constructors and assignment operators for InodeBase.
Reviewed By: wez
Differential Revision: D7342050
fbshipit-source-id: ae07214072bfb8353584a07066aae2dc1adf2797
Summary:
Update the getScmStatusBetweenRevisions() thrift call to use the new
diffCommits() function that diffs source control Tree objects without creating
TreeInode objects to perform the diff.
This addresses two bugs:
- Each call to EdenMount::diffRevisions() constructed a new root inode
associated with the mount, and this would never get destroyed. It was not
destroyed at the end of the diffRevisions() call since inodes are normally
not destroyed immediately when they are unreferenced. It was not destroyed
during EdenMount::shutdown() since EdenMount didn't have any references to
these additional root inode structures and their children.
- EdenMount::diffRevisions() incorrectly swapped ADDED and REMOVED statuses in
the result.
Reviewed By: wez
Differential Revision: D7341609
fbshipit-source-id: 16e755a0ff685f51c977c3b27d6af96908f33494
Summary:
Add a function for diffing two source control commits without needing to
instantiate TreeInode objects.
Reviewed By: wez
Differential Revision: D7341604
fbshipit-source-id: 557eef87faa2785ab96d51b09569a46f892a71f6
Summary:
Fix the code to generate exceptions based on an errno error using
std::generic_category rather than std::system_category.
Reviewed By: yfeldblum
Differential Revision: D7329997
fbshipit-source-id: 3fe257bbbc7a631c801f31120592c8bdbc25c8bf
Summary:
Add a new utils/SystemError.h header with helper functions to check if a
`std::system_error` contains an errno value.
Most of the code in Eden previously only checked for `std::system_category`
when looking for errno values. `std::generic_category` is the correct category
to use for errno exceptions, but folly/Exception.h incorrectly throws them as
`std::system_category` today. This change makes Eden treat either error type
as errno values for now.
Reviewed By: yfeldblum
Differential Revision: D7329999
fbshipit-source-id: 67a3c3ea10371c53a2e34236b7575deac4cbd53a
Summary:
This helps distinguish `facebook::eden::FileHandle` from
`facebook::eden::fusell::FileHandle`, and will make it possible to eliminate
the separate `facebook::eden::fusell` namespace in the future.
Reviewed By: chadaustin
Differential Revision: D7314456
fbshipit-source-id: 0869427766fe666d119a59c7df1c97825a8ad36e
Summary:
This test exercises code under eden/fs/fuse, so put it with that directory's
tests.
Reviewed By: chadaustin
Differential Revision: D7314457
fbshipit-source-id: b664aaa3086b2e65f7ae8b76ae71878f39fd9d17
Summary:
There's some code that uses Entry::hasInodeNumber to decide whether an
inode is in the process of being loaded. This change is a
prerequisite for D7052470.
Reviewed By: simpkins
Differential Revision: D7311182
fbshipit-source-id: e04a89c814950e6ce33e73b963b3ffb0f0e353b8
Summary:
Remove the `FuseChannel::stealFuseDevice()` method, and instead change the
session completion future to return the FUSE device FD if it is still valid.
This ensures we only extract the FUSE device when it is still valid: if an
unmount event is being processed by a FUSE worker thread simultaneously with a
call to `FuseChannel::takeoverStop()` the session completion future will return
an empty `folly::File` since the FUSE device has been shut down.
This also helps clarify the synchronization semantics around modification to
the `fuseDevice_` member variable.
Reviewed By: wez
Differential Revision: D7300847
fbshipit-source-id: 02ced8fc9d24e7cd526d911782949d0bfbf0e5a7
Summary:
The FuseChannel destructor previously could be invoked while there are still
outstanding requests, which would cause problems when responses were generated
for these requests.
This makes the FuseChannel destructor private, requiring users to always
allocate FuseChannel objects on the heap and destroy them using a destroy()
method. The destroy() method still blocks waiting for the FUSE worker threads
to exit, but if there are still outstanding FUSE requests it defers the
deletion of the FuseChannel until they are complete.
The EdenMount class currently couldn't ever destroy FuseChannel objects with
outstanding requests, since it waits for the completion future to fire before
invoking the destructor. Fixing this behavior is mostly for correctness of the
FuseChannel API and for the benefit of our unit tests.
Reviewed By: chadaustin
Differential Revision: D7297829
fbshipit-source-id: 643d1332be84a1f25ee30ba2a2ea3515806f00ef
Summary:
Make ServerState also store the clock as a member variable. This moves more
server-wide state into the ServerState class, and allows us to eliminate
another argument to the EdenMount constructor.
Reviewed By: chadaustin
Differential Revision: D7297832
fbshipit-source-id: 2a063d67752f46686987390b3faefb304455568a
Summary:
Move the EdenCPUThreadPool from EdenServer to ServerState.
This ensures that all EdenMount objects always have a thread pool. Previously
most unit tests were creating an EdenMount without a thread pool set, but the
FileInode code and TreeInode code could end up wanting to use the thread pool
even if no FuseChannel is running.
Reviewed By: chadaustin
Differential Revision: D7297828
fbshipit-source-id: 9861b3439cd091afe31fa13f449aa14656983318
Summary:
Remove the `FuseChannel::getSessionCompleteFuture()` method and instead have
`initialize()` and `initializeFromTakeover()` return the `Future` that can be
used to wait on session completion.
This makes it explicit from an API perspective that this session completion
future will only be invoked if initialization is successful. It also
eliminates the possibility of anyone calling getSessionCompleteFuture() more
than once.
I also changed the session completion future to use a folly::SemiFuture rather
than a folly::Future, to make it explicit that callers generally should not
execute their callback inline in the same thread where the SemiFuture is
fulfilled.
Reviewed By: chadaustin
Differential Revision: D7297835
fbshipit-source-id: 3a1157951f0738f1692833ed5e875c3e9c6d6d69
Summary: Testing convenience. The NO_CHECK constructor is no longer necessary now that simpkins made the original constructor constexpr.
Reviewed By: simpkins
Differential Revision: D7295539
fbshipit-source-id: 717f5a7a8d3d9790966fad441da7b3308495e2c3
Summary: Cleaning up the takeover initialization code for EdenMount and InodeMap.
Reviewed By: simpkins
Differential Revision: D7294419
fbshipit-source-id: 58506f04259bb1017e6cd2e80e40e5820de6200f
Summary:
Instead of having a rule that save() must be called after
InodeMap::shutdown, just have InodeMap::shutdown return the
SerializedInodeMap if it's desired.
Reviewed By: simpkins
Differential Revision: D7292773
fbshipit-source-id: 2ba35fc729e19af122fe5d6c5a3287ad6b8af946
Summary:
This changes FuseChannel to fulfill the session complete future immediately in
the thread where it finishes. This may occur inside a FUSE worker thread.
It is now up to the caller to use `via()` if they want their callback to run in
a separate thread.
This simplifies the FuseChannel code and also addresses a crash that occurs in
the unit tests sometimes: If the FuseChannel destructor is invoked without
explicitly invoking the FuseChannel, it would schedule the session complete
promise to be fulfilled in a separate EventBase thread. However, the promise
may then have already been destroyed by the time it can run in the EventBase
thread.
There are still additional synchronization issues to fix in the FuseChannel
tests, but this resolves one problem.
Reviewed By: chadaustin
Differential Revision: D7282001
fbshipit-source-id: be64d3ef6a0e664ed7a2cf93a549acc140184095
Summary:
Empty .gitignore files should be processed normally, and treated as no new
ignore rules. The logic in D6659654 had accidentally introduced a bug where we
would completely skip processing any directories that contained empty
.gitignore files.
Reviewed By: wez
Differential Revision: D7261276
fbshipit-source-id: 033e199f15d7763bff5f9a080a226c50e481aa9d
Summary:
- Update FuseChannel::initializate() to not require an Executor. Rather than
performing initialization using the supplied Executor, it simply starts one
worker thread first and performs initialization there.
- Change the API semantics slightly so that the session complete future is
invoked only if initialization succeeds. Previously the session complete
future could be invoked if initialization failed as well.
- Replace the activeThreads counter with a stoppedThreads counter to determine
when the session complete future should be invoked. The previous
activeThreads behavior was somewhat racy, since the activeThreads counter was
incremented inside each worker thread, but most places that checked it did
not wait to ensure that all worker threads had successfully incremented it
yet.
Reviewed By: chadaustin
Differential Revision: D7243910
fbshipit-source-id: 93b3465509bd9bf6fa90ea097e70dac3193172f9
Summary: It's a little surprising to me the clang build passed without these explicit template instantiations!
Reviewed By: simpkins
Differential Revision: D7253536
fbshipit-source-id: 2f48d5571777f4e978b6947183eefb03158d5014
Summary:
Split up the FuseChannel and EdenMount APIs for starting a brand new FUSE
channel vs taking over an existing channel.
These operations are somewhat different as initializing a new FUSE channel
requires performing a negotiation with the kernel that will not complete
immediately. Therefore these APIs return a Future object. Taking over an
existing FUSE channel is always an immediate operation, and therefor does not
need to return a Future object.
Reviewed By: chadaustin
Differential Revision: D7241997
fbshipit-source-id: 22780f2df76b2d554ab2a4043b6425fa7a4e9c94
Summary:
Ensure that we ignore SIGPIPE in unit tests. The
`FuseChannel::requestSessionExit()` function sends SIGPIPE to its worker
threads, and expects this to wake up the worker threads if they are stuck
inside a read() call.
In normal program builds folly/Subprocess.cpp causes SIGPIPE to be ignored on
startup. However, Subprocess.cpp is not necessarily linked into all of our
unit tests. I haven't tracked down exactly why, but SIGPIPE in the unit tests
seems to kill the entire process only in opt-ubsan builds.
In the future we probably should clean up how requestSessionExit() triggers its
worker threads to exit: the current code is still potentially racy, since the
worker threads could receive the SIGPIPE just after checking `sessionFinished_`
and before calling `read()`, in which case they may go back to sleep rather
than exiting. However for now this should be a simple fix to get the tests
passing in all build modes.
Reviewed By: wez
Differential Revision: D7224370
fbshipit-source-id: a56fe3331fc5aa6a49ccbe6b0678b479a44ffc07
Summary:
To make it clearer to me why all the calls to newPtrLocked were safe, and to
eliminate some duplication, I captured the newPtrLocked call patterns into
member functions on LoadedInode and TreeInode::Entry.
Reviewed By: simpkins
Differential Revision: D7207542
fbshipit-source-id: 25de77e72c0898be43b3fbdddab835d64101755e
Summary:
To make it clearer why newPtrLocked is safe during the Inode class
construction process, add a takeOwnership function (better name
suggestions welcome) that converts from the newly-instantiated
unique_ptr to a managed InodePtr.
Reviewed By: simpkins
Differential Revision: D7204818
fbshipit-source-id: 8a40189588442490120623da86195c6fc99c9c51
Summary:
This diff is mostly preparation. It's submitted separately to
remove mechanical refactoring noise from the following diff, where
every entry is assigned an inode number upon construction.
Reviewed By: simpkins
Differential Revision: D7116832
fbshipit-source-id: 2943c45340a9a751eb52bf13e19d233d829494c0
Summary:
If the root TreeInode wants to allocate inode numbers, the inode
allocator must be initialized first. But complete InodeMap
initialization requires the root TreeInode. So split this into two
parts.
Also, I changed the inode allocator to a single atomic increment instead
of a lock acquisiton.
Finally, the extra assertions in this diff uncovered what looks like a
bug in the takeover logic where nextInodeNumber_ could end up being
smaller than the value in the takeover data, since the max inode
number from the overlay was assigned after loading from takeover data.
Reviewed By: simpkins
Differential Revision: D7107706
fbshipit-source-id: ec43cc81c11d709261598739c622609b372433a2
Summary:
I spent way too long trying to figure out why my refactorings were
causing invariant errors inside InodeMap. It turns out that we
initialize the root TreeInode before InodeMap::initialize is called,
which I suspect resulted in duplicate inode numbers being handed out.
Reviewed By: simpkins
Differential Revision: D7106302
fbshipit-source-id: b459734fb96bfbb6b4b27a1d23de8b6406d30ca4
Summary:
This adds a FakeFuse class to simulate a FUSE connection to the kernel, but
that is actually communicating with our userspace test harness code. This also
adds a FakePrivHelper class which implements the PrivHelper interface but
returns FakeFuse connections rather than real FUSE connections to the kernel.
This will enable us to write unit tests that exercise more of the FUSE and
EdenMount logic, and will also enable us to test error conditions and ordering
behaviors that are difficult to reliably reproduce with a real FUSE mount.
This also includes some very basic tests using this new code. The code in
fuse/test/FuseChannelTest.cpp creates a FuseChannel using a FakeFuse object,
and the code in inodes/test/FuseTest.cpp creates a full EdenMount object using
FakePrivHelper and FakeFuse. The tests are pretty similar for now, and only
exercise the FUSE initialization code. I will expand these tests in subsequent
diffs.
Reviewed By: wez
Differential Revision: D7050826
fbshipit-source-id: 4f82375d65664ca3a5a228a577caa4d1d47454fe
Summary:
I'm seeing test failures that I have not yet understood and I
thought they might be caused by an implicit conversion from
fusell::InodeNumber to bool. Well, they're not, but this is how I
discovered that. I'm not sure I want to land this change, but I'm
going to leave it around until I figure out what's happening with my
other diffs.
Reviewed By: simpkins
Differential Revision: D7077635
fbshipit-source-id: 50bf67026d2d0da0220c4709e3db24d841960f4b
Summary:
I am working on a stack of diffs that changes how we allocate inode
numbers to tree entries. I was hitting test failures I could not
understand, so in the process of trying to understand the flows
through InodeMap, I found newChildLoadStarted to be redundant with
shouldLoadChild.
Note: Today, allocateInodeNumber() acquires the InodeMap's lock, but
in a later diff, inode numbers will be assigned en masse during
TreeInode construction.
Reviewed By: simpkins
Differential Revision: D7059719
fbshipit-source-id: 624b861040d585d2cae41d7ec2aae7d528ff8336
Summary:
Previously we returned an ENOENT error in response to a FUSE lookup() call for
a name that does not exist. However, this does not allow FUSE to cache the
result, so we will continue to receive lookup() calls for this path in the
future.
This changes EdenDispatcher to return a successful response with an inode
number of 0 instead. This tells the kernel that the name does not exist, but
allows the kernel to cache this negative lookup result (for as long as
specified in the entry_valid field in the response).
Reviewed By: wez
Differential Revision: D7076811
fbshipit-source-id: a2b9977e58d6b6eecb584699b9d93b5ad29ad5ad
Summary:
Update the TreeInode code to always update the FUSE cache whenever we add a new
entry to the directory.
Up to now we have been fine without this since the kernel never cached negative
lookup responses for us. In order to turn on FUSE caching of negative lookup()
responses we need to invalidate the cache whenever we create a new entry.
- Have create(), symlink(), mknod() and mkdir() invalidate the FUSE entry cache
if they are not being invoked from a FUSE request. These are almost always
invoked from a FUSE request, but flushing the cache is the correct thing to
do if we add code in the future that can trigger these from a thrift call or
via some other non-FUSE code path.
- Change the checkout code to invalidate the FUSE cache whenever it creates new
children entries.
Reviewed By: wez
Differential Revision: D7076812
fbshipit-source-id: d6489995b415260e84b3701c49713b0ef514f85d
Summary:
This removes the TARGETS files from the eden github repository. The
open source buck build has been failing for several months, since buck
removed support for the thrift_library() rule.
I will potentially take a stab at adding CMake build support for Eden
at some point in the future.
Reviewed By: chadaustin
Differential Revision: D6893233
fbshipit-source-id: e6023094a807cf481ac49998c6f21b213be6c288
Summary: Per code review comments on D6983198, this simplifies the way we check if mode bits have changed in a meaningful-to-source-control way.
Reviewed By: simpkins
Differential Revision: D7015339
fbshipit-source-id: 548ead337fbea1c1dcb72b880921671e9b6188ac
Summary:
mode_t isn't really part of a TreeEntry and I also wanted to see all
the places where we convert an entry type from source control into
mode bits.
Reviewed By: simpkins
Differential Revision: D6983198
fbshipit-source-id: ce1d0976f5fc5130c34a8c93c07a4e26a7cdaf71
Summary:
This is the type of a tree entry, which may be another tree, so
FileType is not an accurate name.
Reviewed By: simpkins
Differential Revision: D6981168
fbshipit-source-id: 997eb8a27f599310ed678ce221c8083722db8bff
Summary:
While we're changing how timestamps are stored, we might as well
implement this small-ish efficiency. Instead of storing timestamps as
a timespec (16 bytes), store them as 64-bit nanoseconds with a range
slightly larger than what ext4 supports. Assuming a million inodes,
this saves 24 MB.
This diff introduces the EdenTimestamp type. The next diff will start
using it.
Reviewed By: simpkins
Differential Revision: D6957659
fbshipit-source-id: 4551af6f5b8c1ff610ba88795f69e7d69d7f605d
Summary:
I want to rename FileType to TreeEntryType so I removed this one first
and replaced all of its uses with an isTree() method.
Reviewed By: simpkins
Differential Revision: D6980501
fbshipit-source-id: 105b8c599585e63efd44043e761db40e2824e77e
Summary:
Our Model TreeEntry code was a bit too general - in reality, both git
and hg only support a handful of specific tree entries: regular files,
executable files, symlinks, and trees. (git also supports
submodules.) This diff delays the expansion of a TreeEntry's type
into a full mode_t.
Reviewed By: simpkins
Differential Revision: D6980003
fbshipit-source-id: 73729208000668078a180b728d7e0bb9169c6f3c
Summary: I should have caught this before...
Reviewed By: wez
Differential Revision: D6995395
fbshipit-source-id: 08efd0aacb051f2b1754930b311e3a42afc3c538
Summary:
Another small refactoring that I don't want mixed into my
bigger diffs.
Reviewed By: wez
Differential Revision: D6927482
fbshipit-source-id: 28cc60dfdffb50921a5ef9cca4e2814b90d3b701
Summary:
There were places we were acquiring a lock unnecessarily. In
addition, I'm looking at reducing the number of places where we store
the full mode_t to see if we can get away with dirtype_t or something
similar.
Reviewed By: wez
Differential Revision: D6972140
fbshipit-source-id: bb29a4473f3056e39596600d22e67374ca484735
Summary:
Update FileInode so that getattr() and setattr() both return a reasonable value
in st_blocks.
Previously we always returned 0 in st_blocks, which caused applications like
`du` to always report files as using 0 space in Eden mounts. Now we compute
st_blocks based on st_size, so that `du` will report reasonable estimates for
when scanning the size of subdirectories inside an Eden mount.
Reviewed By: chadaustin
Differential Revision: D6932098
fbshipit-source-id: bd29e46821176e510f420e6e2b6ce480b80d50ff
Summary:
Move all of the privhelper functionality into a PrivHelper class. The
ServerState object now stores the PrivHelper object to use, rather than having
a global singleton.
This will make it easier to stub out the PrivHelper functionality during unit
tests.
Reviewed By: wez
Differential Revision: D6929862
fbshipit-source-id: e3edcb0a03ba9afdf34554cb961fd74557cdd6e3
Summary:
Update EdenMount::diff() to use the UserInfo object stored in the shared
ServerState rather than calling UserInfo::lookup() on each diff operation.
Reviewed By: wez
Differential Revision: D6929865
fbshipit-source-id: a68ab1fa9eb345b59972e67c3aac258b4dbcdab5
Summary:
Add a new ServerState class to store process-wide state that is shared across
multiple mounts. Up until now we have passed around the shared state data as
separate variables.
This is intentionally separate from the existing EdenServer class to allow unit
tests to create EdenMount objects without an EdenServer object.
Reviewed By: wez
Differential Revision: D6929861
fbshipit-source-id: 5f22efb6d79dfd70031be1dc37f494c2ad8af902
Summary:
We no longer need this and I believe that this was contributing
to this source of flakiness in our CI; this stack trace triggers when
we get an aborted or short read from the kernel:
```
*** Aborted at 1517827659 (Unix time, try 'date -d 1517827659') ***
*** Signal 11 (SIGSEGV) (0x2d0) received by PID 153706 (pthread TID 0x7f63ea37d700) (linux TID 160573) (maybe from PID 720, UID 0) (code: address not mapped to object), stack trace: ***
@ 0000000002abfa2d folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)
./folly/experimental/symbolizer/SignalHandler.cpp:413
@ 00007f646e96dacf (unknown)
@ 00007f646e96aa20 __pthread_kill
@ 0000000001873748 facebook::eden::fusell::FuseChannel::requestSessionExit()
./eden/fs/fuse/FuseChannel.cpp:483
@ 000000000187d2f7 facebook::eden::fusell::FuseChannel::processSession()
./eden/fs/fuse/FuseChannel.cpp:656
@ 000000000187dab9 facebook::eden::fusell::FuseChannel::fuseWorkerThread(unsigned long)
./eden/fs/fuse/FuseChannel.cpp:493
@ 00007f646f267170 execute_native_thread_routine
@ 00007f646e9637a8 start_thread
@ 00007f646e047a7c __clone
```
my theory is that we're allowing shutdownImpl to free things out from under other
threads before they've all seen this signal and wound down fully. This is slightly
speculative in that I haven't managed to reproduce this stack trace on my devserver.
We don't really need this additional signal any longer.
Reviewed By: simpkins
Differential Revision: D6907734
fbshipit-source-id: 0f0138b631a7201fc9a4a1c93c2cde846e869cbd
Summary: Small refactoring I should have done with the previous diff.
Reviewed By: simpkins
Differential Revision: D6927152
fbshipit-source-id: 1dcda01134c3d63c62169c5728dba24ca0eebd68
Summary:
Update `TreeInode::diff()` to check if its hash matches the source control tree
it is being compared to, and return early if they are identical.
I'm surprised that I forgot to include this initially when implementing
`TreeInode::diff()`
This makes `hg status` faster when a large number of unmodified directories
have been loaded.
Reviewed By: chadaustin
Differential Revision: D6890615
fbshipit-source-id: 561630d0220b4875dbf3678161cdb41a8aa4fc82
Summary:
This re-orders some of the code in `TreeInode::diff()` slightly. This should
not affect the behavior of the code.
This moves the `isIgnored` check inside the main `contents_.wlock()` block.
This reduces the number of places where we grab the lock, and will help keep
things simple for an upcoming diff where I need to add some more checks in this
code with the lock held.
This also changes `inodeFuture` to use the new `Future::makeEmpty()`
constructor rather than having to use an `Optional<Future>`
Reviewed By: chadaustin
Differential Revision: D6890616
fbshipit-source-id: 354bbf6a6be6d356fd23e6c0fb6b534679bbe0bb
Summary: Small things I've needed in later diffs.
Reviewed By: wez
Differential Revision: D6877755
fbshipit-source-id: c9002eb0b92dbd8fe9c4f636d2ca79b25cde331f
Summary:
While working on timestamp storage, the fact that
InodeTimestamps was a member of InodeBase kept getting in the way.
Make it its own type.
Reviewed By: simpkins
Differential Revision: D6862835
fbshipit-source-id: 91d8984764f0586b9fa52e961eb5606a530e0416
Summary: This API seems like it should be const, as it does not modify the clock.
Reviewed By: chadaustin, zhupanov
Differential Revision: D6869719
fbshipit-source-id: c8bf4ccab34538b59e6baeedd0b0ff88b328236e
Summary:
This diff moves the mount-time initialization handling
out of the main loop. This rationale for this is:
* We don't (and shouldn't!) need to process FUSE_INIT for takeover
processing, and this structure allows us to make stronger assertions
about our state.
* we can avoid spinning up multiple threads in the (rare!) case that
the FUSE_INIT fails
* It is now a little harder for exceptions during initialization to
escape our notice.
In rearranging this stuff, I found a race condition in the worker thread
shutdown; we could erroneously emit a completion event before all of
the threads had been torn down and this resulted in sporadic integration
test failures hitting the assertion for the number of joined threads
in the destructor.
Reviewed By: simpkins
Differential Revision: D6766330
fbshipit-source-id: 32afb5a7c739c75aebfdb0a8f896eec5f41ad33f
Summary:
Dir's contents were represented as a vector of 64-bit
pointers to 48-byte structs. This change removes that layer of
indirection, reducing memory usage and slightly pessimizing insertion.
The diff is mostly mechanical outside of the TreeInode.h changes and
calls to emplace..
I'll run memory tests tomorrow, though it's a gamble as to whether
private bytes will show a difference. I may need to shrink the Entry
struct too.
Reviewed By: wez
Differential Revision: D6804957
fbshipit-source-id: b126656dbc7951565e74b6401adde6353e809056
Summary:
Tiny thing I noticed when reading code. Keep the entry name
as a StringPiece rather than bouncing through char*.
Reviewed By: zhupanov
Differential Revision: D6820080
fbshipit-source-id: 884e55f74094f44012efbe44b86d8e5903300967
Summary:
isSameAs calls getSha1 which was failing on symlinks. The
original concern was that asking for the SHA-1 of a symlink is
ambiguous: do you want the hash of the symlink or the target? But we
already check for whether you are requesting the SHA-1 of a symlink in
EdenServiceHandler, so it's redundant and incorrect to check in
FileInode too.
Reviewed By: simpkins
Differential Revision: D6847489
fbshipit-source-id: 13966da06bcde75c5c568e09fef14e735de47cfb
Summary:
It's interesting to see the total number of loaded files
vs. trees when the loaded inode count is high.
Reviewed By: wez
Differential Revision: D6765874
fbshipit-source-id: 178b30184428bd5cf5e005eb475e4f5a1476c385
Summary:
I'm not sure what was wrong with the old code, but I
simplified and clarified all of the time math and now `eden debug
unload` behaves as I'd expect it should.
Reviewed By: simpkins
Differential Revision: D6764962
fbshipit-source-id: 3ed359d4ab4652e95d1538a0982c24185999351c
Summary:
rdev doesn't add any value yet. We can add it back if we want
to implement support.
Reviewed By: simpkins
Differential Revision: D6792346
fbshipit-source-id: ce16317074f1daa456737c55804da8fb7f2b7a94
Summary:
This slightly improves the code in EdenMount by having
just a single call to the FuseChannel constructor with an optional
`fuse_init_out` instance that we can use to determine if we are
performing a takeover or not.
Reviewed By: simpkins
Differential Revision: D6766328
fbshipit-source-id: ece140a1572e2934a3e35bfe25b83af910346c18
Summary: Minor stylistic changes that were done during constification but factored out.
Reviewed By: chadaustin
Differential Revision: D6774976
fbshipit-source-id: d18cd339153cf16ff69be0de5f3eb019a4baa1a0
Summary:
The value we get is of enum class, so it is guaranteed to be in the valid set.
Omitting default we get compiler support if we ever expand the set of values.
Throw becomes obviated, because it becomes unreachable.
Reviewed By: simpkins
Differential Revision: D6771224
fbshipit-source-id: 039f0545d20649193b3fb74981ee7f027f31fd8e
Summary: Replaced memset used to 0-init structs with POD fields with = {} init.
Reviewed By: simpkins
Differential Revision: D6770868
fbshipit-source-id: 190205a277e3c73711813bc515b860b107d41f83
Summary:
This is moving some files around in preparation for
moving TakeoverData to using thrift for its serialization
Reviewed By: simpkins
Differential Revision: D6733405
fbshipit-source-id: 235ba237546f8ef606de8445db45683ce38a2d2c
Summary:
This implements a TODO/FATAL that is important for
graceful restarts to be useful in my "acid test" scenario,
which is to perform a graceful restart while buck build is
running.
Reviewed By: simpkins
Differential Revision: D6700189
fbshipit-source-id: dec1b818ebc9e907841bc127ee08c953b59d6487
Summary:
Previously, `EdenMount::create` would implicitly call
`EdenMount::initialize` which would load the root inode and the `.eden` magical
directory. That's fine for the fresh mount case, but in the case of the
graceful restart we need to take the opportunity to apply the `InodeMap`
from the old process before we start muddying its state.
This diff breaks out the `initialize` method from the `create` method and
makes the mount code call it after potentially loading the `InodeMap` from
the takeover data.
In addition, this diff removes the the `root->loadMaterializedChildren()`
call from the mount initialization code. It is no longer required to do
this eagerly and it makes things simpler and our memory profile a little
smaller to defer this (I haven't measured how much that impacts things).
Reviewed By: simpkins
Differential Revision: D6691182
fbshipit-source-id: 52033a6d64105b658314a919f69dbfcd4eea242b
Summary:
I'm so-so on a bit of the implementation here, but it works!
I had to change the `takeoverPromise` from the `pair<fuseDevice, connInfo>`
to a new helper struct because we now have three distinct pieces of data
to pass out of EdenMount to build up the overall TakeoverData.
The key change in this diff is that we have to release all of the file handles
we're maintaining in the `FileHandleMap` prior to shutting down the `InodeMap`,
otherwise the `InodeMap` will never complete (it waits for all inodes to be
unreferenced, and that cannot happen while there are open file handles). I've
made the `FileHandleMap` serialization and clearing contingent on performing a
takeover shutdown because that feels like the safest thing to do wrt. not
losing any pending writes.
Reviewed By: simpkins
Differential Revision: D6672437
fbshipit-source-id: 7b1f0f8e7ff09dbed850c7737383ecdf1e5ff0c7
Summary:
This is the key portion that makes the graceful restart
function. This diff connects almost all of the moving pieces together;
it informs the priv helper about the takeover mount and transfers
the InodeMap information into the new generation of the eden server.
It doesn't yet load the fileHandleMap (will tackle that in a follow up diff)
Reviewed By: simpkins
Differential Revision: D6670903
fbshipit-source-id: 1770d99eb1477440a6c1deed83b0da55b9c1bbe4
Summary:
this isn't how we really want to do this long term, it's
just the most expedient short term implementation.
This diff provides an implementation of the `InodeMap::save()` which
was previously a stub method; the new implementation returns a thrift
structure that encompasses the unloaded inodes in the map, and adds
a corresponding load() method that performs the reverse transformation.
The struct is serialized into the Takeover data.
This diff doesn't hook up the real serialized data to EdenServer; that will happen
in a follow-on diff.
The way that we actually want to handle this longer term is to store the
`numFuseReferences` field into the overlay file on disk, but to do so we
will need to add a mountGeneration field alongside it and ensure that we
always write out the correct information at the correct times. In addition,
we'd need to add equivalent data to TreeInode::Entry and add accessors that
safely return the correct values in the correct situations.
In the interest of getting something working, I've just dumped this code in
here.
I've also placed the thrift structure for this in `fuse/handlemap.thrift`;
this is a slight layering violation but one that feels "OK" in light of
the imminent refactor of the Takeover data struct to be its own thrift
struct anyway.
Reviewed By: simpkins
Differential Revision: D6670904
fbshipit-source-id: 11a0918954c741935c587e46fcb0e38849010de1
Summary:
This adds some plumbing to thread the fuse device descriptor and
negotiated capabilities through to the takeover code.
I initially wanted to just make the
unmount future yield the device descriptor, but since that uses
`SharedPromise` it is not compatible with a move-only type like
`folly:File`, so I added an optional promise to deal with just that.
I'm also populating the takeover mount information (path, bind mounts)
for each mount point.
Reviewed By: simpkins
Differential Revision: D6494509
fbshipit-source-id: a90684292dc1d8e06ce2c0721eadd8d393377f33
Summary: A prior shelve contained this fix but never got landed.
Reviewed By: wez
Differential Revision: D6676206
fbshipit-source-id: b8c733be663ff56e1a0625f09ec505891d430084
Summary:
We'll need a way to twiddle the refcount by more than
increment at a time for graceful restart. This mirrors the
same pattern we use to forget() by an arbitrary number.
Reviewed By: chadaustin
Differential Revision: D6668809
fbshipit-source-id: 6e5dc33b5e40f98f01293c89152bfe1e0879f572
Summary: We forgot to remove this when we moved to be fully inode based in the overlay
Reviewed By: chadaustin
Differential Revision: D6668810
fbshipit-source-id: b79af85a4bbbcefd9227ad69bb8d57b5274cdaed
Summary:
It is no longer correct to assert that state->file is set if O_TRUNC happened
before blob import from hg finished. It surprises me we never saw a crash
because of that. Also, the O_TRUNC path after blob import finishes can never
complete a future, so don't try.
Reviewed By: wez
Differential Revision: D6656699
fbshipit-source-id: 5e245fc46185714e5f5d81c2680835a3497747ff
Summary:
Today, if a file is ever opened for read, each FileInode keeps a copy
of the data as long as the FileInode is around. This results in
excessive memory consumption under common mistakes like repo-wide grep
or `hg revert .`.
I will audit all of the state machine transitions and blob accesses
before landing this diff.
Reviewed By: wez
Differential Revision: D6598957
fbshipit-source-id: 1eb4aeb08057ce993a29a86d298e153675fee4a1
Summary:
This came up when I was auditing the rules about when it's
safe to read from a FileInode. read() must only be called while openCount > 0.
Reviewed By: wez
Differential Revision: D6604898
fbshipit-source-id: 829ddc335bd58201c2b456ee544cdc6253ebf66c
Summary:
In a follow-on diff, the constraint will be that state->blob
will only be guaranteed valid after ensureDataLoaded() while a
FileHandle is alive. Thus, ensureDataLoaded() must return a
FileHandle.
Reviewed By: wez
Differential Revision: D6586237
fbshipit-source-id: ccc269d322b8c725c93145df5de2add9a2b90207
Summary: This is annoying because it isn't actually shadowing and because this isn't picked up in our default mode mode, or in sandcastle.
Reviewed By: chadaustin
Differential Revision: D6656548
fbshipit-source-id: e624cb563d6396c1ab6b93eae14651c16a8c0cd3
Summary:
Now that we have full control over the fuse session,
moving the threads here makes a lot of sense and makes things
easier from an overall state management perspective.
FuseChannel now provides a future that make it easier for EdenMount to wait
until all the threads and outstanding requests have completed.
There is also a future to determine when the first of the threads
has exited which is used to detect an error condition. We could
go a bit further with this and have the error condition propagate
out from our dispatcher, but that's a bit further away from my
goal of making the graceful restart stuff work.
I've removed the request counting stuff from Dispatcher as we
can now use the definitive signals from FuseChannel in its place.
Reviewed By: chadaustin
Differential Revision: D6580247
fbshipit-source-id: 6bcc3b8b531d59a3fdd0ca6fd09410ad64f8221a
Summary:
This serves a few purposes:
1. We can avoid some conditional code inside eden if we know that
we have a specific fuse_kernel.h header implementation.
2. We don't have to figure out a way to propagate the kernel
capabilities through the graceful restart process.
3. libfuse3 removed the channel/session hooks that we've been
using thus far to interject ourselves for mounting and
graceful restarting, so we were already effectively the
walking dead here.
4. We're now able to take advtange of the latest aspects of
the fuse kernel interface without being tied to the implementation
of libfuse2 or libfuse3. We're interested in the readdirplus
functionality and will look at enabling that in a future diff.
This may make some things slightly harder for the more immediate
macOS port but I belive that we're in a much better place overall.
This diff is relatively mechanical and sadly is (unavoidably) large.
The main aspects of this diff are:
1. The `fuse_ino_t` type was provided by libfuse so we needed to
replace it with our own definition. This has decent penetration
throughout the codebase.
2. The confusing `fuse_file_info` type that was multi-purpose and
had fields that were sometimes *in* parameters and sometimes *out*
parameters has been removed and replaced with a simpler *flags*
parameter that corresponds to the `open(2)` flags parameter.
The *out* portions are subsumed by existing file handle metadata
methods.
3. The fuse parameters returned from variations of the `LOOKUP` opcode
now return the fuse kernel type for this directly. I suspect
that we may need to introduce a compatibility type when we revisit
the macOS port, but this at least makes this diff slightly simpler.
You'll notice that some field and symbol name prefixes vary as
a result of this.
4. Similarly for `setattr`, libfuse separated the kernel data into
two parameters that were a little awkward to use; we're now just
passing the kernel data through and this, IMO, makes the interface
slightly more understandable.
5. The bulk of the code from `Dispatcher.cpp` that shimmed the
libfuse callbacks into the C++ virtual methods has been removed
and replaced by a `switch` statement based dispatcher in
`FuseChannel`. I'm not married to this being `switch` based
and may revise this to be driven by an `unordered_map` of
opcode -> dispatcher method defined in `FuseChannel`. Regardless,
`Dispatcher.cpp` is now much slimmer and should be easier to
replace by rolling it together into `EdenDispatcher.cpp` in
the future should we desire to do so.
6. This diff disables dispatching `poll` and `ioctl` calls. We
didn't make use of them and their interfaces are a bit fiddly.
7. `INTERRUPT` is also disabled here. I will re-enable it in
a follow-up diff where I can also revise how we track outstanding
requests for graceful shutdown.
8. I've imported `fuse_kernel.h` from libfuse. This is included
under the permissive 2-clause BSD license that it allows for
exactly this integration purpose.
Reviewed By: simpkins
Differential Revision: D6576472
fbshipit-source-id: 7cb088af5e06fe27bf22a1bed295c18c17d8006c
Summary:
Easy to overlook this; the issue is that we need to explicitly
do something about the error case when we're stitching together Promises
by hand, otherwise we will silently drop exceptions.
Flat manifest imports are failing in `RestartTestHg` in master
at the moment. That error was silently being swallowed and the test would
hang until it timed out.
This is an uglyish hack to explicitly propagate the error condition so that
that test will error out.
This diff doesn't fix the source of the manifest import issue; that is addressed
in the next diff (turned out to be that the `--takeover` flag was not being
passed correctly)
Reviewed By: bolinfest
Differential Revision: D6627973
fbshipit-source-id: b7093890f543618a11682e939f8802f1309831d4
Summary:
This is a codemod to change from using @/ to // in basic cases.
- TARGETS files with lines starting with @/ (but excluding @/third-party:
- autodeps lines in source and TARGETS files ( (dep|manual)=@/ ), excluding @/third-party
- Targets in string macros
The only thing left of the old format should be @/third-party:foo:bar
drop-conflicts
Reviewed By: ttsugriy
Differential Revision: D6605465
fbshipit-source-id: ae50de2e1edb3f97c0b839d4021f38d77b7ab64c
Summary:
Update the logic for dry-run checkouts to exit early whenever we hit a
directory that is unmodified from the original source control state. In this
case we can skip this entire subdirectory since we know there cannot be any
conflicts.
This should make dry-run checkout operations much faster in cases where there
are lots of changes between the source and destination source control trees,
but the working directory state has relatively few modified files.
Reviewed By: bolinfest
Differential Revision: D6605282
fbshipit-source-id: b5423749f3d47b10ed8d599ffaa0667c72fbaec2
Summary:
Previously the checkout code always called `JournalDiffCallback::performDiff()`
to get the list of unclean files before starting the checkout operation. After
the checkout completed it called `JournalDiffCallback::performDiff()` to get
the unclean files after the checkout operation.
Making both diff calls should be unnecessary: the checkout code can only modify
files that were unclean before the checkout (in the case of a `FORCEe` checkout)
or files that were modified between the source and destination source control
trees. Therefore the `performDiff()` call after checkout completes should be
unnecessary, and can be removed.
This diff also eliminates the initial `performDiff()` call for `DRY_RUN`
checkouts. We do not add a journal entry at all for `DRY_RUN` operations, so
we can skip this diff computation.
Reviewed By: wez
Differential Revision: D6455969
fbshipit-source-id: 20e0ac0d16d1fde844c1d6165a96611bfb370597
Summary:
I almost added a copy in a later diff and realized it would
have let me, but done the wrong thing.
Reviewed By: wez
Differential Revision: D6585810
fbshipit-source-id: 15295d04b06df397113be6080e1f2f6b8a473745
Summary: Added type identification capability to InodeBase and its descendants FileInode and TreeInode.
Reviewed By: simpkins
Differential Revision: D6564902
fbshipit-source-id: ce9300102d6d6d1c42616eb1e32042f21f6e6cce
Summary:
Add EdenCPUThreadPool and UnboundedQueueThreadPool types to make it clearer
that it's always okay for prefetch, deferred diff entry, and hg import to
shuttle work back to the main thread pool.
This diff changes no behavior - it just makes some invariants explicit.
Reviewed By: wez, simpkins
Differential Revision: D6504117
fbshipit-source-id: 3400ad55c00b3719ecba31807fd992442f622cd9
Summary:
Added to Eden capability to incorporate default user and general system level gitignore files.
NOTE: Work in progress, sending the review out to calibrate/ensure I am on right track.
Reviewed By: simpkins
Differential Revision: D6482863
fbshipit-source-id: 9834ca1a577a9599a1f8cb2243dca4e714866be8
Summary:
There's no technical reason to block an open() request until the data
load / materialization returns. This change returns immediately from
open() and then waits if necessary in a subsequent write() call.
Reviewed By: wez
Differential Revision: D6391486
fbshipit-source-id: 862f87e3c3a0d760bacb0f8ca7acc479037fec2f
Summary:
Follow-up to comments in D6466209. All access to the clock goes
through the Clock interface, making time deterministic in unit tests.
Reviewed By: simpkins
Differential Revision: D6477973
fbshipit-source-id: 24e51bdb52d0d079b34d91598d2e787d361f2525
Summary: Follow-up from D6366189. First use of the new FakeClock!
Reviewed By: simpkins
Differential Revision: D6466209
fbshipit-source-id: 4d4d8a9a83df2bee11149e7a0cbddaaf734d0e04
Summary:
Introduce a Clock seam. This will allow us to write tests around
ctime, mtime, and atime logic.
Reviewed By: wez
Differential Revision: D6392543
fbshipit-source-id: 1721d76d2364b135b4ef5c078ef60f7f8526259e
Summary:
open() called materializeInParent unconditionally, and setattr never
called it, implying it was possible to truncate a file without
materializing the parent. This change makes sure to precisely call
materializeInParent whenever the state transitions to materialized.
Reviewed By: wez
Differential Revision: D6389794
fbshipit-source-id: 1e740e133a83d5090a6b9801154b7eaeccb07f22
Summary:
To make the materialization code paths a bit clearer, this decouples
materialization from a blob and truncation. It also caches opened
files if openCount > 0 in both the truncation and materialization from
blob paths.
Reviewed By: wez
Differential Revision: D6388318
fbshipit-source-id: c95a85f5bdaa405130f2f7260143592cdc90d45e
Summary:
Replace the initLoggingGlogStyle() function with a more generic initLogging()
function that accepts a log config string to be parsed with parseLogConfig().
Reviewed By: bolinfest, yfeldblum
Differential Revision: D6342086
fbshipit-source-id: fb1bffd11f190b70e03e2ccbf2b30be08d655242
Summary:
Update TreeInode::computeCheckoutActions() to short circuit on non-materialized
directories that are identical to both the destination and source state. This
speeds up `hg update --clean` operations by allowing us to skip portions of the
tree that do not require changes.
This does not have much effect for update operations without `--clean`: those
already short-circuited the operation in `TreeInode::processCheckoutEntry()`
when the source and destination hashes were the same. That code cannot
short-circuit in the forced update case, since the forced update may have to
revert local modifications.
Reviewed By: bolinfest
Differential Revision: D6455970
fbshipit-source-id: 393acb3272745751a56e06dba0c7505ff2bfad44
Summary:
Previously, we used the Mercurial code `g` when faced with an `UNTRACKED_ADDED`
file conflict, but that was allowing merges to silently succeed that should not
have. This revision changes our logic to use the code `m` for merge, which
unearthed that we were not honoring the user's `update.check` setting properly.
Because we use `update.check=noconflict` internally at Facebook, we changed the
Eden integration tests to default to verifying Hg running with this setting. To
support it properly, we had to port this code from `update.py` in Mercurial to
our own `_determine_actions_for_conflicts()` function:
```
if updatecheck == 'noconflict':
for f, (m, args, msg) in actionbyfile.iteritems():
if m not in ('g', 'k', 'e', 'r', 'pr'):
msg = _("conflicting changes")
hint = _("commit or update --clean to discard changes")
raise error.Abort(msg, hint=hint)
```
However, this introduced an interesting issue where the `checkOutRevision()`
Thrift call from Hg would update the `SNAPSHOT` file on the server, but
`.hg/dirstate` would not get updated with the new parents until the update
completed on the client. With the new call to `raise error.Abort` on the client,
we could get in a state where the `SNAPSHOT` file had the hash of the commit
assuming the update succeeded, but `.hg/dirstate` reflected the reality where it
failed.
To that end, we changed `checkOutRevision()` to take a new parameter,
`checkoutMode`, which can take on one of three values: `NORMAL`, `DRY_RUN`, and
`FORCE`. Now if the user tries to do an ordinary `hg update` with
`update.check=noconflict`, we first do a `DRY_RUN` and examine the potential
conflicts. Only if the conflicts should not block the update do we proceed with
a call to `checkOutRevision()` in `NORMAL` mode.
To make this work, we had to make a number of changes to `CheckoutAction`,
`CheckoutContext`, `EdenMount`, and `TreeInode` to keep track of the
`checkoutMode` and ensure that no changes are made to the working copy when a
`DRY_RUN` is in effect.
One minor issue (for which there is a `TODO`) is that a `DRY_RUN` will not
report any `DIRECTORY_NOT_EMPTY` conflicts that may exist. As `TreeInode` is
implemented today, it is a bit messy to report this type of conflict without
modifying the working copy along the way.
Finally, any `UNTRACKED_ADDED` conflict should cause an update to
abort to match the behavior in stock Mercurial if the user has the following
config setting:
```
[commands]
update.check = noconflict
```
Though the original name for this setting was:
```
[experimental]
updatecheck = noconflict
```
Although I am on Mercurial 4.4.1, the `update.check` setting does not seem to
take effect when I run the integration tests, but the `updatecheck` setting
does, so for now, I set both in `hg_extension_test_base.py` with a `TODO` to
remove `updatecheck` once I can get `update.check` to do its job.
Reviewed By: simpkins
Differential Revision: D6366007
fbshipit-source-id: bb3ecb1270e77d59d7d9e7baa36ada61971bbc49
Summary:
This fixes a crash in EdenMount::destroy() if EdenMount::create() failed to
load the root inode. Previously the code called shutdownImpl() in this case
which tried to unload all inodes and crashed since the root inode was null.
This also fixes EdenMount::destroy() to properly handle the FUSE_ERROR and
FUSE_DONE cases.
Reviewed By: wez
Differential Revision: D6434355
fbshipit-source-id: 39c5f4472d6ebbcf881b4c9c8c8fd67686032ec1
Summary:
The goal is to provide a fast path for watchman to flesh
out the total set of changed files when it needs relay that information
on to consumers.
We choose not to include the full list in the Journal when checking out
between revisions because it will not always be needed and may be an
expensive `O(repo)` operation to compute. This means that watchman
needs to expand that information for itself, and that is currently
a fairly slow query to invoke through mercurial.
Since watchman is responding to journal events from eden we know that
we have tree data for the old and new hashes and thus we should be
able to efficiently compute that diff.
This implementation is slightly awful because it will instantiate an
unlinked TreeInode object for one side of the query, and will in
turn populate any children that differ as it walks down the tree.
A follow on diff will look at making a flavor of the diff code that
can diff raw Tree objects instead.
Reviewed By: bolinfest
Differential Revision: D6305844
fbshipit-source-id: 7506c9ba1f4febebcdc283c414261810a3951588
Summary:
This should have been removed as part of D6179950.
Ideally, Buck would error when this happens, but apparently `glob()` does not
complain when patterns do not match any files, even when the pattern does not
contain any wildcards. There appears to be some code at Facebook that is
exploiting this behavior.
Reviewed By: simpkins
Differential Revision: D6421529
fbshipit-source-id: c6f982624e0e12a911bc12ab1e8239ba4358ea56
Summary:
This corrects a bug where the in-memory timestamp for new files would
be set to the last checkout time instead of when the file was created.
Reviewed By: simpkins
Differential Revision: D6366189
fbshipit-source-id: c5fa8c779726d3a75c2d57b2a161293297eb9271
Summary:
Per a discussion with wez, let's make it explicit in the comments that the kernel handles
O_EXCL for us.
Reviewed By: simpkins
Differential Revision: D6365157
fbshipit-source-id: 638f67cd89a9450ff084e0ef77c731ef738bf518
Summary:
In some workloads we're seeing folks run out of file descriptors.
We forgot that we'd taken out the code that closes the underlying fds.
This diff takes a run at adding a simple counter of the open file handle
objects that is incremented when they are constructed and decremented
when they are destroyed.
When the count falls to zero we release the file handle.
Note that we unconditionally open files when we first load the inodes
from the overlay. I tried to defer that open attempt and it broke
the timestamp overlay test. I think we can revisit that aspect in
a follow on diff; for now we should be more resilient to transiently
opened files from things like ripgrep or similar.
Reviewed By: simpkins
Differential Revision: D6097090
fbshipit-source-id: 9a48220002e760fb1ffb8d7e2a68fa7036558b78
Summary:
This bug is part of a bigger issue in our Mercurial integration where
`UNTRACKED_ADDED` conflicts are being silently swallowed in our Hg extension
whereas stock Mercurial presents these as conflicts and forces the user to deal
with them. The Mercurial issues will be addressed in a follow-up change.
Reviewed By: simpkins
Differential Revision: D6365580
fbshipit-source-id: 831e27ce1da90ea605033b2b9988fe400ba404aa
Summary:
In preparation for bringing D6097090 across the finish line, we need a
more explicit mechanism for determining which state the inode is in.
The issue is that whether an inode is materialized was checked in a
couple ways: the nonexistence of the hash field or the definedness of
the file field. This diff introduces an explicit enum indicating the
state.
Reviewed By: simpkins
Differential Revision: D6325955
fbshipit-source-id: 3682a4ebc9330193baadbb33a4dd9845f26e59a6
Summary:
Eden attempts to initialize timestamps of newly loaded inodes with the time of
the last checkout operation performed in this mount. Unfortunately it had a
number of bugs in this logic:
EdenMount had two separate fields attempting to track the last checkout time:
`lastCheckoutTime_` and `parentInfo_.lastCheckoutTime`.
Unfortunately neither field was actually updated on checkout operations.
Additionally, `lastCheckoutTime_` did not have any locking to allow it to be
updated. `parentInfo_.lastCheckoutTime` did have locking, but it used the
mount point's checkout lock, so it could not be accessed during checkout
operations.
This diff removes `parentInfo_.lastCheckoutTime`, keeping only
`lastCheckoutTime_`. It also converts `lastCheckoutTime_` to a
`struct timespec` since this is most often needed as a `timespec`. It also
adds a new mountpoint-wide lock for synchronizing accessing to this variable.
Reviewed By: bolinfest
Differential Revision: D6356698
fbshipit-source-id: db54f9bb297b5febe4642e2b3fcc8055a6afc199
Summary: More work towards encapsulating a FileInode's internal state machine.
Reviewed By: wez
Differential Revision: D6316013
fbshipit-source-id: 9c8303b35a0de1ba69207c7f59be88c5fb037ad8
Summary:
This is preparation for a change I'm preparing that makes the state
transitions in FileInode clearer and safer.
Reviewed By: wez
Differential Revision: D6304272
fbshipit-source-id: 493ee80517443432f790abf9806000eecb03651c
Summary:
The gtest macros in this file were moved to folly/test/TestUtils.h
Update everything to just use folly/test/TestUtils.h directly.
Reviewed By: chadaustin
Differential Revision: D6301759
fbshipit-source-id: 7f2841c12d5bea15376f782fb3bf3bfef16039c7
Summary:
Moving the post-clone step out of C++ makes it so that
`ClientConfig` in C++ no longer knows about `hooks` and requires
`RepoConfig` to know about `hooks`. This helps us reduce
`ClientConfig`'s dependency on `ConfigData`, as my next step
is to move the remaining information that `ClientConfig` gets from
`ConfigData` directly into the client's `edenrc` file under
`~/local/.eden`.
Reviewed By: chadaustin
Differential Revision: D6310544
fbshipit-source-id: dec7a21281ab49e0416b8872757970a4eff2d943
Summary:
I created this change by running:
```
find eden -name TARGETS | grep -v eden/fs/fuse/TARGETS | grep -v eden/fs/service/TARGETS | xargs autodeps
```
apparently `eden/fs/fuse/TARGETS` and `eden/fs/service/TARGETS` have some
constructions that `autodeps` does not understand, so I filtered those out.
Reviewed By: StanislavGlebik
Differential Revision: D6319982
fbshipit-source-id: 7b3683d1507409dde6d6570e9b13811168aa6859
Summary:
This is follow-up to the lock ordering issues in
StreamingSubscriber. The Journal locks are now finer-grained and no
locks are held while the subscribers are invoked. This change
prevents future deadlocks.
Reviewed By: wez
Differential Revision: D6281410
fbshipit-source-id: 797c164395831752f61cc15928b6d4ce4dab1b68
Summary:
With the subsequent diff that enables multiple concurrent
hg importers, I was seeing this deadlock during rebase; each of
the worker threads was being blocked until it saturated the various
thread pools and locked up the server.
This removes the blocking call and replaces it with a SharedPromise
to allow multiple callers to wait for the load.
Note that this changes the semantics of ensureDataLoaded slightly:
previously the blob load was synchronous while the write lock was
held. Now we release the write lock until the load completes.
I've added some post-condition checks to validate that we don't
break any state. I believe that our usage is such that we don't
do anything that might mess things up. Am I missing anything?
Reviewed By: simpkins
Differential Revision: D6264900
fbshipit-source-id: 4aa2870d95f0f0ec48d87299978cb87af99e3969
Summary:
Address some feedback on D6264900 by only storing the sha-1 in the
overlay when it's immediately available. This avoids the possibility
of the FUSE thread getting blocked.
Reviewed By: simpkins
Differential Revision: D6292175
fbshipit-source-id: 06c2372724e58c485d9a302dde36c34885109acf
Summary:
The underlying issue is that we were reporting a `MODIFIED_REMOVED`
conflict as a `MODIFIED_MODIFIED` conflict. This put us in a state where
Mercurial expected to find a file in the new manifest, but failed because the
file was not present in that revision, so no such file could be found.
Somewhat surprisingly, the appropriate handler for a `MODIFIED_REMOVED`
conflict already existed in our Mercurial extension, but there was no logic on
the server that would generate a `MODIFIED_REMOVED` conflict previous to
this change.
Like D6204916, this was an issue I ran into when trying to create a repro case
for the issue that was fixed in D6199215.
Reviewed By: wez
Differential Revision: D6270272
fbshipit-source-id: 6604eea00b0794cd44b01d2ba6b9ea10db32d556
Summary: Make this function match our C++ guidelines.
Reviewed By: wez
Differential Revision: D6288591
fbshipit-source-id: 1a4f52a8c1e0497df938533fe29da10264eb1ccf
Summary:
Add log messages that will help debug the behavior of `TreeInode::diff()`
I added this since I saw a case where `getScmStatus()` was returning a file as
modified when it did not appear to have mode or contents changes.
Unfortunately we did not have any existing debug log messages for this, and the
problem went away after restarting edenfs, so I wasn't able to track it down.
Having these log messages will help debug this in the future if we run into it
again.
Reviewed By: bolinfest
Differential Revision: D6281905
fbshipit-source-id: 736028d6efe46a387bed6bf98d971685f83da3ec
Summary: The comments and the code did not agree on what fields were valid when, so I corrected them and added runtime enforcement.
Reviewed By: wez
Differential Revision: D6260555
fbshipit-source-id: 13e35e22c44b9a5b4cb30cdfd48e2bf7b0e116d3
Summary:
This is a major change to how we manage the dirstate in Eden's Hg extension.
Previously, the dirstate information was stored under `$EDEN_CONFIG_DIR`,
which is Eden's private storage. Any time the Mercurial extension wanted to
read or write the dirstate, it had to make a Thrift request to Eden to do so on
its behalf. The upside is that Eden could answer dirstate-related questions
independently of the Python code.
This was sufficiently different than how Mercurial's default dirstate worked
that our subclass, `eden_dirstate`, had to override quite a bit of behavior.
Failing to manage the `.hg/dirstate` file in a way similar to the way Mercurial
does has exposed some "unofficial contracts" that Mercurial has. For example,
tools like Nuclide rely on changes to the `.hg/dirstate` file as a heuristic to
determine when to invalidate its internal caches for Mercurial data.
Today, Mercurial has a well-factored `dirstatemap` abstraction that is primarily
responsible for the transactions with the dirstate's data. With this split, we can
focus on putting most of our customizations in our `eden_dirstate_map` subclass
while our `eden_dirstate` class has to override fewer methods. Because the
data is managed through the `.hg/dirstate` file, transaction logic in Mercurial that
relies on renaming/copying that file will work out-of-the-box. This change
also reduces the number of Thrift calls the Mercurial extension has to make
for operations like `hg status` or `hg add`.
In this revision, we introduce our own binary format for the `.hg/dirstate` file.
The logic to read and write this file is in `eden/py/dirstate.py`. After the first
40 bytes, which are used for the parent hashes, the next four bytes are
reserved for a version number for the file format so we can manage file format
changes going forward.
Admittedly one downside of this change is that it is a breaking change.
Ideally, users should commit all of their local changes in their existing mounts,
shutdown Eden, delete the old mounts, restart Eden, and re-clone.
In the end, this change deletes a number of Mercurial-specific code and Thrift
APIs from Eden. This is a better separation of concerns that makes Eden more
SCM-agnostic. For example, this change removes `Dirstate.cpp` and
`DirstatePersistance.cpp`, replacing them with the much simpler and more
general `Differ.cpp`. The Mercurial-specific logic from `Dirstate.cpp` that turned
a diff into an `hg status` now lives in the Mercurial extension in
`EdenThriftClient.getStatus()`, which is much more appropriate.
Note that this reverts the changes that were recently introduced in D6116105:
we now need to intercept `localrepo.localrepository.dirstate` once again.
Reviewed By: simpkins
Differential Revision: D6179950
fbshipit-source-id: 5b78904909b669c9cc606e2fe1fd118ef6eaab95
Summary: Continue moving the asynchrony of loading closer to the state access so, in a future world, we still behave if unloading occurs during access.
Reviewed By: simpkins
Differential Revision: D6238523
fbshipit-source-id: 722fe5bba90b55204d50393314d225f42680409b
Summary: To fix up the invariants in FileInode's API, we intend to remove ensureDataLoaded() and materializeForWrite(). But first I'm going to see if there is any pain caused by removing their calls.
Reviewed By: simpkins
Differential Revision: D6234049
fbshipit-source-id: c39c6d018d164b32903414d3750b875a897af210
Summary:
Per discussion with bolinfest, this brings Eden in line with clang-format.
This diff was generated with `find . \( -iname '*.cpp' -o -iname '*.h' \) -exec bash -c "yes | arc lint {}" \;`
Reviewed By: bolinfest
Differential Revision: D6232695
fbshipit-source-id: d54942bf1c69b5b0dcd4df629f1f2d5538c9e28c
Summary: If writing out the symlink file in the overlay fails, clean up the file before releasing the lock.
Reviewed By: wez
Differential Revision: D6234214
fbshipit-source-id: 91ccf917a26cc1a73c3963dea7dd87364857fa03
Summary:
Most functions in TreeInode that call Journal::addDelta() release the
TreeInode::contents_ lock before locking the journal. However, `doRename()`
locked the journal while still holding the contents_ locks for the directories
affected by the rename.
This updates `doRename()` to make sure we release the TreeInode locks properly
before attempting to update the journal. This should help ensure that we don't
have any lock ordering issues here.
Reviewed By: bolinfest
Differential Revision: D6212075
fbshipit-source-id: 84503acf47e9d97b81075932326825832cef9514
Summary:
For a while, the recommended approach (e.g. from folks on the thrift team, from folks on the javafoundations team) has been to use `languages = ["java-swift"]`. This renaming brings the code in line with the recommendation. There are no actual changes to any builds, this is just a rename.
This is the second diff in a small stack. It's a fully automated change that uses buildozer to rename `*-java` rules to `*-javadeprecated`.
Differential Revision: D6189950
fbshipit-source-id: a08ba86cb92293a07eceb4b6b29385c7bf647b72
Summary:
Now that we make use of the `.hg/dirstate` file, we should ensure that it is
initialized correctly. This takes the commit hash from the `SNAPSHOT` file and
passes it as an argument to the `post-clone` hook. In the Hg `post-clone` hook,
it writes the `.hg/dirstate` file such that the first 20 bytes are the binary
version of the commit hash and the next 20 bytes are nul bytes.
Reviewed By: simpkins
Differential Revision: D6174440
fbshipit-source-id: 29d17f88ca3f63fa6490e1db88bc2121aced74d6
Summary:
We have a couple of issues with the watchman/eden integration:
1. If you "win" the race, you can cause a segfault during shutdown
by changing files during unmount. This causes the journal updating
code to trigger a send to the client, but the associated eventBase
has already been destroyed.
2. We don't proactively send any signal to the subscriber (in practice: watchman)
when we unmount. Watchman detects the eden shutdown by noticing that its
socket has closed but has no way to detect an unmount.
This diff tries to connect the unmount with the set of subscribers and tries
to cause the thrift socket to close out.
Reviewed By: bolinfest
Differential Revision: D6162717
fbshipit-source-id: 42d4a005089cd9cddf204997b1643570488f04c3
Summary:
Previously, the `savebackup()` and `restorebackup()` methods in `eden_dirstate`
only retained the parent commit hashes. With this change, now the dirstate tuples
and entries in the copymap for the dirstate are also included as part of the saved
state.
Failing to restore all of the state caused issues when doing things like aborting
an `hg split`, as observed by one of our users. Although this fix works, we ultimately
plan to move the responsibility for persisting dirstate data out of Eden and into the
Hg extension. Then the data will live in `.hg/dirstate` like it would for the default
dirstate implementation.
Reviewed By: simpkins
Differential Revision: D6145420
fbshipit-source-id: baa077dee73847a47cc171cd980cdd272b3a3a99
Summary:
When performing an source control checkout operation, we attempt to remove
directories that are empty after the checkout. However, this code path was
missing a call to flush the kernel cache for these directories.
As a result, even though eden thought the directory not longer existed, and
would not report it in `readdir()` results, the kernel would return stale
information from its cache when explicitly accessing this path.
Reviewed By: bolinfest
Differential Revision: D6151543
fbshipit-source-id: 6031feb268ff6f980c885efb26c3d43243dec3f4
Summary:
Add a few extra debug logs to record `processCheckoutEntry()` and
`saveOverlayPostCheckout()` calls.
Reviewed By: bolinfest
Differential Revision: D6151544
fbshipit-source-id: ca6faa8fd1fe53df1e70305f5527360c918841d1
Summary:
CodeMod: Replace `wangle/concurrency` with `folly/executors`.
The headers in `wangle/concurrency/` are now but shims to equivalent headers in `folly/executors/`.
Reviewed By: jsedgwick
Differential Revision: D6120852
fbshipit-source-id: 358ceabea7ad79f84b803ed8e3aecb2a57fdd077
Summary:
Everything that's going in system/ besides CpuId and Subprocess,
which are included in hphp
Reviewed By: mzlee
Differential Revision: D6102263
fbshipit-source-id: 564ef584c341a4ac79db14a9d58fe23ce51e78b3
Summary:
This is needed to unblock some other work on folly.
```
foundation/dependency_management/ensure-explicit-dependencies.sh -e '("boost", None, "boost_filesystem")' boost/filesystem.hpp"
```
Reviewed By: yfeldblum
Differential Revision: D6094896
fbshipit-source-id: 3d4f7b42fa54514cf8c055f5fc477c3a2463bb0a
Summary:
I changed this around at the last minute before landing D5896494 and
it seemed to work, but the optimized build is pretty consistent at falling over
at runtime with a zero page non-null address.
Reviewed By: bolinfest
Differential Revision: D6092224
fbshipit-source-id: 93448615ac3bbea30bfe7729085808d9926f7dab
Summary:
Since adding prefetch support, we're triggering this code
each time we perform a lookup operation on a dir which means that
the eden logs are made a bit more noisy with this output than
previously.
So, let's remove the warning.
Reviewed By: simpkins
Differential Revision: D5905454
fbshipit-source-id: 0abc9b42284a5224cef908293ab377d8858977ec
Summary:
We were previously generating a simple JournalDelta consisting of
just the from/to snapshot hashes. This is great from a `!O(repo)` perspective
when recording what changed but makes it difficult for clients downstream
to reason about changes that are not tracked in source control.
This diff adds a concept of `uncleanPaths` to the journal; these are paths
that we think are/were different from the hashes in the journal entry.
Since JournalDelta needs to be able to be merged I've opted for a simple
list of the paths that have a differing status; I'm not including all of
the various dirstate states for this because it is not obvious how to
reconcile the state across successive snapshot change events.
The `uncleanPaths` set is populated with an initial set of different paths as
the first part of the checkout call (prior to changing the hash), and then is
updated after the hash has changed to capture any additional differences.
Care needs to be taken to avoid recursively attempting to grab the parents lock
so I'm replicating just a little bit of the state management glue in the
`performDiff` method.
The Journal was not setting the from/to snapshot hashes when merging deltas.
This manifested in the watchman integration tests; we'd see the null revision
as the `from` and the `to` revision held the `from` revision(!).
On the watchman side we need to ask source control to expand the list of
files that changed when the from/to hashes are different; I've added code
to handle this. This doesn't do anything smart in the case that the
source control aware queries are in use. We'll look at that in a following
diff as it isn't strictly eden specific.
`watchman clock` was returning a basically empty clock unconditionally,
which meant that most since queries would report everything since the start
of time. This is most likely contributing to poor Buck performance, although
I have not investigated the performance aspect of this. It manifested itself
in the watchman integration tests.
Reviewed By: simpkins
Differential Revision: D5896494
fbshipit-source-id: a88be6448862781a1d8f5e15285ca07b4240593a
Summary:
Running Mercurial's own integration tests revealed that we had a bug here:
https://www.mercurial-scm.org/repo/hg/file/tip/tests/test-histedit-arguments.t
Somewhat unsurprisingly, it was time to finally address a longstanding `TODO`
in `Dirstate.cpp`. The issue was that, after running `hg merge --tool :local`,
`hg status` was not including a merged file in the list of modified files. Because
the information from `hg status` is used to create a commit context, that meant
that when a commit was made after running `hg merge`, the commit did not
include the merged file in the list of files for the commit, which differs from
Mercurial's behavior.
Most of the implementation of `hg status` on the Eden side is done by
`EdenMount.diff()`. However, in this case, `diff()` does not categorize the
merged file by invoking one of the methods of `InodeDiffCallback` because
as far as `EdenMount` is concerned, the file has not changed because `EdenMount`
is unaware of the `Dirstate`. We already have some existing cases where we have
to do some post-processing on the result of `EdenMount.diff()` using information
in the `Dirstate` (e.g., files that are marked for addition or removal), so the fix was
to add a check for the case when the file is flagged as "needs merging" and
then including it as modified in the `hg status` output, as appropriate.
Reviewed By: wez
Differential Revision: D6005603
fbshipit-source-id: 7d4dd80e1a2e9f4b98243da80989e0e9119a566d
Summary:
CodeMod: Change `#include`'s of `wangle/concurrent/GlobalExecutor.h` to use `folly`.
The file in `wangle/` is just a shim around the same file in `folly/executors/GlobalExecutor.h`. Just codemod all the `#include` sites.
Reviewed By: Orvid
Differential Revision: D5981467
fbshipit-source-id: ad7f0dce959e2760d3977b04925945e0447abc1d
Summary:
Originally I thought this would help move towards removing a
`future.get()` call from FileInode, but it turned out to not make a difference
to that code.
It does make it a bit less of a chore to deal with the Journal related diff
callbacks added in D5896494 though, and is a move towards a future where we
could potentially return cached and shared instances of these objects.
This diff is a mechanical change to alter the return type so that we can share
instances returned from the object store interface. It doesn't change any
functionality.
Reviewed By: simpkins
Differential Revision: D5919268
fbshipit-source-id: efe4b3af74e80cf1df20e81b4386450b72fa2c94
Summary: Per wez, this makes the MODIFIED case consistent with the other conflict types (e.g. local_remote). Side benefit of avoiding some naming conflicts in the Haskell/Rust thrift tooling.
Reviewed By: wez
Differential Revision: D5882327
fbshipit-source-id: 3ec68c44d8c8a5c5675f1ced3842d29376d46fe2
Summary:
This moves the bind mounting and post-clone script running
functionality to methods of the EdenMount class and makes the whole
mount flow return a `Future<>`. The higher level goal is to make it
easier to see where and how we want to tweak this flow to support
graceful restart.
This is mostly straight forward but care is required to avoid deadlocks; there
are two scenarios:
# We fulfill the fuse start promise in the context of the fuse thread that is
handling the fuse initialization packet before it has signalled to the kernel
that it has come up. This can be solved by using `via(mainEventBase_)`, but...
# When remounting all the mounts on startup, we're running in the
`mainEventBase_` thread so the simplistic solution to 1. would cause us to
deadlock on startup (visible in the remount integration tests).
So to avoid this, we shunt the completion of the future via a CPU pool.
Also worth noting: the way we were setting up the global CPU pool with
wangle wasn't correct; it takes a weak reference to the pool which was
then getting destroyed when our prepare method returned. It just happened
to work for us in the facebook specific build because something else was
setting up a different CPU executor.
I've reconciled this by just setting up a thread pool of our own and
using that explicitly.
Reviewed By: bolinfest
Differential Revision: D5798659
fbshipit-source-id: f1c48730f283f6962f6cd706c02d82ea2952e369
Summary:
Previously, we were generating a bit of disconcerting noise in our logs when
requesting a non-existent key in the dirstate or its copy map. We were also
susceptible to a logical error in the Eden side being silently translated to
a `KeyError` on the Python side.
Now we make things more explicit by converting a `std::out_of_range` on the C++
side to an explicit `NoValueForKeyError` that is defined in `eden.thrift`.
Now the Python side catches a `NoValueForKeyError` explicitly and converts it
into a `KeyError`. Other types of exceptions should pass through rather than be
swallowed.
This also updates the log messages to communicate when a there is no value for a
key. The messaging is improved so that it no longer appears to be a logical
error.
Reviewed By: wez
Differential Revision: D5800833
fbshipit-source-id: c44f2caf04622475d218593037cc6616bbb1c701
Summary:
Previously, we were clearing entries in `hgDirstateTuples` for which:
```
mergeState == NotApplicable
```
but we should have been checking for:
```
mergeState == NotApplicable AND status == Normal
```
The previous logic was causing us to erroneously clear entries in a state like:
```
mergeState == NotApplicable AND status == MarkedForRemoval
```
This bug manifested itself when grafting a change that removed a file.
The file was removed from disk, but Eden did not know that it had been
`MarkedForRemoval`, so it would report the removed file as "missing" in
`hg status`.
Reviewed By: wez
Differential Revision: D5797270
fbshipit-source-id: 29740dfaa8102db868b95e932716773787f317ac
Summary:
After performing the dumb merge of EdenMount and MountPoint in
the prior commit, this one tidies up the state tracking and the interface
by which clients of the object can be notified of state changes.
I've introduced two Promises; the first of these can be used to wait
for the fuse mount to come up or error out. It logically replaces
the cond wait in the `start` method and is exposed to the caller
as a Future, allowing them to wait and react to the outcome.
The second of the promises is associated with the fuse thread pool
winding down. The attached future can be extracted and used by the
client of the EdenMount class. This future yields the fuse device
descriptor which we can then choose to pass on during graceful
restart or simply close. In the current integration, since we ignore
the result of that future, the handle is implicitly closed.
These promises allow us to remove the reference cycle that we had with the
`onStop` function and to potentially make the mount/unmount sequence more
async.
Reviewed By: bolinfest
Differential Revision: D5778214
fbshipit-source-id: 00b293009b7251ddd8bfb10795a115188e97aa3a
Summary:
This is a mechanical and dumb move of the code from MountPoint
and into the EdenMount class.
Of note, it doesn't merge together the two different state/status fields
into a unified thing; that will be tackled in a follow on diff.
Reviewed By: bolinfest
Differential Revision: D5778212
fbshipit-source-id: 6e91a90a5cc760429d87a475ec12f81b93f87be0
Summary:
This is leading up to folding the MountPoint code into
the EdenMount class.
There's still a mention of the MountPoint in Dispatcher.h; that is
being dealt with in the following diff.
Reviewed By: bolinfest
Differential Revision: D5778215
fbshipit-source-id: 996640b3773988a4738ad55bb13de45e1ffe1880
Summary:
The higher level goal is to make it easier to deal
with the graceful restart scenario.
This diff removes the SessionDeleter class and effectively renames
the Channel class to FuseChannel. The FuseChannel represents
the channel to the kernel; it can be constructed from a fuse
device descriptor that has been obtained either from the privhelper
at mount time, or from the graceful restart procedure. Importantly
for graceful restart, it is possible to move the fuse device
descriptor out of the FuseChannel so that it can be transferred
to a new eden process.
The graceful restart procedure requires a bit more control over
the lifecycle of the fuse event loop so this diff also takes over
managing the thread pool for the worker threads. The threads
are owned by the MountPoint class which continues to be responsible
for starting and stopping the fuse session and notifying EdenServer
when it has finished. A nice side effect of this change is that
we can remove a couple of inelegant aspects of the integration;
the stack size env var stuff and the redundant extra thread
to wait for the loop to finish.
I opted to expose the dispatcher ops struct via an `extern` to
simplify the code in the MountPoint class and avoid adding special
interfaces for passing the ops around; they're constant anyway
so this doesn't feel especially egregious.
Reviewed By: bolinfest
Differential Revision: D5751521
fbshipit-source-id: 5ba4fff48f3efb31a809adfc7787555711f649c9
Summary:
This partially fixes up a perf problem when performing status when a large
number of inodes have been loaded but not materialized (eg: by `find /edenfs
-ls`).
For the FileInode case we'd end up requesting the SHA1 from the store
twice in parallel only to compare it and decide that the file has not
been changed(!)
The remediation is to cut this code over to calling `FileInode::isSameAs` so that
we can short-circuit some of this work. In addition, we can avoid loading
subtrees if we haven't materialized them and the hash matches up.
Reviewed By: simpkins
Differential Revision: D5783044
fbshipit-source-id: f40da3fadfcf8d9e19221d41e3a5a980454717db
Summary:
Before this change, `hg split` crashed complaining that `node` was a
`changectxwrapper` instead of a 20-byte hash when it was sent as `parent1`
of `WorkingDirectoryParents` in `resetParentCommits()`. Now we use `node()` to
get the hash from the `destctx` that we have already extracted via this line
earlier in `merge_update()`:
destctx = repo[node]
The change to `eden/hg/eden/__init__.py` eliminated the crash, but was
not sufficient on its own to make `hg split` work correctly. There was also a fix
required in `Dirstate.cpp` where the `onSnapshotChanged()` callback was clearing out
entries of both `NotApplicable` and `BothParents` from `hgDirstateTuples`.
It appears that only `NotApplicable` entries should be cleared. (I tried leaving
`NotApplicable` entries in there, but that broke `eden/integration/hg/graft_test.py`.)
I suspected that the logic to clear out `hgDestToSourceCopyMap` in
`Dirstate::onSnapshotChanged` was also wrong, so I deleted it and all of the
integration tests still pass. Admittedly, we are pretty weak in our test coverage
for use cases that write to the `hgDestToSourceCopyMap`. In general, we should
rely on Mercurial to explicitly remove entries from `hgDestToSourceCopyMap`.
We have a Thrift API, `hgClearDirstate()`, that `eden_dirstate` can use to categorically
clear out `hgDirstateTuples` and `hgDestToSourceCopyMap`, if necessary.
Finally, creating a proper integration test for `hg split` required creating a value for
`HGEDITOR` that could write different commit messages for different commits.
To that end, I added a `create_editor_that_writes_commit_messages()` utility as a
method of `HgExtensionTestBase` and updated its `hg()` method to take `hgeditor`
as an optional parameter.
Reviewed By: wez
Differential Revision: D5758236
fbshipit-source-id: 5cb8bf4207d4e802726cd93108fae4a6d48f45ec
Summary:
The serialized data for each file handle needs to be enough
to re-construct the handle when we load it into a new process later
on. We need the inode number, the file handle number that we communicated
to the kernel and a flag to let us know whether it is a file or a dir.
Note that the file handle allocation strategy already accomodates the
idea of migrating to a new process; we don't need to serialize anything
like a next file handle id number.
This doesn't implement instantiating the handles from the loaded state,
it is just the plumbing for saving and loading that state information.
Reviewed By: bolinfest
Differential Revision: D5733079
fbshipit-source-id: 8fb8afb8ae9694d013ce7a4a82c31bc876ed33c9
Summary:
We're not doing anything with this today. It's not
clear whether we should be doing sanity checks (eg: block attempts
to write to a handle that was opened only for reading) or whether
the kernel is going to do that for us, so I've broken this out
as a separate diff from the removal of FileData.
Reviewed By: bolinfest
Differential Revision: D5723064
fbshipit-source-id: b73452dfb4edf88b57fef1ad604bb2bde93bacc1
Summary: These don't exist any more, so remove them
Reviewed By: bolinfest
Differential Revision: D5722861
fbshipit-source-id: 7db112dfab1dfdcf517452b314bd912ec8760bd1
Summary:
Added new tool to report stat information of EdenFs like fuse counters, Memory counters, latencies, Inode status for all the mount points etc.
eden stat : Prints the general information about eden like list of mount points, loaded unloaded and materialized inodes in each mount point. Also this reports how well periodic unload job is doing by reporting the number of unloaded inodes by periodic job.
eden stat io : Prints how many number of calls made to a system call in Edenfs.
eden stat memory : returns the memory stat for edenfs.
eden stat latency : reports the latencies of system calls in Edenfs.
Reviewed By: bolinfest
Differential Revision: D5660345
fbshipit-source-id: 97a1c2b83a6d8df0cd1b82c4d54b52d7ebd126bd
Summary:
This test was supposed to be a part of D5627411 but it was causing strange behaviour so was brought to a separate diff for further investigation.
After investigating, the test didn't pass because the UnloadedInodeData struct only contained the name of the file, not the path to it. The fix for this was to implement a way to get the relative path of the file even after the inode is unloaded.
Reviewed By: simpkins
Differential Revision: D5646929
fbshipit-source-id: f166398a651e8aea49da7e4474a5ad7fde2eaa4e
Summary:
1.Modified `TreeInode::unloadChildrenNow()` to return number of inodes that have been unloaded.
2.Modified `EdenServiceHandler::unloadInodeForPath()` to return number of inodes that are unloaded.
Reviewed By: simpkins
Differential Revision: D5627539
fbshipit-source-id: 4cdb0433dced6bf101158b9e6f8c35de67d9abbe
Summary:
When Hg tells the `dirstate` to `clear()`, we should also clear out any data we
have on the server for the `Dirstate`.
As it stands, the way we subclass `dirstate`, it does not appear like `clear()`
should be called, in practice, though one thing that could call it is
`hg debugrebuilddirstate`. It is probably good for us to have an RPC lying
around that we can use to reset the `Dirstate.`
Reviewed By: wez
Differential Revision: D5675298
fbshipit-source-id: 38926cfd93f4f83e4c28910f812a693cb32e423a
Summary:
Previously, we were overloading `hgSetDirstateTuple()` to also make it possible
to delete an entry from the internal `hgDirstateTuples` map. Now we have an
explicit method to do this, which enables us to remove some hacks/TODOs.
Reviewed By: simpkins
Differential Revision: D5665170
fbshipit-source-id: bc0753d4990c8966fd9e6c50b29a954d5023292f
Summary: Modified `TreeInode::unloadChildrenNow` such that inodes are unloaded whose age is greater than a specific age.
Reviewed By: simpkins
Differential Revision: D5526137
fbshipit-source-id: 91e2364d55e31befedcf43d98c26467e1a472ef9
Summary:
Update all of the code using ThreadLocal<EdenStats> to pass in a non-default
Tag parameter to the ThreadLocal template.
A non-default tag parameter is required to use the accessAllThreads() method on
the ThreadLocal object. We need to use accessAllThreads() to perform stats
aggregation correctly. Currently the EdenServer code is only performing
aggregation for stats in the FunctionScheduler.
This diff only updates the ThreadLocal<EdenStats> type, but does not contain
any behavior changes. I will fix the stats aggregation in a subsequent diff.
Reviewed By: bolinfest
Differential Revision: D5657268
fbshipit-source-id: bc4b6f56324eb8d3052c023fd3f6f64f99b1d4e0
Summary:
Updated time stamps of TreeInode accurately on mkdir,rmdir,mknode,symlink,create,unlink and readdir.
updated the `TreeInode::getattr` function to return in-memory timestamps.
Reviewed By: simpkins
Differential Revision: D5568183
fbshipit-source-id: c36f7fb767cd4342aab5cc983eea56e37cd2077e
Summary: Removed creation time from `FileInode::state` which was used for getting timestamps of files that are not materialized.Now that we added timestamps to file inodes and tree inodes we no longer need this.
Reviewed By: wez
Differential Revision: D5552761
fbshipit-source-id: 6013b1f694045e08ada7bd64114c4f2e52848fef
Summary:
updating atime,ctime,mtime of FileInode on read, write and setattr system calls.
modified `FileInode::stat` function to return accurate inmemory timestamps.
Reviewed By: simpkins
Differential Revision: D5552666
fbshipit-source-id: 86d446f72908663f8db509b7b789d9f35d17df3a
Summary:
Added `TreeInode::setInodeAttr` a helper function used in `InodeBase::setattr`. Also,added `InodeBase::setAtime` ,`InodeBase::setMtime` and implemented them in `FileInode` and `TreeInode`.
Moved updating timestamp logic to `InodeBase::setattr` from `FileInode::setInodeAttr` and `TreeInode::setInodeAttr`.
Reviewed By: simpkins
Differential Revision: D5545422
fbshipit-source-id: 597cfabb3062166a058cf32776acb50a1bc0c61c
Summary: Removed `FileInode::setattr` from `FileInode` and added a helper function `setInodeAttr` to perform FileInode or TreeInode specific setattr operations in `InodeBase::setattr`. This diff contains implementation of setattr for FileInode i.e for files, will add setattr implementation for directories in another diff.
Reviewed By: simpkins
Differential Revision: D5544968
fbshipit-source-id: 089491d07a603e111966987ef390b6e597aba28c
Summary:
We're seeing that this is always set to true for eden,
which is causing buck to run slower than it should.
To make this work correctly, I've augmented our journal data structure
so that it can track create, change and remove events for the various
paths.
I've also plumbed rename events into the journal.
This requires a slightly more complex merge routine, so I've refactored the two
call sites that were merging in slightly different contexts so that they can
now share the same guts of the merge routine. Perhaps slightly
counterintuitive in the merge code is that we merge a record from the past into
the state for now and this is a bit backwards compared to how people think.
I've expanded the eden integration test to check that we don't mix up
create/change/removes for the same path in a given window.
On the watchman side, we use the presence of the filename in the createdPaths
set as a hint that the file is new. In that case we will set the watchman
`ctime` (which is not the state ctime but is really the *created clock time*)
to match the current journal position if the file is new, or leave it set
to 0 if the file is not known to be new. This will cause the `is_new`
flag to be set appropriately by the code in `watchman/query/eval.cpp`;
if the sequence is 0 then it should never be set to true. Otherwise (when
the file was in the `createPaths` set) it will be set to the current journal
position and this will be seen as newer than the `since` constraint on
the query and cause the file to show as `new`.
Reviewed By: bolinfest
Differential Revision: D5608538
fbshipit-source-id: 8d78f7da05e5e53110108aca220c3a97794f8cc2
Summary:
1. Added a new structure `InodeBase::InodeTimestamps` to wrap atime,ctime,mtime together. This new structure helps in avoiding usage of `struct stat` for timestamps.
2. Modified function `Overlay::openFile` ,`Overlay::updateTimestampToHeader`, `Overlay::deserializeOverlayDir`, `Overlay::parseHeader` to use this new structure for timestamps instead of `struct stat`. Also, modified code in places where this change is being affected.
3. Added new helper methods `FileInode::setattrTimes` and `TreeInode::setattrTimes` to set timestamps in FileInode and TreeInode during setattr. Implementation of setattr for FileInode and TreeInode is in the diffs stacked above this diff.
4. Replaced atime, ctime, mtime in `FileInode::State`, `TreeInode::Dir` to `FileInode::State::timeStamps` and `TreeInode::State::timeStamps`. Made other necessary changes to support this change.
Reviewed By: simpkins
Differential Revision: D5596854
fbshipit-source-id: 2786b7b695508a62fdf8f7829f1ce76054b61c52
Summary: Currently we have two functions `FileInode::setattr` and `FileInode::setAttr` which are used to set given attributes to a `FileInode`. Merged both the functions in to one function called `FileInode::setattr` and removed `FileInode::setAttr`.
Reviewed By: wez
Differential Revision: D5538490
fbshipit-source-id: ec241fad25d6e4694865e5fc3c0a3500e4838bdd
Summary:
Added a new function `InodeBase::updateOverlayHeader` and implemented `FileInode::updateOverlayHeader` and `TreeInode::updateOverlayHeader` to update inmemory timestamps to overlay header when an inode is unreferenced.
Added helper functions in `Overlay` class to read and update timestamps in to the overlay file. Also,modified `Overlay::loadOverlayDir` to read and populate timestamps from overlay header in to treeinode.
Modified constructor of `FileInode::state` to read timestamps from overlay file and to populate inode timestamps.
Added test case to check if time stamps are updated and read correctly on remount.
Fixed a lint warning in TARGETS file
Reviewed By: simpkins
Differential Revision: D5535429
fbshipit-source-id: f6b758f70101c65d316a35101aacc9a3363f7aed
Summary:
This fixes EdenServer::unmount() to actually wait for all EdenMount cleanup
to complete, and fixes unmountAll() to return a Future that correctly waits for
all mount points to be cleaned up.
Previously `unmount()` waited for the mount point to be unmounted from the
kernel, but did not wait for EdenMount shutdown to complete. Previously
EdenMount shutdown was not triggered until the last reference to the
shared_ptr<EdenMount> was released. This often happened in the FUSE channel
thread that triggered the mountFinished() call--it would still hold a
reference to this pointer, and would not release it until after
mountFinished() returns. As a result, when the main thread was shutting down,
`main()` would call `unmountAll()`, and then return soon after it completed.
Some FUSE channel threads may still be running at this point, still performing
`EdenMount` shutdown while the main thread was exiting. This could result in
crashes and deadlocks as shutdown tried to access objects already destroyed by
the main thread.
With this change `EdenMount::shutdown()` is triggered explicitly during
`mountFinished()`, and `unmount()` will not complete until this finishes.
The `EdenMount` object may still exist at this point, and could still be
deleted by the FUSE channel thread, but the deletion now only requires freeing
the memory and does not require accessing other data that may have been cleaned
up by the main thread.
We should still clean up the FUSE channel thread handling in the future, to
make sure these threads are joined before the main thread exits. However, that
cleanup can wait until a separate diff. Ideally I would like to move more of
the mount and unmount logic from EdenServer and EdenServiceHandler and put that
code in EdenMount instead.
Reviewed By: bolinfest
Differential Revision: D5541318
fbshipit-source-id: 470332478357a85c314bc40458373cb0f827f62b
Summary: Changed the initialization of last checkout time in Tree inode constructor by grabbing a lock on `TreeInode::contents_`
Reviewed By: simpkins
Differential Revision: D5524857
fbshipit-source-id: e86ee7d986ca0c280ba156ba9146d6d1f9fa722e
Summary:
In testing, we discovered a case where concurrent Hg operations in EdenFS would
//sometimes// fail with `ECONNREFUSED` when trying to read
`<eden-mount>/.eden/socket`.
This was very confusing, as the standard reasons for `ECONNREFUSED` did not seem
to apply:
- We verified that Eden had not crashed.
- We verified that Eden's UNIX domain socket had a sufficiently large backlog
(1024) such that we should not be at risk of exhausting it with two simultaneous
Hg commands.
It turned out that there was a race condition in our `readlink()` command, which
could cause `<eden-mount>/.eden/socket` to resolve to the wrong path. Failing to
connect to this path manifested itself as an `ECONNREFUSED` error.
It turned out that `readlink()` used `FileInode::readAll()`, which was
performing an `lseek()` followed by a `read()` on a file descriptor while the
descriptor was protected by a //read lock// instead of a //write lock//. Because
the `lseek()` causes a state change, it needs to be protected by a write lock.
Changing the type of the lock fixed the issue.
The only other caller of `FileInode::readAll()` appears to be in
`TreeInode::loadGitIgnoreThenDiff()`, so presumably this fixes a possible race
condition there, as well.
Reviewed By: simpkins
Differential Revision: D5533001
fbshipit-source-id: 86cf84c45463b2ae194d6f46909ea67c0f71d065
Summary:
Make a few minor tweaks to the strace-style logging added in D5464387.
- Call the log categories "eden.strace.<mount_path>" instead of
"eden/strace<mount_path>". The folly logging library uses '.' to separate
nodes in the log category hierarchy, so this puts all of the strace messages
under the "eden.strace" category, which it itself part of the "eden"
category. The "<mount_path>" will contain slashes inside it, but slashes are
not treated specially in log category names.
- Rename `EdenMount::getLogger()` to `EdenMount::getStraceLogger()` since this
logger should be used only for strace-style events, and not for general log
messages for this mount point.
Reviewed By: bolinfest
Differential Revision: D5515245
fbshipit-source-id: 9d833d9fbff47c6a57a7afefeae92755ff0e28b7
Summary: Added atime,ctime,mtime for tracking timestamps for directories inmemory and initialized them to the last checkout time during the creation of TreeInode.
Reviewed By: bolinfest
Differential Revision: D5440950
fbshipit-source-id: 639cf1ce6722f80dde35d33849aa712aa30301a8
Summary:
Added a new data member lastCheckoutTime to EdenMount class to store the time when checkout operation is performed. Also added a method to get the last checkout time which internally returns the lastCheckoutTime in EdenMount class.
Added new fields atime,mtime,ctime in FileInode::State structure to keep track of timestamps in memory. Initialzed timestamps in FileInode::State constructor by calling getLastCheckOutTime from EdenMount class.
Still have to add timestamp tracking for directories and have to initialize them with lastCheckout time.This probably will be done in a seperate diff.
Reviewed By: bolinfest
Differential Revision: D5437682
fbshipit-source-id: e3d6b1bc0c2192538dd3b0d9a6017ceb3ca0843d
Summary:
Moved all the member functions from `FileData` class to `FileInode` class
and made `FileInode` methods independent of shared `FileData` object.
Removed `FileData.h` and `FileData.cpp` files as they are not needed anymore.
Modified functions `FileInode::getSHA1()` and `FileInode::isSameAsFast` and
modified few testcases which are currently using `FileData` class and made
sure that all the test cases are passing.
Reviewed By: bolinfest
Differential Revision: D5430128
fbshipit-source-id: 3e8e6c490e92e4e602355e4ce39b67c450ec53f8
Summary:
D5483953 added a check to ensure that getMode() can only be called on entries
that do not have a loaded inode. However, a few places in the code were still
calling getMode() on tree entries without checking if the inode was loaded or
not.
These crashes were caught in the integration tests run by `buck test eden/...`,
but were not caught by sandcastle tests on the original diff since sandcastle
only runs the eden unit tests, and not the integration tests.
All of these locations only needed to check the file type, which is safe to do
even if the inode is loaded, since the file type portion of the mode can never
change on an existing inode. Only the permissions bits are unsafe to access
once an inode has been loaded (since we need to ask the inode itself for the
latest permissions bits).
I updated these call sites to stop using getMode() and instead use functions
that explicitly check only the file type bits.
Reviewed By: bolinfest
Differential Revision: D5501256
fbshipit-source-id: c989dab2fdacb5b9cdecb6f5101795298f57e78b
Summary:
Before this commit, TreeInode::Entry was a struct which had two
private members: mode and inode. In this commit,
1. TreeInode::Entry was changed from struct to class.
2. Appropriate getters and setters were introduced for the public members to
make them private.
3. Existing code accessing the public members directly was modified to use
the getters instead.
4. A couple of TODOs were added to address Overlay::saveOverlayDir()'s access
of child inode information.
Reviewed By: simpkins
Differential Revision: D5483953
fbshipit-source-id: 50d526731e193a3a4a32742bd4d49deb9ee6b432
Summary: This uses `clang-tidy` to comment out unused parameters (in functions, methods and lambdas) in fbcode. Cases that the tool failed to handle are fixed manually.
Reviewed By: igorsugak
Differential Revision: D5454343
fbshipit-source-id: 5dee339b4334e25e963891b519a5aa81fbf627b2
Summary:
Log messages to an eden.strace category.
This allows us to enable/disable based on the mount_path. For example:
./buck-out/gen/eden/cli/cli.par daemon -F -- --logging eden/strace/data/users/ekent/eden-NEW=DBG7
Thus, we are using a category, rather than filename (default)
Reviewed By: bolinfest
Differential Revision: D5464387
fbshipit-source-id: 6a54badd6bb806cfcda1742fd970073d91303396
Summary:
1.Refactored FileInode and FileData code.
2.Moved some data members of file data into struct State inside file inode
3.Refactoring code such that FileData and FileInode classes are eventually moved to FileInode class.
Reviewed By: simpkins
Differential Revision: D5414427
fbshipit-source-id: cf24721a65541ddfdec7ead4a035de4da3fd5bb5
Summary:
- Updated makeShared() to return a folly::Future<std::shared_ptr<EdenMount>> instead of just a std::shared_ptr<EdenMount>.
- Updated callers to use the returned future (get() for now)
- Refactor makeShared() and the EdenMount constructor to avoid blocking.
Reviewed By: simpkins
Differential Revision: D5424088
fbshipit-source-id: f026a3a3e4abb3593bafda76673e12c55da26322
Summary: Added header to the files that are materialized .
Reviewed By: simpkins
Differential Revision: D5387990
fbshipit-source-id: 1d551d674a39e01d6314d8bb3308e7bea3e669fc
Summary:
This updates the TreeInode code to remove the redundant materialized flag.
A TreeInode should have a Tree Hash if and only if it is dematerialized, so
there is no need for an extra `materialized` boolean.
This diff also fixes an issue in TreeInode::saveOverlayPostCheckout() where it
was not correctly informing it's parent TreeInode of the change if it moved
from one dematerialized state to another (with a different TreeInode hash).
This fixes the code to correctly call `parent->childDematerialized()` when it
needs to inform the parent that it now refers to a different source control
hash.
Reviewed By: wez
Differential Revision: D5336629
fbshipit-source-id: b4d86ecdef2f5faefbc243a09d869c02384ae95c
Summary:
Add a command to deserialize and display information about files in the
overlay. This can be used to help debug the current state of files in the
overlay.
Reviewed By: wez
Differential Revision: D5332014
fbshipit-source-id: 25b01579df33aa9f1926c0144e9f03aa8ece38fc
Summary:
1.Added a new method to create header.
2.Added header to the overlay files of directories.
3.Added test class OverlayTest for Overlay related tests.
Reviewed By: simpkins
Differential Revision: D5335134
fbshipit-source-id: 31f59e7af70a3eeae6350261ded5d8b1bec2b9d0
Summary:
This is mostly needed because of the `folly:file` target split, but there are a few other that had changes that weren't reflected.
This also re-enables auto-deps for `multifeed/aggregator/processor/TARGETS` as it works again.
Reviewed By: yfeldblum
Differential Revision: D5362875
fbshipit-source-id: f9d2793249c0bfe644d0504f19839c108648ea3b
Summary:
[Folly] Remove `construct_in_place`.
It is an alias for `in_place`, and unnecessary.
Reviewed By: Orvid
Differential Revision: D5362354
fbshipit-source-id: 39ce07bad87185e506fe7d780789b78060008182
Summary:
Update the Overlay class to hold a lock on the info file inside the overlay for
as long as it exists.
This just adds an extra layer to help ensure that two separate edenfs processes
are not operating on the same overlay directory at the same time. We already
hold a lock on the .eden directory itself, which generally should be sufficient
protection, since overlay directories are always scoped to a .eden directory.
However, adding a per-Overlay lock will help ensure that we do not have issues
where we try to open an Overlay directory twice, particularly when remounting
an existing mount point that we just closed.
Differential Revision: D5326023
fbshipit-source-id: eb8b213225b8d6905a982db0bfac73a17d1bd246
Summary:
Add a basic unit test that adds a new file, remounts the mount point, and
confirms the materialized contents are loaded from the overlay correctly on
remount.
Differential Revision: D5326024
fbshipit-source-id: d2446030802cc4afe5af09460d590ccf8c43e525
Summary:
Add a main() function to the indoes unit test, to allow a `--logging` flag to
control log levels.
I will eventually update our default test `main()` function to include this
flag, but for now adding a custom `main()` for the inodes test is the simplest
fix.
Reviewed By: wez
Differential Revision: D5326022
fbshipit-source-id: 36f497658fdb21639408fc599cf75908b9c9acb3
Summary: It doesn't need to exist anymore
Reviewed By: yfeldblum
Differential Revision: D5318746
fbshipit-source-id: c70b184f4b3fc12ede4632d6b3d43de16ed758c7
Summary:
this code was present to help in some benchmarking
in the very early prototype code. I noticed that this was
still lingering when reviewing a diff the other day.
Remove it!
Reviewed By: simpkins
Differential Revision: D5334263
fbshipit-source-id: 9ecccf1f9922b4c8f46b2529da665e2fdf11ab7a
Summary:
Format all of the TARGETS files under eden/fs with the autodeps tool.
A few rocksdb include statements require comments so that autodeps can
correctly tell which dependency this include comes from. The rocksdb library's
source file structure unfortunately does not match the layout of how its header
files get installed, so autodeps cannot figure this out automatically.
Reviewed By: wez
Differential Revision: D5316000
fbshipit-source-id: f8163adca79ee4a673440232d6467fb83e56aa10
Summary:
1. Moved read, write, mkdir, rm methods in hg/lib/hg_extension_test_base.py to lib/test_case.py.
2. Added integration test case to test unload free inodes.
Reviewed By: simpkins
Differential Revision: D5277870
fbshipit-source-id: b93b6049a10357cf8c92366e6dca3968f7f30c30
Summary:
Update eden to log via the new folly logging APIs rather than with glog.
This adds a new --logging flag that takes a logging configuration string.
By default we set the log level to INFO for all eden logs, and WARNING for
everything else. (I suspect we may eventually want to run with some
high-priority debug logs enabled for some or all of eden, but this seems like a
reasonable default to start with.)
Reviewed By: wez
Differential Revision: D5290783
fbshipit-source-id: 14183489c48c96613e2aca0f513bfa82fd9798c7
Summary:
Now that the non-future versions of these APIs have been removed, rename
getBlobFuture() to getBlob(), and getTreeFuture() to getTree()
Reviewed By: wez
Differential Revision: D5295690
fbshipit-source-id: 30dcb88854b23160692b9cd83a632f863e07b491
Summary:
Remove the blocking getBlob() API.
There were a few call sites in FileData still using this blocking API. For now
I simply updated them to use getBlobFuture() and make a blocking get() call on
the returned future. These call sites already had TODO comments documenting
the blocking behavior.
I plan to rename getBlobFuture() to getBlob() in a subsequent diff.
Reviewed By: wez
Differential Revision: D5295726
fbshipit-source-id: 748fd7a140b9b59da339a330071f732bba38cb35
Summary:
Remove the ObjectStores.h and .cpp files. These files contained helper
functions for looking up Tree and TreeEntry objects based on a deep relative
path.
These functions are no longer used anywhere anymore (the code that performs
deep lookups always does inode lookups instead). Additionally, this file is
the only place still using some of the blocking, non-Future-based APIs for
object lookups. Deleting these files will let us remove these deprecated
blocking APIs.
Reviewed By: wez
Differential Revision: D5295691
fbshipit-source-id: 89229827305490eba4d3a581954a90c5cf63238d
Summary:
This is a major change to Eden's Hg extension.
Our initial attempt to implement `edendirstate` was to create a "clean room"
implementation that did not share code with `mercurial/dirstate.py`. This was
helpful in uncovering the subset of the dirstate API that matters for Eden. It
also provided a better safeguard against upstream changes to `dirstate.py` in
Mercurial itself.
In this implementation, the state transition management was mostly done
on the server in `Dirstate.cpp`. We also made a modest attempt to make
`Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for
Git at some point.
However, as we have tried to support more of the sophisticated functionality
in Mercurial, particularly `hg histedit`, achieving parity between the clean room
implementation and Mercurial's internals has become more challenging.
Ultimately, the clean room implementation is likely the right way to go for Eden,
but for now, we need to prioritize having feature parity with vanilla Hg when
using Eden. Once we have a more complete set of integration tests in place,
we can reimplement Eden's dirstate more aggressively to optimize things.
Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]]
extension has already demonstrated that it is possible to provide a faithful
dirstate implementation that subclasses the original `dirstate` while using a different
storage mechanism. As such, I used `sqldirstate` as a model when implementing
the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`).
In particular, `sqldirstate` uses SQL tables as storage for the following private fields
of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because
`_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we
do not support them in `eden_dirstate` and add code to ensure the codepaths that
would access them in `dirstate` never get exercised. Similarly, we also implemented
`eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the
dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden.
It appears to be primarily used for checking whether a path to a file already exists in
the dirstate as a directory. We can protect against that in more efficient ways.)
That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set
of files that have been marked "copied" in the current dirstate, so it is fairly small and
can be stored on disk or in memory with little concern. `_map` is a bit trickier because
it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored
across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data
analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new
equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles`
table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap`
data.
In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined
in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`,
which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data
structures that translate their method calls to Thrift server calls. I expect we will want to
optimize this in the future via some client-side caching, as well as creating batch APIs for talking
to the server via Thrift.
One advantage of this new implementation is that it enables us to delete
`eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`.
Between the recent implementation of `dirstate.walk()` for Eden and this switch
to the real dirstate, we can now use the default implementation of `hg add` and `hg remove`
(although we have to play some tricks, like in the implementation of `eden_dirstate.status()`
in order to make `hg remove` work).
In the course of doing this revision, I discovered that I had to make a minor fix to
`EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as
`hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which
case the glob was not matching `foo`!
I also had to do some work in `eden_dirstate.status()` in which the `match` argument
was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number
of things with the `match` specified as a filter, so the output of `status()` must be filtered
by `match` accordingly. Ultimately, this seems like work that would be better done on the
server, but for simplicity, we're just going to do it in Python, for now.
For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`.
As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was
testing should probably be converted to integration tests. At a high level, the role of
`DirstatePersistence` has not changed, but the exact data it writes is much different.
Its corresponding unit test is also disabled, for now.
Note that this revision does not change the name of the file where "dirstate data" is written
(this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing
instances of this file once this change lands. (It is still early enough in the project that it does
not seem worth the overhead of a proper migration.)
The true test of the success of this new approach is the ease with which we can write more
integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very
few changes to `eden_dirstate.py`.
Reviewed By: simpkins
Differential Revision: D5071778
fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
Summary:
This is generated by applying clang-tidy `-checks=modernize-use-override` to all the c++ code in project eden.
It enforces the use of the keywords `virtual`, `verride` and `final` in a way compliant to the style guide.
Reviewed By: igorsugak
Differential Revision: D5108807
fbshipit-source-id: 596f2d73f1137de350114416edb1c37da5423ed5
Summary: The `nodiscard` attribute was added to standard C++ in C++17. Prefer the standard formulation when it is available; otherwise, use `__attribute__((__warn_unused_result__))` on compilers that support the GNU extensions, and `_Check_return_` on MSVC. The old `FOLLY_WARN_UNUSED_RESULT` is now expands to `FOLLY_NODISCARD`.
Reviewed By: yfeldblum
Differential Revision: D5105137
fbshipit-source-id: 9aa22e81cd9f0b89f9343433aeae3ba365227ccb
Summary:
These files are defining flags, so they should import gflags.
This import being missing causes open source build to fail.
Reviewed By: bolinfest
Differential Revision: D4992246
fbshipit-source-id: 61eb35b43ae6f9ff88bd34a9ebc8db6d80ac9d08
Summary:
Update Dirstate::remove() to directly call TreeInode::unlink() instead of going
through the EdenDispatcher. Internal code should be able to use TreeInode
directly without going through the FUSE Dispatcher. This prevents us from
having to look up the TreeInode object by inode number again.
This also fixes TreeInode::unlink() and TreeInode::rmdir() to perform FUSE
cache invalidation correctly. EdenDispatcher used to do this for unlink()
calls, but it was missing for rmdir() calls.
Reviewed By: wez
Differential Revision: D4968832
fbshipit-source-id: 51fda5de196392c640436ca6ad7d8333e6799db9
Summary:
This replaces the shouldFileBeDeletedByHgRemove() function with a much simpler
and safer check.
It turns out this code was more complicated than it needed to be: it looked up
the inode in question and confirmed that it still existed, even though it's
calling function had already previously looked up the inode object.
I believe this deletes the last place in the code that still assumes that an
InodeBase object must be loaded if the inode has previously been materialized.
This also removes one of the last remaining places that directly holds the
TreeInode contents_ lock outside of TreeInode.cpp.
Reviewed By: bolinfest
Differential Revision: D4968834
fbshipit-source-id: a37bc0d7889eb81cca5b045cbc82732a53ef53d4
Summary:
This updates the Dirstate::addAll() code to use EdenMount::diff() internally,
instead of getStatusForExistingDirectory().
This lets us delete getStatusForExistingDirectory() and several other helper
functions that it used. The logic in getStatusForExistingDirectory() was based
on the incorrect assumption that files can only be modified from their expected
source control state if they are materialized. This could result in incorrect
results in a variety of cases (particularly after renames, or "hg reset"
operations). The EdenMount::diff() code does not have this problem.
That said, this new addAll() code does still have some performance issues--it
currently does a full tree diff for each input path. We should ideally fix
this in the future to only diff the necessary subtree for each path. However,
in the short term this trade-off seems worth being able to delete all of this
older, buggy code. diff() should be cheap enough in most cases that this won't
be a major problem unless a large number of paths are given as input.
Reviewed By: bolinfest
Differential Revision: D4968835
fbshipit-source-id: 1834aa98a26dcaa0e1c06c7ac25c57944fa1b5f7
Summary:
Make sure we use pointers to const GitIgnoreStack objects when possible.
Also change the code to store the root GitIgnoreStack object for a diff()
operation in the DiffContext object, instead of having EdenMount::diff()
allocate it separately on the heap. This will make it easier to consolidate a
bit more logic in the DiffContext class in the future. In particular, I
suspect we will want a version of diff() that only works on one portion of the
tree. Putting more of the functionality in DiffContext will make it slightly
easier to share code between the full-mount diff and the subtree diff
functions. This also happens to save a memory allocation for now.
Reviewed By: wez
Differential Revision: D4968833
fbshipit-source-id: 1dc33b3d44cdf00e93b22d810c3a736d27c13638
Summary:
Update the eden thrift APIs to support multiple thrift parents, and update the
hg extension to use the new thrift APIs.
Reviewed By: bolinfest
Differential Revision: D4949139
fbshipit-source-id: 98a7bd17ccad7be6a429df34ecfaf3fe7ae0b39a
Summary:
This updates the ClientConfig and EdenMount code to support storing two parent
commits.
This changes the on-disk SNAPSHOT file contents add an 8-byte header that
includes a file identifier and a file format version number, followed by up to
two commit hashes. The code currently can read either the old or new format
from the SNAPSHOT file. We should be able to drop the code for reading the old
format fairly soon if we want, though.
This diff only updates the ClientConfig and EdenMount code, and does not yet
update the thrift APIs or the eden mercurial extension yet. I will update the
rest of the code in a subsequent diff.
Reviewed By: bolinfest, wez
Differential Revision: D4943917
fbshipit-source-id: cf456e67b845aa0cf8b45c822985cb932df107b4
Summary: Add some unit tests that check the behavior of setattr() and getattr().
Reviewed By: bolinfest, wez
Differential Revision: D4941215
fbshipit-source-id: 6691f51aea742d0159a16112ca291546a8adc928
Summary:
This fixes "hg commit" so that it correctly updates the in-memory snapshot.
This has been broken ever since I added the in-memory snapshot when
implementing checkout(). The existing scmMarkCommitted() method updated only
the Dirstate object and the on-disk SNAPSHOT file.
This diff fixes checkout() and resetCommit() to clear the Dirstate user
directives correctly, and then replaces calls to scmMarkCommitted() with
resetCommit().
Reviewed By: bolinfest
Differential Revision: D4935943
fbshipit-source-id: 5ffcfd5db99f30c730ede202c5e013afa682bac9
Summary:
Update the gitignore code so that patterns ending in a trailing slash correctly
match only directories. Fortunately we have file type information available
during TreeInode::computeDiff(), so getting the correct file type information
does not add any extra overhead. The older ignore checking code in
Dirstate.cpp can also do a reasonable job of getting file type information.
That code should also be removed at some point in the future, and use the
TreeInode logic for computing ignore status.
Reviewed By: bolinfest
Differential Revision: D4901326
fbshipit-source-id: 1222c8142876c91e1b80ec937ec84c0c28737224
Summary:
The temporary Hash object doesn't live long enough for the copy into
the string, so make an explicit temporary to extend its lifetime.
Reviewed By: simpkins
Differential Revision: D4900457
fbshipit-source-id: 948d919b929285c30ee1f9f0111cba20b60d4dc6
Summary:
This change makes it so that all of the C++ code related to the edenfs daemon
is now contained in the eden/fs subdirectory.
Reviewed By: bolinfest, wez
Differential Revision: D4889053
fbshipit-source-id: d0bd4774cc0bdb5d1d6b6f47d716ecae52391f37
Summary:
Fix a subtle crash during checkout when handling newly added entries that
already exist in the working directory: CheckoutAction passed the entry name to
checkoutUpdateEntry() as a PathComponentPiece. However, this
PathComponentPiece could refer to the entry name owned by newScmEntry_, and it
also passed newScmEntry_ into checkoutUpdateEntry() as an rvalue reference.
As a result, if the string data was stored invalidated by the move the name
would no longer be valid when checkoutUpdateEntry() tried to use it.
This bug is triggered by doing an "hg update --clean", where a file added in
the destination commit already exists on disk, and has an entry name of 23
characters or less. (The 23 character limit is fbstring's upper bound on
small string optimizations, where it will store the string data inline in the
object, causing it to be invalidated on move.)
This also fixes a crash in a VLOG() statement when the verbose log level for
TreeInode.cpp was set to 4 or greater.
Reviewed By: bolinfest
Differential Revision: D4882544
fbshipit-source-id: 917ede6eeae2224aaa0724b8b30324f3c3a5c924
Summary:
The intent is to provide a way to locate the SNAPSHOT file
for tools that want to have a very fast way to figure out the commit
id without making any RPCs or subprocess invocations.
Reviewed By: simpkins
Differential Revision: D4824176
fbshipit-source-id: 5adca225d9984146852dad1e83de0d903848c1e5
Summary:
This simplifies the InodePtrImpl class by removing support for pointer-to-const
Inode objects. We don't currently use pointer-to-const Inode objects, and I
don't really anticipate that we will ever need this.
The primary motivating factor behind this diff is a recent change in clang that
breaks the existing code: clang PR31606 prevents children classes from
inheriting template constructor instantiations from their parent which take a
reference to the parent class type as the only argument.
While this could have been fixed by explicitly defining the necessary
constructors in the child class (or simply by overriding newPtrLocked() and
newPtrFromExisting() to return the subclass type), dropping support for
pointer-to-const allows us to simplify the code and resolve this issue.
Reviewed By: wez
Differential Revision: D4825526
fbshipit-source-id: 999de352788b13818b7f23788bb34686e193cd5d
Summary:
This diff fixes FileData::stat() so that we report reasonable timestamp values
on non-materialized files, rather than always leaving them as 0. We set the
timestamps to the time that we created the FileInode. This ensures that
timestamps are updated correctly when files are modified by a checkout
operation.
Note that for materialized files the code reports the timestamp of the overlay
file. This diff does not modify that behavior. However, this behavior is
incorrect, as the overlay file timestamps are not updated by a FUSE client
opening, modifying, then closing a file (since we keep the underyling overlay
file handle open, and don't close it).
In the future we'll need to implement our own tracking of atime, mtime, and
ctime values. We should probably store these in a header inside the overlay
file. For now, this diff is a stop-gap measure that ensures we at least update
non-materialized file timestamps correctly on checkouts.
Reviewed By: bolinfest
Differential Revision: D4765632
fbshipit-source-id: 478da6441e213cdfe830f1c5129212182ce4eeb0
Summary:
This diff adds a new "eden debug" command, with a few subcommands:
- "eden debug inode": Report information about currently loaded inodes
- "eden debug tree": Report data about source control Tree objects
- "eden debug blobmeta": Report metadata about source control Blob objects
- "eden debug blob": Report the contents of source control Blob objects
This diff also includes the thrift APIs in edenfs to support these commands.
Reviewed By: bolinfest
Differential Revision: D4760528
fbshipit-source-id: 7dc2bd6e0e952ba6fb61702c672fb9417d603ffa
Summary:
This updates all of the references to gtest and gmock with googletest.
The change is mechanilcal, generated with the following one-liner:
```lang=bash
hg grep -lwE '(gtest|gmock)' 'glob:**/TARGETS' | grep -v '^third-party-buck' | xargs perl -pi -e '
$gt=qr!(["'"'"'])gtest\g1!;
(
s!$gt(\s*,\s*(.any.|None))(\s*,\s*)?\),?!\1googletest\1\2, \1gtest\1\),!g or
s!$gt((\s*,\s*(.any.|None)[^\)]+))\),?!\1googletest\1\2\),!g or
s!\(\s*$gt,?\s*\),?!\(\1googletest\1, None, \1gtest\1\),!g or
s!$gt,?!\(\1googletest\1, None, \1gtest\1\),!g
) unless /(name|type) *=/;
$gm=qr!(["'"'"'])gmock\g1!;
(
s!$gm(\s*,\s*(.any.|None))(\s*,\s*)?\),?!\1googletest\1\2, \1gmock\1\),!g or
s!$gm((\s*,\s*(.any.|None)[^\)]+))\),?!\1googletest\1\2\),!g or
s!\(\s*$gm,?\s*\),?!\(\1googletest\1, None, \1gmock\1\),!g or
s!$gm,?!\(\1googletest\1, None, \1gmock\1\),!g
) unless /(name|type) *=/;
'
```
Reviewed By: meyering
Differential Revision: D4643237
fbshipit-source-id: fda7f41760c7e44254231df87634631c343e6355
Summary:
Tweaks the stats stuff so that we can name the histograms and export them with p50, p90, p99 percentiles.
This is made a bit ugly because the fb303 stats code isn't yet open source, so
there's a moderately ugly preprocessor directive that we assume to be true for
our internal build. We'll need to force that to be 0 for the open source build
until we open the stats code and make it available.
Reviewed By: bolinfest
Differential Revision: D4801868
fbshipit-source-id: 643909e63bd4a74b2cfa580be131f65c5673bc94
Summary:
this is the bare minimum to support creating unix domain sockets.
We only support using mknod to create a unix socket; other uses will yield an error.
I've added an rdev field as a sibling of the existing mode field that we track,
as that is the additional parameter that we need to track as part of the
special file node.
Special file nodes are tracked in the overlay as empty files.
Reviewed By: bolinfest
Differential Revision: D4774099
fbshipit-source-id: 0824b7e509063faa8bede7aff82a7c51930c4f83
Summary:
This is an initial pass at implementing "hg status" using the new EdenMount and
TreeInode diff() code.
More work still needs to be done to clean things up here, but this makes
"hg status" more correct in the face of modified files and directories that are
not modified, and vice-versa.
In particular, the following still needs to be cleaned up in future diffs:
- Update the "hg status" code to more correctly process user directives for
paths that did not appear in the diff output. This will probably be
simplified some by correctly updating the user directive list on checkout and
resetCommit() operations. It may also make things simpler if we store the
user directives in a hierarchical path map, rather than in a flat list.
- Update the addAll() code to also use the new diff logic, rather than the
getStatusForExistingDirectory() function.
- Clean up the locking some. I think it may be worth changing the code to use
a single lock to manage both the current commit ID for the mount and the
dirstate user directives. We probably should hold this lock for the duration
of a diff operation. The diff operation should also return the commit ID
that it is based against.
Reviewed By: wez
Differential Revision: D4752281
fbshipit-source-id: 6a6d7e2815321b09c75e95aa7460b0da41931dc0
Summary:
It's not really magic because we don't have a virtual directory
inode base any more. Instead, we mkdir and populate it at mount time.
What is slightly magical about it is that we give it some special powers:
* We know the inode number of the eden dir and prevent unlink operations
on it or inside it.
* The .eden dir is present in the contents of the root inode and will
show up when that directory is `readdir`'d
* When resolving a child of a TreeInode by name, we know to return the
magic `.eden` inode number. This means that it is possible to `stat`
and consume the `.eden` directory from any directory inside the eden
mount, even though it won't show up in `readdir` for those child dirs.
The contents of the `.eden` dir are:
* `socket` - a symlink back to the unix domain socket that our thrift
server is listening on. This means that it is a simple
`readlink(".eden/socket")` operation to discover both whether a directory
is part of an eden mount and how to talk to the server.
* `root` - a symlink back to the root of this eden mount. This allows
using `readlink(".eden/root")` as a simple 1-step operation to find
the root of an eden mount, and avoids needing to walk up directory
by directory as is the common pattern for locating `.hg` or `.git`
dirs.
Reviewed By: simpkins
Differential Revision: D4637285
fbshipit-source-id: 0eabf98b29144acccef5c83bd367493399dc55bb