Summary:
This is moving some files around in preparation for
moving TakeoverData to using thrift for its serialization
Reviewed By: simpkins
Differential Revision: D6733405
fbshipit-source-id: 235ba237546f8ef606de8445db45683ce38a2d2c
Summary: Minor refactor replacing if with switch and a loop control comparison.
Reviewed By: wez
Differential Revision: D6730849
fbshipit-source-id: 69ac7be9fae3be6274b56dd2baea433ed2437f7c
Summary:
Changed `eden clone` to check out master commit in both git and hg.
Previously, it checked out the current commit for the repo.
Reviewed By: simpkins
Differential Revision: D6663754
fbshipit-source-id: 92b185ccca5d082dc2bde9c8b191c82a2a4f06b4
Summary:
This implements a TODO/FATAL that is important for
graceful restarts to be useful in my "acid test" scenario,
which is to perform a graceful restart while buck build is
running.
Reviewed By: simpkins
Differential Revision: D6700189
fbshipit-source-id: dec1b818ebc9e907841bc127ee08c953b59d6487
Summary:
Fix lines longer than 80 characters, and also reduce the number of times the
initial directory listing is repeated throughout this file.
Reviewed By: wez
Differential Revision: D6710372
fbshipit-source-id: bdb02cbebabeff7d7c6c88aebee7ebab1865535b
Summary:
Update the statfs() code to return non-zero values for namelen and frsize.
Returning 0 for namelen was causing problems for programs that checked
`pathconf(path, _PC_NAME_MAX)` and tried to honor this value. For instance,
GNU patch would try to generate files with a 0-length name since we indicated
the maximum name length was 0.
I haven't investigated too closely, but this behavior might have broken only
recently when we stopped using libfuse. libfuse may have been setting this to
255 for us. I didn't see code in libfuse that would do this, but I'm fairly
sure GNU patch was working correctly very recently.
Reviewed By: wez
Differential Revision: D6710370
fbshipit-source-id: fc9a0320cd7c1eb2545219a3ec123c2f0644fb5d
Summary:
This avoids the long delay and the `Cpp2Worker.cpp:281] Failed to join
outstanding requests.` message from showing up in the eden logs during shutdown
and graceful restart.
Reviewed By: chadaustin
Differential Revision: D6693117
fbshipit-source-id: 3c56314f288a596264cddae0d8bbab66ab19e9fe
Summary:
Previously, `EdenMount::create` would implicitly call
`EdenMount::initialize` which would load the root inode and the `.eden` magical
directory. That's fine for the fresh mount case, but in the case of the
graceful restart we need to take the opportunity to apply the `InodeMap`
from the old process before we start muddying its state.
This diff breaks out the `initialize` method from the `create` method and
makes the mount code call it after potentially loading the `InodeMap` from
the takeover data.
In addition, this diff removes the the `root->loadMaterializedChildren()`
call from the mount initialization code. It is no longer required to do
this eagerly and it makes things simpler and our memory profile a little
smaller to defer this (I haven't measured how much that impacts things).
Reviewed By: simpkins
Differential Revision: D6691182
fbshipit-source-id: 52033a6d64105b658314a919f69dbfcd4eea242b
Summary:
This removes the duplicated logic and makes it a little
bit easier to follow the initialization sequence.
It doesn't change the behavior, just moves some code around.
Reviewed By: simpkins
Differential Revision: D6691180
fbshipit-source-id: 2068dbe56ebe9a6136d69689997aec8dedd32be0
Summary:
This is the "simple" threading through of a `doTakeover`
flag and the return of `Optional<TakeoverData>` in the unmount code.
Reviewed By: simpkins
Differential Revision: D6691181
fbshipit-source-id: 4a384787783c16085f2e9964964023ba07cefca3
Summary:
I'm so-so on a bit of the implementation here, but it works!
I had to change the `takeoverPromise` from the `pair<fuseDevice, connInfo>`
to a new helper struct because we now have three distinct pieces of data
to pass out of EdenMount to build up the overall TakeoverData.
The key change in this diff is that we have to release all of the file handles
we're maintaining in the `FileHandleMap` prior to shutting down the `InodeMap`,
otherwise the `InodeMap` will never complete (it waits for all inodes to be
unreferenced, and that cannot happen while there are open file handles). I've
made the `FileHandleMap` serialization and clearing contingent on performing a
takeover shutdown because that feels like the safest thing to do wrt. not
losing any pending writes.
Reviewed By: simpkins
Differential Revision: D6672437
fbshipit-source-id: 7b1f0f8e7ff09dbed850c7737383ecdf1e5ff0c7
Summary:
This is the key portion that makes the graceful restart
function. This diff connects almost all of the moving pieces together;
it informs the priv helper about the takeover mount and transfers
the InodeMap information into the new generation of the eden server.
It doesn't yet load the fileHandleMap (will tackle that in a follow up diff)
Reviewed By: simpkins
Differential Revision: D6670903
fbshipit-source-id: 1770d99eb1477440a6c1deed83b0da55b9c1bbe4
Summary:
this isn't how we really want to do this long term, it's
just the most expedient short term implementation.
This diff provides an implementation of the `InodeMap::save()` which
was previously a stub method; the new implementation returns a thrift
structure that encompasses the unloaded inodes in the map, and adds
a corresponding load() method that performs the reverse transformation.
The struct is serialized into the Takeover data.
This diff doesn't hook up the real serialized data to EdenServer; that will happen
in a follow-on diff.
The way that we actually want to handle this longer term is to store the
`numFuseReferences` field into the overlay file on disk, but to do so we
will need to add a mountGeneration field alongside it and ensure that we
always write out the correct information at the correct times. In addition,
we'd need to add equivalent data to TreeInode::Entry and add accessors that
safely return the correct values in the correct situations.
In the interest of getting something working, I've just dumped this code in
here.
I've also placed the thrift structure for this in `fuse/handlemap.thrift`;
this is a slight layering violation but one that feels "OK" in light of
the imminent refactor of the Takeover data struct to be its own thrift
struct anyway.
Reviewed By: simpkins
Differential Revision: D6670904
fbshipit-source-id: 11a0918954c741935c587e46fcb0e38849010de1
Summary:
This puts the data into the takeover information during takeover
shutdown, but doesn't do anything to pull it out again (that will be in a follow on diff).
The serialization stuff could be done a little bit more efficiently (since we
will perform an extra thrift serialization step just to compute the length, and
repeat it again later), but we're planning on replacing this with thrift
serializing soon, once simpkins diff stack lands, so I'm not losing sleep over
it.
Reviewed By: simpkins
Differential Revision: D6668846
fbshipit-source-id: e6d01428bd506a9e93b427db499770fce0a0983a
Summary:
This fulfils a TODO but doesn't do anything useful
with that data at this time.
Reviewed By: simpkins
Differential Revision: D6552750
fbshipit-source-id: 0c441fd0c2ab43785b4d98c4ca6ff643a20629e0
Summary:
This adds some plumbing to thread the fuse device descriptor and
negotiated capabilities through to the takeover code.
I initially wanted to just make the
unmount future yield the device descriptor, but since that uses
`SharedPromise` it is not compatible with a move-only type like
`folly:File`, so I added an optional promise to deal with just that.
I'm also populating the takeover mount information (path, bind mounts)
for each mount point.
Reviewed By: simpkins
Differential Revision: D6494509
fbshipit-source-id: a90684292dc1d8e06ce2c0721eadd8d393377f33
Summary:
The unix `patch` command attempts to issue an `FS_IOC_GETFLAGS`
ioctl and blows up when we return `ENOSYS`. The man page for ioctl says:
```
ENOTTY The specified request does not apply to the kind of object that
the file descriptor fd references.
```
so let's return that error code to ioctl.
In addition, the integration test I added for this trips up when it calls
`llistxattr` on the file; turns out we don't need to insert the
`fuse_getxattr_out` when we're returning the attribute list and that it
is only needed when measuring up the required length, so let's move
things around to resolve this and make it clearer.
Reviewed By: chadaustin
Differential Revision: D6685568
fbshipit-source-id: 81963ffe9af30db5634e5e96b7a8aa1485859d65
Summary: D6161089 removes autoconf support in fbthrift. So let's switch to cmake.
Reviewed By: wez
Differential Revision: D6675252
fbshipit-source-id: 5d04d3e285f80aec4ce3b566c5046c6f2c3d76b1
Summary: A prior shelve contained this fix but never got landed.
Reviewed By: wez
Differential Revision: D6676206
fbshipit-source-id: b8c733be663ff56e1a0625f09ec505891d430084
Summary:
this comment is not accurate since removing libfuse.
The good news is that doing that meant that I didn't need to
touch DirList when removing libfuse.
Reviewed By: chadaustin
Differential Revision: D6672533
fbshipit-source-id: 54216685b03b8f6dd7ee96b7bc38bb91d1b33366
Summary:
We'll need a way to twiddle the refcount by more than
increment at a time for graceful restart. This mirrors the
same pattern we use to forget() by an arbitrary number.
Reviewed By: chadaustin
Differential Revision: D6668809
fbshipit-source-id: 6e5dc33b5e40f98f01293c89152bfe1e0879f572
Summary: We forgot to remove this when we moved to be fully inode based in the overlay
Reviewed By: chadaustin
Differential Revision: D6668810
fbshipit-source-id: b79af85a4bbbcefd9227ad69bb8d57b5274cdaed
Summary:
Switch from glog to folly logging. Also derive from folly::ColdClass to
indicate that the `EdenBug` class is only ever created on unexpected code
paths.
Reviewed By: wez
Differential Revision: D6609613
fbshipit-source-id: 33621c7bafc946df245224da9b6a559e3dfd58d6
Summary:
Update the integration tests to understand the new `eden list` output format
changes introduced in D6661303.
Reviewed By: chadaustin
Differential Revision: D6665272
fbshipit-source-id: 13f65a7a66f997e6b8135436811fe3fc396eb38f
Summary:
Update the TakeoverClient and TakeoverServer code to use the new UnixSocket
helper class for exchanging messages, file descriptors, and credential
information.
This does not change the message serialization code much yet, it merely changes
the code to use the UnixSocket class for I/O.
Reviewed By: wez
Differential Revision: D6494979
fbshipit-source-id: 3129fe8605b1b3b7a24e6e84e94dccf3ea2b4170
Summary:
Add new classes that help send data, file descriptors, and credential
information over unix domain sockets.
UnixSocket provides a low-level, raw callback API, similar to that provided by
the classes in folly/io/async/. UnixSocketFuture is a slightly higher-level
wrapper class that provides a Future-based API on top of this. I expect that
most places in eden will probably use the UnixSocketFuture API, but callers
that repeatedly wait for new messages (like the privhelper server) may want to
use the raw callback API instead.
This will help simplify several places in eden that communicate over unix
domain sockets. Both the privhelper code and the takeover code needs to send
file descriptors over unix domain sockets. They currently each use their own
separate message handling logic for this. We currently communicate with the
hg_import_helper.py script over a pipe, but this could easily be switched to
use this new UnixSocket class as well.
Reviewed By: wez
Differential Revision: D6494981
fbshipit-source-id: 80bd7f06e5b884fc4148162e1a8a3b478acce209
Summary:
Title says it all. Uses a switch statement so the compiler
might warn if a header's opcode isn't handled.
Reviewed By: wez
Differential Revision: D6664378
fbshipit-source-id: 51986c423c943bf3b51cbd342ea4bdedb5fe1188
Summary:
eden doesn't support ioctl which was spamming "unhandled fuse
opcode 39" into the log. Only log each unhandled opcode once.
Reviewed By: wez
Differential Revision: D6663175
fbshipit-source-id: c1bea671c8fb720ae0ac234c4f187251930a8bee
Summary:
`eden list` enumerates all of the mounts in the Eden config,
whether or not the daemon is running and whether or not the mount is
currently active or not. This diff adds an (active) suffix on a mount
if the daemon is running and the mount is currently active.
Reviewed By: simpkins
Differential Revision: D6661303
fbshipit-source-id: c098e90fc9a77f16c723c707cc4da3ee3d4c5abb
Summary:
It is no longer correct to assert that state->file is set if O_TRUNC happened
before blob import from hg finished. It surprises me we never saw a crash
because of that. Also, the O_TRUNC path after blob import finishes can never
complete a future, so don't try.
Reviewed By: wez
Differential Revision: D6656699
fbshipit-source-id: 5e245fc46185714e5f5d81c2680835a3497747ff
Summary:
Today, if a file is ever opened for read, each FileInode keeps a copy
of the data as long as the FileInode is around. This results in
excessive memory consumption under common mistakes like repo-wide grep
or `hg revert .`.
I will audit all of the state machine transitions and blob accesses
before landing this diff.
Reviewed By: wez
Differential Revision: D6598957
fbshipit-source-id: 1eb4aeb08057ce993a29a86d298e153675fee4a1
Summary:
This came up when I was auditing the rules about when it's
safe to read from a FileInode. read() must only be called while openCount > 0.
Reviewed By: wez
Differential Revision: D6604898
fbshipit-source-id: 829ddc335bd58201c2b456ee544cdc6253ebf66c
Summary:
In a follow-on diff, the constraint will be that state->blob
will only be guaranteed valid after ensureDataLoaded() while a
FileHandle is alive. Thus, ensureDataLoaded() must return a
FileHandle.
Reviewed By: wez
Differential Revision: D6586237
fbshipit-source-id: ccc269d322b8c725c93145df5de2add9a2b90207
Summary: Drive-by cleanup. `CompareString` is not used anywhere and `std::less<>` would be better anyway.
Reviewed By: wez
Differential Revision: D6655104
fbshipit-source-id: 0fee22172c93335493400e6317d92f9e1e77b40e
Summary: This is annoying because it isn't actually shadowing and because this isn't picked up in our default mode mode, or in sandcastle.
Reviewed By: chadaustin
Differential Revision: D6656548
fbshipit-source-id: e624cb563d6396c1ab6b93eae14651c16a8c0cd3
Summary:
Now that we have full control over the fuse session,
moving the threads here makes a lot of sense and makes things
easier from an overall state management perspective.
FuseChannel now provides a future that make it easier for EdenMount to wait
until all the threads and outstanding requests have completed.
There is also a future to determine when the first of the threads
has exited which is used to detect an error condition. We could
go a bit further with this and have the error condition propagate
out from our dispatcher, but that's a bit further away from my
goal of making the graceful restart stuff work.
I've removed the request counting stuff from Dispatcher as we
can now use the definitive signals from FuseChannel in its place.
Reviewed By: chadaustin
Differential Revision: D6580247
fbshipit-source-id: 6bcc3b8b531d59a3fdd0ca6fd09410ad64f8221a
Summary:
I'm mostly interested in this as the definitive way to know
how many outstanding fuse requests there are, which is important
for synchronization for graceful restart.
An important part of this diff is to use of the `RequestContextScopeGuard`.
Previously, `RequestData::create` would implicitly create a new
`folly::RequestContext` and associate it with the current thread each
time a new request was processed on the fuse worker thread. Critically,
it would never take any action to remove that association. What that
meant was that during shutdown some number of fuse worker threads were
extending the lifetime of some fuse requests and that leads either to
an use-after-free (as is the case in this diff stack up until this diff),
or with the stronger synchronization that we desire in graceful restarts,
a deadlock.
Reviewed By: chadaustin
Differential Revision: D6578933
fbshipit-source-id: f82f75e75a398d1f6beacb9466060e7bd99adbc1
Summary:
Use a map to lookup the handler for a given opcode.
While we're in here, let's reduce some more boiler plate by factoring
out the code that preps the request and associates it with the stats
histogram by moving that into the opcode map.
Reviewed By: chadaustin
Differential Revision: D6578934
fbshipit-source-id: 9e3f73beb9dc5597f095e81ef21adf1549420e9d
Summary:
This serves a few purposes:
1. We can avoid some conditional code inside eden if we know that
we have a specific fuse_kernel.h header implementation.
2. We don't have to figure out a way to propagate the kernel
capabilities through the graceful restart process.
3. libfuse3 removed the channel/session hooks that we've been
using thus far to interject ourselves for mounting and
graceful restarting, so we were already effectively the
walking dead here.
4. We're now able to take advtange of the latest aspects of
the fuse kernel interface without being tied to the implementation
of libfuse2 or libfuse3. We're interested in the readdirplus
functionality and will look at enabling that in a future diff.
This may make some things slightly harder for the more immediate
macOS port but I belive that we're in a much better place overall.
This diff is relatively mechanical and sadly is (unavoidably) large.
The main aspects of this diff are:
1. The `fuse_ino_t` type was provided by libfuse so we needed to
replace it with our own definition. This has decent penetration
throughout the codebase.
2. The confusing `fuse_file_info` type that was multi-purpose and
had fields that were sometimes *in* parameters and sometimes *out*
parameters has been removed and replaced with a simpler *flags*
parameter that corresponds to the `open(2)` flags parameter.
The *out* portions are subsumed by existing file handle metadata
methods.
3. The fuse parameters returned from variations of the `LOOKUP` opcode
now return the fuse kernel type for this directly. I suspect
that we may need to introduce a compatibility type when we revisit
the macOS port, but this at least makes this diff slightly simpler.
You'll notice that some field and symbol name prefixes vary as
a result of this.
4. Similarly for `setattr`, libfuse separated the kernel data into
two parameters that were a little awkward to use; we're now just
passing the kernel data through and this, IMO, makes the interface
slightly more understandable.
5. The bulk of the code from `Dispatcher.cpp` that shimmed the
libfuse callbacks into the C++ virtual methods has been removed
and replaced by a `switch` statement based dispatcher in
`FuseChannel`. I'm not married to this being `switch` based
and may revise this to be driven by an `unordered_map` of
opcode -> dispatcher method defined in `FuseChannel`. Regardless,
`Dispatcher.cpp` is now much slimmer and should be easier to
replace by rolling it together into `EdenDispatcher.cpp` in
the future should we desire to do so.
6. This diff disables dispatching `poll` and `ioctl` calls. We
didn't make use of them and their interfaces are a bit fiddly.
7. `INTERRUPT` is also disabled here. I will re-enable it in
a follow-up diff where I can also revise how we track outstanding
requests for graceful shutdown.
8. I've imported `fuse_kernel.h` from libfuse. This is included
under the permissive 2-clause BSD license that it allows for
exactly this integration purpose.
Reviewed By: simpkins
Differential Revision: D6576472
fbshipit-source-id: 7cb088af5e06fe27bf22a1bed295c18c17d8006c
Summary: eden rage crashes sometimes due to too much inode info in the info dump. Replacing per-inode info dump with the stats summary from the stats module.
Reviewed By: chadaustin
Differential Revision: D6641312
fbshipit-source-id: 16910aa21306db4e5533217bd2ffb7b37440a807