Summary:
Several tools rely on the .eden directory to be present to detect EdenFS.
Unfortunately, removing it by hand would make it disappear breaking these
tools. Thankfully ProjectedFS is providing ways to deny some rename/deletion
from happening.
Reviewed By: chadaustin
Differential Revision: D34433910
fbshipit-source-id: 4d37a6d9eeea3f16f4e1401beac20ef6d6dce71d
Summary: Adds trace bus and other tracing components to PrjfsChannel. This will be used to enable tracing in the next diff.
Reviewed By: xavierd
Differential Revision: D33845993
fbshipit-source-id: a4666ebff3e535773786b220371e6cf2a0a18ea9
Summary:
Directory[0] invalidations can trigger recursive ProjectedFS callbacks onto
EdenFS, when this happens we need to make sure that the calling context doesn't
hold any locks/threads that the callback will need to run. We otherwise can run
into a deadlock situation where the callback may try to acquire a lock held by
the calling context.
In this situation, checkout is this calling context, and it has been shown that
directory invalidation may in some cases trigger a recursive lookup. While the
lookup callback doesn't use any shared locks, it pushes work to the server
state thread pool in `ObjectStore::getTree`. Unfortunately, this thread pool is
also where checkout runs, and thus if too many directory invalidation are
occuring at the same time, all the threads in the thread pool might be blocked
and thus `ObjectStore::getTree` may never complete, causing the lookup callback
to not complete, causing the invalidation to not complete, causing checkout to
not complete.
The solution is to move the invalidation away from the thread pool, that way
checkout doesn't block in the thread pool while the invalidation is ongoing,
allowing `ObjectStore::getTree` to push work onto it and run it.
[0]: File invalidation won't trigger these recursive callbacks.
Reviewed By: chadaustin
Differential Revision: D33773892
fbshipit-source-id: a6516afdc5c68ac8d1d7ba75f91c45fb12cc6ee4
Summary:
In the case of notifications, these are running in a background thread,
potentially after the original context has been destroyed. This would therefore
lead to use after free. By using shared_ptr, the lifetimes are easier to
understand and less error prone, so let's use them in the dispatcher.
Reviewed By: chadaustin
Differential Revision: D33724774
fbshipit-source-id: 048a62a66e9ef09369cc66f9547501cf7316863f
Summary:
Now that notifications are running in a serial executor, and are also issuing
disk IO, notifications are significantly slower than they used to be. While
writing to the working copy is an overall anti-pattern, some workflows (like
Unity) do and it's thus critical that their performance isn't affected
negatively.
In order to solve this, we can simply move the handling of notifications to the
background and answer the notification immediately, since notifications can no
longer fail, we shouldn't need to send an error back to ProjectedFS which would
anyway be ignored.
The drawback is of course that applications are no longer blocked while the
notification is being processed in EdenFS, and thus any operation that needs to
get a synced up inode hierarchy will need to wait on all the pending
notifications.
Reviewed By: chadaustin, genevievehelsel
Differential Revision: D32480993
fbshipit-source-id: 7ad6b07f540f7d9a52a35a0ff3b94911ef5267af
Summary:
With Facebook having been renamed Meta Platforms, we need to change the license
headers.
Reviewed By: fanzeyi
Differential Revision: D33407812
fbshipit-source-id: b11bfbbf13a48873f0cea75f212cc7b07a68fb2e
Summary:
The makeImmediateFuture is merely a glorified try/catch, and thus the entirety
of the readdir callback would run immediately, and would block on every size
sequentially. This is inefficient and may cause files to be fetched one at a time.
By reworking the inner logic, we can collect all the sizes beforehand prior to
servicing the callback, allowing for sizes to be fetched concurrently, and for
completing the callback asynchronously.
Reviewed By: chadaustin
Differential Revision: D31782916
fbshipit-source-id: d6315347492e969ffa79037dc2a4f275f4b95a8d
Summary:
This moves all the conversions to folly::Future one level up. This allow
removing all folly::Future code from PrjfsDispatcherImpl.
Reviewed By: genevievehelsel
Differential Revision: D31746626
fbshipit-source-id: cb9b78c9ea2f09045f4bc921ea820a77b5832ddb
Summary:
The timeout based notification mechanism appears to be at the root of a lot of
EdenFS crashes. It also doesn't detect all of the network failures and is being
replaced with a more robust mechanism which will handle these cases.
For now, let's just get rid of the timeout based mechanism.
Reviewed By: fanzeyi
Differential Revision: D31739683
fbshipit-source-id: d6baa03fe3579dc36c9f435bd16df39799e51ef9
Summary:
This moves some Prjfs logic into the channel code, which allows for
de-duplicating a bit of code. This will also make a subsequent change in the
rename code easier to do.
Differential Revision: D30023970
fbshipit-source-id: 7efa6dcc4318213e9d266932527b5a56edacefd7
Summary:
Instead of having one "Dispatcher" type that the various backend overload,
let's simply have a per-mount type dispatcher type. The previous model worked
fine when EdenFS supported only one way of mounting a repository, but with NFS
coming, unix platform will support both FUSE and NFS, making the Dispatcher
overload nonsensical.
As a behavioral change, the dispatcher lifetime and ownership is changed a bit.
It used to live for the duration of the EdenMount object, but is now tied to
the channel lifetime, as it is now owned by it.
Reviewed By: kmancini
Differential Revision: D26329477
fbshipit-source-id: 3959b90a4909e3ab0898caa308f54686f59a943c
Summary:
The RcuPtr abstraction allows us to use RCU instead of the significantly more
expensive Synchronized<shared_ptr>. This should reduce the cost of all the
callbacks while not sacrificing the guarantee that unmounting a repository
needs to wait for all the pending callbacks to complete.
A new rcu_domain is used as the pending callbacks may sleep and take a long
time to complete when the servers aren't reachable. To avoid penalizing all the
other RCU clients, it's best to be isolated in its own domain.
Reviewed By: kmancini
Differential Revision: D25351535
fbshipit-source-id: bd40d59056e3e710c28c42d651b79876be496bc3
Summary:
On Windows, the GUID of the mount point identifies the virtualization instance,
that GUID is then propagated automatically to the created placeholders when
these are created as a response to a getPlaceholderInfo callback.
When the placeholders are created by EdenFS when invalidating directories we
have to pass GUID. The documentation isn't clear about whether that GUID needs
to be identical to the mount point GUID, but for a very long time these have
been mismatching due to the mount point GUID being generated at startup time
and not re-used.
One of the most common issue that users have reported is that sometimes
operations on the repository start failing with the error "The provider that
supports file system virtualization is temporarily unavailable". Looking at the
output of `fsutil reparsepoint query` for all the directories from the file
that triggers the error to the root of the repositories, shows that one of the
folder and its descendant don't share the same GUID, removing it solves the
issue.
It's not clear to me why this issue doesn't always reproduce when restarting
EdenFS, but a simple step that we can take to solve this is to always re-use
the GUID, and that hopefully will lead to the GUID always being the same and
the error to go away.
Reviewed By: fanzeyi
Differential Revision: D25513122
fbshipit-source-id: 0058dedbd7fd8ccae1c9527612ac220bc6775c69
Summary:
EdenFS can now show notifications to the user in case something wrong is
happening.
Reviewed By: chadaustin
Differential Revision: D24864354
fbshipit-source-id: fabc30f14bc022b4367af562481235fe984df458
Summary:
One of the main sub-par user experience on Windows is the lack of notification
of any kind when EdenFS can't reach the Mercurial servers. Prior to this diff,
the callbacks would never return, causing commands to simply hangs for the
user.
As a first step, let's add a timeout, a later step will hook the notification
mechanism used on macOS/Linux to display a notification when timeouts occurs.
The only callback that doesn't have a proper timeout is the notification one,
as timing out on these would mean that EdenFS won't have registered that some
files/directories have been materialized which will lead to inconsistencies
later.
Reviewed By: kmancini
Differential Revision: D24809645
fbshipit-source-id: 0ddd9d443a17db405a3edbaa8edecf3764c31d37
Summary:
As described in the previous diff, unmounting a repo while a request is pending
would lead to a use after free. To solve this, we can wrap the inner channel
with a shared_ptr, and set it to NULL whenever unmount is in progress.
While this solution has a fairly large overhead due to requiring at least 2
atomics per callbacks (one for the lock, the second one for the shared_ptr
copy), it is correct. A future improvement will swap these with an RCU pointer
to reduce the callback cost to almost nothing.
Reviewed By: chadaustin
Differential Revision: D25071423
fbshipit-source-id: 77d14a38403bef3e276d3e5e48e6fd95dd641964
Summary:
There is currently a race condition where unmounting a repo can happen
concurrently with a ProjectedFS notification/callback. Depending on who wins
that race, this can lead to a use-after-free as the PrjfsChannel/EdenMount
would be freed but the callback would still have reference to it.
To solve this, we need to keep track of inflight requests, and in particular
make sure that memory isn't freed before all the pending callbacks have
completed. And that effectively means that we need to refcount the channel used
by these callbacks so we only free the memory when nobody else is using it.
The first step towards this is splitting the channel in 2 halves.
Reviewed By: chadaustin
Differential Revision: D25071422
fbshipit-source-id: 743f38c9b19ba534961d06ea6f2ddc96b685fe19
Summary:
As of right now, opendir is the expensive callbacks due to fetching the sizes
for all the files in a directory. This strategy however breaks down when
timeouts are added as very large directories would trigger the timeout, not
because EdenFS is having a hard time reaching Mononoke, but because of
bandwidth limitation.
To avoid this issue, we need to have a per-file timeout and thus makes opendir
just trigger the futures, but not wait on them. The waiting bit will happen
at readdir time which will also enable having a timeout per file.
The one drawback to this is that the ObjectFetchContext that is passed in by
opendir cannot live long enough to be used in the size future. For now, we can
use a null context, but a proper context will need to be passed in, in the
future.
Reviewed By: wez
Differential Revision: D24895089
fbshipit-source-id: e10ceae2f7c49b4a006b15a34f85d06a2863ae3a
Summary:
Instead of logging in the Dispatcher, move strace logging to
FuseChannel where it can be standardized for all FUSE request types.
Reviewed By: wez
Differential Revision: D24035838
fbshipit-source-id: c84d8c27b62f9944e2d26a35a7ed7bbbeeb5bf0e
Summary:
While on Linux these can't fail (or, to be more precise: it doesnt' matter),
they can on Windows. One such exemple is when a user lock a file and triggers
an update that modifies this file. The invalidation will fail, and thus the
update should keep track of that file not being updated properly.
Previously, the invalidation would raise an exception, but that proved to be
the wrong approach as some state would need to be rolled back which the
exception didn't help in. For that, let's just return a Try and make sure that
we handle all the cases properly.
Reviewed By: chadaustin
Differential Revision: D24163672
fbshipit-source-id: ac881984138eefa65c053478a160e2a653fd3fdf
Summary: This is not needed, we can use the PrjfsChannel class directly where needed.
Reviewed By: chadaustin
Differential Revision: D23946259
fbshipit-source-id: eafcd38c0927fa282d62ada0986a7ef8b612174b
Summary:
This enables autodeps, and brings us one step closer to building EdenFS with
Buck on Windows.
Reviewed By: fanzeyi
Differential Revision: D23857794
fbshipit-source-id: c8587a6f7b9e4d9575a62f592c1d0737dff2a8f0
Summary:
If we are to look at the dependency graph for EdenFS on Windows, we would
notice that the eden_prjfs target depends on eden_inodes, and vice versa,
causing a cycle. While CMake is perfectly happy with that, Buck doesn't like
that. The solution to removing this cycle is to move the code that needs the
dependency to eden_inodes into the eden_inodes target, and that's the
EdenDispatcher. However, since PrjfsChannel needs to hold a dispatcher to call
into it, it needs to know the methods exposed by the dispatcher. To achieve
this, a simple abstract class is added, this is the same as what is done for
FUSE.
Reviewed By: wez
Differential Revision: D23857540
fbshipit-source-id: c495c67d43724f648e5ffa17776e4d5d4513698a
Summary:
Now that the win directory only contains the mount directory, we can rename it
to be more faithful to its intent. Since this is about ProjectedFS, let's
rename it "prjfs".
Reviewed By: chadaustin
Differential Revision: D23828561
fbshipit-source-id: cb31fe4652fd4356dc2579028d3ae2c7935371a7