Commit Graph

26 Commits

Author SHA1 Message Date
Xavier Deguillard
6bb10367dd prjfs: prevent .eden from being removed
Summary:
Several tools rely on the .eden directory to be present to detect EdenFS.
Unfortunately, removing it by hand would make it disappear breaking these
tools. Thankfully ProjectedFS is providing ways to deny some rename/deletion
from happening.

Reviewed By: chadaustin

Differential Revision: D34433910

fbshipit-source-id: 4d37a6d9eeea3f16f4e1401beac20ef6d6dce71d
2022-03-07 15:36:00 -08:00
Michael Cuevas
78884089c8 refactor trace fs code
Summary: address follow-up comments on D33931891 (a699b4907f)

Reviewed By: chadaustin

Differential Revision: D34100600

fbshipit-source-id: a7dcf330bc15b7fa3c22e9fd5abd57a523702da3
2022-02-10 12:03:43 -08:00
Michael Cuevas
a699b4907f add edenfsctl trace fs support for ProjectedFS
Summary: Add logic to enable `eden trace fs` command on Windows

Reviewed By: xavierd

Differential Revision: D33931891

fbshipit-source-id: c875a52ab10181abda904e59551e6096ca08e417
2022-02-08 14:46:13 -08:00
Michael Cuevas
cc2bd70e5b add tracing to ProjectedFS
Summary: Adds trace bus and other tracing components to PrjfsChannel. This will be used to enable tracing in the next diff.

Reviewed By: xavierd

Differential Revision: D33845993

fbshipit-source-id: a4666ebff3e535773786b220371e6cf2a0a18ea9
2022-02-08 14:46:12 -08:00
Xavier Deguillard
67e6d48b0d inodes: move prjfs directory invalidation to a different thread
Summary:
Directory[0] invalidations can trigger recursive ProjectedFS callbacks onto
EdenFS, when this happens we need to make sure that the calling context doesn't
hold any locks/threads that the callback will need to run. We otherwise can run
into a deadlock situation where the callback may try to acquire a lock held by
the calling context.

In this situation, checkout is this calling context, and it has been shown that
directory invalidation may in some cases trigger a recursive lookup. While the
lookup callback doesn't use any shared locks, it pushes work to the server
state thread pool in `ObjectStore::getTree`. Unfortunately, this thread pool is
also where checkout runs, and thus if too many directory invalidation are
occuring at the same time, all the threads in the thread pool might be blocked
and thus `ObjectStore::getTree` may never complete, causing the lookup callback
to not complete, causing the invalidation to not complete, causing checkout to
not complete.

The solution is to move the invalidation away from the thread pool, that way
checkout doesn't block in the thread pool while the invalidation is ongoing,
allowing `ObjectStore::getTree` to push work onto it and run it.

[0]: File invalidation won't trigger these recursive callbacks.

Reviewed By: chadaustin

Differential Revision: D33773892

fbshipit-source-id: a6516afdc5c68ac8d1d7ba75f91c45fb12cc6ee4
2022-01-27 17:42:19 -08:00
Xavier Deguillard
8d4b18ab84 prjfs: use shared_ptr for ObjectFetchContext
Summary:
In the case of notifications, these are running in a background thread,
potentially after the original context has been destroyed. This would therefore
lead to use after free. By using shared_ptr, the lifetimes are easier to
understand and less error prone, so let's use them in the dispatcher.

Reviewed By: chadaustin

Differential Revision: D33724774

fbshipit-source-id: 048a62a66e9ef09369cc66f9547501cf7316863f
2022-01-26 10:24:35 -08:00
Xavier Deguillard
4308f92515 prjfs: move notification handling to the background
Summary:
Now that notifications are running in a serial executor, and are also issuing
disk IO, notifications are significantly slower than they used to be. While
writing to the working copy is an overall anti-pattern, some workflows (like
Unity) do and it's thus critical that their performance isn't affected
negatively.

In order to solve this, we can simply move the handling of notifications to the
background and answer the notification immediately, since notifications can no
longer fail, we shouldn't need to send an error back to ProjectedFS which would
anyway be ignored.

The drawback is of course that applications are no longer blocked while the
notification is being processed in EdenFS, and thus any operation that needs to
get a synced up inode hierarchy will need to wait on all the pending
notifications.

Reviewed By: chadaustin, genevievehelsel

Differential Revision: D32480993

fbshipit-source-id: 7ad6b07f540f7d9a52a35a0ff3b94911ef5267af
2022-01-18 17:23:14 -08:00
Xavier Deguillard
a29d465ee8 fs: fix license header
Summary:
With Facebook having been renamed Meta Platforms, we need to change the license
headers.

Reviewed By: fanzeyi

Differential Revision: D33407812

fbshipit-source-id: b11bfbbf13a48873f0cea75f212cc7b07a68fb2e
2022-01-04 15:00:07 -08:00
Xavier Deguillard
87ef50135e prjfs: make readdir more asynchronous
Summary:
The makeImmediateFuture is merely a glorified try/catch, and thus the entirety
of the readdir callback would run immediately, and would block on every size
sequentially. This is inefficient and may cause files to be fetched one at a time.

By reworking the inner logic, we can collect all the sizes beforehand prior to
servicing the callback, allowing for sizes to be fetched concurrently, and for
completing the callback asynchronously.

Reviewed By: chadaustin

Differential Revision: D31782916

fbshipit-source-id: d6315347492e969ffa79037dc2a4f275f4b95a8d
2021-11-05 12:55:58 -07:00
Xavier Deguillard
d9cf8b002d inodes: use ImmediateFuture in PrjfsDispatcher
Summary:
This moves all the conversions to folly::Future one level up. This allow
removing all folly::Future code from PrjfsDispatcherImpl.

Reviewed By: genevievehelsel

Differential Revision: D31746626

fbshipit-source-id: cb9b78c9ea2f09045f4bc921ea820a77b5832ddb
2021-10-26 19:23:09 -07:00
Xavier Deguillard
a75fd9a013 prjfs: kill timeout notification
Summary:
The timeout based notification mechanism appears to be at the root of a lot of
EdenFS crashes. It also doesn't detect all of the network failures and is being
replaced with a more robust mechanism which will handle these cases.

For now, let's just get rid of the timeout based mechanism.

Reviewed By: fanzeyi

Differential Revision: D31739683

fbshipit-source-id: d6baa03fe3579dc36c9f435bd16df39799e51ef9
2021-10-26 12:06:02 -07:00
Xavier Deguillard
c04383bb90 prjfs: refactor the dispatcher interface
Summary:
This moves some Prjfs logic into the channel code, which allows for
de-duplicating a bit of code. This will also make a subsequent change in the
rename code easier to do.

Differential Revision: D30023970

fbshipit-source-id: 7efa6dcc4318213e9d266932527b5a56edacefd7
2021-08-10 11:23:41 -07:00
Xavier Deguillard
62076b545e inodes: move dispatchers around
Summary:
Instead of having one "Dispatcher" type that the various backend overload,
let's simply have a per-mount type dispatcher type. The previous model worked
fine when EdenFS supported only one way of mounting a repository, but with NFS
coming, unix platform will support both FUSE and NFS, making the Dispatcher
overload nonsensical.

As a behavioral change, the dispatcher lifetime and ownership is changed a bit.
It used to live for the duration of the EdenMount object, but is now tied to
the channel lifetime, as it is now owned by it.

Reviewed By: kmancini

Differential Revision: D26329477

fbshipit-source-id: 3959b90a4909e3ab0898caa308f54686f59a943c
2021-02-10 11:52:06 -08:00
Xavier Deguillard
1106d40eae prjfs: replace Synchronized<shared_ptr> with RcuPtr
Summary:
The RcuPtr abstraction allows us to use RCU instead of the significantly more
expensive Synchronized<shared_ptr>. This should reduce the cost of all the
callbacks while not sacrificing the guarantee that unmounting a repository
needs to wait for all the pending callbacks to complete.

A new rcu_domain is used as the pending callbacks may sleep and take a long
time to complete when the servers aren't reachable. To avoid penalizing all the
other RCU clients, it's best to be isolated in its own domain.

Reviewed By: kmancini

Differential Revision: D25351535

fbshipit-source-id: bd40d59056e3e710c28c42d651b79876be496bc3
2021-01-05 12:17:32 -08:00
Xavier Deguillard
34edb7b618 win: re-use guid for the lifetime of the checkout
Summary:
On Windows, the GUID of the mount point identifies the virtualization instance,
that GUID is then propagated automatically to the created placeholders when
these are created as a response to a getPlaceholderInfo callback.

When the placeholders are created by EdenFS when invalidating directories we
have to pass GUID. The documentation isn't clear about whether that GUID needs
to be identical to the mount point GUID, but for a very long time these have
been mismatching due to the mount point GUID being generated at startup time
and not re-used.

One of the most common issue that users have reported is that sometimes
operations on the repository start failing with the error "The provider that
supports file system virtualization is temporarily unavailable". Looking at the
output of `fsutil reparsepoint query` for all the directories from the file
that triggers the error to the root of the repositories, shows that one of the
folder and its descendant don't share the same GUID, removing it solves the
issue.

It's not clear to me why this issue doesn't always reproduce when restarting
EdenFS, but a simple step that we can take to solve this is to always re-use
the GUID, and that hopefully will lead to the GUID always being the same and
the error to go away.

Reviewed By: fanzeyi

Differential Revision: D25513122

fbshipit-source-id: 0058dedbd7fd8ccae1c9527612ac220bc6775c69
2020-12-15 08:07:49 -08:00
Xavier Deguillard
dceab3479b prjfs: thread notifications in PrjfsChannel
Summary:
EdenFS can now show notifications to the user in case something wrong is
happening.

Reviewed By: chadaustin

Differential Revision: D24864354

fbshipit-source-id: fabc30f14bc022b4367af562481235fe984df458
2020-12-03 10:45:30 -08:00
Xavier Deguillard
6d90ceea25 prjfs: add timeout to ProjectedFS callbacks
Summary:
One of the main sub-par user experience on Windows is the lack of notification
of any kind when EdenFS can't reach the Mercurial servers. Prior to this diff,
the callbacks would never return, causing commands to simply hangs for the
user.

As a first step, let's add a timeout, a later step will hook the notification
mechanism used on macOS/Linux to display a notification when timeouts occurs.

The only callback that doesn't have a proper timeout is the notification one,
as timing out on these would mean that EdenFS won't have registered that some
files/directories have been materialized which will lead to inconsistencies
later.

Reviewed By: kmancini

Differential Revision: D24809645

fbshipit-source-id: 0ddd9d443a17db405a3edbaa8edecf3764c31d37
2020-12-03 10:45:29 -08:00
Xavier Deguillard
faf0985885 prjfs: wait for pending requests before unmounting
Summary:
As described in the previous diff, unmounting a repo while a request is pending
would lead to a use after free. To solve this, we can wrap the inner channel
with a shared_ptr, and set it to NULL whenever unmount is in progress.

While this solution has a fairly large overhead due to requiring at least 2
atomics per callbacks (one for the lock, the second one for the shared_ptr
copy), it is correct. A future improvement will swap these with an RCU pointer
to reduce the callback cost to almost nothing.

Reviewed By: chadaustin

Differential Revision: D25071423

fbshipit-source-id: 77d14a38403bef3e276d3e5e48e6fd95dd641964
2020-12-03 10:45:29 -08:00
Xavier Deguillard
54f6aae49c prjfs: add an inner channel
Summary:
There is currently a race condition where unmounting a repo can happen
concurrently with a ProjectedFS notification/callback. Depending on who wins
that race, this can lead to a use-after-free as the PrjfsChannel/EdenMount
would be freed but the callback would still have reference to it.

To solve this, we need to keep track of inflight requests, and in particular
make sure that memory isn't freed before all the pending callbacks have
completed. And that effectively means that we need to refcount the channel used
by these callbacks so we only free the memory when nobody else is using it.

The first step towards this is splitting the channel in 2 halves.

Reviewed By: chadaustin

Differential Revision: D25071422

fbshipit-source-id: 743f38c9b19ba534961d06ea6f2ddc96b685fe19
2020-12-03 10:45:29 -08:00
Xavier Deguillard
8b82dc96cb prjfs: make readdir asynchronous
Summary:
As of right now, opendir is the expensive callbacks due to fetching the sizes
for all the files in a directory. This strategy however breaks down when
timeouts are added as very large directories would trigger the timeout, not
because EdenFS is having a hard time reaching Mononoke, but because of
bandwidth limitation.

To avoid this issue, we need to have a per-file timeout and thus makes opendir
just trigger the futures, but not wait on them. The waiting bit will happen
at readdir time which will also enable having a timeout per file.

The one drawback to this is that the ObjectFetchContext that is passed in by
opendir cannot live long enough to be used in the size future. For now, we can
use a null context, but a proper context will need to be passed in, in the
future.

Reviewed By: wez

Differential Revision: D24895089

fbshipit-source-id: e10ceae2f7c49b4a006b15a34f85d06a2863ae3a
2020-11-13 14:27:26 -08:00
Chad Austin
f6fcff3151 move strace logging into FuseChannel
Summary:
Instead of logging in the Dispatcher, move strace logging to
FuseChannel where it can be standardized for all FUSE request types.

Reviewed By: wez

Differential Revision: D24035838

fbshipit-source-id: c84d8c27b62f9944e2d26a35a7ed7bbbeeb5bf0e
2020-10-20 09:34:03 -07:00
Xavier Deguillard
18a313cba0 inodes: make invalidating inodes fallible
Summary:
While on Linux these can't fail (or, to be more precise: it doesnt' matter),
they can on Windows. One such exemple is when a user lock a file and triggers
an update that modifies this file. The invalidation will fail, and thus the
update should keep track of that file not being updated properly.

Previously, the invalidation would raise an exception, but that proved to be
the wrong approach as some state would need to be rolled back which the
exception didn't help in. For that, let's just return a Try and make sure that
we handle all the cases properly.

Reviewed By: chadaustin

Differential Revision: D24163672

fbshipit-source-id: ac881984138eefa65c053478a160e2a653fd3fdf
2020-10-15 17:31:13 -07:00
Xavier Deguillard
1fef8cbc1a prjfs: remove FsChannel.h
Summary: This is not needed, we can use the PrjfsChannel class directly where needed.

Reviewed By: chadaustin

Differential Revision: D23946259

fbshipit-source-id: eafcd38c0927fa282d62ada0986a7ef8b612174b
2020-09-28 18:14:30 -07:00
Xavier Deguillard
4c2019197a prjfs: add TARGETS file
Summary:
This enables autodeps, and brings us one step closer to building EdenFS with
Buck on Windows.

Reviewed By: fanzeyi

Differential Revision: D23857794

fbshipit-source-id: c8587a6f7b9e4d9575a62f592c1d0737dff2a8f0
2020-09-23 09:43:35 -07:00
Xavier Deguillard
3291feac91 prjfs: add an abstract Dispatcher class
Summary:
If we are to look at the dependency graph for EdenFS on Windows, we would
notice that the eden_prjfs target depends on eden_inodes, and vice versa,
causing a cycle. While CMake is perfectly happy with that, Buck doesn't like
that. The solution to removing this cycle is to move the code that needs the
dependency to eden_inodes into the eden_inodes target, and that's the
EdenDispatcher. However, since PrjfsChannel needs to hold a dispatcher to call
into it, it needs to know the methods exposed by the dispatcher. To achieve
this, a simple abstract class is added, this is the same as what is done for
FUSE.

Reviewed By: wez

Differential Revision: D23857540

fbshipit-source-id: c495c67d43724f648e5ffa17776e4d5d4513698a
2020-09-23 09:43:34 -07:00
Xavier Deguillard
a4f6a1abe0 prjfs: move win/mount into prjfs
Summary:
Now that the win directory only contains the mount directory, we can rename it
to be more faithful to its intent. Since this is about ProjectedFS, let's
rename it "prjfs".

Reviewed By: chadaustin

Differential Revision: D23828561

fbshipit-source-id: cb31fe4652fd4356dc2579028d3ae2c7935371a7
2020-09-22 09:09:56 -07:00