Commit Graph

40 Commits

Author SHA1 Message Date
Saurabh Singh
0b9c2fb148 telemetry: switch to using quantile stats instead of histograms
Summary:
Histograms are memory intensive and require specifying fixed buckets for
instantiation.

Reviewed By: chadaustin

Differential Revision: D26315631

fbshipit-source-id: 6ce5459b8c1c6c25d84285baf96df55ce9119b1a
2021-02-14 16:37:08 -08:00
Xavier Deguillard
62076b545e inodes: move dispatchers around
Summary:
Instead of having one "Dispatcher" type that the various backend overload,
let's simply have a per-mount type dispatcher type. The previous model worked
fine when EdenFS supported only one way of mounting a repository, but with NFS
coming, unix platform will support both FUSE and NFS, making the Dispatcher
overload nonsensical.

As a behavioral change, the dispatcher lifetime and ownership is changed a bit.
It used to live for the duration of the EdenMount object, but is now tied to
the channel lifetime, as it is now owned by it.

Reviewed By: kmancini

Differential Revision: D26329477

fbshipit-source-id: 3959b90a4909e3ab0898caa308f54686f59a943c
2021-02-10 11:52:06 -08:00
Andres Suarez
21c95391ca Apply clang-format update fixes
Reviewed By: igorsugak

Differential Revision: D25861960

fbshipit-source-id: e3c39c080429058a58cdc66d45350e5d1420f98c
2021-01-10 10:06:29 -08:00
Xavier Deguillard
adf6aa8d4f inodes: move strace logging to PrjfsChannel
Summary:
All of the strace logging was done in PrjfsChannel except for the notification
callbacks, let's remediate this.

Reviewed By: kmancini

Differential Revision: D25643491

fbshipit-source-id: 7eaed2503557b0e486d7d1b0637c68287ee9df90
2021-01-05 12:17:33 -08:00
Xavier Deguillard
0855712a77 prjfs: refactor some common code
Summary:
In a previous diff, chadaustin noted that there was a bunch of duplicated code
prior to calling into the PrjfsChannel, let's use template to solve this.

One of the non-refactored piece is the BAIL_ON_RECURSIVE_CALL, and I'm not sure
of a good way to move it into runCallback while still being able to understand
what callback is recursive. Previously, the line number from XLOG was
sufficient, moving it into the runCallback function would lose that.

Reviewed By: chadaustin

Differential Revision: D25576860

fbshipit-source-id: 619ed0c9fecf05cda2263dfcdf2fbcbaec85e45a
2021-01-05 12:17:33 -08:00
Xavier Deguillard
1106d40eae prjfs: replace Synchronized<shared_ptr> with RcuPtr
Summary:
The RcuPtr abstraction allows us to use RCU instead of the significantly more
expensive Synchronized<shared_ptr>. This should reduce the cost of all the
callbacks while not sacrificing the guarantee that unmounting a repository
needs to wait for all the pending callbacks to complete.

A new rcu_domain is used as the pending callbacks may sleep and take a long
time to complete when the servers aren't reachable. To avoid penalizing all the
other RCU clients, it's best to be isolated in its own domain.

Reviewed By: kmancini

Differential Revision: D25351535

fbshipit-source-id: bd40d59056e3e710c28c42d651b79876be496bc3
2021-01-05 12:17:32 -08:00
Xavier Deguillard
bc56270443 prjfs: only send a notification on future timeout
Summary:
When compiling with Buck, it tries to create hardlink which are rightfully
failing, but that error then triggers a spurious notification. For now, let's
only notify the user on timeout, as these are very likely to be network issues.

Reviewed By: kmancini

Differential Revision: D25500364

fbshipit-source-id: 95b609ae901fa6207c8edba26cd8e6a21ddfe3ac
2020-12-15 18:13:01 -08:00
Xavier Deguillard
4001bb11b9 win: silence a handful of warnings.
Summary: These show up when compiling with Buck, let's silence them.

Reviewed By: chadaustin

Differential Revision: D25513672

fbshipit-source-id: 277afae30059114f3646cdf4feedac442a4ee1b6
2020-12-15 08:07:49 -08:00
Xavier Deguillard
34edb7b618 win: re-use guid for the lifetime of the checkout
Summary:
On Windows, the GUID of the mount point identifies the virtualization instance,
that GUID is then propagated automatically to the created placeholders when
these are created as a response to a getPlaceholderInfo callback.

When the placeholders are created by EdenFS when invalidating directories we
have to pass GUID. The documentation isn't clear about whether that GUID needs
to be identical to the mount point GUID, but for a very long time these have
been mismatching due to the mount point GUID being generated at startup time
and not re-used.

One of the most common issue that users have reported is that sometimes
operations on the repository start failing with the error "The provider that
supports file system virtualization is temporarily unavailable". Looking at the
output of `fsutil reparsepoint query` for all the directories from the file
that triggers the error to the root of the repositories, shows that one of the
folder and its descendant don't share the same GUID, removing it solves the
issue.

It's not clear to me why this issue doesn't always reproduce when restarting
EdenFS, but a simple step that we can take to solve this is to always re-use
the GUID, and that hopefully will lead to the GUID always being the same and
the error to go away.

Reviewed By: fanzeyi

Differential Revision: D25513122

fbshipit-source-id: 0058dedbd7fd8ccae1c9527612ac220bc6775c69
2020-12-15 08:07:49 -08:00
Xavier Deguillard
754d132783 Back out "prjfs: handle concurrent file/directory removal"
Summary: Something is wrong with this which causes Unity to freak out.

Reviewed By: fanzeyi

Differential Revision: D25453230

fbshipit-source-id: 89f61fd97817403fa65071ddac022a226b775e53
2020-12-10 07:42:38 -08:00
Xavier Deguillard
2bfdf6f481 prjfs: handle concurrent file/directory removal
Summary:
A while back, we saw that concurrent directory creation would lead to EdenFS
being confused and failing to record some of the created directories. This then
caused EdenFS to no longer being in sync with what was on disk. To handle this
case, we've had to manually creating these directories recursively.

What I didn't realize at the time was that these concurrent notifications could
also happen on removal this time, and if a directory removal notification wins
the race against the removal of its last children, that directory wouldn't be
removed and EdenFS would once again be confused about the state of the
repository.

Fixing this is a bit trickier than directory creation as it's more racier.
Consider a directory that is being removed, and then immediately recreated with
a file in it in a different process. The naive approach of simply force
removing all of the children of a directory when handling the removal
notification would clash with the file creation. We could argue that nobody
should be doing this, but there would be an unhandled race, and thus a bug
where data would potentially be lost[0].

We can however fix this bug slightly differently. For file/directory removal,
we can actually hook onto the pre-callback, ie: one that happens before the
file/directory is no longer visible on disk. This inherently eliminate the race
altogether as the callback will be guaranteed to run when none of its children
are present, and if a race happens with a file creation in it, we can simply
fail the removal properly.

The only tricky bit is for the renaming logic, as renaming a file is logically
a removal followed by a creation. For that reason, I've moved part of the
renaming bits to the pre-callback too.

In theory, this change may negatively affect workloads that do concurrent
directory removal as the duration during which a file/directory is visible
ondisk now includes the EdenFS callback while it didn't before. Such workflows
should be fairly rare and/or redirected to avoid EdenFS altogether if
performance matters.

[0]: This left-over file that EdenFS wouldn't be aware of would also later
cause the checkout code to fail due to invalidation failures triggered when
trying to invalidate that directory. This would be fairly hard to debug.

Reviewed By: fanzeyi

Differential Revision: D25112381

fbshipit-source-id: 9300499ce872ad93d0a687f0e61b7e2a9caf9556
2020-12-04 14:25:44 -08:00
Xavier Deguillard
446ae54b71 inodes: increment and decrement the fs refcount on Windows
Summary:
On Windows, the FS refcount is used to indicate that ProjectedFS knows about
this inode and either has a placeholder on disk, or a plain file. The first
event only occurs on lookup (similarly to Linux/macOS), while the second one
happens when files are created by the user and we receive a notification about
it.

In order to avoid races and to miss necessary invalidation, the refcount has to
be incremented after the placeholder has been created, and the refcount is
decremented before the invalidation is performed. This is straightforward to
achieve for notifications, but requires passing a callback to the PrfsChannel.

Reviewed By: chadaustin

Differential Revision: D24800819

fbshipit-source-id: 0e7ea7ed3a9ca0414e3e727fba975045546d82d1
2020-12-04 09:13:38 -08:00
Xavier Deguillard
dceab3479b prjfs: thread notifications in PrjfsChannel
Summary:
EdenFS can now show notifications to the user in case something wrong is
happening.

Reviewed By: chadaustin

Differential Revision: D24864354

fbshipit-source-id: fabc30f14bc022b4367af562481235fe984df458
2020-12-03 10:45:30 -08:00
Xavier Deguillard
6d90ceea25 prjfs: add timeout to ProjectedFS callbacks
Summary:
One of the main sub-par user experience on Windows is the lack of notification
of any kind when EdenFS can't reach the Mercurial servers. Prior to this diff,
the callbacks would never return, causing commands to simply hangs for the
user.

As a first step, let's add a timeout, a later step will hook the notification
mechanism used on macOS/Linux to display a notification when timeouts occurs.

The only callback that doesn't have a proper timeout is the notification one,
as timing out on these would mean that EdenFS won't have registered that some
files/directories have been materialized which will lead to inconsistencies
later.

Reviewed By: kmancini

Differential Revision: D24809645

fbshipit-source-id: 0ddd9d443a17db405a3edbaa8edecf3764c31d37
2020-12-03 10:45:29 -08:00
Xavier Deguillard
faf0985885 prjfs: wait for pending requests before unmounting
Summary:
As described in the previous diff, unmounting a repo while a request is pending
would lead to a use after free. To solve this, we can wrap the inner channel
with a shared_ptr, and set it to NULL whenever unmount is in progress.

While this solution has a fairly large overhead due to requiring at least 2
atomics per callbacks (one for the lock, the second one for the shared_ptr
copy), it is correct. A future improvement will swap these with an RCU pointer
to reduce the callback cost to almost nothing.

Reviewed By: chadaustin

Differential Revision: D25071423

fbshipit-source-id: 77d14a38403bef3e276d3e5e48e6fd95dd641964
2020-12-03 10:45:29 -08:00
Xavier Deguillard
54f6aae49c prjfs: add an inner channel
Summary:
There is currently a race condition where unmounting a repo can happen
concurrently with a ProjectedFS notification/callback. Depending on who wins
that race, this can lead to a use-after-free as the PrjfsChannel/EdenMount
would be freed but the callback would still have reference to it.

To solve this, we need to keep track of inflight requests, and in particular
make sure that memory isn't freed before all the pending callbacks have
completed. And that effectively means that we need to refcount the channel used
by these callbacks so we only free the memory when nobody else is using it.

The first step towards this is splitting the channel in 2 halves.

Reviewed By: chadaustin

Differential Revision: D25071422

fbshipit-source-id: 743f38c9b19ba534961d06ea6f2ddc96b685fe19
2020-12-03 10:45:29 -08:00
Xavier Deguillard
8b82dc96cb prjfs: make readdir asynchronous
Summary:
As of right now, opendir is the expensive callbacks due to fetching the sizes
for all the files in a directory. This strategy however breaks down when
timeouts are added as very large directories would trigger the timeout, not
because EdenFS is having a hard time reaching Mononoke, but because of
bandwidth limitation.

To avoid this issue, we need to have a per-file timeout and thus makes opendir
just trigger the futures, but not wait on them. The waiting bit will happen
at readdir time which will also enable having a timeout per file.

The one drawback to this is that the ObjectFetchContext that is passed in by
opendir cannot live long enough to be used in the size future. For now, we can
use a null context, but a proper context will need to be passed in, in the
future.

Reviewed By: wez

Differential Revision: D24895089

fbshipit-source-id: e10ceae2f7c49b4a006b15a34f85d06a2863ae3a
2020-11-13 14:27:26 -08:00
Xavier Deguillard
596d54adfe prjfs: properly format the placeholder warning
Summary:
The XLOG_EVERY_MS doesn't use the 3rd argument as a format string, it just
prints it verbatim. To format it, we need to use fmt::format.

Reviewed By: genevievehelsel

Differential Revision: D24906819

fbshipit-source-id: 7d45787301086fb87dd8f5d478af8007df82c0b6
2020-11-11 21:46:35 -08:00
Xavier Deguillard
41ed1bb0d2 prjfs: silence warning in Enumerator
Summary:
The move constructor needs to be noexcept and should also initialize the
members in the right order.

Reviewed By: genevievehelsel

Differential Revision: D24874304

fbshipit-source-id: a3db5dcdab1397b872b8f13ec5c7fd45baad5e6f
2020-11-11 20:00:39 -08:00
Xavier Deguillard
34598d4337 remove dependency on glog
Summary:
The EdenFS codebase uses folly/logging/xlog to log, but we were still relying
on glog for the various CHECK macros. Since xlog also contains equivalent CHECK
macros, let's just rely on them instead.

This is mostly codemodded + arc lint + various fixes to get it compile.

Reviewed By: chadaustin

Differential Revision: D24871174

fbshipit-source-id: 4d2a691df235d6dbd0fbd8f7c19d5a956e86b31c
2020-11-10 16:31:15 -08:00
Xavier Deguillard
f3b0124f01 prjfs: use a shared_ptr for the context
Summary:
In order to allow request to timeout to display notifications to the user, the
`within` future method will need to be called on the various callback futures.
Unfortunately, once the timeout expires, the underlying future isn't cancelled
and stopped, but the unique pointer holding the context will be reclaimed.
Whenever the future actually completes, it will try to use an invalid pointer,
crashing EdenFS.

To solve this, switch to using a shared_ptr and copy it in the right places so
it will only be freed once all futures holding a reference to it will be gone.

I also took the opportunity to reduce the nesting a bit to make the code more
readable.

Reviewed By: kmancini

Differential Revision: D24809647

fbshipit-source-id: 987d6e5763106fabc6bed3ea00d28b129b5285a1
2020-11-10 09:59:25 -08:00
Xavier Deguillard
d42a8c971d inodes: move path into the Dispatcher callbacks
Summary:
Dealing with non-owned path in futures is mind bending and can lead to use
after free fairly easily if not careful. It turns out that this code is
actually a bit buggy and in some cases will attempt to use a path that is
already freed. Since the callsite doesn't need to hold onto the paths, let's
just move them, which resolves the issuue.

Reviewed By: chadaustin

Differential Revision: D24657423

fbshipit-source-id: 47bbaccf18cd86e53860491e3cbfeadb4363499c
2020-11-02 12:26:56 -08:00
Xavier Deguillard
13236d6f8b prjfs: ignore access denied errors on invalidation
Summary:
We've had a handful of reports of access denied errors seen during an `hg
update`, ignore them due to:

```
// TODO(T78476916): The access denied are coming from
// PrjMarkDirectoryAsPlaceholder recursively calling into EdenFS, which
// is denied by the BAIL_ON_RECURSIVE_CALL macro.
//
// In theory this means that EdenFS is invalidating a directory that
// isn't materialized, ie: doing useless work. Despite having a negative
// performance impact, this doesn't affect correctness, so ignore for now.
//
// A long term fix will need to not issue invalidation on directories
// that aren't materialized.
```

Reviewed By: wez

Differential Revision: D24544622

fbshipit-source-id: 4f421836b6d8c7b527e860dcf55103e0e239e3ae
2020-10-26 14:02:16 -07:00
Chad Austin
f6fcff3151 move strace logging into FuseChannel
Summary:
Instead of logging in the Dispatcher, move strace logging to
FuseChannel where it can be standardized for all FUSE request types.

Reviewed By: wez

Differential Revision: D24035838

fbshipit-source-id: c84d8c27b62f9944e2d26a35a7ed7bbbeeb5bf0e
2020-10-20 09:34:03 -07:00
Xavier Deguillard
18a313cba0 inodes: make invalidating inodes fallible
Summary:
While on Linux these can't fail (or, to be more precise: it doesnt' matter),
they can on Windows. One such exemple is when a user lock a file and triggers
an update that modifies this file. The invalidation will fail, and thus the
update should keep track of that file not being updated properly.

Previously, the invalidation would raise an exception, but that proved to be
the wrong approach as some state would need to be rolled back which the
exception didn't help in. For that, let's just return a Try and make sure that
we handle all the cases properly.

Reviewed By: chadaustin

Differential Revision: D24163672

fbshipit-source-id: ac881984138eefa65c053478a160e2a653fd3fdf
2020-10-15 17:31:13 -07:00
Xavier Deguillard
25228797ca prjfs: ignore ERROR_PATH_NOT_FOUND during invalidation
Summary:
Similarly to how we could try invalidating a file that isn't cached, we could
also be trying to invalidate a file whose path isn't cached. Both are
legitimate, and thus we need to ignore both.

Reviewed By: chadaustin

Differential Revision: D24125225

fbshipit-source-id: e8abe5cde5aa3602bb48258abb64aa0cdf60241d
2020-10-05 17:50:57 -07:00
Xavier Deguillard
d8d841ae80 prjfs: add partial support for debug processfetch on Windows
Summary:
This will enable to gather a bit more debugging regarding what processes are
fetching data. The one missing bit on Windows is to collect the process name,
for now, a "NOT IMPLEMENTED" placeholder is put in place.

Reviewed By: wez

Differential Revision: D23946258

fbshipit-source-id: 9f7642c7b9207c5b48ffff0f4eb0333af00bc7d5
2020-10-05 15:46:02 -07:00
Xavier Deguillard
9640f62409 prjfs: don't silently ignore invalidation errors
Summary:
For the longest time, invalidation errors were simply ignored, and while some
of them are valid to ignore, others just aren't and should be surfaced. One
exemple of a valid error is when a file is opened with no sharing, attempts to
invalidate the file would fail, but not inform the users, who would then be
surprised that the file wouldn't be updated on disk properly.

Instead of ignoring all errors, I've special cased the ones that we may be
expecting to receive, and will treat all the others as errors. A future diff
will attempt to properly not update the parent TreeInode so that a failed
invalidation would then be visible during an `hg update`.

Reviewed By: genevievehelsel

Differential Revision: D23980547

fbshipit-source-id: 2e67bfe817951fd0b497214a6474ff28b953b6e6
2020-09-30 12:28:31 -07:00
Xavier Deguillard
29922672ee prjfs: add a missing {} in a format string
Summary: The logs would never show what directory was changed to be a placeholder.

Reviewed By: fanzeyi

Differential Revision: D23977719

fbshipit-source-id: b1f7f539457956510638a31ca3e49c4cee641fd8
2020-09-28 18:49:49 -07:00
Xavier Deguillard
1fef8cbc1a prjfs: remove FsChannel.h
Summary: This is not needed, we can use the PrjfsChannel class directly where needed.

Reviewed By: chadaustin

Differential Revision: D23946259

fbshipit-source-id: eafcd38c0927fa282d62ada0986a7ef8b612174b
2020-09-28 18:14:30 -07:00
Xavier Deguillard
4c2019197a prjfs: add TARGETS file
Summary:
This enables autodeps, and brings us one step closer to building EdenFS with
Buck on Windows.

Reviewed By: fanzeyi

Differential Revision: D23857794

fbshipit-source-id: c8587a6f7b9e4d9575a62f592c1d0737dff2a8f0
2020-09-23 09:43:35 -07:00
Xavier Deguillard
eaa270730f inodes: move prjfs/EdenDispatcher to inodes/
Summary:
Now that prjfs/EdenDispatcher is no longer directly tied to ProjectedFS (sort
of, this is still a bit WIP), we can move it out of the prjfs directory onto
the inodes one. This allows breaking the circular dependency cycle mentioned in
the previous diff where prjfs and inodes depend on each other.

For now, this is merely a copy/paste of the code enclosed in big #ifdef _WIN32,
we might be able to do better later, but for now this is properly good enough.

Reviewed By: wez

Differential Revision: D23857539

fbshipit-source-id: 77c620bac1656d01d7daee4dbf8b10694a589751
2020-09-23 09:43:34 -07:00
Xavier Deguillard
3291feac91 prjfs: add an abstract Dispatcher class
Summary:
If we are to look at the dependency graph for EdenFS on Windows, we would
notice that the eden_prjfs target depends on eden_inodes, and vice versa,
causing a cycle. While CMake is perfectly happy with that, Buck doesn't like
that. The solution to removing this cycle is to move the code that needs the
dependency to eden_inodes into the eden_inodes target, and that's the
EdenDispatcher. However, since PrjfsChannel needs to hold a dispatcher to call
into it, it needs to know the methods exposed by the dispatcher. To achieve
this, a simple abstract class is added, this is the same as what is done for
FUSE.

Reviewed By: wez

Differential Revision: D23857540

fbshipit-source-id: c495c67d43724f648e5ffa17776e4d5d4513698a
2020-09-23 09:43:34 -07:00
Xavier Deguillard
d75fdba6ae win: remove folly/Format.h dependency
Summary: The Windows code is now free of sformat, and only depends on fmt::format.

Reviewed By: fanzeyi

Differential Revision: D23786940

fbshipit-source-id: 4c9d4a8c30e5a3d5eb7971cdc3139539c5893a8d
2020-09-23 09:43:34 -07:00
Xavier Deguillard
ec9241415b win: remove unecessary PrjfsRequestContext.h from EdenDispatcher.cpp
Summary:
Now that all the logic is in PrjfsChannel.cpp, this header no longer needs to
be included. This helps in making this file as free as possible from Windows
specific bits which will allow it to be moved into the inodes top level
directory.

Reviewed By: fanzeyi

Differential Revision: D23786941

fbshipit-source-id: a4a3a47ea00808ca7a839709068efab06436167f
2020-09-23 09:43:34 -07:00
Xavier Deguillard
36f4a1246a inodes: thread an ObjectFetchContext in getInode()
Summary:
Most of the non-tests callers have an ObjectFetchContext available, let's use
it instead of a static null context. This will help in understanding why some
files/trees are being fetched.

Reviewed By: chadaustin

Differential Revision: D23745752

fbshipit-source-id: b2e145d9e559bde0542adbc5b20ff56ccfc59ece
2020-09-23 09:43:34 -07:00
Xavier Deguillard
77ab04daf2 win: thread the request context to readdir
Summary:
This allows removal of a NullContext which will provide more details as to why
fetches are triggered.

Reviewed By: chadaustin

Differential Revision: D23745755

fbshipit-source-id: c26f648b8695b86caf8fe15c4bc6c4128d345aa1
2020-09-23 09:43:34 -07:00
Xavier Deguillard
436a847a79 win: make read callback fully asynchronous
Summary:
Similarly to the other callbacks, this makes the main function return to
ProjectedFS as soon as the future is created which will allow for it to be
interrupted in a subsequent diff.

Reviewed By: fanzeyi

Differential Revision: D23745754

fbshipit-source-id: 2d77d0eacfe0d37eb9075bf9f0660e4f4af77e8f
2020-09-23 09:43:34 -07:00
Xavier Deguillard
495cc257b4 win: make EdenDispatcher::read asynchronous
Summary:
This merely moves Windows bits outside of the EdenDispatcher and into the
PrjfsChannel, making the former less dependant on Windows, and paving the way
to handling this callback fully asynchronous.

One caveat currently is that while the callback supports specifying an offset
and length, the underlying backing store only allows reading the entire file at
once, thus these arguments are simply ignored in the dispatcher.

Reviewed By: fanzeyi

Differential Revision: D23745753

fbshipit-source-id: 266e1f448f9db536d746da1462a2a590ffad19a6
2020-09-23 09:43:34 -07:00
Xavier Deguillard
a4f6a1abe0 prjfs: move win/mount into prjfs
Summary:
Now that the win directory only contains the mount directory, we can rename it
to be more faithful to its intent. Since this is about ProjectedFS, let's
rename it "prjfs".

Reviewed By: chadaustin

Differential Revision: D23828561

fbshipit-source-id: cb31fe4652fd4356dc2579028d3ae2c7935371a7
2020-09-22 09:09:56 -07:00