Summary: To aid in build times, switched to using forward declarations to remove an depedenciny on EdenStats.h in IndoeMap.h.
Reviewed By: chadaustin
Differential Revision: D49546029
fbshipit-source-id: 1306d28465527e7d6109d413f7e9f4ecb2e7e74e
Summary: InodeMap, in subsequent diffs, will report metrics when looking up inodes (hits, misses, errors). Adding EdenStats to it initalization to enable this.
Reviewed By: chadaustin
Differential Revision: D49477891
fbshipit-source-id: 86f7c357d5cf011897d1ccb57e42f80f55890ba0
Summary:
Now that Windows is going to support NFS. We need a reliable way to check at
runtime if a file should be read from disk (Prjfs) or it should be read from
the overlay (nfs).
We read inodes during fsck which is before the mount is fully initialized, so
we need to be able to detect the mount type earlier than how we do right now.
Here I am moving the NFS detection before eden mount initialization.
While moving this I realized that the InodeMap's view if NFS is incorrect
in the takeover case, so I am also fixing that here.
Reviewed By: chadaustin
Differential Revision: D45020849
fbshipit-source-id: b0a8fd431a37174c81b0d053d92b8bac026bd0f1
Summary:
During invalidation on Windows, we need to know if the inode is known by the
InodeMap. This is currently achieved by testing the loaded inode set and the
unloaded one, taking the InodeMap lock twice, and constructing an InodePtr to
destroy it immediately.
The InodePtr destruction is potentially deadlock inducing as that may require
taking the parent lock, which is already taken. To avoid this, the newly added
method merely test both set without constructing an InodePtr.
Reviewed By: genevievehelsel
Differential Revision: D40956445
fbshipit-source-id: 662c139634e9ce79afacabcff2aeb9a682fc33b2
Summary:
After releasing a new feature to trace inode loads by using the InodeTraceBus, some users reported accounts of VS Code hanging. We were able to trace this down to a deadlock occurring where inode load events would be published to tracebus while holding the data_ lock. Then, if capacity is reached, tracebus will block while still holding the data_ lock. Finally, when subscribers then run and try to calculate paths, they would try to acquire the data lock, deadlocking both threads.
This change prevents this deadlock by ensuring we always release the data_ lock before publishing an inode trace event to tracebus. To do this I separate the locations where an InodeTraceEvent is created and later published
Reviewed By: MichaelCuevas
Differential Revision: D38708193
fbshipit-source-id: ab9783b9f18c36e84be4396279cdbce981d4615a
Summary:
So far the ActivityBuffer in EdenFS has only supported storing inode materialization events, which are the only type of event currently being displayed in the eden trace inode and eden trace inode --retroactive commands. In this change and the previous diff, we add support for storing and displaying inode load events as well.
This change specifically modifies the InodeMap code to publish events when inodes first begin loading, finish loading, or fail loading, as well as adds a new attribute on UnloadedInode objects to record the timestamp for when the inode first started loading.
Reviewed By: kmancini
Differential Revision: D38092233
fbshipit-source-id: 4a16e811d596db0553b3ca4d503e4843d82fc6c4
Summary:
When we migrated from Hash20 to ObjectId, we didn't fix the #include structure.
Clean that up.
Reviewed By: genevievehelsel
Differential Revision: D32977635
fbshipit-source-id: 202b02f01f22bc174c7559c22af081deb2945caa
Summary:
With Facebook having been renamed Meta Platforms, we need to change the license
headers.
Reviewed By: fanzeyi
Differential Revision: D33407812
fbshipit-source-id: b11bfbbf13a48873f0cea75f212cc7b07a68fb2e
Summary:
The goal of this stack is to remove Proxy Hash type, but to achieve that we need first to address some tech debt in Eden codebase.
For the long time EdenFs had single Hash type that was used for many different use cases.
One of major uses for Hash type is identifies internal EdenFs objects such as blobs, trees, and others.
We seem to reach agreement that we need a different type for those identifiers, so we introduce separate ObjectId type in this diff to denote new identifier type and replace _some_ usage of Hash with ObjectId.
We still retain original Hash type for other use cases.
Roughly speaking, this is how this diff separates between Hash and ObjectId:
**ObjectId**:
* Everything that is stored in local store(blobs, trees, commits)
**Hash20**:
* Explicit hashes(Sha1 of the blob)
* Hg identifiers: manifest id and blob hg ig
For now, in this diff ObjectId has exactly same content as Hash, but this will change in the future diffs. Doing this way allows to keep diff size manageable, while migrating to new ObjectId right away would produce insanely large diff that would be both hard to make and review.
There are few more things that needs to be done before we can get to the meat of removing proxy hashes:
1) Replace include Hash.h with ObjectId.h where needed
2) Remove Hash type, explicitly rename rest of Hash usages to Hash20
3) Modify content of ObjectId to support new use cases
4) Modify serialized metadata and possibly other places that assume ObjectId size is fixed and equal to Hash20 size
Reviewed By: chadaustin
Differential Revision: D31316477
fbshipit-source-id: 0d5e4460a461bcaac6b9fd884517e129aeaf4baf
Summary:
We run periodic inode unloading for unlinked inodes on NFS because we get no
information from the client on when inodes are no longer needed, and we have to
clean them up at some point for memory and disk reasons. See previous commit
summaries for more details on this (D30144901 (ffa558bf84)).
Let's add some counters on this so we have a bit more visibility into the
process. This counter is meant to mimic the PeriodicUnloadCounter counter.
Reviewed By: chadaustin
Differential Revision: D30966688
fbshipit-source-id: cfc8d769b53073d9f4c0c27b6bee20e222c6c8d2
Summary:
We periodically need to dereference inodes on NFS because we get no other info
from the kernel on when should dereference them.
This means the NFS kernel might have references to inodes after we delete them.
An unknown inode number is not a bug on NFS. It's just stale, so the error should
reflect that.
Reviewed By: xavierd
Differential Revision: D30144898
fbshipit-source-id: 3d448e94aea5acb02908ea443bcf3adae80eb975
Summary:
We periodically need to dereference inodes on NFS because we get no other info
from the kernel on when should dereference them.
It can be disruptive to a users workflow because an open files that were rm'ed
or removed on checkout will no longer have their old content. (on a native
filesystem or fuse applications that had the file open propr to the removal
would still be able to access files.) For most editors this is not a problem
because they read the file on open (seems fine for vim and vscode from testing).
However folks could theoretically have a workflow this does not jive with.
Let's make it configurable how often this runs, so users can control how
much we distupt their workflow.
Reviewed By: xavierd
Differential Revision: D30144899
fbshipit-source-id: 59cf5faea70b3aea216ca2bcb45b96e34f5e72b5
Summary:
NFSv3 has no inode invalidation flow built into the procall. The kernel does not
send us forget messages like we get in FUSE. The kernel also does not send us
notifications when a file is closed. Thus EdenFS can not easily tell when
all handles to a file have been closed.
As is now we never clean up inodes. This is bad for memory & disk usage.
We will never unload an inode so we always keep it in memory once it's created.
Additonally, we never remove a materialized inode from the overlay. This means
we have unbounded memory and disk usage :/
We need to clean up these inodes at somepoint. There are a couple high level
options:
1. Support nfsv4. NFSv4 sends us close message when a file handle is closed.
This would allow us to actually keep track of reference coundts on an inode.
However, This is a lot of work. There is a lot of other things we would have to
support before we can move to nfsv4.
2. Run background inode cleanups.
nfsv4 is probably the right long term solution. But for now we should be able to
get by with periodic unloads.
I considered a couple of options for unloads:
1. Unload inodes immediatly when files are removed.
2. Delay cleaning up inodes until a while after they are removed. (i.e. clean
up inodes n seconds after an `unlink`, `rename`, `rmdir`, or checkout)
3. Run periodic inode unloading. (i.e. once a day unload inodes).
Option 1. feels a bit too hostile to applications that hold files open.
Option 3. means we will build up a lot of cruft over the course of the day. But is
probably the most application friendly.
I decided to try out option 2 first and see if it works well with the common
developer tools. Its seems to work (see below) so I am going with it.
This diff only does inode cleanup after checkout. we might want to run inode
clean up after unlink/remove dir as well, but this would be more expensive.
Batch unloading feels better on checkout seems better to me and should happen
frequently enough to clean up space for people.
There is one known "broken" behavior in this diff. We unload all unlinked
inodes which means we will erase more inodes than we should. Sometimes EdenFS
crashes or bugs and unlinks legit inodes. Normally we let those live in the
overlay so we could go in an recover them. My plan to fix this is to mark inodes
for unloading instead of just unloading all unlinked inodes.
Reviewed By: xavierd
Differential Revision: D30144901
fbshipit-source-id: 345d0c04aa386e9fb2bd40906d6f8c41569c1d05
Summary: This is unused, no need to keep it around.
Reviewed By: genevievehelsel
Differential Revision: D30046503
fbshipit-source-id: 1d20d9b4ce672d5d79410203807dbc93b4bce31a
Summary:
On Windows, the working copy doesn't go away on unmount, instead placeholders
and full files[0] are still present on disk. For this reason, EdenFS needs to
either overly invalidate files and directories at update times, or need to
remember what is present on disk so the state can be recovered.
For now, this diff simply focus on reloading all the inodes that are present on
disk in the InodeMap.
[0]: https://docs.microsoft.com/en-us/windows/win32/projfs/cache-state
Reviewed By: chadaustin
Differential Revision: D28889082
fbshipit-source-id: 90170c1291da563bea455c8032dc8282a093c9b3
Summary:
The InodeMap can be extremely hot when issuing tons of requests to EdenFS.
Unfortunately, it still needs to do memory allocation due to its use of
folly::Future. By switching to ImmediateFuture we can avoid the memory
allocation, speeding it up slightly.
For the NFS dispatcher, this moves the call to `semi()` inward, allowing us to
target specific inode methods to convert next. For the Fuse dispatcher, I've
simply converted the ImmediateFuture into a Future directly, keeping the rest
of the code unchanged, a subsequent change will convert the dispatcher code to
ImmediateFuture.
Reviewed By: chadaustin
Differential Revision: D28302480
fbshipit-source-id: 4e097a721443f0d52f34a337a96f8a63a9a7cd7c
Summary:
This is a preparatory phase to make the refcount usuable on Windows. For more
details, see D24716801 (e50725d2cb)
Reviewed By: chadaustin
Differential Revision: D24764568
fbshipit-source-id: 1e8c6ab00d4c1ec79c347fd5ae7167b2ce1dff68
Summary: This is unecessary, remove it.
Reviewed By: chadaustin
Differential Revision: D24743519
fbshipit-source-id: 5e10eafcd3f84d9ad053be35798df86b21f97d4f
Summary:
One of the issue that EdenFS on Windows is currently facing is around
invalidation during an update. In effect, EdenFS is over invalidating, which
causes update to be slower than it should be, as well as EdenFS recursively
triggering ProjectedFS callbacks during invalidation. Both of these are a
sub-par UX.
The reason this issue exist is multi-faceted. First, the update code follows
the "kPreciseInodeNumberMemory" path which enforces that a directory that is
present in the overlay needs to be invalidated, even if it isn't materialized.
The second reason is that no reclamation is done for the overlay, combine the
two and you get an update that gets both slower over time and will issue
significantly more invalidation that is needed.
Solving this is a bit involved. We could for instance start by reclaiming
inodes from the overlay, but this wouldn't be effective as we use the fact that
an inode is present in the overlay as a way to know that the file is cached in
the overlay. If we reclaim from the overlay we simply won't be invalidating
enough and some files will be out of date.
It turns out that we already have a mechanism to track what is cached by the
kernel: the fuse refcount. On Linux/macOS, everytime an inode is returned to
the kernel, this refcount incremented, and the kernel then notifies us when it
forgot about it, at which point the refcount can be decremented. On Windows,
the rules are a bit different, and a simple flag is sufficient: set when we
write a placeholder on disk (either during a directory listing, or when
ProjectedFS asks for it), and unset at invalidation time during update. There
is however a small snag in this plan. On Linux, the refcount starts at 0 when
EdenFS starts as a mount/unmount will clear all the kernel references on the
inodes. On Windows, the placeholder aren't disappearing when EdenFS dies or is
stopped, so we need a way to scan the working copy when EdenFS starts to know
which inodes should be loaded (an UnloadedInode really).
The astute reader will have noticed that this last part is effectively a
O(materialized) operation that needs to happen at startup, which would be
fairly expensive in itself. It turns out that we really don't have choice and
we need to do it regardless due to Windows not disallowing writes to the
working copy when EdenFS is stopped, and thus for EdenFS to be aware of the
actual state of the working copy, it needs to scan it at startup...
The first step in doing all of this is to simply rename the various places that
uses "fuse refcount" to "fs refcount" which is what this diff does.
Reviewed By: chadaustin
Differential Revision: D24716801
fbshipit-source-id: e9e6ccff14c454e9f2626fab23daeb3930554b1a
Summary:
Since the Stub.h now only contains NOT_IMPLEMENTED, let's move it to its own
header outside of the win directory.
Reviewed By: genevievehelsel
Differential Revision: D23696244
fbshipit-source-id: 2dfc3204707e043ee6c89595668c484e0fa8c0d0
Summary:
While the code isn't compiled, this makes the thrift definition available to
the rest of the code, eliminating the need for having a stub for
SerializedInodeMap on Windows.
Reviewed By: genevievehelsel
Differential Revision: D23696242
fbshipit-source-id: 8a42dd2ed16887f3b7d161511e07aaa35fd1b968
Summary: This file is not fuse specific, therefore, let's move it to a non-fuse folder.
Reviewed By: chadaustin
Differential Revision: D23464460
fbshipit-source-id: f70e94bb0ecc37bd74798fd230dee2058484f31b
Summary:
Next step in unifying the mount path, let's make the initialize the same in
Windows and unices. The only difference is now limited to the .eden directory
which we will be able to implement once regular users can create symlinks.
For the takeover code, the #ifdef is pushed down to the actual code that does
it, this allows the rest of the code to not have to bother about Windows vs
other platforms.
Reviewed By: wez
Differential Revision: D21517478
fbshipit-source-id: d40ca2694d23031ff98e319071e610efa306008f
Summary:
From looking at the code, it appears that the FUSE refcount is always a
uint32_t, except when serialized. Let's do the signedness conversion and
narrowing then.
Reviewed By: simpkins
Differential Revision: D21240161
fbshipit-source-id: 877c6cb6881cb36346c64cf92d99b1e588aed580
Summary: This diff ports TreeInode, FileInode, InodeMap and related classes to Windows. We don't build or test it here, there are more dependcies we need to port. The built script and the test are part of other diffs in this stack.
Reviewed By: simpkins
Differential Revision: D19956266
fbshipit-source-id: 9eb754233bca3d5a336f465c2400512a8593ca4f
Summary:
Each entry in InodeMap::unloadedInodes_ stored the InodeNumber
twice. Remove one of them.
Reviewed By: wez
Differential Revision: D18651007
fbshipit-source-id: be85c34cb2b38fc0b2875d0874cecd1ef274aca4
Summary:
Fix the `getStatInfo()` code to avoid walking all loaded inodes. That
behavior was okay early on when we were first developing EdenFS and this
API wasn't used in many places. However, `eden rage` and `eden stats` both
exercise this code path, and it seems pretty bad to walk all loaded inodes in
order to return some stats.
I removed the counts about number of materialized inodes for now, since I
don't think we're really using this metric much.
Reviewed By: chadaustin
Differential Revision: D17578306
fbshipit-source-id: 55ab0209745869b160e91167d6cff7d95f39a95a
Summary:
We are probably going to add more inodemap counters, so add a prefix
now.
Reviewed By: fanzeyi
Differential Revision: D17142015
fbshipit-source-id: 4bd3cd4fd9234d8766864f364fef0b0d963f03b6
Summary:
The definition of RLockedPtr should use a non-const Subclass type, and not a
const one.
Reviewed By: ot
Differential Revision: D15356827
fbshipit-source-id: b8ad41e263f0e15ffa25b0698aa85eab8ca2ccb8
Summary:
Update the copyright & license headers in C++ files to reflect the
relicensing to GPLv2
Reviewed By: wez
Differential Revision: D15487078
fbshipit-source-id: 19f24c933a64ecad0d3a692d0f8d2a38b4194b1d
Summary:
Some newer versions of `clang` (such as Apple's version 11) will warn/error out if a constructor or assignment operator
marked `default` is implicitly deleted (e.g., if the object contains a non-moveable/non-copyable member). This diff
removes all such defaulted constructors/assignment operators, which I ran into while building `edenfs` on my Macbook Pro.
Reviewed By: chadaustin, strager
Differential Revision: D15901794
fbshipit-source-id: 794ed8377693a6735bb567635dc919bc678751a4
Summary: Small things I noticed while working on other stuff.
Reviewed By: strager
Differential Revision: D10055671
fbshipit-source-id: de8c3b04928567a821172e6fa7ee0e056958e1e7
Summary:
Sandcastle has several cases where we chown the entire
repository which performs terribly on Eden. As a workaround we have a
command to do this in eden without loading all the files.
Reviewed By: chadaustin
Differential Revision: D12857956
fbshipit-source-id: 36cebcc710fbcf4e1eb265df901513cf50a227b9
Summary: This function is never used.
Reviewed By: strager
Differential Revision: D9998991
fbshipit-source-id: 27a8f5180d7516c3bf61b11192672142f77abccc
Summary:
The Overlay is the natural home for nextInodeNumber_ now that every
directory allocates inode numbers for its children right away. This
also simplifies serializing nextInodeNumber_ to disk in the following
diff.
Reviewed By: simpkins
Differential Revision: D8192442
fbshipit-source-id: 9b776a73c8d7653002b55985d592b1746e52f878
Summary:
Once a mount point has been unmounted we no longer need to care about
outstanding FUSE reference counts--we can treat them as if they are all zero.
This updates EdenMount to tell the InodeMap when the mount point is unloaded,
and changes InodeMap::unloadInode() to make use of this information when
deciding if it needs to remember the inode information.
Previously InodeMap would save information for inodes with outstanding FUSE
reference counts. Writing all of this state to the overlay could take a
non-trivial amount of time.
Reviewed By: chadaustin
Differential Revision: D7555998
fbshipit-source-id: 0896f867ce850ab3e61c262776d536de003685ff
Summary:
Update InodeMap::onInodeUnreferenced() to take a pointer to a non-const
InodeBase. This allows the methods it calls, including
InodeBase::updateOverlayHeader() to be non-const.
Using non-const InodeBase objects seems to make sense here since the inode is
in the process of being destroyed.
In a future diff I plan to update FileInode::updateOverlayHeader() to share the
same code as normal file methods to ensure that the overlay file is open. This
modifies the FileInode state's open refcount, so it is useful to have this
method be non-const for that purpose.
Reviewed By: chadaustin
Differential Revision: D7407424
fbshipit-source-id: 541656c7b9b283c5e5650445de5bbdbaae3fc57f
Summary:
Decouple inode number assignment from materialization status.
The idea is that we will always assign entries an inode number and
track whether an entry is materialized otherwise. This is necessary
to give consistent inode values across remounts.
Reviewed By: simpkins
Differential Revision: D7052470
fbshipit-source-id: 80d3f2a2938463198a3132182537e6223c79d509
Summary: Cleaning up the takeover initialization code for EdenMount and InodeMap.
Reviewed By: simpkins
Differential Revision: D7294419
fbshipit-source-id: 58506f04259bb1017e6cd2e80e40e5820de6200f
Summary:
Instead of having a rule that save() must be called after
InodeMap::shutdown, just have InodeMap::shutdown return the
SerializedInodeMap if it's desired.
Reviewed By: simpkins
Differential Revision: D7292773
fbshipit-source-id: 2ba35fc729e19af122fe5d6c5a3287ad6b8af946
Summary:
To make it clearer to me why all the calls to newPtrLocked were safe, and to
eliminate some duplication, I captured the newPtrLocked call patterns into
member functions on LoadedInode and TreeInode::Entry.
Reviewed By: simpkins
Differential Revision: D7207542
fbshipit-source-id: 25de77e72c0898be43b3fbdddab835d64101755e
Summary:
If the root TreeInode wants to allocate inode numbers, the inode
allocator must be initialized first. But complete InodeMap
initialization requires the root TreeInode. So split this into two
parts.
Also, I changed the inode allocator to a single atomic increment instead
of a lock acquisiton.
Finally, the extra assertions in this diff uncovered what looks like a
bug in the takeover logic where nextInodeNumber_ could end up being
smaller than the value in the takeover data, since the max inode
number from the overlay was assigned after loading from takeover data.
Reviewed By: simpkins
Differential Revision: D7107706
fbshipit-source-id: ec43cc81c11d709261598739c622609b372433a2
Summary:
I spent way too long trying to figure out why my refactorings were
causing invariant errors inside InodeMap. It turns out that we
initialize the root TreeInode before InodeMap::initialize is called,
which I suspect resulted in duplicate inode numbers being handed out.
Reviewed By: simpkins
Differential Revision: D7106302
fbshipit-source-id: b459734fb96bfbb6b4b27a1d23de8b6406d30ca4
Summary:
I'm seeing test failures that I have not yet understood and I
thought they might be caused by an implicit conversion from
fusell::InodeNumber to bool. Well, they're not, but this is how I
discovered that. I'm not sure I want to land this change, but I'm
going to leave it around until I figure out what's happening with my
other diffs.
Reviewed By: simpkins
Differential Revision: D7077635
fbshipit-source-id: 50bf67026d2d0da0220c4709e3db24d841960f4b
Summary:
I am working on a stack of diffs that changes how we allocate inode
numbers to tree entries. I was hitting test failures I could not
understand, so in the process of trying to understand the flows
through InodeMap, I found newChildLoadStarted to be redundant with
shouldLoadChild.
Note: Today, allocateInodeNumber() acquires the InodeMap's lock, but
in a later diff, inode numbers will be assigned en masse during
TreeInode construction.
Reviewed By: simpkins
Differential Revision: D7059719
fbshipit-source-id: 624b861040d585d2cae41d7ec2aae7d528ff8336