sapling/eden
Katie Mancini ffa558bf84 implement inode unloading after checkout
Summary:
NFSv3 has no inode invalidation flow built into the procall. The kernel does not
send us forget messages like we get in FUSE. The kernel also does not send us
notifications when a file is closed. Thus EdenFS can not easily tell when
all handles to a file have been closed.

As is now we never clean up inodes. This is bad for memory & disk usage.
We will never unload an inode so we always keep it in memory once it's created.
Additonally, we never remove a materialized inode from the overlay. This means
we have unbounded memory and disk usage :/

We need to clean up these inodes at somepoint. There are a couple high level
options:
1. Support nfsv4. NFSv4 sends us close message when a file handle is closed.
This would allow us to actually keep track of reference coundts on an inode.
However, This is a lot of work. There is a lot of other things we would have to
support before we can move to nfsv4.
2. Run background inode cleanups.

nfsv4 is probably the right long term solution. But for now we should be able to
get by with periodic unloads.

I considered a couple of options for unloads:
1. Unload inodes immediatly when files are removed.
2. Delay cleaning up inodes until a while after they are removed. (i.e. clean
up inodes n seconds after an `unlink`, `rename`, `rmdir`, or checkout)
3. Run periodic inode unloading. (i.e. once a day unload inodes).

Option 1. feels a bit too hostile to applications that hold files open.
Option 3. means we will build up a lot of cruft over the course of the day. But is
probably the most application friendly.

I decided to try out option 2 first and see if it works well with the common
developer tools. Its seems to work (see below) so I am going with it.

This diff only does inode cleanup after checkout. we might want to run inode
clean up after unlink/remove dir as well, but this would be more expensive.
Batch unloading feels better on checkout seems better to me and should happen
frequently enough to clean up space for people.

There is one known "broken" behavior in this diff. We unload all unlinked
inodes which means we will erase more inodes than we should. Sometimes EdenFS
crashes or bugs and unlinks legit inodes. Normally we let those live in the
overlay so we could go in an recover them. My plan to fix this is to mark inodes
for unloading instead of just unloading all unlinked inodes.

Reviewed By: xavierd

Differential Revision: D30144901

fbshipit-source-id: 345d0c04aa386e9fb2bd40906d6f8c41569c1d05
2021-09-16 14:35:04 -07:00
..
fs implement inode unloading after checkout 2021-09-16 14:35:04 -07:00
hg-server third-party/rust: Update pin-project 0.4.24 to 0.4.28 2021-09-15 23:01:30 -07:00
integration prefetch option to only list files 2021-09-14 10:02:33 -07:00
locale
mononoke mononoke: add a test that shows a weird behaviour of derive_manifest 2021-09-16 13:58:03 -07:00
scm third-party/rust: Update pin-project 0.4.24 to 0.4.28 2021-09-15 23:01:30 -07:00
test_support test_support: canonicalize the temporary directory path 2021-08-16 16:08:45 -07:00
test-data fix fsck snapshot integration tests 2021-07-14 16:20:04 -07:00
.gitignore
Eden.project.toml