Summary:
In order to get Watchman to recognize EdenFS mounted as NFS, we previously set
the f_mntfromname to be "edenfs". Unfortunately, Apple decided that when the
NFS server is accessed via a unix socket it would overwrite that and instead
use the path to the socket.
Digging in the source code, I did find that the f_fstypename can be overwritten
by calling fsctl with the right set of arguments. Thus, let's do just that.
Reviewed By: kmancini
Differential Revision: D28270605
fbshipit-source-id: f6be4e394d814806aa03ec3e82b8bc2faf364ea7
Summary:
Due to NFS being designed as a network filesystem, it default to binding on a
TCP socket. For EdenFS, since we're not expecting to mount an actual remote
filesystem, we bind these sockets to localhost. Unfortunately, TCP sockets have
some inherent overhead due to being designed to be reliable over a non-reliable
medium.
On macOS, Apple provides a way to mount an NFS server that is listening on a
unix domain socket. Thanks for unix socket being reliable, the TCP overhead
goes away leading to some higher throughput and lower latency for the NFS
server. For EdenFS, timing `rg foobar` over a directory containing 27k files gives:
NFS over TCP:
rg foobar > /dev/null 0.80s user 5.44s system 567% cpu 1.100 total
NFS over UDS:
rg foobar > /dev/null 0.77s user 5.27s system 679% cpu 0.888 total
osxfuse:
rg foobar > /dev/null 0.87s user 5.59s system 662% cpu 0.975 total
The main drawback of going through a unix socket is that D27717945 (bcf6aa465c) appears to
no longer be effective due to
8f02f2a044/bsd/nfs/nfs_vfsops.c (L3739)
Reviewed By: kmancini
Differential Revision: D28261422
fbshipit-source-id: 25dc1dc78cdb50d6c6550a86ef01ea2c894c110f
Summary:
macOS supports NFS servers that can be reached via a unix socket as a way to
improve performance by reducing the TCP cost. To support this, let's first
allow the socket to bind to to be passed to the RpcServer, and then pass it
through to the privhelper code.
Reviewed By: kmancini
Differential Revision: D28261423
fbshipit-source-id: 78c60aac26353d1da76a67897429b964332df8b3
Summary: Looks like this is causing files to not be immediately updated after an update.
Reviewed By: kmancini
Differential Revision: D28384151
fbshipit-source-id: 61159db64a9f3c9c1f9e54ee0462ca870ff2aecc
Summary:
While ImmediateFuture are expected to be used on values that are mostly
immediate, there are cases where it won't be. In these cases we need a way to
wait for the computation/IO to complete. In order to achieve this, we need to
transform an ImmediateFuture onto a SemiFuture.
Reviewed By: fanzeyi
Differential Revision: D28293941
fbshipit-source-id: 227c0acf1e22e4f23a948ca03f2c92ccc160c862
Summary:
When a T can be default constructed, make an ImmediateFuture default
constructible.
Reviewed By: fanzeyi
Differential Revision: D28292874
fbshipit-source-id: 4c239cc9c3f448652b2bcdc103ea1a81ace46402
Summary: This should help the compiler generate even better code.
Reviewed By: chadaustin
Differential Revision: D28153979
fbshipit-source-id: b1d84c92af4fa760c92624c53d7f57330d7706fa
Summary: This is the same change as D27137328 (a9a1b73418) but for macFUSE.
Reviewed By: kmancini
Differential Revision: D28328029
fbshipit-source-id: c58e146dba2e7e3bdb320f2b5e80946e4a7b3afe
Summary: With the addition of the ability to "background" the prefetches in the daemon itself, we can remove the subprocess backgrounding in the python layer and just depend on the internal backgrounding.
Reviewed By: chadaustin
Differential Revision: D27825274
fbshipit-source-id: aa01dc24c870704272186476be34d668dfff6de5
Summary: getTreeForManifest is no longer called, so remove it.
Reviewed By: genevievehelsel
Differential Revision: D28306796
fbshipit-source-id: e51a32fa7d75c54b2e3525e88c162247b4496560
Summary:
In an NFS mount, the InodeNumber are sent to the client as the unique
identifier for a file. For caching purposes, the client will issue a GETATTR
call on that InodeNumber prior to opening it, to see if the file changed and
thus whether its cache needs to be invalidated.
In EdenFS, the checkout process does unfortunately replace file inodes
entirely, causing new InodeNumber to be created, and thus after an update, an
NFS client would not realize that the content changed, and would thus return
the old content to the application. To solve this, we could approach it in 2
different ways:
- Build a different kind of handle to hand over to the NFS client
- Keep InodeNumber constant during checkout.
After trying the first option, it became clear that this would effectively need
to duplicate a lot of functionality from the InodeMap, but with added memory
consumption. This diff attempts to do the second one.
Reviewed By: chadaustin
Differential Revision: D28132721
fbshipit-source-id: 94d470e33174bb9ffd7db00e1b37924096aac8e9
Summary:
In very hot code path, EdenFS is spending a very large amount of time creating
and destroying folly::Future objects. This is due to the required memory
allocation, as well as the handful of atomics that are happening at creation
time, and these are showing up in EdenFS profiles.
In the steady state, EdenFS actually doesn't need futures, as it often times is
able to service requests from its in-memory caches, in which case we should
ideally just return the value itself and not wrap it in a folly::Future. The
added ImmediateFuture is a step in this direction, as it can hold either an
immediate value, or a folly::SemiFuture, and allow the same API to be used
transparently between these 2.
Reviewed By: genevievehelsel
Differential Revision: D28006802
fbshipit-source-id: 89eaa32e7fa82c44844c4b23c4cb30dbeea46ca8
Summary:
Currently running `edenfsctl top` will crash on Windows:
```
Traceback (most recent call last):
File "C:\Python38\Lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Python38\Lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\tools\eden\bin\edenfsctl.exe\__main__.py", line 3, in <module>
File "C:\tools\eden\bin\edenfsctl.exe\eden\fs\cli\main.py", line 2253, in zipapp_main
File "C:\tools\eden\bin\edenfsctl.exe\eden\fs\cli\main.py", line 2236, in main
File "C:\Python38\Lib\asyncio\runners.py", line 43, in run
return loop.run_until_complete(main)
File "C:\Python38\Lib\asyncio\base_events.py", line 616, in run_until_complete
return future.result()
File "C:\tools\eden\bin\edenfsctl.exe\eden\fs\cli\main.py", line 2212, in async_main
File "C:\tools\eden\bin\edenfsctl.exe\eden\fs\cli\main.py", line 1059, in run
File "C:\tools\eden\bin\edenfsctl.exe\eden\fs\cli\top.py", line 395, in __init__
File "C:\Python38\Lib\curses\__init__.py", line 13, in <module>
from _curses import *
ModuleNotFoundError: No module named '_curses'
```
This diff will let it prints an error message.
Reviewed By: xavierd
Differential Revision: D28207330
fbshipit-source-id: a465fe5941b469f4a1ef964f1d4dc8a593639e7c
Summary:
It looks like argparse's exit was not able to handle asyncio event loop well,
causing edenfsctl to generate a long ugly stack trace when the command line
flag does not parse.
Let's just move the arguments parsing outside the asyncio runloop to avoid this
problem as a whole. In theory it should improve our `--help` time a little bit.
Reviewed By: chadaustin
Differential Revision: D28206622
fbshipit-source-id: 881eefaea73b244eadff0165965085e64dad935f
Summary:
Some user reported to see `edenfsctl restart` crashes due to this call to
os.getuid() since it does not available on Windows. P410914264
https://docs.python.org/3.9/library/os.html#os.getuid
Reviewed By: chadaustin
Differential Revision: D28204262
fbshipit-source-id: 077bf207d8b1b6c014fface63ea93e66057629cd
Summary:
This applies the formatting changes from black v21.4b2 to all covered
projects in fbsource. Most changes are to single line docstrings, as black
will now remove leading and trailing whitespace to match PEP8. Any other
formatting changes are likely due to files that landed without formatting,
or files that previously triggered errors in black.
Any changes to code should be AST identical. Any test failures are likely
due to bad tests, or testing against the output of pyfmt.
Reviewed By: thatch
Differential Revision: D28204910
fbshipit-source-id: 804725bcd14f763e90c5ddff1d0418117c15809a
Summary:
Context: https://fb.workplace.com/groups/edenfswindows/permalink/828914994691047/
Even with D27872753 it doesn't really make sense to have `eden fsck` running on Windows since it requires EdenFS repository to **be unmounted**.
This diff changes it to generate a warning to redirect users to run `eden doctor` instead (which is likely what they need).
Reviewed By: kmancini
Differential Revision: D28203778
fbshipit-source-id: ae105678876903bcf6514252bf07189775f9b187
Summary: This just makes it more obvious _where_ `--force` should be passed.
Reviewed By: genevievehelsel
Differential Revision: D28119590
fbshipit-source-id: 1fbdb4428e9b89e7b66c959f874067485a91d534
Summary:
From what I can see, this was added when EdenFS had a Mononoke store, which is
now long gone, thus we should be able to remove the Curl dependency altogether.
Reviewed By: fanzeyi
Differential Revision: D28037816
fbshipit-source-id: 834f7db64bab5dda1748ad2f033c27a2854b0ba4
Summary: Looks like these aren't needed since these files are owned by a TARGETS file.
Reviewed By: genevievehelsel
Differential Revision: D28101197
fbshipit-source-id: d790530227641bf25e48bd96c8a95dd31f08a954
Summary:
Now that autodeps knows where to find cpptoml.h, we no longer need these
manual annotation.
Reviewed By: kmancini
Differential Revision: D28100956
fbshipit-source-id: 463b73834c500c1d16a4a769af3655938124d49d
Summary:
This diff defines symlink type in `DirType`.
Even though it is not directly used in the FSCK diff. This will allow us to support symlink in EdenFS Windows in the future.
Reviewed By: genevievehelsel
Differential Revision: D28016305
fbshipit-source-id: 67c1aa22e39198f9c91845129695f27b8303a5f1
Summary:
We were ignoring the return value of runWhileMaterialized, and thus we were
returning to FUSE before fallocate returned.
Reviewed By: fanzeyi
Differential Revision: D28081991
fbshipit-source-id: f398942ddb2432e48e80c148abc8edb7e5ada71d
Summary:
folly::via is a Future API, and thus it creates one, which requires allocating
it and then attaching it to the Executore. Since the code to dispatch a request
isn't Future based, we don't need to use folly::via, and we can simply add the
lambda to the Executor directly. This removes expensive memory allocations from
the EventBase.
Reviewed By: kmancini
Differential Revision: D27976674
fbshipit-source-id: 8fa9724a94ba69b071ab894cdbbad0d33733c098
Summary:
Neither macOS, nor Linux are sending multi-fragment requests to the NFS server.
Since supporting these means calling into memmove, which can be expensive for
large requests, let's just remove support for them for now. If somehow macOS
and/or Linux start sending these, the XCHECK(isLast) will catch this and we can
fix the code by then.
Reviewed By: kmancini
Differential Revision: D27976671
fbshipit-source-id: 77c758b2bb36517d22d5b637e6f0ebf84cc19e5b
Summary:
The EventBase is single threaded, and for heavily concurrent client workflows,
it could see a lot activity, thus every cycle saved can be used to drive more
client requests. The construction of the IOBuf doesn't need to be done while in
the EventBase, thus let's build it outside.
Reviewed By: kmancini
Differential Revision: D27976670
fbshipit-source-id: c6c015ef26df1dcb3fc0c5f179e474bafbd71fac
Summary:
Passing 64 to preallocate means that the AsyncSocket code will issue reads of
64 bytes, even though the IOBufQueue has significantly more space available. We
can thus pass a bigger size to preallocate to reduce both the cost of
allocation, and the syscall cost. For heavily concurrent client code, this will
allow us to read more than one request per syscall.
The careful reader may have noticed that for very small requests the code may
reallocate more often that it should as it will always reallocate when falling
under 4KB. This is likely to not be an issue in practice.
Reviewed By: kmancini
Differential Revision: D27976672
fbshipit-source-id: 4c7e3aecc4763ab20854f3c466ce0872332f9b77
Summary:
These are various cleanups that should make the code easier to read, there is
no behavior changes.
Reviewed By: kmancini
Differential Revision: D27976673
fbshipit-source-id: 470eb628ca75bf1712a93c6e9aa3a27c3f314d01
Summary:
Running `rg foobar` in a loop and profiling EdenFS shows that we're spending a
significant amount of time collecting the size of non-materialized files. Since
this will never change, we can easily cache it for much faster access.
Reviewed By: chadaustin
Differential Revision: D27924804
fbshipit-source-id: 8b8af63dcb82664db2ecd81b3fcdc006a3a52d72
Summary: Create a proxy that stored RECAS-> Eden Hash, similar to SCS and HG proxy.
Reviewed By: chadaustin
Differential Revision: D27873498
fbshipit-source-id: 0b3e50e3a74b8f0914547178789cb6684b780866
Summary: Create a RE-CAS backing store with all APIs unimplemented, and Linux only.
Reviewed By: chadaustin
Differential Revision: D27771047
fbshipit-source-id: de00c6e290f924872eae7290b1945e6b3f40d610
Summary:
Chad first noted that deserializing trees from the local store can be expensive.
From the thrift side EdenFS does not have a copy of trees in memory. This
means for glob files each of the trees that have not been materialized will be
read from the local store. Since reading an deserializing trees from the local
store can be expensive lets add an in memory cache so that some of these
reads can be satisfied from here instead.
We collect some cache statistics already, lets expose them through thrift with
the rest of the stats.
Reviewed By: chadaustin
Differential Revision: D27052265
fbshipit-source-id: d7fdf70260599b8df43824e2442471e332c1b0cf
Summary:
Chad first noted that deserializing trees from the local store can be expensive.
From the thrift side EdenFS does not have a copy of trees in memory. This
means for glob files each of the trees that have not been materialized will be
read from the local store. Since reading an deserializing trees from the local
store can be expensive lets add an in memory cache so that some of these
reads can be satisfied from here instead.
Here we actually start to use the cache!
Reviewed By: chadaustin
Differential Revision: D27050310
fbshipit-source-id: e35db193fea0af7f387b6f44c49b5bcc2a902858
Summary:
This introduces some basic unit tests to ensure correctness of the cache.
We are adding tests to cover the simple methods of the object cache since we
are using that code path here. And adding a few sanity check tests to make sure
the cache works with trees.
Reviewed By: chadaustin
Differential Revision: D27050296
fbshipit-source-id: b5f0577c1662483f732bb962c5b40bca8e1dcb40
Summary:
Chad first noted that deserializing trees from the local store can be expensive.
From the thrift side EdenFS does not have a copy of trees in memory. This
means for glob files each of the trees that have not been materialized will be
read from the local store. Since reading an deserializing trees from the local
store can be expensive lets add an in memory cache so that some of these
reads can be satisfied from here instead.
This introduces the class for the in memory cache and is based on the existing
BlobCache. note that we keep the minimum number of entries functionality from
the blob cache. This is unlikely to be needed as trees are much less likely
than blobs to exceed a reasonable cache size limit, but kept since we already
have it.
Reviewed By: chadaustin
Differential Revision: D27050285
fbshipit-source-id: 9dd46419761d32387b6f55ff508b60105edae3af
Summary:
On all the code paths that matter we always acquire the write lock. Since we are
basically just using a simple lock, distributed mutex is a more efficient implementation
that does fancy tricks with cachelines. This improves performance from testing
globs which cause many concurrent reads from the cache.
Reviewed By: chadaustin
Differential Revision: D27810990
fbshipit-source-id: d22470f3f39e2cd3895f5ea772955b62030d154a
Summary:
Now that Object Cache actually does most the work, I am moving the
BlobCache tests to be ObjectCache tests. I am leaving a few blob cache
tests to sanity check that the cache works with blobs.
Reviewed By: chadaustin
Differential Revision: D27776113
fbshipit-source-id: ef58279d93035588beb162ee19173a42e3ca4e5b
Summary:
We would like to use a limited size LRU cache fore trees as well as blobs,
so I am templatizing this to allow us to use this cache for trees.
Trees will not need to use Interest handles, but in the future we could use
this cache for blob metadata, which might want to use interest handles.
Additionally if we at somepoint change the inode tree scheme that would remove
the tree content from the inodes itself, interest handle might be useful for
trees. We could also use this cache proxy hashes which may or may not use
interest handles. Since some caches may want interest handles and others will
not I am creating get/insert functions that work with and without interest
handles.
Reviewed By: chadaustin
Differential Revision: D27797025
fbshipit-source-id: 6db3e6ade56a9f65f851c01eeea5de734371d8f0
Summary:
Thrift setter API is deprecated since it doesn't bring any value over direct assignment. Removing it can reduce build-time and make our codebase more consistent.
If result of `s.set_foo(bar)` is unused, this diff replaces
s.set_foo(bar);
with
s.foo_ref() = bar;
Otherwise, it replaces
s.set_foo(bar)
with
s.foo_ref().emplace(bar)
Reviewed By: xavierd
Differential Revision: D27986185
fbshipit-source-id: d90aaf27f25f2ecfcbbbe7886e0c0d784f607a87
Summary: Allows us to background a prefetch (similar to how prefetch-profile fetches are backgrounded). A thing to note here is that we do not deduplicate fetches for prefetches, but if there is enough busy work between bulk filesystem accesses and the prefetch finishing, this should not be an issue.
Reviewed By: chadaustin
Differential Revision: D27028428
fbshipit-source-id: 5c528fff76719f42151542eaa3499271f7ab6fa3
Summary:
These methods will be used in my later Windows fsck diff as it will need to scan disk state to find changes.
It is a bit unfortunate that we'll need to stick with boost for now. However this should be a fairly easy migration to `std::filesystem` once that is available.
Reviewed By: kmancini
Differential Revision: D27872828
fbshipit-source-id: f6b27a171026aeaaea3db9f17b8f43cfa25004e4
Summary:
Copy from Watchman.
This allows us to show stack trace when EdenFS terminates on Windows.
Reviewed By: chadaustin
Differential Revision: D27896966
fbshipit-source-id: f3238a37a1176f879d5e6bc051ec97031c9a7096
Summary:
When EdenFS is killed, either due to `eden stop` timing out, or when simply
rebooting the host, the edenfs.log becomes filled with fsck errors, which also
slows down the fsck process.
Since we already print the number of errors per mount, limiting ourself to the
first 50 errors is probably good enough.
Reviewed By: fanzeyi
Differential Revision: D27943618
fbshipit-source-id: 2b3e6e3ae4df648d4b1dccf73c71f8dbbded3892
Summary: Mercurial still needs this to work in Python 2 for a few more weeks.
Reviewed By: quark-zju, xavierd
Differential Revision: D27943521
fbshipit-source-id: 2b5106496fbb523cdc97a3dce3ad0cbfab5c17b7