Summary: instead of dynamic casting to find the repo name, all backing stores can return an optional reponame, and can check if the optional is set.
Reviewed By: zhengchaol
Differential Revision: D31214723
fbshipit-source-id: 9d10114ff6bde13254d3a3caaf2401f87d07ffd7
Summary: add more information to the runtime error thrown by the dynamic cast failure in `eden trace hg` and predictive fetch
Reviewed By: zhengchaol
Differential Revision: D31212247
fbshipit-source-id: 982901dfd2eb05db9ca6e7366277a07b6b29872f
Summary:
VC++ 2019 is pickier about which standard library includes include
each other. Be explicit.
Reviewed By: zhengchaol
Differential Revision: D31186916
fbshipit-source-id: 95cfa8848d0e2e312e2024923fa166db5f68dde0
Summary:
While debugging the unlinked inode unloading for NFS I have re-added these
logs a couple times. These seem valuable to have in eden so that we don't have
to add them any time we are debugging eden and we can debug a bit in a
production eden rather than dev built eden.
Reviewed By: xavierd
Differential Revision: D30971151
fbshipit-source-id: 58172079dfe4f4e4ba31bae30bf982e2cbe0fd29
Summary:
We run periodic inode unloading for unlinked inodes on NFS because we get no
information from the client on when inodes are no longer needed, and we have to
clean them up at some point for memory and disk reasons. See previous commit
summaries for more details on this (D30144901 (ffa558bf84)).
Let's add some counters on this so we have a bit more visibility into the
process. This counter is meant to mimic the PeriodicUnloadCounter counter.
Reviewed By: chadaustin
Differential Revision: D30966688
fbshipit-source-id: cfc8d769b53073d9f4c0c27b6bee20e222c6c8d2
Summary:
There is no need to read from the LocalStore twice, the tree is either present
in it, or not.
Reviewed By: chadaustin
Differential Revision: D31187972
fbshipit-source-id: 15bdeef9176b51e6ba3f62ed16550032b0024b94
Summary:
Some of EdenFS backing store requires EdenFS to cache objects locally to avoid
potentially expensive network fetches, while others already have some form of
local caching. In the past, all backing store fell in the first category, but
thanks to Mercurial's native backing store implementation the LocalStore
caching has become pure overhead for it. Previously, this was worked around by
configuring the LocalStore to not cache blobs locally, but this wasn't done for
trees. This config also conflicts with the need to cache blobs and trees
locally for backing stores in the first category (such as ReCas).
Since we know at construction time what backing store needs local caching, we
can simply wrap these in the newly introduced LocalStoreCachedBackingStore
store.
For now, since the Mercurial backing store always writes a proxy hash to the
LocalStore, bypassing the LocalStore for trees would be a regression due to the
added disk IO. Once proxy hashes are gone for Mercurial, we can remove the
LocalStoreCachedBackingStore wrapper.
Reviewed By: chadaustin
Differential Revision: D31118905
fbshipit-source-id: 4a2958eafeeb8144ee4421ec44dbd30cedceee29
Summary:
Now that we might have multiple kernel protocols per mount (i.e. both fuse and
nfs on macOS) let's include them in eden rage.
Reviewed By: xavierd
Differential Revision: D31154042
fbshipit-source-id: 38e7630829d70fe9dd6dbeabacc3b538ee798e0d
Summary:
The `ObjectFetchContext::Origin::FromBackingStore` is widely interpreted as
meaning that a network fetch was performed, but for some backing stores, this
isn't true. The Mercurial backing store for instance can either read data from
its on-disk cache, or from the network. Since both have very different
characteristics we shouldn't bundle them in the same enum value.
Since the backing store knows how data was obtained, let's have the backing
store return how it was obtained to enable the ObjectStore to properly record
this information. The `FromBackingStore` is also renamed to make it clearer
what its purpose is.
Reviewed By: zhengchaol
Differential Revision: D31118906
fbshipit-source-id: ee42a0c9d221f870742de07c0df7c732bc79d880
Summary:
we are passing some bytes into Popen and shlex.quote. shlex.quote expects a
string not bytes. fsencode gives us bytes fsdecode gives us string. Let's used
fsdecode instead.
Reviewed By: zhengchaol
Differential Revision: D31129335
fbshipit-source-id: 7792bdcd4dd833a4946daf8ec75576cfe4fc24af
Summary:
The duality of this function is a bit awkward, especially for backing stores
that want metadata caching, but not blob caching. This makes the code in
ObjectStore more complicated that it needs to be.
This will also be used in a future diff.
Reviewed By: chadaustin
Differential Revision: D31090782
fbshipit-source-id: cb7d7fc44d8780f86abad166d0f099675d29e5e7
Summary:
Similarly to blobs, fetching trees is converting data from HgImportRequest to
individual hashes and proxy hashes vector, making copies of these. This is
inefficient and it makes the code harder to read and understand. Passing the
batch of HgImportRequest directly avoid these copies and makes the code easier
to read.
Reviewed By: fanzeyi
Differential Revision: D30583567
fbshipit-source-id: e85952975141c92f9524095c62418baabf8fefcd
Summary:
Both the caller and the function itself are copying data to make sure that the
next function gets the data in the format it desires. This makes the code
complicated to read as well as inefficient as each conversion ends up being a
copy.
We can simply pass the HgImportRequest directly to avoid both of these issues
which removes all the copies.
A similar change will be done for the getTreeBatch function, after which the
HgDatapackStore will be folded onto the HgBackingStore code.
Reviewed By: fanzeyi
Differential Revision: D30563706
fbshipit-source-id: bb392f89e691c22ff9ad4df0d365ddb62077e657
Summary:
The intent behind the refresh method is to both read new data from the disk,
but also to flush the in-memory write buffer to disk. The name "flush" is used
in the revisionstore code to mean the latter, thus let's use "flush" in the
rest of the codebase.
Reviewed By: kmancini
Differential Revision: D30947873
fbshipit-source-id: c85a6abe770a47d6ce454d6af1fa73e505194a22
Summary: ACE will have to run in situations where Chef has not run, but we'll need to be able to reliably write to the auth logs so Blackbird can properly build detections. So we need these crates so we can build the somewhat foolproof solution to ensure ACE logs all executions.
Reviewed By: farnz
Differential Revision: D31066559
fbshipit-source-id: 9fa3b5778cd2602bdeaac90a9daa758b117babfe
Summary:
folly:format is deprecated in lieu of fmt and std::format. Migrate
most of EdenFS to fmt instead.
Differential Revision: D31025948
fbshipit-source-id: 82ed674d5e255ac129995b56bc8b9731a5fbf82e
Summary:
Backport: https://github.com/briansmith/ring/pull/1334
This will allow us to unpin Rust compiler to 1.53.0 and update to 1.55.0.
Reviewed By: xavierd
Differential Revision: D31039024
fbshipit-source-id: f6a9c918e836d93d03c34c77c12bbe63cf7cbe09
Summary:
EdenFS would never log anything when mounting via NFS, let's make it more
visible and easier to grep.
Reviewed By: chadaustin
Differential Revision: D31022158
fbshipit-source-id: 99fd3a04c90526eedf9951ac7c2bcd9e18ef8953
Summary:
this relies on local changes to make it so cargo metadata ACTUALLY finds this
binary: https://github.com/tokio-rs/console/pull/146 is where I try to upstream
it
Reviewed By: jsgf
Differential Revision: D30944630
fbshipit-source-id: 5d34a32a042f83eff7e7ae7445e23badf10fffe3
Summary: Watchman is a C++17 project now, so we can use std::optional.
Reviewed By: xavierd
Differential Revision: D30917549
fbshipit-source-id: 95d8ac15d4939a70347336ddfb120ab5025db993
Summary:
Having tons of booleans in a function can be very error prone from a caller
perspective, using a structure to pass in the same information can mitigate
some of this issue.
Reviewed By: kmancini
Differential Revision: D30883743
fbshipit-source-id: dcf38d29bfe2cb5155879f7ae4eab5cea31f798a
Summary:
During an `eden chown`, EdenFS will try to chown both the repository, and the
redirections. In some cases, chowning the redirection can both take a long time
and be unecessary. Consider the case where some automation temporarily chown a
repository to a service user that needs to access the repository, and then
chown it back to the owner of the repository. In that case, changing the
ownership of the redirection is superfluous and unecessary.
Reviewed By: mrkmndz
Differential Revision: D31010912
fbshipit-source-id: a882948005ac4fe29ff465088f196e0fc2bc10be
Summary:
In D29940980 (2e2b9755cf) we used shlex for a redirect subprocess command line.
The list does not always contain strings tho, which is a requirement to use
shlex.quote my guess is that they are paths. We should still str things
before we shlex.quote them.
Differential Revision: D31001622
fbshipit-source-id: 2a270781d7f2d84ad7a9a2f9975500b29306cfa8
Summary:
We periodically need to dereference inodes on NFS because we get no other info
from the kernel on when should dereference them.
This means the NFS kernel might have references to inodes after we delete them.
An unknown inode number is not a bug on NFS. It's just stale, so the error should
reflect that.
Reviewed By: xavierd
Differential Revision: D30144898
fbshipit-source-id: 3d448e94aea5acb02908ea443bcf3adae80eb975
Summary:
We periodically need to dereference inodes on NFS because we get no other info
from the kernel on when should dereference them.
It can be disruptive to a users workflow because an open files that were rm'ed
or removed on checkout will no longer have their old content. (on a native
filesystem or fuse applications that had the file open propr to the removal
would still be able to access files.) For most editors this is not a problem
because they read the file on open (seems fine for vim and vscode from testing).
However folks could theoretically have a workflow this does not jive with.
Let's make it configurable how often this runs, so users can control how
much we distupt their workflow.
Reviewed By: xavierd
Differential Revision: D30144899
fbshipit-source-id: 59cf5faea70b3aea216ca2bcb45b96e34f5e72b5
Summary:
NFSv3 has no inode invalidation flow built into the procall. The kernel does not
send us forget messages like we get in FUSE. The kernel also does not send us
notifications when a file is closed. Thus EdenFS can not easily tell when
all handles to a file have been closed.
As is now we never clean up inodes. This is bad for memory & disk usage.
We will never unload an inode so we always keep it in memory once it's created.
Additonally, we never remove a materialized inode from the overlay. This means
we have unbounded memory and disk usage :/
We need to clean up these inodes at somepoint. There are a couple high level
options:
1. Support nfsv4. NFSv4 sends us close message when a file handle is closed.
This would allow us to actually keep track of reference coundts on an inode.
However, This is a lot of work. There is a lot of other things we would have to
support before we can move to nfsv4.
2. Run background inode cleanups.
nfsv4 is probably the right long term solution. But for now we should be able to
get by with periodic unloads.
I considered a couple of options for unloads:
1. Unload inodes immediatly when files are removed.
2. Delay cleaning up inodes until a while after they are removed. (i.e. clean
up inodes n seconds after an `unlink`, `rename`, `rmdir`, or checkout)
3. Run periodic inode unloading. (i.e. once a day unload inodes).
Option 1. feels a bit too hostile to applications that hold files open.
Option 3. means we will build up a lot of cruft over the course of the day. But is
probably the most application friendly.
I decided to try out option 2 first and see if it works well with the common
developer tools. Its seems to work (see below) so I am going with it.
This diff only does inode cleanup after checkout. we might want to run inode
clean up after unlink/remove dir as well, but this would be more expensive.
Batch unloading feels better on checkout seems better to me and should happen
frequently enough to clean up space for people.
There is one known "broken" behavior in this diff. We unload all unlinked
inodes which means we will erase more inodes than we should. Sometimes EdenFS
crashes or bugs and unlinks legit inodes. Normally we let those live in the
overlay so we could go in an recover them. My plan to fix this is to mark inodes
for unloading instead of just unloading all unlinked inodes.
Reviewed By: xavierd
Differential Revision: D30144901
fbshipit-source-id: 345d0c04aa386e9fb2bd40906d6f8c41569c1d05
Summary:
The NFS protocol needs to know if a read reached the end-of-file to avoid a
subsequent read and thus reduce the chattyness of the protocol.
On top of avoiding RPC calls, this should also halve the amount of data read
from Mercurial due to the BlobCache freeing the in-memory cached blob when the
FS has read the file in its entirety. This meant that the second READ would
always force the blob to be reloaded from the Mercurial store, which would also
force that blob to be kept in memory until being evicted (due to it not being
fully read).
Reviewed By: kmancini
Differential Revision: D30871422
fbshipit-source-id: 8acf4e21ea22b2dfd7f81d2fdd1b137a6e90cc8f
Summary:
The tree metadata fetching evolution goes as follow
(1) (commit, path) scs query
(2) tree manifest scs query [we are here]
(3) eden api manifest query [in development]
Option (1) is no longer used and is the only placed that required scs proxy hash.
Removing it will simplify transition from (2) to (3) and also cleans up bunch of unused code.
It also comes with minor performance improvement, saving about 5% on file access time.
To be precise, this is measured by running fsprobe [this is probably too little to measure in high noise benchmark like running arc focus]:
```
fsprobe.sh run cat.targets --parallel 24
```
Results:
```
W/ scshash:
P24: 0.1044 0.1007 0.1005 (hot) 0.1019 avg
W/o scshash:
P24: 0.0954 0.0964 0.1008 (hot) 0.0975 avg
```
This performance improvement comes from the fact, that even though scs hash was never created or used, we still attempted to load it from scs table, and even though this load always failed it contributed to execution time.
Reviewed By: xavierd
Differential Revision: D30942663
fbshipit-source-id: af84f1e5658e7d8d9fb6853cbb88f02b49cd050b
Summary: This adds inode number to NFS trace event so that we can use it in ActivityRecorder to show the filename of the FS request.
Reviewed By: xavierd
Differential Revision: D30849770
fbshipit-source-id: 580faf5fccb1a225399d9aec843e23eae1874e87
Summary:
We have an option on GlobFiles for listing hidden files, but we don't have a
cli option. We default to false in the cli. Let's pipe this option all the way through.
so that we can control this flag from the cli.
Reviewed By: xavierd
Differential Revision: D30915118
fbshipit-source-id: 28b91d4fd2dd4bdf9e342929f570f64db14e8ad0
Summary:
`eden prefetch` and `eden glob` return lists that despite being called
"maching files" actually contains both files and directories.
In some cases, we only want the list of files and it introduces unnessecary
overhead on our clients for them to have to stat all the files in the list to
filter out the dirs. Let's add an option to just list files.
Reviewed By: chadaustin
Differential Revision: D30816193
fbshipit-source-id: 6e264142162ce03e560c969a0c0dbbc2f418d7b9
Summary: The error message that currently exists here does not correspond to the command ran, its just missing the "redirect" part
Reviewed By: xavierd
Differential Revision: D30914616
fbshipit-source-id: 866ab7d494b728af13fbb3656edb8740a399755f
Summary: This is a similar diff to D30915090, but for EdenFS.
Differential Revision: D30915126
fbshipit-source-id: 9a718e47237924ebe20176c522a1b1193224236c
Summary:
To eliminate the need for proxy hashes, we need variable-width object
IDs. Introduce an ObjectId type much like RootId.
Reviewed By: genevievehelsel
Differential Revision: D30819412
fbshipit-source-id: 07a185ba6b866b475c92f811e70aa00a8a9f895f
Summary: the rage summary is getting hard to quickly parse, so this underlines each section header, as well as unifies underline looks (with `eden stats`). This adopts the underline code from `eden du` and makes it a util function for shareability.
Differential Revision: D30857773
fbshipit-source-id: 66b5b06f5b0125304d45d3465a8bc2248693b791
Summary: I think it would be helpful to see the path of the inode that causes this check to fail
Reviewed By: kmancini
Differential Revision: D30880645
fbshipit-source-id: 08cad2277484568a6e325b1db7a89a9cf0fe1d3f
Summary: it would be helpful to see a user's or sandcastle job's eden config, especially in the case of a gated feature rollout / staged feature rollout.
Differential Revision: D30857763
fbshipit-source-id: ee2a311ee643fc9db5acef1b02017564c51d2362
Summary:
just a typo fix
Created from CodeHub with https://fburl.com/edit-in-codehub
Reviewed By: fanzeyi
Differential Revision: D30849172
fbshipit-source-id: 9779832870c909d080548ec71ecf86aa53767dbc
Summary:
Add impls for Layer for Box/Arc<L: Layer> and <dyn Layer>. Also a pile of other
updates in git which haven't been published to crates.io yet, including proper
level filtering of trace events being fed into log.
Reviewed By: dtolnay
Differential Revision: D30829927
fbshipit-source-id: c01c9369222df2af663e8f8bf59ea78ee12f7866
Summary:
Bump all the versions on crates.io to highest to make migration to github
versions in next diff work.
Reviewed By: dtolnay
Differential Revision: D30829928
fbshipit-source-id: 09567c26f275b3b1806bf8fd05417e91f04ba2ef
Summary: This adds the support for FS events logging for NFS. For context, each type of event is assigned a sampling group that determines its sampling rate. In TraceBus subscription callback, events are sent to `FsEventLogger` to be sampled and logged through `HiveLogger`.
Reviewed By: xavierd
Differential Revision: D30843863
fbshipit-source-id: 65394d31b1197efd69c7fd4c1b24562f5abd5785
Summary:
In certain situations, users may cause EdenFS to falsely return a path not exist result while the path is available. Windows will cache that and causing subsequent access to that file to automatically return a file not exist error.
We currently only invalidate this negative cache during checkout and rebooting the machine as the cache is even kept during EdenFS restarts. In this diff, we starts to invalidate the negative path cache at startup so if the user ever had issues an `eden restart` would be sufficient to fix.
Reviewed By: xavierd
Differential Revision: D30814059
fbshipit-source-id: 53283f471702762b2eed0c5d0f6a9cc49f4db739
Summary:
Put code using the usage service behind an `EDEN_HAVE_USAGE_SERVICE` macro.
Previously the C++ code was simply guarded by a `__linux__` check, and the
CMake code did not have a guard at all. This caused builds from the GitHub
repository to fail on Linux, since the code attempted to use the usage service
client which was not available.
Reviewed By: xavierd
Differential Revision: D30797846
fbshipit-source-id: 32a0905d0e1d594c3cfb04a466aea456d0bd6ca1
Summary:
LocalStore no longer special-cases Tree objects with kZeroHash
ids. Instead, unconditionally write into LocalStore with the Tree's
hash.
Reviewed By: xavierd
Differential Revision: D29155470
fbshipit-source-id: aee3840fe8dfd7aa46305b6db6f7950efb2e41d2
Summary:
In preparation for expanding to variable-width hashes, rename the
existing hash type to Hash20.
Reviewed By: genevievehelsel
Differential Revision: D28967365
fbshipit-source-id: 8ca8c39bf03bd97475628545c74cebf0deb8e62f
Summary:
As title, sampling group determines the sampling rate at which an FS event is logged. The higher the sampling group the more heavily its events are dropped, thus, more frequent events are assigned to the higher sampling groups.
I ran activity recorders on a few workflows, buck build, getdepts, and vscode editing and came up with the following assignment. Note that only a subset of events are assigned to a sampling group (so events not included will not be logged) as we just start to tune the sampling rates and these events should be good for a start.
```
Group1 (1/10)
FUSE_MKDIR
FUSE_RMDIR
FUSE_CREATE
FUSE_RENAME
Group2 (1/100)
FUSE_WRITE
FUSE_LISTXATTR
FUSE_SETATTR
Group3 (1/1000)
FUSE_GETXATTR
FUSE_GETATTR
FUSE_READ
FUSE_READDIR
Group4 (1/10000)
FUSE_LOOKUP
```
For reference, here are the counts of FS events of a cold buck build. The frequencies of other workflows are similar.
```
FUSE_LOOKUP 60.09 98733
FUSE_READ 12.80 21037
FUSE_GETXATTR 8.91 14645
FUSE_FORGET 8.01 13162
FUSE_GETATTR 5.55 9116
FUSE_READDIR 3.21 5270
FUSE_LISTXATTR 0.59 969
FUSE_READLINK 0.54 892
FUSE_STATFS 0.21 338
FUSE_WRITE 0.04 64
FUSE_CREATE 0.02 28
FUSE_RENAME 0.01 23
FUSE_SETATTR 0.01 13
FUSE_UNLINK 0.00 6
FUSE_RMDIR 0.00 1
FUSE_MKDIR 0.00 1
FUSE_MKNOD 0.00 1
```
Reviewed By: xavierd
Differential Revision: D30770533
fbshipit-source-id: 90be881ddbeba2113bbb190bdb1e300a68f500a0