Summary: the rage summary is getting hard to quickly parse, so this underlines each section header, as well as unifies underline looks (with `eden stats`). This adopts the underline code from `eden du` and makes it a util function for shareability.
Differential Revision: D30857773
fbshipit-source-id: 66b5b06f5b0125304d45d3465a8bc2248693b791
Summary: I think it would be helpful to see the path of the inode that causes this check to fail
Reviewed By: kmancini
Differential Revision: D30880645
fbshipit-source-id: 08cad2277484568a6e325b1db7a89a9cf0fe1d3f
Summary: it would be helpful to see a user's or sandcastle job's eden config, especially in the case of a gated feature rollout / staged feature rollout.
Differential Revision: D30857763
fbshipit-source-id: ee2a311ee643fc9db5acef1b02017564c51d2362
Summary:
just a typo fix
Created from CodeHub with https://fburl.com/edit-in-codehub
Reviewed By: fanzeyi
Differential Revision: D30849172
fbshipit-source-id: 9779832870c909d080548ec71ecf86aa53767dbc
Summary:
Add impls for Layer for Box/Arc<L: Layer> and <dyn Layer>. Also a pile of other
updates in git which haven't been published to crates.io yet, including proper
level filtering of trace events being fed into log.
Reviewed By: dtolnay
Differential Revision: D30829927
fbshipit-source-id: c01c9369222df2af663e8f8bf59ea78ee12f7866
Summary:
Bump all the versions on crates.io to highest to make migration to github
versions in next diff work.
Reviewed By: dtolnay
Differential Revision: D30829928
fbshipit-source-id: 09567c26f275b3b1806bf8fd05417e91f04ba2ef
Summary: This adds the support for FS events logging for NFS. For context, each type of event is assigned a sampling group that determines its sampling rate. In TraceBus subscription callback, events are sent to `FsEventLogger` to be sampled and logged through `HiveLogger`.
Reviewed By: xavierd
Differential Revision: D30843863
fbshipit-source-id: 65394d31b1197efd69c7fd4c1b24562f5abd5785
Summary:
In certain situations, users may cause EdenFS to falsely return a path not exist result while the path is available. Windows will cache that and causing subsequent access to that file to automatically return a file not exist error.
We currently only invalidate this negative cache during checkout and rebooting the machine as the cache is even kept during EdenFS restarts. In this diff, we starts to invalidate the negative path cache at startup so if the user ever had issues an `eden restart` would be sufficient to fix.
Reviewed By: xavierd
Differential Revision: D30814059
fbshipit-source-id: 53283f471702762b2eed0c5d0f6a9cc49f4db739
Summary:
Put code using the usage service behind an `EDEN_HAVE_USAGE_SERVICE` macro.
Previously the C++ code was simply guarded by a `__linux__` check, and the
CMake code did not have a guard at all. This caused builds from the GitHub
repository to fail on Linux, since the code attempted to use the usage service
client which was not available.
Reviewed By: xavierd
Differential Revision: D30797846
fbshipit-source-id: 32a0905d0e1d594c3cfb04a466aea456d0bd6ca1
Summary:
LocalStore no longer special-cases Tree objects with kZeroHash
ids. Instead, unconditionally write into LocalStore with the Tree's
hash.
Reviewed By: xavierd
Differential Revision: D29155470
fbshipit-source-id: aee3840fe8dfd7aa46305b6db6f7950efb2e41d2
Summary:
In preparation for expanding to variable-width hashes, rename the
existing hash type to Hash20.
Reviewed By: genevievehelsel
Differential Revision: D28967365
fbshipit-source-id: 8ca8c39bf03bd97475628545c74cebf0deb8e62f
Summary:
As title, sampling group determines the sampling rate at which an FS event is logged. The higher the sampling group the more heavily its events are dropped, thus, more frequent events are assigned to the higher sampling groups.
I ran activity recorders on a few workflows, buck build, getdepts, and vscode editing and came up with the following assignment. Note that only a subset of events are assigned to a sampling group (so events not included will not be logged) as we just start to tune the sampling rates and these events should be good for a start.
```
Group1 (1/10)
FUSE_MKDIR
FUSE_RMDIR
FUSE_CREATE
FUSE_RENAME
Group2 (1/100)
FUSE_WRITE
FUSE_LISTXATTR
FUSE_SETATTR
Group3 (1/1000)
FUSE_GETXATTR
FUSE_GETATTR
FUSE_READ
FUSE_READDIR
Group4 (1/10000)
FUSE_LOOKUP
```
For reference, here are the counts of FS events of a cold buck build. The frequencies of other workflows are similar.
```
FUSE_LOOKUP 60.09 98733
FUSE_READ 12.80 21037
FUSE_GETXATTR 8.91 14645
FUSE_FORGET 8.01 13162
FUSE_GETATTR 5.55 9116
FUSE_READDIR 3.21 5270
FUSE_LISTXATTR 0.59 969
FUSE_READLINK 0.54 892
FUSE_STATFS 0.21 338
FUSE_WRITE 0.04 64
FUSE_CREATE 0.02 28
FUSE_RENAME 0.01 23
FUSE_SETATTR 0.01 13
FUSE_UNLINK 0.00 6
FUSE_RMDIR 0.00 1
FUSE_MKDIR 0.00 1
FUSE_MKNOD 0.00 1
```
Reviewed By: xavierd
Differential Revision: D30770533
fbshipit-source-id: 90be881ddbeba2113bbb190bdb1e300a68f500a0
Summary:
Since this method wasn't overriden, EdenFS would never periodically flush data
to disk.
Reviewed By: fanzeyi
Differential Revision: D30784400
fbshipit-source-id: d88e535250a476582868dd82e57137a0ac38f921
Summary: If the bind unmount fails in in the privhelper, theres a possibility of infinite recursion in this method. This adds a flag to indicate if we've tried the bind unmount before.
Differential Revision: D30732857
fbshipit-source-id: 6ee887d211977ee94c8e66531287f076a7e61a2c
Summary:
Having the same queue for all three makes the dequeue code overly complicated
as it needs to keep track of the kind of request that needs to be dequeued.
Incidently, the previous code had a bug where request in "putback" would be
requeued at the end of the queue, even though there were at the beginning of it
if they all had the same priorities.
This is theory should also improve the dequeue performance when the queue has a
mix of blobs/tree requests, but I haven't measured.
Reviewed By: genevievehelsel
Differential Revision: D30560490
fbshipit-source-id: b27e5429105c07e5f9eab482c12e5699ca3413f7
Summary:
Since the background condition is before the actual prefetching of files,
specifying the background option would just glob files but not prefetch them
which is equivalent to prefetching all the trees.
Reviewed By: genevievehelsel
Differential Revision: D30618753
fbshipit-source-id: 5533b1c78d614342ac3341ce033795be3850750a
Summary: Adds an option to print the path to the eden log file. Similar to `eden pid`, this can be used for shell one-liners.
Reviewed By: chadaustin
Differential Revision: D30558294
fbshipit-source-id: ca70addaef2093e10f0321bae0cff3b1bfc7dc75
Summary: `eden debug log --upload` fits in better with the format of the other cli tools (rather than `eden debug log upload`)
Differential Revision: D30557691
fbshipit-source-id: 32e47e1487703560f2adb5f0f79f1002d29eea93
Summary: Currently, tree imports are queued regardless of whether they are in the `hgcache`. This adds unnecessary delay, especially if the queue is busy (importer takes a long time and causes queue to backlog). This diff adds the logic to check if the tree is in `hgcache` before enqueuing a tree import request.
Reviewed By: xavierd
Differential Revision: D30514871
fbshipit-source-id: eb23f64b7f059832571f957fb67d18c3821d2844
Summary:
Some code in the HgDatapackStore is overly complicated due to the fact that
revHash returns a owned Hash and this forces the code to thus copy it onto a
temporary vector. By having a method that can directly return a slice to the
hash, this issue disappears, thus let's add it.
Reviewed By: chadaustin
Differential Revision: D30582458
fbshipit-source-id: dc102117bc82ab72378293c0abfe9acfd862e9e6
Summary:
I noticed that every time we fetch blob from hg, we calculate sha hash and put it into metadata table.
Both calculating sha1 of content and writing it to rocks is fairly expensive, and it would be nice if we can skip doing so in some cases.
In this diff I use inexpensive cache check to see if we already calculated metadata for given blob and skip recalculation
In terms of performance, it reduces blob access time in hot case from **0.62 ms to 0.22 ms**.
[still need to do some testing with buck, but I think this should not block the diff since it seem farily trivial]
This is short-medium term fix, the longer term solution will be keeping hashes in mercurial and fetching them via eden api, but this will take some time to implement
Reviewed By: chadaustin, xavierd
Differential Revision: D30587132
fbshipit-source-id: 3b24ec88fb02e1ea514568b4e2c8f9fd784a0f10
Summary: Similarly to the enqueue benchmark, let's have a dequeue benchmark.
Differential Revision: D30560489
fbshipit-source-id: ae18f7e283e4bab228aaa0f58bff2e6f2cfa3021
Summary:
In order to enqueue and find an element in a hash table, the key needs to be
hashed. Hashing a HgProxyHash relies on hashing a string which is significantly
more expensive than hashing a Hash directly. Note that they both represent the
same data and thus there shouldn't be more collisions.
Reviewed By: chadaustin
Differential Revision: D30520223
fbshipit-source-id: 036007c445c28686f777aa170d0344346e7348b0
Summary:
Allocations are expensive, especially when done under a lock as this increase
the critical section, reducing the potential concurrency. While this yields to
a 1.25x speedup, this is more of a sideway improvement as the allocation is now
done prior to enqueuing. This also means that de-duplicating requests is now
more expensive, as no allocation would be done before, but at the same time,
de-duplication is the non-common code path, so the tradeoff is worthwhile.
Reviewed By: chadaustin
Differential Revision: D30520228
fbshipit-source-id: 99dea65e828f9c896fdfca6b308106554c989282
Summary: The F14 hashmap are significantly faster than the std::unordered_map.
Reviewed By: chadaustin
Differential Revision: D30520225
fbshipit-source-id: d986908c5eac17f66ae2c7589f134c430a3c656e
Summary:
When turning on the native prefetch, EdenFS will enqueue tons of blob requests
to the import request queue. The expectation is then that the threads will
dequeue batch of requests and run them. What is being observed is however
vastly different: the dequeued batches are barely bigger than 10, far lower
than the batch capacity, leading to fetching inefficiencies. The reason for
this is that enqueuing is too costly.
The first step in making enqueuing less costly is to reduce the number of times
the lock needs to be acquired by moving the de-duplication inside the enqueue
function itself. On top of reducing the number of times the lock is held, it
also halves the number of allocation done under the lock.
Reviewed By: chadaustin
Differential Revision: D30520226
fbshipit-source-id: 52f6e3c1ec45caa5c47e3fd122b3a933b0448e7c
Summary:
It turns out that we do want to use a Future to make sure that the tracebus and
watches are completed on the producer and not on the consumer of the future. We
could use a `.via(inline executor)` but the code becomes less readable, so
let's just revert the diff.
Reviewed By: chadaustin
Differential Revision: D30545721
fbshipit-source-id: 524033ab4dbd16be0c377647f7f81f7cd57c206d
Summary:
This change has the unintended effect of causing any Thrift calls to
potentially issue a recursive EdenFS call due to symlink resolution requiring
running `readlink` on the root of the repo itself.
Fixing this isn't really possible, thus let's revert the change altogether, we
can force clients to issue a realpath before issuing EdenFS Thrift calls.
Reviewed By: kmancini
Differential Revision: D30550796
fbshipit-source-id: 9494c8e08c8af2392eeb344879f156cb56f93ea6
Summary:
The documentation allows for not having to test in enqueue if the queue is
still running: if called in the destructor of the owner, no enqueue can
logically happen, and thus we do not need to protect against it.
Reviewed By: chadaustin
Differential Revision: D30520227
fbshipit-source-id: 9d6280ccd7fe875cd06b0746151a2897d1f98d61
Summary:
When trying to push thousands of requestst to the queue, the dequeue side only
manage to pull batches of ~10 requests at most. Let's measure the cost of
enqueue to optimize it.
Reviewed By: chadaustin
Differential Revision: D30503110
fbshipit-source-id: d06ae6741b13b831fa3711fb2dd0e38c3e54193c
Summary: Added a kill switch to enable/disable predictive prefetch profiles similar to the existing one for regular prefetch profiles (D24803728 (7dccb8a49f)). This can be set manually in a user's config or via the cli `eden prefetch-profile disable-predictive/enable-predictive` commands.
Reviewed By: genevievehelsel
Differential Revision: D30404139
fbshipit-source-id: 01900f4030ef6991124f89a67ea404ff2f07ffeb
Summary:
Added eden prefetch-profile activate-predictive/deactivate-predictive subcommands to activate and deactivate predictive prefetch profiles. This will update the checkout config to indicate if predictive prefetch profiles are currently active or not, and stores the overridden num_dirs if specified on activate (--num-dirs N). If activate is called twice with different num_dirs, the value is updated (only one is stored). Unless --skip-prefetch is specified, a predictive prefetch with num_dirs globs (or the default inferred in the daemon) is run.
Also added fetch-predictive [--num-dirs N], which will:
1. if num_dirs is specified: fetch num_dirs globs predictively
2. if num_dirs is not specified, and predictive fetch is active: get the active num_dirs from the checkout config and fetch globs predictively
3. if num_dirs is not specified, and predictive fetch is not active: fetch the default num_dirs (inferred in the daemon)
Added --if-active to fetch-predictive. If set, fetch will not run if predictive prefetch profiles have not been activated (predictive-prefetch-active in checkout-config). Used for post pull hook.
Reviewed By: genevievehelsel
Differential Revision: D30306235
fbshipit-source-id: ba02c2bc976128704c8ab0c3d567637265b7c95d
Summary:
Made changes to ensure that numResults is always a 32 bit unsigned int, and startTime and endTime are 64 bit unsigned ints. This is to ensure consistency across the smartservice and the endpoint in the daemon.
Also, updated the scuba query in the smartservice to only consider dirs with > 1 access (may update this later to accept a configurable lower bound on access count, but for now, including access=1 doesn't make sense).
Reviewed By: genevievehelsel
Differential Revision: D30396526
fbshipit-source-id: 10e7bd969928da91ab29d413280a1ff956db438c
Summary:
This is now only used in HgQueuedBackingStore::logBackingStoreFetch, and
manually inlining it allows for the lock to be taken once instead of once per
path, reducing the number of times the lock needs to be acquired.
Differential Revision: D30494771
fbshipit-source-id: 2d59d0343e48051e4d9c4fc196e66bcb79e7ac71
Summary: While `eden trace hg` already prints queue time when it's over 1ms, this diff adds fb303 counters for import tree/block queue time so that we can have the percentiles.
Reviewed By: xavierd
Differential Revision: D30492275
fbshipit-source-id: 3601aeb9b51b2f55f189a0e0a753fd6ef29d7341
Summary: Currently, the store loops through the requests, calls HgImporter, then waits with `getTry`. This diff makes the change to kickoff all tree imports from HgImporter then waits for future fulfillment with `collectAll`.
Reviewed By: xavierd
Differential Revision: D30486459
fbshipit-source-id: 918e52be818a2064cf04d24f455d23c1ca618434