Summary:
This will be used in the next diffs to add batch derivations for unode.
Also it makes it symmetrical to create_manifest_unode
Reviewed By: mitrandir77
Differential Revision: D31015719
fbshipit-source-id: 65e12901c6a004375c7c0e3b07f1632ac9c6eaa8
Summary:
In some cases (e.g. when master bookmark moves backwards) there might be
commits in segmented changelog that are not ancestors of master. When reseeding
we still want to build segments for these chagnesets, and this is what this
diff does (see D30898955 for more details about why we want to build segments
for these changesets).
Reviewed By: quark-zju
Differential Revision: D30996484
fbshipit-source-id: 864aaaacfc04d6169afd3d04ebcb6096ae2514e5
Summary:
In D29940980 (2e2b9755cf) we used shlex for a redirect subprocess command line.
The list does not always contain strings tho, which is a requirement to use
shlex.quote my guess is that they are paths. We should still str things
before we shlex.quote them.
Differential Revision: D31001622
fbshipit-source-id: 2a270781d7f2d84ad7a9a2f9975500b29306cfa8
Summary:
One of the largest contributor to EdenFS memory usage are the internal
IndexedLog buffers to hold data in memory until a threshold is reached. Since
the main benefit of these buffers is to utilize the disk bandwidth, very large
buffers aren't necessary and much smaller ones will be able to achieve similar
results.
A default 50MB buffer is used which will cap the memory usage to 50MB * 3:
- File IndexedLogDataStore
- Tree IndexedLogDataStore
- File LFS
The aux and history stores are also reduced down to 10MB.
Reviewed By: DurhamG
Differential Revision: D30948343
fbshipit-source-id: 74e789856ac995a5672b6aefe8a68c9580f69613
Summary:
We periodically need to dereference inodes on NFS because we get no other info
from the kernel on when should dereference them.
This means the NFS kernel might have references to inodes after we delete them.
An unknown inode number is not a bug on NFS. It's just stale, so the error should
reflect that.
Reviewed By: xavierd
Differential Revision: D30144898
fbshipit-source-id: 3d448e94aea5acb02908ea443bcf3adae80eb975
Summary:
We periodically need to dereference inodes on NFS because we get no other info
from the kernel on when should dereference them.
It can be disruptive to a users workflow because an open files that were rm'ed
or removed on checkout will no longer have their old content. (on a native
filesystem or fuse applications that had the file open propr to the removal
would still be able to access files.) For most editors this is not a problem
because they read the file on open (seems fine for vim and vscode from testing).
However folks could theoretically have a workflow this does not jive with.
Let's make it configurable how often this runs, so users can control how
much we distupt their workflow.
Reviewed By: xavierd
Differential Revision: D30144899
fbshipit-source-id: 59cf5faea70b3aea216ca2bcb45b96e34f5e72b5
Summary:
NFSv3 has no inode invalidation flow built into the procall. The kernel does not
send us forget messages like we get in FUSE. The kernel also does not send us
notifications when a file is closed. Thus EdenFS can not easily tell when
all handles to a file have been closed.
As is now we never clean up inodes. This is bad for memory & disk usage.
We will never unload an inode so we always keep it in memory once it's created.
Additonally, we never remove a materialized inode from the overlay. This means
we have unbounded memory and disk usage :/
We need to clean up these inodes at somepoint. There are a couple high level
options:
1. Support nfsv4. NFSv4 sends us close message when a file handle is closed.
This would allow us to actually keep track of reference coundts on an inode.
However, This is a lot of work. There is a lot of other things we would have to
support before we can move to nfsv4.
2. Run background inode cleanups.
nfsv4 is probably the right long term solution. But for now we should be able to
get by with periodic unloads.
I considered a couple of options for unloads:
1. Unload inodes immediatly when files are removed.
2. Delay cleaning up inodes until a while after they are removed. (i.e. clean
up inodes n seconds after an `unlink`, `rename`, `rmdir`, or checkout)
3. Run periodic inode unloading. (i.e. once a day unload inodes).
Option 1. feels a bit too hostile to applications that hold files open.
Option 3. means we will build up a lot of cruft over the course of the day. But is
probably the most application friendly.
I decided to try out option 2 first and see if it works well with the common
developer tools. Its seems to work (see below) so I am going with it.
This diff only does inode cleanup after checkout. we might want to run inode
clean up after unlink/remove dir as well, but this would be more expensive.
Batch unloading feels better on checkout seems better to me and should happen
frequently enough to clean up space for people.
There is one known "broken" behavior in this diff. We unload all unlinked
inodes which means we will erase more inodes than we should. Sometimes EdenFS
crashes or bugs and unlinks legit inodes. Normally we let those live in the
overlay so we could go in an recover them. My plan to fix this is to mark inodes
for unloading instead of just unloading all unlinked inodes.
Reviewed By: xavierd
Differential Revision: D30144901
fbshipit-source-id: 345d0c04aa386e9fb2bd40906d6f8c41569c1d05
Summary:
Delete a non-existing file is fine, and also deleting a file when a directory
with the same name just ignores the delete.
This diff adds tests to cover these cases. Overall it seems like a bug, but I'm
not sure it worth fixing - who knows if we have bonsai changesets that rely on
that!
Reviewed By: yancouto
Differential Revision: D30990826
fbshipit-source-id: b04992817469abe2fa82056c4fddac3689559855
Summary:
This method allows to append a value instead of just replacing it.
It will be used in the next diff when we derive manifest for a stack of commits
in one go.
Reviewed By: yancouto
Differential Revision: D30989889
fbshipit-source-id: dd9a574609b4d289c01d6eebcc6f5c76a973a96b
Summary:
The NFS protocol needs to know if a read reached the end-of-file to avoid a
subsequent read and thus reduce the chattyness of the protocol.
On top of avoiding RPC calls, this should also halve the amount of data read
from Mercurial due to the BlobCache freeing the in-memory cached blob when the
FS has read the file in its entirety. This meant that the second READ would
always force the blob to be reloaded from the Mercurial store, which would also
force that blob to be kept in memory until being evicted (due to it not being
fully read).
Reviewed By: kmancini
Differential Revision: D30871422
fbshipit-source-id: 8acf4e21ea22b2dfd7f81d2fdd1b137a6e90cc8f
Summary:
Changes:
- Limit simultainous open git-repo objects by amount of CPUs.
- Put a semaphore limit so we wait inside tokio::task domain instead of tokio::blocking domain (later is more expensive and has a hard upper limit).
Reviewed By: mitrandir77
Differential Revision: D30976034
fbshipit-source-id: 3432983b5650bac6aa5178d98d8fd241398aa682
Summary:
This allows the mononoke_api user to choose whether the skiplists
should be used to spped up the ancestry checks or not.
The skiplists crate is already prepared for the situation where skiplist
entries are missing and traverses the graph then.
Reviewed By: yancouto
Differential Revision: D30958909
fbshipit-source-id: 7773487b78ac6641fa2a427c55f679b49f99ac8d
Summary:
Allow the mononoke_api user to choose whether they want
oprerations to be sped up using WBC or not.
Reviewed By: yancouto
Differential Revision: D30958908
fbshipit-source-id: 038cf77735e7c655f6801d714762e316b6817df5
Summary:
Some crates like mononoke_api depend on warm bookmark cache to speed up the
bookmark operations. This prevents them from being used in cases requiring
quick and low overhead startup like CLIs.
This diff makes it possible to swap out the warm bookmark cache to a
implementation that doesn't cache anything. (See next diffs to see how it's
used in mononoke_api crate).
Reviewed By: yancouto
Differential Revision: D30958910
fbshipit-source-id: 4d09367217a66f59539b566e48c8d271b8cc8c8e
Summary:
This method was added before the more generic list method was added. Let's get
rid of it for simplicity and to discourage listing all the bookmarks.
Reviewed By: yancouto
Differential Revision: D30958911
fbshipit-source-id: f4518da3f34591c313657161f69af96d15482e6c
Summary:
0.4.24 is incompatible with crates that use `deny(warnings)` on a compiler 1.55.0 or newer.
Example error:
```
error: unused borrow that must be used
--> common/rust/shed/futures_ext/src/stream/return_remainder.rs:22:1
|
22 | #[pin_project]
| ^^^^^^^^^^^^^^
|
= note: this error originates in the derive macro `::pin_project::__private::__PinProjectInternalDerive` (in Nightly builds, run with -Z macro-backtrace for more info)
```
The release notes for 0.4.28 call out this issue. https://github.com/taiki-e/pin-project/releases/tag/v0.4.28
Reviewed By: krallin
Differential Revision: D30858380
fbshipit-source-id: 98e98bcb5a6b795b93ed1efd706a1711f15c57db
Summary:
Move optional line handling logic into a separate function and simplify.
This diff is intended to be a pure refactoring with no observable changes in behavior. In particular, all the code dealing with the "optional" list appears to be dead code because if the line is optional, linematch will return "retry", so that branch is never reachable.
Reviewed By: DurhamG
Differential Revision: D30849757
fbshipit-source-id: 17283f9217466b3f85d913da66222b9a6779abe4
Summary:
This line was iterating over a list of files and looking in the
manifest for each one. This results in serial manifest reads which can result in
serial network requests.
Let's instead use manifest.matches() to test them all at once via the underlying
BFS, which does bulk fetching.
Differential Revision: D30938359
fbshipit-source-id: 1af7d417288b82efdd537a4afeaf93c1b55eaf49
Summary:
Demonstrate issues with the vertex to path resolution. Basically, the vertex to
path resolution logic did not check if the "parent of merge" being used is
actually valid (is an ancestor of provided heads) or not.
Reviewed By: DurhamG
Differential Revision: D30911150
fbshipit-source-id: 83d215910d5ba67ac0d5749927018a7aefcc6730
Summary:
The tree metadata fetching evolution goes as follow
(1) (commit, path) scs query
(2) tree manifest scs query [we are here]
(3) eden api manifest query [in development]
Option (1) is no longer used and is the only placed that required scs proxy hash.
Removing it will simplify transition from (2) to (3) and also cleans up bunch of unused code.
It also comes with minor performance improvement, saving about 5% on file access time.
To be precise, this is measured by running fsprobe [this is probably too little to measure in high noise benchmark like running arc focus]:
```
fsprobe.sh run cat.targets --parallel 24
```
Results:
```
W/ scshash:
P24: 0.1044 0.1007 0.1005 (hot) 0.1019 avg
W/o scshash:
P24: 0.0954 0.0964 0.1008 (hot) 0.0975 avg
```
This performance improvement comes from the fact, that even though scs hash was never created or used, we still attempted to load it from scs table, and even though this load always failed it contributed to execution time.
Reviewed By: xavierd
Differential Revision: D30942663
fbshipit-source-id: af84f1e5658e7d8d9fb6853cbb88f02b49cd050b
Summary: File access latency can actually be less then 1 ms, so it's good to show more digits
Reviewed By: DurhamG
Differential Revision: D30942905
fbshipit-source-id: 2fc8d48dbc08c55b89d829d1474ae11c2c3df1c3
Summary:
Since fsprobe itself requires a 'plan' to run, we need separate script to standartize list of plans we think are relevant
This scripts allows to generate fsprobe plans and run them
Reviewed By: DurhamG
Differential Revision: D30908892
fbshipit-source-id: eb722fe1f6d982e42b66614f08bc73345e04f9e6
Summary:
We got errors like:
error.IndexedLogError: "repo/.hg/store/lfs/pointers/meta": when reading LogMetadata
in log::OpenOptions::open(Filesystem("repo/.hg/store/lfs/pointers"))
Caused by 1 errors:
- failed to fill whole buffer
from Sandcastle. There seems no easy way to get a sample of the broken `meta`
file content. Let's include the file content to make progress on debugging.
Reviewed By: DurhamG
Differential Revision: D30939737
fbshipit-source-id: ccd77f6b67e4aaf75af2248118845fd5b3434ff1
Summary: This `allow` is no longer needed.
Reviewed By: yancouto
Differential Revision: D30859520
fbshipit-source-id: 36b810a72a28af25513404739bccf471e380cdf1
Summary: Update TreeStore to use CommonFetchState and update TreeStore and BackingStore to use the other utility types already in use for files (`StoreTree`, `FetchResults`, etc).
Reviewed By: andll
Differential Revision: D30739008
fbshipit-source-id: e210b8d76614c762ba127d5f2e26391681da004f
Summary: Adds a utility method for converting a `StoreTree` to a `manifest-tree::Entry`, which wraps an hg manifest blob and provides methods for parsing it as a tree manifest (and a `TryFrom` impl used to convert it to a pre-parsed `manifest::List`, which is used by BackingStore in the next change in this stack).
Reviewed By: andll
Differential Revision: D30859470
fbshipit-source-id: 411e80a14861e0739b0c398290055002b35e59d3
Summary: This change does not add aux data support, so for now the types are a bit useless.
Reviewed By: DurhamG
Differential Revision: D30313314
fbshipit-source-id: 11968199b12c4f870c58c7e939b5c8ed5cd9afea
Summary: More refactoring of scmstore `TreeStore`. Introducing a new `tree` submodule as we'll be adding tree-specific metrics, types, etc. soon (as currently exist for files).
Reviewed By: andll
Differential Revision: D30313460
fbshipit-source-id: f20d3ee62520b1d9ea34ad04eb1880ad9b5a00c3
Summary: Extract out `CommonFetchState` from `FileStore`'s `FetchState`. Currently, direct field access is still used for computing derivations and a few other places, but this will be changed in a later diff.
Reviewed By: DurhamG
Differential Revision: D30308289
fbshipit-source-id: 16d34904412572facc9f51cbd791e30413bfe634
Summary: Don't show progress bars for pending HTTP requests until they actually start running, so that the user always sees progress bars from active transfers.
Reviewed By: quark-zju
Differential Revision: D30914241
fbshipit-source-id: ca2f85af055dc9324123d0f9cc765f42d3b36ad4
Summary: Add a new `first_activity` event to the `Response` event listeners that fires the first time we detect nonzero progress for either uploading or downloading. This is useful for situations where requests are queued and we want to be notified when the request becomes active (e.g., to register progress bars).
Reviewed By: DurhamG
Differential Revision: D30914242
fbshipit-source-id: 83445724ed81e77ac25954b644e6bbafcbe5cadb
Summary: This adds inode number to NFS trace event so that we can use it in ActivityRecorder to show the filename of the FS request.
Reviewed By: xavierd
Differential Revision: D30849770
fbshipit-source-id: 580faf5fccb1a225399d9aec843e23eae1874e87
Summary:
We have an option on GlobFiles for listing hidden files, but we don't have a
cli option. We default to false in the cli. Let's pipe this option all the way through.
so that we can control this flag from the cli.
Reviewed By: xavierd
Differential Revision: D30915118
fbshipit-source-id: 28b91d4fd2dd4bdf9e342929f570f64db14e8ad0
Summary:
`eden prefetch` and `eden glob` return lists that despite being called
"maching files" actually contains both files and directories.
In some cases, we only want the list of files and it introduces unnessecary
overhead on our clients for them to have to stat all the files in the list to
filter out the dirs. Let's add an option to just list files.
Reviewed By: chadaustin
Differential Revision: D30816193
fbshipit-source-id: 6e264142162ce03e560c969a0c0dbbc2f418d7b9
Summary: The error message that currently exists here does not correspond to the command ran, its just missing the "redirect" part
Reviewed By: xavierd
Differential Revision: D30914616
fbshipit-source-id: 866ab7d494b728af13fbb3656edb8740a399755f
Summary:
There's no real equivalent of hg changeset of snapshot, so let's not derive it.
Closes task T97939172
Reviewed By: liubov-dmitrieva
Differential Revision: D30902073
fbshipit-source-id: 8128597c25e12e40e719cdd7800d4b9b792391c9
Summary:
`hg snapshot info` command will be used to get information about the snapshot (similar to `hg show` for commits)
It's still not easy to do this, as we want to have derived data for snapshots, which is still unimplemented.
For now, this makes the command only check if the snapshot exists or not. In the future more functionality will be added (and likely the edenapi endpoint we query will be different).
Reviewed By: liubov-dmitrieva
Differential Revision: D30900088
fbshipit-source-id: 4dc6915d74694a03496c756f03bc073d1a0819f2
Summary: This is a similar diff to D30915090, but for EdenFS.
Differential Revision: D30915126
fbshipit-source-id: 9a718e47237924ebe20176c522a1b1193224236c
Summary:
To eliminate the need for proxy hashes, we need variable-width object
IDs. Introduce an ObjectId type much like RootId.
Reviewed By: genevievehelsel
Differential Revision: D30819412
fbshipit-source-id: 07a185ba6b866b475c92f811e70aa00a8a9f895f
Summary: As a first step to moving the repo name inside the EdenAPI client itself, add it as a (currently unused) field to the config. Later diffs will use this instead of having each method take a `repo` argument.
Reviewed By: quark-zju
Differential Revision: D30746379
fbshipit-source-id: 07957e53e940ce72f84b2297f506b796117ec46a
Summary: We use it as an unique key for the detector
Reviewed By: ginfung
Differential Revision: D30703470
fbshipit-source-id: cb8e7dae5dc4192402530b2cfe564b86aa23c7c8
Summary:
Edenapi lookup (for file content, filenodes and trees): check all the multiplexed blobstores when we check is_present.
This will help us to avoid undesired behaviour for commit cloud blobs that haven't been replicated to all blobstores. Healer currently doesn't check commit cloud blobs.
Reviewed By: StanislavGlebik
Differential Revision: D30839608
fbshipit-source-id: d13cd4500f7b14731d8b75c763c14a698399ba02
Summary:
The new debugscmstorereplay command replays scmstore fetches given an activity log created previously via the scmstore.activity log config parameter.
Replaying activity logs may help to understand or reproduce performance issues related to file fetching. Currently the replay tool ignores all complications such as concurrent fetches or variable backends.
Differential Revision: D30288701
fbshipit-source-id: c6b24acdbd37b5a51ccba3e74e8f074062e880e5