Summary:
Backport: https://github.com/briansmith/ring/pull/1334
This will allow us to unpin Rust compiler to 1.53.0 and update to 1.55.0.
Reviewed By: xavierd
Differential Revision: D31039024
fbshipit-source-id: f6a9c918e836d93d03c34c77c12bbe63cf7cbe09
Summary:
Previously repo and peer instantiation were in one unified path. This
allowed treating repo's and peers somewhat interchangably. We're moving to a
world where peers and local repos are quite different, so let's separate these
two paths.
This will be useful in the next diff where we remote the file peer, but want to
keep the ability to instantiate local file non-peer repos.
Reviewed By: quark-zju
Differential Revision: D30975887
fbshipit-source-id: 5e676b522c7cfdd5449aeb6a750947dcb023183f
Summary:
We don't use this at Facebook, and most of the tests don't even touch
it anymore. Let's delete it. This will also help us remove our tests dependency
on hg having server logic, once we also delete sshpeer and filepeer.
This will mean we can't use FB hg to clone from http bitbucket though, which is
probably fine.
Differential Revision: D30970713
fbshipit-source-id: 76d96edfbcb7db2168b4b11bfaf8b487406d7f3d
Summary:
Switch derivation of `blame` to the `DerivedDataManager`.
This is mostly the same as the existing derivation implementation. The main difference is that `blame` derivation using the backfilling config
will use the backfilling config for the unodes that it depends on, too.
Reviewed By: mitrandir77
Differential Revision: D30974102
fbshipit-source-id: 5f69f8c218806bb7606b2af4b831e2104b8440d6
Summary: Why not, right? Fixes a few build warnings that showed up to me while building.
Reviewed By: kulshrax
Differential Revision: D30933487
fbshipit-source-id: 318fbd2c5697914fd0bfa723e678dc710524dc02
Summary: There were already helpers to make this code less copy-pasty, this diff just uses them.
Reviewed By: markbt
Differential Revision: D30933408
fbshipit-source-id: acc27a0904425eccfc71fee884a8f2035ed0c37f
Summary:
We already have a macro to make it easier to create wire representation of hash types, let's use it on `HgId` to reduce copy-pasting.
Changes:
- Added `Ord` implementations to wire hash types, as `WireHgId` used it.
- Added from/into implementations on `HgId` to byte arrays, which were used by the macro.
- Changed Debug implementation so it prints hex instead of an actual array of bytes
Reviewed By: krallin
Differential Revision: D30933067
fbshipit-source-id: c88911bfc91e44e07f2f658098036b766495d05f
Summary:
I imagine a pretty common case (specially for automation that's trying to keep two clones in sync), will be that you need to restore a snapshot and then restore another snapshot after that.
Currently, this doesn't work very well, as it fails on (some but not all) cases where there is uncommitted changes. It's kind of boring bc to handle that you need to run `hg purge && hg revert -a -C`.
This diff adds a `--clean` option to `hg snapshot restore` that will clean the working copy before updating to given snapshot. Now the command will also fail if you try to update to a snapshot while you have untracked files.
Reviewed By: markbt
Differential Revision: D30903851
fbshipit-source-id: 387eeeee882093389649dc337c861291c35f4b94
Summary:
The `backfill_batch_dangerous` method requires that the caller ensures
that all dependencies of the batch have been derived, otherwise errors,
such as mappings being written out before the things they map to, can
occur.
When the derived data manager takes over batch derivation, it will enforce this
requirement, so that it is no longer dangerous. However, The backfiller tests
were not ensuring the invariant, so the tests will fail with the new derivation
implementation.
Fix the tests by ensuring the parent commits are always derived before a
batch is started. The test is also extended to expose the failure mode
of accidentally deriving batch parents. This will be fixed in the next
commit.
Reviewed By: yancouto
Differential Revision: D30959132
fbshipit-source-id: 8489a5d0b375692a903854294e3810846c9e13de
Summary:
Implement `DerivedUtils` using the `DerivedDataManager`.
This is just for migration. In the future `DerivedUtils` will be replaced by the manager.
Reviewed By: yancouto
Differential Revision: D30944568
fbshipit-source-id: 32376e3b4aeb959e63f66e989a663c21dee30ba5
Summary:
Implement a new version of data derivation in the derived data manager. This is different from the old version in a few ways:
* `derived_data::BonsaiDerivable` is replaced by `derived_data_manager::BonsaiDerivable`. This trait defines both how to perform derivation and how to store and retrieve mapping values. Derivation is performed with reference to the derived data manager, rather than `BlobRepo`.
* The old `Mapping` structs and traits are replaced with a direct implementation in the derived data manager, using the `BonsaiDerivable` trait to handle the derived-data-type-specific parts.
* The new implementation assumes we will stick with parallel derivation, and doesn't implement serial derivation.
Code is copied from the `derived_data` crate, as it is intended to be a replacement once all the derived data types are migrated, and re-using code would create a circular dependency during migration.
This only covers the basic derivation implementation used during production. The derived data manager will also take over backfilling, but that will happen in a later diff.
Reviewed By: yancouto
Differential Revision: D30805046
fbshipit-source-id: b9660dd957fdf762f621b2cb37fc2eea7bf03074
Summary:
The `find_oldest_underived` method of `DerivedUtils` is used outside tests by
exactly one client (the backfiller in tailing mode). Simplify the
`DerivedUtils` trait by extracting this method from the trait, and replacing
with a more general one that will be easier to implement in terms of the
derived data manager.
Reviewed By: yancouto
Differential Revision: D30944567
fbshipit-source-id: a1d408e091d145297241a5eebc02a87155bc3765
Summary:
Split the `BonsaiDerived` type in two:
* `BonsaiDerived` is now just the interface which is used by callers
who want to derive some derived data type. It will be implemented by
both old and new derivation.
* `BonsaiDerivedOld` is the interface that old derivation uses to
determine the default mapping for derivation. This will not be
implemented by new derivation, and will be removed once migration is
complete.
Reviewed By: yancouto
Differential Revision: D30944566
fbshipit-source-id: 5d30a44da22bcf290ed3123844eb712c7b37dea4
Summary:
The builder pattern turned out to be unnecessary, as mappings don't need to be
stored in the manager after all.
Reviewed By: StanislavGlebik
Differential Revision: D30944565
fbshipit-source-id: 4300cdcc871c89f98e42d5b47600ac640b4b94eb
Summary:
Make the derivation process for mercurial filenodes not depend on `BlobRepo`.
Instead, use the repo attributes (`RepoBlobstore` and `Filenodes`) directly.
This will allow us to migrate to using `DerivedDataManager` in preparation
for removing `BlobRepo` from derivation entirely.
The existing use of `changesets` for determining the commit's parents is
changed to use the parents from the bonsai changeset. For normal derivation,
the bonsai changeset is already loaded, so this saves a database round-trip.
For batch derivation we currently need to load the changeset, but it should
be in cache anyway, as other derived data types will also have loaded it.
We still need to keep a `BlobRepo` reference at the moment. This is because
filenodes depend on the mercurial derived data. The recursive derivation is
hidden in the call to `repo.get_hg_from_bonsai_changeset`. When derivation
is migrated to the derived data manager, we can replace this will a direct
derivation.
Reviewed By: StanislavGlebik
Differential Revision: D30765254
fbshipit-source-id: 20cc17c2eb611544869e5f1c15d858663cd60fd1
Summary:
Let's give them a more descriptive names so that it's easier to understand
what's going on.
Reviewed By: markbt
Differential Revision: D31022612
fbshipit-source-id: 8e4f516f3d0b1cd661b1a8fceba80a8f85a2ed4f
Summary:
This is a new option in split_batch_in_linear_stacks - it either aggregates
file changes from all ancestors in the stack or not. Currently all of our
callsites wants Aggregate, but in the next diff we'll add a new callsite that
doesn't
Reviewed By: markbt
Differential Revision: D31022444
fbshipit-source-id: ce0613863855163f26ab18c7f35142ae569eb31a
Summary:
EdenFS would never log anything when mounting via NFS, let's make it more
visible and easier to grep.
Reviewed By: chadaustin
Differential Revision: D31022158
fbshipit-source-id: 99fd3a04c90526eedf9951ac7c2bcd9e18ef8953
Summary:
this relies on local changes to make it so cargo metadata ACTUALLY finds this
binary: https://github.com/tokio-rs/console/pull/146 is where I try to upstream
it
Reviewed By: jsgf
Differential Revision: D30944630
fbshipit-source-id: 5d34a32a042f83eff7e7ae7445e23badf10fffe3
Summary: For the time being we don't have checksums in saved states. As a temporary workaround add the ability to derive the checksum from the naming table.
Differential Revision: D30967637
fbshipit-source-id: 4ac34d988d08c9af6f08f7ce46206f756cf1cf0c
Summary: Watchman is a C++17 project now, so we can use std::optional.
Reviewed By: xavierd
Differential Revision: D30917549
fbshipit-source-id: 95d8ac15d4939a70347336ddfb120ab5025db993
Summary:
Having tons of booleans in a function can be very error prone from a caller
perspective, using a structure to pass in the same information can mitigate
some of this issue.
Reviewed By: kmancini
Differential Revision: D30883743
fbshipit-source-id: dcf38d29bfe2cb5155879f7ae4eab5cea31f798a
Summary: Without this bit of information we can't tell where the sync came from (i.e. from which of two repos) so we can't reliably find a commit "source" for a landed commit.
Reviewed By: StanislavGlebik
Differential Revision: D30902774
fbshipit-source-id: d85d0d028fbd6bfb2d64bce89bc7934bad2e242b
Summary:
During an `eden chown`, EdenFS will try to chown both the repository, and the
redirections. In some cases, chowning the redirection can both take a long time
and be unecessary. Consider the case where some automation temporarily chown a
repository to a service user that needs to access the repository, and then
chown it back to the owner of the repository. In that case, changing the
ownership of the redirection is superfluous and unecessary.
Reviewed By: mrkmndz
Differential Revision: D31010912
fbshipit-source-id: a882948005ac4fe29ff465088f196e0fc2bc10be
Summary:
This is a very basic commands that uses debug-printing to display all the
request details. In the future we might want to make it more ellaborate but
as-it-is it works.
Reviewed By: StanislavGlebik
Differential Revision: D30965076
fbshipit-source-id: 561c64597b94359843e575550be0ae6f39fad7bf
Summary:
This debug command will allow the user to see and interact with currently
running async requests.
Reviewed By: StanislavGlebik
Differential Revision: D30965077
fbshipit-source-id: 259f1af0eb6ade4a34f6004c7b1ad63cd5f0bc9f
Summary:
It makes it a bit hard to do experiments and compare derivation results.
It's easy to compare these types, so let's do it.
Reviewed By: mitrandir77
Differential Revision: D31017823
fbshipit-source-id: 6173bba53c7ee254198e023dde57564fe9c3efed
Summary:
This will be used in the next diffs to add batch derivations for unode.
Also it makes it symmetrical to create_manifest_unode
Reviewed By: mitrandir77
Differential Revision: D31015719
fbshipit-source-id: 65e12901c6a004375c7c0e3b07f1632ac9c6eaa8
Summary:
In some cases (e.g. when master bookmark moves backwards) there might be
commits in segmented changelog that are not ancestors of master. When reseeding
we still want to build segments for these chagnesets, and this is what this
diff does (see D30898955 for more details about why we want to build segments
for these changesets).
Reviewed By: quark-zju
Differential Revision: D30996484
fbshipit-source-id: 864aaaacfc04d6169afd3d04ebcb6096ae2514e5
Summary:
In D29940980 (2e2b9755cf) we used shlex for a redirect subprocess command line.
The list does not always contain strings tho, which is a requirement to use
shlex.quote my guess is that they are paths. We should still str things
before we shlex.quote them.
Differential Revision: D31001622
fbshipit-source-id: 2a270781d7f2d84ad7a9a2f9975500b29306cfa8
Summary:
One of the largest contributor to EdenFS memory usage are the internal
IndexedLog buffers to hold data in memory until a threshold is reached. Since
the main benefit of these buffers is to utilize the disk bandwidth, very large
buffers aren't necessary and much smaller ones will be able to achieve similar
results.
A default 50MB buffer is used which will cap the memory usage to 50MB * 3:
- File IndexedLogDataStore
- Tree IndexedLogDataStore
- File LFS
The aux and history stores are also reduced down to 10MB.
Reviewed By: DurhamG
Differential Revision: D30948343
fbshipit-source-id: 74e789856ac995a5672b6aefe8a68c9580f69613
Summary:
We periodically need to dereference inodes on NFS because we get no other info
from the kernel on when should dereference them.
This means the NFS kernel might have references to inodes after we delete them.
An unknown inode number is not a bug on NFS. It's just stale, so the error should
reflect that.
Reviewed By: xavierd
Differential Revision: D30144898
fbshipit-source-id: 3d448e94aea5acb02908ea443bcf3adae80eb975
Summary:
We periodically need to dereference inodes on NFS because we get no other info
from the kernel on when should dereference them.
It can be disruptive to a users workflow because an open files that were rm'ed
or removed on checkout will no longer have their old content. (on a native
filesystem or fuse applications that had the file open propr to the removal
would still be able to access files.) For most editors this is not a problem
because they read the file on open (seems fine for vim and vscode from testing).
However folks could theoretically have a workflow this does not jive with.
Let's make it configurable how often this runs, so users can control how
much we distupt their workflow.
Reviewed By: xavierd
Differential Revision: D30144899
fbshipit-source-id: 59cf5faea70b3aea216ca2bcb45b96e34f5e72b5
Summary:
NFSv3 has no inode invalidation flow built into the procall. The kernel does not
send us forget messages like we get in FUSE. The kernel also does not send us
notifications when a file is closed. Thus EdenFS can not easily tell when
all handles to a file have been closed.
As is now we never clean up inodes. This is bad for memory & disk usage.
We will never unload an inode so we always keep it in memory once it's created.
Additonally, we never remove a materialized inode from the overlay. This means
we have unbounded memory and disk usage :/
We need to clean up these inodes at somepoint. There are a couple high level
options:
1. Support nfsv4. NFSv4 sends us close message when a file handle is closed.
This would allow us to actually keep track of reference coundts on an inode.
However, This is a lot of work. There is a lot of other things we would have to
support before we can move to nfsv4.
2. Run background inode cleanups.
nfsv4 is probably the right long term solution. But for now we should be able to
get by with periodic unloads.
I considered a couple of options for unloads:
1. Unload inodes immediatly when files are removed.
2. Delay cleaning up inodes until a while after they are removed. (i.e. clean
up inodes n seconds after an `unlink`, `rename`, `rmdir`, or checkout)
3. Run periodic inode unloading. (i.e. once a day unload inodes).
Option 1. feels a bit too hostile to applications that hold files open.
Option 3. means we will build up a lot of cruft over the course of the day. But is
probably the most application friendly.
I decided to try out option 2 first and see if it works well with the common
developer tools. Its seems to work (see below) so I am going with it.
This diff only does inode cleanup after checkout. we might want to run inode
clean up after unlink/remove dir as well, but this would be more expensive.
Batch unloading feels better on checkout seems better to me and should happen
frequently enough to clean up space for people.
There is one known "broken" behavior in this diff. We unload all unlinked
inodes which means we will erase more inodes than we should. Sometimes EdenFS
crashes or bugs and unlinks legit inodes. Normally we let those live in the
overlay so we could go in an recover them. My plan to fix this is to mark inodes
for unloading instead of just unloading all unlinked inodes.
Reviewed By: xavierd
Differential Revision: D30144901
fbshipit-source-id: 345d0c04aa386e9fb2bd40906d6f8c41569c1d05
Summary:
Delete a non-existing file is fine, and also deleting a file when a directory
with the same name just ignores the delete.
This diff adds tests to cover these cases. Overall it seems like a bug, but I'm
not sure it worth fixing - who knows if we have bonsai changesets that rely on
that!
Reviewed By: yancouto
Differential Revision: D30990826
fbshipit-source-id: b04992817469abe2fa82056c4fddac3689559855
Summary:
This method allows to append a value instead of just replacing it.
It will be used in the next diff when we derive manifest for a stack of commits
in one go.
Reviewed By: yancouto
Differential Revision: D30989889
fbshipit-source-id: dd9a574609b4d289c01d6eebcc6f5c76a973a96b
Summary:
The NFS protocol needs to know if a read reached the end-of-file to avoid a
subsequent read and thus reduce the chattyness of the protocol.
On top of avoiding RPC calls, this should also halve the amount of data read
from Mercurial due to the BlobCache freeing the in-memory cached blob when the
FS has read the file in its entirety. This meant that the second READ would
always force the blob to be reloaded from the Mercurial store, which would also
force that blob to be kept in memory until being evicted (due to it not being
fully read).
Reviewed By: kmancini
Differential Revision: D30871422
fbshipit-source-id: 8acf4e21ea22b2dfd7f81d2fdd1b137a6e90cc8f
Summary:
Changes:
- Limit simultainous open git-repo objects by amount of CPUs.
- Put a semaphore limit so we wait inside tokio::task domain instead of tokio::blocking domain (later is more expensive and has a hard upper limit).
Reviewed By: mitrandir77
Differential Revision: D30976034
fbshipit-source-id: 3432983b5650bac6aa5178d98d8fd241398aa682
Summary:
This allows the mononoke_api user to choose whether the skiplists
should be used to spped up the ancestry checks or not.
The skiplists crate is already prepared for the situation where skiplist
entries are missing and traverses the graph then.
Reviewed By: yancouto
Differential Revision: D30958909
fbshipit-source-id: 7773487b78ac6641fa2a427c55f679b49f99ac8d
Summary:
Allow the mononoke_api user to choose whether they want
oprerations to be sped up using WBC or not.
Reviewed By: yancouto
Differential Revision: D30958908
fbshipit-source-id: 038cf77735e7c655f6801d714762e316b6817df5
Summary:
Some crates like mononoke_api depend on warm bookmark cache to speed up the
bookmark operations. This prevents them from being used in cases requiring
quick and low overhead startup like CLIs.
This diff makes it possible to swap out the warm bookmark cache to a
implementation that doesn't cache anything. (See next diffs to see how it's
used in mononoke_api crate).
Reviewed By: yancouto
Differential Revision: D30958910
fbshipit-source-id: 4d09367217a66f59539b566e48c8d271b8cc8c8e
Summary:
This method was added before the more generic list method was added. Let's get
rid of it for simplicity and to discourage listing all the bookmarks.
Reviewed By: yancouto
Differential Revision: D30958911
fbshipit-source-id: f4518da3f34591c313657161f69af96d15482e6c
Summary:
0.4.24 is incompatible with crates that use `deny(warnings)` on a compiler 1.55.0 or newer.
Example error:
```
error: unused borrow that must be used
--> common/rust/shed/futures_ext/src/stream/return_remainder.rs:22:1
|
22 | #[pin_project]
| ^^^^^^^^^^^^^^
|
= note: this error originates in the derive macro `::pin_project::__private::__PinProjectInternalDerive` (in Nightly builds, run with -Z macro-backtrace for more info)
```
The release notes for 0.4.28 call out this issue. https://github.com/taiki-e/pin-project/releases/tag/v0.4.28
Reviewed By: krallin
Differential Revision: D30858380
fbshipit-source-id: 98e98bcb5a6b795b93ed1efd706a1711f15c57db
Summary:
Move optional line handling logic into a separate function and simplify.
This diff is intended to be a pure refactoring with no observable changes in behavior. In particular, all the code dealing with the "optional" list appears to be dead code because if the line is optional, linematch will return "retry", so that branch is never reachable.
Reviewed By: DurhamG
Differential Revision: D30849757
fbshipit-source-id: 17283f9217466b3f85d913da66222b9a6779abe4
Summary:
This line was iterating over a list of files and looking in the
manifest for each one. This results in serial manifest reads which can result in
serial network requests.
Let's instead use manifest.matches() to test them all at once via the underlying
BFS, which does bulk fetching.
Differential Revision: D30938359
fbshipit-source-id: 1af7d417288b82efdd537a4afeaf93c1b55eaf49
Summary:
Demonstrate issues with the vertex to path resolution. Basically, the vertex to
path resolution logic did not check if the "parent of merge" being used is
actually valid (is an ancestor of provided heads) or not.
Reviewed By: DurhamG
Differential Revision: D30911150
fbshipit-source-id: 83d215910d5ba67ac0d5749927018a7aefcc6730