Summary:
Switch derivation of `blame` to the `DerivedDataManager`.
This is mostly the same as the existing derivation implementation. The main difference is that `blame` derivation using the backfilling config
will use the backfilling config for the unodes that it depends on, too.
Reviewed By: mitrandir77
Differential Revision: D30974102
fbshipit-source-id: 5f69f8c218806bb7606b2af4b831e2104b8440d6
Summary: Why not, right? Fixes a few build warnings that showed up to me while building.
Reviewed By: kulshrax
Differential Revision: D30933487
fbshipit-source-id: 318fbd2c5697914fd0bfa723e678dc710524dc02
Summary: There were already helpers to make this code less copy-pasty, this diff just uses them.
Reviewed By: markbt
Differential Revision: D30933408
fbshipit-source-id: acc27a0904425eccfc71fee884a8f2035ed0c37f
Summary:
We already have a macro to make it easier to create wire representation of hash types, let's use it on `HgId` to reduce copy-pasting.
Changes:
- Added `Ord` implementations to wire hash types, as `WireHgId` used it.
- Added from/into implementations on `HgId` to byte arrays, which were used by the macro.
- Changed Debug implementation so it prints hex instead of an actual array of bytes
Reviewed By: krallin
Differential Revision: D30933067
fbshipit-source-id: c88911bfc91e44e07f2f658098036b766495d05f
Summary:
I imagine a pretty common case (specially for automation that's trying to keep two clones in sync), will be that you need to restore a snapshot and then restore another snapshot after that.
Currently, this doesn't work very well, as it fails on (some but not all) cases where there is uncommitted changes. It's kind of boring bc to handle that you need to run `hg purge && hg revert -a -C`.
This diff adds a `--clean` option to `hg snapshot restore` that will clean the working copy before updating to given snapshot. Now the command will also fail if you try to update to a snapshot while you have untracked files.
Reviewed By: markbt
Differential Revision: D30903851
fbshipit-source-id: 387eeeee882093389649dc337c861291c35f4b94
Summary:
The `backfill_batch_dangerous` method requires that the caller ensures
that all dependencies of the batch have been derived, otherwise errors,
such as mappings being written out before the things they map to, can
occur.
When the derived data manager takes over batch derivation, it will enforce this
requirement, so that it is no longer dangerous. However, The backfiller tests
were not ensuring the invariant, so the tests will fail with the new derivation
implementation.
Fix the tests by ensuring the parent commits are always derived before a
batch is started. The test is also extended to expose the failure mode
of accidentally deriving batch parents. This will be fixed in the next
commit.
Reviewed By: yancouto
Differential Revision: D30959132
fbshipit-source-id: 8489a5d0b375692a903854294e3810846c9e13de
Summary:
Implement `DerivedUtils` using the `DerivedDataManager`.
This is just for migration. In the future `DerivedUtils` will be replaced by the manager.
Reviewed By: yancouto
Differential Revision: D30944568
fbshipit-source-id: 32376e3b4aeb959e63f66e989a663c21dee30ba5
Summary:
Implement a new version of data derivation in the derived data manager. This is different from the old version in a few ways:
* `derived_data::BonsaiDerivable` is replaced by `derived_data_manager::BonsaiDerivable`. This trait defines both how to perform derivation and how to store and retrieve mapping values. Derivation is performed with reference to the derived data manager, rather than `BlobRepo`.
* The old `Mapping` structs and traits are replaced with a direct implementation in the derived data manager, using the `BonsaiDerivable` trait to handle the derived-data-type-specific parts.
* The new implementation assumes we will stick with parallel derivation, and doesn't implement serial derivation.
Code is copied from the `derived_data` crate, as it is intended to be a replacement once all the derived data types are migrated, and re-using code would create a circular dependency during migration.
This only covers the basic derivation implementation used during production. The derived data manager will also take over backfilling, but that will happen in a later diff.
Reviewed By: yancouto
Differential Revision: D30805046
fbshipit-source-id: b9660dd957fdf762f621b2cb37fc2eea7bf03074
Summary:
The `find_oldest_underived` method of `DerivedUtils` is used outside tests by
exactly one client (the backfiller in tailing mode). Simplify the
`DerivedUtils` trait by extracting this method from the trait, and replacing
with a more general one that will be easier to implement in terms of the
derived data manager.
Reviewed By: yancouto
Differential Revision: D30944567
fbshipit-source-id: a1d408e091d145297241a5eebc02a87155bc3765
Summary:
Split the `BonsaiDerived` type in two:
* `BonsaiDerived` is now just the interface which is used by callers
who want to derive some derived data type. It will be implemented by
both old and new derivation.
* `BonsaiDerivedOld` is the interface that old derivation uses to
determine the default mapping for derivation. This will not be
implemented by new derivation, and will be removed once migration is
complete.
Reviewed By: yancouto
Differential Revision: D30944566
fbshipit-source-id: 5d30a44da22bcf290ed3123844eb712c7b37dea4
Summary:
The builder pattern turned out to be unnecessary, as mappings don't need to be
stored in the manager after all.
Reviewed By: StanislavGlebik
Differential Revision: D30944565
fbshipit-source-id: 4300cdcc871c89f98e42d5b47600ac640b4b94eb
Summary:
Make the derivation process for mercurial filenodes not depend on `BlobRepo`.
Instead, use the repo attributes (`RepoBlobstore` and `Filenodes`) directly.
This will allow us to migrate to using `DerivedDataManager` in preparation
for removing `BlobRepo` from derivation entirely.
The existing use of `changesets` for determining the commit's parents is
changed to use the parents from the bonsai changeset. For normal derivation,
the bonsai changeset is already loaded, so this saves a database round-trip.
For batch derivation we currently need to load the changeset, but it should
be in cache anyway, as other derived data types will also have loaded it.
We still need to keep a `BlobRepo` reference at the moment. This is because
filenodes depend on the mercurial derived data. The recursive derivation is
hidden in the call to `repo.get_hg_from_bonsai_changeset`. When derivation
is migrated to the derived data manager, we can replace this will a direct
derivation.
Reviewed By: StanislavGlebik
Differential Revision: D30765254
fbshipit-source-id: 20cc17c2eb611544869e5f1c15d858663cd60fd1
Summary:
Let's give them a more descriptive names so that it's easier to understand
what's going on.
Reviewed By: markbt
Differential Revision: D31022612
fbshipit-source-id: 8e4f516f3d0b1cd661b1a8fceba80a8f85a2ed4f
Summary:
This is a new option in split_batch_in_linear_stacks - it either aggregates
file changes from all ancestors in the stack or not. Currently all of our
callsites wants Aggregate, but in the next diff we'll add a new callsite that
doesn't
Reviewed By: markbt
Differential Revision: D31022444
fbshipit-source-id: ce0613863855163f26ab18c7f35142ae569eb31a
Summary:
EdenFS would never log anything when mounting via NFS, let's make it more
visible and easier to grep.
Reviewed By: chadaustin
Differential Revision: D31022158
fbshipit-source-id: 99fd3a04c90526eedf9951ac7c2bcd9e18ef8953
Summary:
this relies on local changes to make it so cargo metadata ACTUALLY finds this
binary: https://github.com/tokio-rs/console/pull/146 is where I try to upstream
it
Reviewed By: jsgf
Differential Revision: D30944630
fbshipit-source-id: 5d34a32a042f83eff7e7ae7445e23badf10fffe3
Summary: For the time being we don't have checksums in saved states. As a temporary workaround add the ability to derive the checksum from the naming table.
Differential Revision: D30967637
fbshipit-source-id: 4ac34d988d08c9af6f08f7ce46206f756cf1cf0c
Summary: Watchman is a C++17 project now, so we can use std::optional.
Reviewed By: xavierd
Differential Revision: D30917549
fbshipit-source-id: 95d8ac15d4939a70347336ddfb120ab5025db993
Summary:
Having tons of booleans in a function can be very error prone from a caller
perspective, using a structure to pass in the same information can mitigate
some of this issue.
Reviewed By: kmancini
Differential Revision: D30883743
fbshipit-source-id: dcf38d29bfe2cb5155879f7ae4eab5cea31f798a
Summary: Without this bit of information we can't tell where the sync came from (i.e. from which of two repos) so we can't reliably find a commit "source" for a landed commit.
Reviewed By: StanislavGlebik
Differential Revision: D30902774
fbshipit-source-id: d85d0d028fbd6bfb2d64bce89bc7934bad2e242b
Summary:
During an `eden chown`, EdenFS will try to chown both the repository, and the
redirections. In some cases, chowning the redirection can both take a long time
and be unecessary. Consider the case where some automation temporarily chown a
repository to a service user that needs to access the repository, and then
chown it back to the owner of the repository. In that case, changing the
ownership of the redirection is superfluous and unecessary.
Reviewed By: mrkmndz
Differential Revision: D31010912
fbshipit-source-id: a882948005ac4fe29ff465088f196e0fc2bc10be
Summary:
This is a very basic commands that uses debug-printing to display all the
request details. In the future we might want to make it more ellaborate but
as-it-is it works.
Reviewed By: StanislavGlebik
Differential Revision: D30965076
fbshipit-source-id: 561c64597b94359843e575550be0ae6f39fad7bf
Summary:
This debug command will allow the user to see and interact with currently
running async requests.
Reviewed By: StanislavGlebik
Differential Revision: D30965077
fbshipit-source-id: 259f1af0eb6ade4a34f6004c7b1ad63cd5f0bc9f
Summary:
It makes it a bit hard to do experiments and compare derivation results.
It's easy to compare these types, so let's do it.
Reviewed By: mitrandir77
Differential Revision: D31017823
fbshipit-source-id: 6173bba53c7ee254198e023dde57564fe9c3efed
Summary:
This will be used in the next diffs to add batch derivations for unode.
Also it makes it symmetrical to create_manifest_unode
Reviewed By: mitrandir77
Differential Revision: D31015719
fbshipit-source-id: 65e12901c6a004375c7c0e3b07f1632ac9c6eaa8
Summary:
In some cases (e.g. when master bookmark moves backwards) there might be
commits in segmented changelog that are not ancestors of master. When reseeding
we still want to build segments for these chagnesets, and this is what this
diff does (see D30898955 for more details about why we want to build segments
for these changesets).
Reviewed By: quark-zju
Differential Revision: D30996484
fbshipit-source-id: 864aaaacfc04d6169afd3d04ebcb6096ae2514e5
Summary:
In D29940980 (2e2b9755cf) we used shlex for a redirect subprocess command line.
The list does not always contain strings tho, which is a requirement to use
shlex.quote my guess is that they are paths. We should still str things
before we shlex.quote them.
Differential Revision: D31001622
fbshipit-source-id: 2a270781d7f2d84ad7a9a2f9975500b29306cfa8
Summary:
One of the largest contributor to EdenFS memory usage are the internal
IndexedLog buffers to hold data in memory until a threshold is reached. Since
the main benefit of these buffers is to utilize the disk bandwidth, very large
buffers aren't necessary and much smaller ones will be able to achieve similar
results.
A default 50MB buffer is used which will cap the memory usage to 50MB * 3:
- File IndexedLogDataStore
- Tree IndexedLogDataStore
- File LFS
The aux and history stores are also reduced down to 10MB.
Reviewed By: DurhamG
Differential Revision: D30948343
fbshipit-source-id: 74e789856ac995a5672b6aefe8a68c9580f69613
Summary:
We periodically need to dereference inodes on NFS because we get no other info
from the kernel on when should dereference them.
This means the NFS kernel might have references to inodes after we delete them.
An unknown inode number is not a bug on NFS. It's just stale, so the error should
reflect that.
Reviewed By: xavierd
Differential Revision: D30144898
fbshipit-source-id: 3d448e94aea5acb02908ea443bcf3adae80eb975
Summary:
We periodically need to dereference inodes on NFS because we get no other info
from the kernel on when should dereference them.
It can be disruptive to a users workflow because an open files that were rm'ed
or removed on checkout will no longer have their old content. (on a native
filesystem or fuse applications that had the file open propr to the removal
would still be able to access files.) For most editors this is not a problem
because they read the file on open (seems fine for vim and vscode from testing).
However folks could theoretically have a workflow this does not jive with.
Let's make it configurable how often this runs, so users can control how
much we distupt their workflow.
Reviewed By: xavierd
Differential Revision: D30144899
fbshipit-source-id: 59cf5faea70b3aea216ca2bcb45b96e34f5e72b5
Summary:
NFSv3 has no inode invalidation flow built into the procall. The kernel does not
send us forget messages like we get in FUSE. The kernel also does not send us
notifications when a file is closed. Thus EdenFS can not easily tell when
all handles to a file have been closed.
As is now we never clean up inodes. This is bad for memory & disk usage.
We will never unload an inode so we always keep it in memory once it's created.
Additonally, we never remove a materialized inode from the overlay. This means
we have unbounded memory and disk usage :/
We need to clean up these inodes at somepoint. There are a couple high level
options:
1. Support nfsv4. NFSv4 sends us close message when a file handle is closed.
This would allow us to actually keep track of reference coundts on an inode.
However, This is a lot of work. There is a lot of other things we would have to
support before we can move to nfsv4.
2. Run background inode cleanups.
nfsv4 is probably the right long term solution. But for now we should be able to
get by with periodic unloads.
I considered a couple of options for unloads:
1. Unload inodes immediatly when files are removed.
2. Delay cleaning up inodes until a while after they are removed. (i.e. clean
up inodes n seconds after an `unlink`, `rename`, `rmdir`, or checkout)
3. Run periodic inode unloading. (i.e. once a day unload inodes).
Option 1. feels a bit too hostile to applications that hold files open.
Option 3. means we will build up a lot of cruft over the course of the day. But is
probably the most application friendly.
I decided to try out option 2 first and see if it works well with the common
developer tools. Its seems to work (see below) so I am going with it.
This diff only does inode cleanup after checkout. we might want to run inode
clean up after unlink/remove dir as well, but this would be more expensive.
Batch unloading feels better on checkout seems better to me and should happen
frequently enough to clean up space for people.
There is one known "broken" behavior in this diff. We unload all unlinked
inodes which means we will erase more inodes than we should. Sometimes EdenFS
crashes or bugs and unlinks legit inodes. Normally we let those live in the
overlay so we could go in an recover them. My plan to fix this is to mark inodes
for unloading instead of just unloading all unlinked inodes.
Reviewed By: xavierd
Differential Revision: D30144901
fbshipit-source-id: 345d0c04aa386e9fb2bd40906d6f8c41569c1d05
Summary:
Delete a non-existing file is fine, and also deleting a file when a directory
with the same name just ignores the delete.
This diff adds tests to cover these cases. Overall it seems like a bug, but I'm
not sure it worth fixing - who knows if we have bonsai changesets that rely on
that!
Reviewed By: yancouto
Differential Revision: D30990826
fbshipit-source-id: b04992817469abe2fa82056c4fddac3689559855
Summary:
This method allows to append a value instead of just replacing it.
It will be used in the next diff when we derive manifest for a stack of commits
in one go.
Reviewed By: yancouto
Differential Revision: D30989889
fbshipit-source-id: dd9a574609b4d289c01d6eebcc6f5c76a973a96b
Summary:
The NFS protocol needs to know if a read reached the end-of-file to avoid a
subsequent read and thus reduce the chattyness of the protocol.
On top of avoiding RPC calls, this should also halve the amount of data read
from Mercurial due to the BlobCache freeing the in-memory cached blob when the
FS has read the file in its entirety. This meant that the second READ would
always force the blob to be reloaded from the Mercurial store, which would also
force that blob to be kept in memory until being evicted (due to it not being
fully read).
Reviewed By: kmancini
Differential Revision: D30871422
fbshipit-source-id: 8acf4e21ea22b2dfd7f81d2fdd1b137a6e90cc8f
Summary:
Changes:
- Limit simultainous open git-repo objects by amount of CPUs.
- Put a semaphore limit so we wait inside tokio::task domain instead of tokio::blocking domain (later is more expensive and has a hard upper limit).
Reviewed By: mitrandir77
Differential Revision: D30976034
fbshipit-source-id: 3432983b5650bac6aa5178d98d8fd241398aa682
Summary:
This allows the mononoke_api user to choose whether the skiplists
should be used to spped up the ancestry checks or not.
The skiplists crate is already prepared for the situation where skiplist
entries are missing and traverses the graph then.
Reviewed By: yancouto
Differential Revision: D30958909
fbshipit-source-id: 7773487b78ac6641fa2a427c55f679b49f99ac8d
Summary:
Allow the mononoke_api user to choose whether they want
oprerations to be sped up using WBC or not.
Reviewed By: yancouto
Differential Revision: D30958908
fbshipit-source-id: 038cf77735e7c655f6801d714762e316b6817df5
Summary:
Some crates like mononoke_api depend on warm bookmark cache to speed up the
bookmark operations. This prevents them from being used in cases requiring
quick and low overhead startup like CLIs.
This diff makes it possible to swap out the warm bookmark cache to a
implementation that doesn't cache anything. (See next diffs to see how it's
used in mononoke_api crate).
Reviewed By: yancouto
Differential Revision: D30958910
fbshipit-source-id: 4d09367217a66f59539b566e48c8d271b8cc8c8e
Summary:
This method was added before the more generic list method was added. Let's get
rid of it for simplicity and to discourage listing all the bookmarks.
Reviewed By: yancouto
Differential Revision: D30958911
fbshipit-source-id: f4518da3f34591c313657161f69af96d15482e6c
Summary:
0.4.24 is incompatible with crates that use `deny(warnings)` on a compiler 1.55.0 or newer.
Example error:
```
error: unused borrow that must be used
--> common/rust/shed/futures_ext/src/stream/return_remainder.rs:22:1
|
22 | #[pin_project]
| ^^^^^^^^^^^^^^
|
= note: this error originates in the derive macro `::pin_project::__private::__PinProjectInternalDerive` (in Nightly builds, run with -Z macro-backtrace for more info)
```
The release notes for 0.4.28 call out this issue. https://github.com/taiki-e/pin-project/releases/tag/v0.4.28
Reviewed By: krallin
Differential Revision: D30858380
fbshipit-source-id: 98e98bcb5a6b795b93ed1efd706a1711f15c57db
Summary:
Move optional line handling logic into a separate function and simplify.
This diff is intended to be a pure refactoring with no observable changes in behavior. In particular, all the code dealing with the "optional" list appears to be dead code because if the line is optional, linematch will return "retry", so that branch is never reachable.
Reviewed By: DurhamG
Differential Revision: D30849757
fbshipit-source-id: 17283f9217466b3f85d913da66222b9a6779abe4
Summary:
This line was iterating over a list of files and looking in the
manifest for each one. This results in serial manifest reads which can result in
serial network requests.
Let's instead use manifest.matches() to test them all at once via the underlying
BFS, which does bulk fetching.
Differential Revision: D30938359
fbshipit-source-id: 1af7d417288b82efdd537a4afeaf93c1b55eaf49
Summary:
Demonstrate issues with the vertex to path resolution. Basically, the vertex to
path resolution logic did not check if the "parent of merge" being used is
actually valid (is an ancestor of provided heads) or not.
Reviewed By: DurhamG
Differential Revision: D30911150
fbshipit-source-id: 83d215910d5ba67ac0d5749927018a7aefcc6730
Summary:
The tree metadata fetching evolution goes as follow
(1) (commit, path) scs query
(2) tree manifest scs query [we are here]
(3) eden api manifest query [in development]
Option (1) is no longer used and is the only placed that required scs proxy hash.
Removing it will simplify transition from (2) to (3) and also cleans up bunch of unused code.
It also comes with minor performance improvement, saving about 5% on file access time.
To be precise, this is measured by running fsprobe [this is probably too little to measure in high noise benchmark like running arc focus]:
```
fsprobe.sh run cat.targets --parallel 24
```
Results:
```
W/ scshash:
P24: 0.1044 0.1007 0.1005 (hot) 0.1019 avg
W/o scshash:
P24: 0.0954 0.0964 0.1008 (hot) 0.0975 avg
```
This performance improvement comes from the fact, that even though scs hash was never created or used, we still attempted to load it from scs table, and even though this load always failed it contributed to execution time.
Reviewed By: xavierd
Differential Revision: D30942663
fbshipit-source-id: af84f1e5658e7d8d9fb6853cbb88f02b49cd050b
Summary: File access latency can actually be less then 1 ms, so it's good to show more digits
Reviewed By: DurhamG
Differential Revision: D30942905
fbshipit-source-id: 2fc8d48dbc08c55b89d829d1474ae11c2c3df1c3
Summary:
Since fsprobe itself requires a 'plan' to run, we need separate script to standartize list of plans we think are relevant
This scripts allows to generate fsprobe plans and run them
Reviewed By: DurhamG
Differential Revision: D30908892
fbshipit-source-id: eb722fe1f6d982e42b66614f08bc73345e04f9e6
Summary:
We got errors like:
error.IndexedLogError: "repo/.hg/store/lfs/pointers/meta": when reading LogMetadata
in log::OpenOptions::open(Filesystem("repo/.hg/store/lfs/pointers"))
Caused by 1 errors:
- failed to fill whole buffer
from Sandcastle. There seems no easy way to get a sample of the broken `meta`
file content. Let's include the file content to make progress on debugging.
Reviewed By: DurhamG
Differential Revision: D30939737
fbshipit-source-id: ccd77f6b67e4aaf75af2248118845fd5b3434ff1
Summary: This `allow` is no longer needed.
Reviewed By: yancouto
Differential Revision: D30859520
fbshipit-source-id: 36b810a72a28af25513404739bccf471e380cdf1
Summary: Update TreeStore to use CommonFetchState and update TreeStore and BackingStore to use the other utility types already in use for files (`StoreTree`, `FetchResults`, etc).
Reviewed By: andll
Differential Revision: D30739008
fbshipit-source-id: e210b8d76614c762ba127d5f2e26391681da004f
Summary: Adds a utility method for converting a `StoreTree` to a `manifest-tree::Entry`, which wraps an hg manifest blob and provides methods for parsing it as a tree manifest (and a `TryFrom` impl used to convert it to a pre-parsed `manifest::List`, which is used by BackingStore in the next change in this stack).
Reviewed By: andll
Differential Revision: D30859470
fbshipit-source-id: 411e80a14861e0739b0c398290055002b35e59d3
Summary: This change does not add aux data support, so for now the types are a bit useless.
Reviewed By: DurhamG
Differential Revision: D30313314
fbshipit-source-id: 11968199b12c4f870c58c7e939b5c8ed5cd9afea
Summary: More refactoring of scmstore `TreeStore`. Introducing a new `tree` submodule as we'll be adding tree-specific metrics, types, etc. soon (as currently exist for files).
Reviewed By: andll
Differential Revision: D30313460
fbshipit-source-id: f20d3ee62520b1d9ea34ad04eb1880ad9b5a00c3
Summary: Extract out `CommonFetchState` from `FileStore`'s `FetchState`. Currently, direct field access is still used for computing derivations and a few other places, but this will be changed in a later diff.
Reviewed By: DurhamG
Differential Revision: D30308289
fbshipit-source-id: 16d34904412572facc9f51cbd791e30413bfe634
Summary: Don't show progress bars for pending HTTP requests until they actually start running, so that the user always sees progress bars from active transfers.
Reviewed By: quark-zju
Differential Revision: D30914241
fbshipit-source-id: ca2f85af055dc9324123d0f9cc765f42d3b36ad4
Summary: Add a new `first_activity` event to the `Response` event listeners that fires the first time we detect nonzero progress for either uploading or downloading. This is useful for situations where requests are queued and we want to be notified when the request becomes active (e.g., to register progress bars).
Reviewed By: DurhamG
Differential Revision: D30914242
fbshipit-source-id: 83445724ed81e77ac25954b644e6bbafcbe5cadb
Summary: This adds inode number to NFS trace event so that we can use it in ActivityRecorder to show the filename of the FS request.
Reviewed By: xavierd
Differential Revision: D30849770
fbshipit-source-id: 580faf5fccb1a225399d9aec843e23eae1874e87
Summary:
We have an option on GlobFiles for listing hidden files, but we don't have a
cli option. We default to false in the cli. Let's pipe this option all the way through.
so that we can control this flag from the cli.
Reviewed By: xavierd
Differential Revision: D30915118
fbshipit-source-id: 28b91d4fd2dd4bdf9e342929f570f64db14e8ad0
Summary:
`eden prefetch` and `eden glob` return lists that despite being called
"maching files" actually contains both files and directories.
In some cases, we only want the list of files and it introduces unnessecary
overhead on our clients for them to have to stat all the files in the list to
filter out the dirs. Let's add an option to just list files.
Reviewed By: chadaustin
Differential Revision: D30816193
fbshipit-source-id: 6e264142162ce03e560c969a0c0dbbc2f418d7b9
Summary: The error message that currently exists here does not correspond to the command ran, its just missing the "redirect" part
Reviewed By: xavierd
Differential Revision: D30914616
fbshipit-source-id: 866ab7d494b728af13fbb3656edb8740a399755f
Summary:
There's no real equivalent of hg changeset of snapshot, so let's not derive it.
Closes task T97939172
Reviewed By: liubov-dmitrieva
Differential Revision: D30902073
fbshipit-source-id: 8128597c25e12e40e719cdd7800d4b9b792391c9
Summary:
`hg snapshot info` command will be used to get information about the snapshot (similar to `hg show` for commits)
It's still not easy to do this, as we want to have derived data for snapshots, which is still unimplemented.
For now, this makes the command only check if the snapshot exists or not. In the future more functionality will be added (and likely the edenapi endpoint we query will be different).
Reviewed By: liubov-dmitrieva
Differential Revision: D30900088
fbshipit-source-id: 4dc6915d74694a03496c756f03bc073d1a0819f2
Summary: This is a similar diff to D30915090, but for EdenFS.
Differential Revision: D30915126
fbshipit-source-id: 9a718e47237924ebe20176c522a1b1193224236c
Summary:
To eliminate the need for proxy hashes, we need variable-width object
IDs. Introduce an ObjectId type much like RootId.
Reviewed By: genevievehelsel
Differential Revision: D30819412
fbshipit-source-id: 07a185ba6b866b475c92f811e70aa00a8a9f895f
Summary: As a first step to moving the repo name inside the EdenAPI client itself, add it as a (currently unused) field to the config. Later diffs will use this instead of having each method take a `repo` argument.
Reviewed By: quark-zju
Differential Revision: D30746379
fbshipit-source-id: 07957e53e940ce72f84b2297f506b796117ec46a
Summary: We use it as an unique key for the detector
Reviewed By: ginfung
Differential Revision: D30703470
fbshipit-source-id: cb8e7dae5dc4192402530b2cfe564b86aa23c7c8
Summary:
Edenapi lookup (for file content, filenodes and trees): check all the multiplexed blobstores when we check is_present.
This will help us to avoid undesired behaviour for commit cloud blobs that haven't been replicated to all blobstores. Healer currently doesn't check commit cloud blobs.
Reviewed By: StanislavGlebik
Differential Revision: D30839608
fbshipit-source-id: d13cd4500f7b14731d8b75c763c14a698399ba02
Summary:
The new debugscmstorereplay command replays scmstore fetches given an activity log created previously via the scmstore.activity log config parameter.
Replaying activity logs may help to understand or reproduce performance issues related to file fetching. Currently the replay tool ignores all complications such as concurrent fetches or variable backends.
Differential Revision: D30288701
fbshipit-source-id: c6b24acdbd37b5a51ccba3e74e8f074062e880e5
Summary:
The new scmstore.activitylog config knob optionally specifies a file for scmstore to record fetch activity. Currently it only records file fetches, but it is intended to also record tree fetches once that is fully baked.
The purpose of the log is to record file access patterns to help debug command performance. The following commit will include a tool to replay scmstore activity from the log file.
Activity is stored in the log as newline delimited JSON objects. In addition to fetched keys, we also record the start time and duration of each fetch.
Differential Revision: D30288715
fbshipit-source-id: c40177e95b1f613ebed41e50a476cbf39e6d9364
Summary:
Make it more detailed, especially about corner cases. Avoid ambiguous words
like "valid" etc.
Reviewed By: farnz
Differential Revision: D30876339
fbshipit-source-id: a45ca643c6454645f7729053a7ea5dd78016fc68
Summary:
Same fix as D30874167 (9edb2cafe7), but for hg-server. This was broken in a recent
update.
Reviewed By: yancouto
Differential Revision: D30882520
fbshipit-source-id: 7e23556f619e3ead585e9e756456f30578ff7cab
Summary:
Some time ago (see D25910464 (fca761e153)) we've started using Background session class
while deriving data. This was done to avoid overloading blobstore sync queue - if Background
session class is set then multiplex blobstore waits for all blobstores to
finish instead of writing to the blobstore sync queue right away. However if any of the
blobstores fails then we start writing to the blobstore sync queue. In theory it should have avoided the problem of overloading blobstore sync queue while having the same multiplex reliability (i.e. if only a single blobstore fails the whole multiplex put doesn't fail)
Unfortunately there was a flaw - if blobstore put of a single blobstore wasn't
failing but was just too slow, then the whole multiplexed put operation becomes
too slow. This diff fixes this flaw by adding a timeout - if multiplexed put is
taking too long then we fallback to writing entries to the blobstore sync
queue.
Note that I added a new session class - BackgroundUnlessTooSlow -
because I figured that in some cases we are ok with waiting a long time but not
writing to the sync queue. Skiplist builder might be a good example of that -
since it's doing overwrites then we don't want to write to the blobstore sync
queue at all, because healer doesn't process overwrites correctly.
Reviewed By: farnz
Differential Revision: D30892377
fbshipit-source-id: 69ac1795002b124e11daac13d8bfe59895191168
Summary: When specifying `HGPLAIN`, only the hash is outputted, which is easier for automation.
Reviewed By: StanislavGlebik
Differential Revision: D30899254
fbshipit-source-id: 32457c6b92d14305c5b0bafb1217d574ec83a85c
Summary:
I added logging in D30805504 (d5e2624fbb), however it wasn't really logging anything,
because I forgot to pass scuba sample builder to CoreContext (facepalm).
This diff fixes it.
Reviewed By: HarveyHunt
Differential Revision: D30899642
fbshipit-source-id: 6e20f1e84fc96175be8ca7a6f91c0fc61caf8e49
Summary:
It looks like the comment is misleading (we don't really derive anything in
this block, just finding underived commits), and this CoreContext override
doesn't seem necessary anymore. Let's remove it
Reviewed By: farnz
Differential Revision: D30899641
fbshipit-source-id: 2850905891a9bd8b01f3f6fa9ef15c572fc2f07a
Summary:
`createremote` is a slightly inconsistent name.
The reasoning behind it was that this command creates the snapshot on server side only.
But since actually making the client snapshot-aware is pretty far away, I prefer to make it "create".
Reviewed By: StanislavGlebik
Differential Revision: D30871026
fbshipit-source-id: fde5d65e38249998f71e51b76ccb7d7b6b9bf24d
Summary:
This was a very well-hidden bug that I failed to notice in the integration tests.
Turns out the in serde enum unit variants are serialized to strings, while struct variants to a dictionary.
My code assumed it was always a dictionary. And because python is python, it worked, as the `in` operator works for strings and dicts. But `"Deletion" in fc` for strings means if Deletion is a substring of `fc`, which is also true for `UntrackedDeletion`, thus all untracked deletions were treated like normal deletions.
Reviewed By: StanislavGlebik
Differential Revision: D30868534
fbshipit-source-id: e574de6493bcd8e8d42d6e22da4dc482d083f22d
Summary: the rage summary is getting hard to quickly parse, so this underlines each section header, as well as unifies underline looks (with `eden stats`). This adopts the underline code from `eden du` and makes it a util function for shareability.
Differential Revision: D30857773
fbshipit-source-id: 66b5b06f5b0125304d45d3465a8bc2248693b791
Summary: I think it would be helpful to see the path of the inode that causes this check to fail
Reviewed By: kmancini
Differential Revision: D30880645
fbshipit-source-id: 08cad2277484568a6e325b1db7a89a9cf0fe1d3f
Summary: Refresh doesn't have any special meaning outside of packfiles, but in some contexts (like BackingStore) is used to trigger a flush. Previously I'd implemented refresh as a no-op. With this change, refresh will trigger a flush, and if a contentstore fallback is available will also forward the refresh call to contentstore for packfile-specific refresh behavior.
Reviewed By: andll
Differential Revision: D30851223
fbshipit-source-id: 893b4256fd8edc0f60612e61c177d885252cb85b
Summary: This test can fail for not having `USERNAME` environment variable when the test is run as root. Let's just skip the test when this happens because it doesn't make sense to drop priv as root.
Reviewed By: xavierd
Differential Revision: D30868518
fbshipit-source-id: 14ff6db218b1477f5905f2df3ad075a5ca186117
Summary:
Add an endpoint to provide repo configuration information, such as whether
segmented changelog is supported by the server or not. This helps the client
make decisions without hitting actual (expensive) endpoints and distinguishing
from unrelated server errors. It would allow us to remove error-prone
client-side config deciding whether to use segment clone.
Reviewed By: krallin
Differential Revision: D30831346
fbshipit-source-id: 872e20a32879e075c75481f622b2a49000059d04
Summary:
In a future diff, we want an endpoint to test if segmented changelog is
supported for a repo without doing any real computation using segmented
chagnelog. This would be useful for the client to decide whether it can
use segmented changelog clone or not, instead of relying on fragile
per-repo configuration.
Reviewed By: farnz
Differential Revision: D30825920
fbshipit-source-id: 16dc5bf762da2d2b9cd808c129e1830285023f3d
Summary: it would be helpful to see a user's or sandcastle job's eden config, especially in the case of a gated feature rollout / staged feature rollout.
Differential Revision: D30857763
fbshipit-source-id: ee2a311ee643fc9db5acef1b02017564c51d2362
Summary:
just a typo fix
Created from CodeHub with https://fburl.com/edit-in-codehub
Reviewed By: fanzeyi
Differential Revision: D30849172
fbshipit-source-id: 9779832870c909d080548ec71ecf86aa53767dbc
Summary:
This logic (A Wire and Api object that are just the same object) is used in more places ad-hoc (for Vec<u8> and u32).
This diff makes it simpler by using the macro introduced before, and derives it for a lot of basic types (integers and bytes).
Reviewed By: kulshrax
Differential Revision: D30605781
fbshipit-source-id: 7520b529e52cfde0a5c5d17d91f5f85b0633fa7f
Summary:
It's nice to have these functions to open source and target repos.
Previously we always had to get repo id first, and then call
open_repo_internal_with_repo_id
Reviewed By: yancouto
Differential Revision: D30866314
fbshipit-source-id: dd74822da755de232f4701f8523088e0bb612cb9
Summary:
D30829928 (fd03bff2e2) updated some of Mononoke's integration tests to take into
account whitespace changes. However, it removed the globs from some parts of
the tests.
As the assigned port changes on each test run, the globs are required. Add them
back in again, as well as fix up some whitespace in a test.
Reviewed By: markbt
Differential Revision: D30866884
fbshipit-source-id: 1557eee2143a2459a6412b8649e7e3dce5a607a4
Summary:
It's nice to have that can quickly count and print stats about a commit. I'm
using it now to understand performance of derived data.
Reviewed By: ahornby
Differential Revision: D30865267
fbshipit-source-id: 26b91c3c05a1c417015b5be228796589348bf064
Summary:
`rust_include_srcs` is supported on `thrift_library` as a way of including other Rust code in the generated crate, generally used to implement other traits on the generated types.
Adding support for this in autocargo by copying these files into the output dir and making sure their option is specified to the thrift compiler
Reviewed By: ahornby
Differential Revision: D30789835
fbshipit-source-id: 325cb59fdf85324dccfff20a559802c11816769f
Summary:
The default Windows encoding can't handle some unicode characters
apparently, so let's use utf-8 by default.
Reviewed By: quark-zju
Differential Revision: D30850982
fbshipit-source-id: 51a7fdf5464d075549afe4f0bcd307c0f2eb7fa0
Summary:
Add impls for Layer for Box/Arc<L: Layer> and <dyn Layer>. Also a pile of other
updates in git which haven't been published to crates.io yet, including proper
level filtering of trace events being fed into log.
Reviewed By: dtolnay
Differential Revision: D30829927
fbshipit-source-id: c01c9369222df2af663e8f8bf59ea78ee12f7866
Summary:
Bump all the versions on crates.io to highest to make migration to github
versions in next diff work.
Reviewed By: dtolnay
Differential Revision: D30829928
fbshipit-source-id: 09567c26f275b3b1806bf8fd05417e91f04ba2ef
Summary:
We don't need to pass the bubble id to the server, it can find it from the changeset id.
This fixes a TODO I added previously, and should make the `restore` command complete.
Reviewed By: ahornby
Differential Revision: D30609423
fbshipit-source-id: d1c8eb0e0556069fa408520a0aea91a0f865fbe1
Summary:
Uses the endpoint added on previous diffs to download the snapshot files to the repo, and adds them correctly to the snapshot restore.
This should almost complete the `snapshot restore` command, missing is getting the bubble id from the snapshot hash.
{gif:sqc6yx6c}
Reviewed By: StanislavGlebik
Differential Revision: D30583038
fbshipit-source-id: 6549a52f767c50444c316b358d9704bc4a136934
Summary:
This adds the `downloadfiles` method to the python EdenApi wrapper.
It uses multiple calls to the endpoint added on previous diffs to download each file and place it somewhere on the repo. It also does deduplication of downloads.
Reviewed By: StanislavGlebik
Differential Revision: D30582638
fbshipit-source-id: 34e864d03c0e48a7605ee8e4e92376881dbb2de9
Summary:
When using hashbinary with a removed/moved file, hg throws with `TypeError: object supporting the buffer API required` this is because we are trying to `sha1(None)`.
This diff falls back to the `Binary file %s has changed` message when we have a removed file.
Reviewed By: quark-zju
Differential Revision: D30845897
fbshipit-source-id: a3d2b7d11d9c1ca3855140c9abd7550cf7076bbc
Summary: This adds the support for FS events logging for NFS. For context, each type of event is assigned a sampling group that determines its sampling rate. In TraceBus subscription callback, events are sent to `FsEventLogger` to be sampled and logged through `HiveLogger`.
Reviewed By: xavierd
Differential Revision: D30843863
fbshipit-source-id: 65394d31b1197efd69c7fd4c1b24562f5abd5785
Summary:
Previously, it was only possible to register event listeners for request completion on the `HttpClient` itself, rather than on individual `Request`s. This diff adds similar event listeners to `Request`s themselves, so that its possible to register a callback to fire when any request completes, regardless of whether it was sent via an `HttpClient` or as a one-off.
This is similar to `RequestCreationEventListeners`, which run for the creation of every request, whether or not the request is associated with a client.
Notably, to avoid circular references the new event listeners take a `RequestInfo` argument instead of a `RequestContext` (since the listeners are themselves stored inside the `RequestContext`). In practice, the `RequestInfo` should contain all of the information one might want to access about the request.
Reviewed By: quark-zju
Differential Revision: D30831840
fbshipit-source-id: 77ca9dc5fd9f8fc5ee60319baabd77171af70d45
Summary:
The content store repair binding ate the entire repair message, making
it hard to debug when it wasn't working.
Reviewed By: quark-zju
Differential Revision: D30824740
fbshipit-source-id: 52dbfe79f2dd1568285cda63fb54cacf532aa4a1
Summary:
Make `verify` check the lazy changelog properties:
- Universal id <-> name mappings are known locally.
- Segments are sane (ex. high-level segments are built from low-level and there
are no cycles)
With `--dag`, also check the graph with a newly cloned remote graph.
This just calls the verification logic added in Rust `dag` crate to do the
heavy lifting.
Differential Revision: D30820773
fbshipit-source-id: 8f62f41738c3c8e3fe88442860a83fdb4944f178
Summary:
In certain situations, users may cause EdenFS to falsely return a path not exist result while the path is available. Windows will cache that and causing subsequent access to that file to automatically return a file not exist error.
We currently only invalidate this negative cache during checkout and rebooting the machine as the cache is even kept during EdenFS restarts. In this diff, we starts to invalidate the negative path cache at startup so if the user ever had issues an `eden restart` would be sufficient to fix.
Reviewed By: xavierd
Differential Revision: D30814059
fbshipit-source-id: 53283f471702762b2eed0c5d0f6a9cc49f4db739
Summary:
This adds the plumbing to access download a file using the endpoint from the previous diff via the EdenApi trait, which does the actual http request.
It concats the stream into a Bytes object and returns it.
Reviewed By: StanislavGlebik
Differential Revision: D30582422
fbshipit-source-id: ed0fe5e34e3fecc6c1b26d2dceb322dfcf5f8e37
Summary:
This diff adds an endpoint `/download/file` that allows to download a file given an upload token.
This will be used for snapshots, as we need to download the snapshot changes, and there's no way to do that right now.
Other options, and why I didn't do them:
- Using the existing `/files` endpoint: Not possible, as it needs hg filenodes and we don't have those.
- Returning the file contents in the fetch_snapshot request: Might make the response too big
- Returning just a single Bytes instead of a stream: I thought streaming would be preferred, and more future proof. In the stack I still put everything in memory in the client, but maybe in the future it should be possible to stream it directly to the file. I'm happy to remove if preferred, though.
Reviewed By: StanislavGlebik
Differential Revision: D30582411
fbshipit-source-id: f9423bc42867402d380e831bc45d3ce3b3825434
Summary: This proved useful couple of times when folks experienced problems with the agent.
Reviewed By: ahornby
Differential Revision: D30837676
fbshipit-source-id: aec769f60a09ecb83857e6e60d49a5662b4ce0b2
Summary:
Add back the octopus merge support for revlog.
This recommits D30686451 (b13579fdf9) and D30686450 (7eb11cb392) as-is, with updates to test files.
Original commit changeset: 9f213766e7c4
Reviewed By: StanislavGlebik
Differential Revision: D30784681
fbshipit-source-id: ace0c317652ad8b657c8edd9a0130532dad53078
Summary:
As far as I could tell, this was legacy from some refactorings.
It was only used in one place, and it was easy to fix.
Also, if we really need it in the future, we can probably use `#[auto_impl]` instead of doing it manually.
Reviewed By: StanislavGlebik
Differential Revision: D30574803
fbshipit-source-id: 20715364713775818fe0e83844637f48b310d87f
Summary: createremote only worked from root of the repo. This fixes it, and tests that in the integration test
Reviewed By: StanislavGlebik
Differential Revision: D30546582
fbshipit-source-id: 84aa304d346e448b44e5d7fb9e9607d84a67da25
Summary:
This adds basic logic for `snapshot restore` command.
- It updates to the parent of the snapshot
- It loads the snapshot changes
For now I did not do changes/tracked changes, as it will need to download the file contents, which will need a new edenapi endpoint, so I'll leave it for a future diff. It just restores your deleted files for now.
Reviewed By: StanislavGlebik
Differential Revision: D30543507
fbshipit-source-id: 080588ceff0ecd595ce739044f0d4118fb8e1a3f
Summary:
log sync reason for `hg cloud sync`.
This will help us to investigate issues better and measure impact for new Eden Api Uploads case by case (after amend, rebase, etc) on different platforms.
Reviewed By: yancouto
Differential Revision: D30775519
fbshipit-source-id: 696e954ec8db19226fb67ad0952e23e2b67e9931
Summary:
Put code using the usage service behind an `EDEN_HAVE_USAGE_SERVICE` macro.
Previously the C++ code was simply guarded by a `__linux__` check, and the
CMake code did not have a guard at all. This caused builds from the GitHub
repository to fail on Linux, since the code attempted to use the usage service
client which was not available.
Reviewed By: xavierd
Differential Revision: D30797846
fbshipit-source-id: 32a0905d0e1d594c3cfb04a466aea456d0bd6ca1
Summary:
In the v1 sparse config arrangement, if all rules were excludes then we
would include a default "**" rule. This was always a little confusing and caused
some weird behavior. Let's remove it from the v2 world.
This actually bit us because the fbsource_exclude profile only has excludes,
which caused it to insert a ** include, which pulled in all of fbsource. We
could fix it to only check if a profile is excludes-only once all the transitive
profiles have been loaded, but I think the cleaner fix is to remove this logic
since it's confusing and never actually used in production.
Differential Revision: D30824082
fbshipit-source-id: adcf4c820cc9f7636f79759d03fc0b387b9f55fa
Summary:
Any error inside the decode stream was being propagated up as a decoder
error. This caused higher level code to not handle certain errors appropriately.
For instance, the lfs retry logic only retries for certain classes of curl
errors. So let's propagate up HttpClientErrors as is.
Reviewed By: kulshrax
Differential Revision: D30798108
fbshipit-source-id: 7316f6cdc47de090c202ff6a1f28d0fba60f7a15
Summary:
The previous version had two issues:
1. It's UB to cast uninit away as it may be actually uninitialized.
2. Because of the cast, the buffer was not actually written nor advanced after written to, causing the caller to think nothing was read.
https://docs.rs/tokio/1.11.0/tokio/io/struct.ReadBuf.html
Reviewed By: dtolnay
Differential Revision: D30823808
fbshipit-source-id: d5f67e4c03f1d63f2241421dd35082ee96b5afd8
Summary: For some reason it got broken, need to call `as_ref()` to properly cast type
Reviewed By: quark-zju
Differential Revision: D30740629
fbshipit-source-id: f49275caae9d360859e97c03709a720dabc22e9e
Summary:
LocalStore no longer special-cases Tree objects with kZeroHash
ids. Instead, unconditionally write into LocalStore with the Tree's
hash.
Reviewed By: xavierd
Differential Revision: D29155470
fbshipit-source-id: aee3840fe8dfd7aa46305b6db6f7950efb2e41d2
Summary:
In preparation for expanding to variable-width hashes, rename the
existing hash type to Hash20.
Reviewed By: genevievehelsel
Differential Revision: D28967365
fbshipit-source-id: 8ca8c39bf03bd97475628545c74cebf0deb8e62f
Summary:
Do not assume `changelog.parents` returns 2 items.
This changes the behavior for root commits. `parents()` used to return
`[repo[nullid]]`, now it returns `[]`.
Reviewed By: andll
Differential Revision: D30784684
fbshipit-source-id: 73f58c85457391fb74b96b88dc4dcb69a25e81ac
Summary:
In a future change, `ctx.parents()` returns `[]` instead of `[repo[nullid]]`
for root commits. Make the change to preserve absorb behavior.
Differential Revision: D30816385
fbshipit-source-id: afded91a6e72d4eb54faf87dcdfc52a81ea1d66f
Summary:
In a future change, `ctx.parents()` returns `[]` instead of `[repo[nullid]]`
for root commits. Make the change to preserve rebase behavior.
Differential Revision: D30816386
fbshipit-source-id: ca7c489991ae149c9640b7da0e6e54f76afbc250
Summary:
We're going to change parents() to return an empty list instead of `[nullctx]`
for roots. This change makes it more compatible with upcoming changes.
Reviewed By: andll
Differential Revision: D30787305
fbshipit-source-id: 1de523964faa64a6496a7bb0197af597e393d859
Summary: This will be used by the next change.
Reviewed By: andll
Differential Revision: D30784683
fbshipit-source-id: 59a37c5f428eaf5950584d8f17471d358bfefee7
Summary: Integrate http hash prefix lookup into the pull operation. One unfortunate change here is that if the prefix is ambiguous, we're only able to output possible full hashes as suggestions. Previously we'd also print commit log information. To retain that we'd need to add an error option to the response and have the server send back an error message with the log information or send another request to download the extra information.
Reviewed By: andll
Differential Revision: D30716050
fbshipit-source-id: 33f8bc38b0bfe7fce4ec11cd8def7feda3b3d3da
Summary:
As title, sampling group determines the sampling rate at which an FS event is logged. The higher the sampling group the more heavily its events are dropped, thus, more frequent events are assigned to the higher sampling groups.
I ran activity recorders on a few workflows, buck build, getdepts, and vscode editing and came up with the following assignment. Note that only a subset of events are assigned to a sampling group (so events not included will not be logged) as we just start to tune the sampling rates and these events should be good for a start.
```
Group1 (1/10)
FUSE_MKDIR
FUSE_RMDIR
FUSE_CREATE
FUSE_RENAME
Group2 (1/100)
FUSE_WRITE
FUSE_LISTXATTR
FUSE_SETATTR
Group3 (1/1000)
FUSE_GETXATTR
FUSE_GETATTR
FUSE_READ
FUSE_READDIR
Group4 (1/10000)
FUSE_LOOKUP
```
For reference, here are the counts of FS events of a cold buck build. The frequencies of other workflows are similar.
```
FUSE_LOOKUP 60.09 98733
FUSE_READ 12.80 21037
FUSE_GETXATTR 8.91 14645
FUSE_FORGET 8.01 13162
FUSE_GETATTR 5.55 9116
FUSE_READDIR 3.21 5270
FUSE_LISTXATTR 0.59 969
FUSE_READLINK 0.54 892
FUSE_STATFS 0.21 338
FUSE_WRITE 0.04 64
FUSE_CREATE 0.02 28
FUSE_RENAME 0.01 23
FUSE_SETATTR 0.01 13
FUSE_UNLINK 0.00 6
FUSE_RMDIR 0.00 1
FUSE_MKDIR 0.00 1
FUSE_MKNOD 0.00 1
```
Reviewed By: xavierd
Differential Revision: D30770533
fbshipit-source-id: 90be881ddbeba2113bbb190bdb1e300a68f500a0
Summary: The new `EdenApiHandler` framework for defining EdenAPI endpoints provides a common place where responses are encoded. This diff adds automatic content compression at this point, using the received `Accept-Encoding` header from the request to determine what compression, if any, should be used. As a result, all endpoints that implement `EdenApiHandler` will get compression for free.
Reviewed By: yancouto
Differential Revision: D30553242
fbshipit-source-id: 9eda54cbf81dd1e03abec769744c96b16fad64ea
Summary:
It can sometimes be difficult to work out from the logging which commit cloud
requests came from which client repo. Previously you could often infer it from
the client identities, however if the request is proxied, the originating hostname can be
lost, and it still doesn't handle the case where the host contains multiple
repos.
This diff adds a new `ClientInfo` struct, which is included by the client
on every `get_references` and `update_references` request. This is logged
by the service, allowing us to correlate which client it came from, and what
workspace version the client had at that time.
Reviewed By: StanislavGlebik
Differential Revision: D30697839
fbshipit-source-id: 8fe2e03f0be2f443f8ae1814f083c04ba5d1805e
Summary: It was not used and hard to implement retry with
Reviewed By: yancouto
Differential Revision: D30716647
fbshipit-source-id: a90b629f7758486c9e526d1eaf3fd29da305f2e7
Summary:
D30704344 (5704ad51f6) upgraded tokio for the buck build. We need to do the same for
the non-buck build. This unbreaks hgbuild.
Also clean up some compiler warnings while I'm here.
Reviewed By: fanzeyi
Differential Revision: D30798315
fbshipit-source-id: 47005c7674d87196aab42b3ddf2194acced3bb6c
Summary:
We have two mode of deriving data: the "normal" way and using backfilling.
Backfilling is different from "normal" mode in that it derives a few commits at
once, and saves them all to blobstore at once.
Backfilling mode seemed to have helped us when we need to derive a lot of data
(e.g. backfill the whole repo). But
a) We don't know how much it helps, and we don't know if it depends on the repo
b) We don't know if it helps when we derive data for newly landed commits (i.e.
we use "backfill" mode in derived data tailer to derive data for latest public
commits)
So this diff adds a bit of logging to a separate scuba table so that we can get
an idea about things like:
1) How long does it take to derive a stack of commits?
2) Where do we spend most of the time (e.g. deriving, saving the blobs, saving
the mapping).
Reviewed By: mitrandir77
Differential Revision: D30805504
fbshipit-source-id: d82c905cafa87459990d74769a0dddcc91fac174
Summary:
It allows us to do 3 things:
1) Remove derive function
2) Add support for backfill mode so that we can compare perf with and without
it
3) Use all derived data types, and not just 3 of them
Reviewed By: krallin
Differential Revision: D30804258
fbshipit-source-id: 604723a3d845a60cfd94b4e090a121f5b5191536
Summary: This command can be useful to split a large bonsai commit into a smaller one
Reviewed By: mitrandir77
Differential Revision: D30776789
fbshipit-source-id: dc56d7c51eb0e9e0988dcba868c6008ebf488927
Summary:
While we don't really need it, creation mononoke matches fail if they are not
present. Let's just enable it here - it's not a bad thing to initialize them
Reviewed By: mitrandir77
Differential Revision: D30780463
fbshipit-source-id: c4199c6711ae7bd9641e9f51643b94d020051dbd
Summary: The code for accessing config fields had a lot of repetitive boilerplate. Let's move that to a helper function.
Reviewed By: andll
Differential Revision: D30785932
fbshipit-source-id: fb4d47337a27bd6e75eeb38d5a9d1de5b1fac6ce
Summary: Implement serverside graph endpoint for fetching the mapping of commits to commit parents for the missing segment of a commit graph. This implementation uses the find_commits_to_send method from the get_bundle_response library. What may be missing from pull and the old bundle protocol now is mutation markers.
Reviewed By: yancouto
Differential Revision: D30485672
fbshipit-source-id: ba3a30d9e482d60831cbe7a8e89f20dab947d9a1
Summary:
Since the find_commits_to_send method was added, common is already a
hashset not a vector, so it doesn't needed to be converted to a hashset.
Reviewed By: quark-zju
Differential Revision: D30622028
fbshipit-source-id: e5d1b6c60115d13c906b25142043652ba9e89d70
Summary:
Not flushing the data to disk makes studying performance almost impossible due
to not being able to avoid fetching from the network. By forcing a flush to
disk, we can ensure that data will always be on disk, making performance
measurement easier. This will also prevent users from re-fetching the same data
multiple times.
Reviewed By: fanzeyi
Differential Revision: D30784399
fbshipit-source-id: 0250c209b5f49f95cf2f43873573cacc661a4989
Summary:
Since this method wasn't overriden, EdenFS would never periodically flush data
to disk.
Reviewed By: fanzeyi
Differential Revision: D30784400
fbshipit-source-id: d88e535250a476582868dd82e57137a0ac38f921
Summary: Previously, there were special variants for missing and invalid URLs (since the server URL is presently the only required config option). In order to support other required config options, let's simplify the enum to just have variants for missing and invalid config fields respectively.
Reviewed By: yancouto
Differential Revision: D30745971
fbshipit-source-id: e414ec2fadc5d04e9c788bf290a70f6cf52dbe58
Summary: Saw this when reading related code.
Reviewed By: kulshrax
Differential Revision: D30783665
fbshipit-source-id: f9b598b9301619346972bd0abf893f089d902022
Summary:
I'm not sure what this was for, but it doesn't seem necessary, and removing it simplifies the code a lot, enabling us to make other improvements later.
This is an alternate, less ambitious version of https://www.internalfb.com/diff/D30620443.
Reviewed By: DurhamG
Differential Revision: D30674016
fbshipit-source-id: 17dee50b82c78d31e45492dc23826d8c3fe838e5
Summary:
This test relies on Mononoke, so it fails for make local build/test,
which breaks hgbuild. Let's only enable it for buck tests.
Reviewed By: quark-zju
Differential Revision: D30782799
fbshipit-source-id: 4b543beeb248715702b9072d84cdb8211dcd4a9b
Summary:
Currently there are two things preventing us from running add_sync_target
on existing target:
* already existing bookmark
* already existing config
Both need to be deleted to create new target. This diff removes the second
one to simplify code and make it easier to recreate the target (it's easy to
forget about manually removing the config as they otherwise don't need
human interventions).
Reviewed By: StanislavGlebik
Differential Revision: D30767613
fbshipit-source-id: f951c0e1ef9bde69d805dc911331fcdb220123f2
Summary:
This logic scans all the ancestors of the working copy that are not
ancestor of the graft source and checks their extras. With lazy changelog this
is extremely expensive. Let's just drop this logic.
Reviewed By: quark-zju
Differential Revision: D30734017
fbshipit-source-id: ca5606cea08efe10f29847970379d6bff4eb4aee
Summary:
Update the `Filenodes` trait so that it doesn't require the repository id to be
passed in every method invocation. In practice a filenodes instance can only
be used for a single repo, so it is safer for the implementation to store the
repository id.
At the same time, update the trait to use new futures and async-trait.
Reviewed By: yancouto
Differential Revision: D30729630
fbshipit-source-id: a1f80a299d9b0a99ddb267d1f7093f27cf21f1af
Summary:
Make the derivation process for mercurial changesets and manifests not depend
on `BlobRepo`, but instead use the repo attribute (`RepoBlobstore`) directly.
This will allow us to migrate to using `DerivedDataManager` in preparation
for removing `BlobRepo` from derivation entirely.
Reviewed By: yancouto
Differential Revision: D30729629
fbshipit-source-id: cf478ffb97a919c78c7e6e574580218539eb0fdf
Summary:
Make the derivation process for blame and deleted files manifest not depend
on `BlobRepo`, but instead use the repo attribute (`RepoBlobstore`) directly.
This will allow us to migrate to using `DerivedDataManager` in preparation
for removing `BlobRepo` from derivation entirely.
A `BlobRepo` reference is still needed at the moment for derivation of the
unodes that these depend on. That will be removed when `DerivedDataManager`
takes over co-ordination of derivation.
Reviewed By: yancouto
Differential Revision: D30729628
fbshipit-source-id: 4504abbe63c9bf036d69cb4341c75b13061fae18
Summary:
Make the derivation process for fsnodes and skeleton manifests not depend on
`BlobRepo`, but instead take the `DerivedDataManager` from the `BlobRepo` and
use that instead. This is in preparation for removing `BlobRepo` from
derivation entirely.
Reviewed By: yancouto
Differential Revision: D30301855
fbshipit-source-id: a2ed1639526aad9ddbe8429988043f0499f7629c
Summary:
Make the derivation process for unodes not depend on `BlobRepo`, but instead
take the `DerivedDataManager` from the `BlobRepo` and use that instead. This
is in preparation for removing `BlobRepo` from derivation entirely.
Reviewed By: yancouto
Differential Revision: D30300408
fbshipit-source-id: c35e9e21366de74338f453aaf6be476e7305556d
Summary:
The derived data manager also has a reference to the repo blobstore. This must
also be overridden when we override the blobstore.
Reviewed By: yancouto
Differential Revision: D30738354
fbshipit-source-id: b0e16ef810c5244cd056a3c9e5b9ceaaddb5ecea
Summary:
modernise tests (removing disabling treemanifest)
treemanifest is the default now, so it shouldn't be disabled unless absolutely necessary.
Also, if we would like to switch some of the tests to Mononoke, it shouldn't be disabled.
Only two tests left with it in commitcloud. Those two a bit harder to fix.
```
devvm1006.cln0 {emoji:1f440} ~/fbsource/fbcode/eden/scm/tests
[139] → rg 'disable treemanifest' | grep cloud
test-commitcloud-backup-bundlestore-short-hash.t: $ disable treemanifest
test-commitcloud-backup.t: $ disable treemanifest
```
Reviewed By: kulshrax
Differential Revision: D30278754
fbshipit-source-id: cf450084669c2b6b361cd34952bf986e913de1a8
Summary:
I want to use `ReplicaFirst` read connection type since `ReplicaOnly` is a bit too restrictive.
We've had 2 MySQL SEVs this year when all the replicas went down crashing our services despite the primary instance working normally.
There was also a case when I've deleted too much rows at once and all replicas went down due to replication lag (I know better now)
RFC
- Yay or Nay?
- Should I expand `ReadConnectionType` to mirror all options of `InstanceRequirement`?
- Perhaps it's worth moving it into the `common/rust/shed/sql` crate?
I kept cleaning up all the usages out of this diff to keep the changes minimal for RFC
Differential Revision: D30574326
fbshipit-source-id: 1462b238305d47557372afe7763096c53df55f10
Summary:
Segmented changelog seeder spends a significant chunk of time fetching
changesets. By saving them to file we can make reseeding significantly faster.
Reviewed By: farnz
Differential Revision: D30765374
fbshipit-source-id: 0e6adf12e334ad70486145173ae87c810880988a
Summary:
In backfill_derived_data we had a way to prefetch a lot of commits at once, so
that backfill_derived_data doesn't have to do it on every startup.
I'd like to use the same functionality in segmented changelog seeder, so let's
move it to the separate binary.
Reviewed By: mitrandir77, farnz
Differential Revision: D30765375
fbshipit-source-id: f6930965b270cbaae95c3ac4390b3d367eaab338
Summary: ContentDataStore is meant to be implemented local-only. Fetching remotely seems to cause the issue observed in https://fb.workplace.com/groups/scm/permalink/4192991577417097/ (though I'm not quite sure why yet)
Reviewed By: kmancini
Differential Revision: D30744817
fbshipit-source-id: 68875a4912905f9b8f88cf4be804c5d988c3905d
Summary: If the bind unmount fails in in the privhelper, theres a possibility of infinite recursion in this method. This adds a flag to indicate if we've tried the bind unmount before.
Differential Revision: D30732857
fbshipit-source-id: 6ee887d211977ee94c8e66531287f076a7e61a2c
Summary:
It sounds like macOS has a bug where an APFS subvolume may be falsely created.
Let's retry with the hope that the retry will succeed.
Differential Revision: D30657706
fbshipit-source-id: 60bc74f789a0d34b2be53073103b95474a9a18e6
Summary: This is regenerated rust lib using the latest compiler
Reviewed By: krallin
Differential Revision: D30720130
fbshipit-source-id: 3d3389ec8504568fc356dda1577e1f7801cb7e96
Summary:
~~Also enable the `derive` feature so it isn't necessary to separately
depend on `strum_macros`.~~
This turns out to break a lot.
Reviewed By: dtolnay
Differential Revision: D30709976
fbshipit-source-id: a9181070b8d7a8489eebc9e94fa24f334cd383d5
Summary: Move `edenapi::Client`'s internals to an `Arc<ClientInner>`. This makes the client `Clone`-able while sharing the same underlying state. This is particularly useful for scenarios where `Future`s or `Stream`s returned by the client need to hold a reference to the client itself (e.g., in order to issue subsequent HTTP requests).
Differential Revision: D30729803
fbshipit-source-id: c97e700c9e3702f818eb86ded1a46f920a55cfd1
Summary: The `Fetch<T>` type has basically turned into the canonical type EdenAPI for all EdenAPI responses. Originally, this type was merely an implementation detail (essentially just a named tuple returned by the `fetch()` method, hence the name), but given its prominence in the API, the name is confusing. As we add more functionality and usage to this type, it makes sense to give it a more suitable name.
Differential Revision: D30730573
fbshipit-source-id: 7acd2a86b55bdfc186bd9110f6a99333df9d29d3
Summary:
Some of the method names used internally by `edenapi::Client` are a bit terse.
This was OK back when there were only handful of private methods which were used by a small number of API methods that were doing more or less the same thing (sending concurrent POST requests for a set of keys).
Today, there are way more API methods, most of which set up requests in different ways. As such, it makes sense to give these older private methods more explicit and descriptive names so that their intended usage is clear.
Differential Revision: D30729802
fbshipit-source-id: 5adfd8e7ba153df8c036e4dbb312f95b9b1d7335
Summary: Allow repack to be called on treescmstore via the ContentStore shim like filescmstore is already supported.
Reviewed By: andll
Differential Revision: D30687145
fbshipit-source-id: 7559af08e98cfb22da6dbf45dc1746312b1e6d28
Summary:
Provide a basic implementation of the LegacyStore trait for TreeStore to allow repack calls to be forwarded to the fallback ContentStore for trees.
Repack will be removed entirely before contentstore is deleted, and the `unimplemented` methods are never called, so this should be safe.
Reviewed By: andll
Differential Revision: D30687136
fbshipit-source-id: d238d70fbf6be5c25c2e1c9610430a53d031bf3b
Summary: Looks like it was lost during the last refactoring, let's add it back.
Reviewed By: farnz
Differential Revision: D30728456
fbshipit-source-id: 20c638b3c5a8664f2367f871cd29a793fd897de3
Summary:
Some users have reported errors of the form:
```
error.HttpError: [65] Send failed since rewinding of the data stream failed (seek callback returned error 2)
```
These are caused by the fact that we're passing the HTTP request body directly to libcurl in memory rather than via a file, but we haven't implemented the `seek()` method necessary for libcurl to retransmit the data if needed. This diff implements the method.
Reviewed By: DurhamG
Differential Revision: D30654625
fbshipit-source-id: f21a067ad02ee540b86cf2e6eff2c6f08f45a3e4
Summary:
Like it says in the title, this updates us to use Daemonize 0.5, though from
Github and not Crates.io, because it hasn't been released to the latter yet.
The main motivation here is to pull in
https://github.com/knsd/daemonize/pull/39 to avoid leaking PID files to
children of the daemon.
This required some changes in `hphp/hack/src/facebook/hh_decl` and `xplat/rust/mobium` since the way to
run code after daemonization has changed (and became more flexible).
Reviewed By: ndmitchell
Differential Revision: D30694946
fbshipit-source-id: d99768febe449d7a079feec78ab8826d0e29f1ef
Summary:
At the moment when segmented changelog is updated and/or reseeded mononoke
servers can pick it up only once an hour (this is a current reload schedule)
or when mononoke server is restarted. However during production issues (see
attached task for an example) it would be great to have a way to force all
servers to reload segmented changelog.
This diff makes it possible to do so with a tunable. Once tunable changes its
value then monononoke servers almost immediately (subject to jitter) reload it.
This implementation adds a special loop that polls tunables value and reloads
if it changes. Note that in theory it could avoid polling and watch for configerator
changes instead, but it would be harder to implement and I decided that it's
not worth it.
Reviewed By: farnz
Differential Revision: D30725095
fbshipit-source-id: da90ea06715c4b763d0de61e5899dfda8ffe2067