Summary:
This adds the configuration `doctor.ignored-problem-class-names` so that we can
make doctor ignore individual problem classes via config rollout.
This doesn't actually stop the checks from running, but it stops reporting (or
attempting fixes for) ignored problems. Ignored tests will still be logged in the
edenfs_events table's
Reviewed By: xavierd
Differential Revision: D44720584
fbshipit-source-id: 954d1131dfbbf6a3264abf55b5e0f01ab61836d0
Summary:
On Windows, this should always be set to true. Unfortunately, we've rolled out
a bad EdenFS release that overwrote all of these configs and set them to false,
breaking several users.
Reviewed By: fanzeyi
Differential Revision: D44683911
fbshipit-source-id: 4d8efb3402f967b2e35fd333c858fe939307e6f0
Summary:
When this check was previously enabled, we saw 2 issues from users:
- The check failing due to status taking more than 5s,
- The check having false positive.
The first one can be fixed by increasing the timeout on the Thrift client, the
second one is anything but clear at first sight. Digging a bit deeper, one
issue became apparent. In:
if modified_file not in diff:
The type of `modified_file` is a `Path`, while the type of `diff` was a
`Set[str]` (the type annotation was wrong). If we manually test this, here is
what we get:
% python3
Python 3.8.6 (default, Feb 10 2023, 17:15:29)
[GCC 11.x 20221024 (Facebook) 11.2.1+] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from pathlib import Path
>>> a = {"a/b.c", "c/d.c"}
>>> a
{'c/d.c', 'a/b.c'}
>>> Path('c/d.c') in a
False
>>> a = {Path("a/b.c"), Path("c/d.c")}
>>> Path('c/d.c') in a
True
>>>
The check would have thus failed, which would have led to false positives.
Lastly, this also fixes a potential false negative, where the number of
modified files could be the same as the files in diff when the latter includes
an added/remove file but is missing one of the modified file. Removing the
check comparing both is sufficient to avoid this issue.
Reviewed By: chadaustin
Differential Revision: D44558090
fbshipit-source-id: 0cc83a87758a5feeff78c38b210ebd91fa5d58f5
Summary:
Some Python versions don't return the correct macOS version from
`platform.mac_ver()`, so let's use `/usr/bin/sw_vers` instead.
Reviewed By: xavierd
Differential Revision: D44594569
fbshipit-source-id: ee52f11aad76361b780845de8218b2f365f0ecfe
Summary:
`eden top` would previously crash if its terminal window was resized too small,
due to incorrectly specifying a negative padding width in a format specifier.
Reviewed By: xavierd
Differential Revision: D44564714
fbshipit-source-id: 936f346ce7e7f5cb0f18cdbffa269cb0fd06be91
Summary:
This config is already hardcoded to default to true on Windows in the C++ where it is consumed, and is set to true during clone, but in the case where the config is not set on disk, we should also default it to true in the python.
```
def _get_enable_sqlite_overlay(
self, instance: EdenInstance, overlay_type: Optional[str]
) -> bool:
if overlay_type is None:
# The sqlite backed overlay is default only on Windows
return sys.platform == "win32"
return overlay_type == "sqlite"
```
```
auto enableSqliteOverlay =
repository->get_as<bool>(kEnableSqliteOverlay.str());
// SqliteOverlay is default on Windows
config->enableSqliteOverlay_ =
enableSqliteOverlay.value_or(folly::kIsWindows);
```
Reviewed By: chadaustin
Differential Revision: D44544561
fbshipit-source-id: 78f3b7c72934a377ffedcd6503b6948d282d2f49
Summary: This rollout has long been finished (the initial rollout from sqlite -> tree). However, I recently renamed tree -> sqlite, which can trigger incorrectly trigger this check since we assume `enable-tree-overlay/enable-sqlite-overlay` is false if its not in the config, but the config is only set to true upon clone. This behavior does not affect the C++ b/c we only write the config during clone.
Reviewed By: xavierd
Differential Revision: D44523497
fbshipit-source-id: ef19af0932fb8f7747bd7b90b8eabe92c1ddc13c
Summary:
We've largely migrated everyone off of macFUSE. Now we can remove the
doctor check.
Reviewed By: fanzeyi
Differential Revision: D44479361
fbshipit-source-id: 6332c5d47a043797ce7e6b823f6c81e0f6ae33a7
Summary: Add this so we know where events in edenfs_events came from (Python CLI, Rust CLI, or the daemon).
Reviewed By: mshroyer
Differential Revision: D44044474
fbshipit-source-id: e23af5e186121657dfabf25c5c50882cc9aec923
Summary: as far as I can tell, we no longer have any callsites using `--overlay-type`, so this diff changes "tree" -> "sqlite" since that more accurately describes the difference between the different overlay options.
Reviewed By: chadaustin
Differential Revision: D44105266
fbshipit-source-id: 40d036222ca3e07c14f3f78aec374d4b6d54eb8a
Summary:
Currently, EdenFS doesn't perform any aux data fetching, instead it fallback to
fetching blobs. In some cases, this can be extremely expensive when the aux
data store is emptied/flushed/rotated leading to non deterministic performance.
On Windows in particular, directory listing always need blob sizes, and thus
getting aux data reliably is critical.
As a first step, let's add some watches through the stack and expose these to
Thrift. This will allow `eden top` and `eden trace hg` to display these
fetches correctly.
Reviewed By: chadaustin
Differential Revision: D44105497
fbshipit-source-id: a3dc5cce1bc3115a2a4effcece6fa0cf0b16f6c8
Summary: The check appears to be flaky for an unknown reason.
Reviewed By: chadaustin
Differential Revision: D44104911
fbshipit-source-id: 390a0d11e16d09c5204d829727ee20db37665815
Summary:
We are unable to run any integration tests on apple silicon because we fail to
start the eden daemon.
We are expecting to be able to find arch, but with execve we need to pass the
full path. according to the python docs adding a "p" should allow the command to
resolve the path with the current environment.
This allows the tests to get farther on my M1.
Reviewed By: mshroyer
Differential Revision: D43923320
fbshipit-source-id: eb737eb561659bd8c9d3cd6bc214f8860b657294
Summary:
The debugInodeStatus can be used to get a mapping between paths and objectid
and/or inode numbers. Unfortunately, the Thrift call is recursive by default,
so calling it on the root of the repository yields a potentially extremely long
Thrift call.
By adding a non-recursive flag, we can mitigate this and force EdenFS to merely
return data for a single directory. This will be used in doctor to obtain the
mapping between paths and objectid in an efficient way. In the future, the
Windows-only filesystem checker in doctor will also be migrated to use the
non-recursive API to reduce races during doctor.
Reviewed By: genevievehelsel
Differential Revision: D43859190
fbshipit-source-id: e175238c2a169b5cdfcf9d4dbcb058578d0c2efe
Summary:
We have an ongoing bug in EdenFS where somehow `hg status` shows a dirty
working copy, while `hg diff` shows no changes to these files. To understand
how widespread this issue is, let's start by adding a doctor check which will
give us telemetry on it.
Reviewed By: genevievehelsel
Differential Revision: D43804391
fbshipit-source-id: ef1e1b8bdd9221ff867747de4957277d3fac5538
Summary: Some environments are still somehow using the Python version of edenfsctl even though they should be fully on the Rust version. That means some environments are experiencing failures due to `eden prefetch-profile` not being a valid command. This diff makes the Python version of edenfsctl available again to help fix this.
Reviewed By: fanzeyi
Differential Revision: D43589012
fbshipit-source-id: eb529b48bcadd734f80e809d2f7b41ef081d7442
Summary:
Over the past few weeks, we discovered that invalidating the working copy
without looking at the atime of files can lead to undesirable behavior due to
races between invalidation and placeholders being laid on disk (D42694759 (f308c20680)). We
also learned that in some cases, invalidation can take a really long time in
cases where very large parts of the repository is loaded, during which `eden
doctor` is stuck waiting for the invalidation to complete.
To solve the first one, we simply need to have `eden doctor` pass a non-zero
age, in this case since atime's granularity is 1h, we use 1h as the age. For
the second one, the Thrift handler is modified to allow backgrounding the
invalidation and `eden doctor` uses this new flag.
Reviewed By: kmancini
Differential Revision: D42785599
fbshipit-source-id: 9d86686f791e124b016685e3669004338ca33359
Summary:
In order for the command to show up in our help text, we must add a stub and print out "not-implemented" if someone somehow manages to invoke the Python version of `prefetch-profile`.
Instead of adding stubs for each sub-subcommand for prefetch-profile, we'll simply redirect the user to the Rust help text.
Reviewed By: fanzeyi
Differential Revision: D43482266
fbshipit-source-id: f29d511bfb06a6d1cc74386cc88e685024384ece
Summary:
Replaces the dashed and solid arrow emoji, used to represent start and finish
events respectively, with an arrow and checkmark.
With my terminal font, I never noticed there were actually two different arrow
types in the output...
Reviewed By: kmancini
Differential Revision: D42971357
fbshipit-source-id: 9e569a461d77b6df9993554b0edccd85fe77d9f2
Summary:
We've had Rust prefetch-profiles for a while now. It's pretty safe at this point to delete the Python version.
quiet_delete
Reviewed By: genevievehelsel
Differential Revision: D43375605
fbshipit-source-id: f3670434b9884e2c7dc2c1429cf90b07334611af
Summary: We used to use disk images for redirections, but the code had long since gone stale. This diff fixes some of the logic so that the disk image code path will work again.
Reviewed By: fanzeyi
Differential Revision: D42467863
fbshipit-source-id: 12927aad617ffb3fb805dcbb91e1a671943fd1c0
Summary:
Fixes a crash in `trace_stream` in the event that `getRetroactiveInodeEvents`
returns an empty list.
Reviewed By: kmancini
Differential Revision: D43104090
fbshipit-source-id: a662ffda29c6133a02f522b50614aef1535a4f6c
Summary:
Eden doctor considers eden still starting as an "issue". Lets recommend looking
at eden status --wait here too, so users have a bit more encouragement to be
patient.
While I might at some point remove the hint in regular `eden status`. I intend to
leave this recommendation in the code forever.
Reviewed By: fanzeyi
Differential Revision: D43101809
fbshipit-source-id: 523bf9be1961db77916a773d68188407c7e3e450
Summary:
Some IDEs are running `eden doctor --dry-run` in the background to preemptively
find issues and display them to engineers. Unfortunately, some of the doctor
checks are fairly expensive and take a long time, these expensive checks are
also racy as they look at the working copy while it is being read/written,
which can cause more problems to be reported for non-existant issues.
To bypass both issues, a `--fast` option is introduced, which IDEs should use
to avoid the expensive checks.
Reviewed By: jdonald
Differential Revision: D43051918
fbshipit-source-id: 094c8db32ead99c9bfcd40cf1aad1e57c524b48a
Summary:
Some times EdenFS has corrupt data, its helpful to be able to see where that
corrupt data is stored. I updated `eden debug blob` a while back to show
blob data from multiple places: D41165544 (224cde1d5c).
In this stack I am going to do the same for blob metadata, so we can
understand where corrupt data is comming from.
This diff implements that new endpoint and adds some tests for it.
Reviewed By: chadaustin
Differential Revision: D42283192
fbshipit-source-id: 5042ee81798ffb4a80c2fa13e14080c4a3ed00ca
Summary:
When investigating the `QueueTimeout` issue with `globFiles` in EdenFS, it is handy to understand which thread is a code block being executed on. `folly::Future/SemiFuture` is subtle and easy to make mistakes when it comes to thread scheduling.
This diff introduces a new type of `TraceBus` that generates events based on `TaskTraceBlock` -- a scope guard style tracing utility that will report which thread the given block is executed on, along with some other interesting information.
The trace_stream binary is going to produce a Chrome JSON trace format so you can easily analyze the result with tools like `chrome://tracing` or Perfetto. (Note: `task` mode produces JSON for inspection, `task-chrome-trace` will produce Chrome JSON trace format).
Reviewed By: xavierd
Differential Revision: D41390587
fbshipit-source-id: 7fdc19b9ec87318f4c6dc8b196153ffb2568c8ba
Summary: The Mercurial functions that Eden doctor uses want `str`s to be passed in instead of `bytes`. We should change the Eden doctor checks/fixes to match this requirement.
Reviewed By: xavierd
Differential Revision: D42456187
fbshipit-source-id: 312d53d98154e79215e18030598fcee915f76b3c
Summary:
When starting developer instances of EdenFS, we're forced to use `sudo` to start developer instances privhelper daemons. This is an issue because `sudo` is not available on macOS sandcastle hosts, and therefore we cannot run integration tests on macOS (all integration tests for macOS are currently disabled).
Instead, we can use the system privhelper by default (instead of using developer instances of privhelper). The system privhelper is installed with SUID root, which means we can leverage its privileges to run integration tests on macOS without `sudo`!
Reviewed By: chadaustin, xavierd
Differential Revision: D41020919
fbshipit-source-id: ef26f95d673b9290c62c7d0755d580f30eb43645
Summary:
EdenFS is sometimes displaying truncated file content through the filesystem. This is because the size EdenFS is reporting is incorrect.
This adds a checker to scan the local store for corrupted file sizes. It's too slow to put in eden doctor right now, but I will use it to stress test and see if I can repro the corruption.
Reviewed By: xavierd
Differential Revision: D41012331
fbshipit-source-id: b74ce536d0ced1a01c19a92759258594ffd49e68
Summary:
EdenFS now supports fetching blobs from multiple locations. It certainly would
have been helpful to be able to see blob contents from all locations all at
once while debugging reports of corrupted file content recently.
Lets add an option to `eden debug blob` so that we cans see data from multiple
places at once.
Reviewed By: chadaustin
Differential Revision: D41165544
fbshipit-source-id: 421fc8841a531894715ff5cbb3786e1003782666
Summary:
The daemon now supports specifying different locations where data can be
fetched from. We can support these in the eden debug blob CLI so that it is
easy to inspect which stores have a blob and what contents those stores have
for the blob.
This diff does break the API of the `eden debug blob` command (no more --load
option). But I don't spot this uses anywhere in code and is only intended for
use by the eden team, so this seems ok.
Reviewed By: xavierd
Differential Revision: D41165545
fbshipit-source-id: 7f35de59ad8073173e917f01eeae693500f6540c
Summary:
The --all-sources flag doesn't really make sense, and it hasn't been used internally within the past 90 days...
I'm open to killing this flag altogether, but for now we can just change the logic so that it makes sense.
Now we have renamed the flag to --only-repo-source (more accurate), default this flag to false, and require users to pass it in if they want to change the behavior of redirect fixup.
Reviewed By: fanzeyi
Differential Revision: D41488196
fbshipit-source-id: 5f36c2199e9eb849cdc3db6ce163462b3e295b47
Summary:
In the last diff I introduced a new endpoint debugGetScmBlobV2 this will allow
us to fetch blobs from more (and multiple) locations. migrate the CLI to the
new endpoint:
Looks like this is the only place incode that we are using the old endpoint:
https://www.internalfb.com/code/search?q=repo%3Afbcode%20debugGetScmBlob
Reviewed By: mshroyer
Differential Revision: D41069917
fbshipit-source-id: dfeb9983c6c5f009180653d3bfd96cd928b13bcf
Summary:
One of the top issues that engineers are faced with are slow update/rebase
alongside with slow startup. Both of which are a consequence of having tons of
files populated in the repo, which often comes from tools crawling the repo. As
a mitigation, EdenFS recently gained the ability to invalidate all
non-materialized files/directories, let's start by using this in `eden doctor`
when we consider that the user has a large amount (over 1M) of inodes in their
repo.
A future step will run this invalidation as a periodic background task taking
into consideration how frequent a file was accessed.
Reviewed By: kmancini
Differential Revision: D41010836
fbshipit-source-id: 14f4c5322f3879d830048ba924d5769631809bb1
Summary:
We have a concept in Hargow collect Volumettes, which are directories of data. The backend storage for them is the RE CAS, and we would like to be able to mount volumettes using eden.
However, we have our own use-cases, which have their own storage (Manifold buckets and ZippDBs). So to make this work, we need some way to be able to specify the use-case.
We pass the use-case in using a new `--re-use-case` parameter, and store it in a `CheckoutConfig` option under `recas/use-case`.
Reviewed By: miaoyipu
Differential Revision: D41001049
fbshipit-source-id: a4b3d5c602e29a71560d17d3d3802b4b0a5922c0
Summary:
std::string_view has noexcept accessors and folly::Range doesn't, so
this allows us to make Path and PathPiece noexcept.
Reviewed By: kmancini
Differential Revision: D41145426
fbshipit-source-id: 046f6f6a532d8d0da8508ccf7896c914e19a25ec