Summary:
eden doctor can discover when an inode is missing for a file, but can't
remediate the issue. restart usually remediates the issue, but it would be
better to have doctor remediate since restart can be very slow.
We could do something similar to our remediation for phantom inodes (performing
a filesystem operation). However, messing with the filesystem leaves us open to
races with concurrent modifications. The point of the filesystem io is to make
eden see a notification about a certain path and match it's state to the
filesystem. So we can directly do that instead.
We can more directly do this by introducing a thrift call to make eden match
it's internal state to the filesystem.
We could replace the other remediation with this thrift call. I'll leave that
for a follow up.
Reviewed By: xavierd
Differential Revision: D46243633
fbshipit-source-id: a1df5929428dc4f6c8fd71d826fe1e7371ebf283
Summary:
We've got a couple sets of prjfs tests now and they have some utility functions.
Let's move this all into a base class so it can easily be reused.
Reviewed By: xavierd
Differential Revision: D46379827
fbshipit-source-id: c4b708a1effebe5f11c97a63519295be625c8018
Summary: Adding a method to retrieve blake3 hash for a list of files.
Reviewed By: chadaustin
Differential Revision: D46268718
fbshipit-source-id: 59cb3d25a1d059a7e9b6a4da784a820945ffbd32
Summary: Expose blake3 via file attributes and changing some integration tests accordingly. Added blake3_sum so that it could be used to verify blake3 hashes and updated the tests to work with blake3.
Reviewed By: chadaustin
Differential Revision: D46307686
fbshipit-source-id: 6f2a4e8e25757862ef17d56f92b90a95c7f5a474
Summary: This utility binary could be used from the integration tests to compute blake3 hashes as it is easier to have it than adding the python pypi package with native dependencies.
Reviewed By: chadaustin
Differential Revision: D46307685
fbshipit-source-id: 1c48c689312cd9a04a19a62ad02ae3e9185041e5
Summary: v1 and v2 have both been written out forever, so I think we can drop the v1 format.
Reviewed By: zzl0
Differential Revision: D46075828
fbshipit-source-id: 373dae9cf6a53534489022716986aa85b5de5552
Summary:
We've seen a case where a large `hg update` was taking an absurdly long time in
`ObjectStore::getTree` but the telemetry was showing us that most of the time
wasn't spent fetching trees from Mercurial! The suspicion is that most of the
time was spent in the LocalStore but with no evidence to prove it.
Let's thus add some timing telemetry to various LocalStore read requests to
fill this gap.
Reviewed By: mshroyer
Differential Revision: D46154456
fbshipit-source-id: b439ac48889ed3db71db136ff6c1cc91f48c926a
Summary:
For tools that want to take advantage of the same fast-path logic when
directories don't change across updates, expose a semistable ID they
can use to cache derived data or get a rough understand when a
directory has changed its contents.
Reviewed By: kmancini, xavierd
Differential Revision: D45974142
fbshipit-source-id: 7b2b482876b07e73514a936e198de2dc31ed1597
Summary:
thrift_test is pretty general. For something like getFileAttributes,
getFileAttributesV2, and readdir, it's nice to have them in one file.
Reviewed By: kmancini
Differential Revision: D45974095
fbshipit-source-id: ca5ecd4795f1d278f4de5b83cb8c9af94c111902
Summary:
StringConv.h showed up as an expensive header in my build-time
benchmarking on Windows. That's because its implementation was a
template. Switching to explicit template instantiations allows us to
move the template into the .cpp.
Reviewed By: genevievehelsel
Differential Revision: D45718855
fbshipit-source-id: 66347227108c21c9d8a22456243d0fc53c4a8edf
Summary:
One of the main objective of the LocalStoreCachedBackingStore is the ability of
configuring its policy depending on the underlying backing store. For instance,
while some backing store want to cache everything in the LocalStore, others may
only want to cache BlobMetadata. Configuring the policy per-backingstore allows
the removal and special case inside the RocksDbLocalStore of blob caching: this
is now done entirely inside the LocalStoreCachedBackingStore.
Reviewed By: chadaustin
Differential Revision: D45550480
fbshipit-source-id: 386e384f0148e38fad8d4008b4e312f360edb676
Summary:
We don't use the Cython-based Thrift client because it never worked in
the open source build. Removing it fixes the tests when run with
nix2rpm Python binaries.
Reviewed By: xavierd
Differential Revision: D45508963
fbshipit-source-id: d6f0e162177082a78504a3a6014a8a4d32670ccc
Summary:
mshroyer suggested renaming run_with_fault in D45135531 as the previous name
was confusing and he expected IO to fail, not to be blocked.
Reviewed By: fanzeyi
Differential Revision: D45459918
fbshipit-source-id: c39e5d414a6c0fda4eeff8314e6a266fa8eee185
Summary:
When unmount is called, the mount is kept alive until all the pending
ProjectedFS callbacks have completed (D45135531) to avoid use-after-free.
However, the notification callback is handled entirely in the background and
would thus not prevent the mount from being unmounted, and not being freed.
In order to fix this, we need to keep the PrjfsChannelInner alive for the
duration of the notifcation handling, which as a side effect will keep the
mount alive.
Reviewed By: mshroyer
Differential Revision: D45251808
fbshipit-source-id: e78afbc388cdf75f6d8d4e774bced8fdf841db5e
Summary:
Per Microsoft (Christian Allred), calling any Prj functions after
`PrjStopVirtualizing` will lead to use-after-free due to the internal
virtualization context being freed on `PrjStopVirtualizing`.
In EdenFS, there is a subtle cases where this may happen, in particular, if a
callback is ongoing and detached to a background thread, while
`PrjStopVirtualizing` is called as a result of `PrjfsChannel::stop`. In that case,
the callback will attempt to use a freed virtualization context and crash
EdenFS!
To prevent this, we need to make sure that `PrjStopVirtualizing` is only called
when no in-flight callbacks exits. This can be achieved by simply moving the
`PrjStopVirtualizing` into the `PrjfsChannelInner` destructor: a `shared_ptr` of
`PrjfsChannelInner` is held alive for the duration of callbacks.
After `PrjfsChannel::stop` is called, no new callbacks will be accepted due to
the `PrjfsChannelInner` field being `nullptr` thus failing IO early ensuring
that the lifetime of the mount won't be extended indefinitively.
Reviewed By: mshroyer
Differential Revision: D45135531
fbshipit-source-id: 15b25efc121ab1577c4355e07357148d122cc39f
Summary:
Since EdenFS doesn't explicitely sets the timestamps of files/directory when
writing them on disk, ProjectedFS sets these to the current time. Prior to
running the background GC, this was an appropriate behavior: the timestamps of
files would only get changed on checkout when the file changed. However this
behavior changed slightly since background GC got introduced as placeholder
will get invalidated when not accessed for a while. The next time the
placeholder is written on disk, it's timestamps would now be different from
what it was before GC.
To avoid this issue, we need to consistently write the same timestamp before
and after GC. This is more or less the last checkout time. The one current
downside is that the last checkout time is reset across restarts, to fix this,
we could write the last checkout time to disk and restore it at startup time.
This is left as a future improvement.
Reviewed By: MichaelCuevas
Differential Revision: D44936789
fbshipit-source-id: a21ef3f2f0ef1c0d7ecb57658ae99647dc2bd99b
Summary:
It's been a long time since EdenFS has managed bind mounts in the
daemon. We no longer need to populate the bindmounts field in the
takeover protocol.
This is a small simplification for later changes.
Reviewed By: kmancini
Differential Revision: D44737123
fbshipit-source-id: d9127445083d83626d214efbbf271cd58df96ac6
Summary:
fbcode_builder doesn't currently know how to build a Python extension module,
which means the Windows oss build fails on trying to load ntapi if we don't
pass the `--no-tests` flag to getdeps. Ideally we add extension module support
to the builder, but for now let's just disable the test case if the extension
module is unavailable.
Reviewed By: chadaustin
Differential Revision: D44804035
fbshipit-source-id: 10da54f6ea7d3699ac42421ee5cf57d883509bdd
Summary:
Eden's readdir and getfileAttribute endpoint returns an error when an entry in
a directory has a type that is not: regular file, directory, or symlink.
This causes issues for Buck2 because it propagates this error to the user. For
Buck2 a directory having a file that isn't a regular file, directory, or
symlink isn't an error case, it's just a file Buck2 wants to skip over. Buck2
would like to be able to differentiate real errors getting the filetype (like
say a network error) and having a weird file in some directory.
From chatting with Thomas, Buck2 is unlikely to ever care what type the file
is (if its not a file, dir or symlink). So it's sufficient just let buck2
know it's some "other" type of file. I think it makes sense to just add a non
source control type here. I also considered adding dtype as an attribute, but
I don't think we need it, but we could add that too.
In some cases it can be dangerous to add values to thrift enumeration
(SourceControlType enum we change below)
(reference post: https://fb.workplace.com/groups/thriftusers/permalink/785884732120941/).
But in our case, rust + Buck2 handles new enum types gracefully
(and with exactly the behavior we want):
https://our.intern.facebook.com/intern/diffusion/FBS/browse/master/fbcode/buck2/app/buck2_common/src/io/eden.rs?lines=157
so adding a value to the enum is safe (for buck2).
hack is our other client. they are going to handle it less gracefully:
https://www.internalfb.com/code/fbsource/[65673fd318750984372aeb5b44036a259a0d85d2]/fbcode/hphp/hack/src/facebook/hh_distc/package/package.rs?lines=441 but from what I can tell hack would also
error if they tried to list a directory with a socket in it with out this
change. Will confirm with them that this change is ok with them.
Reviewed By: chadaustin
Differential Revision: D44794698
fbshipit-source-id: 4e3ab7964fa2c0932b0363fb9ad62f24af74480c
Summary: Buck1 is EOL and these tests are thus no longer necessary.
Reviewed By: chadaustin
Differential Revision: D44717980
fbshipit-source-id: bfe0d9977243c35405e1b5cc988b687369488d0c
Summary:
On Windows, the `NtQueryDirectoryFile` low-level API exposes some behaviors—such as restarting partway through a directory enumeration, setting filename filters, and requesting single entries at a time—that are implemented by our ProjFS provider, but were not previously covered by our integration tests.
This change adds coverage by exposing `NtQueryDirectoryFile` via a Python extension module, then exercising it on an EdenFS mount in a new test.
Reviewed By: xavierd
Differential Revision: D44356956
fbshipit-source-id: 4114a0be95092b8276156ba7fd895f64d9e64c3a
Summary:
Now that EdenFS can fetch blob metadata from the server, let's make sure to
plug it to the `eden debug blobmeta` command.
Reviewed By: kmancini
Differential Revision: D44186943
fbshipit-source-id: a64f1384cf312e3c677505c330cfc82469fb83f3
Summary:
The external OSS build is broken because thrift has a really long path and
cargo/git on Windows do not lick such long paths.
I'm not fixing that. But the external OSS build is the only one we run in CI any
more. In the meantime the internal build has broken because no one has been
watching.
There were 4 different breakages.
Reviewed By: chadaustin
Differential Revision: D44189633
fbshipit-source-id: 2eedbc2b3bbf5d1def075d99f11f2273dbb1f4ab
Summary:
Fixing the asciitransform issue, unblocks a lot more tests.
This enables all the ones we know that were blocked by this.
Reviewed By: mshroyer
Differential Revision: D43966515
fbshipit-source-id: dfd988f81ec9b931f2d3bc6f34ad4ad82c2f2a61
Summary:
These tests seem fine? Maybe someone already fixed them?
But even with stress runs I can't get them to fail. I'll try on my intel too
to make sure it's not that.
Reviewed By: genevievehelsel
Differential Revision: D43986058
fbshipit-source-id: 7e3702a7efbd4637efbb235e626b421e9a3d28a1
Summary:
The takeover tests are failing in a couple ways.
First, there are failures:
multiprocessing seems to behave differently on mac than linux.
The process calls cause locking issues when "pickling". multiprocessing seems
kind un reliable, and we don't really need it in either of the used places.
Second. there are timeouts:
accessing an fd that was open before takeover seems to hang sometimes.
I can not manually repro on my M1, but don't have time to dig in right now, so
I will just leave a comment with some info on the issue and leave these disabled
for now.
Reviewed By: mshroyer
Differential Revision: D44000288
fbshipit-source-id: 76ef085967a495ffd3ab0a8aae337960368d75e0
Summary: In the next diff, we'll use this util function to acquire an exclusive handle to a file for a prjfs integration test.
Reviewed By: xavierd
Differential Revision: D44349708
fbshipit-source-id: ded7037266f539c681569f0c90f875717760fed5
Summary:
This one slipped through D44263797 and allows for integration tests to be run
with Buck2 on macOS.
Reviewed By: fanzeyi
Differential Revision: D44315942
fbshipit-source-id: d2de0773ba68f13fca9e8d5c067b82653646c757
Summary:
We've seen several cases where the sqlite database is corrupted causing EdenFS
to fail to start and requiring manual remediation. On Windows, we can always
reconstruct the sqlite database from scratch due to FSCK being able to build it
from scratch. Thus, we can simply delete the database on disk and continue
starting up.
Reviewed By: chadaustin
Differential Revision: D44155034
fbshipit-source-id: de05c814796ab8f76fd3cd9a3e98df438431c657
Summary:
It turned out `split_test` is failing for the same reason as other hg tests.
The assert raise hid the actual error so it wasn't very clear. So this diff adds some better reporting for that as well.
Reviewed By: kmancini
Differential Revision: D43962607
fbshipit-source-id: 137016260f286ee9577576d90e9e4372d4db960e
Summary: Most of these tests are already passing, except few of them are having issues with `_asciitransform`.
Reviewed By: mshroyer
Differential Revision: D43960522
fbshipit-source-id: b3bae8d3df1acc6f9b32057367309aff44c93de3
Summary: After the `._` fix, these tests are passing and we can enable them.
Reviewed By: chadaustin
Differential Revision: D43931347
fbshipit-source-id: c568eeaff5802901a20036c81faac58e44b74820
Summary: Some of these stats counters don't exist for NFS/PrjFS mounts. I filed tasks (T147665665, T147669123) to add these counters in the future, but until then we will disable some tests.
Reviewed By: kmancini
Differential Revision: D43963755
fbshipit-source-id: 4fd9ab9fa5bc123a6d24c669b655bfb18cf3a0a5
Summary:
This is another case where we had issues with the long default temp path on
macOS. Here, fake_edenfs was failing with SIGABRT on startup due to trying to
open a unix-domain socket with too long of a path.
D43925614 addressed this for tests based on EdenTestCase, but this doesn't help
with StartFakeEdenFSTestBase which only inherits from EdenTestCaseBase. So
this diff moves the environment variable override one level up in the type
hierarchy to cover more test cases.
Reviewed By: kmancini
Differential Revision: D43988120
fbshipit-source-id: 9eca299c42aea819d3e93ed5bb408b4a4783d8ee
Summary:
On macOS/NFS, for whatever reason, calling "unlink" on a directory will always
give a EPERM error. EdenFS is not involved in the error returned.
Reviewed By: fanzeyi
Differential Revision: D43962767
fbshipit-source-id: d0834b472be29657c1e37062a955337091a58be4
Summary: These are passing with no changes, likely fixed as a result of another diff.
Reviewed By: chadaustin
Differential Revision: D43952630
fbshipit-source-id: 9de8d375f21e33939339e72c7c08d3ba9abef070
Summary:
These were failing due to in_proc_mounts trying to open /proc/mounts. On macOS,
this file doesn't exist. We can however use the `mount` command instead to
achieve a similar goal.
Reviewed By: MichaelCuevas
Differential Revision: D43932260
fbshipit-source-id: dcdc707732196f2dcf4466acffba34deb3a86d25
Summary:
hg grep under the hood is `hg st | xargs grep`. The return code is the xargs
return code. xargs on mac and linux behave a little differently.
according the the linux man page:
```
EXIT STATUS
xargs exits with the following status:
0 if it succeeds
123 if any invocation of the command exited with status 1-125
```
according to the mac man page:
```
EXIT STATUS
The xargs utility exits with a value of 0 if no error occurs. If utility cannot be found, xargs
exits with a value of 127, otherwise if utility cannot be executed, xargs exits with a value of
126. If any other error occurs, xargs exits with a value of 1.
```
So a failure on linux in grep will result in a 123 code while on mac it will
result in a 1 exit code. I wish mac behaved like linux, but it doesn't. So lets
teach the tests to accept the xargs return code based on the platform.
Reviewed By: mshroyer
Differential Revision: D43924480
fbshipit-source-id: 0e6c2ec2c6a7553a1aed7c072f0464e5d1fa64c7
Summary:
This is theoretically fixable, but it seems like it would require a bunch of work and changes to test teardown.
For now, let's document why it doesn't work and provide a starting point if someone wants to fix it in the future.
Reviewed By: fanzeyi
Differential Revision: D43962277
fbshipit-source-id: b90f832a305ac7019f9031ddf3bee78a2e1e7817
Summary: This test passes on both architectures.
Reviewed By: genevievehelsel
Differential Revision: D43953611
fbshipit-source-id: 8c3477e6bc74a34cc437b17083967ba295ed4a6d