Summary:
Fix `EdenFS.shutdown()` to call `edenfsctl stop` with a timeout of 0 seconds,
telling it not to wait for EdenFS to exit. This code then performs its own
wait with a timeout.
Previously the code called `edenfsctl stop` asking it to wait for EdenFS to
exit with a 30 second timeout. However, since the integration test could be
the immediate parent process of EdenFS the `edenfs` process may not actually
go away until the test called `wait()` on this process, which wouldn't happen
until `edenfsctl stop` returned. This only caused problems for cases where
the test could run `edenfs` directly without needing to run it through `sudo`:
when run through `sudo` the edenfs process would get cleaned up since `sudo`
was the immediate parent and it would wait on the process.
Reviewed By: genevievehelsel
Differential Revision: D20434081
fbshipit-source-id: 513fd2ebb5fc24a54c546a76e94827c81a4ab754
Summary:
Add a counter to report the number of mounts that we failed to remount during
startup. Mount failures do not prevent EdenFS startup from proceeding. It is
useful to have a metric to report if these errors did occur even though the
start-up as a whole still proceeded otherwise.
Reviewed By: chadaustin
Differential Revision: D20319512
fbshipit-source-id: fd503a1ccc91b476cc9dc2bc6323501bbbeaf2c5
Summary: expose the counters for number of pending imports (blobs, trees, prefetches) to allow use in tooling
Reviewed By: chadaustin
Differential Revision: D20269853
fbshipit-source-id: d2b7e2110520290751699c4a891d41ebd5b374cf
Summary:
Remove a failing integration test that was testing behavior we don't really
care about.
My changes in D20210708 made this test start failing. This integration test
was initially added to exercise the code I reverted in D20210708.
This test fails when EdenFS is invoked in the foreground and under sudo. If
you send SIGSTOP to the EdenFS process sudo happens to notice this and send
the same signal to itself too. This results in a state where the `sudo`
command is stopped and is never resumed so it never wakes up to reap its child
EdenFS process when EdenFS exits. The behavior I reverted in D20210708 caused
the edenfsctl CLI code to simply ignore the fact that EdenFS was stuck in a
zombie state, and proceed anyway. This allowed EdenFS to at least restart,
but it left old zombies stuck forever on the system.
This problem is arguably an issue with how sudo operates, and it's sort of
hard for us to work around. To solve the problem you need to send SIGCONT to
the sudo process, but since it is running with root privileges you don't
normally have permission to send a signal to it. It is understandable why
sudo behaves this way, since normally it is desirable for sudo to background
itself when the child is stopped.
In practice this isn't really ever a situation that we care much about
handling. Normal users shouldn't ever get into this situation (they don't run
EdenFS in the foreground, and they generally don't run it under sudo either).
Reviewed By: genevievehelsel
Differential Revision: D20268924
fbshipit-source-id: d61d0a10ee1e132f00dbd2e4dc135808b7c79345
Summary:
Update some of the systemd tests that were using
`eden.cli.daemon.wait_for_process_exit()` and were relying on it to return for
zombie processes that had not been reaped. This test would spawn a subprocess
and then wait for it using `wait_for_process_exit()` instead of actually just
using `subprocess.Popen.wait()`.
The `wait_for_process_exit()` function is only intended to be used for
non-child processes. For immediate children processes it is always better to
simply use `wait()`.
This refactors the code so that it uses `subprocess.Popen.wait()` where
appropriate. This is needed to make these tests work even after D20210708
lands.
Reviewed By: wez
Differential Revision: D20242891
fbshipit-source-id: 0afd3d3d7ee1d733099ea74f7b9b19cbe48b22d4
Summary: I was looking in the `edenfs_events` table and saw that sandcastle was logging to this table. Rice was able to identify that the reason was because the integration tests were logging. So if we're on running integration tests, we should return a `NullTelemetryLogger`. The daemon currently does not log on sandcastle AFAIK.
Reviewed By: simpkins
Differential Revision: D20203556
fbshipit-source-id: e09175347631478cb366d4fa2c6092d976504dd8
Summary:
- added logging only around the import blob call to capture non-queue related wait time
- added to `test_reading_file_gets_file_from_hg` in `integration.stats_test.HgBackingStoreStatsTest` to test import blob logging in addition to the get blob loging
(not yet done for importing trees, will do in next diff)
Reviewed By: chadaustin
Differential Revision: D20201215
fbshipit-source-id: c89281fe7d3d6e89d111ac8cce9014adff44ac40
Summary:
D17135557 added a bunch of `pyre-fixme` comments to the EdenFS integration
tests for cases where Pyre cannot detect that some attributes are initialized
by the test case `setUp()` method.
It looks like Pyre's handling of `setUp()` is somewhat incorrect: it looks
like if a class has a `setUp()` method this currently suppresses all
uninitialized attribute errors (even if some attributes really are never
initialized). However, Pyre does not detect `setUp()` methods inherited from
parent classes, and always warns about uninitialized attributes in this case
even they are initialized.
Lets change these comments from `pyre-fixme` to `pyre-ignore` since this
appears to be an issue with Pyre rather than with this code. T62487924 is
open to track adding support for annotating custom constructor methods, which
might help here. I've also posted in Pyre Q&A about incorrect handling of
`setUp()` in derived classes.
Reviewed By: grievejia
Differential Revision: D19963118
fbshipit-source-id: 9fd13fc8665367e0780f871a5a0d9a8fe50cc687
Summary: It seems to be stable and not causing issues. Let's make it default everywhere.
Reviewed By: wez
Differential Revision: D19896738
fbshipit-source-id: cf6abe8f536e570017742b3a0674213a932a6a4d
Summary: This should get rid of the extraneous uninitialized attribute errors related to `setUp` and abstract classes.
Reviewed By: simpkins
Differential Revision: D19964487
fbshipit-source-id: 52d5a6496e372d99d4398473f9ed7672228a76f5
Summary: fork exec wait in `daemon.dameon_exec` so we can get exit code of child process in order to log.
Reviewed By: simpkins
Differential Revision: D19861810
fbshipit-source-id: 85fce52b2e2d252bb4dec779f5f975e3712b6bb5
Summary:
When checking if a commit is valid explicitly check against the backing
repository rather than the Eden checkout. This makes the commit work
correctly if the Eden checkout's `.hg` directory has been corrupted but the
backing repository is still fine.
Reviewed By: genevievehelsel
Differential Revision: D19629959
fbshipit-source-id: 57992260332cbc1d6868813263fb3768b50db07e
Summary:
Sadly, we didn't derive as much value from these as we would
have liked, and now they are a source of flakeyness in our tests.
Reviewed By: genevievehelsel
Differential Revision: D19460792
fbshipit-source-id: 48c82cc2d1fdbd81bece4057e93799fbcc4f4725
Summary:
Instead of clearing every single cached object when the total size
exceeds the ephemeral storage limit, keep a limit per object type and
only clear those that exceed their quota.
Reviewed By: simpkins
Differential Revision: D19358312
fbshipit-source-id: 6918d6f4cc2931aed79a9025d0e0f357ede515e0
Summary: adds a cli debug command to inspect the working copy parent. by default just returns eden's snapshot contents, but adds optional --hg flag to print Mercurial's dirstate information
Reviewed By: chadaustin
Differential Revision: D19167518
fbshipit-source-id: b65e112df6abe4e0e7a8a528a90b2e3d17297e66
Summary:
Remove some half-baked, unnecessary logic for caching sizes separately
from SHA-1. Eden's backing stores do not support chunking large files
yet, so there's no value in caching content SHA-1 and size
separately. This fixes a scenario where fetching blob size and then
SHA-1 would result in two backing store imports.
Reviewed By: fanzeyi
Differential Revision: D19169096
fbshipit-source-id: dc32f3313e5f4230c06a5bbaa67da7bf0febaba8
Summary: There is one instace of `getScmStatusBetweenRevisions` in use - it is used in the eden cli in a hacky way to check if a commit hash is valid. Since this is not used anywhere else in a meaningful way, this replaces that use case with a hg call and depreciates `getScmStatusBetweenRevisions`
Reviewed By: simpkins
Differential Revision: D18690026
fbshipit-source-id: 02bd2c20a0f631ec41116f9fd4e18d14369298ef
Summary:
A spike in automatic GCs usually implies something has gone wrong. Log
an event for each one, recording the cache size prior to the GC and
the cache size after.
Reviewed By: simpkins
Differential Revision: D18902580
fbshipit-source-id: 158b2635733a415a9fcc7c412b2c0f44ed04aa01
Summary:
Two bugs conspired to cause edenfs after a graceful restart to think
the kernel supported FUSE_NO_OPENDIR_SUPPORT when it didn't: the
connection info struct wasn't zeroed, and FUSE connection capabilities
weren't properly mirrored into the Dispatcher upon graceful
restart. Fix both and add an integration test.
Reviewed By: simpkins
Differential Revision: D18903761
fbshipit-source-id: 23f4db3e240ee7d035f707820072c606a45f1138
Summary: This updates the hg and telemetry wrapper callsites of getScmStatus to first try running getScmStatusV2() with fallback option. This does not retry `hg status` while a checkout is in progress.
Reviewed By: simpkins
Differential Revision: D18209899
fbshipit-source-id: e7a77b902f5a0ee624e4ea3185a1901bdac090e6
Summary:
Now that we've transitioned to the newer redirections
configuration we can remove the older bind mounts configuration
parsing, and that is what this diff does.
This is helpful because in situations where the user has run
`hg co null` as part of some troubleshooting advice, the legacy
bind mounts would remain mounted and obstruct some of the
steps that would have fixed up the users repo.
Reviewed By: pkaush
Differential Revision: D18337246
fbshipit-source-id: 23f27787d609e1c38a9c98b8b6596bb40743b9ca
Summary:
The purpose of this command is to unmount/unlink any configured
redirections without removing their configuration.
The intent is to call this for a mount when we are unmounting; I'll do
that bit in a follow on diff.
Reviewed By: pkaush
Differential Revision: D18801872
fbshipit-source-id: 096d9595091da72aa85f4259cbab022a1fe0c01f
Summary: Even if blobs have different hashes, they could have the same contents. For example, if between the two revisions being compared, if a file was changed and then later reverted. In that case, the contents would be the same but the blobs would have different hashes. Currently, `getScmStatusBetweenRevisions()` would report false positives in this case. This is also needed so we do not report false positives in `getScmStatus()` when hit this code path
Reviewed By: simpkins
Differential Revision: D18647086
fbshipit-source-id: 66e12648a24fd7e5612eee5e599a5b81c7c5f2d1
Summary: This reads `enforceParents` from a config instead of always assuming true for `getScmStatusV2()`. This will allow a easy kill switch in case throwing errors from this thrift call causes issues with something that calls hg status
Reviewed By: simpkins
Differential Revision: D18258164
fbshipit-source-id: 1ae421a941c01a678d25d5453c771262b03558d0
Summary: make the error message returned in the case of out of date parents during a new status call more user friendly and provide possible remediation instructions
Reviewed By: simpkins
Differential Revision: D18328835
fbshipit-source-id: b214f45bb055d008db8b233ddd2a1843332db838
Summary:
Merge the fb-mercurial code into the Eden repository, under the
`eden/scm` subdirectory.
Reviewed By: quark-zju
Differential Revision: D18445774
fbshipit-source-id: fc3307f9937e0c7e1c8f7d03c5102c4fe5dedb10
Summary:
Add a new thrift API for computing the difference between the working
directory and a given source control commit.
This has the following differences from the old getScmStatus() commit:
- The parameters are accepted in a GetScmStatusParams structure now.
This makes it easier for the server-side C++ implementation to tell which
parameters have actually been specified by the caller. This will make it
easier to extend this API in the future without having to replace it with a
new function call again.
- The return value is a GetScmStatusResult, which includes both the ScmStatus
and the EdenFS version number. This will allow code like `hg status` to get
both the status results and the EdenFS version in a single call, without
needing to make multiple separate thrift calls.
- This new call will return an error if the caller requests the status against
a commit that disagrees with EdenFS's view of the current commit. Because
the individual `hg` command line processes do not perform any
synchronization of their own when reading the working directory parent,
they can often call EdenFS with stale parent information, or while a
checkout is currently in progress. This new behavior will reject the
request with an error, rather than having EdenFS perform a potentially very
expensive status computation when the results probably aren't actually
useful to the caller anyway.
Reviewed By: chadaustin
Differential Revision: D15110218
fbshipit-source-id: ebc2f74dafc090d4fd245de8e4d62e2b086500dd
Summary: Adds a non-optional EdenErrorType struct for EdenError, this can be used in case of special error case handling of errors without error message parsing. Currently this is just passed along and not consumed anywhere in the client, but later in the stack is used for specific retry of checkout if "CHECKOUT_IN_PROGRESS" on the consuming side.
Reviewed By: chadaustin
Differential Revision: D18139917
fbshipit-source-id: b3f2ec4c480fc5246ff2f46d09c436021bad8b61
Summary: Fastmanifest is going away, remove it from the test.
Reviewed By: chadaustin
Differential Revision: D18145000
fbshipit-source-id: ee75ebe4eda19caca92fd0a84bf0ae9f48112167
Summary: Formatting had diverged in a few places. Fix that up.
Reviewed By: fanzeyi
Differential Revision: D18123219
fbshipit-source-id: 832cdd70789642f665a029196998928a9173be81
Summary:
A recent change removed the revision number from the rebase output, fix the
EdenFS test to not have it too.
Reviewed By: genevievehelsel
Differential Revision: D17954310
fbshipit-source-id: 6c1db48086af4b7b138e6c3f4ef0bb362d2256f8
Summary:
Instead of having accessors for every config setting in EdenConfig,
just expose the ConfigSettings directly.
Reviewed By: fanzeyi
Differential Revision: D17847805
fbshipit-source-id: 8c6c1010c010113cf859677449797ea916f2a2a5
Summary:
Some Unix applications (notably, nfsd) create regular files using vfs_create, which ends up invoking the `mknod` system call rather than `open`, which for historical reasons only supported socket creation with Eden. However, since Eden supports regular files, we can broaden the FUSE mknod handler to support regular files as well.
For context, see https://github.com/GoogleCloudPlatform/gcsfuse/issues/137#issuecomment-155273363
Reviewed By: chadaustin
Differential Revision: D17792424
fbshipit-source-id: 466fcbcb3bcb587e731bc8b2a3e0f1508ff1f4e4
Summary: D17766371 added an `update --merge` state. Teach eden tests about it.
Reviewed By: wez
Differential Revision: D17837836
fbshipit-source-id: a95ed326bf435f7340d7910307c8c5c761812514
Summary:
## Backstory
Pyre was throwing errors in my diff (D17747558) regarding an extraneous fixme. Turns out PyreBot has been adding and removing these fixmes during version updates (see D17135557 and D16183608), so I suspect it's something to do with the Pyre version. Anyways, I figured it'd be easier to do the annotation than to remove the fixme and risk Pyre throwing the same error in a later diff.
## What I did
I added the Key type (Path) and Value type(ExpectedFileBase). mypy then started throwing an error regarding __iter__ returning the wrong type because it wanted an iterator over keys rather than values. Fixed that and added .values() to the for loop.
Reviewed By: genevievehelsel
Differential Revision: D17806135
fbshipit-source-id: c07feee33db78a9bff19ba9856a7047657b8c63e
Summary:
Update the CMakeLists.txt files to support building the hg integration tests.
At the moment this only includes one of the test files (`status_test.py`).
I have not verified if tests from the other modules pass yet or if they need
any additional tweaks to work in CMake-based builds.
Reviewed By: pkaush, fanzeyi
Differential Revision: D17678991
fbshipit-source-id: 4a5ee5a8d6039d9d2a635c7027897bbeed14f8c0
Summary:
Add initial support for building and running some of the integration tests
with CMake. For now this just runs the tests from basic_test.py, just to
confirm that most of the framework code works in CMake-based builds.
Many of the other tests should also work as well, but a few of them we may
want to disable for CMake-based builds. e.g., a couple of the tests depend on
hypothesis, and we would need to include hypothesis as a dependency. Some of
the tests that use systemd might also require a little more work to get
working.
Reviewed By: fanzeyi
Differential Revision: D17659026
fbshipit-source-id: 67420fda9e1021a0cddee2d385fd21e34fb2fd70
Summary:
Force a reference to the edenfsctlPath flag, otherwise the
linker will discard it and a large number of tests will fail.
Reviewed By: simpkins
Differential Revision: D17683222
fbshipit-source-id: b7cb29e74af85b544f45a228770ad2613c8e6efc
Summary:
This diff removes the logic that consumes the legacy bind
mount list and mounts them on startup. That functionality has been
replaced with the eden redirect command.
Instead of performing the bind mounts in the server, the server will
now run `eden redirect fixup` to apply that configuration.
This diff also changes the behavior of performBindMounts: previously, if the
bind mount setup failed, we would tear down the entire repo mount. Since we're
now spawning an external process, it is much more likely that something might
fail and result in a bad experience, so we no longer bail out in that case:
we'll continue and leave the bind mounts as-is. The user can then use `eden
doctor` or `eden redirect fixup` to sort things out.
Reviewed By: simpkins
Differential Revision: D17236366
fbshipit-source-id: 8b004551a076216f0e5448942f00b5195ee18803
Summary:
Change the `//scm/hg:hg` target to use an `sh_binary()` rule that invokes the
`:hg_rust` binary with the proper environment so it can find its dependencies,
rather than copying the binary and all of its dependencies into a new
subdirectory.
In dev mode builds the `hg_rust` binary isn't guaranteed to work anywhere
other than its original location, due to the way that dev mode builds use
`$ORIGIN` in the binary's `RPATH` setting. This happened to work up until now
as the hg_rust binary did not have any separate libraries, but I plan to add
one on the `chg` library.
Reviewed By: quark-zju
Differential Revision: D17109104
fbshipit-source-id: ae8bb1126969f012d1d2fb7d04e80867a310b9a8
Summary:
Add a flag to cause `eden start` to exit successfully without doing anything
if EdenFS is already running. This flag makes it slightly easier for
automation to ensure that EdenFS is running, without logging warnings if
EdenFS was already running.
I also cleaned up the error message slightly when `eden start` is used
without this flag and fails if EdenFS was already running. Previously the
exception thrown was unhandled so it also printed a python backtrace. Now the
code throws an exception that is caught by the higher level command line code,
so it is printed in a more user-friendly way.
Reviewed By: wez
Differential Revision: D17440486
fbshipit-source-id: d7661ef7be7159bf5542b20e99a0b5495690e5a2
Summary: This makes it a bit more human friendly
Reviewed By: chadaustin
Differential Revision: D17249465
fbshipit-source-id: 40d5afc77ded34237e1860d5b91e9257a732e480
Summary:
D17236366 will disable the getBindsMount thrift call and
remove the internal source of data about bind mounts. We instead
have a more current set of data from the `redirect` command, so
tech `eden chown` to use that data.
Reviewed By: chadaustin
Differential Revision: D17249433
fbshipit-source-id: 853f24e729814c501768e9834765e1be283d6aac
Summary:
Integration test helpers relied on an implicit gflags include. Make
that explicit so they compile against open source gflags and glog.
Reviewed By: wez
Differential Revision: D17264335
fbshipit-source-id: e336423b71c0f15e29b0e4ad604328b7624080a8
Summary:
Make sure the contents of the special `.eden/` subdirectory are correct each
time we mount a checkout. Before we would generally only set up the contents
of this directory if it didn't previously exist.
Now the code verifies that the contents of each symlink in this directory are
correct, and recreates the symlink if needed.
This allows EdenFS to automatically repair the contents of this directory even
if the checkout or its `clients` directory has been manually moved.
Reviewed By: wez
Differential Revision: D17279413
fbshipit-source-id: e24e7530f44fff94ebb6f67174aaf78c9b498d6b
Summary:
Update the fsck code to save any orphaned symlink inodes that it finds as
symlinks in the repair archive directory, rather than saving the contents as a
regular file.
Reviewed By: wez
Differential Revision: D17170346
fbshipit-source-id: 4cba8b27233b728114a80a327ab519b039297aea
Summary:
Use the new `OverlayChecker` class to automatically scan for errors and
attempt to repair them if the overlay was not shutdown cleanly the previous
time it was used.
Reviewed By: wez
Differential Revision: D16596601
fbshipit-source-id: 9923565b101ba953e92909e502be6ef5895c5cbd
Summary:
This was causing problems on macos where various tools
would enumerate and helpfully try to preserve attributes across
copies. On macos this would result in appledouble metadata files
being created to track the metadata in the destination file,
which clutters up the repo and has surprising secondary effects
such as being picked up by glob operations in cmake build rules.
This diff simply stops enumerating the extended attribute.
Reviewed By: fanzeyi
Differential Revision: D17140414
fbshipit-source-id: 2924657dc75b900baf70595edfa72e5d0521a697
Summary:
The snapshot generation code and many integration tests create
repositories. By default, they were creating flatmanifest
repositories, which are on their way out. Instead, create tree-only
repos.
Reviewed By: strager
Differential Revision: D17066151
fbshipit-source-id: f99a9543440da6fd7cce0065c3cd7f91a59a02d5
Summary:
Update `EdenServer::mount()` to correctly handle errors that occur during the
mount `INITIALIZING` phase. Previously the code did not add error callbacks
to the `Future` result to handle errors during initialization. As a result we
would propagate the exception back to the thrift caller, but the `EdenMount`
object would remain in our mount point list, stuck forever in the
`INITIALIZING` state.
Reviewed By: strager
Differential Revision: D16590032
fbshipit-source-id: 9adbdf05441dad815096b195ece36f3d958c96a9
Summary: Add a dependency from the eden open source build to the fb303 open source build and switch EdenServiceHandler to BaseService.
Reviewed By: simpkins
Differential Revision: D15528156
fbshipit-source-id: 2ca5c31dd9fcc9bac43fd399b27f33b6f2c5ebfc
Summary:
Open source fb303 will not have getPid() or getCommandLine(), so
introduce a new method for Eden's tests.
Reviewed By: fanzeyi
Differential Revision: D16292993
fbshipit-source-id: 5cdc006ec0ee15f50a3e1cebe9b46a3ea275ff78
Summary: The CountersTest would previously fail if by chance the counters prefixed by "thrift" and "thrift_client" were accounted for between getting "counters" and "counters2", since these counters should not be modified when mounting/unmounting mounts we will just filter them out.
Reviewed By: chadaustin
Differential Revision: D16265511
fbshipit-source-id: 21af0dff345977692785136ca0333d23d5c77e0d
Summary: I found out that the journal stats callbacks that were getting registered were not getting unregistered, this diff fixes that.
Reviewed By: chadaustin
Differential Revision: D16187569
fbshipit-source-id: 8c84e1515e376ccd7036a22c06e2e6b98dc62342
Summary: Turn on logview collecting by default. Also disable logview for Eden integration tests.
Reviewed By: chadaustin
Differential Revision: D15978960
fbshipit-source-id: 623c3be7a461e0bb9bc44924fccfdb006565fad6
Summary: Added the cli command `eden stats object-store` for querying the counts on what part of the object store was responsible for finding the blob or blob size (local store or backing store). This will tell us how well local and in-memory caching works for different workflows.
Reviewed By: chadaustin
Differential Revision: D15934535
fbshipit-source-id: 70345f11a51c3c6996dc001d4101744395a3d182
Summary:
This diff adds a single repo-wide `.eden-redirections` file that is used to
record the redirection configuration for the repo.
The redirection configuration code knows to look at this file and fold it in to
the effective configuration; the legacy bind mounts are applied first, then the
repo redirection configuration and then the user specified redirections.
The intent is that a `post-update` hook will invoke `eden redirect fixup` to
apply the configuration from the repo, and to have the eden daemon also
trigger that after (re-)mounting the repo.
In the future we can extend this to allow different profiles to be enabled
or disabled, but for now this is the minimum viable product.
Reviewed By: strager
Differential Revision: D15867225
fbshipit-source-id: b0a95936dd28283de6c7439ca8e503caef4e7247
Summary:
This is part of the effort to make our bind-mount configuration more
visible and easier to change.
The idea is to generalize the concept of redirection and add a command to help
manage it.
The `eden redirect add` subcommand allows creating one of two different kinds
of redirection:
* `bind` - allocate some space using `mkscratch` and mount it into the repo
* `symlink` - allocate some space using `mkscratch` and create a symlink
that points to it from the repo
On Linux we use bind mounts to implement `bind` but on macOS, which doesn't
have a bind mount concept, we create a sparse disk image file that can grow
to match the size of the disk on which it is created (in practice these are a
7-15MB in size to start and grow as the user stores data into them).
The `eden redirect del` subcommand allows removing a redirection, including
the legacy `bind-mounts` configuration from `.eden/client/config.toml`.
The `eden redirect list` subcommand lists the effective set of redirections,
both from the new redirections configuration and the legacy `bind-mounts`
configuration, along with their state.
The `eden redirect fixup` subcommand iterates over the effective set of
redirections and can remove and reinstate any that are in a broken state.
Reviewed By: strager
Differential Revision: D15707319
fbshipit-source-id: a5dd8c44c9f748482d7b48855b1305d44267885c
Summary: This diff takes care of importing blob from Mononoke and Mercurial at the same time, also improves the name situation in the statistics counters.
Reviewed By: strager
Differential Revision: D15768557
fbshipit-source-id: 10cf831b1ae6dc9e6b91f1e96508c4fa92583743
Summary:
If edenfs was started using `sudo`, the `$USER` environment variable will be
set to `root` rather than the actual user. When we drop privileges make sure
we restore the value of `$USER` as well.
The `$USER` variable isn't checked anywhere else in edenfs itself, but it
matters for subprocesses we spawn, like `hg debugedenimporthelper`.
I also changed the code to clear the `SUDO_*` variables as well, mostly
just for good measure.
Reviewed By: kulshrax
Differential Revision: D15929539
fbshipit-source-id: e022c7ae762e2a5e86d0227058bb476aff17cf55
Summary:
Add a periodic task for performing LocalStore management tasks. For now only
the RocksDBLocalStore class implements this management task.
When this periodic task runs the RocksDBLocalStore object computes how much
space each of the column families are using and publishes this as fb303
counters. If the total size of the ephemeral column families exceeds a
configurable limit it then triggers a background garbage collection task.
I also added a new `edenfsctl stats local_store` command that reports the new
counters added by this diff.
Reviewed By: chadaustin, strager
Differential Revision: D15798505
fbshipit-source-id: 25ca4ba80f5a9c4a1a09dc08633c7b3af363d7ff
Summary:
Update the copyright & license headers in Python files to reflect the
relicensing to GPLv2
Reviewed By: wez
Differential Revision: D15487088
fbshipit-source-id: 9f2138dff41048d2c35f15e09a04ae5a9c9c80dd
Summary:
Update the copyright & license headers in C++ files to reflect the
relicensing to GPLv2
Reviewed By: wez
Differential Revision: D15487078
fbshipit-source-id: 19f24c933a64ecad0d3a692d0f8d2a38b4194b1d
Summary:
Add a periodic task to reload the configuration file from disk. By default
this runs once every 5 minutes, but this interval can be controlled from the
config file.
At the moment reloading the config file does not do much other than update the
interval for how frequently the config file is reloaded. However, I plan to
add additional periodic tasks shortly that are controlled by this config
setting.
This will also make it possible for other parts of the code to
access the config settings in the `ServerState` and use them as-is without
checking to see if they reloaded. Currently all of the code that accesses
config values performs a check to see if the config needs to be reloaded. If
we want to switch to Mercurial-style configs in the future that check will be
substantially more expensive.
This diff also includes a new thrift call to force the config file to be
reloaded immediately. This can be used to restart automatic config reloading
if it is ever disabled in the config file.
Reviewed By: wez
Differential Revision: D15756357
fbshipit-source-id: 1999f4730903633ce838842932a6ae6a65eda4e6
Summary: Fix a few more issues raised by our Python lint checks.
Reviewed By: wez
Differential Revision: D15776717
fbshipit-source-id: 621960579c4567c4fb9395ae14cd7a8666726c1c
Summary: Remove a number of unused imports detected by the linter.
Reviewed By: wez
Differential Revision: D15776268
fbshipit-source-id: 221f45d275664d037bbabcac9858b40266b4833e
Summary:
If the systemd user manager is not running (e.g. it crashed or was manually stopped), 'eden stop' fails with an unhelpful error message:
> pystemd.dbusexc.DBusConnectionRefusedError: [err -111]: Could not open a bus to DBus
or
> pystemd.dbusexc.DBusBaseError: [err -2]: Could not open a bus to DBus
Provide a better experience: tell the user the most likely cause, and suggest a solution:
> error: The systemd user manager is not running. Run the following command to start it, then try again:
> sudo systemctl start user@strager.service
Reviewed By: wez
Differential Revision: D13791023
fbshipit-source-id: 5172df0a52d21c311b27b8a527cad934f9882154
Summary:
Summary
Change `ConfigTest` to derive directly from `EdenTestCase` rather than using
the `eden_repo_test` decorator. The configuration test code doesn't really
need a repository, and so we don't need to run it twice (for both Mercurial
and Git repositories).
Reviewed By: wez
Differential Revision: D15756359
fbshipit-source-id: 90d5011ae1ff7d2a251c9e7bb776045fbe2fdfe1
Summary:
Clean up the `EdenFS` class construction.
Previously it accepted the `eden_dir`, `etc_eden_dir`, and `home_dir`
arguments as separate parameters. If `etc_eden_dir` or `home_dir` were not
specified it would not pass these arguments to `edenfs`, allowing the default
values to be used. This is undesirable for most tests.
Now it accepts a `base_dir` argument. Explicit values for the `eden_dir`,
`etc_eden_dir`, and `home_dir` parameters can still be specified (this is used
for the snapshot tests), but if they aren't specified, default locations
inside the `base_dir` will be used instead.
This also cleans up some of the code to use `pathlib.Path` values instead of
plain `str` objects in more places.
Reviewed By: strager
Differential Revision: D15756358
fbshipit-source-id: 3e87ddc98d15fcb7f60c6c3116d4fcc8e49432ea
Summary:
Add a thrift call to get the current config settings.
My primary use case for this method at the moment is to make it possible to
build integration tests that check the config behavior. However in the future
this will probably also be useful for building CLI commands to report the
current config values to allow debugging if there are ever issues. This API
can also be used to force EdenFS to immediately reload the config from disk.
Reviewed By: strager
Differential Revision: D15572124
fbshipit-source-id: da3bc982f9c419b3314a8b0560c9bd327760d429
Summary:
With systemd integration enabled, if edenfs is running, or if edenfs' systemd service is active, `edenfsctl start` does nothing. This behavior differs from `edenfsctl start` with systemd integration disabled, and can cause `edenfsctl restart` to think that it successfully started edenfs.
Make `edenfsctl start` fail if edenfs is running and healthy, or if edenfs' systemd service is active (yet edenfs is unhealthy).
Reviewed By: chadaustin
Differential Revision: D15703310
fbshipit-source-id: ce0a13780ee03de1f896a938d002901023e5bdd3
Summary:
systemctl has some problems for Eden. For example:
* With Restart=on-failure, 'systemctl start' reports that the job failed when the first failure occurs. 'systemctl start' does not wait for retries to finish. This means 'eden start' can fail despite edenfs starting successfully.
* If the service fails to start, 'systemctl start' prints a suggestion to use journalctl, even though journalctl is broken and is not even used by fb-edenfs@.service.
* If 'systemctl' can't connect to systemd, it prints a generic message such as "Failed to connect to bus: Connection refused" which the Eden CLI can't easily detect and customize.
For 'eden start', instead of using systemctl, talk to systemd using its D-Bus API (via pystemd [1]). This automatically solves the journalctl message problem, makes it trivial to customize certain errors, and will let us solve the Restart=on-failure problem in the future.
Aside from changing some error messages, this diff should not change behavior.
[1] https://github.com/facebookincubator/pystemd
Reviewed By: simpkins
Differential Revision: D13533184
fbshipit-source-id: 7fedc8ad4a094a2d04b14c2f6e82b51a0ed348a6
Summary: We want to have just one entry point to Mercurial, namely the Rust binary. Getting rid of the `hg` Python script means that we finally can do things that only exist in Rust and not in Python.
Reviewed By: simpkins
Differential Revision: D13186374
fbshipit-source-id: f3c8cfe4beb7bf764172a8af04fd25202eca9af2
Summary:
test_reading_committed_file_bumps_read_counter is flaky with optimized builds of edenfs. I think it's flaky because FuseChannel bumps counters *after* responding to the kernel, so the test can call get_counters before the counters are bumped.
Fix the flakiness by making the test wait a while for the counters to change.
Reviewed By: chadaustin
Differential Revision: D15550972
fbshipit-source-id: 891e5d0a9748b43eb0ef1089ef6bc0a547c47d4d
Summary:
In the past few months, test_no_units_are_active started failing. It looks like 'systemctl list-units' is now listing devices. For example:
```
$ systemctl list-units --all --full --no-pager
UNIT LOAD ACTIVE SUB JOB DESCRIPTION
● boot.automount not-found inactive dead boot.automount
proc-sys-fs-binfmt_misc.automount loaded active running Arbitrary Executable File Formats File System Automount Point
dev-disk-by\x2dlabel-\x5cx2f.device loaded active plugged /dev/disk/by-label/\x2f
[snip]
dev-getty.device loaded inactive dead /dev/getty
dev-loop0.device loaded active plugged /dev/loop0
dev-ram0.device loaded active plugged /dev/ram0
dev-ram1.device loaded active plugged /dev/ram1
[snip]
```
I don't know if systemctl changed or if systemd changed or if my machine's configuration changed. Either way, the test is failing now due to these systemd units.
Teach test_no_units_are_active to ignore these unimportant device units, since they don't represent running services or timers. This causes the test to pass on my machine.
Reviewed By: wez
Differential Revision: D15548072
fbshipit-source-id: 4f49c72d88b836aba37ec5ea7b5ee5b7cb8172f6
Summary:
EdenFS' systemd tests detect if the running system is managed by systemd, and chooses different strategies for creating a sandboxed systemd user manager depending on this detection. Sandcastle (Facebook's continuous integration system) recently started running systemd as the container's init system.
EdenFS' systemd tests correctly detect that Sandcastle's container is managed by systemd. Unfortunately, the tests cannot communicate with systemd. For example, `XDG_RUNTIME_DIR=/run systemctl --user status` in a Sandcastle job reports the following error:
> Failed to connect to bus: No data available
This error happens because systemd and the EdenFS' tests are running in a different process namespace. The filesystem shows that the system is managed by systemd, but the process table shows otherwise! This causes all of EdenFS' systemd tests to fail.
Work around this issue making Sandcastle use the old "unmanaged" code path. (The unmanaged code path is run if the system is not managed by systemd. Before Sandcastle started running systemd in its container, Sandcastle used this code path when running EdenFS's tests.)
(Ideally we would figure out why we need both the "managed" code path in the first place. This diff is just meant to fix tests on Sandcastle for now, not implement a long-term solution.)
Possibly related upstream systemd issue: https://github.com/systemd/systemd/issues/11300
Reviewed By: wez
Differential Revision: D15530685
fbshipit-source-id: b65b568e660310c50a4e25e0aa143f9388f1ad45
Summary:
I am refactoring edenfs' EdenStats class. In doing so, I accidentally removed a call to `aggregate`, causing `flushStatsNow` to not expose accurate counters. This wasn't caught by any existing tests, only by manual testing.
Add some tests to prevent FUSE and HgBackingStore statistics collection from completely regressing.
Reviewed By: simpkins
Differential Revision: D15274275
fbshipit-source-id: c8a9c9848dd60aee7f252a93f10ddce6d7560799
Summary:
In another diff, Adam Simpkins noticed that HgImporterStatsTest.get_counters was clunky and that the logic belongs in the EdenTestCase base class.
Hoist HgImporterStatsTest.get_counters into EdenTestCase. Also avoid reusing the Thrift client between calls to get_counters because edenfs might restart between calls (in which case the old Thrift client won't work).
Reviewed By: wez
Differential Revision: D15514537
fbshipit-source-id: 0ae25106baa0e5b2d857b0bb2552d884b9b270ef
Summary: Some pyre-fixme directives are on the wrong line. Move them to the line to fix "Unused ignore" errors from Pyre.
Reviewed By: wez
Differential Revision: D15507418
fbshipit-source-id: b8d1163080b1c64868c37e7581411be31f495141
Summary: This function isn't used anywhere. Delete it.
Reviewed By: pkaush
Differential Revision: D9695388
fbshipit-source-id: 1ac702c98ee63d09c15c8a7b8a9c8d44fcec630d
Summary:
In EdenFS' tests, when systemd-run fails, subprocess.check_output raises an exception and causes the test to fail. The exception's message does not include any output of systemd-run, so systemd-run's error messages are hidden. This makes it hard to debug systemd-run failures.
When systemd-run fails, print the captured output. This surfaces error messages printed by systemd-run, improving the debugging experience.
In the normal case where systemd-run succeeds, this diff should not change behavior.
Reviewed By: wez
Differential Revision: D15466610
fbshipit-source-id: 2b2db80b989308967e13499fcfadd37b44ca878f
Summary:
Update the integration tests to avoid explicitly using the old
`hg_import_helper.py` script.
Reviewed By: pkaush
Differential Revision: D15223982
fbshipit-source-id: 6e2310d5a9e6e0b95690a07d61d295cc3b1bca92
Summary: Flatmanifest is on its way out. Remove support for falling back to it if a tree import fails.
Reviewed By: pkaush
Differential Revision: D15056459
fbshipit-source-id: a4df820322ee354d77f50a0ec92e9705d0f152ec
Summary:
Move some of the argument parsing and config setup code out into a new
EdenInit.h header file. This makes it possible to re-use this logic for other
standalone utilities that want to be able to find the Eden state directory and
config information.
For now I have updated the `fake_edenfs` helper tool used by the integration
tests to use this. This may also be useful for writing standalone tools that
can perform garbage collection of the LocalStore or checking of the overlay
state.
Reviewed By: chadaustin
Differential Revision: D14889616
fbshipit-source-id: b0b193a42cb2f52177d0c44592426b42e27242aa
Summary:
On my system this test was failing because the `systemctl start exit.target`
command was exiting with an error. It looks like this command exits because
systemd closes its D-Bus connection as it is shutting down. This modifies the
test to ignore the return code from this systemctl command.
I also updated the log message when shutting down systemd to no longer include
the exception backtrace. This message is always logged when running this test
(since systemd is not actually running), and the backtrace information simply
makes the test logs harder to read.
Reviewed By: wez
Differential Revision: D14886191
fbshipit-source-id: 87c996b2579a9920a72ee5b57608c263ca080d6e
Summary:
Sometimes, EdenFS goes bonkers and talks to 'hg debugedenimporthelper' a lot for seemingly no reason.
Make these situations easier to debug by counting how many requests EdenFS makes to 'hg debugedenimporthelper'. These counters let us answer questions such as the following:
* When performance sucks, is EdenFS is making a lot of requests or only a few requests? I.e. is Hg slow at responding or is EdenFS very demanding?
* Does a recent performance issue correlate with EdenFS communicating with 'hg debugedenimporthelper'?
* Which engineers are outliers having orders of magnitude more 'hg debugedenimporthelper' requests than p50 engineers?
We could get fancier with these counters and include the number of bytes received, the duration of the request, etc. For now, just having a request count is useful.
Reviewed By: simpkins
Differential Revision: D14677339
fbshipit-source-id: 7f8f394fb0096aef65d6a8a45d7da5936db539a0
Summary:
After the kernel added readdir caching, my testing uncovered that Eden
was invalidating TreeInode entries incorrectly when new entries were
added. Change TreeInode to distinguish between directory entry changes
and removals (FUSE_NOTIFY_INVAL_ENTRY) and additions
(FUSE_NOTIFY_INVAL_INODE).
Reviewed By: strager
Differential Revision: D13870422
fbshipit-source-id: 2a6f25bfd9e77436a5aae639fedbfd8a445b2e05
Summary:
Update `RocksHandles` to call `RepairDB()` if we get an error opening the
database.
We have seen errors opening the DB in some cases after hard server reboots.
In every case so far `ldb repair` has been able to repair it with no adverse
effects. This simply makes `edenfs` automatically attempt to perform this DB
repair step.
Reviewed By: chadaustin, strager
Differential Revision: D14452216
fbshipit-source-id: 10c0cb0ff9cea3c3bbe485a4e489e4a6df640803
Summary:
Sometimes, the XDG_RUNTIME_DIR environment variable isn't set. If this happens, 'eden start' fails because systemctl uses XDG_RUNTIME_DIR to talk to systemd. We still want 'eden start' to work in these cases, so guess what XDG_RUNTIME_DIR should be and use that guess if the variable isn't set.
If XDG_RUNTIME_DIR is set in the environment, its value should still be used.
Reviewed By: chadaustin
Differential Revision: D13811813
fbshipit-source-id: bb44d99e585bbe7a4341087c5cb4644c606fc441
Summary:
Sometimes, Facebook's CI servers might not have a /run/systemd directory. This causes EdenFS' systemd tests to fail, because daemon-respawn can't access that directory [1].
Fix the tests on CI by creating /run/systemd.
Why did the tests only start failing recently? I'm not sure. I think we were just lucky in the past; tests in other projects seem to create /run/systemd (e.g. using the systemd-nspawn command), and it looks like this state persists across CI jobs.
[1] https://github.com/systemd/systemd/blob/v239/src/core/dbus-manager.c#L1277
Reviewed By: simpkins
Differential Revision: D14436098
fbshipit-source-id: eb48abeb1ce38ea4ae760192db37bb1910efff99
Summary:
If sanity_check_enabled_unit_fragment fails, the error message is unhelpful:
```
Exception: Enabled unit's FragmentPath does not match unit file
Expected: {repr(expected_unit_file)}
Actual: {repr(actual_unit_file)}
```
Use format strings as was originally intended, and drop repr to make the paths easier to read:
```
Exception: Enabled unit's FragmentPath does not match unit file
Expected: /data/users/strager/fbsource/fbcode/eden/fs/service/fb-edenfs@.service
Actual: /usr/lib/systemd/user/fb-edenfs@.service
```
Reviewed By: simpkins
Differential Revision: D13372758
fbshipit-source-id: 0f12cc7a6f63fc53d72ce92b265e0ccbcc26d394
Summary: We don't run this binary anymore, no reason to build and ship it.
Reviewed By: quark-zju
Differential Revision: D14437317
fbshipit-source-id: dd6da521783f18a2a518a7aa042be98950894e89
Summary:
If TreeInode::startLoadingInode() is in progress, and EdenServer::startTakeoverShutdown() is called, edenfs can deadlock:
1. Thread A: A FUSE request calls TreeInode::readdir() -> TreeInode::prefetch() -> TreeInode::startLoadingInode() on the children TreeInode-s -> RocksDbLocalStore::getFuture().
2. Thread B: A takeover request calls EdenServer::performTakeoverShutdown() -> InodeMap::shutdown().
3. Thread C: RocksDbLocalStore::getFuture() (called in step 1) completes -> TreeInode::inodeLoadComplete(). (The inodeLoadComplete continuation was registered by TreeInode::registerInodeLoadComplete().)
4. Thread C: After TreeInode::inodeLoadComplete() returns, the TreeInode's InodePtr is destructed, dropping the reference count to 0.
5. Thread C: InodeMap::onInodeUnreferenced() -> InodeMap::shutdownComplete() -> EdenMount::shutdown() (called in step 2) completes -> EdenServer::performTakeoverShutdown().
6. Thread C: EdenServer::performTakeoverShutdown() -> localStore_.reset() -> RocksDbLocalStore::~RocksDbLocalStore().
7. Thread C: RocksDbLocalStore::~RocksDbLocalStore() signals the thread pool to exit and waits for the pool's threads to exit. Because thread C is one of the threads managed by RocksDbLocalStore's thread pool, the signal is never handled and RocksDbLocalStore::~RocksDbLocalStore() never finishes.
Fix this deadlock by executing EdenServer::shutdown()'s callback (in EdenServer::performTakeoverShutdown()) on a different thread.
Reviewed By: simpkins
Differential Revision: D14337058
fbshipit-source-id: 1d63b4e7d8f5103a2dde31e329150bf763be3db7
Summary:
The feature was completed by Phil in D9816270. It's handy and can probably
reduce user support burden like: https://fb.intern.facebook.com/groups/scm/permalink/2039619916087618/
Therefore let's enable it.
Reviewed By: DurhamG
Differential Revision: D14293405
fbshipit-source-id: 54e934e0bf495c090109462e4f743d427df39380
Summary:
Update most of the `eden/cli/config.py` to use `Path` instead of `str` where
appropriate. This also updates several of the APIs in `util.py` that were
affected as well.
Reviewed By: chadaustin
Differential Revision: D14356543
fbshipit-source-id: a8f6d15b8870bf689eeb78f9fc0e9a0c65c97218
Summary:
If no mounts are configured `eden fsck` previously threw an exception when
trying to compute the return value. It called `max(return_codes)` on an empty
return codes list, which would fail. This changes the code to handle that
code specially and report a warning that there was nothing to check.
Reviewed By: chadaustin
Differential Revision: D14352112
fbshipit-source-id: 3815ef34a12834d642f3eee867dda6dc1117c2ef
Summary:
This updates `edenfs` to automatically create the mount point directory
if it does not exist.
Previously the `eden mount` CLI command would automatically create the mount
directory in the Python logic. This adds similar logic to the C++ code, which
handles more situations. In particular this makes it so that `eden start`
will now automatically create missing mount point directories.
Note that the C++ code does not create the `README_EDEN.txt` symlink inside
the mount point if it is missing. We could move that functionality into the
C++ code in the future if needed.
Reviewed By: strager
Differential Revision: D14254699
fbshipit-source-id: bad5634f57fba6e7af3b6a3830eb51ac099b435e
Summary:
This updates the `EdenServer` class so that the existing `getMount()` and
`getMountPoints()` APIs only return mounts that have finished initializing.
These APIs are primarily used by the thrift interfaces. In most cases the
callers did not intend to operate on mounts that were still initializing, and
doing so was unsafe. The code could potentially dereference a null pointer if
it tried to access the mount's root inode before the root inode object had
been created.
New `getMountUnsafe()` and `getAllMountPoints()` APIs have been added for call
sites that explicitly want to be able to access mounts that may still be
initializing. Currently the `listMounts()` thrift API is the only location
that needs this.
Reviewed By: strager
Differential Revision: D13981139
fbshipit-source-id: e6168d7a15694c79ca2bcc129dda46f82382e8e9
Summary:
Add a flag to tell edenfs to report successful start-up as soon as the thrift
server is running, without waiting for all mount points to finish being
remounted.
In the future I plan to have edenfs automatically perform an fsck scan of the
overlay for checkouts that were not shut down cleanly. This may cause the
remount to take a significant amount of extra start-up time in some cases.
(This is already true today in some cases even with the simpler scan we do to
re-compute the max inode number.)
I think we will probably want to have systemd invoke edenfs with this option,
so that we do not time out during system start up if some mount points need to
be rescanned.
Reviewed By: strager
Differential Revision: D13522040
fbshipit-source-id: 6f183770c25efee34c4805c9bad42a9cce51039e
Summary:
Update `edenfs` to automatically create the bind mount source directories if
they are missing. Previously Eden would report an error and would not be able
to mount the checkout if some of the bind mount source directories were
missing.
Reviewed By: strager
Differential Revision: D14253771
fbshipit-source-id: 87ad091ccf2c0f0f72aebb50437fd7680ddbfd1c
Summary:
Update `EdenMount::initialize()` to perform a fault injection check. This
allows test code to inject delays and errors into the mount initialization
flow.
Reviewed By: strager
Differential Revision: D14079491
fbshipit-source-id: be80135b0833c8f0300104524473cc3e949fec34
Summary:
The Eden CLI tool is really a control program for `edenfs`. Rename it to
`edenfsctl` to free up the `eden` name for future use.
The Eden daemon shouldn't really be on the user's path, and instead belongs in
`libexec`.
For transition compatibility, `eden` is symlinked to `edenfsctl`.
Reviewed By: simpkins
Differential Revision: D13888875
fbshipit-source-id: 435cc63e92b85b1f28b8691e4846fbcb05bc450e
Summary:
Now that `hg debugedenimporthelper` has been released for
a little while, we can remove the bundled implementation of it from
the eden release.
However, we cannot remove the script itself as there are users
with long running edenfs instances that pre-date the knowledge
of `hg debugedenimporthelper`. So, as a compatibility shim,
this diff redirects `hg_import_helper.py` and has it exec the
command in mercurial.
For extra fun, our own integration tests rely on being able
to import `hg_import_helper.py` and override portions of it,
so we cannot remove its implementation from the codebase just
yet either.
So this diff:
* Introduces `proxy_import_helper.py` which execs `hg debugedenimporthelper`
* Installs `proxy_import_helper.py` as `hg_import_helper.py` in the rpm
* Leave `hg_import_helper.py` as-is in the tree for now.
Reviewed By: simpkins
Differential Revision: D13970332
fbshipit-source-id: 717dc86a880fbbbe4a7e801a8b748abd053c7f7c
Summary: Branches are going away. Remove the use of them.
Reviewed By: strager
Differential Revision: D14062107
fbshipit-source-id: 00f6d3666eb3cb6900cd570fa3fcf12ba75c2ae0
Summary: using upgrade script to clear out all remaining version-set configs
Reviewed By: dark
Differential Revision: D13832474
fbshipit-source-id: 52c280cbd79b1410821ed829465b1c0907b50a86
Summary:
Update the `eden list` command to also report the current state for each
checkout if it is not running normally. Also added a `--json` flag to
print information as JSON so it can be consumed programmatically.
Reviewed By: strager
Differential Revision: D13503053
fbshipit-source-id: 4ef366f5bf4a1157036fdfd7ff1056079588e802
Summary:
TemporarySystemdUserServiceManagerTest.test_exit_kills_manager is flaky due to a race condition. SystemdUserServiceManager.exit does not wait for the systemd process to exit; I think it only waits for systemd to close its socket. This means the process can still be alive, and `did_process_exit` can return true.
Fix the race condition by making SystemdUserServiceManager.exit block until the systemd process exits.
Reviewed By: chadaustin
Differential Revision: D13791407
fbshipit-source-id: 8422e0101eaea8b4da285dcb0fcf564435b30065
Summary:
Add an option to forcibly kill `edenfs` with SIGKILL without ever attempting
to query it over thrift.
This should provide a way for users to reliably kill edenfs even if it is
hung. This shouldn't be necessary in most cases, but it lets us tell users to
run this command as a last resort if something is wrong.
Reviewed By: chadaustin, strager
Differential Revision: D13744188
fbshipit-source-id: 13378d04b3398e72ed3733d4ebb68b39868007bd
Summary:
When graceful restart was first implemented we forgot to update the
lock file with the new pid, resulting in occasional unexpected output
from tools like eden doctor.
Reviewed By: simpkins
Differential Revision: D13744411
fbshipit-source-id: cdc758ed6ac1201fd2ff3e9d7805bb5ab6f83e8a
Summary:
On a systemd-managed system, the `XDG_RUNTIME_DIR` environment variable is set on login. `systemctl` uses this variable to know how to talk to the systemd user manager. If `XDG_RUNTIME_DIR` is not set in the environment, `systemctl` (and thus `eden start`) fails with an unhelpful message:
Failed to connect to bus: No such file or directory
Improve this message by explicitly checking for the absence of `XDG_RUNTIME_DIR`.
Reviewed By: simpkins
Differential Revision: D13728111
fbshipit-source-id: a7f60fc29561acd05fbc1bf52d7968ae0e64d0c2
Summary:
The failure messages printed by 'eden start' are kinda crappy with systemd integration enabled. Add some tests for these messages so we can easily iterate on them.
In particular, test the following cases:
* The systemd user manager is no longer running
* The XDG_RUNTIME_DIR environment variable, needed to talk to systemd, is not set
* edenfs fails to start
This diff should not change behavior.
Reviewed By: simpkins
Differential Revision: D13723440
fbshipit-source-id: abae5c0e4a9f0bc6b8d0d606e8f5f36760aad5fa
Summary:
A bug in Pyre causes the properties of FindEXE to have an incorrect type. We currently work around this bug by silencing type errors. Unfortunately, this might silence legitimate errors too.
Instead of silencing type errors, using `typing.cast` to tell Pyre the correct type. This should expose legitimate errors if they exist.
This diff should not change behavior.
Reviewed By: chadaustin
Differential Revision: D13709138
fbshipit-source-id: 55f47f47062a35911c6bbe03ffd7b02a90a5107f
Summary:
In our linux deployments it was relatively straightforward
to import the mercurial runtime from a python process running the
system python executable. Our macOS deployments are a lot more
complex because they do not use the system python and do not install
the mercurial python packages in the python path of the target
python executable.
It is simpler to move the import helper functional into a mercurial
command that we can invoke instead of our own helper program.
This diff moves the script to be a debug command and adjusts its
argument parsing to match the mercurial dispatcher requirements.
There are some stylistic mismatches between this code and the
rest of mercurial; I'm suggesting that we ignore those as the
medium term solution is that this command is replaced by eden
directly consuming the rust config parsing code and by native
rust code to perform the data fetching that we need.
Reviewed By: pkaush
Differential Revision: D13522225
fbshipit-source-id: 28d751c5de4228491924df4df88ab382cfbf146a
Summary:
If edenfs is not running or is unhealthy, 'eden rage' does not run 'eden doctor'. This means 'eden rage' does not include helpful output such as ~/local/.eden/clients/* being on NFS, or the Linux kernel version being unsupported.
Make 'eden rage' run 'eden doctor' regardless of the health of edenfs.
Reviewed By: simpkins
Differential Revision: D13633381
fbshipit-source-id: 2439057ba7a7bbe5041991ddc4ede256e86634f3
Summary:
Update the Eden CLI to use `os.path.realpath()` to resolve symlinks in the
Eden config directory before using it.
In most situations today it is common for the default path
(`$HOME/local/.eden`) to traverse as symlink in the user's home directory.
For users that are still using NFS home directories we can sometimes read this
symlink when running as the user, but we cannot read the symlink as root (for
instance, from inside the privhelper process).
Resolving symlinks in the CLI code ensures that the `edenfs` daemon will see
the final resolved path and will not need to traverse symlinks in the user's
home directory in this situation.
Reviewed By: strager
Differential Revision: D13515871
fbshipit-source-id: 0602389492afc0b542e089bb002534f3d714882e
Summary:
Update the ServiceTestCaseBase code so that each test case keeps its state in
a single top-level temporary directory. This makes it a little easier to
figure out which directory is which when debugging a test. I also plan to
write a new test soon that creates some additional directories, and having one
top-level temporary directory avoids needing to create new TemporaryDirectory
objects.
Reviewed By: strager
Differential Revision: D13522026
fbshipit-source-id: 95a3d268d267a107bbf5e405839d64afd6afdb03
Summary:
Change some of the integration tests to read back the original command line
arguments from fake_edenfs over thrift rather than by writing it out to a file
on disk.
This shouldn't really change much, it just seemed slightly simpler.
Reviewed By: strager
Differential Revision: D13515855
fbshipit-source-id: 386207c00f28626e2125958895387a870ca87b82
Summary: 'eden clone' starts the EdenFS daemon if it's not already running. If the user opted into systemd integration, make sure the daemon is started via systemd.
Reviewed By: wez
Differential Revision: D13498650
fbshipit-source-id: 8c5da579f9b79363e2d825ea7c85d423cbcc6509
Summary:
Add another suppression for T38220626 that appears to have been missed in
D13502225 when it was rebased before landing.
Reviewed By: strager
Differential Revision: D13526440
fbshipit-source-id: 60f5f6eff36b5f8462286c229836ffcb88f3afc1
Summary:
Update the `eden clone` command to automatically create the `.hg` directory
when creating a checkout for a Mercurial repository.
Previously this logic was performed by a separate post-clone hook that was
invoked by `eden clone`. Having this logic in a separate script made the code
slightly more complicated, and meant that configuring Eden was also more
complicated, as the hook also needed to be installed and configured. Moving
the logic into the Eden CLI will make it easier to re-use this code in
`eden doctor` if the `.hg` directory needs to be repaired.
Reviewed By: wez
Differential Revision: D13447272
fbshipit-source-id: 11c4f8e389aead151dd235eff95c860a326967af
Summary: If edenfs crashes when starting, we don't want systemd to keep trying to restart the service forever. systemd already behaves as we want, but add a test to make sure this feature doesn't regress.
Reviewed By: wez
Differential Revision: D13327803
fbshipit-source-id: df4fb0e5b2d9874fda58bad903087e411efeeefc
Summary:
When run inside the systemd service (fb-edenfs@.service), edenfs' logs are written to `/var/log/messages` (on Facebook dev servers). This is undesirable, since those logs have a bunch of noise.
Make systemd-managed edenfs log to `~/local/.eden/logs/edenfs.log` instead, matching the behavior of custom-managed edenfs.
---
I considered using systemd's StandardOutput= and StandardError= directives [1], but they have limitations:
* **StandardOutput=file:%f/logs/edenfs.log**: When the `logs` directory is missing, systemd does not create it. In this case, systemd fails when it opens the log file, so systemd refuses to start the service.
* **StandardOutput=journal** [2]: journald and journalctl are broken for user services. Logging to journald only works with persistent journal storage [3][4], but Facebook uses volatile journal storage.
* **StandardOutput=syslog** [5]: rsyslog seems designed for system administrators, not users. I didn't investigate much, but I suspect it's impossible to make rsyslog write to a user-controlled path such as `~/local/.eden/logs/edenfs.log`.
* **LogsDirectory=%f/logs and StandardOutput=file:%L/edenfs.log** [6][7]: LogsDirectory= does exactly what we need, except it only supports paths relative to `/var/log` or `~/.config/log/`. `LogsDirectory=%f/logs` does not work, and systemd will ignore such a directive.
* **StandardOutput=file:%f/logs/edenfs.log and a `mkdir` service**: If we create a service which just creates the `logs` directory, and make fb-edenfs@.service depend upon that service, systemd can successfully open the log file [8]. In theory, using StandardOutput= would cause errors like "could not set resource limits" to be logged to `edenfs.log`. In practice, systemd does not respect the service's logging configuration when reporting such errors [9]. Therefore, this solution is no better than the manual redirect.
[1] https://www.freedesktop.org/software/systemd/man/systemd.exec.html#StandardOutput=
[2] https://www.freedesktop.org/software/systemd/man/systemd-journald.service.html#
[3] https://www.freedesktop.org/software/systemd/man/journald.conf.html#SplitMode=
[4] https://lists.freedesktop.org/archives/systemd-devel/2016-October/037554.html
[5] https://www.rsyslog.com/
[6] https://www.freedesktop.org/software/systemd/man/systemd.exec.html#RuntimeDirectory=
[7] https://www.freedesktop.org/software/systemd/man/systemd.unit.html#Specifiers
[8]
```name=fb-edenfs-logs@.service
[Service]
Environment=EDENFS_CONFIG_DIR=%f
ExecStart=/bin/sh -c ' \
set -e; \
set -u; \
\
/bin/mkdir -p -- "$${EDENFS_CONFIG_DIR}/logs""; \
'
```
```name=fb-edenfs@.service
[Unit]
After=fb-edenfs-logs@%i.service
Requires=fb-edenfs-logs@%i.service
```
[9] fd0ec39d38/src/basic/log.c (L560-L639)
Reviewed By: simpkins
Differential Revision: D13422459
fbshipit-source-id: 57c575a6f377812caa2a79168778576c6ccff33e
Summary:
I want to use fake_edenfs to test logging via EdenFS' systemd service. Make fake_edenfs and the real edenfs use similar logic to determine the log file path.
This diff should not change behavior for the real edenfs.
Reviewed By: simpkins
Differential Revision: D13424470
fbshipit-source-id: d0c2e035fdb5884dbd2d9704c7e0244d35e052f2
Summary:
Update some of the test code to use `Path.read_text()`, so we can eliminate
`eden.cli.util.read_all()`
Reviewed By: strager
Differential Revision: D13374245
fbshipit-source-id: 3399923b60ae78a4f7ea57367d097697c8b9c1cb
Summary:
D13422460 made TemporaryDirectoryMixin required for classes deriving from ServiceTestCaseBase, but ServiceTestCaseBase expressed this requirement poorly. Make the dependency explicit by making ServiceTestCaseBase derive from TemporaryDirectoryMixin.
This diff should not change behavior.
Reviewed By: simpkins
Differential Revision: D13454141
fbshipit-source-id: e07745cfd2a364da5011371fe8671d94d32c6798
Summary:
I noticed some Managed service tests behaving strangely on my dev server. They behave as if they are being managed by systemd.
I noticed that `systemctl --user status 'fb-edenfs@*.service'` showed a bunch of tmp-eden_test services. Since the Managed tests don't create an isolated systemd instance, they are starting and stopping services on my real systemd!
This behavior is caused by me setting service.experimental_systemd=true in my ~/.edenrc (D13371186).
Fix the odd behavior of these tests by preventing them from reading ~/.edenrc. Also do the same for /etc/eden.
Reviewed By: simpkins
Differential Revision: D13422460
fbshipit-source-id: b8a4cbabe55b75b34729d4122ba804cd7d3297a2
Summary:
The code for invoking 'eden start' is duplicated by test_daemon_command_arguments_should_forward_to_edenfs_without_leading_dashdash. I am going to add more required flags for 'eden start' tests, so the duplication is a problem. Make this test reuse spawn_start.
This diff should not change behavior.
Reviewed By: simpkins
Differential Revision: D13453511
fbshipit-source-id: 72293ed56ca90fce1cc1fa33d996edd08ca15767
Summary:
While trying to reproduce a failing test on Sandcastle (Facebook's CI), I encountered a bug: when run as root, StartWithRepoTestGit test_eden_start_with_systemd_mounts_checkouts failed with permission denied errors:
```
$ sudo -Hi
root$ SANDCASTLE=1 EDEN_TEST_FORCE_SYSTEMD_USER_SERVICE_MANAGER_TYPE=unmanaged ./buck-out/gen/eden/integration/integration#binary.par -r StartWithRepoTestGit.test_eden_start_with_systemd_mounts_checkouts
[snip]
fb-edenfs@tmp-eden_test.j62w_nfx-homedir-local-.eden.service: Main process exited, code=exited, status=70/SOFTWARE
[snip]
root$ less /var/log/messages
[snip]
Dec 10 20:05:11 devvm3761.prn2.facebook.com sh[3668152]: I1210 20:05:11.781113 3668152 main.cpp:280] edenfs exiting successfully
Dec 10 20:05:36 devvm3761.prn2.facebook.com sh[3669439]: I1210 20:05:36.908677 3669439 main.cpp:153] Running in experimental systemd mode
Dec 10 20:05:36 devvm3761.prn2.facebook.com sh[3669439]: W1210 20:05:36.910221 3669439 EdenConfig.cpp:362] error accessing config file /tmp/eden_test.j62w_nfx/homedir/.edenrc: Permission denied
Dec 10 20:05:36 devvm3761.prn2.facebook.com sh[3669439]: error creating /tmp/eden_test.j62w_nfx/homedir/local/.eden: boost::filesystem::filesystem_error: boost::filesystem::create_directory: Permission denied: "/tmp/eden_test.j62w_nfx/homedir/local/.eden"
```
edenfs is dropping its permission to my regular user. SUDO_ variables [1] propagate to edenfs, and edenfs calls `setuid($SUDO_UID)`. Clearing SUDO_UID manually fixes the test:
```
$ sudo -Hi
root$ env --unset SUDO_UID SANDCASTLE=1 EDEN_TEST_FORCE_SYSTEMD_USER_SERVICE_MANAGER_TYPE=unmanaged ./buck-out/gen/eden/integration/integration#binary.par -r StartWithRepoTestGit.test_eden_start_with_systemd_mounts_checkouts
[snip]
Ran 1 test in 14.720s
OK
```
According to systemd's documentation, services should have a mostly-empty environment [2]. It looks like the systemd user manager relies on this (because it's normally run via user@.service) and doesn't sanitize its environment before forking service processes.
Fix the bug by cleansing the environment when running systemd manually. This prevents SUDO_UID and other environment variables from propagating to services.
---
By coincidence, this change fixes the original bug I was trying to reproduce. On Sandcastle, `SUDO_COMMAND` is set to a long string with plenty of "special" characters (spaces, quotes, backslashes, equal signs, etc.). systemd barfs when shuffling the environment block around:
```
Reloading.
Failed to parse environment entry: "env=SUDO_COMMAND=/bin/bash -c SANDCASTLE_INSTANCE_ID=1488557959 [snip]\'\\\'\' \'\\\'\'\\--collection\'\\\'\' \'\\\'
Unknown serialization item '' \'\\\'\'\\--no-stress-run-sub-run\'\\\'\' \'\\\'\'\\--extended-tests\'\\\'\'\''
```
By not setting SUDO_COMMAND, we avoid this systemd bug.
---
[1]
```
root$ env | grep SUDO
SUDO_GID=100
SUDO_COMMAND=/bin/bash
SUDO_USER=strager
SUDO_UID=6986
```
[2] https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Environment%20variables%20in%20spawned%20processes
Reviewed By: simpkins
Differential Revision: D13413110
fbshipit-source-id: a91d70f33c93e034bdef5573451d528a255e5fc1
Summary:
Now that FileInode read and write operations are stateless via BlobAccess and OverlayFileAccess,
EdenFileHandle no longer provides any value. Remove it. This also fixes eden's shutdown timeout
when a file handle is open and paves the way for FUSE_NO_OPEN_SUPPORT.
Reviewed By: strager
Differential Revision: D13325137
fbshipit-source-id: 71ed47a7c997f5035b4394ccb311f94332ecd8c2
Summary:
D13366696 made some tests fail. These tests assume stderr doesn't print extra messages.
Make these tests tolerate extra messages on stderr, so extra logging doesn't make them fail.
Reviewed By: simpkins
Differential Revision: D13409510
fbshipit-source-id: 7c80766dd3978bb78c9a6fc8ed23310c2ee744e3
Summary:
I want to allow opting into systemd using a setting in ~/.edenrc. Since should_use_experimental_systemd_mode is a global function, it can't read any configs. Make the config file visible to should_use_experimental_systemd_mode by moving it into EdenInstance.
This diff should not change behavior.
Reviewed By: simpkins
Differential Revision: D13370823
fbshipit-source-id: b604db66954d0a08973030daae38bf6b1433e821
Summary:
If the user opted into systemd integration, make 'eden restart' restart the systemd service.
Don't support --graceful yet. --graceful will be implemented in a later diff. Mark it as explicitly unsupported for now.
Reviewed By: chadaustin
Differential Revision: D13271438
fbshipit-source-id: e505f00cbc337a2bf4da77bdea4b8faba063607c
Summary:
This made it through landcastle because of our current ASAN flakiness
issue on sandcastle marking almost all integration tests as disabled.
Reviewed By: strager
Differential Revision: D13385128
fbshipit-source-id: 70dee16fc5c99819fe121f4b96f7307d05ac24e6
Summary:
When you run 'eden start' without systemd integration, edenfs writes startup logs to the terminal to let users know that stuff is happening:
```
$ eden start
Starting edenfs (dev build), pid 2792025
Opening local RocksDB store...
Opened RocksDB store in 0.95 seconds.
Remounting 1 mount points...
Successfully remounted /data/users/strager/fbsource-dev
Started edenfs (pid 2792025)
Logs available at /data/users/strager/.eden-dev/logs/edenfs.log
```
These startup logs are also used by various tests (especially 'eden restart's tests).
Make the same thing happen when running 'eden start' with systemd integration, improving the user experience and making some tests work:
```
$ EDEN_EXPERIMENTAL_SYSTEMD=1 \
./buck-out/gen/eden/cli/eden.par start \
--daemon-binary "${PWD}/buck-out/gen/eden/fs/service/edenfs"
Starting edenfs (dev build), pid 2800760
Opening local RocksDB store...
Opened RocksDB store in 0.693 seconds.
Remounting 1 mount points...
Successfully remounted /data/users/strager/fbsource-dev
Started edenfs (pid 2800760)
```
Reviewed By: wez
Differential Revision: D13241979
fbshipit-source-id: de79b714e42b690fdab7c21d9add46bc2da35328
Summary: Several tests for 'eden start', 'eden stop', and 'eden status' need to pass command-line arguments to fake_edenfs. With systemd support enabled, make 'eden start' forward daemon arguments to fake_edenfs, making these tests pass.
Reviewed By: wez
Differential Revision: D13249891
fbshipit-source-id: 9008a361fce7a5629535cc9d245b86073ee70826
Summary:
getSHA1 is only handling std::system_error. If another kind of exception is thrown, it's never caught and edenfs crashes.
Calling EdenMount::getInode with a path such as "./hello" will cause a std::domain_error to be thrown. Since std::domain_error is not derived from std::system_error, `getSHA1ForPathDefensively(["./hello"])` crashes edenfs.
Fix the crash by forwarding all exceptions over Thrift, not just std::system_error-s.
Reviewed By: simpkins
Differential Revision: D13386450
fbshipit-source-id: 06262dad30a5508ed482c9e8979b61aa9643280a
Summary:
On Sandcastle's continuous builds, many EdenFS tests are flaky. D13366696 added logging, but the logs don't tell us much:
```
==24060==AddressSanitizer: libc interceptors initialized
|| `[0x10007fff8000, 0x7fffffffffff]` || HighMem ||
|| `[0x02008fff7000, 0x10007fff7fff]` || HighShadow ||
|| `[0x00008fff7000, 0x02008fff6fff]` || ShadowGap ||
|| `[0x00007fff8000, 0x00008fff6fff]` || LowShadow ||
|| `[0x000000000000, 0x00007fff7fff]` || LowMem ||
MemToShadow(shadow): 0x00008fff7000 0x000091ff6dff 0x004091ff6e00 0x02008fff6fff
redzone=16
max_redzone=2048
quarantine_size_mb=256M
thread_local_quarantine_size_kb=1024K
malloc_context_size=30
SHADOW_SCALE: 3
SHADOW_GRANULARITY: 8
SHADOW_OFFSET: 0x7fff8000
==24060==Installed the sigaction for signal 11
==24060==Installed the sigaction for signal 7
==24060==Installed the sigaction for signal 8
==24060==T0: FakeStack created: 0x7f2a640b4000 -- 0x7f2a64bbd000 stack_size_log: 20; mmapped 11300K, noreserve=0
==24060==T0: stack [0x7fff12e4e000,0x7fff1364e000) size 0x800000; local=0x7fff1364d298
==24060==AddressSanitizer Init done
AddressSanitizer ignores mlock/mlockall/munlock/munlockall
==24208==Processing thread 24060.
==24208==Stack at 0x7fff12e4e000-0x7fff1364e000 (SP = 0x7fff1364c3f8).
==24208==TLS at 0x7f2a6a7e3600-0x7f2a6a7e46c0.
==24208==DTLS 18 at 0x4f80004920000010-0x5280004820000012.
Tracer caught signal 11: addr=0x0 pc=0x7f2a69e830e0 sp=0x7f2a4ae85cd0
==24060==LeakSanitizer has encountered a fatal error.
==24060==HINT: For debugging, try setting environment variable LSAN_OPTIONS=verbosity=1:log_threads=1
==24060==HINT: LeakSanitizer does not work under ptrace (strace, gdb, etc)
```
Facebook's test infra is disabling many of EdenFS' tests. This has already caused regressions to land [1]. We need to get these tests passing and enabled ASAP.
The current hypothesis is that LSAN's leak detection is crashing. Disable the leak detection, hopefully making tests pass again. (We still want to find and fix the root cause, of course!)
[1] T37784916: test_rage_output is failing: AssertionError: 'General EdenFS Statistics' not found
Reviewed By: simpkins
Differential Revision: D13385022
fbshipit-source-id: c442146c39ce84c19fc53916aef421cece6d8b40
Summary:
edenfs's privhelper process needs the CAP_SYS_ADMIN capability [1] in order to manage mount points. Give it this capability by invoking edenfs as the `root` user using `sudo`.
Known issues:
* `sudo` must be passwordless in order for fb-edenfs@.service to start.
* systemd can't kill all of fb-edenfs@.service's processes, so `systemctl stop` is unreliable for example.
[1] https://manpage.me/index.cgi?q=capabilities&apropos=0&sektion=0&manpath=CentOS+7.1&arch=default&format=html
Reviewed By: chadaustin
Differential Revision: D13113450
fbshipit-source-id: 01b89521cab371b5017fab6fbd38d55eea599c46
Summary:
I want to reuse assert_systemd_service_is_active in another test. Move it (and the related assert_systemd_service_is_stopped function) so it can be reused.
This diff should not change behavior.
Reviewed By: chadaustin
Differential Revision: D13327802
fbshipit-source-id: 022c3ed3b9e8f04ef1156c2bb4b3deda662439e4
Summary: 'eden stop' should behave like 'systemctl stop'; the service should be stopped after 'eden stop' returns. Add a test which verifies this.
Reviewed By: chadaustin
Differential Revision: D13288819
fbshipit-source-id: 5b836c8ac7c5eb97c484195496f38c7cf70c84dc
Summary:
On Sandcastle's continuous builds, many EdenFS tests are flaky:
```
EdenCommandError: eden command [/mnt/btrfs/trunk-hg-fbcode-fbsource-303-1544100804/fbcode/buck-out/dev/gen/eden/cli/eden.par --config-dir /tmp/eden_test.udfpngq0/homedir/local/.eden --etc-eden-dir /tmp/eden_test.udfpngq0/etc-eden --home-dir /tmp/eden_test.udfpngq0/homedir repository main /tmp/eden_test.udfpngq0/repos/main] returned non-zero exit status 1
stderr=b'Tracer caught signal 11: addr=0x0 pc=0x7f65cea990e0 sp=0x7f65afb21cd0\n==1698412==LeakSanitizer has encountered a fatal error.\n==1698412==HINT: For debugging, try setting environment variable LSAN_OPTIONS=verbosity=1:log_threads=1\n==1698412==HINT: LeakSanitizer does not work under ptrace (strace, gdb, etc)\n'
```
I failed to reproduce this issue on my own dev server, and I failed to reproduce this issue by running 'buck test' on a Sandcastle worker.
As a last resort, make LSAN report more stuff in an effort to track down the cause of this failure.
Aside from extra logging, this diff should not change behavior.
Reviewed By: chadaustin
Differential Revision: D13366696
fbshipit-source-id: e052700dbc4ed9d30f864b8d2dc5ccad27e1a281
Summary:
Add the plumbing necessary to make 'eden start' start a systemd user service. This is only enabled if you opt in using EDEN_EXPERIMENTAL_SYSTEMD.
Currently, only fake_edenfs works. The real edenfs doesn't work yet because it needs root access to configure mount points.
'eden restart', 'eden stop', etc. are not affected by this diff (and are probably broken with EDEN_EXPERIMENTAL_SYSTEMD enabled).
Reviewed By: simpkins
Differential Revision: D10849390
fbshipit-source-id: c087a6498951ff100e5c80bd07ad869b2709e1b3
Summary:
Stop holding a reference count to the TreeInode while a directory
handle is open. This allows eden to shut down while a directory handle
is open.
Reviewed By: strager
Differential Revision: D13287701
fbshipit-source-id: a24f32a1ac40b6c19bc5864aa5f5785f3016361b
Summary:
If a file was partially truncated, it would not always be marked as
materialized. During materialization, the SHA-1 would be cached,
but not invalidated after the truncation.
Write tests that ensure that both ftruncate and O_TRUNC mark files as
modified.
Reviewed By: simpkins
Differential Revision: D13329102
fbshipit-source-id: f09fdc5f11f1da25e1b4453de1b29d1390b3dc71
Summary:
Add some skipped tests that illustrate that Eden cannot shut down when
file handles are open. They will be enabled later in this stack as the
root cause is fixed..
Reviewed By: strager
Differential Revision: D13287587
fbshipit-source-id: cd8d79896127676853c183bb4b86c0d586ce511e
Summary: I'm about to add some more integration tests for unmounting.
Reviewed By: strager
Differential Revision: D13279068
fbshipit-source-id: e84580497f22b9dc6d5a04835dc4beede52a07fd
Summary: CLI tests use shutil.rmtree to clean up temporary directories. Reuse TemporaryDirectoryMixin which is more robust against errors and supports EDEN_TEST_NO_CLEANUP.
Reviewed By: chadaustin
Differential Revision: D13268108
fbshipit-source-id: d77e95a2def0dceb34cf14e19c0c0c0e3aeef3f2
Summary:
systemd uses cgroups to contain a service's processes.
When EdenFS' tests run systemd on Sandcastle, the systemd process creates its cgroups as children of Sandcastle's cgroup. For example, the test-SystemdServiceTest.service service created in test_running_simple_service_is_active has the following cgroup on Sandcastle:
/sandcastle-job/command/test-SystemdServiceTest.service
Unfortunately, systemd uses predictable cgroup names based on the service name. A service managed by one systemd instance shares the same cgroup as a same-named service managed by a different systemd instance. For example, if multiple systemd tests run concurrently on Sandcastle (e.g. during stress-testing), `systemctl --user stop foo.service` from one run will kill the processes of a different run's active foo.service. This causes systemd tests to be flaky on Sandcastle.
Fix the conflict by creating a unique cgroup per systemd instance. For example, separate runs of test_running_simple_service_is_active might use the following cgroups for the test service:
/sandcastle-job/command/edenfs_test.1_qat2xv/test-SystemdServiceTest.service
/sandcastle-job/command/edenfs_test.bbd3gj_o/test-SystemdServiceTest.service
Reviewed By: simpkins
Differential Revision: D13016626
fbshipit-source-id: 8535dc14a06bdb403c926b111cad4aed6c8ec3e3
Summary:
Check that the next-inode-number file exists and that the inode number it
contains is actually larger than all existing inode numbers. Replace it with
correct data if the file does not exist, is corrupt, or contains an incorrect
inode number.
Reviewed By: chadaustin
Differential Revision: D12955093
fbshipit-source-id: 3d26fb475535577d9a2697bbd575fba350766d01
Summary:
Update fsck to extract data for orphan inodes to a lost+found directory in the
fsck log directory, and remove them from the overlay. This will allow users
to recover the orphan file data if they want, and remove it otherwise.
Reviewed By: chadaustin
Differential Revision: D12955094
fbshipit-source-id: 9783452fce4060b9c5c48b3d48dd1f70294211c6
Summary:
Add several more files to the basic snapshot, so we can test more cases
in the fsck tests:
- Materialized, new, and unmodified symlinks
- A deeper directory tree of directory inodes that are not materialized (still
have a source control tree hash) but have children inodes allocated and are
therefore present in the overlay.
- A socket in a slightly deeper directory so we can test behavior of sockets
inside directories that have been corrupted..
As before I have replaced the older basic snapshot instead of adding a new
one, since the Eden data storage formats have not changed since the last
snapshot was created.
Reviewed By: chadaustin
Differential Revision: D13164658
fbshipit-source-id: d117c9cc336709044de212637c03140dfadd9a96
Summary:
This updates the basic snapshot code to include a couple slightly deeper
directories. I plan to use this in the fsck tests to verify handling of
orphan directories that contain subdirectories.
Normally it would be preferable to keep the old `basic-20181030` snapshot, and
simply add this new snapshot without than replacing the old one. However, I
don't think we have made any meaningful changes to our on-disk storage formats
since the previous snapshot was generated, so it seems okay to just delete the
old snapshot.
Reviewed By: strager
Differential Revision: D13151861
fbshipit-source-id: e6b7583beecb5d9cc55271ad2dea8d36980542d1
Summary:
Add a helper class for maintaining the list of expected files.
I plan to use this for the fsck tests, so I can more easily modify the
snapshot's expected contents based on how we expect fsck to repair various
types of overlay corruption. This also helps slightly simplify the code that
constructs the expected file list.
Reviewed By: chadaustin
Differential Revision: D13095918
fbshipit-source-id: 57686e82d1bf7f23a92eda0ed4d66623a3f58840
Summary:
Add basic high level logic to fsck to begin fixing problems that are found.
This adds basic checks to decide if we should fix errors or not.
If errors are found and need to be fixed, this creates a new directory inside
the checkout state directory in `.eden` to record the actions taken by this
fsck run. This directory will contain a log file that records the actions
taken. In the future the fsck logic will also use this directory to store
copies of the corrupted inode data, and can store extracted orphan inode data
here as well.
Reviewed By: wez
Differential Revision: D12955044
fbshipit-source-id: 06c1e17a0a51fa5e2c0f2aab83b367b9358fd004
Summary:
After sending SIGTERM to systemd, increase the wait timeout from 3 seconds to
15 seconds. The previous 3 second timeout was easy to hit in practice on our
continuous build infrastructure.
If systemd does not exit within 15 seconds after we send SIGTERM, send SIGKILL
and then try waiting on it again for up to 3 seconds. Forcefully killing it
seems preferable to leaving the process hanging around after the tests exit.
Reviewed By: strager
Differential Revision: D13159491
fbshipit-source-id: debce21f2f202fb7cfa4a53120dcb2b2b35ccbe3
Summary:
Replace some member variables in
`_TransientUnmanagedSystemdUserServiceManager` with locals, to make pyre
happy.
Previously pyre complained that these member variables are not initialized in
`__init__()`. These variables are only needed temporarily during
`__enter__()` (and in some clean up closures), so just use them as local
variables instead.
Reviewed By: strager
Differential Revision: D13135257
fbshipit-source-id: 76f2bdc4b7b36d2102ad8dab4a60722a03197fab
Summary:
If invoked with `--log-target=console` systemd will log to stderr even if it
is not a tty.
This changes the tests to pass in `--log-target=console` rather than creating
a pty and forwarding I/O from it in a separate background thread.
Reviewed By: strager
Differential Revision: D13135258
fbshipit-source-id: 11dfe0711adaa62cedba2882045d8088e0df5499
Summary: test_processes_of_forking_service_includes_all_child_processes is failing on some machines. The test assumes /sys/fs/cgroup/ is a cgroups v2 mount point, so it fails if /sys/fs/cgroup/ is a directory containing cgroups v1 mount points. Disable the test on machines where cgroups v1 is detected.
Reviewed By: simpkins
Differential Revision: D13112836
fbshipit-source-id: 7921604707a0c1fe81a82c87e767a6a99cdd6206
Summary: Fix the remaining set of errors reported by pyre and mypy.
Reviewed By: strager
Differential Revision: D13086855
fbshipit-source-id: 4c2b21352f94ef225a5555aef0f6b95b92e56f6d
Summary:
Our code in testcase.py mucks around with test case classes in several ways
that mypy doesn't like. In particular, mypy does not currently support
replacing methods on classes, and it also does not understand dynamic base
classes. pyre also trips up on some of these changes, although in different
ways.
This updates testcase.py so that mypy and pyre no longer report errors on it,
often just by suppressing the errors. Also fix similar errors in
service_test_case.py around replicating test classes.
Reviewed By: strager
Differential Revision: D13086856
fbshipit-source-id: af446dd13791f5da50b09657012db95c2bcf0e39
Summary:
This updates the integration tests to add type annotations to most functions
that were missing annotations.
In particular this is needed to make pyre happy, as it complains if subclasses
override methods from their parent class and do not specify type annotations
if the parent class did have annotations.
This diff also contains some minor changes to hg_extension_test_base.py to
explicitly declare some abstract methods that it uses. This was also
necessary to make pyre happy about this ocde.
Reviewed By: strager
Differential Revision: D13051097
fbshipit-source-id: 77567ed2f4d3050f93acefb52e688932d276d587
Summary:
Drop the `stdout` and `stderr` arguments, so that this method always return a
string. Change callers that were previously calling this method with
`stderr=None` to use the `run_hg()` method instead of `hg()`. `run_hg()`
returns a `subprocess.CompletedProcess` object.
This change simplifies the python type checking, and fixes several existing
type checking errors in the code. Even though most call sites could be
guaranteed that this function would return a `str`, the type checker wasn't
smart enough to tell that the return type would be fixed based on the argument
values, and so it assumed the result always needed to be checked for `None`.
This also updates the `GitRepository.git()` method in a similar fashion.
However, that was a simpler change since it already returned a `str` in all
cases.
Reviewed By: strager
Differential Revision: D13078095
fbshipit-source-id: a8def2a33edc865ac40279bbcb3ada4dade68374
Summary:
Restructure storage_engine_test so that the base class derives from
EdenTestCase.
Reviewed By: strager
Differential Revision: D13051096
fbshipit-source-id: e89c4b56e361460b2457d1c2e6a22727a25d7646
Summary:
I want to use Hypothesis to fuzz some CLI code. Move EdenFS' Hypothesis configuration out of eden.integration.lib.testcase and into a place where both CLI tests and integration tests can use it.
This diff should not change behavior.
Reviewed By: wez
Differential Revision: D12813285
fbshipit-source-id: 3a1badd1e18b0e070295ea03dcb24be166cd42c1
Summary:
_transient_unmanaged_systemd_user_service_manager has a few inner functions and a few variables shared between these inner functions. The function is pretty long and hard to follow.
I think a class is more familiar than closures. Refactor _transient_unmanaged_systemd_user_service_manager into a class: turn inner functions into methods, and shared variables into instance attrs.
This diff should not change behavior.
Reviewed By: simpkins
Differential Revision: D13017642
fbshipit-source-id: d20f7476142fa5d7ba0ae09291228ec63e127338
Summary:
If you enable type checking of util.py, mypy complains about _remove_readonly having the wrong type:
eden/integration/lib/util.py:87: error: Argument "onerror" to "rmtree" has incompatible type "Callable[[Callable[[str], Any], str, Tuple[Type[Any], BaseException, TracebackType]], None]"; expected "Optional[Callable[[Any, _PathLike[str], Any], Any]]"
Fix the error by making _remove_readonly compatible with onerror's signature in typeshed [1]:
overload
def rmtree(path: _AnyPath, ignore_errors: bool = ...,
onerror: Optional[Callable[[Any, _AnyPath, Any], Any]] = ...) -> None: ...
This diff should not change behavior.
[1] 4dc21f04dd/stdlib/2and3/shutil.pyi (L90-L92)
Reviewed By: simpkins
Differential Revision: D13086137
fbshipit-source-id: 222e5fa2e06a26464483a0f09545089a7ecc5234
Summary:
While debugging flakiness in test_running_simple_service_is_active, I noticed that sometimes ActiveState=active despite the service process being dead. This can happen if the service process is killed outside systemd. When this happens, SubState=exited instead of the expected SubState=running.
Prevent false positives in SystemdServiceTest tests by checking SubState in addition to ActiveState.
Reviewed By: simpkins
Differential Revision: D13032619
fbshipit-source-id: 2d5754291a19290d29a817115923e9cc5efc90ab
Summary:
A few times, I've needed to manually make _is_system_booted_with_systemd return False in order to emulate how Sandcastle behaves. Make this easier by introducing an environment variable, EDEN_TEST_FORCE_SYSTEMD_USER_SERVICE_MANAGER_TYPE, which allows choosing how to start `systemd --user` when running tests.
When EDEN_TEST_FORCE_SYSTEMD_USER_SERVICE_MANAGER_TYPE is unset, this diff should not change behavior.
Reviewed By: simpkins
Differential Revision: D13031404
fbshipit-source-id: ecbd5f90ff55f4dffa47ba797686db5c25a7198c