Summary:
Although this is not the type of behavior we want to encourage, we should make
it possible. It turns out that this was throwing an exception becuase
`make_glob_list()` was erroneously mapping the pattern to `/**/*` instead of
`**/*` in this case.
Reviewed By: wez
Differential Revision: D5826753
fbshipit-source-id: 659d67c13cdcda39abb7d6893a57ef046804da73
Summary:
It turns out that we had a small bug with our matcher code that did not account
for pattern normalization. I discovered this while dogfooding Eden and using
`hg grep <pattern> <directory>` from a subdirectory in my working copy. Given
that the fix was to patterns, in general, this likely fixes other `hg` commands
that take a file pattern when used someplace other than the repo root.
Reviewed By: wez
Differential Revision: D5825483
fbshipit-source-id: 0d639cbb2fc678c5459e02e965bf6fc6d7c10959
Summary:
Following on from D5798659, this diff pulls the mount flow into
EdenServer. Previously the flow of control would bounce back and forth between
the EdenServiceHandler and EdenServer and this made it (IMO) more difficult to
follow and understand where to add things into the flow.
Now `EdenServer::mount` is the main entry point for the mounting flow.
I've simplified the stat registration and broken that out into helper methods
to avoid cluttering up the mount logic.
Reviewed By: bolinfest
Differential Revision: D5806393
fbshipit-source-id: 7c858a2a580332ce82c2600e9dc3537af1d734d1
Summary:
This moves the bind mounting and post-clone script running
functionality to methods of the EdenMount class and makes the whole
mount flow return a `Future<>`. The higher level goal is to make it
easier to see where and how we want to tweak this flow to support
graceful restart.
This is mostly straight forward but care is required to avoid deadlocks; there
are two scenarios:
# We fulfill the fuse start promise in the context of the fuse thread that is
handling the fuse initialization packet before it has signalled to the kernel
that it has come up. This can be solved by using `via(mainEventBase_)`, but...
# When remounting all the mounts on startup, we're running in the
`mainEventBase_` thread so the simplistic solution to 1. would cause us to
deadlock on startup (visible in the remount integration tests).
So to avoid this, we shunt the completion of the future via a CPU pool.
Also worth noting: the way we were setting up the global CPU pool with
wangle wasn't correct; it takes a weak reference to the pool which was
then getting destroyed when our prepare method returned. It just happened
to work for us in the facebook specific build because something else was
setting up a different CPU executor.
I've reconciled this by just setting up a thread pool of our own and
using that explicitly.
Reviewed By: bolinfest
Differential Revision: D5798659
fbshipit-source-id: f1c48730f283f6962f6cd706c02d82ea2952e369
Summary:
This is reasonably straightforward, although a little
more fiddly than I'd hoped because the timer wheel stuff doesn't
offer a convenient way to set up a recurring timer.
I've also made the inode unloading code get run globally for all
mounts; it was previously scheduling one timer per mount point.
This nets out the same; the function scheduler was just a single
thread anyway, so there is no change in the level of concurrency.
I believe that this tidies up the unload counter too; it looked
like we'd set the counter to be the result of the last mount
point that we processed rather than the aggregate of all mounts.
Having the unload timer be associated with the server rather
than the mount points means that we don't have to do anything
special to coordinate with the timer management when the mount
point is being torn down.
Reviewed By: bolinfest
Differential Revision: D5792938
fbshipit-source-id: 1a14bb7b7f4952139e684fe6b52f64bd1ba70dd0
Summary:
Previously, we were generating a bit of disconcerting noise in our logs when
requesting a non-existent key in the dirstate or its copy map. We were also
susceptible to a logical error in the Eden side being silently translated to
a `KeyError` on the Python side.
Now we make things more explicit by converting a `std::out_of_range` on the C++
side to an explicit `NoValueForKeyError` that is defined in `eden.thrift`.
Now the Python side catches a `NoValueForKeyError` explicitly and converts it
into a `KeyError`. Other types of exceptions should pass through rather than be
swallowed.
This also updates the log messages to communicate when a there is no value for a
key. The messaging is improved so that it no longer appears to be a logical
error.
Reviewed By: wez
Differential Revision: D5800833
fbshipit-source-id: c44f2caf04622475d218593037cc6616bbb1c701
Summary:
Previously, we were clearing entries in `hgDirstateTuples` for which:
```
mergeState == NotApplicable
```
but we should have been checking for:
```
mergeState == NotApplicable AND status == Normal
```
The previous logic was causing us to erroneously clear entries in a state like:
```
mergeState == NotApplicable AND status == MarkedForRemoval
```
This bug manifested itself when grafting a change that removed a file.
The file was removed from disk, but Eden did not know that it had been
`MarkedForRemoval`, so it would report the removed file as "missing" in
`hg status`.
Reviewed By: wez
Differential Revision: D5797270
fbshipit-source-id: 29740dfaa8102db868b95e932716773787f317ac
Summary: This looks like it ended up getting done together with the original diff
Reviewed By: simpkins
Differential Revision: D5796901
fbshipit-source-id: 24ab05c50b13a37eefe903de5fd3f2dac3d462da
Summary:
We'll need to gate portions of our shutdown so that we don't
tear down the database until after in-flight requests have completed.
This seems like the easiest way to go about it.
Reviewed By: simpkins
Differential Revision: D5796593
fbshipit-source-id: 49e695826ae68cc2b1d724a8da53ce5d884ff9ff
Summary:
After performing the dumb merge of EdenMount and MountPoint in
the prior commit, this one tidies up the state tracking and the interface
by which clients of the object can be notified of state changes.
I've introduced two Promises; the first of these can be used to wait
for the fuse mount to come up or error out. It logically replaces
the cond wait in the `start` method and is exposed to the caller
as a Future, allowing them to wait and react to the outcome.
The second of the promises is associated with the fuse thread pool
winding down. The attached future can be extracted and used by the
client of the EdenMount class. This future yields the fuse device
descriptor which we can then choose to pass on during graceful
restart or simply close. In the current integration, since we ignore
the result of that future, the handle is implicitly closed.
These promises allow us to remove the reference cycle that we had with the
`onStop` function and to potentially make the mount/unmount sequence more
async.
Reviewed By: bolinfest
Differential Revision: D5778214
fbshipit-source-id: 00b293009b7251ddd8bfb10795a115188e97aa3a
Summary:
This is a mechanical and dumb move of the code from MountPoint
and into the EdenMount class.
Of note, it doesn't merge together the two different state/status fields
into a unified thing; that will be tackled in a follow on diff.
Reviewed By: bolinfest
Differential Revision: D5778212
fbshipit-source-id: 6e91a90a5cc760429d87a475ec12f81b93f87be0
Summary:
This is leading up to folding the MountPoint code into
the EdenMount class.
There's still a mention of the MountPoint in Dispatcher.h; that is
being dealt with in the following diff.
Reviewed By: bolinfest
Differential Revision: D5778215
fbshipit-source-id: 996640b3773988a4738ad55bb13de45e1ffe1880
Summary:
The higher level goal is to make it easier to deal
with the graceful restart scenario.
This diff removes the SessionDeleter class and effectively renames
the Channel class to FuseChannel. The FuseChannel represents
the channel to the kernel; it can be constructed from a fuse
device descriptor that has been obtained either from the privhelper
at mount time, or from the graceful restart procedure. Importantly
for graceful restart, it is possible to move the fuse device
descriptor out of the FuseChannel so that it can be transferred
to a new eden process.
The graceful restart procedure requires a bit more control over
the lifecycle of the fuse event loop so this diff also takes over
managing the thread pool for the worker threads. The threads
are owned by the MountPoint class which continues to be responsible
for starting and stopping the fuse session and notifying EdenServer
when it has finished. A nice side effect of this change is that
we can remove a couple of inelegant aspects of the integration;
the stack size env var stuff and the redundant extra thread
to wait for the loop to finish.
I opted to expose the dispatcher ops struct via an `extern` to
simplify the code in the MountPoint class and avoid adding special
interfaces for passing the ops around; they're constant anyway
so this doesn't feel especially egregious.
Reviewed By: bolinfest
Differential Revision: D5751521
fbshipit-source-id: 5ba4fff48f3efb31a809adfc7787555711f649c9
Summary:
I have found it valuable to dump data about the most recent Buck build.
Given the layout of the buck-out directory, this can be hard to audit by hand,
so I created a script to help. Features:
- With no arguments, prints a report of the last `buck build`.
- Specifying `--list` will enumerate the paths to the build traces in your
`buck-out` and gives each one an index starting with 1.
- Can take a variable number of "trace" arguments, each can be either a path to
a build trace or the index from `--list`. Prints the report for each trace.
This means that `buck-history 2` will print the report for your penultimate
build.
Here is an example of what an individual report looks like:
```
$ ./buck-out/dev/gen/eden/facebook/analytics/buck_history.par
buck build --keep-going eden/...
Used daemon? True
Action Graph Cache Event: {'hit': False, 'cacheWasEmpty': True}
Important duration events:
overall command: 19.976s
parse: 2.287s
target_node_parse_pipeline: 3.042s
action_graph: 6.525s
DefaultProjectFilesystemDelegate{root=/data/users/mbolin/fbsource/fbcode}
SUBSET OF FILES reported by check_watchman (10+):
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/11787-2040418-1ex1lzr/i28g2e29.meta.json.3e6392cb522ceeb7', 'exists': False, 'new': True}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/11787-2040418-1ex1lzr/i28g2e29.meta.json', 'exists': True, 'new': True}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/11787-2040418-1ex1lzr/i28g2e29.data.json', 'exists': True, 'new': True}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/os/__init__.meta.json.47c363721c5c4939', 'exists': False, 'new': True}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/os/__init__.meta.json', 'exists': True, 'new': False}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/os/__init__.data.json.53827f06b4873bc1', 'exists': False, 'new': True}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/os/__init__.data.json', 'exists': True, 'new': False}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/posix.meta.json.568b4b993bed4a74', 'exists': False, 'new': True}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/posix.meta.json', 'exists': True, 'new': False}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/posix.data.json', 'exists': True, 'new': False}
```
It contains:
- The command that the user ran (without `@` args expanded).
- Whether `buckd` was used.
- Whether there was an action graph cache hit.
- Duration for some key events, such as build file parsing.
- The type of `ProjectFilesystemDelegate` used. (We want to be sure it uses the
Eden one when we use Buck in Eden.)
- Any Watchman change events it processed at the start of the build.
I'm happy to include more information in this report: I'm sure the output will
evolve as we figure out what else we're interested in.
Reviewed By: wez
Differential Revision: D5696560
fbshipit-source-id: 97348dd984c6058e136501f0a532d45359ce2f10
Summary:
Add an `--strace <FILE>` flag to the `eden daemon` CLI command to run eden
under strace, saving the strace output to the specified path.
Reviewed By: wez
Differential Revision: D5771462
fbshipit-source-id: fe4bf18f372f3276400bee624e906ed4f3569735
Summary:
This partially fixes up a perf problem when performing status when a large
number of inodes have been loaded but not materialized (eg: by `find /edenfs
-ls`).
For the FileInode case we'd end up requesting the SHA1 from the store
twice in parallel only to compare it and decide that the file has not
been changed(!)
The remediation is to cut this code over to calling `FileInode::isSameAs` so that
we can short-circuit some of this work. In addition, we can avoid loading
subtrees if we haven't materialized them and the hash matches up.
Reviewed By: simpkins
Differential Revision: D5783044
fbshipit-source-id: f40da3fadfcf8d9e19221d41e3a5a980454717db
Summary:
We're running down a performance problem with hg status that we believe
is something happening at a higher level in the code but noticed that there
were a lot of reads of the rocksdb SST files. In the strace output for those
we observed file content data being read. The status operation shouldn't
need file contents; what's happening is that we're over-fetching some metadata
but happen to be scooping up the file contents from the SST file because we
use the same key prefixes and differentiate the keyspace with key suffixes.
This diff implements the use of rocksdb column families to partition things
more effectively and results in a speed up of around 6x in this scenario.
Furthermore, applying point lookup optimization options yields an additional
2x performance improvement to our rocksdb performance.
As part of this diff, I've removed the hash set that we were using to allow
checking whether a key was present in the store; it wasn't very useful
and would have had to be split into one set per keyspace with this diff;
easier to just remove it.
Reviewed By: bolinfest
Differential Revision: D5781906
fbshipit-source-id: 97f068ade546fd09f391e60a7a57fec0e9081e67
Summary:
This should help us audit the source of the slow path when we hit it.
I took a look at `eden/integration/hg/rebase_test.py`, which we know exercises
the slow path. With this change, I manually rebased a short stack of two commits
onto another stack of two commits with the `--debug` flag and saw two instances
of this message:
```
falling back to non-eden update code path: branchmerge is "truthy:" True.
```
so it seems like we should work to update the `branchmerge` case to take the
fast path, when possible.
Reviewed By: simpkins
Differential Revision: D5779633
fbshipit-source-id: a76d72408d6115aa37ae563d3f7165f404fc8332
Summary:
Before this change, `hg split` crashed complaining that `node` was a
`changectxwrapper` instead of a 20-byte hash when it was sent as `parent1`
of `WorkingDirectoryParents` in `resetParentCommits()`. Now we use `node()` to
get the hash from the `destctx` that we have already extracted via this line
earlier in `merge_update()`:
destctx = repo[node]
The change to `eden/hg/eden/__init__.py` eliminated the crash, but was
not sufficient on its own to make `hg split` work correctly. There was also a fix
required in `Dirstate.cpp` where the `onSnapshotChanged()` callback was clearing out
entries of both `NotApplicable` and `BothParents` from `hgDirstateTuples`.
It appears that only `NotApplicable` entries should be cleared. (I tried leaving
`NotApplicable` entries in there, but that broke `eden/integration/hg/graft_test.py`.)
I suspected that the logic to clear out `hgDestToSourceCopyMap` in
`Dirstate::onSnapshotChanged` was also wrong, so I deleted it and all of the
integration tests still pass. Admittedly, we are pretty weak in our test coverage
for use cases that write to the `hgDestToSourceCopyMap`. In general, we should
rely on Mercurial to explicitly remove entries from `hgDestToSourceCopyMap`.
We have a Thrift API, `hgClearDirstate()`, that `eden_dirstate` can use to categorically
clear out `hgDirstateTuples` and `hgDestToSourceCopyMap`, if necessary.
Finally, creating a proper integration test for `hg split` required creating a value for
`HGEDITOR` that could write different commit messages for different commits.
To that end, I added a `create_editor_that_writes_commit_messages()` utility as a
method of `HgExtensionTestBase` and updated its `hg()` method to take `hgeditor`
as an optional parameter.
Reviewed By: wez
Differential Revision: D5758236
fbshipit-source-id: 5cb8bf4207d4e802726cd93108fae4a6d48f45ec
Summary:
this is the dumb and obvious refactor of this method to
propagate and wait on the Future from the EdenMount::create call.
Reviewed By: simpkins
Differential Revision: D5750372
fbshipit-source-id: fb7ce595de3bacab99ce8af6ef597ef6f0417c12
Summary:
The on-disk treemanifest store doesn't contain an empty tree for the null
manifest ID. We have to explicitly check for this ID in software and generate
an empty tree in this case.
Reviewed By: wez
Differential Revision: D5750053
fbshipit-source-id: d1a58df45f9025ff5a4757f0a814f92dd58798e8
Summary: Add a method to get the EventBase used to drive the main thread.
Reviewed By: wez
Differential Revision: D5750054
fbshipit-source-id: ad2eba021a6200ed28e39a60b16d90aabfaee5b4
Summary:
The serialized data for each file handle needs to be enough
to re-construct the handle when we load it into a new process later
on. We need the inode number, the file handle number that we communicated
to the kernel and a flag to let us know whether it is a file or a dir.
Note that the file handle allocation strategy already accomodates the
idea of migrating to a new process; we don't need to serialize anything
like a next file handle id number.
This doesn't implement instantiating the handles from the loaded state,
it is just the plumbing for saving and loading that state information.
Reviewed By: bolinfest
Differential Revision: D5733079
fbshipit-source-id: 8fb8afb8ae9694d013ce7a4a82c31bc876ed33c9
Summary:
We're not doing anything with this today. It's not
clear whether we should be doing sanity checks (eg: block attempts
to write to a handle that was opened only for reading) or whether
the kernel is going to do that for us, so I've broken this out
as a separate diff from the removal of FileData.
Reviewed By: bolinfest
Differential Revision: D5723064
fbshipit-source-id: b73452dfb4edf88b57fef1ad604bb2bde93bacc1
Summary: These don't exist any more, so remove them
Reviewed By: bolinfest
Differential Revision: D5722861
fbshipit-source-id: 7db112dfab1dfdcf517452b314bd912ec8760bd1
Summary:
The fixes to the fastmanifest code should be landing in prod
tomorrow, so this diff turns on trees in anticipation of that.
Folks working in eden will need to also use hg-dev until this ships out.
Reviewed By: simpkins
Differential Revision: D5740960
fbshipit-source-id: fac3b59183ceb63b6af704715fb5a5b9daed013d
Summary:
Make the CLI read and write a SNAPSHOT file that is
consistent with the C++ server implementation.
Ideally we'd only ever write this file from the C++ side.
Reviewed By: simpkins
Differential Revision: D5740079
fbshipit-source-id: 2057df0ee2b0b271a4734d58e1b6d1334a28020b
Summary:
This moves logic for running the server from main.cpp into the EdenServer
class.
This will make it easier to refactor some of the start-up and running process
in the future, and makes EdenServer the owner of this entire workflow. This
will help as we start splitting the startup code into two separate code paths:
one for a new, fresh start, and one for graceful restart taking over mounts
from an existing eden process.
Reviewed By: bolinfest
Differential Revision: D5732656
fbshipit-source-id: 63f05eb1105078764f4e4931d770416dd5f6d6dc
Summary:
this is required together with D5711177 to successfully perform an `hg
amend`. The changes in D5711177 cause the pending pack files to get written
out and the amend command will then call into eden to change the parents of the
commit.
We need to be able to resolve the tree from the packfiles when this happens,
but since this happens within the default refresh interval in the store code
(which is ~100ms) we need to explicitly refresh the set of pack files.
This is most easily accomplished by forcing a refresh on a tree miss.
This is probably fine if you assume that we won't legitimately be asked
to resolve non-existent trees very frequently.
Reviewed By: simpkins
Differential Revision: D5712623
fbshipit-source-id: 4d0034affcc276f1ae29caac36aa5596e52cd746
Summary:
I had this set and it broke some of the integration tests.
Force it to be unset before running the tests.
Reviewed By: simpkins
Differential Revision: D5712624
fbshipit-source-id: 7d4aef86ef56f5880180b417e356e8a85abf11d7
Summary:
Added new tool to report stat information of EdenFs like fuse counters, Memory counters, latencies, Inode status for all the mount points etc.
eden stat : Prints the general information about eden like list of mount points, loaded unloaded and materialized inodes in each mount point. Also this reports how well periodic unload job is doing by reporting the number of unloaded inodes by periodic job.
eden stat io : Prints how many number of calls made to a system call in Edenfs.
eden stat memory : returns the memory stat for edenfs.
eden stat latency : reports the latencies of system calls in Edenfs.
Reviewed By: bolinfest
Differential Revision: D5660345
fbshipit-source-id: 97a1c2b83a6d8df0cd1b82c4d54b52d7ebd126bd
Summary:
This test was supposed to be a part of D5627411 but it was causing strange behaviour so was brought to a separate diff for further investigation.
After investigating, the test didn't pass because the UnloadedInodeData struct only contained the name of the file, not the path to it. The fix for this was to implement a way to get the relative path of the file even after the inode is unloaded.
Reviewed By: simpkins
Differential Revision: D5646929
fbshipit-source-id: f166398a651e8aea49da7e4474a5ad7fde2eaa4e
Summary:
Fortunately, this passed on the first try: it did not require any bug fixes in
Eden!
(Note: this ignores all push blocking failures!)
Reviewed By: simpkins
Differential Revision: D5698953
fbshipit-source-id: c5ce39725f8d14b5ea93bd3cafeb5e566f92d326
Summary:
Also modified the `getpath` handler to defer the `os.getcwd()` call.
(Note: this ignores all push blocking failures!)
Reviewed By: simpkins
Differential Revision: D5698232
fbshipit-source-id: 9483907771fd1fd2918f62120664bb0d8c431cf3
Summary:
Fortunately, this passed on the first try: it did not require any bug fixes in
Eden! Though admittedly, most of the relevant fixes were presumably done in
D5686114.
(Note: this ignores all push blocking failures!)
Reviewed By: simpkins
Differential Revision: D5696055
fbshipit-source-id: 0099db501ae1a5d72528d222dee0176fc1fc4332
Summary:
Update the integration test framework so that we can run the hg integration
tests with several different hg config settings, using different sets of
mercurial extensions.
This adds code to test using flat manifest, treemanifest in hybrid mode, and
treemanifest in tree only mode. However, the treeonly configuration is
disabled at the moment due to some bugs in treeonly behavior preventing it from
being able to create test repositories in treeonly mode.
Reviewed By: bolinfest
Differential Revision: D5685880
fbshipit-source-id: 081ead4e77cd14a7feb03381783395bd5a8fef4f
Summary:
The most recent mercurial release updated the 'successors()' revset to be the
same as 'allsuccessors()', and it always includes the argument itself in the
output now. This updates the revset to exclude the input commit.
Reviewed By: bolinfest
Differential Revision: D5694826
fbshipit-source-id: 3e931a39675262f33a5298701b4559e0d9906490
Summary:
This makes some minor tweaks to the behavior of the HgRepository.log() helper
function in the integration tests.
Previously this command did not take a revset argument, and instead relied on
the Facebook tweakdefaults extension to use the `--follow` behavior when no
revset was specified. (Without tweakdefaults mercurial uses `tip:0` by
default, which is not what the histedit tests expect.)
I added a revset parameter now, and updated it to default to `::.`. This is
close to the previous behavior, although I intentionally left it reporting
commits from oldest to newest now.
I also updated the log code to add its own delimiter to the template, rather
than requiring callers to always append an escaped nul byte to the template.
Reviewed By: bolinfest
Differential Revision: D5685876
fbshipit-source-id: 01578f62d553be1cd8002b5718d7f12a2f41d4d8
Summary:
Currently treemanifest import fails when a mercurial transaction is in progress
(if the tree in question was created by the pending transaction), and this
breaks many mercurial workflows.
Turning treemanifest back off by default, to fall back to the slower but
functional flat manifest import.
D5685880 adds framework to run the eden integration tests with treemanifest
enabled; those tests can be used to tell when treemanifest is working well
enough that we can turn it back on.
Reviewed By: bolinfest
Differential Revision: D5692894
fbshipit-source-id: 5ee84cf73db4cd87bbdaae1edd92c74058fa00e2
Summary:
Update the integration tests to no longer enable the evolve or fsmonitor
extensions in the test repositories.
evolve has been deprecated at Facebook for a while now and isn't even shipped
as part of our mercurial installation any more. This settings was just causing
a warning to be printed that this extension could not be found.
The fsmonitor extension also didn't have any real effect, even in the backing
repository. We don't create .watchmanconfig files in the test repositories, so
watchman won't watch them. Therefore fsmonitor simply printed warnings that
watchman wasn't watching this repository.
Reviewed By: bolinfest
Differential Revision: D5685879
fbshipit-source-id: 85b8a725bd17890a93be5c71dd5a0f3f1d744598
Summary:
Fix the integration tests to store hg config settings in the .hg/hgrc file in
the backing repository. Previously the tests saved settings to a temporary
file, and then always invoked hg with HGRCPATH pointing at this temporary file.
Unfortunately this resulted in the integration test code using different hg
settings than edenfs, since edenfs was never aware of this temporary file.
Defining the settings in the backing repository's normal .hg/hgrc file means
that edenfs will be able to see these settings as well. The eden post-clone
hooks will also automatically copy these settings in to the mount point, so
that we do not need to use a custom HGRCPATH setting inside the eden mount
either.
Reviewed By: bolinfest
Differential Revision: D5685877
fbshipit-source-id: 1857554d0cf1a585fe55577eb48a87686f9476ca