Summary:
Previously, we were clearing entries in `hgDirstateTuples` for which:
```
mergeState == NotApplicable
```
but we should have been checking for:
```
mergeState == NotApplicable AND status == Normal
```
The previous logic was causing us to erroneously clear entries in a state like:
```
mergeState == NotApplicable AND status == MarkedForRemoval
```
This bug manifested itself when grafting a change that removed a file.
The file was removed from disk, but Eden did not know that it had been
`MarkedForRemoval`, so it would report the removed file as "missing" in
`hg status`.
Reviewed By: wez
Differential Revision: D5797270
fbshipit-source-id: 29740dfaa8102db868b95e932716773787f317ac
Summary: This looks like it ended up getting done together with the original diff
Reviewed By: simpkins
Differential Revision: D5796901
fbshipit-source-id: 24ab05c50b13a37eefe903de5fd3f2dac3d462da
Summary:
We'll need to gate portions of our shutdown so that we don't
tear down the database until after in-flight requests have completed.
This seems like the easiest way to go about it.
Reviewed By: simpkins
Differential Revision: D5796593
fbshipit-source-id: 49e695826ae68cc2b1d724a8da53ce5d884ff9ff
Summary:
After performing the dumb merge of EdenMount and MountPoint in
the prior commit, this one tidies up the state tracking and the interface
by which clients of the object can be notified of state changes.
I've introduced two Promises; the first of these can be used to wait
for the fuse mount to come up or error out. It logically replaces
the cond wait in the `start` method and is exposed to the caller
as a Future, allowing them to wait and react to the outcome.
The second of the promises is associated with the fuse thread pool
winding down. The attached future can be extracted and used by the
client of the EdenMount class. This future yields the fuse device
descriptor which we can then choose to pass on during graceful
restart or simply close. In the current integration, since we ignore
the result of that future, the handle is implicitly closed.
These promises allow us to remove the reference cycle that we had with the
`onStop` function and to potentially make the mount/unmount sequence more
async.
Reviewed By: bolinfest
Differential Revision: D5778214
fbshipit-source-id: 00b293009b7251ddd8bfb10795a115188e97aa3a
Summary:
This is a mechanical and dumb move of the code from MountPoint
and into the EdenMount class.
Of note, it doesn't merge together the two different state/status fields
into a unified thing; that will be tackled in a follow on diff.
Reviewed By: bolinfest
Differential Revision: D5778212
fbshipit-source-id: 6e91a90a5cc760429d87a475ec12f81b93f87be0
Summary:
This is leading up to folding the MountPoint code into
the EdenMount class.
There's still a mention of the MountPoint in Dispatcher.h; that is
being dealt with in the following diff.
Reviewed By: bolinfest
Differential Revision: D5778215
fbshipit-source-id: 996640b3773988a4738ad55bb13de45e1ffe1880
Summary:
The higher level goal is to make it easier to deal
with the graceful restart scenario.
This diff removes the SessionDeleter class and effectively renames
the Channel class to FuseChannel. The FuseChannel represents
the channel to the kernel; it can be constructed from a fuse
device descriptor that has been obtained either from the privhelper
at mount time, or from the graceful restart procedure. Importantly
for graceful restart, it is possible to move the fuse device
descriptor out of the FuseChannel so that it can be transferred
to a new eden process.
The graceful restart procedure requires a bit more control over
the lifecycle of the fuse event loop so this diff also takes over
managing the thread pool for the worker threads. The threads
are owned by the MountPoint class which continues to be responsible
for starting and stopping the fuse session and notifying EdenServer
when it has finished. A nice side effect of this change is that
we can remove a couple of inelegant aspects of the integration;
the stack size env var stuff and the redundant extra thread
to wait for the loop to finish.
I opted to expose the dispatcher ops struct via an `extern` to
simplify the code in the MountPoint class and avoid adding special
interfaces for passing the ops around; they're constant anyway
so this doesn't feel especially egregious.
Reviewed By: bolinfest
Differential Revision: D5751521
fbshipit-source-id: 5ba4fff48f3efb31a809adfc7787555711f649c9
Summary:
I have found it valuable to dump data about the most recent Buck build.
Given the layout of the buck-out directory, this can be hard to audit by hand,
so I created a script to help. Features:
- With no arguments, prints a report of the last `buck build`.
- Specifying `--list` will enumerate the paths to the build traces in your
`buck-out` and gives each one an index starting with 1.
- Can take a variable number of "trace" arguments, each can be either a path to
a build trace or the index from `--list`. Prints the report for each trace.
This means that `buck-history 2` will print the report for your penultimate
build.
Here is an example of what an individual report looks like:
```
$ ./buck-out/dev/gen/eden/facebook/analytics/buck_history.par
buck build --keep-going eden/...
Used daemon? True
Action Graph Cache Event: {'hit': False, 'cacheWasEmpty': True}
Important duration events:
overall command: 19.976s
parse: 2.287s
target_node_parse_pipeline: 3.042s
action_graph: 6.525s
DefaultProjectFilesystemDelegate{root=/data/users/mbolin/fbsource/fbcode}
SUBSET OF FILES reported by check_watchman (10+):
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/11787-2040418-1ex1lzr/i28g2e29.meta.json.3e6392cb522ceeb7', 'exists': False, 'new': True}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/11787-2040418-1ex1lzr/i28g2e29.meta.json', 'exists': True, 'new': True}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/11787-2040418-1ex1lzr/i28g2e29.data.json', 'exists': True, 'new': True}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/os/__init__.meta.json.47c363721c5c4939', 'exists': False, 'new': True}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/os/__init__.meta.json', 'exists': True, 'new': False}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/os/__init__.data.json.53827f06b4873bc1', 'exists': False, 'new': True}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/os/__init__.data.json', 'exists': True, 'new': False}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/posix.meta.json.568b4b993bed4a74', 'exists': False, 'new': True}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/posix.meta.json', 'exists': True, 'new': False}
{'name': 'eden/facebook/analytics/.mypy_cache/3.6/posix.data.json', 'exists': True, 'new': False}
```
It contains:
- The command that the user ran (without `@` args expanded).
- Whether `buckd` was used.
- Whether there was an action graph cache hit.
- Duration for some key events, such as build file parsing.
- The type of `ProjectFilesystemDelegate` used. (We want to be sure it uses the
Eden one when we use Buck in Eden.)
- Any Watchman change events it processed at the start of the build.
I'm happy to include more information in this report: I'm sure the output will
evolve as we figure out what else we're interested in.
Reviewed By: wez
Differential Revision: D5696560
fbshipit-source-id: 97348dd984c6058e136501f0a532d45359ce2f10
Summary:
Add an `--strace <FILE>` flag to the `eden daemon` CLI command to run eden
under strace, saving the strace output to the specified path.
Reviewed By: wez
Differential Revision: D5771462
fbshipit-source-id: fe4bf18f372f3276400bee624e906ed4f3569735
Summary:
This partially fixes up a perf problem when performing status when a large
number of inodes have been loaded but not materialized (eg: by `find /edenfs
-ls`).
For the FileInode case we'd end up requesting the SHA1 from the store
twice in parallel only to compare it and decide that the file has not
been changed(!)
The remediation is to cut this code over to calling `FileInode::isSameAs` so that
we can short-circuit some of this work. In addition, we can avoid loading
subtrees if we haven't materialized them and the hash matches up.
Reviewed By: simpkins
Differential Revision: D5783044
fbshipit-source-id: f40da3fadfcf8d9e19221d41e3a5a980454717db
Summary:
We're running down a performance problem with hg status that we believe
is something happening at a higher level in the code but noticed that there
were a lot of reads of the rocksdb SST files. In the strace output for those
we observed file content data being read. The status operation shouldn't
need file contents; what's happening is that we're over-fetching some metadata
but happen to be scooping up the file contents from the SST file because we
use the same key prefixes and differentiate the keyspace with key suffixes.
This diff implements the use of rocksdb column families to partition things
more effectively and results in a speed up of around 6x in this scenario.
Furthermore, applying point lookup optimization options yields an additional
2x performance improvement to our rocksdb performance.
As part of this diff, I've removed the hash set that we were using to allow
checking whether a key was present in the store; it wasn't very useful
and would have had to be split into one set per keyspace with this diff;
easier to just remove it.
Reviewed By: bolinfest
Differential Revision: D5781906
fbshipit-source-id: 97f068ade546fd09f391e60a7a57fec0e9081e67
Summary:
This should help us audit the source of the slow path when we hit it.
I took a look at `eden/integration/hg/rebase_test.py`, which we know exercises
the slow path. With this change, I manually rebased a short stack of two commits
onto another stack of two commits with the `--debug` flag and saw two instances
of this message:
```
falling back to non-eden update code path: branchmerge is "truthy:" True.
```
so it seems like we should work to update the `branchmerge` case to take the
fast path, when possible.
Reviewed By: simpkins
Differential Revision: D5779633
fbshipit-source-id: a76d72408d6115aa37ae563d3f7165f404fc8332
Summary:
Before this change, `hg split` crashed complaining that `node` was a
`changectxwrapper` instead of a 20-byte hash when it was sent as `parent1`
of `WorkingDirectoryParents` in `resetParentCommits()`. Now we use `node()` to
get the hash from the `destctx` that we have already extracted via this line
earlier in `merge_update()`:
destctx = repo[node]
The change to `eden/hg/eden/__init__.py` eliminated the crash, but was
not sufficient on its own to make `hg split` work correctly. There was also a fix
required in `Dirstate.cpp` where the `onSnapshotChanged()` callback was clearing out
entries of both `NotApplicable` and `BothParents` from `hgDirstateTuples`.
It appears that only `NotApplicable` entries should be cleared. (I tried leaving
`NotApplicable` entries in there, but that broke `eden/integration/hg/graft_test.py`.)
I suspected that the logic to clear out `hgDestToSourceCopyMap` in
`Dirstate::onSnapshotChanged` was also wrong, so I deleted it and all of the
integration tests still pass. Admittedly, we are pretty weak in our test coverage
for use cases that write to the `hgDestToSourceCopyMap`. In general, we should
rely on Mercurial to explicitly remove entries from `hgDestToSourceCopyMap`.
We have a Thrift API, `hgClearDirstate()`, that `eden_dirstate` can use to categorically
clear out `hgDirstateTuples` and `hgDestToSourceCopyMap`, if necessary.
Finally, creating a proper integration test for `hg split` required creating a value for
`HGEDITOR` that could write different commit messages for different commits.
To that end, I added a `create_editor_that_writes_commit_messages()` utility as a
method of `HgExtensionTestBase` and updated its `hg()` method to take `hgeditor`
as an optional parameter.
Reviewed By: wez
Differential Revision: D5758236
fbshipit-source-id: 5cb8bf4207d4e802726cd93108fae4a6d48f45ec
Summary:
this is the dumb and obvious refactor of this method to
propagate and wait on the Future from the EdenMount::create call.
Reviewed By: simpkins
Differential Revision: D5750372
fbshipit-source-id: fb7ce595de3bacab99ce8af6ef597ef6f0417c12
Summary:
The on-disk treemanifest store doesn't contain an empty tree for the null
manifest ID. We have to explicitly check for this ID in software and generate
an empty tree in this case.
Reviewed By: wez
Differential Revision: D5750053
fbshipit-source-id: d1a58df45f9025ff5a4757f0a814f92dd58798e8
Summary: Add a method to get the EventBase used to drive the main thread.
Reviewed By: wez
Differential Revision: D5750054
fbshipit-source-id: ad2eba021a6200ed28e39a60b16d90aabfaee5b4
Summary:
The serialized data for each file handle needs to be enough
to re-construct the handle when we load it into a new process later
on. We need the inode number, the file handle number that we communicated
to the kernel and a flag to let us know whether it is a file or a dir.
Note that the file handle allocation strategy already accomodates the
idea of migrating to a new process; we don't need to serialize anything
like a next file handle id number.
This doesn't implement instantiating the handles from the loaded state,
it is just the plumbing for saving and loading that state information.
Reviewed By: bolinfest
Differential Revision: D5733079
fbshipit-source-id: 8fb8afb8ae9694d013ce7a4a82c31bc876ed33c9
Summary:
We're not doing anything with this today. It's not
clear whether we should be doing sanity checks (eg: block attempts
to write to a handle that was opened only for reading) or whether
the kernel is going to do that for us, so I've broken this out
as a separate diff from the removal of FileData.
Reviewed By: bolinfest
Differential Revision: D5723064
fbshipit-source-id: b73452dfb4edf88b57fef1ad604bb2bde93bacc1
Summary: These don't exist any more, so remove them
Reviewed By: bolinfest
Differential Revision: D5722861
fbshipit-source-id: 7db112dfab1dfdcf517452b314bd912ec8760bd1
Summary:
The fixes to the fastmanifest code should be landing in prod
tomorrow, so this diff turns on trees in anticipation of that.
Folks working in eden will need to also use hg-dev until this ships out.
Reviewed By: simpkins
Differential Revision: D5740960
fbshipit-source-id: fac3b59183ceb63b6af704715fb5a5b9daed013d
Summary:
Make the CLI read and write a SNAPSHOT file that is
consistent with the C++ server implementation.
Ideally we'd only ever write this file from the C++ side.
Reviewed By: simpkins
Differential Revision: D5740079
fbshipit-source-id: 2057df0ee2b0b271a4734d58e1b6d1334a28020b
Summary:
This moves logic for running the server from main.cpp into the EdenServer
class.
This will make it easier to refactor some of the start-up and running process
in the future, and makes EdenServer the owner of this entire workflow. This
will help as we start splitting the startup code into two separate code paths:
one for a new, fresh start, and one for graceful restart taking over mounts
from an existing eden process.
Reviewed By: bolinfest
Differential Revision: D5732656
fbshipit-source-id: 63f05eb1105078764f4e4931d770416dd5f6d6dc
Summary:
this is required together with D5711177 to successfully perform an `hg
amend`. The changes in D5711177 cause the pending pack files to get written
out and the amend command will then call into eden to change the parents of the
commit.
We need to be able to resolve the tree from the packfiles when this happens,
but since this happens within the default refresh interval in the store code
(which is ~100ms) we need to explicitly refresh the set of pack files.
This is most easily accomplished by forcing a refresh on a tree miss.
This is probably fine if you assume that we won't legitimately be asked
to resolve non-existent trees very frequently.
Reviewed By: simpkins
Differential Revision: D5712623
fbshipit-source-id: 4d0034affcc276f1ae29caac36aa5596e52cd746
Summary:
I had this set and it broke some of the integration tests.
Force it to be unset before running the tests.
Reviewed By: simpkins
Differential Revision: D5712624
fbshipit-source-id: 7d4aef86ef56f5880180b417e356e8a85abf11d7
Summary:
Added new tool to report stat information of EdenFs like fuse counters, Memory counters, latencies, Inode status for all the mount points etc.
eden stat : Prints the general information about eden like list of mount points, loaded unloaded and materialized inodes in each mount point. Also this reports how well periodic unload job is doing by reporting the number of unloaded inodes by periodic job.
eden stat io : Prints how many number of calls made to a system call in Edenfs.
eden stat memory : returns the memory stat for edenfs.
eden stat latency : reports the latencies of system calls in Edenfs.
Reviewed By: bolinfest
Differential Revision: D5660345
fbshipit-source-id: 97a1c2b83a6d8df0cd1b82c4d54b52d7ebd126bd
Summary:
This test was supposed to be a part of D5627411 but it was causing strange behaviour so was brought to a separate diff for further investigation.
After investigating, the test didn't pass because the UnloadedInodeData struct only contained the name of the file, not the path to it. The fix for this was to implement a way to get the relative path of the file even after the inode is unloaded.
Reviewed By: simpkins
Differential Revision: D5646929
fbshipit-source-id: f166398a651e8aea49da7e4474a5ad7fde2eaa4e
Summary:
Fortunately, this passed on the first try: it did not require any bug fixes in
Eden!
(Note: this ignores all push blocking failures!)
Reviewed By: simpkins
Differential Revision: D5698953
fbshipit-source-id: c5ce39725f8d14b5ea93bd3cafeb5e566f92d326
Summary:
Also modified the `getpath` handler to defer the `os.getcwd()` call.
(Note: this ignores all push blocking failures!)
Reviewed By: simpkins
Differential Revision: D5698232
fbshipit-source-id: 9483907771fd1fd2918f62120664bb0d8c431cf3
Summary:
Fortunately, this passed on the first try: it did not require any bug fixes in
Eden! Though admittedly, most of the relevant fixes were presumably done in
D5686114.
(Note: this ignores all push blocking failures!)
Reviewed By: simpkins
Differential Revision: D5696055
fbshipit-source-id: 0099db501ae1a5d72528d222dee0176fc1fc4332
Summary:
Update the integration test framework so that we can run the hg integration
tests with several different hg config settings, using different sets of
mercurial extensions.
This adds code to test using flat manifest, treemanifest in hybrid mode, and
treemanifest in tree only mode. However, the treeonly configuration is
disabled at the moment due to some bugs in treeonly behavior preventing it from
being able to create test repositories in treeonly mode.
Reviewed By: bolinfest
Differential Revision: D5685880
fbshipit-source-id: 081ead4e77cd14a7feb03381783395bd5a8fef4f
Summary:
The most recent mercurial release updated the 'successors()' revset to be the
same as 'allsuccessors()', and it always includes the argument itself in the
output now. This updates the revset to exclude the input commit.
Reviewed By: bolinfest
Differential Revision: D5694826
fbshipit-source-id: 3e931a39675262f33a5298701b4559e0d9906490
Summary:
This makes some minor tweaks to the behavior of the HgRepository.log() helper
function in the integration tests.
Previously this command did not take a revset argument, and instead relied on
the Facebook tweakdefaults extension to use the `--follow` behavior when no
revset was specified. (Without tweakdefaults mercurial uses `tip:0` by
default, which is not what the histedit tests expect.)
I added a revset parameter now, and updated it to default to `::.`. This is
close to the previous behavior, although I intentionally left it reporting
commits from oldest to newest now.
I also updated the log code to add its own delimiter to the template, rather
than requiring callers to always append an escaped nul byte to the template.
Reviewed By: bolinfest
Differential Revision: D5685876
fbshipit-source-id: 01578f62d553be1cd8002b5718d7f12a2f41d4d8
Summary:
Currently treemanifest import fails when a mercurial transaction is in progress
(if the tree in question was created by the pending transaction), and this
breaks many mercurial workflows.
Turning treemanifest back off by default, to fall back to the slower but
functional flat manifest import.
D5685880 adds framework to run the eden integration tests with treemanifest
enabled; those tests can be used to tell when treemanifest is working well
enough that we can turn it back on.
Reviewed By: bolinfest
Differential Revision: D5692894
fbshipit-source-id: 5ee84cf73db4cd87bbdaae1edd92c74058fa00e2
Summary:
Update the integration tests to no longer enable the evolve or fsmonitor
extensions in the test repositories.
evolve has been deprecated at Facebook for a while now and isn't even shipped
as part of our mercurial installation any more. This settings was just causing
a warning to be printed that this extension could not be found.
The fsmonitor extension also didn't have any real effect, even in the backing
repository. We don't create .watchmanconfig files in the test repositories, so
watchman won't watch them. Therefore fsmonitor simply printed warnings that
watchman wasn't watching this repository.
Reviewed By: bolinfest
Differential Revision: D5685879
fbshipit-source-id: 85b8a725bd17890a93be5c71dd5a0f3f1d744598
Summary:
Fix the integration tests to store hg config settings in the .hg/hgrc file in
the backing repository. Previously the tests saved settings to a temporary
file, and then always invoked hg with HGRCPATH pointing at this temporary file.
Unfortunately this resulted in the integration test code using different hg
settings than edenfs, since edenfs was never aware of this temporary file.
Defining the settings in the backing repository's normal .hg/hgrc file means
that edenfs will be able to see these settings as well. The eden post-clone
hooks will also automatically copy these settings in to the mount point, so
that we do not need to use a custom HGRCPATH setting inside the eden mount
either.
Reviewed By: bolinfest
Differential Revision: D5685877
fbshipit-source-id: 1857554d0cf1a585fe55577eb48a87686f9476ca
Summary: This seems a little more user-friendly.
Reviewed By: bradenwatling
Differential Revision: D5686562
fbshipit-source-id: 8142fb9105a3a44823f935fc04187cf0ed2258d7
Summary:
Because `DBG2` seems to be the level we are using to log thrift calls in
`EdenServiceHandler`, this seems like a reasonable default.
Reviewed By: simpkins
Differential Revision: D5686115
fbshipit-source-id: 2d3e0173df37919b6936f73e641f880d16dc539f
Summary:
Note that this feature was mostly implemented before this commit, but never
tested. Unsurprisingly, there were bugs.
This change also introduces a new `eden debug hg_copy_map_get_all` subcommand
because that was a straightforward way to verify the internal state of the copy
map on the server side from an integration test.
Adding this test uncovered a key copy/paste bug in `EdenThriftClient.py`
(`hgCopyMapGet` was being invoked instead of `hgCopyMapPut`.)
It also uncovered a bug in `LameThriftClient` because the `compile()` and
`eval()` calls on the output are not appropriate when the return type of the
Thrift endpoint is `string`.
Reviewed By: simpkins
Differential Revision: D5686114
fbshipit-source-id: f0093d2b67062c01982dc5bc1f0db2774b3a9356
Summary:
1.Modified `TreeInode::unloadChildrenNow()` to return number of inodes that have been unloaded.
2.Modified `EdenServiceHandler::unloadInodeForPath()` to return number of inodes that are unloaded.
Reviewed By: simpkins
Differential Revision: D5627539
fbshipit-source-id: 4cdb0433dced6bf101158b9e6f8c35de67d9abbe
Summary:
Added a test case `test_unload_free_inodes_age` to verify the behaviour of unloadChildrenNow with age parameter.
Added new parameter age to `unloadInodeForPath` in eden.thrift, and `EdenServiceHandler`.
Modified `do_unload_inodes` function in `debug.py` to support the new behaviour.
Reviewed By: simpkins
Differential Revision: D5565859
fbshipit-source-id: a35053725be26bc906cf158969cbe21db1cbadde
Summary:
When Hg tells the `dirstate` to `clear()`, we should also clear out any data we
have on the server for the `Dirstate`.
As it stands, the way we subclass `dirstate`, it does not appear like `clear()`
should be called, in practice, though one thing that could call it is
`hg debugrebuilddirstate`. It is probably good for us to have an RPC lying
around that we can use to reset the `Dirstate.`
Reviewed By: wez
Differential Revision: D5675298
fbshipit-source-id: 38926cfd93f4f83e4c28910f812a693cb32e423a
Summary: This will make subsequent changes to these files cleaner.
Reviewed By: wez
Differential Revision: D5675296
fbshipit-source-id: 06b14d55485415e3ec8a59a4bcc50e6189464b7d
Summary:
In `hg/eden/__init__.py`, we wrap `match()` in Mercurial's `match.py` in an
attempt to annotate every `basematcher` created in the system with a special
`_eden_match_info` property that we can use in `_eden_walk_helper()` to perform
walks more efficiently. Unfortunately, we missed a case where `scmutil.py`
has a `matchfiles()` function that calls `exact()` in `match.py` directly rather
than going through the generic `match()` function.
This was causing a failure when running `hg revert <filename>` in Eden because
the matcher that was created via `exact()` did not have an `_eden_match_info`.
This commit wraps `exact()` to add the property.
Reviewed By: wez
Differential Revision: D5674660
fbshipit-source-id: 16d1e7648ebd7a23b43b9b1200d3e284e5bc07b0