Summary:
Following on from D5798659, this diff pulls the mount flow into
EdenServer. Previously the flow of control would bounce back and forth between
the EdenServiceHandler and EdenServer and this made it (IMO) more difficult to
follow and understand where to add things into the flow.
Now `EdenServer::mount` is the main entry point for the mounting flow.
I've simplified the stat registration and broken that out into helper methods
to avoid cluttering up the mount logic.
Reviewed By: bolinfest
Differential Revision: D5806393
fbshipit-source-id: 7c858a2a580332ce82c2600e9dc3537af1d734d1
Summary:
This moves the bind mounting and post-clone script running
functionality to methods of the EdenMount class and makes the whole
mount flow return a `Future<>`. The higher level goal is to make it
easier to see where and how we want to tweak this flow to support
graceful restart.
This is mostly straight forward but care is required to avoid deadlocks; there
are two scenarios:
# We fulfill the fuse start promise in the context of the fuse thread that is
handling the fuse initialization packet before it has signalled to the kernel
that it has come up. This can be solved by using `via(mainEventBase_)`, but...
# When remounting all the mounts on startup, we're running in the
`mainEventBase_` thread so the simplistic solution to 1. would cause us to
deadlock on startup (visible in the remount integration tests).
So to avoid this, we shunt the completion of the future via a CPU pool.
Also worth noting: the way we were setting up the global CPU pool with
wangle wasn't correct; it takes a weak reference to the pool which was
then getting destroyed when our prepare method returned. It just happened
to work for us in the facebook specific build because something else was
setting up a different CPU executor.
I've reconciled this by just setting up a thread pool of our own and
using that explicitly.
Reviewed By: bolinfest
Differential Revision: D5798659
fbshipit-source-id: f1c48730f283f6962f6cd706c02d82ea2952e369
Summary:
This is reasonably straightforward, although a little
more fiddly than I'd hoped because the timer wheel stuff doesn't
offer a convenient way to set up a recurring timer.
I've also made the inode unloading code get run globally for all
mounts; it was previously scheduling one timer per mount point.
This nets out the same; the function scheduler was just a single
thread anyway, so there is no change in the level of concurrency.
I believe that this tidies up the unload counter too; it looked
like we'd set the counter to be the result of the last mount
point that we processed rather than the aggregate of all mounts.
Having the unload timer be associated with the server rather
than the mount points means that we don't have to do anything
special to coordinate with the timer management when the mount
point is being torn down.
Reviewed By: bolinfest
Differential Revision: D5792938
fbshipit-source-id: 1a14bb7b7f4952139e684fe6b52f64bd1ba70dd0
Summary:
Previously, we were generating a bit of disconcerting noise in our logs when
requesting a non-existent key in the dirstate or its copy map. We were also
susceptible to a logical error in the Eden side being silently translated to
a `KeyError` on the Python side.
Now we make things more explicit by converting a `std::out_of_range` on the C++
side to an explicit `NoValueForKeyError` that is defined in `eden.thrift`.
Now the Python side catches a `NoValueForKeyError` explicitly and converts it
into a `KeyError`. Other types of exceptions should pass through rather than be
swallowed.
This also updates the log messages to communicate when a there is no value for a
key. The messaging is improved so that it no longer appears to be a logical
error.
Reviewed By: wez
Differential Revision: D5800833
fbshipit-source-id: c44f2caf04622475d218593037cc6616bbb1c701
Summary:
Previously, we were clearing entries in `hgDirstateTuples` for which:
```
mergeState == NotApplicable
```
but we should have been checking for:
```
mergeState == NotApplicable AND status == Normal
```
The previous logic was causing us to erroneously clear entries in a state like:
```
mergeState == NotApplicable AND status == MarkedForRemoval
```
This bug manifested itself when grafting a change that removed a file.
The file was removed from disk, but Eden did not know that it had been
`MarkedForRemoval`, so it would report the removed file as "missing" in
`hg status`.
Reviewed By: wez
Differential Revision: D5797270
fbshipit-source-id: 29740dfaa8102db868b95e932716773787f317ac
Summary: This looks like it ended up getting done together with the original diff
Reviewed By: simpkins
Differential Revision: D5796901
fbshipit-source-id: 24ab05c50b13a37eefe903de5fd3f2dac3d462da
Summary:
We'll need to gate portions of our shutdown so that we don't
tear down the database until after in-flight requests have completed.
This seems like the easiest way to go about it.
Reviewed By: simpkins
Differential Revision: D5796593
fbshipit-source-id: 49e695826ae68cc2b1d724a8da53ce5d884ff9ff
Summary:
After performing the dumb merge of EdenMount and MountPoint in
the prior commit, this one tidies up the state tracking and the interface
by which clients of the object can be notified of state changes.
I've introduced two Promises; the first of these can be used to wait
for the fuse mount to come up or error out. It logically replaces
the cond wait in the `start` method and is exposed to the caller
as a Future, allowing them to wait and react to the outcome.
The second of the promises is associated with the fuse thread pool
winding down. The attached future can be extracted and used by the
client of the EdenMount class. This future yields the fuse device
descriptor which we can then choose to pass on during graceful
restart or simply close. In the current integration, since we ignore
the result of that future, the handle is implicitly closed.
These promises allow us to remove the reference cycle that we had with the
`onStop` function and to potentially make the mount/unmount sequence more
async.
Reviewed By: bolinfest
Differential Revision: D5778214
fbshipit-source-id: 00b293009b7251ddd8bfb10795a115188e97aa3a
Summary:
This is a mechanical and dumb move of the code from MountPoint
and into the EdenMount class.
Of note, it doesn't merge together the two different state/status fields
into a unified thing; that will be tackled in a follow on diff.
Reviewed By: bolinfest
Differential Revision: D5778212
fbshipit-source-id: 6e91a90a5cc760429d87a475ec12f81b93f87be0
Summary:
This is leading up to folding the MountPoint code into
the EdenMount class.
There's still a mention of the MountPoint in Dispatcher.h; that is
being dealt with in the following diff.
Reviewed By: bolinfest
Differential Revision: D5778215
fbshipit-source-id: 996640b3773988a4738ad55bb13de45e1ffe1880
Summary:
The higher level goal is to make it easier to deal
with the graceful restart scenario.
This diff removes the SessionDeleter class and effectively renames
the Channel class to FuseChannel. The FuseChannel represents
the channel to the kernel; it can be constructed from a fuse
device descriptor that has been obtained either from the privhelper
at mount time, or from the graceful restart procedure. Importantly
for graceful restart, it is possible to move the fuse device
descriptor out of the FuseChannel so that it can be transferred
to a new eden process.
The graceful restart procedure requires a bit more control over
the lifecycle of the fuse event loop so this diff also takes over
managing the thread pool for the worker threads. The threads
are owned by the MountPoint class which continues to be responsible
for starting and stopping the fuse session and notifying EdenServer
when it has finished. A nice side effect of this change is that
we can remove a couple of inelegant aspects of the integration;
the stack size env var stuff and the redundant extra thread
to wait for the loop to finish.
I opted to expose the dispatcher ops struct via an `extern` to
simplify the code in the MountPoint class and avoid adding special
interfaces for passing the ops around; they're constant anyway
so this doesn't feel especially egregious.
Reviewed By: bolinfest
Differential Revision: D5751521
fbshipit-source-id: 5ba4fff48f3efb31a809adfc7787555711f649c9
Summary:
This partially fixes up a perf problem when performing status when a large
number of inodes have been loaded but not materialized (eg: by `find /edenfs
-ls`).
For the FileInode case we'd end up requesting the SHA1 from the store
twice in parallel only to compare it and decide that the file has not
been changed(!)
The remediation is to cut this code over to calling `FileInode::isSameAs` so that
we can short-circuit some of this work. In addition, we can avoid loading
subtrees if we haven't materialized them and the hash matches up.
Reviewed By: simpkins
Differential Revision: D5783044
fbshipit-source-id: f40da3fadfcf8d9e19221d41e3a5a980454717db
Summary:
We're running down a performance problem with hg status that we believe
is something happening at a higher level in the code but noticed that there
were a lot of reads of the rocksdb SST files. In the strace output for those
we observed file content data being read. The status operation shouldn't
need file contents; what's happening is that we're over-fetching some metadata
but happen to be scooping up the file contents from the SST file because we
use the same key prefixes and differentiate the keyspace with key suffixes.
This diff implements the use of rocksdb column families to partition things
more effectively and results in a speed up of around 6x in this scenario.
Furthermore, applying point lookup optimization options yields an additional
2x performance improvement to our rocksdb performance.
As part of this diff, I've removed the hash set that we were using to allow
checking whether a key was present in the store; it wasn't very useful
and would have had to be split into one set per keyspace with this diff;
easier to just remove it.
Reviewed By: bolinfest
Differential Revision: D5781906
fbshipit-source-id: 97f068ade546fd09f391e60a7a57fec0e9081e67
Summary:
Before this change, `hg split` crashed complaining that `node` was a
`changectxwrapper` instead of a 20-byte hash when it was sent as `parent1`
of `WorkingDirectoryParents` in `resetParentCommits()`. Now we use `node()` to
get the hash from the `destctx` that we have already extracted via this line
earlier in `merge_update()`:
destctx = repo[node]
The change to `eden/hg/eden/__init__.py` eliminated the crash, but was
not sufficient on its own to make `hg split` work correctly. There was also a fix
required in `Dirstate.cpp` where the `onSnapshotChanged()` callback was clearing out
entries of both `NotApplicable` and `BothParents` from `hgDirstateTuples`.
It appears that only `NotApplicable` entries should be cleared. (I tried leaving
`NotApplicable` entries in there, but that broke `eden/integration/hg/graft_test.py`.)
I suspected that the logic to clear out `hgDestToSourceCopyMap` in
`Dirstate::onSnapshotChanged` was also wrong, so I deleted it and all of the
integration tests still pass. Admittedly, we are pretty weak in our test coverage
for use cases that write to the `hgDestToSourceCopyMap`. In general, we should
rely on Mercurial to explicitly remove entries from `hgDestToSourceCopyMap`.
We have a Thrift API, `hgClearDirstate()`, that `eden_dirstate` can use to categorically
clear out `hgDirstateTuples` and `hgDestToSourceCopyMap`, if necessary.
Finally, creating a proper integration test for `hg split` required creating a value for
`HGEDITOR` that could write different commit messages for different commits.
To that end, I added a `create_editor_that_writes_commit_messages()` utility as a
method of `HgExtensionTestBase` and updated its `hg()` method to take `hgeditor`
as an optional parameter.
Reviewed By: wez
Differential Revision: D5758236
fbshipit-source-id: 5cb8bf4207d4e802726cd93108fae4a6d48f45ec
Summary:
this is the dumb and obvious refactor of this method to
propagate and wait on the Future from the EdenMount::create call.
Reviewed By: simpkins
Differential Revision: D5750372
fbshipit-source-id: fb7ce595de3bacab99ce8af6ef597ef6f0417c12
Summary:
The on-disk treemanifest store doesn't contain an empty tree for the null
manifest ID. We have to explicitly check for this ID in software and generate
an empty tree in this case.
Reviewed By: wez
Differential Revision: D5750053
fbshipit-source-id: d1a58df45f9025ff5a4757f0a814f92dd58798e8
Summary: Add a method to get the EventBase used to drive the main thread.
Reviewed By: wez
Differential Revision: D5750054
fbshipit-source-id: ad2eba021a6200ed28e39a60b16d90aabfaee5b4
Summary:
The serialized data for each file handle needs to be enough
to re-construct the handle when we load it into a new process later
on. We need the inode number, the file handle number that we communicated
to the kernel and a flag to let us know whether it is a file or a dir.
Note that the file handle allocation strategy already accomodates the
idea of migrating to a new process; we don't need to serialize anything
like a next file handle id number.
This doesn't implement instantiating the handles from the loaded state,
it is just the plumbing for saving and loading that state information.
Reviewed By: bolinfest
Differential Revision: D5733079
fbshipit-source-id: 8fb8afb8ae9694d013ce7a4a82c31bc876ed33c9
Summary:
We're not doing anything with this today. It's not
clear whether we should be doing sanity checks (eg: block attempts
to write to a handle that was opened only for reading) or whether
the kernel is going to do that for us, so I've broken this out
as a separate diff from the removal of FileData.
Reviewed By: bolinfest
Differential Revision: D5723064
fbshipit-source-id: b73452dfb4edf88b57fef1ad604bb2bde93bacc1
Summary: These don't exist any more, so remove them
Reviewed By: bolinfest
Differential Revision: D5722861
fbshipit-source-id: 7db112dfab1dfdcf517452b314bd912ec8760bd1
Summary:
The fixes to the fastmanifest code should be landing in prod
tomorrow, so this diff turns on trees in anticipation of that.
Folks working in eden will need to also use hg-dev until this ships out.
Reviewed By: simpkins
Differential Revision: D5740960
fbshipit-source-id: fac3b59183ceb63b6af704715fb5a5b9daed013d
Summary:
This moves logic for running the server from main.cpp into the EdenServer
class.
This will make it easier to refactor some of the start-up and running process
in the future, and makes EdenServer the owner of this entire workflow. This
will help as we start splitting the startup code into two separate code paths:
one for a new, fresh start, and one for graceful restart taking over mounts
from an existing eden process.
Reviewed By: bolinfest
Differential Revision: D5732656
fbshipit-source-id: 63f05eb1105078764f4e4931d770416dd5f6d6dc
Summary:
this is required together with D5711177 to successfully perform an `hg
amend`. The changes in D5711177 cause the pending pack files to get written
out and the amend command will then call into eden to change the parents of the
commit.
We need to be able to resolve the tree from the packfiles when this happens,
but since this happens within the default refresh interval in the store code
(which is ~100ms) we need to explicitly refresh the set of pack files.
This is most easily accomplished by forcing a refresh on a tree miss.
This is probably fine if you assume that we won't legitimately be asked
to resolve non-existent trees very frequently.
Reviewed By: simpkins
Differential Revision: D5712623
fbshipit-source-id: 4d0034affcc276f1ae29caac36aa5596e52cd746
Summary:
Added new tool to report stat information of EdenFs like fuse counters, Memory counters, latencies, Inode status for all the mount points etc.
eden stat : Prints the general information about eden like list of mount points, loaded unloaded and materialized inodes in each mount point. Also this reports how well periodic unload job is doing by reporting the number of unloaded inodes by periodic job.
eden stat io : Prints how many number of calls made to a system call in Edenfs.
eden stat memory : returns the memory stat for edenfs.
eden stat latency : reports the latencies of system calls in Edenfs.
Reviewed By: bolinfest
Differential Revision: D5660345
fbshipit-source-id: 97a1c2b83a6d8df0cd1b82c4d54b52d7ebd126bd
Summary:
This test was supposed to be a part of D5627411 but it was causing strange behaviour so was brought to a separate diff for further investigation.
After investigating, the test didn't pass because the UnloadedInodeData struct only contained the name of the file, not the path to it. The fix for this was to implement a way to get the relative path of the file even after the inode is unloaded.
Reviewed By: simpkins
Differential Revision: D5646929
fbshipit-source-id: f166398a651e8aea49da7e4474a5ad7fde2eaa4e
Summary:
Currently treemanifest import fails when a mercurial transaction is in progress
(if the tree in question was created by the pending transaction), and this
breaks many mercurial workflows.
Turning treemanifest back off by default, to fall back to the slower but
functional flat manifest import.
D5685880 adds framework to run the eden integration tests with treemanifest
enabled; those tests can be used to tell when treemanifest is working well
enough that we can turn it back on.
Reviewed By: bolinfest
Differential Revision: D5692894
fbshipit-source-id: 5ee84cf73db4cd87bbdaae1edd92c74058fa00e2
Summary:
Because `DBG2` seems to be the level we are using to log thrift calls in
`EdenServiceHandler`, this seems like a reasonable default.
Reviewed By: simpkins
Differential Revision: D5686115
fbshipit-source-id: 2d3e0173df37919b6936f73e641f880d16dc539f
Summary:
Note that this feature was mostly implemented before this commit, but never
tested. Unsurprisingly, there were bugs.
This change also introduces a new `eden debug hg_copy_map_get_all` subcommand
because that was a straightforward way to verify the internal state of the copy
map on the server side from an integration test.
Adding this test uncovered a key copy/paste bug in `EdenThriftClient.py`
(`hgCopyMapGet` was being invoked instead of `hgCopyMapPut`.)
It also uncovered a bug in `LameThriftClient` because the `compile()` and
`eval()` calls on the output are not appropriate when the return type of the
Thrift endpoint is `string`.
Reviewed By: simpkins
Differential Revision: D5686114
fbshipit-source-id: f0093d2b67062c01982dc5bc1f0db2774b3a9356
Summary:
1.Modified `TreeInode::unloadChildrenNow()` to return number of inodes that have been unloaded.
2.Modified `EdenServiceHandler::unloadInodeForPath()` to return number of inodes that are unloaded.
Reviewed By: simpkins
Differential Revision: D5627539
fbshipit-source-id: 4cdb0433dced6bf101158b9e6f8c35de67d9abbe
Summary:
Added a test case `test_unload_free_inodes_age` to verify the behaviour of unloadChildrenNow with age parameter.
Added new parameter age to `unloadInodeForPath` in eden.thrift, and `EdenServiceHandler`.
Modified `do_unload_inodes` function in `debug.py` to support the new behaviour.
Reviewed By: simpkins
Differential Revision: D5565859
fbshipit-source-id: a35053725be26bc906cf158969cbe21db1cbadde
Summary:
When Hg tells the `dirstate` to `clear()`, we should also clear out any data we
have on the server for the `Dirstate`.
As it stands, the way we subclass `dirstate`, it does not appear like `clear()`
should be called, in practice, though one thing that could call it is
`hg debugrebuilddirstate`. It is probably good for us to have an RPC lying
around that we can use to reset the `Dirstate.`
Reviewed By: wez
Differential Revision: D5675298
fbshipit-source-id: 38926cfd93f4f83e4c28910f812a693cb32e423a
Summary: This will make subsequent changes to these files cleaner.
Reviewed By: wez
Differential Revision: D5675296
fbshipit-source-id: 06b14d55485415e3ec8a59a4bcc50e6189464b7d
Summary: Provide a thrift interface to invalidate the cache for an inode denoted by path.
Reviewed By: simpkins
Differential Revision: D5655387
fbshipit-source-id: 887aa4963d216a0d8eed93b6fb8721632cc31d19
Summary:
Previously, we were overloading `hgSetDirstateTuple()` to also make it possible
to delete an entry from the internal `hgDirstateTuples` map. Now we have an
explicit method to do this, which enables us to remove some hacks/TODOs.
Reviewed By: simpkins
Differential Revision: D5665170
fbshipit-source-id: bc0753d4990c8966fd9e6c50b29a954d5023292f
Summary:
This was helpful when debugging interactions between Mercurial's
dirstate and the related Thrift calls to Eden server.
Reviewed By: simpkins
Differential Revision: D5664872
fbshipit-source-id: 4e6ef3b034f4fc81f0d467974311a58f54b6e47b
Summary: Modified `TreeInode::unloadChildrenNow` such that inodes are unloaded whose age is greater than a specific age.
Reviewed By: simpkins
Differential Revision: D5526137
fbshipit-source-id: 91e2364d55e31befedcf43d98c26467e1a472ef9
Summary:
Update the EdenServer code to correctly aggregate the thread local stats data
from all threads. Previously it only aggregated stats from the
FunctionScheduler thread that it was running in. This thread never updates any
stats, so it never had any actual stats data.
This also adds a thrift call to trigger stats flushing immediately. This can
be called from integration tests to ensure that they get up-to-date stats
information.
Reviewed By: bolinfest
Differential Revision: D5657267
fbshipit-source-id: 060a24c00a19568b09ae8795477d73a3baab9a3c
Summary:
Update all of the code using ThreadLocal<EdenStats> to pass in a non-default
Tag parameter to the ThreadLocal template.
A non-default tag parameter is required to use the accessAllThreads() method on
the ThreadLocal object. We need to use accessAllThreads() to perform stats
aggregation correctly. Currently the EdenServer code is only performing
aggregation for stats in the FunctionScheduler.
This diff only updates the ThreadLocal<EdenStats> type, but does not contain
any behavior changes. I will fix the stats aggregation in a subsequent diff.
Reviewed By: bolinfest
Differential Revision: D5657268
fbshipit-source-id: bc4b6f56324eb8d3052c023fd3f6f64f99b1d4e0
Summary:
Change the default value for --use_hg_tree_manifest to true, to re-enable
treemanifest import by default. This should work much better now that we are
using the latest treemanifest code and now support fetching treemanifest data
from the remote server when it is not found locally.
Reviewed By: bolinfest
Differential Revision: D5647204
fbshipit-source-id: 424baf4de1d3b247c1e04a838040baeb5976404b
Summary:
This refactors the treemanifest union store initialization code, and fixes a
crash introduced in D5560787 if --use_hg_tree_manifest is enabled but the
underlying repository does not support treemanifest.
This moves the treemanifest initialization code into a new importTreeManifest()
helper function, and separates it from parsing the CMD_STARTED response. We
now only initialize unionStore_ if FLAGS_use_hg_tree_manifest is enabled and
the repository supports treemanifest. Other places in the code now only check
if unionStore_ is non-null to see if they should try importing via
treemanifest.
Splitting importTreeManifest() from the CMD_STARTED response parsing will
potentially make things cleaner in the future for using multiple
hg_import_helper.py processes. Even if we start multiple of these processes,
we only need one instance of the C++ union store, and we don't need to
re-initialize the treemanifest separately each time.
Reviewed By: bolinfest
Differential Revision: D5647203
fbshipit-source-id: 71e14156f1d0ebc0880dd819c5b041b7c1146818
Summary:
A removal followed by a create should be treated as a change rather than a
no-op. Previously the code was merging this into a no-op, and was also
ignoring any subsequent modifications in the same journal period being queried
(since it processes transactions in reverse order).
Reviewed By: bolinfest
Differential Revision: D5647669
fbshipit-source-id: 29389206f0e818d6b6248fbb6697b85e93064b1f
Summary:
Recently, autodeps switched to requiring simple types for the various dep
parameters in order to support proper annotation parsing (D5387184), which
started breaking on `eden/fs/fuse/TARGETS`.
Reviewed By: simpkins
Differential Revision: D5620249
fbshipit-source-id: 16fa2c73421ff7e9929c71290a662babde1289ec