Summary:
Building with gcc/opt mode complains that `strlen()` is not a constant
expression, so I'm dropping the `constexpr` to fix the opt build.
Reviewed By: chadaustin
Differential Revision: D6272580
fbshipit-source-id: a8e3a93ea04e878353c6cab31cb0b1b4705276fe
Summary:
In Eden, some Thrift endpoints get a lot of hits whereas others are used less
frequently. By giving each endpoint its own logging category, we can toggle the
logging for each one independently.
Each of our Thrift endpoint methods should start with the following macro:
INSTRUMENT_THRIFT_CALL();
then within the method, the macro `TLOG()` should be used everywhere
`XLOG(LEVEL)` would normally used. `LOG()` ensures the logger with the category
specific to the method will be used.
For an endpoint named `XXX`, the logging category will be `eden.thrift.XXX`.
Reviewed By: simpkins
Differential Revision: D6108311
fbshipit-source-id: 23a34179811359b0819434de31a3601d29c3b4f0
Summary:
Opening the RocksDB store can take a long time if there is a fair amount of
data in the store. In my current .eden directory I have around 10GB of data in
the local store, and it takes RocksDB nearly 60 seconds to open the database.
These log messages help provide a little more information about what edenfs is
doing during this startup delay.
Reviewed By: bolinfest
Differential Revision: D6263603
fbshipit-source-id: a0945aa1cc020b95944b365b17869756dcc27407
Summary:
This is a major change to how we manage the dirstate in Eden's Hg extension.
Previously, the dirstate information was stored under `$EDEN_CONFIG_DIR`,
which is Eden's private storage. Any time the Mercurial extension wanted to
read or write the dirstate, it had to make a Thrift request to Eden to do so on
its behalf. The upside is that Eden could answer dirstate-related questions
independently of the Python code.
This was sufficiently different than how Mercurial's default dirstate worked
that our subclass, `eden_dirstate`, had to override quite a bit of behavior.
Failing to manage the `.hg/dirstate` file in a way similar to the way Mercurial
does has exposed some "unofficial contracts" that Mercurial has. For example,
tools like Nuclide rely on changes to the `.hg/dirstate` file as a heuristic to
determine when to invalidate its internal caches for Mercurial data.
Today, Mercurial has a well-factored `dirstatemap` abstraction that is primarily
responsible for the transactions with the dirstate's data. With this split, we can
focus on putting most of our customizations in our `eden_dirstate_map` subclass
while our `eden_dirstate` class has to override fewer methods. Because the
data is managed through the `.hg/dirstate` file, transaction logic in Mercurial that
relies on renaming/copying that file will work out-of-the-box. This change
also reduces the number of Thrift calls the Mercurial extension has to make
for operations like `hg status` or `hg add`.
In this revision, we introduce our own binary format for the `.hg/dirstate` file.
The logic to read and write this file is in `eden/py/dirstate.py`. After the first
40 bytes, which are used for the parent hashes, the next four bytes are
reserved for a version number for the file format so we can manage file format
changes going forward.
Admittedly one downside of this change is that it is a breaking change.
Ideally, users should commit all of their local changes in their existing mounts,
shutdown Eden, delete the old mounts, restart Eden, and re-clone.
In the end, this change deletes a number of Mercurial-specific code and Thrift
APIs from Eden. This is a better separation of concerns that makes Eden more
SCM-agnostic. For example, this change removes `Dirstate.cpp` and
`DirstatePersistance.cpp`, replacing them with the much simpler and more
general `Differ.cpp`. The Mercurial-specific logic from `Dirstate.cpp` that turned
a diff into an `hg status` now lives in the Mercurial extension in
`EdenThriftClient.getStatus()`, which is much more appropriate.
Note that this reverts the changes that were recently introduced in D6116105:
we now need to intercept `localrepo.localrepository.dirstate` once again.
Reviewed By: simpkins
Differential Revision: D6179950
fbshipit-source-id: 5b78904909b669c9cc606e2fe1fd118ef6eaab95
Summary:
Per discussion with bolinfest, this brings Eden in line with clang-format.
This diff was generated with `find . \( -iname '*.cpp' -o -iname '*.h' \) -exec bash -c "yes | arc lint {}" \;`
Reviewed By: bolinfest
Differential Revision: D6232695
fbshipit-source-id: d54942bf1c69b5b0dcd4df629f1f2d5538c9e28c
Summary:
D5067763 introduced a potential deadlock. The issue is that all of the FUSE threads were blocked
on the HgImporter thread pool, which completes its futures back on serverEventThread_. The
FUSE threads were blocked on Future::get() in ensureDataLoaded().
Eventually, the right fix is some combination of eliminating ensureDataLoaded() and
replacing it with an explicitly-asynchronous API.
Reviewed By: bolinfest
Differential Revision: D6212858
fbshipit-source-id: 42b17d3e20a200f26b87588784edb5ee51e96a4a
Summary:
The result of this call was never used, and the call does not appear to have
any side-effects.
Reviewed By: chadaustin
Differential Revision: D6212076
fbshipit-source-id: e87858b75d9c0482e5f544411a85ebf1a66d63ba
Summary:
For a while, the recommended approach (e.g. from folks on the thrift team, from folks on the javafoundations team) has been to use `languages = ["java-swift"]`. This renaming brings the code in line with the recommendation. There are no actual changes to any builds, this is just a rename.
This is the second diff in a small stack. It's a fully automated change that uses buildozer to rename `*-java` rules to `*-javadeprecated`.
Differential Revision: D6189950
fbshipit-source-id: a08ba86cb92293a07eceb4b6b29385c7bf647b72
Summary:
For a while, the recommended approach (e.g. from folks on the thrift team, from
folks on the javafoundations team) has been to use `languages = ["java-swift"]`.
This renaming brings the code in line with the recommendation. There are no
actual changes to any builds, this is just a rename.
This is the first diff in a small stack. It edits the `JavaThriftConverter` to
use the `-javadeprecated` naming, but with a `java_library` shim that retains
the `-java` naming temporarily (to ease the transition). There's also a little
shim when processing the `languages` parameter, to accept either `java` or
`javadeprecated`.
Reviewed By: yfeldblum
Differential Revision: D6204449
fbshipit-source-id: 3e7f3ca7a4d5e1976176479ff4298da72af9e685
Summary:
There was a lock ordering violation in the combination of StreamingSubscriber and Journal. I changed StreamingSubscriber to always acquire the Journal lock before StreamingSubscriber's State lock.
I also simplified StreamingSubscriber's API to a single static function to isolate all of the lifetime and thread safety concerns in one implementation file.
This fixes a regression caused by D6162717
Reviewed By: simpkins
Differential Revision: D6202770
fbshipit-source-id: 326269db15bf3200bd6321edf372daf286784fb5
Summary:
We have a couple of issues with the watchman/eden integration:
1. If you "win" the race, you can cause a segfault during shutdown
by changing files during unmount. This causes the journal updating
code to trigger a send to the client, but the associated eventBase
has already been destroyed.
2. We don't proactively send any signal to the subscriber (in practice: watchman)
when we unmount. Watchman detects the eden shutdown by noticing that its
socket has closed but has no way to detect an unmount.
This diff tries to connect the unmount with the set of subscribers and tries
to cause the thrift socket to close out.
Reviewed By: bolinfest
Differential Revision: D6162717
fbshipit-source-id: 42d4a005089cd9cddf204997b1643570488f04c3
Summary:
Previously, the `savebackup()` and `restorebackup()` methods in `eden_dirstate`
only retained the parent commit hashes. With this change, now the dirstate tuples
and entries in the copymap for the dirstate are also included as part of the saved
state.
Failing to restore all of the state caused issues when doing things like aborting
an `hg split`, as observed by one of our users. Although this fix works, we ultimately
plan to move the responsibility for persisting dirstate data out of Eden and into the
Hg extension. Then the data will live in `.hg/dirstate` like it would for the default
dirstate implementation.
Reviewed By: simpkins
Differential Revision: D6145420
fbshipit-source-id: baa077dee73847a47cc171cd980cdd272b3a3a99
Summary: This diff helps some common pitfalls when using set_log_level.
Reviewed By: simpkins
Differential Revision: D6142849
fbshipit-source-id: 7fa35392dda148af90d0aefdb872b6e8a8b770db
Summary:
Thess flags are not used (neither in code, nor passed in on command line). List of flags that are getting killed:
thriftMaxNumMessagesInQueue
max_queue_message_num
thrift_max_unprocessed_message
thrift_set_max_num_messages_in_queue
decorator_thriftSetMaxNumMessagesInQueue
atlas_pinger_imps_thriftSetMaxNumMessagesInQueue
tesseract_ocr_max_queue_msgs_per_thread
max_msgs_in_queue
max_num_messages_in_queue
maxNumMessagesInQueue
thrift_max_messages
thrift_queue_len
max_accepts_in_pipe
max_pending_connections_per_io_thread
thriftMaxNumPendingConnectionsPerWorker
lookup_thriftSetMaxNumMessagesInQueue
thriftSetMaxNumPendingConnectionsPerWorker
max_queue_msgs_per_thread
ds_thrift_max_num_messages_in_queue
zdb_thrift_max_num_messages_in_queue
update_control_proxy_queue_length
safety_enforcer_proxy_queue_length
maxRequestsInQueue
max_num_messages_in_pipe
Reviewed By: yfeldblum
Differential Revision: D6143500
fbshipit-source-id: 541507d50100817590b91cdd48e39a29e7c465ea
Summary:
Have HgBackingStore hold multiple HgImporters in its own thread pool. Incoming
requests are processed by threads in the thread pool.
Reviewed By: wez
Differential Revision: D5067763
fbshipit-source-id: d666b1026455d13442367673010b5934ff4cdb48
Summary:
While debugging SRProxy issue I found a bug in ThriftServer initialization (more details in D6115379), due to that bug we always have accept queue of 1k and the value passed to `setMaxNumPendingConnectionsPerWorker` is ignored, thus making all call sites bogus.
We have ~350 calls to this function in fbcode, all of them pass random numbers such as 50, 100, 50k, etc. I’m removing all of them before landing the fix for Thrift to avoid causing production issues.
Unfortunately almost every such caller has a gflag for it (that are usually have confusing names, e.g. FLAGS_thriftSetMaxNumMessagesInQueue). I’m not touching these gflags, because they may be used in different configs and removing them would cause breakages.
Reviewed By: yfeldblum
Differential Revision: D6129569
fbshipit-source-id: 1550b5073bac42d0c6fb7bdffa1835bf523609c8
Summary:
During the initial phases of a large Buck build, we expect to get a lot of
`glob()` calls during the parse phase (assuming the Buck project is configured
with `glob_handler = watchman`) and a lot of `getSHA1()` calls during the
building phase. Being able to dump this to the log file will make this easier to
audit.
Reviewed By: chadaustin
Differential Revision: D6097764
fbshipit-source-id: 4cb1bb4f6b21830c4d830fcdcf87023eab859f57
Summary:
We have encountered cases where `eden health` reported
`"edenfs not healthy: edenfs not running"` even though the `edenfs` process is
still running. Because the existing implementation of `eden health` bases its
health check on the output of a `getStatus()` Thrift call, it will erroneously
report `"edenfs not running"` even if Eden is running but its Thrift server is
not running. This type of false negative could occur if `edenfs` has shutdown
the Thrift server, but not the rest of the process (quite possibly, its
shutdown is blocked on calls to `umount2()`).
This is further problematic because `eden daemon` checks `eden health`
before attempting to start the daemon. If it gets a false negative, then
`eden daemon` will forge ahead, trying to launch a new instance of the daemon,
but it will fail with a nasty error like the following:
```
I1017 11:59:25.188414 3064499 main.cpp:81] Starting edenfs. UID=5256, GID=100, PID=3064499
terminate called after throwing an instance of 'std::runtime_error'
what(): another instance of Eden appears to be running for /home/mbolin/local/.eden
*** Aborted at 1508266765 (Unix time, try 'date -d 1508266765') ***
*** Signal 6 (SIGABRT) (0x1488002ec2b3) received by PID 3064499 (pthread TID 0x7fd0d3787d40) (linux TID 3064499) (maybe from PID 30644
99, UID 5256), stack trace: ***
@ 000000000290d3cd folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)
@ 00007fd0d133cacf (unknown)
@ 00007fd0d093e7c8 __GI_raise
@ 00007fd0d0940590 __GI_abort
@ 00007fd0d1dfeecc __gnu_cxx::__verbose_terminate_handler()
@ 00007fd0d1dfcdc5 __cxxabiv1::__terminate(void (*)())
@ 00007fd0d1dfce10 std::terminate()
@ 00007fd0d1dfd090 __cxa_throw
@ 00000000015fe8ca facebook::eden::EdenServer::acquireEdenLock()
@ 000000000160f27b facebook::eden::EdenServer::prepare()
@ 00000000016107d5 facebook::eden::EdenServer::run()
@ 000000000042c4ee main
@ 00007fd0d0929857 __libc_start_main
@ 0000000000548ad8 _start
Aborted
```
By providing more accurate information to `eden daemon`, if the user tries to
run it while the daemon is already running, they will get a more polite error
like the following:
```
error: edenfs is already running (pid 274205)
```
This revision addresses this issue by writing the PID of `edenfs` in the
lockfile. It updated the implementation of `eden health` to use the PID in the
lockfile to assess the health of Eden if the call to `getStatus()` fails. It
does this by running:
```
ps -p PID -o comm=
```
and applying some heuristics on the output to assess whether the command
associated with that process is the `edenfs` command. If it is, then
`eden health` reports the status as `STOPPED` whereas previously it would report
it as `DEAD`.
Reviewed By: wez
Differential Revision: D6086473
fbshipit-source-id: 825421a6818b56ddd7deea257a92c070c2232bdd
Summary:
Our existing values for these resources appear to be too conservative for
massively parallel builds. Although we expected Eden to behave more like a small
application, there are times when it has to be able to respond more like a
well-provisioned service (particularly during the parse phase of a large Buck
build).
Reviewed By: simpkins
Differential Revision: D6075340
fbshipit-source-id: 7c26e9b0f785358968430527115c63c6d8cdedc8
Summary:
We were previously generating a simple JournalDelta consisting of
just the from/to snapshot hashes. This is great from a `!O(repo)` perspective
when recording what changed but makes it difficult for clients downstream
to reason about changes that are not tracked in source control.
This diff adds a concept of `uncleanPaths` to the journal; these are paths
that we think are/were different from the hashes in the journal entry.
Since JournalDelta needs to be able to be merged I've opted for a simple
list of the paths that have a differing status; I'm not including all of
the various dirstate states for this because it is not obvious how to
reconcile the state across successive snapshot change events.
The `uncleanPaths` set is populated with an initial set of different paths as
the first part of the checkout call (prior to changing the hash), and then is
updated after the hash has changed to capture any additional differences.
Care needs to be taken to avoid recursively attempting to grab the parents lock
so I'm replicating just a little bit of the state management glue in the
`performDiff` method.
The Journal was not setting the from/to snapshot hashes when merging deltas.
This manifested in the watchman integration tests; we'd see the null revision
as the `from` and the `to` revision held the `from` revision(!).
On the watchman side we need to ask source control to expand the list of
files that changed when the from/to hashes are different; I've added code
to handle this. This doesn't do anything smart in the case that the
source control aware queries are in use. We'll look at that in a following
diff as it isn't strictly eden specific.
`watchman clock` was returning a basically empty clock unconditionally,
which meant that most since queries would report everything since the start
of time. This is most likely contributing to poor Buck performance, although
I have not investigated the performance aspect of this. It manifested itself
in the watchman integration tests.
Reviewed By: simpkins
Differential Revision: D5896494
fbshipit-source-id: a88be6448862781a1d8f5e15285ca07b4240593a
Summary:
I noticed that when running `hg push --to master`, the implicit
pull took about a minute to collect all the commits from the remote, then
we tried to make a thrift call to set the parents only to fail with a
python EPIPE stack trace.
I opted to simply make a new client for each call; this is similar to how
things were running with the lame thrift client but doesn't have as terrible
an impact on rebase performance as lame thrift client did.
It feels like things might be better if we made the client retry on transport
errors, but this is quite fiddly so let's defer that until we really need
to cut out the unix domain socket connection overheads.
```
$ hg ci
note: commit message saved in ../.hg/last-message.txt
Traceback (most recent call last):
File "/usr/bin/hg.real", line 47, in <module>
dispatch.run()
File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 81, in run
status = (dispatch(req) or 0) & 255
File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 163, in dispatch
ret = _runcatch(req)
File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 314, in _runcatch
return _callcatch(ui, _runcatchfunc)
File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 365, in _callcatch
if not handlecommandexception(ui):
File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 993, in handlecommandexception
ui.log("commandexception", "%s\n%s\n", warning, traceback.format_exc())
File "/usr/lib64/python2.7/site-packages/hgext3rd/sampling.py", line 79, in log
return super(logtofile, self).log(event, *msg, **opts)
File "/usr/lib64/python2.7/site-packages/hgext/blackbox.py", line 163, in log
parents = ctx.parents()
File "/usr/lib64/python2.7/site-packages/mercurial/context.py", line 286, in parents
return self._parents
File "/usr/lib64/python2.7/site-packages/mercurial/util.py", line 862, in __get__
result = self.func(obj)
File "/usr/lib64/python2.7/site-packages/mercurial/context.py", line 1546, in _parents
p = self._repo.dirstate.parents()
File "hgext3rd/eden/eden_dirstate.py", line 163, in parents
File "hgext3rd/eden/eden_dirstate.py", line 157, in _getparents
File "/usr/local/fb-mercurial/eden/hgext3rd/eden/EdenThriftClient.py", line 145, in getParentCommits
parents = self._client.getParentCommits(self._root)
File "facebook/eden/EdenService.py", line 7330, in getParentCommits
File "facebook/eden/EdenService.py", line 7339, in send_getParentCommits
File "thrift/transport/THeaderTransport.py", line 411, in flush
File "thrift/transport/THeaderTransport.py", line 516, in flushImpl
File "thrift/transport/TSocket.py", line 315, in write
thrift.transport.TTransport.TTransportException: Socket send failed with error 32 (Broken pipe)
```
Reviewed By: bolinfest
Differential Revision: D5942205
fbshipit-source-id: 464e5035075476f47232ca975e107e165057c912
Summary:
The eden mount may be in a deep tree (for instance, in the
watchman integration tests) and be too long to fit in the unix domain
socket address.
The failure that we see in that case mistakenly informs the user that
eden is not running, rather than that we have a broken address.
Reviewed By: simpkins
Differential Revision: D5948497
fbshipit-source-id: 3bde8b83bf08b82bdd08758382ac0e218cc12829
Summary:
Originally I thought this would help move towards removing a
`future.get()` call from FileInode, but it turned out to not make a difference
to that code.
It does make it a bit less of a chore to deal with the Journal related diff
callbacks added in D5896494 though, and is a move towards a future where we
could potentially return cached and shared instances of these objects.
This diff is a mechanical change to alter the return type so that we can share
instances returned from the object store interface. It doesn't change any
functionality.
Reviewed By: simpkins
Differential Revision: D5919268
fbshipit-source-id: efe4b3af74e80cf1df20e81b4386450b72fa2c94
Summary: Per wez, this makes the MODIFIED case consistent with the other conflict types (e.g. local_remote). Side benefit of avoiding some naming conflicts in the Haskell/Rust thrift tooling.
Reviewed By: wez
Differential Revision: D5882327
fbshipit-source-id: 3ec68c44d8c8a5c5675f1ced3842d29376d46fe2
Summary:
Following on from D5798659, this diff pulls the mount flow into
EdenServer. Previously the flow of control would bounce back and forth between
the EdenServiceHandler and EdenServer and this made it (IMO) more difficult to
follow and understand where to add things into the flow.
Now `EdenServer::mount` is the main entry point for the mounting flow.
I've simplified the stat registration and broken that out into helper methods
to avoid cluttering up the mount logic.
Reviewed By: bolinfest
Differential Revision: D5806393
fbshipit-source-id: 7c858a2a580332ce82c2600e9dc3537af1d734d1
Summary:
This moves the bind mounting and post-clone script running
functionality to methods of the EdenMount class and makes the whole
mount flow return a `Future<>`. The higher level goal is to make it
easier to see where and how we want to tweak this flow to support
graceful restart.
This is mostly straight forward but care is required to avoid deadlocks; there
are two scenarios:
# We fulfill the fuse start promise in the context of the fuse thread that is
handling the fuse initialization packet before it has signalled to the kernel
that it has come up. This can be solved by using `via(mainEventBase_)`, but...
# When remounting all the mounts on startup, we're running in the
`mainEventBase_` thread so the simplistic solution to 1. would cause us to
deadlock on startup (visible in the remount integration tests).
So to avoid this, we shunt the completion of the future via a CPU pool.
Also worth noting: the way we were setting up the global CPU pool with
wangle wasn't correct; it takes a weak reference to the pool which was
then getting destroyed when our prepare method returned. It just happened
to work for us in the facebook specific build because something else was
setting up a different CPU executor.
I've reconciled this by just setting up a thread pool of our own and
using that explicitly.
Reviewed By: bolinfest
Differential Revision: D5798659
fbshipit-source-id: f1c48730f283f6962f6cd706c02d82ea2952e369
Summary:
This is reasonably straightforward, although a little
more fiddly than I'd hoped because the timer wheel stuff doesn't
offer a convenient way to set up a recurring timer.
I've also made the inode unloading code get run globally for all
mounts; it was previously scheduling one timer per mount point.
This nets out the same; the function scheduler was just a single
thread anyway, so there is no change in the level of concurrency.
I believe that this tidies up the unload counter too; it looked
like we'd set the counter to be the result of the last mount
point that we processed rather than the aggregate of all mounts.
Having the unload timer be associated with the server rather
than the mount points means that we don't have to do anything
special to coordinate with the timer management when the mount
point is being torn down.
Reviewed By: bolinfest
Differential Revision: D5792938
fbshipit-source-id: 1a14bb7b7f4952139e684fe6b52f64bd1ba70dd0
Summary:
Previously, we were generating a bit of disconcerting noise in our logs when
requesting a non-existent key in the dirstate or its copy map. We were also
susceptible to a logical error in the Eden side being silently translated to
a `KeyError` on the Python side.
Now we make things more explicit by converting a `std::out_of_range` on the C++
side to an explicit `NoValueForKeyError` that is defined in `eden.thrift`.
Now the Python side catches a `NoValueForKeyError` explicitly and converts it
into a `KeyError`. Other types of exceptions should pass through rather than be
swallowed.
This also updates the log messages to communicate when a there is no value for a
key. The messaging is improved so that it no longer appears to be a logical
error.
Reviewed By: wez
Differential Revision: D5800833
fbshipit-source-id: c44f2caf04622475d218593037cc6616bbb1c701
Summary: This looks like it ended up getting done together with the original diff
Reviewed By: simpkins
Differential Revision: D5796901
fbshipit-source-id: 24ab05c50b13a37eefe903de5fd3f2dac3d462da
Summary:
After performing the dumb merge of EdenMount and MountPoint in
the prior commit, this one tidies up the state tracking and the interface
by which clients of the object can be notified of state changes.
I've introduced two Promises; the first of these can be used to wait
for the fuse mount to come up or error out. It logically replaces
the cond wait in the `start` method and is exposed to the caller
as a Future, allowing them to wait and react to the outcome.
The second of the promises is associated with the fuse thread pool
winding down. The attached future can be extracted and used by the
client of the EdenMount class. This future yields the fuse device
descriptor which we can then choose to pass on during graceful
restart or simply close. In the current integration, since we ignore
the result of that future, the handle is implicitly closed.
These promises allow us to remove the reference cycle that we had with the
`onStop` function and to potentially make the mount/unmount sequence more
async.
Reviewed By: bolinfest
Differential Revision: D5778214
fbshipit-source-id: 00b293009b7251ddd8bfb10795a115188e97aa3a
Summary:
This is a mechanical and dumb move of the code from MountPoint
and into the EdenMount class.
Of note, it doesn't merge together the two different state/status fields
into a unified thing; that will be tackled in a follow on diff.
Reviewed By: bolinfest
Differential Revision: D5778212
fbshipit-source-id: 6e91a90a5cc760429d87a475ec12f81b93f87be0
Summary:
The higher level goal is to make it easier to deal
with the graceful restart scenario.
This diff removes the SessionDeleter class and effectively renames
the Channel class to FuseChannel. The FuseChannel represents
the channel to the kernel; it can be constructed from a fuse
device descriptor that has been obtained either from the privhelper
at mount time, or from the graceful restart procedure. Importantly
for graceful restart, it is possible to move the fuse device
descriptor out of the FuseChannel so that it can be transferred
to a new eden process.
The graceful restart procedure requires a bit more control over
the lifecycle of the fuse event loop so this diff also takes over
managing the thread pool for the worker threads. The threads
are owned by the MountPoint class which continues to be responsible
for starting and stopping the fuse session and notifying EdenServer
when it has finished. A nice side effect of this change is that
we can remove a couple of inelegant aspects of the integration;
the stack size env var stuff and the redundant extra thread
to wait for the loop to finish.
I opted to expose the dispatcher ops struct via an `extern` to
simplify the code in the MountPoint class and avoid adding special
interfaces for passing the ops around; they're constant anyway
so this doesn't feel especially egregious.
Reviewed By: bolinfest
Differential Revision: D5751521
fbshipit-source-id: 5ba4fff48f3efb31a809adfc7787555711f649c9
Summary:
this is the dumb and obvious refactor of this method to
propagate and wait on the Future from the EdenMount::create call.
Reviewed By: simpkins
Differential Revision: D5750372
fbshipit-source-id: fb7ce595de3bacab99ce8af6ef597ef6f0417c12
Summary: Add a method to get the EventBase used to drive the main thread.
Reviewed By: wez
Differential Revision: D5750054
fbshipit-source-id: ad2eba021a6200ed28e39a60b16d90aabfaee5b4
Summary:
This moves logic for running the server from main.cpp into the EdenServer
class.
This will make it easier to refactor some of the start-up and running process
in the future, and makes EdenServer the owner of this entire workflow. This
will help as we start splitting the startup code into two separate code paths:
one for a new, fresh start, and one for graceful restart taking over mounts
from an existing eden process.
Reviewed By: bolinfest
Differential Revision: D5732656
fbshipit-source-id: 63f05eb1105078764f4e4931d770416dd5f6d6dc
Summary:
Added new tool to report stat information of EdenFs like fuse counters, Memory counters, latencies, Inode status for all the mount points etc.
eden stat : Prints the general information about eden like list of mount points, loaded unloaded and materialized inodes in each mount point. Also this reports how well periodic unload job is doing by reporting the number of unloaded inodes by periodic job.
eden stat io : Prints how many number of calls made to a system call in Edenfs.
eden stat memory : returns the memory stat for edenfs.
eden stat latency : reports the latencies of system calls in Edenfs.
Reviewed By: bolinfest
Differential Revision: D5660345
fbshipit-source-id: 97a1c2b83a6d8df0cd1b82c4d54b52d7ebd126bd
Summary:
This test was supposed to be a part of D5627411 but it was causing strange behaviour so was brought to a separate diff for further investigation.
After investigating, the test didn't pass because the UnloadedInodeData struct only contained the name of the file, not the path to it. The fix for this was to implement a way to get the relative path of the file even after the inode is unloaded.
Reviewed By: simpkins
Differential Revision: D5646929
fbshipit-source-id: f166398a651e8aea49da7e4474a5ad7fde2eaa4e
Summary:
Because `DBG2` seems to be the level we are using to log thrift calls in
`EdenServiceHandler`, this seems like a reasonable default.
Reviewed By: simpkins
Differential Revision: D5686115
fbshipit-source-id: 2d3e0173df37919b6936f73e641f880d16dc539f
Summary:
Note that this feature was mostly implemented before this commit, but never
tested. Unsurprisingly, there were bugs.
This change also introduces a new `eden debug hg_copy_map_get_all` subcommand
because that was a straightforward way to verify the internal state of the copy
map on the server side from an integration test.
Adding this test uncovered a key copy/paste bug in `EdenThriftClient.py`
(`hgCopyMapGet` was being invoked instead of `hgCopyMapPut`.)
It also uncovered a bug in `LameThriftClient` because the `compile()` and
`eval()` calls on the output are not appropriate when the return type of the
Thrift endpoint is `string`.
Reviewed By: simpkins
Differential Revision: D5686114
fbshipit-source-id: f0093d2b67062c01982dc5bc1f0db2774b3a9356
Summary:
1.Modified `TreeInode::unloadChildrenNow()` to return number of inodes that have been unloaded.
2.Modified `EdenServiceHandler::unloadInodeForPath()` to return number of inodes that are unloaded.
Reviewed By: simpkins
Differential Revision: D5627539
fbshipit-source-id: 4cdb0433dced6bf101158b9e6f8c35de67d9abbe
Summary:
Added a test case `test_unload_free_inodes_age` to verify the behaviour of unloadChildrenNow with age parameter.
Added new parameter age to `unloadInodeForPath` in eden.thrift, and `EdenServiceHandler`.
Modified `do_unload_inodes` function in `debug.py` to support the new behaviour.
Reviewed By: simpkins
Differential Revision: D5565859
fbshipit-source-id: a35053725be26bc906cf158969cbe21db1cbadde
Summary:
When Hg tells the `dirstate` to `clear()`, we should also clear out any data we
have on the server for the `Dirstate`.
As it stands, the way we subclass `dirstate`, it does not appear like `clear()`
should be called, in practice, though one thing that could call it is
`hg debugrebuilddirstate`. It is probably good for us to have an RPC lying
around that we can use to reset the `Dirstate.`
Reviewed By: wez
Differential Revision: D5675298
fbshipit-source-id: 38926cfd93f4f83e4c28910f812a693cb32e423a