Commit Graph

197 Commits

Author SHA1 Message Date
Michael Bolin
41f4127e8d Fix INSTRUMENT_THRIFT_CALL macro so it builds in opt mode.
Summary:
Building with gcc/opt mode complains that `strlen()` is not a constant
expression, so I'm dropping the `constexpr` to fix the opt build.

Reviewed By: chadaustin

Differential Revision: D6272580

fbshipit-source-id: a8e3a93ea04e878353c6cab31cb0b1b4705276fe
2017-11-08 10:49:35 -08:00
Michael Bolin
489e9071c4 Give each Thrift endpoint its own logging category.
Summary:
In Eden, some Thrift endpoints get a lot of hits whereas others are used less
frequently. By giving each endpoint its own logging category, we can toggle the
logging for each one independently.

Each of our Thrift endpoint methods should start with the following macro:

    INSTRUMENT_THRIFT_CALL();

then within the method, the macro `TLOG()` should be used everywhere
`XLOG(LEVEL)` would normally used. `LOG()` ensures the logger with the category
specific to the method will be used.

For an endpoint named `XXX`, the logging category will be `eden.thrift.XXX`.

Reviewed By: simpkins

Differential Revision: D6108311

fbshipit-source-id: 23a34179811359b0819434de31a3601d29c3b4f0
2017-11-07 20:56:12 -08:00
Adam Simpkins
c0e048148e log around opening the RocksDB local store
Summary:
Opening the RocksDB store can take a long time if there is a fair amount of
data in the store.  In my current .eden directory I have around 10GB of data in
the local store, and it takes RocksDB nearly 60 seconds to open the database.

These log messages help provide a little more information about what edenfs is
doing during this startup delay.

Reviewed By: bolinfest

Differential Revision: D6263603

fbshipit-source-id: a0945aa1cc020b95944b365b17869756dcc27407
2017-11-07 14:17:37 -08:00
Michael Bolin
5d738193e5 Store Hg dirstate data in Hg instead of Eden.
Summary:
This is a major change to how we manage the dirstate in Eden's Hg extension.

Previously, the dirstate information was stored under `$EDEN_CONFIG_DIR`,
which is Eden's private storage. Any time the Mercurial extension wanted to
read or write the dirstate, it had to make a Thrift request to Eden to do so on
its behalf. The upside is that Eden could answer dirstate-related questions
independently of the Python code.

This was sufficiently different than how Mercurial's default dirstate worked
that our subclass, `eden_dirstate`, had to override quite a bit of behavior.
Failing to manage the `.hg/dirstate` file in a way similar to the way Mercurial
does has exposed some "unofficial contracts" that Mercurial has. For example,
tools like Nuclide rely on changes to the `.hg/dirstate` file as a heuristic to
determine when to invalidate its internal caches for Mercurial data.

Today, Mercurial has a well-factored `dirstatemap` abstraction that is primarily
responsible for the transactions with the dirstate's data. With this split, we can
focus on putting most of our customizations in our `eden_dirstate_map` subclass
while our `eden_dirstate` class has to override fewer methods. Because the
data is managed through the `.hg/dirstate` file, transaction logic in Mercurial that
relies on renaming/copying that file will work out-of-the-box. This change
also reduces the number of Thrift calls the Mercurial extension has to make
for operations like `hg status` or `hg add`.

In this revision, we introduce our own binary format for the `.hg/dirstate` file.
The logic to read and write this file is in `eden/py/dirstate.py`. After the first
40 bytes, which are used for the parent hashes, the next four bytes are
reserved for a version number for the file format so we can manage file format
changes going forward.

Admittedly one downside of this change is that it is a breaking change.
Ideally, users should commit all of their local changes in their existing mounts,
shutdown Eden, delete the old mounts, restart Eden, and re-clone.

In the end, this change deletes a number of Mercurial-specific code and Thrift
APIs from Eden. This is a better separation of concerns that makes Eden more
SCM-agnostic. For example, this change removes `Dirstate.cpp` and
`DirstatePersistance.cpp`, replacing them with the much simpler and more
general `Differ.cpp`. The Mercurial-specific logic from `Dirstate.cpp` that turned
a diff into an `hg status` now lives in the Mercurial extension in
`EdenThriftClient.getStatus()`, which is much more appropriate.

Note that this reverts the changes that were recently introduced in D6116105:
we now need to intercept `localrepo.localrepository.dirstate` once again.

Reviewed By: simpkins

Differential Revision: D6179950

fbshipit-source-id: 5b78904909b669c9cc606e2fe1fd118ef6eaab95
2017-11-06 19:56:49 -08:00
Chad Austin
8b9261f2a1 run clang-format across all C++ files
Summary:
Per discussion with bolinfest, this brings Eden in line with clang-format.

This diff was generated with `find . \( -iname '*.cpp' -o -iname '*.h' \) -exec bash -c "yes | arc lint {}" \;`

Reviewed By: bolinfest

Differential Revision: D6232695

fbshipit-source-id: d54942bf1c69b5b0dcd4df629f1f2d5538c9e28c
2017-11-03 16:02:03 -07:00
Chad Austin
e5984e48d5 Revert D5067763: [eden] Use a pool of HgImporter workers
Summary:
D5067763 introduced a potential deadlock.  The issue is that all of the FUSE threads were blocked
on the HgImporter thread pool, which completes its futures back on serverEventThread_.  The
FUSE threads were blocked on Future::get() in ensureDataLoaded().

Eventually, the right fix is some combination of eliminating ensureDataLoaded() and
replacing it with an explicitly-asynchronous API.

Reviewed By: bolinfest

Differential Revision: D6212858

fbshipit-source-id: 42b17d3e20a200f26b87588784edb5ee51e96a4a
2017-11-01 14:55:45 -07:00
Adam Simpkins
d6c2f5f9e9 remove unnecessary call to Journal::getLatest()
Summary:
The result of this call was never used, and the call does not appear to have
any side-effects.

Reviewed By: chadaustin

Differential Revision: D6212076

fbshipit-source-id: e87858b75d9c0482e5f544411a85ebf1a66d63ba
2017-11-01 14:55:45 -07:00
Ryan Menezes
22b2ccf616 thrift_library rename java{,deprecated}
Summary:
For a while, the recommended approach (e.g. from folks on the thrift team, from folks on the javafoundations team) has been to use `languages = ["java-swift"]`. This renaming brings the code in line with the recommendation. There are no actual changes to any builds, this is just a rename.

This is the second diff in a small stack. It's a fully automated change that uses buildozer to rename `*-java` rules to `*-javadeprecated`.

Differential Revision: D6189950

fbshipit-source-id: a08ba86cb92293a07eceb4b6b29385c7bf647b72
2017-11-01 14:34:07 -07:00
Ryan Menezes
7955a023b3 thrift_library start renaming java{,deprecated}
Summary:
For a while, the recommended approach (e.g. from folks on the thrift team, from
folks on the javafoundations team) has been to use `languages = ["java-swift"]`.
This renaming brings the code in line with the recommendation. There are no
actual changes to any builds, this is just a rename.

This is the first diff in a small stack. It edits the `JavaThriftConverter` to
use the `-javadeprecated` naming, but with a `java_library` shim that retains
the `-java` naming temporarily (to ease the transition). There's also a little
shim when processing the `languages` parameter, to accept either `java` or
`javadeprecated`.

Reviewed By: yfeldblum

Differential Revision: D6204449

fbshipit-source-id: 3e7f3ca7a4d5e1976176479ff4298da72af9e685
2017-11-01 14:34:07 -07:00
Chad Austin
edf20155c9 fix a deadlock - lock ordering violation in Journal/StreamingSubscriber
Summary:
There was a lock ordering violation in the combination of StreamingSubscriber and Journal.  I changed StreamingSubscriber to always acquire the Journal lock before StreamingSubscriber's State lock.

I also simplified StreamingSubscriber's API to a single static function to isolate all of the lifetime and thread safety concerns in one implementation file.

This fixes a regression caused by D6162717

Reviewed By: simpkins

Differential Revision: D6202770

fbshipit-source-id: 326269db15bf3200bd6321edf372daf286784fb5
2017-11-01 12:24:44 -07:00
Wez Furlong
d3202383c1 tidy up journal subscribers on unmount
Summary:
We have a couple of issues with the watchman/eden integration:

1. If you "win" the race, you can cause a segfault during shutdown
by changing files during unmount.  This causes the journal updating
code to trigger a send to the client, but the associated eventBase
has already been destroyed.

2. We don't proactively send any signal to the subscriber (in practice: watchman)
when we unmount.  Watchman detects the eden shutdown by noticing that its
socket has closed but has no way to detect an unmount.

This diff tries to connect the unmount with the set of subscribers and tries
to cause the thrift socket to close out.

Reviewed By: bolinfest

Differential Revision: D6162717

fbshipit-source-id: 42d4a005089cd9cddf204997b1643570488f04c3
2017-10-27 09:01:57 -07:00
Michael Bolin
ac5b213e92 Include the dirstate tuples and copymap when backing up the dirstate.
Summary:
Previously, the `savebackup()` and `restorebackup()` methods in `eden_dirstate`
only retained the parent commit hashes. With this change, now the dirstate tuples
and entries in the copymap for the dirstate are also included as part of the saved
state.

Failing to restore all of the state caused issues when doing things like aborting
an `hg split`, as observed by one of our users. Although this fix works, we ultimately
plan to move the responsibility for persisting dirstate data out of Eden and into the
Hg extension. Then the data will live in `.hg/dirstate` like it would for the default
dirstate implementation.

Reviewed By: simpkins

Differential Revision: D6145420

fbshipit-source-id: baa077dee73847a47cc171cd980cdd272b3a3a99
2017-10-25 22:36:06 -07:00
Chad Austin
3240f2dff9 warn when creating new logger, allow disabling inheritance for log level
Summary: This diff helps some common pitfalls when using set_log_level.

Reviewed By: simpkins

Differential Revision: D6142849

fbshipit-source-id: 7fa35392dda148af90d0aefdb872b6e8a8b770db
2017-10-25 09:56:16 -07:00
Stepan Palamarchuk
2e7804cc3e Remove more unused flags
Summary:
Thess flags are not used (neither in code, nor passed in on command line). List of flags that are getting killed:
thriftMaxNumMessagesInQueue
max_queue_message_num
thrift_max_unprocessed_message
thrift_set_max_num_messages_in_queue
decorator_thriftSetMaxNumMessagesInQueue
atlas_pinger_imps_thriftSetMaxNumMessagesInQueue
tesseract_ocr_max_queue_msgs_per_thread
max_msgs_in_queue
max_num_messages_in_queue
maxNumMessagesInQueue
thrift_max_messages
thrift_queue_len
max_accepts_in_pipe
max_pending_connections_per_io_thread
thriftMaxNumPendingConnectionsPerWorker
lookup_thriftSetMaxNumMessagesInQueue
thriftSetMaxNumPendingConnectionsPerWorker
max_queue_msgs_per_thread
ds_thrift_max_num_messages_in_queue
zdb_thrift_max_num_messages_in_queue
update_control_proxy_queue_length
safety_enforcer_proxy_queue_length
maxRequestsInQueue
max_num_messages_in_pipe

Reviewed By: yfeldblum

Differential Revision: D6143500

fbshipit-source-id: 541507d50100817590b91cdd48e39a29e7c465ea
2017-10-25 08:18:10 -07:00
Chad Austin
03396f30f7 Use a pool of HgImporter workers
Summary:
Have HgBackingStore hold multiple HgImporters in its own thread pool. Incoming
requests are processed by threads in the thread pool.

Reviewed By: wez

Differential Revision: D5067763

fbshipit-source-id: d666b1026455d13442367673010b5934ff4cdb48
2017-10-24 10:45:33 -07:00
Stepan Palamarchuk
e0227f8c4e Remove call-sites to ThriftServer::setMaxNumPendingConnectionsPerWorker()
Summary:
While debugging SRProxy issue I found a bug in ThriftServer initialization (more details in D6115379), due to that bug we always have accept queue of 1k and the value passed to `setMaxNumPendingConnectionsPerWorker` is ignored, thus making all call sites bogus.

We have ~350 calls to this function in fbcode, all of them pass random numbers such as 50, 100, 50k, etc. I’m removing all of them before landing the fix for Thrift to avoid causing production issues.

Unfortunately almost every such caller has a gflag for it (that are usually have confusing names, e.g. FLAGS_thriftSetMaxNumMessagesInQueue). I’m not touching these gflags, because they may be used in different configs and removing them would cause breakages.

Reviewed By: yfeldblum

Differential Revision: D6129569

fbshipit-source-id: 1550b5073bac42d0c6fb7bdffa1835bf523609c8
2017-10-23 19:32:03 -07:00
Michael Bolin
ed5b3c9aef Add logging for both glob() and getSHA1() in EdenServiceHandler.
Summary:
During the initial phases of a large Buck build, we expect to get a lot of
`glob()` calls during the parse phase (assuming the Buck project is configured
with `glob_handler = watchman`) and a lot of `getSHA1()` calls during the
building phase. Being able to dump this to the log file will make this easier to
audit.

Reviewed By: chadaustin

Differential Revision: D6097764

fbshipit-source-id: 4cb1bb4f6b21830c4d830fcdcf87023eab859f57
2017-10-19 17:20:08 -07:00
Michael Bolin
2c46c59ad6 Write the PID to the lockfile and update eden health to use it.
Summary:
We have encountered cases where `eden health` reported
`"edenfs not healthy: edenfs not running"` even though the `edenfs` process is
still running. Because the existing implementation of `eden health` bases its
health check on the output of a `getStatus()` Thrift call, it will erroneously
report `"edenfs not running"` even if Eden is running but its Thrift server is
not running. This type of false negative could occur if `edenfs` has shutdown
the Thrift server, but not the rest of the process (quite possibly, its
shutdown is blocked on calls to `umount2()`).

This is further problematic because `eden daemon` checks `eden health`
before attempting to start the daemon. If it gets a false negative, then
`eden daemon` will forge ahead, trying to launch a new instance of the daemon,
but it will fail with a nasty error like the following:

```
I1017 11:59:25.188414 3064499 main.cpp:81] Starting edenfs.  UID=5256, GID=100, PID=3064499
terminate called after throwing an instance of 'std::runtime_error'
  what():  another instance of Eden appears to be running for /home/mbolin/local/.eden
*** Aborted at 1508266765 (Unix time, try 'date -d 1508266765') ***
*** Signal 6 (SIGABRT) (0x1488002ec2b3) received by PID 3064499 (pthread TID 0x7fd0d3787d40) (linux TID 3064499) (maybe from PID 30644
99, UID 5256), stack trace: ***
    @ 000000000290d3cd folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)
    @ 00007fd0d133cacf (unknown)
    @ 00007fd0d093e7c8 __GI_raise
    @ 00007fd0d0940590 __GI_abort
    @ 00007fd0d1dfeecc __gnu_cxx::__verbose_terminate_handler()
    @ 00007fd0d1dfcdc5 __cxxabiv1::__terminate(void (*)())
    @ 00007fd0d1dfce10 std::terminate()
    @ 00007fd0d1dfd090 __cxa_throw
    @ 00000000015fe8ca facebook::eden::EdenServer::acquireEdenLock()
    @ 000000000160f27b facebook::eden::EdenServer::prepare()
    @ 00000000016107d5 facebook::eden::EdenServer::run()
    @ 000000000042c4ee main
    @ 00007fd0d0929857 __libc_start_main
    @ 0000000000548ad8 _start
Aborted
```

By providing more accurate information to `eden daemon`, if the user tries to
run it while the daemon is already running, they will get a more polite error
like the following:

```
error: edenfs is already running (pid 274205)
```

This revision addresses this issue by writing the PID of `edenfs` in the
lockfile. It updated the implementation of `eden health` to use the PID in the
lockfile to assess the health of Eden if the call to `getStatus()` fails. It
does this by running:

```
ps -p PID -o comm=
```

and applying some heuristics on the output to assess whether the command
associated with that process is the `edenfs` command. If it is, then
`eden health` reports the status as `STOPPED` whereas previously it would report
it as `DEAD`.

Reviewed By: wez

Differential Revision: D6086473

fbshipit-source-id: 825421a6818b56ddd7deea257a92c070c2232bdd
2017-10-18 11:29:43 -07:00
Michael Bolin
288d599fb3 Set flags to use default values for Thrift until we have a reason to do otherwise.
Summary:
Our existing values for these resources appear to be too conservative for
massively parallel builds. Although we expected Eden to behave more like a small
application, there are times when it has to be able to respond more like a
well-provisioned service (particularly during the parse phase of a large Buck
build).

Reviewed By: simpkins

Differential Revision: D6075340

fbshipit-source-id: 7c26e9b0f785358968430527115c63c6d8cdedc8
2017-10-18 11:29:43 -07:00
Wez Furlong
25a9786ca5 augment JournalDelta with unclean paths on snapshot hash change
Summary:
We were previously generating a simple JournalDelta consisting of
just the from/to snapshot hashes.  This is great from a `!O(repo)` perspective
when recording what changed but makes it difficult for clients downstream
to reason about changes that are not tracked in source control.

This diff adds a concept of `uncleanPaths` to the journal; these are paths
that we think are/were different from the hashes in the journal entry.

Since JournalDelta needs to be able to be merged I've opted for a simple
list of the paths that have a differing status; I'm not including all of
the various dirstate states for this because it is not obvious how to
reconcile the state across successive snapshot change events.

The `uncleanPaths` set is populated with an initial set of different paths as
the first part of the checkout call (prior to changing the hash), and then is
updated after the hash has changed to capture any additional differences.

Care needs to be taken to avoid recursively attempting to grab the parents lock
so I'm replicating just a little bit of the state management glue in the
`performDiff` method.

The Journal was not setting the from/to snapshot hashes when merging deltas.
This manifested in the watchman integration tests; we'd see the null revision
as the `from` and the `to` revision held the `from` revision(!).

On the watchman side we need to ask source control to expand the list of
files that changed when the from/to hashes are different; I've added code
to handle this.  This doesn't do anything smart in the case that the
source control aware queries are in use.  We'll look at that in a following
diff as it isn't strictly eden specific.

`watchman clock` was returning a basically empty clock unconditionally,
which meant that most since queries would report everything since the start
of time.  This is most likely contributing to poor Buck performance, although
I have not investigated the performance aspect of this.  It manifested itself
in the watchman integration tests.

Reviewed By: simpkins

Differential Revision: D5896494

fbshipit-source-id: a88be6448862781a1d8f5e15285ca07b4240593a
2017-10-16 22:46:54 -07:00
Chad Austin
73dc6620fc add a debug CLI command to set a log category's level
Reviewed By: simpkins

Differential Revision: D6068539

fbshipit-source-id: 45d0baecc2291c83c48ad22403d44a790b270b9a
2017-10-16 16:37:10 -07:00
Michael Bolin
457f4f52e2 Include the directory that Eden was looking at in the exception message.
Reviewed By: simpkins

Differential Revision: D5997873

fbshipit-source-id: 0f598b873f6cc18ce1cd29c08b546e4884db0a66
2017-10-06 12:28:41 -07:00
Jeremy Fitzhardinge
fe4a7b6765 eden: enable rust Thrift generation
Reviewed By: wez

Differential Revision: D5909038

fbshipit-source-id: a4bdeac30a17682b2a92e182ab53d0413ad54256
2017-10-03 00:54:17 -07:00
Wez Furlong
4218700735 deal with idle timeouts for thrift clients
Summary:
I noticed that when running `hg push --to master`, the implicit
pull took about a minute to collect all the commits from the remote, then
we tried to make a thrift call to set the parents only to fail with a
python EPIPE stack trace.

I opted to simply make a new client for each call; this is similar to how
things were running with the lame thrift client but doesn't have as terrible
an impact on rebase performance as lame thrift client did.

It feels like things might be better if we made the client retry on transport
errors, but this is quite fiddly so let's defer that until we really need
to cut out the unix domain socket connection overheads.

```
$ hg ci
note: commit message saved in ../.hg/last-message.txt
Traceback (most recent call last):
  File "/usr/bin/hg.real", line 47, in <module>
    dispatch.run()
  File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 81, in run
    status = (dispatch(req) or 0) & 255
  File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 163, in dispatch
    ret = _runcatch(req)
  File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 314, in _runcatch
    return _callcatch(ui, _runcatchfunc)
  File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 365, in _callcatch
    if not handlecommandexception(ui):
  File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 993, in handlecommandexception
    ui.log("commandexception", "%s\n%s\n", warning, traceback.format_exc())
  File "/usr/lib64/python2.7/site-packages/hgext3rd/sampling.py", line 79, in log
    return super(logtofile, self).log(event, *msg, **opts)
  File "/usr/lib64/python2.7/site-packages/hgext/blackbox.py", line 163, in log
    parents = ctx.parents()
  File "/usr/lib64/python2.7/site-packages/mercurial/context.py", line 286, in parents
    return self._parents
  File "/usr/lib64/python2.7/site-packages/mercurial/util.py", line 862, in __get__
    result = self.func(obj)
  File "/usr/lib64/python2.7/site-packages/mercurial/context.py", line 1546, in _parents
    p = self._repo.dirstate.parents()
  File "hgext3rd/eden/eden_dirstate.py", line 163, in parents
  File "hgext3rd/eden/eden_dirstate.py", line 157, in _getparents
  File "/usr/local/fb-mercurial/eden/hgext3rd/eden/EdenThriftClient.py", line 145, in getParentCommits
    parents = self._client.getParentCommits(self._root)
  File "facebook/eden/EdenService.py", line 7330, in getParentCommits
  File "facebook/eden/EdenService.py", line 7339, in send_getParentCommits
  File "thrift/transport/THeaderTransport.py", line 411, in flush
  File "thrift/transport/THeaderTransport.py", line 516, in flushImpl
  File "thrift/transport/TSocket.py", line 315, in write
thrift.transport.TTransport.TTransportException: Socket send failed with error 32 (Broken pipe)
```

Reviewed By: bolinfest

Differential Revision: D5942205

fbshipit-source-id: 464e5035075476f47232ca975e107e165057c912
2017-10-02 18:21:44 -07:00
Wez Furlong
9c634a9ce2 readlink .eden/socket before connecting
Summary:
The eden mount may be in a deep tree (for instance, in the
watchman integration tests) and be too long to fit in the unix domain
socket address.

The failure that we see in that case mistakenly informs the user that
eden is not running, rather than that we have a broken address.

Reviewed By: simpkins

Differential Revision: D5948497

fbshipit-source-id: 3bde8b83bf08b82bdd08758382ac0e218cc12829
2017-10-02 11:30:11 -07:00
Wez Furlong
364398cf94 ObjectStore now returns shared_ptr<const>
Summary:
Originally I thought this would help move towards removing a
`future.get()` call from FileInode, but it turned out to not make a difference
to that code.

It does make it a bit less of a chore to deal with the Journal related diff
callbacks added in D5896494 though, and is a move towards a future where we
could potentially return cached and shared instances of these objects.

This diff is a mechanical change to alter the return type so that we can share
instances returned from the object store interface.  It doesn't change any
functionality.

Reviewed By: simpkins

Differential Revision: D5919268

fbshipit-source-id: efe4b3af74e80cf1df20e81b4386450b72fa2c94
2017-09-29 16:54:05 -07:00
Chad Austin
3fb5680152 Rename ConflictType::MODIFIED to ConflictType::MODIFIED_MODIFIED
Summary: Per wez, this makes the MODIFIED case consistent with the other conflict types (e.g. local_remote).  Side benefit of avoiding some naming conflicts in the Haskell/Rust thrift tooling.

Reviewed By: wez

Differential Revision: D5882327

fbshipit-source-id: 3ec68c44d8c8a5c5675f1ced3842d29376d46fe2
2017-09-21 16:54:37 -07:00
Chad Austin
24c0fe9e8f prefetch inode contents when they're looked up
Reviewed By: wez

Differential Revision: D5817915

fbshipit-source-id: 3820f635cc6251ae5e13a4e214ba66df25ab6687
2017-09-19 11:10:11 -07:00
Christopher Dykes
4d602516ca Shift folly:synchronized out of folly:folly
Summary:
```
foundation/dependency_management/ensure-explicit-dependencies.sh folly:synchronized
```

Reviewed By: yfeldblum

Differential Revision: D5802850

fbshipit-source-id: 7587e6352953879ed193b4931617db7028f730b1
2017-09-13 16:53:47 -07:00
Wez Furlong
8135820893 EdenServiceHandler::mountImpl -> EdenServer::mount
Summary:
Following on from D5798659, this diff pulls the mount flow into
EdenServer.  Previously the flow of control would bounce back and forth between
the EdenServiceHandler and EdenServer and this made it (IMO) more difficult to
follow and understand where to add things into the flow.

Now `EdenServer::mount` is the main entry point for the mounting flow.

I've simplified the stat registration and broken that out into helper methods
to avoid cluttering up the mount logic.

Reviewed By: bolinfest

Differential Revision: D5806393

fbshipit-source-id: 7c858a2a580332ce82c2600e9dc3537af1d734d1
2017-09-13 08:42:22 -07:00
Wez Furlong
fab0c4071c centralize the post-mount functionality
Summary:
This moves the bind mounting and post-clone script running
functionality to methods of the EdenMount class and makes the whole
mount flow return a `Future<>`.  The higher level goal is to make it
easier to see where and how we want to tweak this flow to support
graceful restart.

This is mostly straight forward but care is required to avoid deadlocks; there
are two scenarios:

  # We fulfill the fuse start promise in the context of the fuse thread that is
  handling the fuse initialization packet before it has signalled to the kernel
  that it has come up.  This can be solved by using `via(mainEventBase_)`, but...
  # When remounting all the mounts on startup, we're running in the
  `mainEventBase_` thread so the simplistic solution to 1. would cause us to
  deadlock on startup (visible in the remount integration tests).

So to avoid this, we shunt the completion of the future via a CPU pool.

Also worth noting: the way we were setting up the global CPU pool with
wangle wasn't correct; it takes a weak reference to the pool which was
then getting destroyed when our prepare method returned.  It just happened
to work for us in the facebook specific build because something else was
setting up a different CPU executor.

I've reconciled this by just setting up a thread pool of our own and
using that explicitly.

Reviewed By: bolinfest

Differential Revision: D5798659

fbshipit-source-id: f1c48730f283f6962f6cd706c02d82ea2952e369
2017-09-13 08:42:21 -07:00
Wez Furlong
fe1a376e0f migrate functionScheduler to eventBase timer
Summary:
This is reasonably straightforward, although a little
more fiddly than I'd hoped because the timer wheel stuff doesn't
offer a convenient way to set up a recurring timer.

I've also made the inode unloading code get run globally for all
mounts; it was previously scheduling one timer per mount point.
This nets out the same; the function scheduler was just a single
thread anyway, so there is no change in the level of concurrency.
I believe that this tidies up the unload counter too; it looked
like we'd set the counter to be the result of the last mount
point that we processed rather than the aggregate of all mounts.

Having the unload timer be associated with the server rather
than the mount points means that we don't have to do anything
special to coordinate with the timer management when the mount
point is being torn down.

Reviewed By: bolinfest

Differential Revision: D5792938

fbshipit-source-id: 1a14bb7b7f4952139e684fe6b52f64bd1ba70dd0
2017-09-13 08:42:21 -07:00
Michael Bolin
e837848da5 Introduce a special NoValueForKeyError for hgGetDirstateTuple() and hgCopyMapGet().
Summary:
Previously, we were generating a bit of disconcerting noise in our logs when
requesting a non-existent key in the dirstate or its copy map. We were also
susceptible to a logical error in the Eden side being silently translated to
a `KeyError` on the Python side.

Now we make things more explicit by converting a `std::out_of_range` on the C++
side to an explicit `NoValueForKeyError` that is defined in `eden.thrift`.
Now the Python side catches a `NoValueForKeyError` explicitly and converts it
into a `KeyError`. Other types of exceptions should pass through rather than be
swallowed.

This also updates the log messages to communicate when a there is no value for a
key. The messaging is improved so that it no longer appears to be a logical
error.

Reviewed By: wez

Differential Revision: D5800833

fbshipit-source-id: c44f2caf04622475d218593037cc6616bbb1c701
2017-09-11 10:52:09 -07:00
Wez Furlong
1e679a2400 remove stale TODO comment
Summary: This looks like it ended up getting done together with the original diff

Reviewed By: simpkins

Differential Revision: D5796901

fbshipit-source-id: 24ab05c50b13a37eefe903de5fd3f2dac3d462da
2017-09-08 19:25:34 -07:00
Wez Furlong
d910ab6594 unify state and fuseStatus in EdenMount
Summary:
After performing the dumb merge of EdenMount and MountPoint in
the prior commit, this one tidies up the state tracking and the interface
by which clients of the object can be notified of state changes.

I've introduced two Promises; the first of these can be used to wait
for the fuse mount to come up or error out.  It logically replaces
the cond wait in the `start` method and is exposed to the caller
as a Future, allowing them to wait and react to the outcome.

The second of the promises is associated with the fuse thread pool
winding down.  The attached future can be extracted and used by the
client of the EdenMount class.  This future yields the fuse device
descriptor which we can then choose to pass on during graceful
restart or simply close.  In the current integration, since we ignore
the result of that future, the handle is implicitly closed.

These promises allow us to remove the reference cycle that we had with the
`onStop` function and to potentially make the mount/unmount sequence more
async.

Reviewed By: bolinfest

Differential Revision: D5778214

fbshipit-source-id: 00b293009b7251ddd8bfb10795a115188e97aa3a
2017-09-08 19:25:34 -07:00
Wez Furlong
3da68e5adc dumb merge of MountPoint into EdenMount
Summary:
This is a mechanical and dumb move of the code from MountPoint
and into the EdenMount class.

Of note, it doesn't merge together the two different state/status fields
into a unified thing; that will be tackled in a follow on diff.

Reviewed By: bolinfest

Differential Revision: D5778212

fbshipit-source-id: 6e91a90a5cc760429d87a475ec12f81b93f87be0
2017-09-08 19:25:34 -07:00
Wez Furlong
467c417ccb re-organize the fuse Channel and Session code
Summary:
The higher level goal is to make it easier to deal
with the graceful restart scenario.

This diff removes the SessionDeleter class and effectively renames
the Channel class to FuseChannel.  The FuseChannel represents
the channel to the kernel; it can be constructed from a fuse
device descriptor that has been obtained either from the privhelper
at mount time, or from the graceful restart procedure.  Importantly
for graceful restart, it is possible to move the fuse device
descriptor out of the FuseChannel so that it can be transferred
to a new eden process.

The graceful restart procedure requires a bit more control over
the lifecycle of the fuse event loop so this diff also takes over
managing the thread pool for the worker threads.  The threads
are owned by the MountPoint class which continues to be responsible
for starting and stopping the fuse session and notifying EdenServer
when it has finished.  A nice side effect of this change is that
we can remove a couple of inelegant aspects of the integration;
the stack size env var stuff and the redundant extra thread
to wait for the loop to finish.

I opted to expose the dispatcher ops struct via an `extern` to
simplify the code in the MountPoint class and avoid adding special
interfaces for passing the ops around; they're constant anyway
so this doesn't feel especially egregious.

Reviewed By: bolinfest

Differential Revision: D5751521

fbshipit-source-id: 5ba4fff48f3efb31a809adfc7787555711f649c9
2017-09-08 19:25:34 -07:00
Wez Furlong
87ff487820 EdenServiceHandler::mountImpl returns a Future
Summary:
this is the dumb and obvious refactor of this method to
propagate and wait on the Future from the EdenMount::create call.

Reviewed By: simpkins

Differential Revision: D5750372

fbshipit-source-id: fb7ce595de3bacab99ce8af6ef597ef6f0417c12
2017-09-06 13:04:54 -07:00
Christopher Dykes
1276d581a8 Kill folly/experimental:experimental
Summary:
```
foundation/dependency_management/kill-dead-target.sh folly/experimental:experimental
```

Reviewed By: yfeldblum

Differential Revision: D5762711

fbshipit-source-id: fbe05120b1ce47347a6a08ebe44fb1e8b66ee9fd
2017-09-03 14:24:26 -07:00
Adam Simpkins
25c7041589 add an EdenServer::getMainEventBase() method
Summary: Add a method to get the EventBase used to drive the main thread.

Reviewed By: wez

Differential Revision: D5750054

fbshipit-source-id: ad2eba021a6200ed28e39a60b16d90aabfaee5b4
2017-08-31 19:05:23 -07:00
Adam Simpkins
15c3de2dc1 add an EdenServer::run() method
Summary:
This moves logic for running the server from main.cpp into the EdenServer
class.

This will make it easier to refactor some of the start-up and running process
in the future, and makes EdenServer the owner of this entire workflow.  This
will help as we start splitting the startup code into two separate code paths:
one for a new, fresh start, and one for graceful restart taking over mounts
from an existing eden process.

Reviewed By: bolinfest

Differential Revision: D5732656

fbshipit-source-id: 63f05eb1105078764f4e4931d770416dd5f6d6dc
2017-08-30 13:50:08 -07:00
Christopher Dykes
053347e747 Shift folly:network_address out of folly:folly
Summary:
```
foundation/dependency_management/ensure-explicit-dependencies.sh folly:network_address
```

Reviewed By: yfeldblum

Differential Revision: D5715989

fbshipit-source-id: c512b648c7a3968a47bef4c20db58e11938f68b7
2017-08-28 01:30:40 -07:00
Christopher Dykes
f6b84b74c8 Shift folly:thread_local out of folly:base
Summary:
```
foundation/dependency_management/ensure-explicit-dependencies.sh folly:thread_local
```

Reviewed By: yfeldblum

Differential Revision: D5715635

fbshipit-source-id: b1e770e59de7406aa2a2f669ce974b8d87ccc1a5
2017-08-27 16:55:10 -07:00
Jyothsna Konisa
8fb37c1ada Diagnostic tool to report Stat information of EdenFs
Summary:
Added new tool to report stat information of EdenFs like fuse counters, Memory counters, latencies, Inode status for all the mount points etc.

eden stat : Prints the general information about eden like list of mount points, loaded unloaded and materialized inodes in each mount point. Also this reports how well periodic unload job is doing by reporting the number of unloaded inodes by periodic job.

eden stat io : Prints how many number of calls made to a system call in Edenfs.

eden stat memory : returns the memory stat for edenfs.

eden stat latency : reports the latencies of system calls in Edenfs.

Reviewed By: bolinfest

Differential Revision: D5660345

fbshipit-source-id: 97a1c2b83a6d8df0cd1b82c4d54b52d7ebd126bd
2017-08-25 12:49:35 -07:00
Braden Watling
ab43c66a8d Add test to verify that eden debug getpath indicates when inodes are unloaded
Summary:
This test was supposed to be a part of D5627411 but it was causing strange behaviour so was brought to a separate diff for further investigation.

After investigating, the test didn't pass because the UnloadedInodeData struct only contained the name of the file, not the path to it. The fix for this was to implement a way to get the relative path of the file even after the inode is unloaded.

Reviewed By: simpkins

Differential Revision: D5646929

fbshipit-source-id: f166398a651e8aea49da7e4474a5ad7fde2eaa4e
2017-08-25 08:34:31 -07:00
Michael Bolin
29805661ee Set the default logging to DBG2 for the eden category instead of INFO.
Summary:
Because `DBG2` seems to be the level we are using to log thrift calls in
`EdenServiceHandler`, this seems like a reasonable default.

Reviewed By: simpkins

Differential Revision: D5686115

fbshipit-source-id: 2d3e0173df37919b6936f73e641f880d16dc539f
2017-08-22 21:39:14 -07:00
Michael Bolin
303655c4b1 Add integration test for hg copy.
Summary:
Note that this feature was mostly implemented before this commit, but never
tested. Unsurprisingly, there were bugs.

This change also introduces a new `eden debug hg_copy_map_get_all` subcommand
because that was a straightforward way to verify the internal state of the copy
map on the server side from an integration test.

Adding this test uncovered a key copy/paste bug in `EdenThriftClient.py`
(`hgCopyMapGet` was being invoked instead of `hgCopyMapPut`.)

It also uncovered a bug in `LameThriftClient` because the `compile()` and
`eval()` calls on the output are not appropriate when the return type of the
Thrift endpoint is `string`.

Reviewed By: simpkins

Differential Revision: D5686114

fbshipit-source-id: f0093d2b67062c01982dc5bc1f0db2774b3a9356
2017-08-22 21:06:07 -07:00
Jyothsna Konisa
72b61a5ddc Changes to return unloaded inode count for TreeInode::unloadChildrenNow
Summary:
1.Modified `TreeInode::unloadChildrenNow()` to return number of inodes that have been unloaded.
2.Modified `EdenServiceHandler::unloadInodeForPath()` to return number of inodes that are unloaded.

Reviewed By: simpkins

Differential Revision: D5627539

fbshipit-source-id: 4cdb0433dced6bf101158b9e6f8c35de67d9abbe
2017-08-22 19:50:00 -07:00
Jyothsna Konisa
371cfa097d TestCase to verify unloadChildrenNow with age
Summary:
Added a test case `test_unload_free_inodes_age` to verify the behaviour of unloadChildrenNow with age parameter.
Added new parameter age to `unloadInodeForPath` in eden.thrift, and `EdenServiceHandler`.
Modified `do_unload_inodes` function in `debug.py` to support the new behaviour.

Reviewed By: simpkins

Differential Revision: D5565859

fbshipit-source-id: a35053725be26bc906cf158969cbe21db1cbadde
2017-08-22 19:50:00 -07:00
Michael Bolin
5cf1692782 Add new Thrift API: hgClearDirstate(mountPoint)
Summary:
When Hg tells the `dirstate` to `clear()`, we should also clear out any data we
have on the server for the `Dirstate`.

As it stands, the way we subclass `dirstate`, it does not appear like `clear()`
should be called, in practice, though one thing that could call it is
`hg debugrebuilddirstate`. It is probably good for us to have an RPC lying
around that we can use to reset the `Dirstate.`

Reviewed By: wez

Differential Revision: D5675298

fbshipit-source-id: 38926cfd93f4f83e4c28910f812a693cb32e423a
2017-08-22 16:50:24 -07:00