Summary:
Update the C++ edenfs code to ensure that the .eden and
.eden/storage/rocks-db directories exist, rather than requiring the python CLI
code create these directories as part of `eden start`
Reviewed By: strager
Differential Revision: D8508488
fbshipit-source-id: 358521b4f5eed1d19bf37903900ca50718e2c35c
Summary:
Update edenfs to print version information in its startup messages.
More detailed version information will be printed to the log file, but it
seems nice to also include this in user-facing messages that will be printed
to the user's terminal in most situations.
Reviewed By: chadaustin
Differential Revision: D8508486
fbshipit-source-id: 9364290ed470375120acd74cfaef1ccde41fd746
Summary:
This updates edenfs to be able to daemonize itself, and moves the
daemonization logic from the python CLI code into C++.
The main benefit of this is that we can now do a better job of reporting
messages to the user during start-up. We can log around potentially slow
operations (opening the RocksDB local store), and we can print messages
directly to the user if startup fails. Previously most failure messages would
go only to the eden log and would not be printed to the user's terminal.
This also fixes some issues where stdin and stdout were not closed properly
when daemonization was performed by the CLI. sudo needed access to these file
descriptors in case it needed to prompt for a password, and it would then hold
the descriptors open until edenfs exited.
Reviewed By: wez
Differential Revision: D8373672
fbshipit-source-id: 3272bff2208596f41d26e479c82c700d6c1efe11
Summary:
We provide average values for vmRSS and memory (dirty bytes) on linux systems
by parsing the proc file system. This change allows the current values
(non-average) to be queried and calculated from the thrift API. The cli has
been updated to present these values (and no longer consumes the parsed smaps
files from EdenServer).
Reviewed By: chadaustin
Differential Revision: D8549305
fbshipit-source-id: 77f6838f39784e7ebeda11d8c66dba1fa9f10591
Summary:
This adds a small helper class for printing startup status messages to the
user, and for signaling startup status over a pipe.
This will make it possible in a future diff for edenfs to daemonize, but still
communicate messages back to the parent process to report what it is doing
until it finishes the start-up steps.
Reviewed By: strager
Differential Revision: D8372250
fbshipit-source-id: 53a897944beeb1582a090a2b69afbc2b41408d52
Summary:
We calculate vmRSS bytes on linux systems by parsing the /proc/self/status
file. The value in that file is in kB so we need to convert it to bytes.
Reviewed By: chadaustin
Differential Revision: D8549009
fbshipit-source-id: 88b18543cb561372dc5eee84293e79fddca8efcb
Summary:
Expand the data we collect in fb 303 collector. Currently we extract data from
procprint it is only reliable for a day (or so).
Here we add:
- memory vm rss bytes (from /proc/self/status)
- memory private bytes (from /proc/self/smaps)
Reviewed By: chadaustin
Differential Revision: D8380917
fbshipit-source-id: dca6fac7af44321c7a6615edb0fde0cb7c8827d0
Summary:
We've seen what appears to be phantom calls to shutdown() so we'd like
to add some degree of auditing. This diff adds a new method with some
context; this will allow us to distinguish between `eden stop`, `eden restart`
and eden server internal calls to the `shutdown` method. It may still
be possible that something else is calling our shutdown method, but it
seems unlikely as we're only accessible to our own code via a unix domain
socket.
Reviewed By: chadaustin
Differential Revision: D8341595
fbshipit-source-id: 50d58ea0b56e5f42cd37c404048d710bde0d13a3
Summary:
Update edenfs to fork the privhelper process as the first thing we do, before
calling folly::init(). This allows us to drop privileges before processing
command line arguments and doing any other startup work.
Reviewed By: wez
Differential Revision: D8212778
fbshipit-source-id: d67e3700305fdb01cb6188645b37875ceb53d21f
Summary:
Add a PrivHelper::detachEventBase() method, and rename PrivHelper::start() to
attachEventBase().
This makes it possible to detach a running PrivHelper from its EventBase and
re-attach it to an EventBase later to restart it. This will be useful in
upcoming diffs to allow performing calls to the PrivHelper before the main
thrift server EventBase has started.
Reviewed By: wez
Differential Revision: D8212777
fbshipit-source-id: d5a9bf672afa8b16e53201ac747d77337e1cc307
Summary:
Profiling revealed that we spend a lot of time spookyhashing things
during a big `eden prefetch '**' --silent --no-prefetch` operation, so this
does the obvious and dumb thing to avoid it.
Reviewed By: simpkins
Differential Revision: D8373604
fbshipit-source-id: 16772c0680949792045560f168294239f4cd513b
Summary:
This updates the privhelper code to use the UnixSocket class for performing
I/O. This reduces the number of separate implementations of code we have for
sending file descriptors across Unix domain sockets, and also makes the
privhelper APIs non-blocking.
This will make it easier to clean up some of the initialization ordering in
the future. It will also make it easier to send file descriptors to the
privhelper server, instead of just receiving them. This may be helpful for
passing a file descriptor to use for logging to the privhelper process, which
will make it easier to fork the privhelper before logging redirection has
occurred.
Reviewed By: bolinfest
Differential Revision: D8053422
fbshipit-source-id: 1f8fdf22afc797eead0213be1352ea530762140d
Summary:
Up until now all of the privhelper APIs have been blocking calls. This
changes the privhelper functions to return Futures, and updates all users of
these APIs to be able to handle the results using Futures.
One benefit of this change is that all existing mount points are remounted in
parallel now during startup, rather than being mounted serially. The old code
performed a blocking `get()` call on the future returned by
`EdenServer::mount()`.
The privhelper calls themselves are still blocking for now--they block until
complete and always return completed Future objects. I will update the
privhelper code in a subsequent diff to actually make it asynchronous.
Reviewed By: bolinfest
Differential Revision: D8053421
fbshipit-source-id: 342d38697f67518f6ca96a37c12dd9812ddb151d
Summary:
This file contained one helper function that didn't really save us much code.
This simply removes it for now.
Reviewed By: wez
Differential Revision: D8329592
fbshipit-source-id: 5172ac0746fc051288c76522c6c3d5ac4097f588
Summary:
When testing D8108649 I accidentally deleted all of my trees
but didn't delete my commit2tree mapping. This diff allows Eden to
recover from that situation.
Reviewed By: wez
Differential Revision: D8108728
fbshipit-source-id: 94a9393294ca259303026c297683dac4b3ecfac4
Summary: This parameter was only supported for fbbuild.
Reviewed By: yfeldblum
Differential Revision: D8246482
fbshipit-source-id: 95db878a34dce5694639364f2838bb4cccd723d3
Summary:
1. Enabled a number of additional C++ compiler warnings in Eden.
2. Fixed warnings-turned-errors that resulted from this change.
Reviewed By: simpkins
Differential Revision: D8132543
fbshipit-source-id: 2290ffaaab55024d582e29201a1bcaa1152e6b3e
Summary:
Like D7867399, split TreeInode's synchronized state into a top-level
class. This is a step towards using the type system to perform
lock-safe metadata updates.
Reviewed By: simpkins
Differential Revision: D7882648
fbshipit-source-id: 27262df8ed9137c8478c68ebf4c4f13878655754
Summary:
Have getBlobMetadata always return a Future. It's a little unfortunate
that this will always allocate, but it sounds like we might decide to
put all RocksDB access on a background thread to increase CPU
parallelism.
Reviewed By: bolinfest
Differential Revision: D8101464
fbshipit-source-id: 6e9ec95050c366c7c57519e3f68b311470b2addd
Summary: We are changing `folly::collectAll` to return `SemiFuture` rather than `Future` and this is needed as an interim step. After all calls to `collectAll` are changed to `collectAllSemiFuture`, we'll be renaming it back to `collectAll`.
Reviewed By: yfeldblum
Differential Revision: D8210974
fbshipit-source-id: e4a7464f4a1c3ede157b8377a4df97d943001f60
Summary:
Remove getTreeFuture and have getTree always return a Future. It's a
little unfortunate that this will always allocate, but it sounds like
we might decide to put all RocksDB access on a background thread to
increase CPU parallelism.
Reviewed By: bolinfest
Differential Revision: D8101430
fbshipit-source-id: e12b7ab07b3468114a58753768655c107265b8af
Summary:
Remove getBlobFuture and have getBlob always return a Future. It's a
little unfortunate that this will always allocate, but it sounds like
we might decide to put all RocksDB access on a background thread to
increase CPU parallelism.
Reviewed By: bolinfest
Differential Revision: D8101402
fbshipit-source-id: d6cbbd7fe4fe55bad661c9158297db2f03f7d352
Summary:
I kept running into issues trying to get graceful restart and
flush_cache to work together in the hg integration suite, so add a
test to ensure flush_cache succeeds after a graceful restart in the
main integration suite.
Also, to make the test's output easier to follow, add logging when
invalidating inodes.
Reviewed By: simpkins
Differential Revision: D8215961
fbshipit-source-id: 33db4292af3969ae23940c3027ba513ed20c53fb
Summary:
This adds a debug command to blow away all RocksDB information that
can be reproduced from Mercurial. We will use it to help an Eden user
recover from a corrupted blob.
Reviewed By: bolinfest
Differential Revision: D8108649
fbshipit-source-id: 056dec19d51b9e430b3c2a249747b26830cfc875
Summary: Mostly empty lines removed and added. A few bugfixes on excessive line splitting.
Reviewed By: cooperlees
Differential Revision: D8198776
fbshipit-source-id: 4361faf4a2b9347d57fb6e1342c494575f2beb67
Summary: We are changing `folly::collectAll` to return `SemiFuture` rather than `Future` and this is needed as an interim step. After all calls to `collectAll` are changed to `collectAllSemiFuture`, we'll be renaming it back to `collectAll`.
Reviewed By: yfeldblum
Differential Revision: D8157548
fbshipit-source-id: 27b768ac7ff0d6572bde57f01601045a1fd5d5e5
Summary:
This removes the main point of contention for eden prefetch
in two ways:
1. We batch up the complete list of blobs so that they can be processed
in bulk rather than stalling the tree walk
2. We can ask remotefilelog to check and fetch that list to the local
hgcache, again as a batch, rather than by forcing the data to be
loaded through into the local store
The goal of this prefetch is to bulk load data from the mercurial server
so that a subsequent file access doesn't have to make a one-off ssh session
for each one, rather than making sure that all the data is loaded into
the local store.
Reviewed By: chadaustin
Differential Revision: D7965818
fbshipit-source-id: 753400460d633b5467c5110e3f5608ce06106e00
Summary:
This is a first pass at a prefetcher. The idea is simple,
but the execution is impeded by some unfortunate slowness in different
parts of mercurial.
The idea is that you pass a list of glob patterns and we'll do something
to make accessing files that match those patterns ideally faster than
if you didn't give us the prefetch hint.
In theory we could run `hg prefetch -I PATTERN` for this, but prefetch
takes several minutes materializing and walking the whole manifest to
find matches, checking outgoing revs and various other overheads.
There is a revision flag that can be specified to try to reduce this
effort, but it still takes more than a minute.
This diff:
* Removes a `Future::get()` call in the GlobNode code
* Makes `globFiles` use Futures directly rather than `Future::get()`
* Adds a `prefetchFiles` parameter to `globFiles`
* Adds `eden prefetch` to the CLI and makes it call `globFiles` with
`prefetchFiles=true`
* Adds the abillity to glob over `Tree` as well as the existing `TreeInode`.
This means that we can avoid allocating inodes for portions of the
tree that have not yet been loaded.
When `prefetchFiles` is set we'll ask ObjectStore to load the blob for
matching files. I'm not currently doing this in the `TreeInode` case
on the assumption that we already did this earlier when its `TreeInode::prefetch`
method was called.
The glob executor joins the blob prefetches at each GlobNode level. It may
be possible to observe higher throughput if we join the complete set at the
end.
Reviewed By: chadaustin
Differential Revision: D7825423
fbshipit-source-id: d2ae03d0f62f00090537198095661475056e968d
Summary:
This is used to dump the raw `JournalDelta` entries in the journal.
Hopefully this will help us figure out what's happening in T28686395, or more
generally, why we see get those merge errors that appear in the logs of the
form:
```
Journal for .hg/rebasestate holds invalid Created, Created sequence
```
from `eden/fs/journal/JournalDelta.cpp`.
(Note: this ignores all push blocking failures!)
Reviewed By: wez
Differential Revision: D7855071
fbshipit-source-id: f195813695bec7426329a9aacd84a9b1613feec9
Summary:
The calls to `getFilesChangedSince()` seem to fill up our logs without providing
a ton of value, so let's drop the default log level to `DBG3` so they do not
make it into the logs by default.
Reviewed By: chadaustin
Differential Revision: D8107917
fbshipit-source-id: 069673e9740487d8e4b62597beb13d5779971a79
Summary:
Update the folly::Init code to define a `--logging` command line flag, and call
`folly::initLoggingOrDie()` with the value of this command line during
initialization.
This is similar to the existing code that initializes the glog library.
(Programs can use both glog and folly logging together in the same program, and
I expect that many programs will do so as parts get converted to folly::logging
and parts remain using glog.)
Reviewed By: yfeldblum
Differential Revision: D7827344
fbshipit-source-id: 8aa239fbad43bc0b551cbe40cad7b92fa97fcdde
Summary:
I got tired of typing PathComponentPiece{"..."} in tests so here are
some operator literals.
Reviewed By: simpkins
Differential Revision: D7956732
fbshipit-source-id: 85d9f3fd725853a54da9e70fc659bd7eb9e0862c
Summary:
We need to introduce a new `includeDotfiles` option to `glob()`. [As we have
done for all of our Thrift API, to date], rather than define `glob()` so that it
takes a single struct, we specified the parameters individually, so we can no
longer add new params to `glob()`.
In particular, we need to support `includeDotfiles` because we often configure
Buck to use Watchman to implement `glob()` in `BUCK` files, and when Watchman is
used in Eden, it leverages Eden's Thrift API to implement `glob()`. Because
Buck's `glob()` has an `include_dotfiles` option, we must be able to honor it
and pass it all the way through to Eden's `glob()` implementation.
Rather than name the new API `glob2()`, I'm electing to go with `globFiles()`.
(Perhaps once we eliminate all known users of `glob()` in the wild, which
requires turning over the current version of Watchman we have deployed, we can
redefine `glob()` in `eden.thrift` to be the same as `globFiles()` and then
update everyone to use `glob()` again so it has the more intuitive name.)
Reviewed By: wez
Differential Revision: D7748870
fbshipit-source-id: 92438f9c41e4fbdbd6cdccca5fce0e41cc3e9b07
Summary:
To provide the ability to ignore dotfiles, we update `GlobNode()` to take a
`bool includeDotfiles` that is used to determine the options used to create
the `GlobMatcher` associated with the node. Whereas the patterns
`**` and `*` were assumed to always match, that is now true only when
`includeDotfiles` is true, as well.
Finally, this adds a special case when `**` is used as the `pattern` for a `GlobNode`
and `includeDotfiles` is `false`. Because `GlobMatcher` does not accept `**` as
input, we specify the pattern as `**/*` to `GlobMatcher`, which it accepts and is
functionally equivalent in our case.
This is safe because we only do this trick when `**` is specified as a leaf `GlobNode`.
Reviewed By: wez
Differential Revision: D7741508
fbshipit-source-id: 9e6a50cb4dab09be2497393c641f176c84316a07
Summary:
Although we have not seen any ASAN failures or anything as a result of the
existing implementation, it certainly seems that it should be possible to
use a `StringPiece pattern` that is passed to `GlobNode.parse()` and not require
the `pattern` to outlive the `GlobNode`.
Reviewed By: wez
Differential Revision: D7795497
fbshipit-source-id: cefdd3fc69f8095c3aef3306b45f60ab4f82c737
Summary:
When `GlobMatcher` is used to implement `glob()` for Eden, `**` should not
include dotfiles by default (at least when it is used to implement `glob()` in Buck),
so we need to make this configurable. To this end, this adds a `GlobOptions`
parameter to `GlobMatcher::create()`. The key option this revision introduces is
`GlobOptions::IGNORE_DOTFILES`.
We implement this new functionality by associating a `matchCanStartWithDot`
boolean with the following opcodes in `GlobMatcher`:
* `GLOB_STAR`
* `GLOB_STAR_STAR_END`
* `GLOB_STAR_STAR_SLASH`
* `GLOB_ENDS_WITH`
The value of `matchCanStartWithDot` is largely determined by
`GlobOptions::IGNORE_DOTFILES`, though some extra checking is done
when assigning this for `GLOB_STAR`.
Originally, `GLOB_ENDS_WITH` required some funny business in how it
manipulated the `result` vector. This revision introduces some new funny
business to preserve the desired optimization.
Most of the work in this revision is new logic to ensure `matchCanStartWithDot`
is honored appropriately for each opcode.
Reviewed By: simpkins
Differential Revision: D7787621
fbshipit-source-id: f2c42e0f0948db74d48dc163d40aa3b13bbb4c3d
Summary:
Promote the folly logging code out of the experimental subdirectory.
We have been using this for several months in a few projects and are pretty
happy with it so far.
After moving it out of the experimental/ subdirectory I plan to update
folly::Init() to automatically support configuring it via a `--logging` command
line flag (similar to the initialization it already does today for glog).
Reviewed By: yfeldblum, chadaustin
Differential Revision: D7755455
fbshipit-source-id: 052db34c97f7516728f7cbb1a5ad959def2f6efb
Summary:
Update the Eden mercurial extension to read the `.eden/root` symlink to
determine what Eden thinks the mount path is. This might be different from
what directory mercurial thinks it is in if a parent directory of the Eden
mount has been bind-mounted to an alternate location.
Maybe in the future we should update thrift clients to pass in the client ID
(currently readable via `.eden/client`) rather than the mount path. That would
make it less likely for clients to accidentally forget to read `.eden/root` and
pass in the wrong mount path.
Reviewed By: wez
Differential Revision: D7705655
fbshipit-source-id: 7bd1e8013b99a52ff06dd45f63d6669b66bdf577
Summary:
This updates the code to store ServerState using a shared_ptr rather than
having it be an inlined member variable of EdenServer.
Previously EdenMount objects contained a raw pointer to the ServerState, with
the reasoning that the EdenServer object should outlive the EdenMount objects.
It turns out that this is not quite true in practice--EdenMount::destroy() will
normally be called before the EdenServer is destroyed, but this may not
actually destroy the EdenMount object immediately.
This fixes a race condition in the FuseTest.initMount() test that could cause
this test to occasionally fail when run on a heavily loaded system.
Reviewed By: chadaustin
Differential Revision: D7720509
fbshipit-source-id: 056ff5985587c8d8c32c11d17ba637ebd7598677
Summary:
D7512338 has been deployed for well over a week, so lets update edenfs to
reject the request if any clients still try to call getScmStatus() without
specifying the hash.
Reviewed By: chadaustin
Differential Revision: D7658022
fbshipit-source-id: ebd49034e076720f156f1ff2c551417ef058167a
Summary:
Remove the EDEN_HAS_COMMON_STATS checks now that the common/stats stubs have
the required APIs needed by Eden.
Reviewed By: wez
Differential Revision: D7479593
fbshipit-source-id: cc3db50288bfea7aefd6c91391ab800628b7978f
Summary:
Add a macro to help users define the `getBaseLoggingConfig()` function.
While I would prefer to avoid macros if possible, this seems worthwhile. This
saves 4 or 5 lines of boilerplate code in each program that sets a custom base
logger setting. It also reduces the likelihood of a developer accidentally
having a typo in the function name, which would still build successfully but
not have the desired results.
Reviewed By: chadaustin
Differential Revision: D7457652
fbshipit-source-id: 1c316c7ea6949c16bd7b61c0440cc1ee69ecb83e
Summary:
Make several performance improvements to INSTRUMENT_THRIFT_CALL:
- Avoid heap-allocating the ThriftLogHelper objects.
- Avoid evaluating the log arguments if the log level is not enabled.
Reviewed By: chadaustin
Differential Revision: D7556846
fbshipit-source-id: e111e24e44499c5cf9725ded2b958a7dcb2c3e26
Summary:
Add a helper function to make sure we log commit hash arguments as hexadecimal
rather than binary data. This is similar to `hashFromThrift()`, except that it
always returns a string result and does not throw on invalid input.
Reviewed By: chadaustin
Differential Revision: D7556916
fbshipit-source-id: 87422de3f178700d378f5ddc45172efd38a13799
Summary:
Change getScmStatus() so that callers must explicitly specify the commit to
diff against. This should help avoid race conditions around commit or checkout
operations where the parent commit has just changed and eden returns status
information against a commit that wasn't what the client was expecting.
This should still maintain backwards compatibility with older clients that do
not send this parameter yet: we will simply receive the hash as an empty string
in this case, and we still provide the old behavior in this case.
Reviewed By: wez
Differential Revision: D7512338
fbshipit-source-id: 1fb4645dda13b9108c66c2daaa802ea3445ac5f2