Summary:
This is a first pass at a prefetcher. The idea is simple,
but the execution is impeded by some unfortunate slowness in different
parts of mercurial.
The idea is that you pass a list of glob patterns and we'll do something
to make accessing files that match those patterns ideally faster than
if you didn't give us the prefetch hint.
In theory we could run `hg prefetch -I PATTERN` for this, but prefetch
takes several minutes materializing and walking the whole manifest to
find matches, checking outgoing revs and various other overheads.
There is a revision flag that can be specified to try to reduce this
effort, but it still takes more than a minute.
This diff:
* Removes a `Future::get()` call in the GlobNode code
* Makes `globFiles` use Futures directly rather than `Future::get()`
* Adds a `prefetchFiles` parameter to `globFiles`
* Adds `eden prefetch` to the CLI and makes it call `globFiles` with
`prefetchFiles=true`
* Adds the abillity to glob over `Tree` as well as the existing `TreeInode`.
This means that we can avoid allocating inodes for portions of the
tree that have not yet been loaded.
When `prefetchFiles` is set we'll ask ObjectStore to load the blob for
matching files. I'm not currently doing this in the `TreeInode` case
on the assumption that we already did this earlier when its `TreeInode::prefetch`
method was called.
The glob executor joins the blob prefetches at each GlobNode level. It may
be possible to observe higher throughput if we join the complete set at the
end.
Reviewed By: chadaustin
Differential Revision: D7825423
fbshipit-source-id: d2ae03d0f62f00090537198095661475056e968d
Summary:
When calling mercurial from inside the Eden CLI, use `$EDEN_HG_BINARY` as the
path to hg if this variable is set. If it is not set, continue the current
behavior of using `hg` from the user's `$PATH`. This is primarily used during
the `eden clone` command.
This makes sure the Eden integration tests use `hg` built from the local
repository rather than the version of `hg` installed on the system. Most
locations in the integration tests were already doing so, but `eden clone` was
one place that still ended up using the system hg binary.
Reviewed By: wez
Differential Revision: D7839850
fbshipit-source-id: da801fad0767a111b3e3dfd393d82da8e2147e22
Summary:
Relax the restriction on changing uid/gid on inodes. We'll see what
cans of worms this opens I guess. (Landing this is low priority, but
might be important for making some of the existing tooling in fbsource
and www work.)
Reviewed By: simpkins
Differential Revision: D7768655
fbshipit-source-id: 95fe02fe7ddc001335dbdb34e16a989a85820240
Summary:
Add methods to UnixSocket and FutureUnixSocket to attach and detach from an
EventBase. This makes it possible to construct a UnixSocket object without
having an EventBase yet and then attach it to an EventBase later.
Reviewed By: bolinfest
Differential Revision: D8053423
fbshipit-source-id: c4de00166dbc0e075b4e4cd81c3dd5b377ea9a52
Summary:
Fix the code in FuseChannel::processSession() to look at the errno value
instead of the return value from `read()` when checking for ENODEV.
I accidentally broke this in D7436867. This wasn't causing any major issues,
since we would still break out of the loop, but it caused us to incorrectly
print warning messages about receiving an unexpected error. We would also
print this error message once per FUSE thread rather than just once for the
mount point.
Reviewed By: bolinfest
Differential Revision: D8109889
fbshipit-source-id: 9a53ca47b436ccf6731144ee2b829131339b6445
Summary: Notice this while checking something for a new hire yesterday.
Reviewed By: bolinfest
Differential Revision: D8119130
fbshipit-source-id: 8a27797061b2316c62f72e34c9f20130d97dc2b1
Summary:
This is used to dump the raw `JournalDelta` entries in the journal.
Hopefully this will help us figure out what's happening in T28686395, or more
generally, why we see get those merge errors that appear in the logs of the
form:
```
Journal for .hg/rebasestate holds invalid Created, Created sequence
```
from `eden/fs/journal/JournalDelta.cpp`.
(Note: this ignores all push blocking failures!)
Reviewed By: wez
Differential Revision: D7855071
fbshipit-source-id: f195813695bec7426329a9aacd84a9b1613feec9
Summary:
The calls to `getFilesChangedSince()` seem to fill up our logs without providing
a ton of value, so let's drop the default log level to `DBG3` so they do not
make it into the logs by default.
Reviewed By: chadaustin
Differential Revision: D8107917
fbshipit-source-id: 069673e9740487d8e4b62597beb13d5779971a79
Summary:
Add a `remove` command to the Eden CLI. This behaves like
`eden unmount --destroy`, but calling this "remove" is hopefully a more
intuitive UI. If stdin is a TTY this command also prompts the user for
confirmation before removing the checkout.
I plan to deprecate the `eden unmount --destroy` command in a subsequent
diff.
Reviewed By: wez
Differential Revision: D8086823
fbshipit-source-id: 562cf0f998eea416b80589b188eee255a10b9699
Summary: Now that permissions on directories work, verify umask works as intended.
Reviewed By: simpkins
Differential Revision: D7783743
fbshipit-source-id: 635221cd3255cc20e9ffa26b6838922c4a4110f3
Summary:
Unify how inode metadata is modified across inode types. This allows
changing the permission bits on a directory.
Reviewed By: simpkins
Differential Revision: D7767254
fbshipit-source-id: 35e9cf652c84c7d8680cc22dec7942e94e9f5af1
Summary:
This moves most inode metadata management into InodeBase and
persists permission bits (and eventually uid/gid) across Eden runs.
Reviewed By: simpkins
Differential Revision: D7035163
fbshipit-source-id: 50145449b56aad1662d53156e6e4960c5f7b6166
Summary: Store tree and file timestamps in the InodeTable so they persist across runs.
Reviewed By: simpkins
Differential Revision: D6891479
fbshipit-source-id: 1c9e6266375aceeaf293a81e73cf7f5334dbc32d
Summary:
This is stacked on top of Black 18.5b0.
allow-large-files
Reviewed By: carljm
Differential Revision: D8061834
fbshipit-source-id: 92e3645e159b60d77cf7e0bec64a8262ca4e88c2
Summary:
This is not at all clear from cppreference.com, but per
https://www.youtube.com/watch?v=dTeKf5Oek2c, it sounds to me like
recommended practice is to either:
`using namespace std::chrono_literals` (or string_literals or
whatever) to pull in a focused set of literals.
or
`using namespace std::literals` to pull in all standard literals.
or
`using namespace std` to pull in everything.
`using namespace std::literals::chrono_literals` is unnecessarily
verbose.
Adopt those standards in Eden.
Reviewed By: simpkins
Differential Revision: D8060944
fbshipit-source-id: 4d9dd4329698b7ff5e5c81b5b28780ca4d81a2a1
Summary:
I'd misunderstood the point of SharedMutex's upgrade locks -
unless they're used in rare paths, they don't allow for increased
concurrency. This diff and D7885245 remove all of Eden's ulocks,
replacing them with a helper which checks once with an rlock held, and
if the check fails, switches to a wlock (and checks again).
Reviewed By: yfeldblum
Differential Revision: D7886046
fbshipit-source-id: 545bb0dbb4898cbb71412efc6222ef12e4ee374e
Summary:
This allows multiple inodes to update their metadata simultaneously
without contending on InodeTable's locks.
Reviewed By: simpkins
Differential Revision: D7885245
fbshipit-source-id: cc8ab6cd90b7424beec314a115852e08b64026ae
Summary:
igorsugak reported an LSAN failure in Eden. Relatively straightforward to
track down - pending requests could sustain an Inode which ultimately
sustains EdenMount which owns the BackingStore, resulting in a
reference cycle for the duration of a fetch request. During EdenMount
shutdown in general we may want to consider aborting or discarding any
pending fetch requests.
Reviewed By: bolinfest, igorsugak
Differential Revision: D8047650
fbshipit-source-id: 32dafaff60570cf54a74ca4f57da61b6657d8ccb
Summary:
Some tools/script may be blindly assuming the presence of this file,
so rather than fight them, just create it.
Reviewed By: simpkins
Differential Revision: D8020274
fbshipit-source-id: 712e3bf31a0aefe27cc20f5361a0edb59c7deb9f
Summary:
Update the folly::Init code to define a `--logging` command line flag, and call
`folly::initLoggingOrDie()` with the value of this command line during
initialization.
This is similar to the existing code that initializes the glog library.
(Programs can use both glog and folly logging together in the same program, and
I expect that many programs will do so as parts get converted to folly::logging
and parts remain using glog.)
Reviewed By: yfeldblum
Differential Revision: D7827344
fbshipit-source-id: 8aa239fbad43bc0b551cbe40cad7b92fa97fcdde
Summary:
This updates the Eden mercurial extension to no longer invoke the Eden
`resetParentCommits()` thrift call when `setparents()` is called on the
dirstate map. Instead we now defer the call to `resetParentCommits()` until
`write()` is called to write the dirstate data to disk.
Informing edenfs of the parent change as soon as `setparents()` was called was
problematic, as this made edenfs reflect the change before the transaction was
committed. Some mercurial commands, notably `hg status` also call
`setparents()` on the dirstate but never write this back to disk at all. This
is problematic since `hg status` calls `setparents()` without holding any
mercurial locks. As a result it may call `setparents()` with the "wrong"
parent if another mercurial process is running and is in the middle of a
transaction.
Reviewed By: bolinfest, chadaustin
Differential Revision: D7980375
fbshipit-source-id: 4f5e4391fd291d4ea5fc93bb9d49ed0380fc1721
Summary:
While running the secfs filesystem validation tests against Eden, I
discovered a test that caused the eden process to abort. I bisected
and found that D7451330 regressed renaming a directory onto an empty
one. This fixes that case.
Reviewed By: simpkins
Differential Revision: D7945727
fbshipit-source-id: 592ede1b391528c02cd12b2b6ebbf3733fe8f503
Summary:
TestMount used to call through the EdenDispatcher when creating nodes
in the filesystem which had the unfortunate side effect of
incrementing the FUSE refcount. This diff asserts the FUSE refcounts
remain zero after inodes are created.
Reviewed By: simpkins
Differential Revision: D7957177
fbshipit-source-id: 7b0865a37ebbf39fdb34db409edd70d606295a0f
Summary:
I got tired of typing PathComponentPiece{"..."} in tests so here are
some operator literals.
Reviewed By: simpkins
Differential Revision: D7956732
fbshipit-source-id: 85d9f3fd725853a54da9e70fc659bd7eb9e0862c
Summary:
This moves some things around in order to facilitate adding the migration
command in a separate file.
Reviewed By: bolinfest
Differential Revision: D7946842
fbshipit-source-id: 54a554fb02e83a12f1d626b81377bc042fac41aa
Summary:
This allows using multiple cores when supported by
the BackingStore, and improves the throughput of prefetches.
Reviewed By: chadaustin
Differential Revision: D7888343
fbshipit-source-id: 1747f4ec4edf9ace02d54a4fb0ea3e8f509f51e5
Summary:
this overrides the LocalStore::getFuture to use its own
thread pool.
Reviewed By: chadaustin
Differential Revision: D7888344
fbshipit-source-id: 76b18d9417b28dc0ab72af8d070bc9e037c73bc3
Summary:
Adds a dumb `getFuture` implementation in the LocalStore
base class that simply calls `get`. Different store implementations
may choose to override this to allow making use of multiple cores
if appropriate.
Reviewed By: chadaustin
Differential Revision: D7888345
fbshipit-source-id: 20ba2db91cd7d62e5594f7d3bc3fca594dd107aa
Summary: This old Overlay code is no longer necessary.
Reviewed By: simpkins
Differential Revision: D7903912
fbshipit-source-id: 4a39d6ce7d1f6f81eb13715f2d5d17b22c10d413
Summary:
A persistent (but notably non-durable) mapping from inode
number to a fixed-size record stored in a memory-mapped file. The two
primary goals here are:
1. efficiently (and lazily) reify timestamps for inodes that aren't in the overlay
2. allow the kernel's page cache to drop pages under memory pressure
Reviewed By: simpkins
Differential Revision: D6877361
fbshipit-source-id: a4366b12e21e2bf483c83069cd93ef150829b2ac
Summary:
Make it clear (especially for the upcoming InodeMetadata struct) which
operations with EdenTimestamp and InodeTimestamps will never throw.
Reviewed By: simpkins
Differential Revision: D7920219
fbshipit-source-id: 5917da51b8128455893a1480def6f2c1c8de13d4
Summary:
simpkins was curious how data format migrations would be handled in
the upcoming InodeTable. This diff implements the bulk of the logic
which is largely at the MappedDiskVector level. The existing file
format supported record version negotiation and this diff hooks it up
with some type-level operations.
Reviewed By: simpkins
Differential Revision: D7836249
fbshipit-source-id: 00e36bc67068c7524956e908b3872c80a79241c0
Summary:
Per yfeldblum's comment in D7886046, we can use folly::unit instead
of folly::Unit{}. We weren't using folly::unit anywhere, so this diff
replaces folly::Unit{} elsewhere in the Eden code.
Reviewed By: yfeldblum
Differential Revision: D7913462
fbshipit-source-id: fa6ab44ceb406d38713e0f4649224a74e6e51abd
Summary:
Historically, we have seen a number of messages like the following in the Eden
logs:
```
Journal for .hg/blackbox.log holds invalid Created, Created sequence
```
Apparently we were getting these invalid sequences because we were not always
recording a "rename" correctly. The "rename" constructor for a `JournalDelta`
assumed that the source path should be included in the list of "removed" files
while the destination path should be included in the list of "created" files.
However, that is not accurate if the destination path already existed before
the user ran `mv`.
Fortunately, we already check whether the destination file exists in
`TreeInode::doRename()`, so it is straightforward to determine whether the
action is a "rename" (destination does not exist) or an "replace" (destination
already exists) and then classify the destination path accordingly.
As demonstrated by the new test introduced in this commit
(`JournalUpdateTest::moveFileReplace`), in the old implementation,
a file that was removed after it was overwritten would not show up as
removed in the merged `JournalDelta`. Because Watchman relies on
`JournalDelta::merge()` via the Thrift method `getFilesChangedSince()`,
this would cause Watchman to report such a file as still existing even
though it was removed.
This definitely caused bugs in Nuclide. It is likely that other tools that rely
on Watchman in Eden (such as Buck) may have also done incorrect things
because of this bug, so this could explain past reported issues.
Reviewed By: simpkins
Differential Revision: D7888249
fbshipit-source-id: 3e57963f27c5421a6175d1a759db8d9597ed76f3
Summary:
Before I fixed an issue with `nuclide-connections` in D7833410, it could throw
an error. Although there was a try/except around the `check_output` call in
`eden doctor`, it caught `OSError` rather than `CalledProcessError`, which
seems like a mistake (or maybe an inadvertent evolution of the code).
While there, I also extended it to catch `ValueError` in case the stdout of the
subprocess is not valid JSON, in which case `json.loads()` will raise a
`ValueError`.
Reviewed By: chadaustin
Differential Revision: D7890571
fbshipit-source-id: 184f6f669e9d62a5fb04db29bcbab450defc226e
Summary:
For sure, there is still more we can do to improve the output of `eden doctor`
to make it easier to scan, but I thought some color to classify the status
would be helpful. With this change, `eden doctor`:
* Prints green when no issues are encountered.
* Prints red when there are issues that could not be fixed.
* Prints yellow when there were issues that were fixed.
* Prints yellow when there were issues that were not fixed because it was a dry run.
In making this change, I took `StdoutPrinter` from `eden/cli/debug.py` and moved
it into its own file, `eden/cli/stdout_printer.py`. While there, I introduced an
`AnsiEscapeCodes` class that can be passed to the constructor of `StdoutPrinter`
so the client can specify its own ANSI escape codes. The unit test uses these to
ensure the test output will be the same independent of the value of
`sys.stdout.isatty()` when the test is run.
Reviewed By: chadaustin
Differential Revision: D7890525
fbshipit-source-id: a95ff8c1685b48c2d239923cf08456ec6de757fe
Summary:
This diff works a little harder to be able to successfully
stop buck in a repo. It does so by performing a single level glob
to find the main buckconfig files and then invoking buck kill in
each of those locations.
The output from buck is suppressed as we've had reports that it
was confusing.
I've removed the code that shutdown chg; it's been causing us
problems in our integration tests, and the problematic behavior
will soon be addressed in chg itself.
Reviewed By: chadaustin
Differential Revision: D7874975
fbshipit-source-id: e9755099b1d22f2b4e3684280eb95cb9c9d11a41
Summary:
I have an upcoming change that touches this file and I don't want to pull in
unrelated edits. (I thought we had already autoformatted all of this code?)
Reviewed By: chadaustin
Differential Revision: D7855072
fbshipit-source-id: 2d2ab2ce19d438af73c30471199d15db98fa4e3a
Summary:
We need to introduce a new `includeDotfiles` option to `glob()`. [As we have
done for all of our Thrift API, to date], rather than define `glob()` so that it
takes a single struct, we specified the parameters individually, so we can no
longer add new params to `glob()`.
In particular, we need to support `includeDotfiles` because we often configure
Buck to use Watchman to implement `glob()` in `BUCK` files, and when Watchman is
used in Eden, it leverages Eden's Thrift API to implement `glob()`. Because
Buck's `glob()` has an `include_dotfiles` option, we must be able to honor it
and pass it all the way through to Eden's `glob()` implementation.
Rather than name the new API `glob2()`, I'm electing to go with `globFiles()`.
(Perhaps once we eliminate all known users of `glob()` in the wild, which
requires turning over the current version of Watchman we have deployed, we can
redefine `glob()` in `eden.thrift` to be the same as `globFiles()` and then
update everyone to use `glob()` again so it has the more intuitive name.)
Reviewed By: wez
Differential Revision: D7748870
fbshipit-source-id: 92438f9c41e4fbdbd6cdccca5fce0e41cc3e9b07
Summary:
To provide the ability to ignore dotfiles, we update `GlobNode()` to take a
`bool includeDotfiles` that is used to determine the options used to create
the `GlobMatcher` associated with the node. Whereas the patterns
`**` and `*` were assumed to always match, that is now true only when
`includeDotfiles` is true, as well.
Finally, this adds a special case when `**` is used as the `pattern` for a `GlobNode`
and `includeDotfiles` is `false`. Because `GlobMatcher` does not accept `**` as
input, we specify the pattern as `**/*` to `GlobMatcher`, which it accepts and is
functionally equivalent in our case.
This is safe because we only do this trick when `**` is specified as a leaf `GlobNode`.
Reviewed By: wez
Differential Revision: D7741508
fbshipit-source-id: 9e6a50cb4dab09be2497393c641f176c84316a07
Summary:
Although we have not seen any ASAN failures or anything as a result of the
existing implementation, it certainly seems that it should be possible to
use a `StringPiece pattern` that is passed to `GlobNode.parse()` and not require
the `pattern` to outlive the `GlobNode`.
Reviewed By: wez
Differential Revision: D7795497
fbshipit-source-id: cefdd3fc69f8095c3aef3306b45f60ab4f82c737
Summary:
When `GlobMatcher` is used to implement `glob()` for Eden, `**` should not
include dotfiles by default (at least when it is used to implement `glob()` in Buck),
so we need to make this configurable. To this end, this adds a `GlobOptions`
parameter to `GlobMatcher::create()`. The key option this revision introduces is
`GlobOptions::IGNORE_DOTFILES`.
We implement this new functionality by associating a `matchCanStartWithDot`
boolean with the following opcodes in `GlobMatcher`:
* `GLOB_STAR`
* `GLOB_STAR_STAR_END`
* `GLOB_STAR_STAR_SLASH`
* `GLOB_ENDS_WITH`
The value of `matchCanStartWithDot` is largely determined by
`GlobOptions::IGNORE_DOTFILES`, though some extra checking is done
when assigning this for `GLOB_STAR`.
Originally, `GLOB_ENDS_WITH` required some funny business in how it
manipulated the `result` vector. This revision introduces some new funny
business to preserve the desired optimization.
Most of the work in this revision is new logic to ensure `matchCanStartWithDot`
is honored appropriately for each opcode.
Reviewed By: simpkins
Differential Revision: D7787621
fbshipit-source-id: f2c42e0f0948db74d48dc163d40aa3b13bbb4c3d
Summary:
I frequently find myself forgetting how to make the compiler see that
non-moved-from EDEN_BUG's destructor is noreturn, so add a simple
throwException function to it.
Reviewed By: simpkins
Differential Revision: D7834182
fbshipit-source-id: f279b9ca24f90efb4ad3ac318606dbd2dd002665
Summary:
When comparing two source control blob hashes, identical hashes can be assumed
to mean that the file contents are equal. However, differing hashes does not
necessarily mean that the file contents differ. In particular, mercurial
hashes history metadata in addition to the file contents when computing the
blob hash.
This updates Eden to always compare the file contents when the source control
blob hashes differ, rather than assuming that the file contents are different.
Reviewed By: wez
Differential Revision: D7825900
fbshipit-source-id: e611124a66cdd5c44589f20d1d4665a603286530
Summary:
Promote the folly logging code out of the experimental subdirectory.
We have been using this for several months in a few projects and are pretty
happy with it so far.
After moving it out of the experimental/ subdirectory I plan to update
folly::Init() to automatically support configuring it via a `--logging` command
line flag (similar to the initialization it already does today for glog).
Reviewed By: yfeldblum, chadaustin
Differential Revision: D7755455
fbshipit-source-id: 052db34c97f7516728f7cbb1a5ad959def2f6efb
Summary: Eden also depends on cpptoml for config files these days.
Reviewed By: wez
Differential Revision: D7786127
fbshipit-source-id: d220773a6a09959a4a1007f4c246dcb0ef371aa9