Summary:
To make it clearer to me why all the calls to newPtrLocked were safe, and to
eliminate some duplication, I captured the newPtrLocked call patterns into
member functions on LoadedInode and TreeInode::Entry.
Reviewed By: simpkins
Differential Revision: D7207542
fbshipit-source-id: 25de77e72c0898be43b3fbdddab835d64101755e
Summary:
To make it clearer why newPtrLocked is safe during the Inode class
construction process, add a takeOwnership function (better name
suggestions welcome) that converts from the newly-instantiated
unique_ptr to a managed InodePtr.
Reviewed By: simpkins
Differential Revision: D7204818
fbshipit-source-id: 8a40189588442490120623da86195c6fc99c9c51
Summary:
In some cases we could call `delete` with the wrong size in
`UnixSocket::SendQueueDestructor` when `__cpp_sized_deallocation` is available.
In particular, if the input message data contained some empty buffers in the
IOBuf chain we would allocate room for these elements when initially performing
the allocation in `createSendQueueEntry()`, but we would skip over them in the
`SendQueueEntry` constructor, so `iovCount` did not include them. The
`SendQueueDestructor` code used `iovCount` to calculate how much space had been
allocated, and so it would undercount the amount of allocated space in this
case.
This updates `createSendQueueEntry()` to also avoid allocating an iovec entry
for the empty buffers, so that `iovCount` should always accurately reflect how
many entries were originally allocated.
Reviewed By: chadaustin
Differential Revision: D7190579
fbshipit-source-id: 422cc737f146adeb1c133b9f3b500038e05bad10
Summary:
This diff is mostly preparation. It's submitted separately to
remove mechanical refactoring noise from the following diff, where
every entry is assigned an inode number upon construction.
Reviewed By: simpkins
Differential Revision: D7116832
fbshipit-source-id: 2943c45340a9a751eb52bf13e19d233d829494c0
Summary:
If the root TreeInode wants to allocate inode numbers, the inode
allocator must be initialized first. But complete InodeMap
initialization requires the root TreeInode. So split this into two
parts.
Also, I changed the inode allocator to a single atomic increment instead
of a lock acquisiton.
Finally, the extra assertions in this diff uncovered what looks like a
bug in the takeover logic where nextInodeNumber_ could end up being
smaller than the value in the takeover data, since the max inode
number from the overlay was assigned after loading from takeover data.
Reviewed By: simpkins
Differential Revision: D7107706
fbshipit-source-id: ec43cc81c11d709261598739c622609b372433a2
Summary:
We were curious why we needed `sudo gdb` to debug Eden even
though it drops privileges. The process dumpable bit is the trick.
Reviewed By: simpkins
Differential Revision: D7182742
fbshipit-source-id: 48e3dd778bf07ec1bfadace4edeb64668b936b9b
Summary:
This was an in-person code review item I forgot to enable
with the original diff.
Reviewed By: wez
Differential Revision: D7016624
fbshipit-source-id: 91d729772aa3c0b476f6bf8f6ee7e46cdac54626
Summary:
This integration test verifies that observed timestamps should persist
across runs. D6891479 makes it pass.
Reviewed By: wez
Differential Revision: D6930208
fbshipit-source-id: b8c95bce00933b9ae0de101a0bd8b6abfbfa1177
Summary:
Today, attaching gdb to edenfs requires sudo. Our hypothesis is that,
even though we drop uid and gid, the SELinux context remains elevated,
preventing ptrace from attaching.
Reviewed By: simpkins
Differential Revision: D7128692
fbshipit-source-id: 0f9b9a2ebef5a438bf0f4dcdc00a7b703a7aea02
Summary:
1. Moved the elapsed time calculation and logging for getScmStatus() request to the future completion.
2. Added a helper class "ThriftLogHelper" to attach the time logging to the future completion.
Reviewed By: simpkins
Differential Revision: D7118876
fbshipit-source-id: 5f8212296cd8725a3556c7f5c364ddfa66dbdc95
Summary:
I spent way too long trying to figure out why my refactorings were
causing invariant errors inside InodeMap. It turns out that we
initialize the root TreeInode before InodeMap::initialize is called,
which I suspect resulted in duplicate inode numbers being handed out.
Reviewed By: simpkins
Differential Revision: D7106302
fbshipit-source-id: b459734fb96bfbb6b4b27a1d23de8b6406d30ca4
Summary:
The getpath_unloaded_inode tests failed on my machine quite
regularly. The two possible races here I can imagine are racing the
system clock and FUSE not having released its refcount on an inode by
the time unload is called.
Reviewed By: wez
Differential Revision: D7118883
fbshipit-source-id: c3708f14a860f5ad04ddec988fc67a683b7dcfde
Summary:
`DCHECK_NE()` unfortunately is not constexpr, so it results in a build error
when used in a constexpr function. Fortunately `assert()` is allowed in
constexpr functions as of C++14, so use it instead.
Reviewed By: wez
Differential Revision: D7108887
fbshipit-source-id: 2be9181929a357c23f48c09173646683060aad28
Summary:
This adds a FakeFuse class to simulate a FUSE connection to the kernel, but
that is actually communicating with our userspace test harness code. This also
adds a FakePrivHelper class which implements the PrivHelper interface but
returns FakeFuse connections rather than real FUSE connections to the kernel.
This will enable us to write unit tests that exercise more of the FUSE and
EdenMount logic, and will also enable us to test error conditions and ordering
behaviors that are difficult to reliably reproduce with a real FUSE mount.
This also includes some very basic tests using this new code. The code in
fuse/test/FuseChannelTest.cpp creates a FuseChannel using a FakeFuse object,
and the code in inodes/test/FuseTest.cpp creates a full EdenMount object using
FakePrivHelper and FakeFuse. The tests are pretty similar for now, and only
exercise the FUSE initialization code. I will expand these tests in subsequent
diffs.
Reviewed By: wez
Differential Revision: D7050826
fbshipit-source-id: 4f82375d65664ca3a5a228a577caa4d1d47454fe
Summary:
I'm seeing test failures that I have not yet understood and I
thought they might be caused by an implicit conversion from
fusell::InodeNumber to bool. Well, they're not, but this is how I
discovered that. I'm not sure I want to land this change, but I'm
going to leave it around until I figure out what's happening with my
other diffs.
Reviewed By: simpkins
Differential Revision: D7077635
fbshipit-source-id: 50bf67026d2d0da0220c4709e3db24d841960f4b
Summary:
I am working on a stack of diffs that changes how we allocate inode
numbers to tree entries. I was hitting test failures I could not
understand, so in the process of trying to understand the flows
through InodeMap, I found newChildLoadStarted to be redundant with
shouldLoadChild.
Note: Today, allocateInodeNumber() acquires the InodeMap's lock, but
in a later diff, inode numbers will be assigned en masse during
TreeInode construction.
Reviewed By: simpkins
Differential Revision: D7059719
fbshipit-source-id: 624b861040d585d2cae41d7ec2aae7d528ff8336
Summary:
Previously we returned an ENOENT error in response to a FUSE lookup() call for
a name that does not exist. However, this does not allow FUSE to cache the
result, so we will continue to receive lookup() calls for this path in the
future.
This changes EdenDispatcher to return a successful response with an inode
number of 0 instead. This tells the kernel that the name does not exist, but
allows the kernel to cache this negative lookup result (for as long as
specified in the entry_valid field in the response).
Reviewed By: wez
Differential Revision: D7076811
fbshipit-source-id: a2b9977e58d6b6eecb584699b9d93b5ad29ad5ad
Summary:
Update the TreeInode code to always update the FUSE cache whenever we add a new
entry to the directory.
Up to now we have been fine without this since the kernel never cached negative
lookup responses for us. In order to turn on FUSE caching of negative lookup()
responses we need to invalidate the cache whenever we create a new entry.
- Have create(), symlink(), mknod() and mkdir() invalidate the FUSE entry cache
if they are not being invoked from a FUSE request. These are almost always
invoked from a FUSE request, but flushing the cache is the correct thing to
do if we add code in the future that can trigger these from a thrift call or
via some other non-FUSE code path.
- Change the checkout code to invalidate the FUSE cache whenever it creates new
children entries.
Reviewed By: wez
Differential Revision: D7076812
fbshipit-source-id: d6489995b415260e84b3701c49713b0ef514f85d
Summary:
Add a test that exercises `hg pull`. This confirms that eden can see new
commits created on the server after Eden and its hg_import_helper processes
have started. This test gets run in flatmanifest, hybrid treemanifest, and
treeonly mode.
This currently performs pulls using a local peer repository rather than over
SSH. This does exercise a different code path in mercurial than what typically
occurs in production. In the future we should perhaps also add a test that
uses a fake SSH helper program to exercise mercurial's sshpeer code paths as
well.
Reviewed By: chadaustin
Differential Revision: D6993788
fbshipit-source-id: 40628c0b3faac0dc8622b605a29b084979b8c089
Summary:
Whenever we tell Eden to change the working directory
parents, we need to make sure the appropriate objects are written to
disk. This fixes hg fold in Eden.
Reviewed By: simpkins
Differential Revision: D7045299
fbshipit-source-id: cbd51be59cf943a843b77c2abe66a84b745bce22
Summary:
Update the integration tests to avoid specifying an explicit path to the eden
extension. This way they use the version that we now package into hg.par
during the build.
This avoids issues with hg not being able to find and load native .so libraries
from the eden extension. Mercurial is able to find these libraries correctly
when they are packaged into hg.par (since the par start-up script sets
LD_LIBRARY_PATH to point to the par unpack directory). When using eden from an
external directory mercurial was not able to find these libraries.
Reviewed By: chadaustin
Differential Revision: D7047245
fbshipit-source-id: d56bffa953c178949c866efec507298a1f40da8b
Summary:
This removes the TARGETS files from the eden github repository. The
open source buck build has been failing for several months, since buck
removed support for the thrift_library() rule.
I will potentially take a stab at adding CMake build support for Eden
at some point in the future.
Reviewed By: chadaustin
Differential Revision: D6893233
fbshipit-source-id: e6023094a807cf481ac49998c6f21b213be6c288
Summary:
Most uses of `size_t` in `eden` are unqualified, but a few are qualified.
As discussed ad nauseum in
https://stackoverflow.com/questions/5813700/difference-between-size-t-and-stdsize-t
it is totally safe to use unqualified `size_t` with all compilers/platforms.
Since this saves 5 chars per use, and to improve uniformity, I ran:
```
$ find ~/fbsource/fbcode/eden -type f \
| egrep '\.(h|cpp)$' \
| xargs sed -i 's/std::size_t/size_t/g'
```
Reviewed By: chadaustin
Differential Revision: D7021980
fbshipit-source-id: da268e62a9a93d2a5168a40b6878795ae7516b7f
Summary: Per code review comments on D6983198, this simplifies the way we check if mode bits have changed in a meaningful-to-source-control way.
Reviewed By: simpkins
Differential Revision: D7015339
fbshipit-source-id: 548ead337fbea1c1dcb72b880921671e9b6188ac
Summary:
mode_t isn't really part of a TreeEntry and I also wanted to see all
the places where we convert an entry type from source control into
mode bits.
Reviewed By: simpkins
Differential Revision: D6983198
fbshipit-source-id: ce1d0976f5fc5130c34a8c93c07a4e26a7cdaf71
Summary:
This is the type of a tree entry, which may be another tree, so
FileType is not an accurate name.
Reviewed By: simpkins
Differential Revision: D6981168
fbshipit-source-id: 997eb8a27f599310ed678ce221c8083722db8bff
Summary:
Add a command line flag to control whether or not Eden should ever try falling
back to import tree data using flatmanifest if an error occurs trying to import
it directly via treemanifest, in repositories that support treemanifest.
This is particularly useful for tests, where we usually do not want to fall
back to flatmanifest import if an error occurs during treemanifest import. The
fallback can otherwise mask real issues that should trigger test failures.
This is probably also a good thing to have in general. Supporting
flatmanifest+treemanifest data in a single Eden repository has some unfortunate
problems today: we compute hashes differently for flatmanifest trees vs
treemanifest trees. As a result, we can end up with identical trees that have
different hashes. This can result in unfortunate performance consequences in
some cases where Eden assumes it must scan a directory for differences if the
hashes are different.
I have left flatmanifest import enabled by default for now, but we may want to
disable it by default in the future. I would be more inclined to disable it by
default if we did added a thrift method to explicitly re-enable it (or to
import a single commit using flatmanifest), so that users could work around
this setting if necessary without having to fully restart edenfs.
Reviewed By: wez
Differential Revision: D6993791
fbshipit-source-id: 6e091a426cf1e7c973df5a641d2f8a1101011346
Summary:
Update the integration tests to build `hg_import_helper` into a python archive
that includes the current mercurial sources from the local repository. This
way hg_import_helper will use the local mercurial code rather than whatever
mercurial modules are installed on the system. This will help ensure that we
detect any breakages caused by changes in the mercurial source when the
mercurial changes are made rather than when they are deployed.
Reviewed By: wez
Differential Revision: D6993790
fbshipit-source-id: f3ad404583cadcf07156bac1ce6bc869bd1160e1
Summary:
While we're changing how timestamps are stored, we might as well
implement this small-ish efficiency. Instead of storing timestamps as
a timespec (16 bytes), store them as 64-bit nanoseconds with a range
slightly larger than what ext4 supports. Assuming a million inodes,
this saves 24 MB.
This diff introduces the EdenTimestamp type. The next diff will start
using it.
Reviewed By: simpkins
Differential Revision: D6957659
fbshipit-source-id: 4551af6f5b8c1ff610ba88795f69e7d69d7f605d
Summary:
I want to rename FileType to TreeEntryType so I removed this one first
and replaced all of its uses with an isTree() method.
Reviewed By: simpkins
Differential Revision: D6980501
fbshipit-source-id: 105b8c599585e63efd44043e761db40e2824e77e
Summary:
Our Model TreeEntry code was a bit too general - in reality, both git
and hg only support a handful of specific tree entries: regular files,
executable files, symlinks, and trees. (git also supports
submodules.) This diff delays the expansion of a TreeEntry's type
into a full mode_t.
Reviewed By: simpkins
Differential Revision: D6980003
fbshipit-source-id: 73729208000668078a180b728d7e0bb9169c6f3c
Summary:
Update the `hg_test` decorator to accept additional parameters specifying the
list of test modes. e.g., `hg_test('Treemanifest')` asks to only run the test
in the Treemanifest configuration. With no arguments tests are still run with
both the Flatmanifest and Treemanifest configurations by default.
This also enables the TreeOnly configuration mode, which appears to work now.
(It was previously disabled since `hg init` would fail in treeonly
repositories.)
This new changes allows tests to explicitly opt-in to running in `TreeOnly`
mode.
Reviewed By: wez
Differential Revision: D6993789
fbshipit-source-id: 9ee51318d0f661038fe29f246b2b14eebbb1c3d9
Summary:
Update the integration tests so that they do not delete the temporary test
directories if the environment variable `EDEN_TEST_NO_CLEANUP` is set.
This makes it easier to manually examine the repository state after a test
fails.
Reviewed By: chadaustin
Differential Revision: D6986217
fbshipit-source-id: 727321c2c3da4d19d9edf8ed20b2aca3449779de
Summary:
Update the logic for how the Eden integration tests find the hg binary:
- Use the contents of the EDEN_HG_BINARY environment variable if set. When
running tests via `buck test` buck will pass the hg.par output location in
this variable.
- If EDEN_HG_BINARY is not set, use libfb.py.pathutils to find the location of
the //scm/hg:hg rule output. This makes sure the integration tests still
prefer this par path even when run manually without EDEN_HG_BINARY set. This
is convenient when running individual tests not through buck.
If for some reason the hg python_binary() output cannot be found then we still
search through $PATH for hg.real or hg as usual. For internal fbsource builds
we generally shouldn't hit this fallback case, though.
Reviewed By: wez, quark-zju
Differential Revision: D6986221
fbshipit-source-id: 982cb99112405a674dbc45df4ada73a990536489
Summary: I should have caught this before...
Reviewed By: wez
Differential Revision: D6995395
fbshipit-source-id: 08efd0aacb051f2b1754930b311e3a42afc3c538
Summary:
Remove the declaration of HgRepo::initHg(), which is never used or defined
anywhere. (There is a public HgRepo::hgInit() method.)
Reviewed By: chadaustin
Differential Revision: D6996812
fbshipit-source-id: 1ad15f624b3839ccb8a2005f3bcf0897563f82f5
Summary:
Several parts of the mercurial remotefilelog and treemanifest code can
incorrectly throw KeyErrors when accessing legitimate objects that do exist.
This appears to occur if the data was created since the mercurial code on the
server side started.
These KeyErrors are often generated on the server-side of a pull or
prefetchtrees operation. We can't workaround these by invalidating caches on
our local repository object. The only way that I've been able to work around
these bogus KeyErrors is by completely recreating the repo object before
retrying.
Our code now generally attempts to load data 3 times: if the initial attempt
fails we call `repo.invalidate()` and then retry. If that retry fails we then
try destroying and recreating the entire repo object and then try again.
Ideally it would be nicer to fix mercurial itself not to throw KeyErrors for
data that does exist. However that seems somewhat more complicated.
this_is_fine
Reviewed By: wez
Differential Revision: D6986220
fbshipit-source-id: 57905dd25e11c4858822020b44185a6f83ecd363
Summary:
Explicitly mark the UnionDatapackStore as needing a refresh after we have
fetched data from the server. Otherwise we may not be able to find the tree
data that we just fetched.
Reviewed By: wez
Differential Revision: D6986219
fbshipit-source-id: c50b92ee4242665c7f5770f87a5dbab17698b8b9
Summary:
Recent changes to the treemanifest extension removed the
treemanifest.prefetchtrees() function, and replaced moved this functionality to
a repo.prefetchtrees() function instead.
Update the Eden extension to call `repo.prefetchtrees()` if it is available.
Reviewed By: wez
Differential Revision: D6986218
fbshipit-source-id: 02e07ab5194f311e49d8ecb4755d79d55acb080c
Summary:
Another small refactoring that I don't want mixed into my
bigger diffs.
Reviewed By: wez
Differential Revision: D6927482
fbshipit-source-id: 28cc60dfdffb50921a5ef9cca4e2814b90d3b701
Summary:
Introduces a persistent non-durable storage mechanism backed
by a memory-mapped file with fixed-length records.
Reviewed By: simpkins
Differential Revision: D6877217
fbshipit-source-id: 0ddacb4137cfe43e67c822dce4064356cdf515b5
Summary:
There were places we were acquiring a lock unnecessarily. In
addition, I'm looking at reducing the number of places where we store
the full mode_t to see if we can get away with dirtype_t or something
similar.
Reviewed By: wez
Differential Revision: D6972140
fbshipit-source-id: bb29a4473f3056e39596600d22e67374ca484735
Summary:
We would immediately satisfy the health check and
tell the user that the system is healthy and show the pid of
the prior incarnation rather than the one for the instance
that we just launched.
This diff refactors the health checking code so that we can
share the implementation between the cli and the integration
tests; the integration tests already had code to do the right
thing for this.
Reviewed By: simpkins
Differential Revision: D6944989
fbshipit-source-id: 7c0f02c875b1b81f8f1b7521add67928200b27ed
Summary:
chadaustin is going to think about how to test this
in unit tests intestead :-p
Reviewed By: chadaustin
Differential Revision: D6951788
fbshipit-source-id: 137f285f3a1f080ce43392a621c73640ce3a9bf7
Summary:
This makes the default `memory` for speed and minimal
flakiness, but allows a test to select a different engine where
appropriate (eg: restart and remount tests).
Reviewed By: chadaustin
Differential Revision: D6944207
fbshipit-source-id: 1fb11387beda02d059a796dad5a42d56ddcf6e88