Summary:
I'm seeing test failures that I have not yet understood and I
thought they might be caused by an implicit conversion from
fusell::InodeNumber to bool. Well, they're not, but this is how I
discovered that. I'm not sure I want to land this change, but I'm
going to leave it around until I figure out what's happening with my
other diffs.
Reviewed By: simpkins
Differential Revision: D7077635
fbshipit-source-id: 50bf67026d2d0da0220c4709e3db24d841960f4b
Summary:
I am working on a stack of diffs that changes how we allocate inode
numbers to tree entries. I was hitting test failures I could not
understand, so in the process of trying to understand the flows
through InodeMap, I found newChildLoadStarted to be redundant with
shouldLoadChild.
Note: Today, allocateInodeNumber() acquires the InodeMap's lock, but
in a later diff, inode numbers will be assigned en masse during
TreeInode construction.
Reviewed By: simpkins
Differential Revision: D7059719
fbshipit-source-id: 624b861040d585d2cae41d7ec2aae7d528ff8336
Summary:
Previously we returned an ENOENT error in response to a FUSE lookup() call for
a name that does not exist. However, this does not allow FUSE to cache the
result, so we will continue to receive lookup() calls for this path in the
future.
This changes EdenDispatcher to return a successful response with an inode
number of 0 instead. This tells the kernel that the name does not exist, but
allows the kernel to cache this negative lookup result (for as long as
specified in the entry_valid field in the response).
Reviewed By: wez
Differential Revision: D7076811
fbshipit-source-id: a2b9977e58d6b6eecb584699b9d93b5ad29ad5ad
Summary:
Update the TreeInode code to always update the FUSE cache whenever we add a new
entry to the directory.
Up to now we have been fine without this since the kernel never cached negative
lookup responses for us. In order to turn on FUSE caching of negative lookup()
responses we need to invalidate the cache whenever we create a new entry.
- Have create(), symlink(), mknod() and mkdir() invalidate the FUSE entry cache
if they are not being invoked from a FUSE request. These are almost always
invoked from a FUSE request, but flushing the cache is the correct thing to
do if we add code in the future that can trigger these from a thrift call or
via some other non-FUSE code path.
- Change the checkout code to invalidate the FUSE cache whenever it creates new
children entries.
Reviewed By: wez
Differential Revision: D7076812
fbshipit-source-id: d6489995b415260e84b3701c49713b0ef514f85d
Summary:
This removes the TARGETS files from the eden github repository. The
open source buck build has been failing for several months, since buck
removed support for the thrift_library() rule.
I will potentially take a stab at adding CMake build support for Eden
at some point in the future.
Reviewed By: chadaustin
Differential Revision: D6893233
fbshipit-source-id: e6023094a807cf481ac49998c6f21b213be6c288
Summary:
Most uses of `size_t` in `eden` are unqualified, but a few are qualified.
As discussed ad nauseum in
https://stackoverflow.com/questions/5813700/difference-between-size-t-and-stdsize-t
it is totally safe to use unqualified `size_t` with all compilers/platforms.
Since this saves 5 chars per use, and to improve uniformity, I ran:
```
$ find ~/fbsource/fbcode/eden -type f \
| egrep '\.(h|cpp)$' \
| xargs sed -i 's/std::size_t/size_t/g'
```
Reviewed By: chadaustin
Differential Revision: D7021980
fbshipit-source-id: da268e62a9a93d2a5168a40b6878795ae7516b7f
Summary: Per code review comments on D6983198, this simplifies the way we check if mode bits have changed in a meaningful-to-source-control way.
Reviewed By: simpkins
Differential Revision: D7015339
fbshipit-source-id: 548ead337fbea1c1dcb72b880921671e9b6188ac
Summary:
mode_t isn't really part of a TreeEntry and I also wanted to see all
the places where we convert an entry type from source control into
mode bits.
Reviewed By: simpkins
Differential Revision: D6983198
fbshipit-source-id: ce1d0976f5fc5130c34a8c93c07a4e26a7cdaf71
Summary:
This is the type of a tree entry, which may be another tree, so
FileType is not an accurate name.
Reviewed By: simpkins
Differential Revision: D6981168
fbshipit-source-id: 997eb8a27f599310ed678ce221c8083722db8bff
Summary:
Add a command line flag to control whether or not Eden should ever try falling
back to import tree data using flatmanifest if an error occurs trying to import
it directly via treemanifest, in repositories that support treemanifest.
This is particularly useful for tests, where we usually do not want to fall
back to flatmanifest import if an error occurs during treemanifest import. The
fallback can otherwise mask real issues that should trigger test failures.
This is probably also a good thing to have in general. Supporting
flatmanifest+treemanifest data in a single Eden repository has some unfortunate
problems today: we compute hashes differently for flatmanifest trees vs
treemanifest trees. As a result, we can end up with identical trees that have
different hashes. This can result in unfortunate performance consequences in
some cases where Eden assumes it must scan a directory for differences if the
hashes are different.
I have left flatmanifest import enabled by default for now, but we may want to
disable it by default in the future. I would be more inclined to disable it by
default if we did added a thrift method to explicitly re-enable it (or to
import a single commit using flatmanifest), so that users could work around
this setting if necessary without having to fully restart edenfs.
Reviewed By: wez
Differential Revision: D6993791
fbshipit-source-id: 6e091a426cf1e7c973df5a641d2f8a1101011346
Summary:
While we're changing how timestamps are stored, we might as well
implement this small-ish efficiency. Instead of storing timestamps as
a timespec (16 bytes), store them as 64-bit nanoseconds with a range
slightly larger than what ext4 supports. Assuming a million inodes,
this saves 24 MB.
This diff introduces the EdenTimestamp type. The next diff will start
using it.
Reviewed By: simpkins
Differential Revision: D6957659
fbshipit-source-id: 4551af6f5b8c1ff610ba88795f69e7d69d7f605d
Summary:
I want to rename FileType to TreeEntryType so I removed this one first
and replaced all of its uses with an isTree() method.
Reviewed By: simpkins
Differential Revision: D6980501
fbshipit-source-id: 105b8c599585e63efd44043e761db40e2824e77e
Summary:
Our Model TreeEntry code was a bit too general - in reality, both git
and hg only support a handful of specific tree entries: regular files,
executable files, symlinks, and trees. (git also supports
submodules.) This diff delays the expansion of a TreeEntry's type
into a full mode_t.
Reviewed By: simpkins
Differential Revision: D6980003
fbshipit-source-id: 73729208000668078a180b728d7e0bb9169c6f3c
Summary: I should have caught this before...
Reviewed By: wez
Differential Revision: D6995395
fbshipit-source-id: 08efd0aacb051f2b1754930b311e3a42afc3c538
Summary:
Remove the declaration of HgRepo::initHg(), which is never used or defined
anywhere. (There is a public HgRepo::hgInit() method.)
Reviewed By: chadaustin
Differential Revision: D6996812
fbshipit-source-id: 1ad15f624b3839ccb8a2005f3bcf0897563f82f5
Summary:
Several parts of the mercurial remotefilelog and treemanifest code can
incorrectly throw KeyErrors when accessing legitimate objects that do exist.
This appears to occur if the data was created since the mercurial code on the
server side started.
These KeyErrors are often generated on the server-side of a pull or
prefetchtrees operation. We can't workaround these by invalidating caches on
our local repository object. The only way that I've been able to work around
these bogus KeyErrors is by completely recreating the repo object before
retrying.
Our code now generally attempts to load data 3 times: if the initial attempt
fails we call `repo.invalidate()` and then retry. If that retry fails we then
try destroying and recreating the entire repo object and then try again.
Ideally it would be nicer to fix mercurial itself not to throw KeyErrors for
data that does exist. However that seems somewhat more complicated.
this_is_fine
Reviewed By: wez
Differential Revision: D6986220
fbshipit-source-id: 57905dd25e11c4858822020b44185a6f83ecd363
Summary:
Explicitly mark the UnionDatapackStore as needing a refresh after we have
fetched data from the server. Otherwise we may not be able to find the tree
data that we just fetched.
Reviewed By: wez
Differential Revision: D6986219
fbshipit-source-id: c50b92ee4242665c7f5770f87a5dbab17698b8b9
Summary:
Recent changes to the treemanifest extension removed the
treemanifest.prefetchtrees() function, and replaced moved this functionality to
a repo.prefetchtrees() function instead.
Update the Eden extension to call `repo.prefetchtrees()` if it is available.
Reviewed By: wez
Differential Revision: D6986218
fbshipit-source-id: 02e07ab5194f311e49d8ecb4755d79d55acb080c
Summary:
Another small refactoring that I don't want mixed into my
bigger diffs.
Reviewed By: wez
Differential Revision: D6927482
fbshipit-source-id: 28cc60dfdffb50921a5ef9cca4e2814b90d3b701
Summary:
Introduces a persistent non-durable storage mechanism backed
by a memory-mapped file with fixed-length records.
Reviewed By: simpkins
Differential Revision: D6877217
fbshipit-source-id: 0ddacb4137cfe43e67c822dce4064356cdf515b5
Summary:
There were places we were acquiring a lock unnecessarily. In
addition, I'm looking at reducing the number of places where we store
the full mode_t to see if we can get away with dirtype_t or something
similar.
Reviewed By: wez
Differential Revision: D6972140
fbshipit-source-id: bb29a4473f3056e39596600d22e67374ca484735
Summary:
refactor this test so that we can apply it to any LocalStore
implementation, and have it run against the Memory, Rocks and SQLite
implementations.
Reviewed By: chadaustin, simpkins
Differential Revision: D6919455
fbshipit-source-id: cc93042b95833b175955e6395c84cf41238a90d2
Summary:
Adds a SQLite storage implementation and makes it the
default engine for integration tests; this requires fewer resources
to run and the integration tests thus run faster and more reliably.
In the future we may add a configuration option to remember the
storage engine that was used as it is currently not "safe" to switch
from one to the other because the hgproxyhash data cannot be
recreated without re-importing a revision.
Reviewed By: simpkins
Differential Revision: D6919456
fbshipit-source-id: 3afbfafb190cca0e3c797cd9b7cd051768575a8c
Summary:
Update FileInode so that getattr() and setattr() both return a reasonable value
in st_blocks.
Previously we always returned 0 in st_blocks, which caused applications like
`du` to always report files as using 0 space in Eden mounts. Now we compute
st_blocks based on st_size, so that `du` will report reasonable estimates for
when scanning the size of subdirectories inside an Eden mount.
Reviewed By: chadaustin
Differential Revision: D6932098
fbshipit-source-id: bd29e46821176e510f420e6e2b6ce480b80d50ff
Summary:
Rename the existing `state_` variable to `runState_` to help distinguish it
from the new `serverState_` variable.
The information in `runState_` is all related to whether the EdenServer object
is starting, shutting down, or running normally.
Reviewed By: wez
Differential Revision: D6929864
fbshipit-source-id: ad7af381a8a291b12db9308668c7616ebd9b7f39
Summary:
Move all of the privhelper functionality into a PrivHelper class. The
ServerState object now stores the PrivHelper object to use, rather than having
a global singleton.
This will make it easier to stub out the PrivHelper functionality during unit
tests.
Reviewed By: wez
Differential Revision: D6929862
fbshipit-source-id: e3edcb0a03ba9afdf34554cb961fd74557cdd6e3
Summary:
Drop the --thrift_address flag, and always create the thrift socket using a
fixed name under the `.eden` directory. The location of the `.eden` directory
is still configurable with the `--edenDir` argument.
There isn't really much benefit to making the socket path be configurable
separately from the .eden directory path, and it adds to the code complexity.
For instance, while you can tell eden to listen on a TCP socket instead of a
Unix domain socket, that functionality has been broken since D4637285
introduced a `CHECK()` statement that crashes the code when using a TCP socket.
Reviewed By: wez
Differential Revision: D6929863
fbshipit-source-id: ee5f7341d01d3ce522cae936ef3c133bba3f18f7
Summary:
Update EdenMount::diff() to use the UserInfo object stored in the shared
ServerState rather than calling UserInfo::lookup() on each diff operation.
Reviewed By: wez
Differential Revision: D6929865
fbshipit-source-id: a68ab1fa9eb345b59972e67c3aac258b4dbcdab5
Summary:
Add a new ServerState class to store process-wide state that is shared across
multiple mounts. Up until now we have passed around the shared state data as
separate variables.
This is intentionally separate from the existing EdenServer class to allow unit
tests to create EdenMount objects without an EdenServer object.
Reviewed By: wez
Differential Revision: D6929861
fbshipit-source-id: 5f22efb6d79dfd70031be1dc37f494c2ad8af902
Summary:
We no longer need this and I believe that this was contributing
to this source of flakiness in our CI; this stack trace triggers when
we get an aborted or short read from the kernel:
```
*** Aborted at 1517827659 (Unix time, try 'date -d 1517827659') ***
*** Signal 11 (SIGSEGV) (0x2d0) received by PID 153706 (pthread TID 0x7f63ea37d700) (linux TID 160573) (maybe from PID 720, UID 0) (code: address not mapped to object), stack trace: ***
@ 0000000002abfa2d folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)
./folly/experimental/symbolizer/SignalHandler.cpp:413
@ 00007f646e96dacf (unknown)
@ 00007f646e96aa20 __pthread_kill
@ 0000000001873748 facebook::eden::fusell::FuseChannel::requestSessionExit()
./eden/fs/fuse/FuseChannel.cpp:483
@ 000000000187d2f7 facebook::eden::fusell::FuseChannel::processSession()
./eden/fs/fuse/FuseChannel.cpp:656
@ 000000000187dab9 facebook::eden::fusell::FuseChannel::fuseWorkerThread(unsigned long)
./eden/fs/fuse/FuseChannel.cpp:493
@ 00007f646f267170 execute_native_thread_routine
@ 00007f646e9637a8 start_thread
@ 00007f646e047a7c __clone
```
my theory is that we're allowing shutdownImpl to free things out from under other
threads before they've all seen this signal and wound down fully. This is slightly
speculative in that I haven't managed to reproduce this stack trace on my devserver.
We don't really need this additional signal any longer.
Reviewed By: simpkins
Differential Revision: D6907734
fbshipit-source-id: 0f0138b631a7201fc9a4a1c93c2cde846e869cbd
Summary: Small refactoring I should have done with the previous diff.
Reviewed By: simpkins
Differential Revision: D6927152
fbshipit-source-id: 1dcda01134c3d63c62169c5728dba24ca0eebd68
Summary:
This enables dropping in alternative implementations
of LocalStore and adds a MemoryLocalStore implementation for
use in our tests.
This diff doesn't change the default storage option for the
eden server. I'll look at adding such an option in a follow up diff.
Reviewed By: chadaustin
Differential Revision: D6910413
fbshipit-source-id: 018bf04e0bff101e1f0ab35e8580ca2a2622e5ef
Summary:
Change `LoggerDB::get()` to a reference instead of a pointer since this
function can never return null.
Reviewed By: yfeldblum
Differential Revision: D6893206
fbshipit-source-id: af47063918a79c851fd39b838d6c63755166e033
Summary:
Update hg_import_helper.py to parse the repository's .hg/hgrc file, but then
create the repository object with a fresh UI object that has not parsed that
config yet.
This more closely mimic's the behavior of mercurial's dispatch code invoked
when starting the `hg` command line. This behavior is required to ensure that
secondary repository objects that get created end up with the correct
configuration, and do not have settings from the original repository's hgrc
file.
The mercurial behavior of parsing the original repository's hgrc file twice
dates back to rHG741f64dfc04d1.
Reviewed By: quark-zju
Differential Revision: D6909449
fbshipit-source-id: 85073ab6ade4ab70247d48bc670c9924e9e6841f
Summary:
Update `TreeInode::diff()` to check if its hash matches the source control tree
it is being compared to, and return early if they are identical.
I'm surprised that I forgot to include this initially when implementing
`TreeInode::diff()`
This makes `hg status` faster when a large number of unmodified directories
have been loaded.
Reviewed By: chadaustin
Differential Revision: D6890615
fbshipit-source-id: 561630d0220b4875dbf3678161cdb41a8aa4fc82
Summary:
This re-orders some of the code in `TreeInode::diff()` slightly. This should
not affect the behavior of the code.
This moves the `isIgnored` check inside the main `contents_.wlock()` block.
This reduces the number of places where we grab the lock, and will help keep
things simple for an upcoming diff where I need to add some more checks in this
code with the lock held.
This also changes `inodeFuture` to use the new `Future::makeEmpty()`
constructor rather than having to use an `Optional<Future>`
Reviewed By: chadaustin
Differential Revision: D6890616
fbshipit-source-id: 354bbf6a6be6d356fd23e6c0fb6b534679bbe0bb
Summary: We don't think this is happening, but let's test it!
Reviewed By: simpkins
Differential Revision: D6888038
fbshipit-source-id: 754b2ec8f78ff513fd350a74505915e2a1e9ba3e
Summary: Small things I've needed in later diffs.
Reviewed By: wez
Differential Revision: D6877755
fbshipit-source-id: c9002eb0b92dbd8fe9c4f636d2ca79b25cde331f
Summary:
While working on timestamp storage, the fact that
InodeTimestamps was a member of InodeBase kept getting in the way.
Make it its own type.
Reviewed By: simpkins
Differential Revision: D6862835
fbshipit-source-id: 91d8984764f0586b9fa52e961eb5606a530e0416
Summary: This API seems like it should be const, as it does not modify the clock.
Reviewed By: chadaustin, zhupanov
Differential Revision: D6869719
fbshipit-source-id: c8bf4ccab34538b59e6baeedd0b0ff88b328236e
Summary:
This diff moves the mount-time initialization handling
out of the main loop. This rationale for this is:
* We don't (and shouldn't!) need to process FUSE_INIT for takeover
processing, and this structure allows us to make stronger assertions
about our state.
* we can avoid spinning up multiple threads in the (rare!) case that
the FUSE_INIT fails
* It is now a little harder for exceptions during initialization to
escape our notice.
In rearranging this stuff, I found a race condition in the worker thread
shutdown; we could erroneously emit a completion event before all of
the threads had been torn down and this resulted in sporadic integration
test failures hitting the assertion for the number of joined threads
in the destructor.
Reviewed By: simpkins
Differential Revision: D6766330
fbshipit-source-id: 32afb5a7c739c75aebfdb0a8f896eec5f41ad33f
Summary:
This is the spiritual successor to D3302706 which originally
wanted to solve this by adding a python extension. That would prove
to be too painful for the opensource build so it was shelved.
We now need to be able to run our tests in an environment that doesn't
have the `attr` rpm installed so this is a good time to fix this
in a more portable way.
This diff adds a little wrapper around the functions that we already
have for consuming extended attribute information and augments them
with another to list attributes.
The utility emits output in json format and is intended to be fed
directly into the helper functions we have in `fs.py`.
Reviewed By: chadaustin
Differential Revision: D6851182
fbshipit-source-id: 3d1d1a351f2e01405645d45658d1c8bc61a659a4
Summary:
Dir's contents were represented as a vector of 64-bit
pointers to 48-byte structs. This change removes that layer of
indirection, reducing memory usage and slightly pessimizing insertion.
The diff is mostly mechanical outside of the TreeInode.h changes and
calls to emplace..
I'll run memory tests tomorrow, though it's a gamble as to whether
private bytes will show a difference. I may need to shrink the Entry
struct too.
Reviewed By: wez
Differential Revision: D6804957
fbshipit-source-id: b126656dbc7951565e74b6401adde6353e809056
Summary:
This adds version 3 nee 2 of the takeover serialization,
which uses thrift to represent the data. Even though it is logically
version 2 I'm naming it version 3 because we're taking advantage of
the MessageType values from Version 1 which allowed numerical values
1 or 2 to represent different data types.
In this diff we now formalize that first word as the protocol version
and to avoid ambiguity are starting the new version value at 3. I
did briefly consider using the name Version2 to refer to this and setting
the value in the enum to 3, but I didn't want to become known for
API hate crimes against my fellow engineers.
Reviewed By: simpkins
Differential Revision: D6733406
fbshipit-source-id: e2067365e4e8b388490440fd73ab504544011846
Summary:
Whilst chatting with simpkins we realized that we lost
the handshake portion of the takeover protocol during a refactor.
The handshake is important for a couple of reasons:
1. It prevents unmounting and loosing all the mounts in the case
that sometime decides to netcat or otherwise connect to the
socket
2. It gives us an opportunity to short circuit any heavy lifting
if we know that it will be impossible to succeed.
3. It allows us to rollback to earlier builds with older versions
of the takeover protocol.
This diff adds a little bit of machinery to enable passing a set of supported
takeover protocol version numbers. The intent is to retain support for
the two of these at a time; any time we change the encoding/protocol
for takeover we'll bump the version number and add supporting code
to handle the new format, retaining support for the prior version.
Retaining the ability to handle the prior version allows us to downgrade
to an earlier build gracefully if/when the need arises.
I opted to do this here rather than by bumping the `kProtocolID`
constant in `UnixSocket.h` becase we're not really changing the
lowest level of the protocol; just the takeover specific portions.
I haven't actually changed the takeover serialization in this diff,
but do have some work on that happening in D6733406; that diff will
be amended to take advantage and demonstrate how this versioning
scheme works.
A key thing to note about the implementation of this diff is that
the client sends the version number to the server, but doesn't
add any explicit version encoding in the response we receive.
This is deliberate and allows us to upgrade prior builds to
this new scheme. I'll add a more definitive check for this
situation when I actually rev the format in the following diff.
Reviewed By: simpkins
Differential Revision: D6743065
fbshipit-source-id: c991cebfee918daad098105ca6bcfef76374c0ff
Summary:
Tiny thing I noticed when reading code. Keep the entry name
as a StringPiece rather than bouncing through char*.
Reviewed By: zhupanov
Differential Revision: D6820080
fbshipit-source-id: 884e55f74094f44012efbe44b86d8e5903300967
Summary:
isSameAs calls getSha1 which was failing on symlinks. The
original concern was that asking for the SHA-1 of a symlink is
ambiguous: do you want the hash of the symlink or the target? But we
already check for whether you are requesting the SHA-1 of a symlink in
EdenServiceHandler, so it's redundant and incorrect to check in
FileInode too.
Reviewed By: simpkins
Differential Revision: D6847489
fbshipit-source-id: 13966da06bcde75c5c568e09fef14e735de47cfb