Summary:
These are new helper methods we need to create test scenarios.
They will be used in upcoming revisions.
Reviewed By: wez
Differential Revision: D4046981
fbshipit-source-id: 9c66c456be57006173e4a65eed603de4a426a438
Summary: This should be useful for my upcoming unit tests for the Hg dirstate.
Reviewed By: simpkins
Differential Revision: D4013464
fbshipit-source-id: 46460186abfa104aa026894068cd160e52c94729
Summary: This will make it easier to compare a `TreeEntry` with a `TreeInode::Entry`.
Reviewed By: simpkins
Differential Revision: D4034298
fbshipit-source-id: 29674e2902661bf46394ea71b81537b35bd4b107
Summary: This should make some of the upcoming test harness work a little easier.
Reviewed By: simpkins
Differential Revision: D4011747
fbshipit-source-id: 87ee80a6d641a29be9027b163b1adee496f4452f
Summary:
I need this for the upcoming test harness so I can avoid creating a
`ClientConfig`, which is currently a huge pain to do from a unit test.
Reviewed By: simpkins
Differential Revision: D4010842
fbshipit-source-id: 03d1e1de9c3047340a6f26202d4b432f4a8620b4
Summary:
This was reported by ASAN.
The major issue was that `FakeObjectStore` was returning a copy of a `Tree`,
so it was not the case that the `TreeEntry*` returned by `getEntryForFile()`
was guaranteed to be "owned by" the `Tree* root` that was passed in. To address
this, we change `getEntryForFile()` to now return a copy of the `TreeEntry*`
that it gets back from `getEntryPtr()`. It really comes down to this line:
```
auto entry = currentDirectory->getEntryPtr(piece.basename());
```
because we cannot guarantee that `currentDirectory` will live past the end of
`getEntryForFile()`, so we cannot guarantee that return return value of
`currentDirectory->getEntryPtr()` will, either.
Special thanks to meyering and yfeldblum for helping me debug this.
Reviewed By: simpkins
Differential Revision: D4024627
fbshipit-source-id: 6295e6f2b1d2f544271b2aebad27a4ad3ae04563
Summary:
Utility function that given a `Tree` and a `RelativePathPiece`, returns the
corresponding `TreeEntry` in the `ObjectStore`, if it exists.
Reviewed By: wez
Differential Revision: D3980261
fbshipit-source-id: 2808a4ca45be84e3a6bb91b0cf2db19a3bf88798
Summary:
In an upcoming revision, I am going to introduce a utility function that takes
an `ObjectStore` (well, now an `IObjectStore`) as a parameter and I want to be
able to test it. Having a `FakeObjectStore` should make this considerably easier
without having to resort to mocks.
Reviewed By: simpkins
Differential Revision: D3980580
fbshipit-source-id: 5886e2055c893e749cc898226e1baade776c3ea7
Summary:
Adds a very basic example of testing eden functionality with hypothesis.
We'll be building on this with stateful testing in a follow on diff tomorrow.
There's some prep/setup work in the base test class that can be removed when an updated version of hypothesis ships and is updated in our third-party repo.
Reviewed By: simpkins
Differential Revision: D3968250
fbshipit-source-id: 46382c3bf2d6a0edbd60ac2b048b1bae26ca2572
Summary: This is necessary so that it can be used as the key in an `unordered_map`.
Reviewed By: simpkins
Differential Revision: D3980575
fbshipit-source-id: d225a98f957f9aae2f2f50a6cc365011d953c92e
Summary:
Apparently we did not have an existing unit test for `Tree`, so this adds one.
The other methods should be tested, as well, but I'm about to use `getEntryPtr()`
elsewhere, which is why I just focused on this one for the moment.
Reviewed By: simpkins
Differential Revision: D3980150
fbshipit-source-id: 33456fd621a1894606605af4fee06ba42d124752
Summary:
We want to use these with Eden
Depends on D3961190
Depends on D3961193
Depends on D3961196
Depends on D3961208
Reviewed By: rhysparry
Differential Revision: D3961232
fbshipit-source-id: 56f5a1811625303514e4398a6d47ea90ba348724
Summary:
The default `casecollisionauditor` reads all of `dirstate._map`. This is
(1) expensive for a large repo and (2) accesses a private property that we would
prefer not to expose via `edendirstate`. Setting `portablefilenames = ignore` by
default avoids this check.
This means that Eden users are currently responsible for not creating directory
entries that would cause a case-insensitive collision. Ideally, we would just do
this check on the server.
Reviewed By: wez
Differential Revision: D3964461
fbshipit-source-id: f351bdeaad0fc06cd70cc637ca1b6fde249dde9c
Summary:
The getMaterializedEntries() would previously try to dereference a null pointer
if the input mount path did not refer to a valid mount piont.
Reviewed By: bolinfest, wez
Differential Revision: D3942600
fbshipit-source-id: 2a8c9aa87d2bd8175f7bc77f3d6293ad25e9c198
Summary:
Add some helper functions for constructing EdenError objects from a few
different types of arguments. Also update eden.thrift to indicate that most
functions can throw EdenErrors on failure.
Reviewed By: bolinfest, wez
Differential Revision: D3942588
fbshipit-source-id: 1b561c5310a8a218f88c38c70499e087fe47bbe0
Summary:
Python 2.x requires the current class name be passed into super().
Add arguments to super so that we can use this inside a mercurial extension.
(Mercurial only supports python 2.x.)
Reviewed By: bolinfest
Differential Revision: D3942573
fbshipit-source-id: 06df55f217631a398004c0d25448d3a612f772e9
Summary:
The keys in the config directory map are normalized, absolute paths to the
mount point. When trying to look up a mount point make sure we also always use
a normalized absolute path.
Reviewed By: bolinfest
Differential Revision: D3942565
fbshipit-source-id: 63db838ffc7139d779925adf07c50f849d73bcc5
Summary:
We were hitting an assertion in the case where we did a `mkdir`
followed by a `rename` followed by `getMaterializedEntries`.
The issue is that our in-memory representation has a boolean to indicate
whether a dir inode is materialized, but our serialization format does
not have this bit. When we loaded the data we were not setting the
field to true and this was caught by the DCHECK.
If we have serialized data for a dir then it is, by definition, materialized
and we should just set that field to true.
Reviewed By: bolinfest
Differential Revision: D3900795
fbshipit-source-id: 62d8281e7a1009056d274888c9aff87664d2e09f
Summary:
This design is inspired by that of Git hooks:
https://git-scm.com/docs/githooks
By default, `/etc/eden/hooks` should be the place where Eden looks for
hooks; however, this can be overridden in `~/.edenrc` on a per-`repository` basis.
This directory should be installed as part of installing Eden.
There is information in `eden/hooks/README.md` about this.
The first hook that is supported is for post-clone logic for a repository.
This change demonstrates the need for an `eden config --get <value>`
analogous to what Git has, as hooks should be able to leverage this in their
own scripts. There introduces a `TODO` in `post-clone.py` where such a
feature would be useful, so that I could add the following to my `~/.edenrc`
to develop the Eden extension for Hg:
```
[hooks]
hg.edenextension = /data/users/mbolin/fbsource/fbcode/eden/hg/eden
[repository fbsource]
path = /data/users/mbolin/fbsource
type = hg
hooks = /data/users/mbolin/eden-hooks
```
Note that this revision also introduces a `generate-hooks-dir` script that can be
used to generate the standard `/etc/eden/hooks` directory that we intend to
distribute with Eden. This is also useful in creating the basis for a custom `hooks`
directory that can be specified as shown above in an `~/.edenrc` file.
Reviewed By: simpkins
Differential Revision: D3858635
fbshipit-source-id: 215ca26379a4b3b0a07d50845fd645b4d9ccf0f2
Summary:
simpkins spotted this; we were passing the wrong path down to the overlay saving dir.
This adds a test to prove that the source and destination directory contents
are correct both immediately after performing the rename and after remounting,
where we just read the serialized data.
Reviewed By: simpkins
Differential Revision: D3888694
fbshipit-source-id: 7f5fb5be417db5c693ac8a07b85abbffdbfe0fff
Summary:
This is pretty straightforward; we just walk back until we hit the
boundary with the requested JournalPosition.sequenceNumber
Reviewed By: simpkins
Differential Revision: D3872970
fbshipit-source-id: 1405f05957346d7ac513070f0407a477548aff1d
Summary:
populate the position from the latest journal delta.
To facilitate this, we also define the mountGeneration value to be a
combination of the pid and the time at which we created the EdenMount object,
as well as a global counter that we bump for each mount.
The precise value and meaning of this bits really doesn't matter, just that we
are unlikely to pick the same value for this same mountPoint path again if we
were to remount in the future.
Since we are now in a position to report JournalPosition values to clients, now
is also a good time to fill out the `currentPosition` field for the
`getMaterializedEntries` thrift call, and to check that this value is
consistent with the value we return via `getCurrentJournalPosition`.
Reviewed By: simpkins
Differential Revision: D3872952
fbshipit-source-id: 2fbc25d2e9711035b66ab1bf5d746507b72de265
Summary:
This just populates the initial snapshot hash in the journal.
The `addDelta` method will propagate this into subsequent deltas if the delta
to be added has hash values that have not been set from the default 0-filled
hash values.
Reviewed By: simpkins
Differential Revision: D3872936
fbshipit-source-id: d0014ded40488a2be04d5a381e1d9815c7f0a638
Summary:
This diff adds a couple more things to our thrift interface:
1. Introduces JournalPosition
2. Adds methods to query the current JournalPosition and obtain a
delta since a given JournalPosition
3. Augments getMaterializedFiles to also return the current JournalPosition
4. Adds a method to evaluate a `glob` against Eden
5. Adds a method using thrift streaming to subscribe to realtime changes
Could probably finesse the naming a little bit.
The JournalPosition allows reasoning about changes to files that are not part
of an Eden snapshot. Internally the journal position is just the
SequenceNumber from the journal datastructures, but when we expose it to
clients we need to be able to distinguish between a sequence number from the
current instance of the eden service and a prior incarnation (eg: if the
process has been restarted, and we have no way to recreate the journal we need
to be able to indicate this to the client if they ask about changes in that
range). For the convenience of the client we also include the `toHash` (the
most recent hash from the journal entry) which is likely useful for the `hg`
dirstate operations; it is useful to know that the snapshot may have changed
since the last query about the dirstate.
The `getFileInformation` method returns the instantaneously available `stat()`
like information about the requested list of files. Since we simply don't
have historical data on how files in the overlay looked (only how they look
now), this method does not allow passing in a JournalPosition. When it comes
to comparing historical data, we will need to add an API that accepts two
snapshot hashes and generates the results from there. This particular method
is geared up to understanding the current state of the world; the obvious use
case is plugging in the file list from `getFilesChangedSince` into this
function to figure out what's what.
* Do we want a function that combines `getFilesChangedSince` + `getFileInformation` into a single RPC?
Why is there a glob method? It's to support a use-case in the watchman/buck
integration. I'm just sketching it out in the thrift interface at this stage.
In the future we also need to be able to express how to carry out a tree walk,
but that will require some query predicates that I don't want to get hung up on
specifying immediately.
Why is the streaming stuff in its own thrift file? We can't generate code for
it in java or perhaps also python. It's only needed to plumb data into
watchman so it's broken out into its own definition. Nothing depends on that
file yet, so it's probably not specified quite right. The important thing is
how the subscribe method looks: it's essentially the same as the method to
query a delta, but it keeps emitting deltas as they are produced. This is
another API that will benefit from query predicates when we get around to
specifying them.
I've added `JournalDelta::fromHash` and `JournalDelta::toHash` to hold the
appropriate snapshot ids in the journal entry; this will allow us to indicate
when we've checked out a new snapshot, or created a new snapshot. We have
no way to populate these yet; I commented on D3762646 about storing the
`snapshotID` that we have during `EdenServiceHandler::mountImpl` into either
the `EdenMount` or the proposed `RootInode` class. Once we have that we
can simply sample it and store it as we generate `JournalDelta`s.
Reviewed By: simpkins
Differential Revision: D3860804
fbshipit-source-id: 896c24c354e6f58328fb45c24b16915d9e937108
Summary:
This is pretty simplistic: we just wlock and add a delta for the set
of file(s) that were changed in a given fuse operation (this is typically 1
file, but rename affects 2).
To reduce boilerplate very slightly, I've added an initializer_list constructor
for JournalDelta that makes it less cumbersome to create a JournalDelta for a
list of files.
Reviewed By: simpkins
Differential Revision: D3866053
fbshipit-source-id: cd918e2c98c022d5ef79430cd8ab4aef88875239
Summary:
This implements a pretty simple change Journal and associated
JournalDelta.
The Journal is intended to be held in memory and not persisted to disk.
The idea is that we'll hold a `Synchronized<Journal>` along with the
other mount data and grab a `wlock` on it each time we want to add
a change record.
This diff doesn't change any other existing functionality.
Reviewed By: simpkins
Differential Revision: D3660162
fbshipit-source-id: a6b6fa28dd12e4d34718956167ee87f8cb2d89ca
Summary:
Adds a thrift call that returns the list of materialized entries from the whole tree.
This is intended to be plugged into the mercurial dirstate extension.
Reviewed By: simpkins
Differential Revision: D3851805
fbshipit-source-id: 8429fdb4eeccc32928e8abc154d4e6fd49343556
Summary:
Was chatting with simpkins the other day and he mentioned that our
instrumented hg wrappers are quite CPU intensive. This diff switches us to
running the underlying `hg.real` or `git.real` in our integration tests when we
find them in the path.
Reviewed By: bolinfest
Differential Revision: D3865996
fbshipit-source-id: d047749356f0c1c0662774e25801f3578f9f9243
Summary:
hg_import_helper.py imports mercurial python modules, which are GPLv2+, so this
code also needs to be licensed under GPLv2+, rather than the BSD-style license
used for the bulk of the eden code base.
Reviewed By: wez
Differential Revision: D3833508
fbshipit-source-id: eb2a8969a5a88c12444a3778875609f24e145e6b
Summary:
Annoying that gcc and clang behave differently here. The compilation
error is due to gcc not seeing the implicit this pointer for some of these
method calls, so we need to explicitly use it.
Reviewed By: simpkins
Differential Revision: D3846973
fbshipit-source-id: 3d5b8b8b8c9bbab1e7935cff0e65677f76d116fb
Summary:
Buck needs this API so that it knows which paths under a project
root it should exclude when deciding whether it can ask Eden for its
SHA-1 or if it must compute it on its own.
Reviewed By: simpkins
Differential Revision: D3840658
fbshipit-source-id: 5eddc0bef423d3b3ee165d2a4b0bbf193f94f61a
Summary:
we now serialize the overlay data for each directory independently.
When we mount, we try to load the root overlay data. The children are lazy
loaded as the inodes are instantiated.
Structural changes cause the overlay data for the impacted dirs to get saved out.
I need to make a pass over this to fixup comments and so on, I just wanted to get this diff out first.
I moved the overlay stuff from `eden/fs/overlay` -> `eden/fs/inodes` since most
of the overlay-ness is handled in `TreeInode` now; the `Overlay` class is
really just for carrying around the paths and providing the serialization
helpers.
Reviewed By: simpkins
Differential Revision: D3787108
fbshipit-source-id: f0e089a829defd953535b9d0a96b102ac729261b
Summary:
It was starting to get pretty complex to manage locking across the
inodes, filedata, overlay and soon the journal, so as a simplifying step, this
folds data that was tracked by the overlay into the TreeInode itself.
This is the first diff in a short series for this. This one:
1. Breaks the persistent overlay information, so shutting down eden and
bringing it back up will lose your changes (to be restored in the
following diff)
2. Allows deferring materialization of file data in more cases
3. Allows renaming dirs.
The approach here is now to keep just one source of information about the
directory contents; when we construct a TreeInode we import this data from the
Tree and then apply mutations to it locally.
Each inode can be mutated indepdently from others; we only need to lock the 1,
2 or 3 participating inodes in the various mutation operations.
I'll tackle persistence of the mutations in the following diff, but the high
level plan for that (to help understand this diff) is to always keep the
directory inodes for mutations alive as inode objects. We make use of the
canForget functionality introduced by D3774269 to ensure that these don't
get evicted early. On startup we'll load this information from the overlay
area.
This model simplifies some of the processing around reading dirs and looking up
children.
Since the overlay data now tracks the appropriate tree or content hash
we can be much more lazy at materializing data, especially in the rename
case. For example, renaming "fbcode" to "fbcod" doesn't require us to
recursively materialize the "fbcode" tree.
Depends on D3653706
Reviewed By: simpkins
Differential Revision: D3657894
fbshipit-source-id: d4561639845ca93b93487dc84bf11ad795927b1f
Summary:
We can't allow ~EdenServer to delete the memory until we're sure that
the other threads are done. To ensure that, we need to notify the condition
variable while the aux thread still holds the lock. This makes sure that the
thread destroying the EdenServer waits for the aux thread to release the lock
before we check the predicate and proceed to deleting the memory.
```
SUMMARY ThreadSanitizer: data race /
/common/concurrency/Event.cpp:107 in facebook::common::concurrency::Event::set() const
==================
I0909 14:51:18.543072 4147554 main.cpp:173] edenfs performing orderly shutdown
I0909 14:51:18.555794 4148654 Channel.cpp:177] session completed
I0909 14:51:18.556011 4148654 EdenServer.cpp:192] mount point "/tmp/eden_test.0ostuc90/mounts/main" stopped
==================
WARNING: ThreadSanitizer: data race (pid=4147554)
Write of size 8 at 0x7fff9e182d90 by main thread:
#0 pthread_cond_destroy <null> (edenfs+0x00000007671a)
#1 facebook::eden::EdenServer::~EdenServer() /
/eden/fs/service/EdenServer.cpp:93 (libeden_fs_service_server.so+0x0000000b96cd)
#2 main /
/eden/fs/service/main.cpp:176 (edenfs+0x000000018515)
Previous read of size 8 at 0x7fff9e182d90 by thread T73:
#0 pthread_cond_broadcast <null> (edenfs+0x0000000765b7)
#1 __gthread_cond_broadcast /home/engshare/third-party2/libgcc/4.9.x/src/gcc-4_9/x86_64-facebook-linux/libstdc++-v3/include/x86_64-facebook-linux/bits/gthr-default.h:852 (libstdc++.so.6+0x0000000e14f8)
#2 std::condition_variable::notify_all() /home/engshare/third-party2/libgcc/4.9.x/src/gcc-4_9/x86_64-facebook-linux/libstdc++-v3/src/c++11/../../../.././libstdc++-v3/src/c++11/condition_variable.cc:72 (libstdc
++.so.6+0x0000000e14f8)
#3 facebook::eden::EdenServer::mount(std::shared_ptr<facebook::eden::EdenMount>, std::unique_ptr<facebook::eden::ClientConfig, std::default_delete<facebook::eden::ClientConfig> >)::$_0::operator()() const /
/
/eden/fs/service/EdenServer.cpp:145 (libeden_fs_service_server.so+0x0000000bcdb5)
#4 std::_Function_handler<void (), facebook::eden::EdenServer::mount(std::shared_ptr<facebook::eden::EdenMount>, std::unique_ptr<facebook::eden::ClientConfig, std::default_delete<facebook::eden::ClientConfig>
>)::$_0>::_M_invoke(std::_Any_data const&) /
/third-party-buck/gcc-4.9-glibc-2.20-fb/build/libgcc/include/c++/trunk/functional:2039 (libeden_fs_service_server.so+0x0000000bcab0)
#5 std::function<void ()>::operator()() const /
/third-party-buck/gcc-4.9-glibc-2.20-fb/build/libgcc/include/c++/trunk/functional:2439 (libeden_fuse_fusell.so+0x00000020fbb9)
#6 facebook::eden::fusell::MountPoint::start(bool, std::function<void ()> const&)::$_0::operator()() const /
/eden/fuse/MountPoint.cpp:69 (libeden_fuse_fusell.so+0x000000237447
)
#7 void std::_Bind_simple<facebook::eden::fusell::MountPoint::start(bool, std::function<void ()> const&)::$_0 ()>::_M_invoke<>(std::_Index_tuple<>) /
/third-party-buck/gcc-4.9-
glibc-2.20-fb/build/libgcc/include/c++/trunk/functional:1699 (libeden_fuse_fusell.so+0x000000237048)
#8 std::_Bind_simple<facebook::eden::fusell::MountPoint::start(bool, std::function<void ()> const&)::$_0 ()>::operator()() /
/third-party-buck/gcc-4.9-glibc-2.20-fb/build/libgc
c/include/c++/trunk/functional:1688 (libeden_fuse_fusell.so+0x000000236ff8)
#9 std:🧵:_Impl<std::_Bind_simple<facebook::eden::fusell::MountPoint::start(bool, std::function<void ()> const&)::$_0 ()> >::_M_run() /
/third-party-buck/gcc-4.9-glibc-2.
20-fb/build/libgcc/include/c++/trunk/thread:115 (libeden_fuse_fusell.so+0x000000236d8c)
#10 execute_native_thread_routine /home/engshare/third-party2/libgcc/4.9.x/src/gcc-4_9/x86_64-facebook-linux/libstdc++-v3/src/c++11/../../../.././libstdc++-v3/src/c++11/thread.cc:84 (libstdc++.so.6+0x0000000e6
ec0)
```
Reviewed By: simpkins
Differential Revision: D3844846
fbshipit-source-id: 545474bc1aff8621dbeb487dcd6b54c82828ff3b
Summary:
This did not work because the dict-like object read from
a `ConfigParser` was not JSON-serializable by Python.
I had to add some methods to the `Repository` that we use in our
integration test harness in order to verify everything I wanted to
in my new integration test. I implemented these methods in both
`HgRepository` and `GitRepository`.
Reviewed By: simpkins
Differential Revision: D3837879
fbshipit-source-id: e0bfb5f1bd3add192ef9bdf561591ac8e52bc002
Summary: I did what the linter told me to do.
Reviewed By: wez
Differential Revision: D3836659
fbshipit-source-id: a5d3fc8974cf6cb7c7e2d88a6215ac5c54479780
Summary:
This is useful for inodes that maintain the source of truth about
something and that must not be deleted from our tracking datastructures.
Reviewed By: simpkins
Differential Revision: D3774269
fbshipit-source-id: 623c77769cbb30c35b413a443d0413e57d6c17b3
Summary: Open a file in a subdirectory in append mode and update it.
Reviewed By: wez
Differential Revision: D3802815
fbshipit-source-id: fe1726529d017345dfee530ae5ec84cfb7531602
Summary:
test_utime() seems to fail a reasonable amount of the time in my experience:
the atime returned by lstat() occasionally ends up being 1 or 2 milliseconds
earlier than the saved timestamp. With the existing behavior python and eden
are independently computing the current time, although I haven't investigated
why eden sometimes gets a timestamp slightly earlier than python did.
This fixes the test by asking eden to set an explicit timestamp value, rather
than letting eden independently get the current timestamp.
Reviewed By: wez
Differential Revision: D3802100
fbshipit-source-id: fecfd1d68b8c95d119099ef39f17004be13d3dff
Summary:
`buck-out` now has an extra level of depth due to its new CPU-specific directories,
so we have to change the place we look for `hg_import_helper.py` in development.
Reviewed By: simpkins
Differential Revision: D3807012
fbshipit-source-id: 24d1fc1fa22f3003580f59cfdd46b9b3917f6eeb
Summary: As noted in the comments, these dependencies were making this harder to use in Buck.
Reviewed By: simpkins
Differential Revision: D3805861
fbshipit-source-id: e04898d0e1a3ccc5e38a9629b1d30791853224a5
Summary:
This codemods `TARGETS` under `[a-d]*` directories in fbcode to make
the `headers` parameter explicitly refer to `AutoHeaders.RECURSIVE_GLOB`.
Reviewed By: yfeldblum
Differential Revision: D3801845
fbshipit-source-id: 715c753b6d4ca3a9779db1ff0a0e6632c56c0655
Summary:
Update DirList::add() to accept entry names as a folly::StringPiece instead of
just a null-terminated string. This makes it possible to use PathComponent
objects without having to make a copy of the name just to ensure that it is
null terminated.
The libfuse APIs unfortunately only accept null terminated strings, so we now
simply manually populate the fuse_dirent structs in our buffer rather than
using the libfuse helper methods.
Reviewed By: wez
Differential Revision: D3762623
fbshipit-source-id: d4132b354912e0e003090bddcad0ce912f4ed401