Commit Graph

96 Commits

Author SHA1 Message Date
Chad Austin
b99d81654a introduce an EdenTimestamp type
Summary:
While we're changing how timestamps are stored, we might as well
implement this small-ish efficiency.  Instead of storing timestamps as
a timespec (16 bytes), store them as 64-bit nanoseconds with a range
slightly larger than what ext4 supports.  Assuming a million inodes,
this saves 24 MB.

This diff introduces the EdenTimestamp type.  The next diff will start
using it.

Reviewed By: simpkins

Differential Revision: D6957659

fbshipit-source-id: 4551af6f5b8c1ff610ba88795f69e7d69d7f605d
2018-02-15 16:31:42 -08:00
Chad Austin
895a0196d9 compute permission bits based on FileType in TreeEntry
Summary:
Our Model TreeEntry code was a bit too general - in reality, both git
and hg only support a handful of specific tree entries: regular files,
executable files, symlinks, and trees.  (git also supports
submodules.)  This diff delays the expansion of a TreeEntry's type
into a full mode_t.

Reviewed By: simpkins

Differential Revision: D6980003

fbshipit-source-id: 73729208000668078a180b728d7e0bb9169c6f3c
2018-02-15 14:46:33 -08:00
Adam Simpkins
58fa81ebf2 report a reasonable value in stat.st_blocks for files
Summary:
Update FileInode so that getattr() and setattr() both return a reasonable value
in st_blocks.

Previously we always returned 0 in st_blocks, which caused applications like
`du` to always report files as using 0 space in Eden mounts.  Now we compute
st_blocks based on st_size, so that `du` will report reasonable estimates for
when scanning the size of subdirectories inside an Eden mount.

Reviewed By: chadaustin

Differential Revision: D6932098

fbshipit-source-id: bd29e46821176e510f420e6e2b6ce480b80d50ff
2018-02-08 19:36:03 -08:00
Chad Austin
20f7a10bfd split InodeTimestamps into its own file
Summary:
While working on timestamp storage, the fact that
InodeTimestamps was a member of InodeBase kept getting in the way.
Make it its own type.

Reviewed By: simpkins

Differential Revision: D6862835

fbshipit-source-id: 91d8984764f0586b9fa52e961eb5606a530e0416
2018-02-01 12:34:15 -08:00
Chad Austin
697eb8a6fd run clang-format across eden
Summary:
```
find . \( -iname '*.cpp' -o -iname '*.h' \) -exec arc lint --apply-patches {} +
```

Differential Revision: D6820436

fbshipit-source-id: 173c0e3b5c023c1c9276f34e17d732f1dd161892
2018-01-26 11:20:31 -08:00
Wez Furlong
014789b4ca open file handles now survive graceful restart
Summary:
I'm so-so on a bit of the implementation here, but it works!

I had to change the `takeoverPromise` from the `pair<fuseDevice, connInfo>`
to a new helper struct because we now have three distinct pieces of data
to pass out of EdenMount to build up the overall TakeoverData.

The key change in this diff is that we have to release all of the file handles
we're maintaining in the `FileHandleMap` prior to shutting down the `InodeMap`,
otherwise the `InodeMap` will never complete (it waits for all inodes to be
unreferenced, and that cannot happen while there are open file handles).  I've
made the `FileHandleMap` serialization and clearing contingent on performing a
takeover shutdown because that feels like the safest thing to do wrt. not
losing any pending writes.

Reviewed By: simpkins

Differential Revision: D6672437

fbshipit-source-id: 7b1f0f8e7ff09dbed850c7737383ecdf1e5ff0c7
2018-01-09 22:23:11 -08:00
Andrew Gallagher
ebab5e09b9 Run autodeps on stale opt-in TARGETS
Reviewed By: luciang, meyering

Differential Revision: D6685618

fbshipit-source-id: 9e0e01a8cbc47f00225fd8445dcc4c35b4e43ffb
2018-01-09 15:12:08 -08:00
Sergey Zhupanov
0d6c3a31cd Changed TreeInode::loadGitIgnoreThenDiff() to properly handle symlinks to gitignore.
Summary: Added TreeInode::loadGitIgnoreThenDiff() symlink handling with tests.

Reviewed By: simpkins

Differential Revision: D6659654

fbshipit-source-id: 293c913ea56b5b770a051efd78e1e57497c360bd
2018-01-05 15:51:04 -08:00
Sergey Zhupanov
a3ef94a011 Added EdenMount::resolveSymlink(InodePtr pInode) with tests.
Summary: EdenMount::resolveSymlink(InodePtr pInode) resolves symlink to INodePtr.

Reviewed By: simpkins

Differential Revision: D6659644

fbshipit-source-id: 8be9ca06b08bf9730ff961e55c3ee747d5f45707
2018-01-05 15:35:42 -08:00
Chad Austin
0f0b0b6b4d remove unnecessary code and a faulty assertion
Summary:
It is no longer correct to assert that state->file is set if O_TRUNC happened
before blob import from hg finished.  It surprises me we never saw a crash
because of that.  Also, the O_TRUNC path after blob import finishes can never
complete a future, so don't try.

Reviewed By: wez

Differential Revision: D6656699

fbshipit-source-id: 5e245fc46185714e5f5d81c2680835a3497747ff
2018-01-03 17:38:48 -08:00
Wez Furlong
6ff492d11c remove dep on libfuse
Summary:
This serves a few purposes:

1. We can avoid some conditional code inside eden if we know that
   we have a specific fuse_kernel.h header implementation.
2. We don't have to figure out a way to propagate the kernel
   capabilities through the graceful restart process.
3. libfuse3 removed the channel/session hooks that we've been
   using thus far to interject ourselves for mounting and
   graceful restarting, so we were already effectively the
   walking dead here.
4. We're now able to take advtange of the latest aspects of
   the fuse kernel interface without being tied to the implementation
   of libfuse2 or libfuse3.  We're interested in the readdirplus
   functionality and will look at enabling that in a future diff.

This may make some things slightly harder for the more immediate
macOS port but I belive that we're in a much better place overall.

This diff is relatively mechanical and sadly is (unavoidably) large.

The main aspects of this diff are:

1. The `fuse_ino_t` type was provided by libfuse so we needed to
   replace it with our own definition.  This has decent penetration
   throughout the codebase.
2. The confusing `fuse_file_info` type that was multi-purpose and
   had fields that were sometimes *in* parameters and sometimes *out*
   parameters has been removed and replaced with a simpler *flags*
   parameter that corresponds to the `open(2)` flags parameter.
   The *out* portions are subsumed by existing file handle metadata
   methods.
3. The fuse parameters returned from variations of the `LOOKUP` opcode
   now return the fuse kernel type for this directly.  I suspect
   that we may need to introduce a compatibility type when we revisit
   the macOS port, but this at least makes this diff slightly simpler.
   You'll notice that some field and symbol name prefixes vary as
   a result of this.
4. Similarly for `setattr`, libfuse separated the kernel data into
   two parameters that were a little awkward to use; we're now just
   passing the kernel data through and this, IMO, makes the interface
   slightly more understandable.
5. The bulk of the code from `Dispatcher.cpp` that shimmed the
   libfuse callbacks into the C++ virtual methods has been removed
   and replaced by a `switch` statement based dispatcher in
   `FuseChannel`.   I'm not married to this being `switch` based
   and may revise this to be driven by an `unordered_map` of
   opcode -> dispatcher method defined in `FuseChannel`.  Regardless,
   `Dispatcher.cpp` is now much slimmer and should be easier to
   replace by rolling it together into `EdenDispatcher.cpp` in
   the future should we desire to do so.
6. This diff disables dispatching `poll` and `ioctl` calls.  We
   didn't make use of them and their interfaces are a bit fiddly.
7. `INTERRUPT` is also disabled here.  I will re-enable it in
   a follow-up diff where I can also revise how we track outstanding
   requests for graceful shutdown.
8. I've imported `fuse_kernel.h` from libfuse.  This is included
   under the permissive 2-clause BSD license that it allows for
   exactly this integration purpose.

Reviewed By: simpkins

Differential Revision: D6576472

fbshipit-source-id: 7cb088af5e06fe27bf22a1bed295c18c17d8006c
2018-01-02 16:36:16 -08:00
Philip Jameson
8604b8f5b0 Migrate TARGETS files from @/ to //
Summary:
This is a codemod to change from using @/ to // in basic cases.
- TARGETS files with lines starting with @/ (but excluding @/third-party:
- autodeps lines in source and TARGETS files ( (dep|manual)=@/ ), excluding @/third-party
- Targets in string macros

The only thing left of the old format should be @/third-party:foo:bar

drop-conflicts

Reviewed By: ttsugriy

Differential Revision: D6605465

fbshipit-source-id: ae50de2e1edb3f97c0b839d4021f38d77b7ab64c
2017-12-20 16:57:41 -08:00
Sergey Zhupanov
58b472b9d0 Added type identification capability to InodeBase.
Summary: Added type identification capability to InodeBase and its descendants FileInode and TreeInode.

Reviewed By: simpkins

Differential Revision: D6564902

fbshipit-source-id: ce9300102d6d6d1c42616eb1e32042f21f6e6cce
2017-12-14 16:41:39 -08:00
Sergey Zhupanov
b6394ac357 Added user and general system level gitignore
Summary:
Added to Eden capability to incorporate default user and general system level gitignore files.
NOTE: Work in progress, sending the review out to calibrate/ensure I am on right track.

Reviewed By: simpkins

Differential Revision: D6482863

fbshipit-source-id: 9834ca1a577a9599a1f8cb2243dca4e714866be8
2017-12-08 12:52:51 -08:00
Chad Austin
d112d63870 remove all direct calls to clock_gettime and system_clock::now
Summary:
Follow-up to comments in D6466209.  All access to the clock goes
through the Clock interface, making time deterministic in unit tests.

Reviewed By: simpkins

Differential Revision: D6477973

fbshipit-source-id: 24e51bdb52d0d079b34d91598d2e787d361f2525
2017-12-05 10:06:50 -08:00
Chad Austin
70f1bba3d6 test that asserts a new file inode's timestamps match the creation time, not the last checkout time
Summary: Follow-up from D6366189. First use of the new FakeClock!

Reviewed By: simpkins

Differential Revision: D6466209

fbshipit-source-id: 4d4d8a9a83df2bee11149e7a0cbddaaf734d0e04
2017-12-05 10:06:50 -08:00
Chad Austin
dcba28b47f only call materializeInParent() when the inode state actually transitions to materialized
Summary:
open() called materializeInParent unconditionally, and setattr never
called it, implying it was possible to truncate a file without
materializing the parent.  This change makes sure to precisely call
materializeInParent whenever the state transitions to materialized.

Reviewed By: wez

Differential Revision: D6389794

fbshipit-source-id: 1e740e133a83d5090a6b9801154b7eaeccb07f22
2017-12-05 10:06:47 -08:00
Adam Simpkins
1b86627d43 logging: update initialization code to use the new LogConfig logic
Summary:
Replace the initLoggingGlogStyle() function with a more generic initLogging()
function that accepts a log config string to be parsed with parseLogConfig().

Reviewed By: bolinfest, yfeldblum

Differential Revision: D6342086

fbshipit-source-id: fb1bffd11f190b70e03e2ccbf2b30be08d655242
2017-12-01 17:07:56 -08:00
Michael Bolin
5e2afa735f Change how the UNTRACKED_ADDED conflict and merges are handled.
Summary:
Previously, we used the Mercurial code `g` when faced with an `UNTRACKED_ADDED`
file conflict, but that was allowing merges to silently succeed that should not
have. This revision changes our logic to use the code `m` for merge, which
unearthed that we were not honoring the user's `update.check` setting properly.

Because we use `update.check=noconflict` internally at Facebook, we changed the
Eden integration tests to default to verifying Hg running with this setting. To
support it properly, we had to port this code from `update.py` in Mercurial to
our own `_determine_actions_for_conflicts()` function:

```
if updatecheck == 'noconflict':
    for f, (m, args, msg) in actionbyfile.iteritems():
        if m not in ('g', 'k', 'e', 'r', 'pr'):
            msg = _("conflicting changes")
            hint = _("commit or update --clean to discard changes")
            raise error.Abort(msg, hint=hint)
```

However, this introduced an interesting issue where the `checkOutRevision()`
Thrift call from Hg would update the `SNAPSHOT` file on the server, but
`.hg/dirstate` would not get updated with the new parents until the update
completed on the client. With the new call to `raise error.Abort` on the client,
we could get in a state where the `SNAPSHOT` file had the hash of the commit
assuming the update succeeded, but `.hg/dirstate` reflected the reality where it
failed.

To that end, we changed `checkOutRevision()` to take a new parameter,
`checkoutMode`, which can take on one of three values: `NORMAL`, `DRY_RUN`, and
`FORCE`. Now if the user tries to do an ordinary `hg update` with
`update.check=noconflict`, we first do a `DRY_RUN` and examine the potential
conflicts. Only if the conflicts should not block the update do we proceed with
a call to `checkOutRevision()` in `NORMAL` mode.

To make this work, we had to make a number of changes to `CheckoutAction`,
`CheckoutContext`, `EdenMount`, and `TreeInode` to keep track of the
`checkoutMode` and ensure that no changes are made to the working copy when a
`DRY_RUN` is in effect.

One minor issue (for which there is a `TODO`) is that a `DRY_RUN` will not
report any `DIRECTORY_NOT_EMPTY` conflicts that may exist. As `TreeInode` is
implemented today, it is a bit messy to report this type of conflict without
modifying the working copy along the way.

Finally, any `UNTRACKED_ADDED` conflict should cause an update to
abort to match the behavior in stock Mercurial if the user has the following
config setting:

```
[commands]
update.check = noconflict
```

Though the original name for this setting was:

```
[experimental]
updatecheck = noconflict
```

Although I am on Mercurial 4.4.1, the `update.check` setting does not seem to
take effect when I run the integration tests, but the `updatecheck` setting
does, so for now, I set both in `hg_extension_test_base.py` with a `TODO` to
remove `updatecheck` once I can get `update.check` to do its job.

Reviewed By: simpkins

Differential Revision: D6366007

fbshipit-source-id: bb3ecb1270e77d59d7d9e7baa36ada61971bbc49
2017-11-29 21:50:34 -08:00
Adam Simpkins
71981cc504 fix state handling in EdenMount::destroy()
Summary:
This fixes a crash in EdenMount::destroy() if EdenMount::create() failed to
load the root inode.  Previously the code called shutdownImpl() in this case
which tried to unload all inodes and crashed since the root inode was null.

This also fixes EdenMount::destroy() to properly handle the FUSE_ERROR and
FUSE_DONE cases.

Reviewed By: wez

Differential Revision: D6434355

fbshipit-source-id: 39c5f4472d6ebbcf881b4c9c8c8fd67686032ec1
2017-11-29 14:36:39 -08:00
Wez Furlong
28e74f1ba6 add scmGetStatusBetweenRevisions thrift call
Summary:
The goal is to provide a fast path for watchman to flesh
out the total set of changed files when it needs relay that information
on to consumers.

We choose not to include the full list in the Journal when checking out
between revisions because it will not always be needed and may be an
expensive `O(repo)` operation to compute.  This means that watchman
needs to expand that information for itself, and that is currently
a fairly slow query to invoke through mercurial.

Since watchman is responding to journal events from eden we know that
we have tree data for the old and new hashes and thus we should be
able to efficiently compute that diff.

This implementation is slightly awful because it will instantiate an
unlinked TreeInode object for one side of the query, and will in
turn populate any children that differ as it walks down the tree.
A follow on diff will look at making a flavor of the diff code that
can diff raw Tree objects instead.

Reviewed By: bolinfest

Differential Revision: D6305844

fbshipit-source-id: 7506c9ba1f4febebcdc283c414261810a3951588
2017-11-28 19:36:32 -08:00
Michael Bolin
33b31c4062 Remove excludes from a build rule that no longer exist.
Summary:
This should have been removed as part of D6179950.

Ideally, Buck would error when this happens, but apparently `glob()` does not
complain when patterns do not match any files, even when the pattern does not
contain any wildcards. There appears to be some code at Facebook that is
exploiting this behavior.

Reviewed By: simpkins

Differential Revision: D6421529

fbshipit-source-id: c6f982624e0e12a911bc12ab1e8239ba4358ea56
2017-11-28 01:35:56 -08:00
Michael Bolin
9ab42310c5 Directory was showing up as UNTRACKED_ADDED instead of its contents.
Summary:
This bug is part of a bigger issue in our Mercurial integration where
`UNTRACKED_ADDED` conflicts are being silently swallowed in our Hg extension
whereas stock Mercurial presents these as conflicts and forces the user to deal
with them. The Mercurial issues will be addressed in a follow-up change.

Reviewed By: simpkins

Differential Revision: D6365580

fbshipit-source-id: 831e27ce1da90ea605033b2b9988fe400ba404aa
2017-11-20 15:56:37 -08:00
Adam Simpkins
fbeb35cbdc fix issues tracking the last checkout time
Summary:
Eden attempts to initialize timestamps of newly loaded inodes with the time of
the last checkout operation performed in this mount.  Unfortunately it had a
number of bugs in this logic:

EdenMount had two separate fields attempting to track the last checkout time:
`lastCheckoutTime_` and `parentInfo_.lastCheckoutTime`.

Unfortunately neither field was actually updated on checkout operations.
Additionally, `lastCheckoutTime_` did not have any locking to allow it to be
updated.  `parentInfo_.lastCheckoutTime` did have locking, but it used the
mount point's checkout lock, so it could not be accessed during checkout
operations.

This diff removes `parentInfo_.lastCheckoutTime`, keeping only
`lastCheckoutTime_`.  It also converts `lastCheckoutTime_` to a
`struct timespec` since this is most often needed as a `timespec`.  It also
adds a new mountpoint-wide lock for synchronizing accessing to this variable.

Reviewed By: bolinfest

Differential Revision: D6356698

fbshipit-source-id: db54f9bb297b5febe4642e2b3fcc8055a6afc199
2017-11-19 15:47:13 -08:00
Chad Austin
38360c8563 make materializeForWrite private
Summary: More work towards encapsulating a FileInode's internal state machine.

Reviewed By: wez

Differential Revision: D6316013

fbshipit-source-id: 9c8303b35a0de1ba69207c7f59be88c5fb037ad8
2017-11-16 09:58:15 -08:00
Adam Simpkins
c8c1ba5eab remove eden/fs/utils/test/TestChecks.h
Summary:
The gtest macros in this file were moved to folly/test/TestUtils.h
Update everything to just use folly/test/TestUtils.h directly.

Reviewed By: chadaustin

Differential Revision: D6301759

fbshipit-source-id: 7f2841c12d5bea15376f782fb3bf3bfef16039c7
2017-11-15 12:53:55 -08:00
Michael Bolin
2d1eade7a7 Run the autodeps script on our build files.
Summary:
I created this change by running:

```
find eden -name TARGETS | grep -v eden/fs/fuse/TARGETS | grep -v eden/fs/service/TARGETS | xargs autodeps
```

apparently `eden/fs/fuse/TARGETS` and `eden/fs/service/TARGETS` have some
constructions that `autodeps` does not understand, so I filtered those out.

Reviewed By: StanislavGlebik

Differential Revision: D6319982

fbshipit-source-id: 7b3683d1507409dde6d6570e9b13811168aa6859
2017-11-14 11:18:19 -08:00
Chad Austin
587d0ad1ee Move Journal locking inside of Journal itself
Summary:
This is follow-up to the lock ordering issues in
StreamingSubscriber.  The Journal locks are now finer-grained and no
locks are held while the subscribers are invoked.  This change
prevents future deadlocks.

Reviewed By: wez

Differential Revision: D6281410

fbshipit-source-id: 797c164395831752f61cc15928b6d4ce4dab1b68
2017-11-10 19:57:49 -08:00
Michael Bolin
c6f59d25b8 Fix a crash that could occur when doing hg update .^ --merge.
Summary:
The underlying issue is that we were reporting a `MODIFIED_REMOVED`
conflict as a `MODIFIED_MODIFIED` conflict. This put us in a state where
Mercurial expected to find a file in the new manifest, but failed because the
file was not present in that revision, so no such file could be found.

Somewhat surprisingly, the appropriate handler for a `MODIFIED_REMOVED`
conflict already existed in our Mercurial extension, but there was no logic on
the server that would generate a `MODIFIED_REMOVED` conflict previous to
this change.

Like D6204916, this was an issue I ran into when trying to create a repro case
for the issue that was fixed in D6199215.

Reviewed By: wez

Differential Revision: D6270272

fbshipit-source-id: 6604eea00b0794cd44b01d2ba6b9ea10db32d556
2017-11-09 16:29:55 -08:00
Michael Bolin
5d738193e5 Store Hg dirstate data in Hg instead of Eden.
Summary:
This is a major change to how we manage the dirstate in Eden's Hg extension.

Previously, the dirstate information was stored under `$EDEN_CONFIG_DIR`,
which is Eden's private storage. Any time the Mercurial extension wanted to
read or write the dirstate, it had to make a Thrift request to Eden to do so on
its behalf. The upside is that Eden could answer dirstate-related questions
independently of the Python code.

This was sufficiently different than how Mercurial's default dirstate worked
that our subclass, `eden_dirstate`, had to override quite a bit of behavior.
Failing to manage the `.hg/dirstate` file in a way similar to the way Mercurial
does has exposed some "unofficial contracts" that Mercurial has. For example,
tools like Nuclide rely on changes to the `.hg/dirstate` file as a heuristic to
determine when to invalidate its internal caches for Mercurial data.

Today, Mercurial has a well-factored `dirstatemap` abstraction that is primarily
responsible for the transactions with the dirstate's data. With this split, we can
focus on putting most of our customizations in our `eden_dirstate_map` subclass
while our `eden_dirstate` class has to override fewer methods. Because the
data is managed through the `.hg/dirstate` file, transaction logic in Mercurial that
relies on renaming/copying that file will work out-of-the-box. This change
also reduces the number of Thrift calls the Mercurial extension has to make
for operations like `hg status` or `hg add`.

In this revision, we introduce our own binary format for the `.hg/dirstate` file.
The logic to read and write this file is in `eden/py/dirstate.py`. After the first
40 bytes, which are used for the parent hashes, the next four bytes are
reserved for a version number for the file format so we can manage file format
changes going forward.

Admittedly one downside of this change is that it is a breaking change.
Ideally, users should commit all of their local changes in their existing mounts,
shutdown Eden, delete the old mounts, restart Eden, and re-clone.

In the end, this change deletes a number of Mercurial-specific code and Thrift
APIs from Eden. This is a better separation of concerns that makes Eden more
SCM-agnostic. For example, this change removes `Dirstate.cpp` and
`DirstatePersistance.cpp`, replacing them with the much simpler and more
general `Differ.cpp`. The Mercurial-specific logic from `Dirstate.cpp` that turned
a diff into an `hg status` now lives in the Mercurial extension in
`EdenThriftClient.getStatus()`, which is much more appropriate.

Note that this reverts the changes that were recently introduced in D6116105:
we now need to intercept `localrepo.localrepository.dirstate` once again.

Reviewed By: simpkins

Differential Revision: D6179950

fbshipit-source-id: 5b78904909b669c9cc606e2fe1fd118ef6eaab95
2017-11-06 19:56:49 -08:00
Chad Austin
8b9261f2a1 run clang-format across all C++ files
Summary:
Per discussion with bolinfest, this brings Eden in line with clang-format.

This diff was generated with `find . \( -iname '*.cpp' -o -iname '*.h' \) -exec bash -c "yes | arc lint {}" \;`

Reviewed By: bolinfest

Differential Revision: D6232695

fbshipit-source-id: d54942bf1c69b5b0dcd4df629f1f2d5538c9e28c
2017-11-03 16:02:03 -07:00
Yedidya Feldblum
b2f5376cea Move folly/Array.h
Summary: [Folly] Move `folly/Array.h` to `folly/container/`.

Reviewed By: luciang

Differential Revision: D6182858

fbshipit-source-id: 59340b96058cc6d0c7a0289e316bbde98c15d724
2017-10-29 03:42:55 -07:00
Wez Furlong
25a9786ca5 augment JournalDelta with unclean paths on snapshot hash change
Summary:
We were previously generating a simple JournalDelta consisting of
just the from/to snapshot hashes.  This is great from a `!O(repo)` perspective
when recording what changed but makes it difficult for clients downstream
to reason about changes that are not tracked in source control.

This diff adds a concept of `uncleanPaths` to the journal; these are paths
that we think are/were different from the hashes in the journal entry.

Since JournalDelta needs to be able to be merged I've opted for a simple
list of the paths that have a differing status; I'm not including all of
the various dirstate states for this because it is not obvious how to
reconcile the state across successive snapshot change events.

The `uncleanPaths` set is populated with an initial set of different paths as
the first part of the checkout call (prior to changing the hash), and then is
updated after the hash has changed to capture any additional differences.

Care needs to be taken to avoid recursively attempting to grab the parents lock
so I'm replicating just a little bit of the state management glue in the
`performDiff` method.

The Journal was not setting the from/to snapshot hashes when merging deltas.
This manifested in the watchman integration tests; we'd see the null revision
as the `from` and the `to` revision held the `from` revision(!).

On the watchman side we need to ask source control to expand the list of
files that changed when the from/to hashes are different; I've added code
to handle this.  This doesn't do anything smart in the case that the
source control aware queries are in use.  We'll look at that in a following
diff as it isn't strictly eden specific.

`watchman clock` was returning a basically empty clock unconditionally,
which meant that most since queries would report everything since the start
of time.  This is most likely contributing to poor Buck performance, although
I have not investigated the performance aspect of this.  It manifested itself
in the watchman integration tests.

Reviewed By: simpkins

Differential Revision: D5896494

fbshipit-source-id: a88be6448862781a1d8f5e15285ca07b4240593a
2017-10-16 22:46:54 -07:00
Chad Austin
3fb5680152 Rename ConflictType::MODIFIED to ConflictType::MODIFIED_MODIFIED
Summary: Per wez, this makes the MODIFIED case consistent with the other conflict types (e.g. local_remote).  Side benefit of avoiding some naming conflicts in the Haskell/Rust thrift tooling.

Reviewed By: wez

Differential Revision: D5882327

fbshipit-source-id: 3ec68c44d8c8a5c5675f1ced3842d29376d46fe2
2017-09-21 16:54:37 -07:00
Wez Furlong
3da68e5adc dumb merge of MountPoint into EdenMount
Summary:
This is a mechanical and dumb move of the code from MountPoint
and into the EdenMount class.

Of note, it doesn't merge together the two different state/status fields
into a unified thing; that will be tackled in a follow on diff.

Reviewed By: bolinfest

Differential Revision: D5778212

fbshipit-source-id: 6e91a90a5cc760429d87a475ec12f81b93f87be0
2017-09-08 19:25:34 -07:00
Wez Furlong
9f07485239 add code to serialize FileHandleMap
Summary:
The serialized data for each file handle needs to be enough
to re-construct the handle when we load it into a new process later
on.  We need the inode number, the file handle number that we communicated
to the kernel and a flag to let us know whether it is a file or a dir.

Note that the file handle allocation strategy already accomodates the
idea of migrating to a new process; we don't need to serialize anything
like a next file handle id number.

This doesn't implement instantiating the handles from the loaded state,
it is just the plumbing for saving and loading that state information.

Reviewed By: bolinfest

Differential Revision: D5733079

fbshipit-source-id: 8fb8afb8ae9694d013ce7a4a82c31bc876ed33c9
2017-08-30 19:20:23 -07:00
Jyothsna Konisa
3f046593a8 Wrapper for TimeStamps & helper function to set timestamps in setattr.
Summary:
1. Added a new structure `InodeBase::InodeTimestamps` to wrap atime,ctime,mtime together. This new structure helps in avoiding usage of `struct stat` for timestamps.
2. Modified function `Overlay::openFile` ,`Overlay::updateTimestampToHeader`, `Overlay::deserializeOverlayDir`, `Overlay::parseHeader` to use this new structure for timestamps instead of `struct stat`. Also, modified code in places where this change is being affected.
3. Added new helper methods `FileInode::setattrTimes`  and `TreeInode::setattrTimes` to set timestamps in FileInode and TreeInode during setattr. Implementation of setattr for FileInode and TreeInode is in the diffs stacked above this diff.
4. Replaced atime, ctime, mtime in `FileInode::State`, `TreeInode::Dir` to `FileInode::State::timeStamps` and `TreeInode::State::timeStamps`. Made other necessary changes to support this change.

Reviewed By: simpkins

Differential Revision: D5596854

fbshipit-source-id: 2786b7b695508a62fdf8f7829f1ce76054b61c52
2017-08-11 11:36:07 -07:00
Jyothsna Konisa
6aa6e547d6 Reading and writing timestamps in to overlay files
Summary:
Added a new function `InodeBase::updateOverlayHeader` and implemented `FileInode::updateOverlayHeader` and `TreeInode::updateOverlayHeader` to update inmemory timestamps to overlay header when an inode is unreferenced.

Added helper functions in `Overlay` class to read and update timestamps in to the overlay file. Also,modified `Overlay::loadOverlayDir` to read and populate timestamps from overlay header in to treeinode.

Modified constructor of `FileInode::state` to read timestamps from overlay file and to populate inode timestamps.

Added test case to check if time stamps are updated and read correctly on remount.

Fixed a lint warning in TARGETS file

Reviewed By: simpkins

Differential Revision: D5535429

fbshipit-source-id: f6b758f70101c65d316a35101aacc9a3363f7aed
2017-08-04 20:19:20 -07:00
Jyothsna Konisa
e4fefa3e69 Adding Timestamps to TreeInode class and intializing timestamps to lastcheckout time
Summary: Added atime,ctime,mtime for tracking timestamps for directories inmemory and initialized them to the last checkout time during the creation of TreeInode.

Reviewed By: bolinfest

Differential Revision: D5440950

fbshipit-source-id: 639cf1ce6722f80dde35d33849aa712aa30301a8
2017-07-27 18:25:01 -07:00
Jyothsna Konisa
19df19d994 Adding lastCheckoutTime to EdenMount and initializing timestamps of FileInode with lastCheckoutTime
Summary:
Added a new data member lastCheckoutTime to EdenMount class to store the time when checkout operation is performed. Also added a method to get the last checkout time which internally returns the lastCheckoutTime in EdenMount class.

Added new fields atime,mtime,ctime in FileInode::State structure to keep track of timestamps in memory. Initialzed timestamps in FileInode::State constructor by calling getLastCheckOutTime from EdenMount class.

Still have to add timestamp tracking for directories and have to initialize them with lastCheckout time.This probably will be done in a seperate diff.

Reviewed By: bolinfest

Differential Revision: D5437682

fbshipit-source-id: e3d6b1bc0c2192538dd3b0d9a6017ceb3ca0843d
2017-07-27 11:52:31 -07:00
Jyothsna Konisa
20cd12b31a Moving FileData methods to FileInode
Summary:
Moved all the member functions from `FileData` class to `FileInode` class
and made `FileInode` methods independent of shared `FileData` object.
Removed `FileData.h` and `FileData.cpp` files as they are not needed anymore.

Modified functions `FileInode::getSHA1()` and `FileInode::isSameAsFast` and
modified few testcases which are currently using `FileData` class and made
sure that all the test cases are passing.

Reviewed By: bolinfest

Differential Revision: D5430128

fbshipit-source-id: 3e8e6c490e92e4e602355e4ce39b67c450ec53f8
2017-07-26 23:39:35 -07:00
Adam Simpkins
8be3b57eed fix issues updating TreeInode materialization status during checkout
Summary:
This updates the TreeInode code to remove the redundant materialized flag.
A TreeInode should have a Tree Hash if and only if it is dematerialized, so
there is no need for an extra `materialized` boolean.

This diff also fixes an issue in TreeInode::saveOverlayPostCheckout() where it
was not correctly informing it's parent TreeInode of the change if it moved
from one dematerialized state to another (with a different TreeInode hash).
This fixes the code to correctly call `parent->childDematerialized()` when it
needs to inform the parent that it now refers to a different source control
hash.

Reviewed By: wez

Differential Revision: D5336629

fbshipit-source-id: b4d86ecdef2f5faefbc243a09d869c02384ae95c
2017-07-07 18:45:02 -07:00
Jyothsna Konisa
fb50ce0213 adding header to the overlay directory
Summary:
1.Added a new method to create header.
2.Added header to the overlay files of directories.
3.Added test class OverlayTest for Overlay related tests.

Reviewed By: simpkins

Differential Revision: D5335134

fbshipit-source-id: 31f59e7af70a3eeae6350261ded5d8b1bec2b9d0
2017-07-04 00:23:25 -07:00
Adam Simpkins
6e7e8f9200 add basic unit test for remounting
Summary:
Add a basic unit test that adds a new file, remounts the mount point, and
confirms the materialized contents are loaded from the overlay correctly on
remount.

Differential Revision: D5326024

fbshipit-source-id: d2446030802cc4afe5af09460d590ccf8c43e525
2017-06-30 19:10:53 -07:00
Adam Simpkins
e708dda521 add a main function to the inodes unit test
Summary:
Add a main() function to the indoes unit test, to allow a `--logging` flag to
control log levels.

I will eventually update our default test `main()` function to include this
flag, but for now adding a custom `main()` for the inodes test is the simplest
fix.

Reviewed By: wez

Differential Revision: D5326022

fbshipit-source-id: 36f497658fdb21639408fc599cf75908b9c9acb3
2017-06-30 19:10:53 -07:00
Christopher Dykes
0970c1e12a Merge StringBase.cpp into String.cpp
Summary: It doesn't need to exist anymore

Reviewed By: yfeldblum

Differential Revision: D5318746

fbshipit-source-id: c70b184f4b3fc12ede4632d6b3d43de16ed758c7
2017-06-29 20:20:11 -07:00
Adam Simpkins
429f737816 format eden/fs TARGETS files with autodeps
Summary:
Format all of the TARGETS files under eden/fs with the autodeps tool.

A few rocksdb include statements require comments so that autodeps can
correctly tell which dependency this include comes from.  The rocksdb library's
source file structure unfortunately does not match the layout of how its header
files get installed, so autodeps cannot figure this out automatically.

Reviewed By: wez

Differential Revision: D5316000

fbshipit-source-id: f8163adca79ee4a673440232d6467fb83e56aa10
2017-06-27 21:20:15 -07:00
Andrew Gallagher
03bdaff954 codemod: format TARGETS with buildifier [4/5] (D5092623)
Reviewed By: igorsugak

fbshipit-source-id: 277a9d2bdc1d7e3ff3075bfe2d7307502fd0a507
2017-06-01 17:52:40 -07:00
Michael Bolin
57f5d72a27 Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate.
Summary:
This is a major change to Eden's Hg extension.

Our initial attempt to implement `edendirstate` was to create a "clean room"
implementation that did not share code with `mercurial/dirstate.py`. This was
helpful in uncovering the subset of the dirstate API that matters for Eden. It
also provided a better safeguard against upstream changes to `dirstate.py` in
Mercurial itself.

In this implementation, the state transition management was mostly done
on the server in `Dirstate.cpp`. We also made a modest attempt to make
`Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for
Git at some point.

However, as we have tried to support more of the sophisticated functionality
in Mercurial, particularly `hg histedit`, achieving parity between the clean room
implementation and Mercurial's internals has become more challenging.
Ultimately, the clean room implementation is likely the right way to go for Eden,
but for now, we need to prioritize having feature parity with vanilla Hg when
using Eden. Once we have a more complete set of integration tests in place,
we can reimplement Eden's dirstate more aggressively to optimize things.

Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]]
extension has already demonstrated that it is possible to provide a faithful
dirstate implementation that subclasses the original `dirstate` while using a different
storage mechanism. As such, I used `sqldirstate` as a model when implementing
the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`).

In particular, `sqldirstate` uses SQL tables as storage for the following private fields
of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because
`_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we
do not support them in `eden_dirstate` and add code to ensure the codepaths that
would access them in `dirstate` never get exercised. Similarly, we also implemented
`eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the
dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden.
It appears to be primarily used for checking whether a path to a file already exists in
the dirstate as a directory. We can protect against that in more efficient ways.)

That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set
of files that have been marked "copied" in the current dirstate, so it is fairly small and
can be stored on disk or in memory with little concern. `_map` is a bit trickier because
it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored
across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data
analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new
equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles`
table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap`
data.

In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined
in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`,
which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data
structures that translate their method calls to Thrift server calls. I expect we will want to
optimize this in the future via some client-side caching, as well as creating batch APIs for talking
to the server via Thrift.

One advantage of this new implementation is that it enables us to delete
`eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`.
Between the recent implementation of `dirstate.walk()` for Eden and this switch
to the real dirstate, we can now use the default implementation of `hg add` and `hg remove`
(although we have to play some tricks, like in the implementation of `eden_dirstate.status()`
in order to make `hg remove` work).

In the course of doing this revision, I discovered that I had to make a minor fix to
`EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as
`hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which
case the glob was not matching `foo`!

I also had to do some work in `eden_dirstate.status()` in which the `match` argument
was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number
of things with the `match` specified as a filter, so the output of `status()` must be filtered
by `match` accordingly. Ultimately, this seems like work that would be better done on the
server, but for simplicity, we're just going to do it in Python, for now.

For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`.
As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was
testing should probably be converted to integration tests. At a high level, the role of
`DirstatePersistence` has not changed, but the exact data it writes is much different.
Its corresponding unit test is also disabled, for now.

Note that this revision does not change the name of the file where "dirstate data" is written
(this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing
instances of this file once this change lands. (It is still early enough in the project that it does
not seem worth the overhead of a proper migration.)

The true test of the success of this new approach is the ease with which we can write more
integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very
few changes to `eden_dirstate.py`.

Reviewed By: simpkins

Differential Revision: D5071778

fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 12:05:29 -07:00
Adam Simpkins
51df0c0663 refactor Dirstate::addAll() to clean up some old code
Summary:
This updates the Dirstate::addAll() code to use EdenMount::diff() internally,
instead of getStatusForExistingDirectory().

This lets us delete getStatusForExistingDirectory() and several other helper
functions that it used.  The logic in getStatusForExistingDirectory() was based
on the incorrect assumption that files can only be modified from their expected
source control state if they are materialized.  This could result in incorrect
results in a variety of cases (particularly after renames, or "hg reset"
operations).  The EdenMount::diff() code does not have this problem.

That said, this new addAll() code does still have some performance issues--it
currently does a full tree diff for each input path.  We should ideally fix
this in the future to only diff the necessary subtree for each path.  However,
in the short term this trade-off seems worth being able to delete all of this
older, buggy code.  diff() should be cheap enough in most cases that this won't
be a major problem unless a large number of paths are given as input.

Reviewed By: bolinfest

Differential Revision: D4968835

fbshipit-source-id: 1834aa98a26dcaa0e1c06c7ac25c57944fa1b5f7
2017-05-01 18:54:41 -07:00