Commit Graph

88 Commits

Author SHA1 Message Date
Brian Strauch
e2d4362896 live debug journal command
Summary: Running `eden debug journal -f` will print and follow the eden journal in a similar style to the unix `tail -f` command.

Reviewed By: chadaustin

Differential Revision: D16112458

fbshipit-source-id: 5304cd0f857bdbeca41c2591e98920f4f1fc8f42
2019-07-09 09:13:28 -07:00
Jake Crouch
74b514ceac Thrift interface for setting memory limit of Journal
Summary: Sets up a thrift interface to set the size of the journal (until truncation is added in the size field in the journal currently does nothing other than being viewable from getMemoryLimit)

Reviewed By: chadaustin

Differential Revision: D16042286

fbshipit-source-id: bc0acdf4ac5516cfac66fa0fbd87254d08ad479b
2019-07-02 19:03:35 -07:00
Jake Crouch
30e6c20988 Displaying duration of journal
Summary: Shows the end-to-end duration of the journal in "eden stats"

Reviewed By: chadaustin

Differential Revision: D15993261

fbshipit-source-id: 46471faca17d4f12ccdd8cea55b2722e33519a74
2019-06-28 16:41:33 -07:00
Jake Crouch
e7036c45cd Update "eden debug journal" to use limit instead of generating a range
Summary: Prior to this diff "eden debug journal" got the latest journal entry and manually seeked for the one "limit" prior to it, this has been updated to just passing the limit to the journal. [This will make the queries always happen from the tip of the journal, since in theory currently something can be added to the journal after the python script gets the latest delta]

Reviewed By: strager

Differential Revision: D15972018

fbshipit-source-id: 0ee0dd88a1e9edef5ccce3b3da2dbc09aa64f8a9
2019-06-26 16:38:33 -07:00
Wez Furlong
f5d9a06dc9 eden: add thrift calls for adding/removing bind mounts
Summary:
These allow the cli to setup and tear down mounts and
have the eden server keep track of them.

Reviewed By: strager

Differential Revision: D15707318

fbshipit-source-id: abdb8eaa28c8c67c8211a8af1647efe3a083e998
2019-06-25 18:42:37 -07:00
Adam Simpkins
92fc1d83d9 update license headers in thrift files
Summary:
Update the copyright & license headers in thrift files to reflect the
relicensing to GPLv2+

Reviewed By: wez

Differential Revision: D15487082

fbshipit-source-id: 33f68617037f36c07075fb962a16a4d8f55bd6a6
2019-06-19 17:02:46 -07:00
Adam Simpkins
aa45fa2cb7 periodically reload the config files
Summary:
Add a periodic task to reload the configuration file from disk.  By default
this runs once every 5 minutes, but this interval can be controlled from the
config file.

At the moment reloading the config file does not do much other than update the
interval for how frequently the config file is reloaded.  However, I plan to
add additional periodic tasks shortly that are controlled by this config
setting.

This will also make it possible for other parts of the code to
access the config settings in the `ServerState` and use them as-is without
checking to see if they reloaded.  Currently all of the code that accesses
config values performs a check to see if the config needs to be reloaded.  If
we want to switch to Mercurial-style configs in the future that check will be
substantially more expensive.

This diff also includes a new thrift call to force the config file to be
reloaded immediately.  This can be used to restart automatic config reloading
if it is ever disabled in the config file.

Reviewed By: wez

Differential Revision: D15756357

fbshipit-source-id: 1999f4730903633ce838842932a6ae6a65eda4e6
2019-06-14 18:14:43 -07:00
Adam Simpkins
7309869981 add a thrift call for getting config values
Summary:
Add a thrift call to get the current config settings.

My primary use case for this method at the moment is to make it possible to
build integration tests that check the config behavior.  However in the future
this will probably also be useful for building CLI commands to report the
current config values to allow debugging if there are ever issues.  This API
can also be used to force EdenFS to immediately reload the config from disk.

Reviewed By: strager

Differential Revision: D15572124

fbshipit-source-id: da3bc982f9c419b3314a8b0560c9bd327760d429
2019-06-11 13:08:28 -07:00
Jake Crouch
0dc6812f33 Print out Journal Info with edenfsctl
Summary: Print out memory usage and entry counts with edenfsctl stats

Reviewed By: chadaustin

Differential Revision: D15607015

fbshipit-source-id: 866960ea1d3b5e9fdbe24df3b57fba419795ec76
2019-06-07 13:37:02 -07:00
Adam Simpkins
799d2d6834 update initiateShutdown() to be able to throw exceptions
Summary:
Update the thrift definition for `initiateShutdown()` to indicate that it may
throw `EdenError` on failure.  Without this all application-level errors that
occur will be translated to generic thrift `TApplicationException` objects on
the wire.

Also move this method declaration out of the "debugging APIs" section of the
file.

Reviewed By: strager

Differential Revision: D15572121

fbshipit-source-id: 7e621a24abd4347cedbb1bcce1ae9c2b70f991fd
2019-06-05 11:50:36 -07:00
Matt Glazar
49f4c37b67 Add missing copyright notices
Summary: Internal Facebook infrastructure is nagging me about some files not having a Facebook copyright notice. Add a notice to these files to make the nagging stop.

Reviewed By: simpkins

Differential Revision: D14173944

fbshipit-source-id: 7234431224fcf4f86ea56ca2f9108f47ef959d87
2019-03-07 19:32:39 -08:00
Wez Furlong
8847a7a061 add dtype as an optional return value from glob
Summary:
This diff adds the dtype field to the glob results;
this will help to reduce the cost of some watchman queries by avoiding a
getFileInformation call that instantiates inodes.

As part of this, I added a bunch of unit test coverage.

Reviewed By: strager

Differential Revision: D8779149

fbshipit-source-id: 3064a3e42be55ec576fed9e0f7112edef426f32d
2019-02-19 11:26:26 -08:00
Adam Simpkins
5d4279567d add thrift APIs to the fault injection framework
Summary: Add some thrift APIs that allow injecting faults into Eden's FaultInjector.

Reviewed By: wez

Differential Revision: D14079490

fbshipit-source-id: a9ec16b560ddcc79d5d819dbbc120ad5da556b4e
2019-02-15 19:27:17 -08:00
Adam Simpkins
798ccc750a track EdenMounts while they are initializing
Summary:
This updates the logic in EdenServer to add the EdenMount to the mountPoints_
map as soon as it is created, so that we track mount points as they are
initializing.

I don't expect this change to have any major impact in functionality yet.  In
a subsequent diff I also plan have EdenServer keep mount points in the
mountPoints_ map longer while they are shutting down.  I expect that change to
matter a bit more, as that will allow us to do a better job reporting and
debugging when mount points are taking a non-trivial amount of time to become
unreferenced and fully shut down.

Reviewed By: strager

Differential Revision: D13503050

fbshipit-source-id: 2e0e8dfde64c6a005efd6dcf503ad7577f314356
2019-01-16 19:35:09 -08:00
Adam Simpkins
05cb1dcd4f report the mount state in listMounts()
Summary:
Update the `listMounts()` thrift API to also report the current mount point
state.  This will allow us to do a better job of reporting mount points that
are in the process of initializing or shutting down.

This change splits the `MountInfo` thrift type into two distinct types for
the `listMounts()` vs `mount()` APIs.  However this change should be
completely backwards compatible at the wire protocol level for older client
and server code.

Reviewed By: strager

Differential Revision: D13503049

fbshipit-source-id: 68e7ca708b956991c8fd93bbf8973d90650aced9
2019-01-02 12:58:08 -08:00
Chad Austin
3dbad3ac01 show blob cache sizes in eden stats
Summary:
Have `eden stats` print the size of the blob cache if the running
edenfs has information about it.

Reviewed By: strager

Differential Revision: D13349220

fbshipit-source-id: 9f59f4399f2d4283aa80bcb54ba73c51d555d502
2018-12-06 19:43:52 -08:00
Dan Schatzberg
8fe62ce81b Add command to chown a mount
Summary:
Sandcastle has several cases where we chown the entire
repository which performs terribly on Eden. As a workaround we have a
command to do this in eden without loading all the files.

Reviewed By: chadaustin

Differential Revision: D12857956

fbshipit-source-id: 36cebcc710fbcf4e1eb265df901513cf50a227b9
2018-11-07 08:58:31 -08:00
Dan Schatzberg
b2a4204b4d Add thrift interface to dump tracepoints
Summary:
With this the eden cli can dump tracepoints and translate
them to various formats or perform any processing

Reviewed By: chadaustin

Differential Revision: D10384072

fbshipit-source-id: 8b38e7f6b551a2bd98b3e748ba1cceafeceeec8c
2018-11-01 08:09:19 -07:00
Chad Austin
e750ab68fe expose FUSE accesses over Thrift
Summary:
Add a Thrift API for reading the pid access logs from each
EdenMount/FuseChannel. Used in a future diff.

Reviewed By: strager

Differential Revision: D9477867

fbshipit-source-id: 0897a915ca654bca952aecc123ea40105830a75b
2018-09-10 13:52:51 -07:00
Chad Austin
6394450579 restructure JournalDelta and fix Watchman subscription race
Summary:
Watchman's Eden integration has a bug where the combination of
Watchman querying Eden for overlapping delta ranges ("give me changes
between X and Y, now changes between X+1 and Y+1") and Eden eliding
redundant change events ("add-modify-remove" -> []) results in
Watchman sometimes reporting that a file exists in its final
subscription update when it no longer does.

The fix is to never elide events, even for files that were added and
removed in the same sequence. To continue to support Watchman's `new`
flag, track whether a file existed at the beginning and end of a
journal delta.

Reviewed By: wez

Differential Revision: D9304964

fbshipit-source-id: f34c12b25f2b24e3a0d46fc94aa428528f4c5098
2018-08-15 14:52:06 -07:00
Wez Furlong
cfde0c0717 define paths as binary rather than strings in the thrift interface
Summary:
This prevents `hg status` from blowing up with a UTF-8 decode
error inside the generated thrift code.

Push safety concerns:
* This doesn't change the wire representation of the data
* Existing clients that believe it to be a string will continue to have
  the same behavior
* Buck has its own copy of an older version of the thrift spec, so it will
  continue to work "OK".
* When buck resyncs with our thrift file, some changes will likely be needed
  to convert the byte arrays to strings or paths or whatever is appropriate
  for bucks internal API

Work "OK" above means that clients that currently believe that `string` is
utf-8 encoded will have a runtime error if we ever send them a path that
is not utf-8.  This is the behavior prior to this diff and will continue
to be the behavior for clients (like buck) that have an older version
of the thrift file.

Reviewed By: simpkins

Differential Revision: D9270843

fbshipit-source-id: b01135aec9152aaf5199e1c654ddd7f61c03717e
2018-08-11 01:35:49 -07:00
Chad Austin
fae4229ff2 add eden gc command
Summary:
Add the beginnings of an eden gc command. Today it's equivalent to
`eden debug clear_local_caches` followed by `eden
debug_compact_local_storage`, except that it compacts each column as
they're cleared to minimize peak disk consumption.

Eventually, it will also unload in-memory inodes, flush data from the
overlay, and clear the kernel's VFS cache too.

Reviewed By: wez

Differential Revision: D9138305

fbshipit-source-id: b303a63f601014cf38ca94c9e6f7c04394159ea8
2018-08-10 11:38:20 -07:00
Eamonn Kent
b9f7aa0d95 Provide real-time values for memory statistics
Summary:
We provide average values for vmRSS and memory (dirty bytes) on linux systems
by parsing the proc file system. This change allows the current values
(non-average) to be queried and calculated from the thrift API. The cli has
been updated to present these values (and no longer consumes the parsed smaps
files from EdenServer).

Reviewed By: chadaustin

Differential Revision: D8549305

fbshipit-source-id: 77f6838f39784e7ebeda11d8c66dba1fa9f10591
2018-06-21 15:51:51 -07:00
Wez Furlong
bfad766a21 add initiateShutdown() thrift method with a shutdown reason
Summary:
We've seen what appears to be phantom calls to shutdown() so we'd like
to add some degree of auditing.  This diff adds a new method with some
context; this will allow us to distinguish between `eden stop`, `eden restart`
and eden server internal calls to the `shutdown` method.   It may still
be possible that something else is calling our shutdown method, but it
seems unlikely as we're only accessible to our own code via a unix domain
socket.

Reviewed By: chadaustin

Differential Revision: D8341595

fbshipit-source-id: 50d58ea0b56e5f42cd37c404048d710bde0d13a3
2018-06-19 11:13:59 -07:00
Chad Austin
0e9cc052c8 add compact_local_storage debug command to cli
Summary: Add a debug command to compact the LocalStore's RocksDB.

Reviewed By: bolinfest

Differential Revision: D8108686

fbshipit-source-id: 116a74d4bd70442a4c60e45d551afa60674f121d
2018-05-31 11:23:21 -07:00
Chad Austin
b9f6bf1c14 add clear_local_caches debug command to cli
Summary:
This adds a debug command to blow away all RocksDB information that
can be reproduced from Mercurial. We will use it to help an Eden user
recover from a corrupted blob.

Reviewed By: bolinfest

Differential Revision: D8108649

fbshipit-source-id: 056dec19d51b9e430b3c2a249747b26830cfc875
2018-05-31 11:23:21 -07:00
Wez Furlong
8be54b4a1b prefetch file batch for hg import helper
Summary:
This removes the main point of contention for eden prefetch
in two ways:

1. We batch up the complete list of blobs so that they can be processed
   in bulk rather than stalling the tree walk
2. We can ask remotefilelog to check and fetch that list to the local
   hgcache, again as a batch, rather than by forcing the data to be
   loaded through into the local store

The goal of this prefetch is to bulk load data from the mercurial server
so that a subsequent file access doesn't have to make a one-off ssh session
for each one, rather than making sure that all the data is loaded into
the local store.

Reviewed By: chadaustin

Differential Revision: D7965818

fbshipit-source-id: 753400460d633b5467c5110e3f5608ce06106e00
2018-05-25 13:51:27 -07:00
Wez Furlong
d245de4a41 add eden prefetch command
Summary:
This is a first pass at a prefetcher.  The idea is simple,
but the execution is impeded by some unfortunate slowness in different
parts of mercurial.

The idea is that you pass a list of glob patterns and we'll do something
to make accessing files that match those patterns ideally faster than
if you didn't give us the prefetch hint.

In theory we could run `hg prefetch -I PATTERN` for this, but prefetch
takes several minutes materializing and walking the whole manifest to
find matches, checking outgoing revs and various other overheads.
There is a revision flag that can be specified to try to reduce this
effort, but it still takes more than a minute.

This diff:

* Removes a `Future::get()` call in the GlobNode code
* Makes `globFiles` use Futures directly rather than `Future::get()`
* Adds a `prefetchFiles` parameter to `globFiles`
* Adds `eden prefetch` to the CLI and makes it call `globFiles` with
  `prefetchFiles=true`
* Adds the abillity to glob over `Tree` as well as the existing `TreeInode`.
  This means that we can avoid allocating inodes for portions of the
  tree that have not yet been loaded.

When `prefetchFiles` is set we'll ask ObjectStore to load the blob for
matching files.  I'm not currently doing this in the `TreeInode` case
on the assumption that we already did this earlier when its `TreeInode::prefetch`
method was called.

The glob executor joins the blob prefetches at each GlobNode level.  It may
be possible to observe higher throughput if we join the complete set at the
end.

Reviewed By: chadaustin

Differential Revision: D7825423

fbshipit-source-id: d2ae03d0f62f00090537198095661475056e968d
2018-05-25 13:51:27 -07:00
Michael Bolin
1a0260b3d1 Introduce a new debug command: eden debug journal.
Summary:
This is used to dump the raw `JournalDelta` entries in the journal.

Hopefully this will help us figure out what's happening in T28686395, or more
generally, why we see get those merge errors that appear in the logs of the
form:

```
Journal for .hg/rebasestate holds invalid Created, Created sequence
```

from `eden/fs/journal/JournalDelta.cpp`.

(Note: this ignores all push blocking failures!)

Reviewed By: wez

Differential Revision: D7855071

fbshipit-source-id: f195813695bec7426329a9aacd84a9b1613feec9
2018-05-23 06:21:41 -07:00
Michael Bolin
e6737d409d Thrift API change: deprecate glob() in favor of globFiles().
Summary:
We need to introduce a new `includeDotfiles` option to `glob()`. [As we have
done for all of our Thrift API, to date], rather than define `glob()` so that it
takes a single struct, we specified the parameters individually, so we can no
longer add new params to `glob()`.

In particular, we need to support `includeDotfiles` because we often configure
Buck to use Watchman to implement `glob()` in `BUCK` files, and when Watchman is
used in Eden, it leverages Eden's Thrift API to implement `glob()`. Because
Buck's `glob()` has an `include_dotfiles` option, we must be able to honor it
and pass it all the way through to Eden's `glob()` implementation.

Rather than name the new API `glob2()`, I'm electing to go with `globFiles()`.
(Perhaps once we eliminate all known users of `glob()` in the wild, which
requires turning over the current version of Watchman we have deployed, we can
redefine `glob()` in `eden.thrift` to be the same as `globFiles()` and then
update everyone to use `glob()` again so it has the more intuitive name.)

Reviewed By: wez

Differential Revision: D7748870

fbshipit-source-id: 92438f9c41e4fbdbd6cdccca5fce0e41cc3e9b07
2018-05-02 15:19:44 -07:00
Adam Simpkins
2fedc3bcea update getScmStatus() to require the commit hash as an argument
Summary:
Change getScmStatus() so that callers must explicitly specify the commit to
diff against.  This should help avoid race conditions around commit or checkout
operations where the parent commit has just changed and eden returns status
information against a commit that wasn't what the client was expecting.

This should still maintain backwards compatibility with older clients that do
not send this parameter yet: we will simply receive the hash as an empty string
in this case, and we still provide the old behavior in this case.

Reviewed By: wez

Differential Revision: D7512338

fbshipit-source-id: 1fb4645dda13b9108c66c2daaa802ea3445ac5f2
2018-04-06 12:51:31 -07:00
Adam Simpkins
65a682a1ff add a function for diffing source control commits
Summary:
Add a function for diffing two source control commits without needing to
instantiate TreeInode objects.

Reviewed By: wez

Differential Revision: D7341604

fbshipit-source-id: 557eef87faa2785ab96d51b09569a46f892a71f6
2018-03-20 16:47:12 -07:00
Adam Simpkins
f6685834de update eden to be more liberal when parsing BinaryHash arguments
Summary:
Update Eden's thrift service handler code to accept BinaryHash arguments either
as 20-byte binary values or as 40-byte hexadecimal values.

This will make it easier to transition APIs like getScmStatusBetweenRevisions()
to use 20-byte binary hash arguments without breaking existing clients.

Reviewed By: wez

Differential Revision: D7341607

fbshipit-source-id: 3e952211900d3ec4b9c2073cf3afd55ae7e253ea
2018-03-20 16:47:12 -07:00
Adam Simpkins
dfe7cac4c2 add an integration test for getScmStatusBetweenRevisions()
Summary:
Add an integration test for the getScmStatusBetweenRevisions() thrift call.

This call apparently gets the ADDED and REMOVED states backwards.  For now the
test checks for the current (incorrect) behavior.

This also fixes the thrift definition for this function to stop using the
BinaryHash typedef.  Unlike most of our other thrift functions this method
appears to require the arguments as 40-byte hexadecimal strings.

Reviewed By: wez

Differential Revision: D7341606

fbshipit-source-id: 73cbd0ecf4445da6b1f0ef9cf6d9dce47e6fb593
2018-03-20 15:07:39 -07:00
Puneet Kaushik
a7f99f7f2c Added thrift request to report outstanding FUSE calls
Summary:
Added a thrift call to return the outstanding FUSE requests.
Cli will call the thrift and print the output.
Added a unit test to test getOutstandingRequests().

Reviewed By: simpkins

Differential Revision: D7314584

fbshipit-source-id: 420790405babdb734f598e19719b487096ec53ca
2018-03-20 10:25:49 -07:00
Chad Austin
8219f5c60a have eden stats show file and tree counts
Summary:
It's interesting to see the total number of loaded files
vs. trees when the loaded inode count is high.

Reviewed By: wez

Differential Revision: D6765874

fbshipit-source-id: 178b30184428bd5cf5e005eb475e4f5a1476c385
2018-01-24 15:29:16 -08:00
Chad Austin
9a3fa8bd60 replace the system memory info in eden stats with process memory
Summary:
`eden stats` used to show system memory usage which was not very
interesting (and can be gleaned from top).  Instead read the contents
of /proc/self/smaps and sum the Private_Dirty fields to get a number
that more accurately reflects impact on the rest of the system.

Reviewed By: wez

Differential Revision: D6575595

fbshipit-source-id: 9badc5cd5a1b56d3ccb27edd1a2d20ee74ec34ae
2017-12-18 12:00:58 -08:00
Michael Bolin
5e2afa735f Change how the UNTRACKED_ADDED conflict and merges are handled.
Summary:
Previously, we used the Mercurial code `g` when faced with an `UNTRACKED_ADDED`
file conflict, but that was allowing merges to silently succeed that should not
have. This revision changes our logic to use the code `m` for merge, which
unearthed that we were not honoring the user's `update.check` setting properly.

Because we use `update.check=noconflict` internally at Facebook, we changed the
Eden integration tests to default to verifying Hg running with this setting. To
support it properly, we had to port this code from `update.py` in Mercurial to
our own `_determine_actions_for_conflicts()` function:

```
if updatecheck == 'noconflict':
    for f, (m, args, msg) in actionbyfile.iteritems():
        if m not in ('g', 'k', 'e', 'r', 'pr'):
            msg = _("conflicting changes")
            hint = _("commit or update --clean to discard changes")
            raise error.Abort(msg, hint=hint)
```

However, this introduced an interesting issue where the `checkOutRevision()`
Thrift call from Hg would update the `SNAPSHOT` file on the server, but
`.hg/dirstate` would not get updated with the new parents until the update
completed on the client. With the new call to `raise error.Abort` on the client,
we could get in a state where the `SNAPSHOT` file had the hash of the commit
assuming the update succeeded, but `.hg/dirstate` reflected the reality where it
failed.

To that end, we changed `checkOutRevision()` to take a new parameter,
`checkoutMode`, which can take on one of three values: `NORMAL`, `DRY_RUN`, and
`FORCE`. Now if the user tries to do an ordinary `hg update` with
`update.check=noconflict`, we first do a `DRY_RUN` and examine the potential
conflicts. Only if the conflicts should not block the update do we proceed with
a call to `checkOutRevision()` in `NORMAL` mode.

To make this work, we had to make a number of changes to `CheckoutAction`,
`CheckoutContext`, `EdenMount`, and `TreeInode` to keep track of the
`checkoutMode` and ensure that no changes are made to the working copy when a
`DRY_RUN` is in effect.

One minor issue (for which there is a `TODO`) is that a `DRY_RUN` will not
report any `DIRECTORY_NOT_EMPTY` conflicts that may exist. As `TreeInode` is
implemented today, it is a bit messy to report this type of conflict without
modifying the working copy along the way.

Finally, any `UNTRACKED_ADDED` conflict should cause an update to
abort to match the behavior in stock Mercurial if the user has the following
config setting:

```
[commands]
update.check = noconflict
```

Though the original name for this setting was:

```
[experimental]
updatecheck = noconflict
```

Although I am on Mercurial 4.4.1, the `update.check` setting does not seem to
take effect when I run the integration tests, but the `updatecheck` setting
does, so for now, I set both in `hg_extension_test_base.py` with a `TODO` to
remove `updatecheck` once I can get `update.check` to do its job.

Reviewed By: simpkins

Differential Revision: D6366007

fbshipit-source-id: bb3ecb1270e77d59d7d9e7baa36ada61971bbc49
2017-11-29 21:50:34 -08:00
Wez Furlong
28e74f1ba6 add scmGetStatusBetweenRevisions thrift call
Summary:
The goal is to provide a fast path for watchman to flesh
out the total set of changed files when it needs relay that information
on to consumers.

We choose not to include the full list in the Journal when checking out
between revisions because it will not always be needed and may be an
expensive `O(repo)` operation to compute.  This means that watchman
needs to expand that information for itself, and that is currently
a fairly slow query to invoke through mercurial.

Since watchman is responding to journal events from eden we know that
we have tree data for the old and new hashes and thus we should be
able to efficiently compute that diff.

This implementation is slightly awful because it will instantiate an
unlinked TreeInode object for one side of the query, and will in
turn populate any children that differ as it walks down the tree.
A follow on diff will look at making a flavor of the diff code that
can diff raw Tree objects instead.

Reviewed By: bolinfest

Differential Revision: D6305844

fbshipit-source-id: 7506c9ba1f4febebcdc283c414261810a3951588
2017-11-28 19:36:32 -08:00
Michael Bolin
99e29ed185 Remove getParentCommits() Thrift API.
Summary: It is currently unused. Let's bring it back if/when we need it.

Reviewed By: chadaustin

Differential Revision: D6368867

fbshipit-source-id: 096015ba597a6e04f544273ba9773576429e39ce
2017-11-20 15:56:46 -08:00
Michael Bolin
5d738193e5 Store Hg dirstate data in Hg instead of Eden.
Summary:
This is a major change to how we manage the dirstate in Eden's Hg extension.

Previously, the dirstate information was stored under `$EDEN_CONFIG_DIR`,
which is Eden's private storage. Any time the Mercurial extension wanted to
read or write the dirstate, it had to make a Thrift request to Eden to do so on
its behalf. The upside is that Eden could answer dirstate-related questions
independently of the Python code.

This was sufficiently different than how Mercurial's default dirstate worked
that our subclass, `eden_dirstate`, had to override quite a bit of behavior.
Failing to manage the `.hg/dirstate` file in a way similar to the way Mercurial
does has exposed some "unofficial contracts" that Mercurial has. For example,
tools like Nuclide rely on changes to the `.hg/dirstate` file as a heuristic to
determine when to invalidate its internal caches for Mercurial data.

Today, Mercurial has a well-factored `dirstatemap` abstraction that is primarily
responsible for the transactions with the dirstate's data. With this split, we can
focus on putting most of our customizations in our `eden_dirstate_map` subclass
while our `eden_dirstate` class has to override fewer methods. Because the
data is managed through the `.hg/dirstate` file, transaction logic in Mercurial that
relies on renaming/copying that file will work out-of-the-box. This change
also reduces the number of Thrift calls the Mercurial extension has to make
for operations like `hg status` or `hg add`.

In this revision, we introduce our own binary format for the `.hg/dirstate` file.
The logic to read and write this file is in `eden/py/dirstate.py`. After the first
40 bytes, which are used for the parent hashes, the next four bytes are
reserved for a version number for the file format so we can manage file format
changes going forward.

Admittedly one downside of this change is that it is a breaking change.
Ideally, users should commit all of their local changes in their existing mounts,
shutdown Eden, delete the old mounts, restart Eden, and re-clone.

In the end, this change deletes a number of Mercurial-specific code and Thrift
APIs from Eden. This is a better separation of concerns that makes Eden more
SCM-agnostic. For example, this change removes `Dirstate.cpp` and
`DirstatePersistance.cpp`, replacing them with the much simpler and more
general `Differ.cpp`. The Mercurial-specific logic from `Dirstate.cpp` that turned
a diff into an `hg status` now lives in the Mercurial extension in
`EdenThriftClient.getStatus()`, which is much more appropriate.

Note that this reverts the changes that were recently introduced in D6116105:
we now need to intercept `localrepo.localrepository.dirstate` once again.

Reviewed By: simpkins

Differential Revision: D6179950

fbshipit-source-id: 5b78904909b669c9cc606e2fe1fd118ef6eaab95
2017-11-06 19:56:49 -08:00
Michael Bolin
ac5b213e92 Include the dirstate tuples and copymap when backing up the dirstate.
Summary:
Previously, the `savebackup()` and `restorebackup()` methods in `eden_dirstate`
only retained the parent commit hashes. With this change, now the dirstate tuples
and entries in the copymap for the dirstate are also included as part of the saved
state.

Failing to restore all of the state caused issues when doing things like aborting
an `hg split`, as observed by one of our users. Although this fix works, we ultimately
plan to move the responsibility for persisting dirstate data out of Eden and into the
Hg extension. Then the data will live in `.hg/dirstate` like it would for the default
dirstate implementation.

Reviewed By: simpkins

Differential Revision: D6145420

fbshipit-source-id: baa077dee73847a47cc171cd980cdd272b3a3a99
2017-10-25 22:36:06 -07:00
Chad Austin
3240f2dff9 warn when creating new logger, allow disabling inheritance for log level
Summary: This diff helps some common pitfalls when using set_log_level.

Reviewed By: simpkins

Differential Revision: D6142849

fbshipit-source-id: 7fa35392dda148af90d0aefdb872b6e8a8b770db
2017-10-25 09:56:16 -07:00
Wez Furlong
25a9786ca5 augment JournalDelta with unclean paths on snapshot hash change
Summary:
We were previously generating a simple JournalDelta consisting of
just the from/to snapshot hashes.  This is great from a `!O(repo)` perspective
when recording what changed but makes it difficult for clients downstream
to reason about changes that are not tracked in source control.

This diff adds a concept of `uncleanPaths` to the journal; these are paths
that we think are/were different from the hashes in the journal entry.

Since JournalDelta needs to be able to be merged I've opted for a simple
list of the paths that have a differing status; I'm not including all of
the various dirstate states for this because it is not obvious how to
reconcile the state across successive snapshot change events.

The `uncleanPaths` set is populated with an initial set of different paths as
the first part of the checkout call (prior to changing the hash), and then is
updated after the hash has changed to capture any additional differences.

Care needs to be taken to avoid recursively attempting to grab the parents lock
so I'm replicating just a little bit of the state management glue in the
`performDiff` method.

The Journal was not setting the from/to snapshot hashes when merging deltas.
This manifested in the watchman integration tests; we'd see the null revision
as the `from` and the `to` revision held the `from` revision(!).

On the watchman side we need to ask source control to expand the list of
files that changed when the from/to hashes are different; I've added code
to handle this.  This doesn't do anything smart in the case that the
source control aware queries are in use.  We'll look at that in a following
diff as it isn't strictly eden specific.

`watchman clock` was returning a basically empty clock unconditionally,
which meant that most since queries would report everything since the start
of time.  This is most likely contributing to poor Buck performance, although
I have not investigated the performance aspect of this.  It manifested itself
in the watchman integration tests.

Reviewed By: simpkins

Differential Revision: D5896494

fbshipit-source-id: a88be6448862781a1d8f5e15285ca07b4240593a
2017-10-16 22:46:54 -07:00
Chad Austin
73dc6620fc add a debug CLI command to set a log category's level
Reviewed By: simpkins

Differential Revision: D6068539

fbshipit-source-id: 45d0baecc2291c83c48ad22403d44a790b270b9a
2017-10-16 16:37:10 -07:00
Chad Austin
3fb5680152 Rename ConflictType::MODIFIED to ConflictType::MODIFIED_MODIFIED
Summary: Per wez, this makes the MODIFIED case consistent with the other conflict types (e.g. local_remote).  Side benefit of avoiding some naming conflicts in the Haskell/Rust thrift tooling.

Reviewed By: wez

Differential Revision: D5882327

fbshipit-source-id: 3ec68c44d8c8a5c5675f1ced3842d29376d46fe2
2017-09-21 16:54:37 -07:00
Michael Bolin
e837848da5 Introduce a special NoValueForKeyError for hgGetDirstateTuple() and hgCopyMapGet().
Summary:
Previously, we were generating a bit of disconcerting noise in our logs when
requesting a non-existent key in the dirstate or its copy map. We were also
susceptible to a logical error in the Eden side being silently translated to
a `KeyError` on the Python side.

Now we make things more explicit by converting a `std::out_of_range` on the C++
side to an explicit `NoValueForKeyError` that is defined in `eden.thrift`.
Now the Python side catches a `NoValueForKeyError` explicitly and converts it
into a `KeyError`. Other types of exceptions should pass through rather than be
swallowed.

This also updates the log messages to communicate when a there is no value for a
key. The messaging is improved so that it no longer appears to be a logical
error.

Reviewed By: wez

Differential Revision: D5800833

fbshipit-source-id: c44f2caf04622475d218593037cc6616bbb1c701
2017-09-11 10:52:09 -07:00
Jyothsna Konisa
8fb37c1ada Diagnostic tool to report Stat information of EdenFs
Summary:
Added new tool to report stat information of EdenFs like fuse counters, Memory counters, latencies, Inode status for all the mount points etc.

eden stat : Prints the general information about eden like list of mount points, loaded unloaded and materialized inodes in each mount point. Also this reports how well periodic unload job is doing by reporting the number of unloaded inodes by periodic job.

eden stat io : Prints how many number of calls made to a system call in Edenfs.

eden stat memory : returns the memory stat for edenfs.

eden stat latency : reports the latencies of system calls in Edenfs.

Reviewed By: bolinfest

Differential Revision: D5660345

fbshipit-source-id: 97a1c2b83a6d8df0cd1b82c4d54b52d7ebd126bd
2017-08-25 12:49:35 -07:00
Braden Watling
ab43c66a8d Add test to verify that eden debug getpath indicates when inodes are unloaded
Summary:
This test was supposed to be a part of D5627411 but it was causing strange behaviour so was brought to a separate diff for further investigation.

After investigating, the test didn't pass because the UnloadedInodeData struct only contained the name of the file, not the path to it. The fix for this was to implement a way to get the relative path of the file even after the inode is unloaded.

Reviewed By: simpkins

Differential Revision: D5646929

fbshipit-source-id: f166398a651e8aea49da7e4474a5ad7fde2eaa4e
2017-08-25 08:34:31 -07:00
Jyothsna Konisa
72b61a5ddc Changes to return unloaded inode count for TreeInode::unloadChildrenNow
Summary:
1.Modified `TreeInode::unloadChildrenNow()` to return number of inodes that have been unloaded.
2.Modified `EdenServiceHandler::unloadInodeForPath()` to return number of inodes that are unloaded.

Reviewed By: simpkins

Differential Revision: D5627539

fbshipit-source-id: 4cdb0433dced6bf101158b9e6f8c35de67d9abbe
2017-08-22 19:50:00 -07:00