Commit Graph

97 Commits

Author SHA1 Message Date
Brian Strauch
1556dec257 Total FUSE access time
Summary: Record a rolling sum of the time taken by any FUSE call on a per-process basis

Reviewed By: strager

Differential Revision: D16553149

fbshipit-source-id: 54f1e453916727a40f245b294239dc1b232a8967
2019-08-02 09:44:03 -07:00
Brian Strauch
3abcc3bef1 Record backing store imports with RequestData
Summary: Uses the existing RequestData class to make calls to static functions to set and get the `didImportFromBackingStore` flag.

Reviewed By: simpkins

Differential Revision: D16461868

fbshipit-source-id: e3ed39249f5dd1a842ad06a204b5933014b12f7f
2019-08-01 13:38:31 -07:00
Adam Simpkins
2c7f65c021 fix handling of errors that occur early during mount initialization
Summary:
Update `EdenServer::mount()` to correctly handle errors that occur during the
mount `INITIALIZING` phase.  Previously the code did not add error callbacks
to the `Future` result to handle errors during initialization.  As a result we
would propagate the exception back to the thrift caller, but the `EdenMount`
object would remain in our mount point list, stuck forever in the
`INITIALIZING` state.

Reviewed By: strager

Differential Revision: D16590032

fbshipit-source-id: 9adbdf05441dad815096b195ece36f3d958c96a9
2019-07-31 20:09:46 -07:00
Jake Crouch
75452ca107 Add debug command to flush Journal
Summary:
Adds a debug command to eden such that users can flush the journal and cause the subscribers to get a truncated result, this will be useful for debugging [to debug we won't have to lower the memory limit and repeatedly touching files to fill up the journal].

Can be used with "eden debug flush_journal"

Reviewed By: chadaustin

Differential Revision: D16348811

fbshipit-source-id: fdbe6729d0393c424addcd42a091de440383035b
2019-07-30 22:05:32 -07:00
Adam Simpkins
113f62e557 unbreak edenfsctl status
Summary:
D15528156 inadvertently changed the return type of the `getPid()` call from
`i64` to `i32`.  This is a backwards-incompatible change, and causes new
clients to reject the 64-bit response from older EdenFS daemons.  This breaks
`edenfsctl status`, `edenfsctl clone` and other commands.

Reviewed By: pkaush

Differential Revision: D16522765

fbshipit-source-id: eecd344ee4b963d638576f146a87fc88a5003e55
2019-07-26 15:58:12 -07:00
Brian Strauch
d0acc0f175 Count FUSE reads/writes
Summary:
Display FUSE calls, reads, and writes

{F167451325}

Reviewed By: chadaustin

Differential Revision: D16214570

fbshipit-source-id: ce1b3533d7260fb304c7efdaef8567a83d3dcd4a
2019-07-26 10:08:10 -07:00
Chad Austin
fe64ec3874 use fb303 repo in open source build
Summary: Add a dependency from the eden open source build to the fb303 open source build and switch EdenServiceHandler to BaseService.

Reviewed By: simpkins

Differential Revision: D15528156

fbshipit-source-id: 2ca5c31dd9fcc9bac43fd399b27f33b6f2c5ebfc
2019-07-24 21:07:04 -07:00
Jake Crouch
2172bbbf8c Adding memory limit to Journal
Summary:
This diff updates Eden's journal to be bounded in terms of memory usage which should help lessen the likelihood of Eden OOMing and taking up a large amount of our users' resources.

The memory limit is set to be 1 GB per journal [so a user with 3 mounts could expect the journals to possibly use up to 3 GB of memory].

The landing of this diff will need to wait until a version of Watchman that can handle truncation is deployed on all machines using Eden. This means we need to wait for a version of Watchman with D16219267 to fully land to machines.

Reviewed By: strager

Differential Revision: D15954994

fbshipit-source-id: 9a6425527f10a1ce051feb8fc7d092a84712f338
2019-07-23 12:15:38 -07:00
Chad Austin
ae35e76c9c add a getDaemonInfo() thrift method
Summary:
Open source fb303 will not have getPid() or getCommandLine(), so
introduce a new method for Eden's tests.

Reviewed By: fanzeyi

Differential Revision: D16292993

fbshipit-source-id: 5cdc006ec0ee15f50a3e1cebe9b46a3ea275ff78
2019-07-17 13:47:02 -07:00
Brian Strauch
e2d4362896 live debug journal command
Summary: Running `eden debug journal -f` will print and follow the eden journal in a similar style to the unix `tail -f` command.

Reviewed By: chadaustin

Differential Revision: D16112458

fbshipit-source-id: 5304cd0f857bdbeca41c2591e98920f4f1fc8f42
2019-07-09 09:13:28 -07:00
Jake Crouch
74b514ceac Thrift interface for setting memory limit of Journal
Summary: Sets up a thrift interface to set the size of the journal (until truncation is added in the size field in the journal currently does nothing other than being viewable from getMemoryLimit)

Reviewed By: chadaustin

Differential Revision: D16042286

fbshipit-source-id: bc0acdf4ac5516cfac66fa0fbd87254d08ad479b
2019-07-02 19:03:35 -07:00
Jake Crouch
30e6c20988 Displaying duration of journal
Summary: Shows the end-to-end duration of the journal in "eden stats"

Reviewed By: chadaustin

Differential Revision: D15993261

fbshipit-source-id: 46471faca17d4f12ccdd8cea55b2722e33519a74
2019-06-28 16:41:33 -07:00
Jake Crouch
e7036c45cd Update "eden debug journal" to use limit instead of generating a range
Summary: Prior to this diff "eden debug journal" got the latest journal entry and manually seeked for the one "limit" prior to it, this has been updated to just passing the limit to the journal. [This will make the queries always happen from the tip of the journal, since in theory currently something can be added to the journal after the python script gets the latest delta]

Reviewed By: strager

Differential Revision: D15972018

fbshipit-source-id: 0ee0dd88a1e9edef5ccce3b3da2dbc09aa64f8a9
2019-06-26 16:38:33 -07:00
Wez Furlong
f5d9a06dc9 eden: add thrift calls for adding/removing bind mounts
Summary:
These allow the cli to setup and tear down mounts and
have the eden server keep track of them.

Reviewed By: strager

Differential Revision: D15707318

fbshipit-source-id: abdb8eaa28c8c67c8211a8af1647efe3a083e998
2019-06-25 18:42:37 -07:00
Adam Simpkins
92fc1d83d9 update license headers in thrift files
Summary:
Update the copyright & license headers in thrift files to reflect the
relicensing to GPLv2+

Reviewed By: wez

Differential Revision: D15487082

fbshipit-source-id: 33f68617037f36c07075fb962a16a4d8f55bd6a6
2019-06-19 17:02:46 -07:00
Adam Simpkins
aa45fa2cb7 periodically reload the config files
Summary:
Add a periodic task to reload the configuration file from disk.  By default
this runs once every 5 minutes, but this interval can be controlled from the
config file.

At the moment reloading the config file does not do much other than update the
interval for how frequently the config file is reloaded.  However, I plan to
add additional periodic tasks shortly that are controlled by this config
setting.

This will also make it possible for other parts of the code to
access the config settings in the `ServerState` and use them as-is without
checking to see if they reloaded.  Currently all of the code that accesses
config values performs a check to see if the config needs to be reloaded.  If
we want to switch to Mercurial-style configs in the future that check will be
substantially more expensive.

This diff also includes a new thrift call to force the config file to be
reloaded immediately.  This can be used to restart automatic config reloading
if it is ever disabled in the config file.

Reviewed By: wez

Differential Revision: D15756357

fbshipit-source-id: 1999f4730903633ce838842932a6ae6a65eda4e6
2019-06-14 18:14:43 -07:00
Adam Simpkins
7309869981 add a thrift call for getting config values
Summary:
Add a thrift call to get the current config settings.

My primary use case for this method at the moment is to make it possible to
build integration tests that check the config behavior.  However in the future
this will probably also be useful for building CLI commands to report the
current config values to allow debugging if there are ever issues.  This API
can also be used to force EdenFS to immediately reload the config from disk.

Reviewed By: strager

Differential Revision: D15572124

fbshipit-source-id: da3bc982f9c419b3314a8b0560c9bd327760d429
2019-06-11 13:08:28 -07:00
Jake Crouch
0dc6812f33 Print out Journal Info with edenfsctl
Summary: Print out memory usage and entry counts with edenfsctl stats

Reviewed By: chadaustin

Differential Revision: D15607015

fbshipit-source-id: 866960ea1d3b5e9fdbe24df3b57fba419795ec76
2019-06-07 13:37:02 -07:00
Adam Simpkins
799d2d6834 update initiateShutdown() to be able to throw exceptions
Summary:
Update the thrift definition for `initiateShutdown()` to indicate that it may
throw `EdenError` on failure.  Without this all application-level errors that
occur will be translated to generic thrift `TApplicationException` objects on
the wire.

Also move this method declaration out of the "debugging APIs" section of the
file.

Reviewed By: strager

Differential Revision: D15572121

fbshipit-source-id: 7e621a24abd4347cedbb1bcce1ae9c2b70f991fd
2019-06-05 11:50:36 -07:00
Matt Glazar
49f4c37b67 Add missing copyright notices
Summary: Internal Facebook infrastructure is nagging me about some files not having a Facebook copyright notice. Add a notice to these files to make the nagging stop.

Reviewed By: simpkins

Differential Revision: D14173944

fbshipit-source-id: 7234431224fcf4f86ea56ca2f9108f47ef959d87
2019-03-07 19:32:39 -08:00
Wez Furlong
8847a7a061 add dtype as an optional return value from glob
Summary:
This diff adds the dtype field to the glob results;
this will help to reduce the cost of some watchman queries by avoiding a
getFileInformation call that instantiates inodes.

As part of this, I added a bunch of unit test coverage.

Reviewed By: strager

Differential Revision: D8779149

fbshipit-source-id: 3064a3e42be55ec576fed9e0f7112edef426f32d
2019-02-19 11:26:26 -08:00
Adam Simpkins
5d4279567d add thrift APIs to the fault injection framework
Summary: Add some thrift APIs that allow injecting faults into Eden's FaultInjector.

Reviewed By: wez

Differential Revision: D14079490

fbshipit-source-id: a9ec16b560ddcc79d5d819dbbc120ad5da556b4e
2019-02-15 19:27:17 -08:00
Adam Simpkins
798ccc750a track EdenMounts while they are initializing
Summary:
This updates the logic in EdenServer to add the EdenMount to the mountPoints_
map as soon as it is created, so that we track mount points as they are
initializing.

I don't expect this change to have any major impact in functionality yet.  In
a subsequent diff I also plan have EdenServer keep mount points in the
mountPoints_ map longer while they are shutting down.  I expect that change to
matter a bit more, as that will allow us to do a better job reporting and
debugging when mount points are taking a non-trivial amount of time to become
unreferenced and fully shut down.

Reviewed By: strager

Differential Revision: D13503050

fbshipit-source-id: 2e0e8dfde64c6a005efd6dcf503ad7577f314356
2019-01-16 19:35:09 -08:00
Adam Simpkins
05cb1dcd4f report the mount state in listMounts()
Summary:
Update the `listMounts()` thrift API to also report the current mount point
state.  This will allow us to do a better job of reporting mount points that
are in the process of initializing or shutting down.

This change splits the `MountInfo` thrift type into two distinct types for
the `listMounts()` vs `mount()` APIs.  However this change should be
completely backwards compatible at the wire protocol level for older client
and server code.

Reviewed By: strager

Differential Revision: D13503049

fbshipit-source-id: 68e7ca708b956991c8fd93bbf8973d90650aced9
2019-01-02 12:58:08 -08:00
Chad Austin
3dbad3ac01 show blob cache sizes in eden stats
Summary:
Have `eden stats` print the size of the blob cache if the running
edenfs has information about it.

Reviewed By: strager

Differential Revision: D13349220

fbshipit-source-id: 9f59f4399f2d4283aa80bcb54ba73c51d555d502
2018-12-06 19:43:52 -08:00
Dan Schatzberg
8fe62ce81b Add command to chown a mount
Summary:
Sandcastle has several cases where we chown the entire
repository which performs terribly on Eden. As a workaround we have a
command to do this in eden without loading all the files.

Reviewed By: chadaustin

Differential Revision: D12857956

fbshipit-source-id: 36cebcc710fbcf4e1eb265df901513cf50a227b9
2018-11-07 08:58:31 -08:00
Dan Schatzberg
b2a4204b4d Add thrift interface to dump tracepoints
Summary:
With this the eden cli can dump tracepoints and translate
them to various formats or perform any processing

Reviewed By: chadaustin

Differential Revision: D10384072

fbshipit-source-id: 8b38e7f6b551a2bd98b3e748ba1cceafeceeec8c
2018-11-01 08:09:19 -07:00
Chad Austin
e750ab68fe expose FUSE accesses over Thrift
Summary:
Add a Thrift API for reading the pid access logs from each
EdenMount/FuseChannel. Used in a future diff.

Reviewed By: strager

Differential Revision: D9477867

fbshipit-source-id: 0897a915ca654bca952aecc123ea40105830a75b
2018-09-10 13:52:51 -07:00
Chad Austin
6394450579 restructure JournalDelta and fix Watchman subscription race
Summary:
Watchman's Eden integration has a bug where the combination of
Watchman querying Eden for overlapping delta ranges ("give me changes
between X and Y, now changes between X+1 and Y+1") and Eden eliding
redundant change events ("add-modify-remove" -> []) results in
Watchman sometimes reporting that a file exists in its final
subscription update when it no longer does.

The fix is to never elide events, even for files that were added and
removed in the same sequence. To continue to support Watchman's `new`
flag, track whether a file existed at the beginning and end of a
journal delta.

Reviewed By: wez

Differential Revision: D9304964

fbshipit-source-id: f34c12b25f2b24e3a0d46fc94aa428528f4c5098
2018-08-15 14:52:06 -07:00
Wez Furlong
cfde0c0717 define paths as binary rather than strings in the thrift interface
Summary:
This prevents `hg status` from blowing up with a UTF-8 decode
error inside the generated thrift code.

Push safety concerns:
* This doesn't change the wire representation of the data
* Existing clients that believe it to be a string will continue to have
  the same behavior
* Buck has its own copy of an older version of the thrift spec, so it will
  continue to work "OK".
* When buck resyncs with our thrift file, some changes will likely be needed
  to convert the byte arrays to strings or paths or whatever is appropriate
  for bucks internal API

Work "OK" above means that clients that currently believe that `string` is
utf-8 encoded will have a runtime error if we ever send them a path that
is not utf-8.  This is the behavior prior to this diff and will continue
to be the behavior for clients (like buck) that have an older version
of the thrift file.

Reviewed By: simpkins

Differential Revision: D9270843

fbshipit-source-id: b01135aec9152aaf5199e1c654ddd7f61c03717e
2018-08-11 01:35:49 -07:00
Chad Austin
fae4229ff2 add eden gc command
Summary:
Add the beginnings of an eden gc command. Today it's equivalent to
`eden debug clear_local_caches` followed by `eden
debug_compact_local_storage`, except that it compacts each column as
they're cleared to minimize peak disk consumption.

Eventually, it will also unload in-memory inodes, flush data from the
overlay, and clear the kernel's VFS cache too.

Reviewed By: wez

Differential Revision: D9138305

fbshipit-source-id: b303a63f601014cf38ca94c9e6f7c04394159ea8
2018-08-10 11:38:20 -07:00
Eamonn Kent
b9f7aa0d95 Provide real-time values for memory statistics
Summary:
We provide average values for vmRSS and memory (dirty bytes) on linux systems
by parsing the proc file system. This change allows the current values
(non-average) to be queried and calculated from the thrift API. The cli has
been updated to present these values (and no longer consumes the parsed smaps
files from EdenServer).

Reviewed By: chadaustin

Differential Revision: D8549305

fbshipit-source-id: 77f6838f39784e7ebeda11d8c66dba1fa9f10591
2018-06-21 15:51:51 -07:00
Wez Furlong
bfad766a21 add initiateShutdown() thrift method with a shutdown reason
Summary:
We've seen what appears to be phantom calls to shutdown() so we'd like
to add some degree of auditing.  This diff adds a new method with some
context; this will allow us to distinguish between `eden stop`, `eden restart`
and eden server internal calls to the `shutdown` method.   It may still
be possible that something else is calling our shutdown method, but it
seems unlikely as we're only accessible to our own code via a unix domain
socket.

Reviewed By: chadaustin

Differential Revision: D8341595

fbshipit-source-id: 50d58ea0b56e5f42cd37c404048d710bde0d13a3
2018-06-19 11:13:59 -07:00
Chad Austin
0e9cc052c8 add compact_local_storage debug command to cli
Summary: Add a debug command to compact the LocalStore's RocksDB.

Reviewed By: bolinfest

Differential Revision: D8108686

fbshipit-source-id: 116a74d4bd70442a4c60e45d551afa60674f121d
2018-05-31 11:23:21 -07:00
Chad Austin
b9f6bf1c14 add clear_local_caches debug command to cli
Summary:
This adds a debug command to blow away all RocksDB information that
can be reproduced from Mercurial. We will use it to help an Eden user
recover from a corrupted blob.

Reviewed By: bolinfest

Differential Revision: D8108649

fbshipit-source-id: 056dec19d51b9e430b3c2a249747b26830cfc875
2018-05-31 11:23:21 -07:00
Wez Furlong
8be54b4a1b prefetch file batch for hg import helper
Summary:
This removes the main point of contention for eden prefetch
in two ways:

1. We batch up the complete list of blobs so that they can be processed
   in bulk rather than stalling the tree walk
2. We can ask remotefilelog to check and fetch that list to the local
   hgcache, again as a batch, rather than by forcing the data to be
   loaded through into the local store

The goal of this prefetch is to bulk load data from the mercurial server
so that a subsequent file access doesn't have to make a one-off ssh session
for each one, rather than making sure that all the data is loaded into
the local store.

Reviewed By: chadaustin

Differential Revision: D7965818

fbshipit-source-id: 753400460d633b5467c5110e3f5608ce06106e00
2018-05-25 13:51:27 -07:00
Wez Furlong
d245de4a41 add eden prefetch command
Summary:
This is a first pass at a prefetcher.  The idea is simple,
but the execution is impeded by some unfortunate slowness in different
parts of mercurial.

The idea is that you pass a list of glob patterns and we'll do something
to make accessing files that match those patterns ideally faster than
if you didn't give us the prefetch hint.

In theory we could run `hg prefetch -I PATTERN` for this, but prefetch
takes several minutes materializing and walking the whole manifest to
find matches, checking outgoing revs and various other overheads.
There is a revision flag that can be specified to try to reduce this
effort, but it still takes more than a minute.

This diff:

* Removes a `Future::get()` call in the GlobNode code
* Makes `globFiles` use Futures directly rather than `Future::get()`
* Adds a `prefetchFiles` parameter to `globFiles`
* Adds `eden prefetch` to the CLI and makes it call `globFiles` with
  `prefetchFiles=true`
* Adds the abillity to glob over `Tree` as well as the existing `TreeInode`.
  This means that we can avoid allocating inodes for portions of the
  tree that have not yet been loaded.

When `prefetchFiles` is set we'll ask ObjectStore to load the blob for
matching files.  I'm not currently doing this in the `TreeInode` case
on the assumption that we already did this earlier when its `TreeInode::prefetch`
method was called.

The glob executor joins the blob prefetches at each GlobNode level.  It may
be possible to observe higher throughput if we join the complete set at the
end.

Reviewed By: chadaustin

Differential Revision: D7825423

fbshipit-source-id: d2ae03d0f62f00090537198095661475056e968d
2018-05-25 13:51:27 -07:00
Michael Bolin
1a0260b3d1 Introduce a new debug command: eden debug journal.
Summary:
This is used to dump the raw `JournalDelta` entries in the journal.

Hopefully this will help us figure out what's happening in T28686395, or more
generally, why we see get those merge errors that appear in the logs of the
form:

```
Journal for .hg/rebasestate holds invalid Created, Created sequence
```

from `eden/fs/journal/JournalDelta.cpp`.

(Note: this ignores all push blocking failures!)

Reviewed By: wez

Differential Revision: D7855071

fbshipit-source-id: f195813695bec7426329a9aacd84a9b1613feec9
2018-05-23 06:21:41 -07:00
Michael Bolin
e6737d409d Thrift API change: deprecate glob() in favor of globFiles().
Summary:
We need to introduce a new `includeDotfiles` option to `glob()`. [As we have
done for all of our Thrift API, to date], rather than define `glob()` so that it
takes a single struct, we specified the parameters individually, so we can no
longer add new params to `glob()`.

In particular, we need to support `includeDotfiles` because we often configure
Buck to use Watchman to implement `glob()` in `BUCK` files, and when Watchman is
used in Eden, it leverages Eden's Thrift API to implement `glob()`. Because
Buck's `glob()` has an `include_dotfiles` option, we must be able to honor it
and pass it all the way through to Eden's `glob()` implementation.

Rather than name the new API `glob2()`, I'm electing to go with `globFiles()`.
(Perhaps once we eliminate all known users of `glob()` in the wild, which
requires turning over the current version of Watchman we have deployed, we can
redefine `glob()` in `eden.thrift` to be the same as `globFiles()` and then
update everyone to use `glob()` again so it has the more intuitive name.)

Reviewed By: wez

Differential Revision: D7748870

fbshipit-source-id: 92438f9c41e4fbdbd6cdccca5fce0e41cc3e9b07
2018-05-02 15:19:44 -07:00
Adam Simpkins
2fedc3bcea update getScmStatus() to require the commit hash as an argument
Summary:
Change getScmStatus() so that callers must explicitly specify the commit to
diff against.  This should help avoid race conditions around commit or checkout
operations where the parent commit has just changed and eden returns status
information against a commit that wasn't what the client was expecting.

This should still maintain backwards compatibility with older clients that do
not send this parameter yet: we will simply receive the hash as an empty string
in this case, and we still provide the old behavior in this case.

Reviewed By: wez

Differential Revision: D7512338

fbshipit-source-id: 1fb4645dda13b9108c66c2daaa802ea3445ac5f2
2018-04-06 12:51:31 -07:00
Adam Simpkins
65a682a1ff add a function for diffing source control commits
Summary:
Add a function for diffing two source control commits without needing to
instantiate TreeInode objects.

Reviewed By: wez

Differential Revision: D7341604

fbshipit-source-id: 557eef87faa2785ab96d51b09569a46f892a71f6
2018-03-20 16:47:12 -07:00
Adam Simpkins
f6685834de update eden to be more liberal when parsing BinaryHash arguments
Summary:
Update Eden's thrift service handler code to accept BinaryHash arguments either
as 20-byte binary values or as 40-byte hexadecimal values.

This will make it easier to transition APIs like getScmStatusBetweenRevisions()
to use 20-byte binary hash arguments without breaking existing clients.

Reviewed By: wez

Differential Revision: D7341607

fbshipit-source-id: 3e952211900d3ec4b9c2073cf3afd55ae7e253ea
2018-03-20 16:47:12 -07:00
Adam Simpkins
dfe7cac4c2 add an integration test for getScmStatusBetweenRevisions()
Summary:
Add an integration test for the getScmStatusBetweenRevisions() thrift call.

This call apparently gets the ADDED and REMOVED states backwards.  For now the
test checks for the current (incorrect) behavior.

This also fixes the thrift definition for this function to stop using the
BinaryHash typedef.  Unlike most of our other thrift functions this method
appears to require the arguments as 40-byte hexadecimal strings.

Reviewed By: wez

Differential Revision: D7341606

fbshipit-source-id: 73cbd0ecf4445da6b1f0ef9cf6d9dce47e6fb593
2018-03-20 15:07:39 -07:00
Puneet Kaushik
a7f99f7f2c Added thrift request to report outstanding FUSE calls
Summary:
Added a thrift call to return the outstanding FUSE requests.
Cli will call the thrift and print the output.
Added a unit test to test getOutstandingRequests().

Reviewed By: simpkins

Differential Revision: D7314584

fbshipit-source-id: 420790405babdb734f598e19719b487096ec53ca
2018-03-20 10:25:49 -07:00
Chad Austin
8219f5c60a have eden stats show file and tree counts
Summary:
It's interesting to see the total number of loaded files
vs. trees when the loaded inode count is high.

Reviewed By: wez

Differential Revision: D6765874

fbshipit-source-id: 178b30184428bd5cf5e005eb475e4f5a1476c385
2018-01-24 15:29:16 -08:00
Chad Austin
9a3fa8bd60 replace the system memory info in eden stats with process memory
Summary:
`eden stats` used to show system memory usage which was not very
interesting (and can be gleaned from top).  Instead read the contents
of /proc/self/smaps and sum the Private_Dirty fields to get a number
that more accurately reflects impact on the rest of the system.

Reviewed By: wez

Differential Revision: D6575595

fbshipit-source-id: 9badc5cd5a1b56d3ccb27edd1a2d20ee74ec34ae
2017-12-18 12:00:58 -08:00
Michael Bolin
5e2afa735f Change how the UNTRACKED_ADDED conflict and merges are handled.
Summary:
Previously, we used the Mercurial code `g` when faced with an `UNTRACKED_ADDED`
file conflict, but that was allowing merges to silently succeed that should not
have. This revision changes our logic to use the code `m` for merge, which
unearthed that we were not honoring the user's `update.check` setting properly.

Because we use `update.check=noconflict` internally at Facebook, we changed the
Eden integration tests to default to verifying Hg running with this setting. To
support it properly, we had to port this code from `update.py` in Mercurial to
our own `_determine_actions_for_conflicts()` function:

```
if updatecheck == 'noconflict':
    for f, (m, args, msg) in actionbyfile.iteritems():
        if m not in ('g', 'k', 'e', 'r', 'pr'):
            msg = _("conflicting changes")
            hint = _("commit or update --clean to discard changes")
            raise error.Abort(msg, hint=hint)
```

However, this introduced an interesting issue where the `checkOutRevision()`
Thrift call from Hg would update the `SNAPSHOT` file on the server, but
`.hg/dirstate` would not get updated with the new parents until the update
completed on the client. With the new call to `raise error.Abort` on the client,
we could get in a state where the `SNAPSHOT` file had the hash of the commit
assuming the update succeeded, but `.hg/dirstate` reflected the reality where it
failed.

To that end, we changed `checkOutRevision()` to take a new parameter,
`checkoutMode`, which can take on one of three values: `NORMAL`, `DRY_RUN`, and
`FORCE`. Now if the user tries to do an ordinary `hg update` with
`update.check=noconflict`, we first do a `DRY_RUN` and examine the potential
conflicts. Only if the conflicts should not block the update do we proceed with
a call to `checkOutRevision()` in `NORMAL` mode.

To make this work, we had to make a number of changes to `CheckoutAction`,
`CheckoutContext`, `EdenMount`, and `TreeInode` to keep track of the
`checkoutMode` and ensure that no changes are made to the working copy when a
`DRY_RUN` is in effect.

One minor issue (for which there is a `TODO`) is that a `DRY_RUN` will not
report any `DIRECTORY_NOT_EMPTY` conflicts that may exist. As `TreeInode` is
implemented today, it is a bit messy to report this type of conflict without
modifying the working copy along the way.

Finally, any `UNTRACKED_ADDED` conflict should cause an update to
abort to match the behavior in stock Mercurial if the user has the following
config setting:

```
[commands]
update.check = noconflict
```

Though the original name for this setting was:

```
[experimental]
updatecheck = noconflict
```

Although I am on Mercurial 4.4.1, the `update.check` setting does not seem to
take effect when I run the integration tests, but the `updatecheck` setting
does, so for now, I set both in `hg_extension_test_base.py` with a `TODO` to
remove `updatecheck` once I can get `update.check` to do its job.

Reviewed By: simpkins

Differential Revision: D6366007

fbshipit-source-id: bb3ecb1270e77d59d7d9e7baa36ada61971bbc49
2017-11-29 21:50:34 -08:00
Wez Furlong
28e74f1ba6 add scmGetStatusBetweenRevisions thrift call
Summary:
The goal is to provide a fast path for watchman to flesh
out the total set of changed files when it needs relay that information
on to consumers.

We choose not to include the full list in the Journal when checking out
between revisions because it will not always be needed and may be an
expensive `O(repo)` operation to compute.  This means that watchman
needs to expand that information for itself, and that is currently
a fairly slow query to invoke through mercurial.

Since watchman is responding to journal events from eden we know that
we have tree data for the old and new hashes and thus we should be
able to efficiently compute that diff.

This implementation is slightly awful because it will instantiate an
unlinked TreeInode object for one side of the query, and will in
turn populate any children that differ as it walks down the tree.
A follow on diff will look at making a flavor of the diff code that
can diff raw Tree objects instead.

Reviewed By: bolinfest

Differential Revision: D6305844

fbshipit-source-id: 7506c9ba1f4febebcdc283c414261810a3951588
2017-11-28 19:36:32 -08:00
Michael Bolin
99e29ed185 Remove getParentCommits() Thrift API.
Summary: It is currently unused. Let's bring it back if/when we need it.

Reviewed By: chadaustin

Differential Revision: D6368867

fbshipit-source-id: 096015ba597a6e04f544273ba9773576429e39ce
2017-11-20 15:56:46 -08:00
Michael Bolin
5d738193e5 Store Hg dirstate data in Hg instead of Eden.
Summary:
This is a major change to how we manage the dirstate in Eden's Hg extension.

Previously, the dirstate information was stored under `$EDEN_CONFIG_DIR`,
which is Eden's private storage. Any time the Mercurial extension wanted to
read or write the dirstate, it had to make a Thrift request to Eden to do so on
its behalf. The upside is that Eden could answer dirstate-related questions
independently of the Python code.

This was sufficiently different than how Mercurial's default dirstate worked
that our subclass, `eden_dirstate`, had to override quite a bit of behavior.
Failing to manage the `.hg/dirstate` file in a way similar to the way Mercurial
does has exposed some "unofficial contracts" that Mercurial has. For example,
tools like Nuclide rely on changes to the `.hg/dirstate` file as a heuristic to
determine when to invalidate its internal caches for Mercurial data.

Today, Mercurial has a well-factored `dirstatemap` abstraction that is primarily
responsible for the transactions with the dirstate's data. With this split, we can
focus on putting most of our customizations in our `eden_dirstate_map` subclass
while our `eden_dirstate` class has to override fewer methods. Because the
data is managed through the `.hg/dirstate` file, transaction logic in Mercurial that
relies on renaming/copying that file will work out-of-the-box. This change
also reduces the number of Thrift calls the Mercurial extension has to make
for operations like `hg status` or `hg add`.

In this revision, we introduce our own binary format for the `.hg/dirstate` file.
The logic to read and write this file is in `eden/py/dirstate.py`. After the first
40 bytes, which are used for the parent hashes, the next four bytes are
reserved for a version number for the file format so we can manage file format
changes going forward.

Admittedly one downside of this change is that it is a breaking change.
Ideally, users should commit all of their local changes in their existing mounts,
shutdown Eden, delete the old mounts, restart Eden, and re-clone.

In the end, this change deletes a number of Mercurial-specific code and Thrift
APIs from Eden. This is a better separation of concerns that makes Eden more
SCM-agnostic. For example, this change removes `Dirstate.cpp` and
`DirstatePersistance.cpp`, replacing them with the much simpler and more
general `Differ.cpp`. The Mercurial-specific logic from `Dirstate.cpp` that turned
a diff into an `hg status` now lives in the Mercurial extension in
`EdenThriftClient.getStatus()`, which is much more appropriate.

Note that this reverts the changes that were recently introduced in D6116105:
we now need to intercept `localrepo.localrepository.dirstate` once again.

Reviewed By: simpkins

Differential Revision: D6179950

fbshipit-source-id: 5b78904909b669c9cc606e2fe1fd118ef6eaab95
2017-11-06 19:56:49 -08:00