2016-05-12 23:43:17 +03:00
|
|
|
include "common/fb303/if/fb303.thrift"
|
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate.
Summary:
This is a major change to Eden's Hg extension.
Our initial attempt to implement `edendirstate` was to create a "clean room"
implementation that did not share code with `mercurial/dirstate.py`. This was
helpful in uncovering the subset of the dirstate API that matters for Eden. It
also provided a better safeguard against upstream changes to `dirstate.py` in
Mercurial itself.
In this implementation, the state transition management was mostly done
on the server in `Dirstate.cpp`. We also made a modest attempt to make
`Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for
Git at some point.
However, as we have tried to support more of the sophisticated functionality
in Mercurial, particularly `hg histedit`, achieving parity between the clean room
implementation and Mercurial's internals has become more challenging.
Ultimately, the clean room implementation is likely the right way to go for Eden,
but for now, we need to prioritize having feature parity with vanilla Hg when
using Eden. Once we have a more complete set of integration tests in place,
we can reimplement Eden's dirstate more aggressively to optimize things.
Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]]
extension has already demonstrated that it is possible to provide a faithful
dirstate implementation that subclasses the original `dirstate` while using a different
storage mechanism. As such, I used `sqldirstate` as a model when implementing
the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`).
In particular, `sqldirstate` uses SQL tables as storage for the following private fields
of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because
`_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we
do not support them in `eden_dirstate` and add code to ensure the codepaths that
would access them in `dirstate` never get exercised. Similarly, we also implemented
`eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the
dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden.
It appears to be primarily used for checking whether a path to a file already exists in
the dirstate as a directory. We can protect against that in more efficient ways.)
That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set
of files that have been marked "copied" in the current dirstate, so it is fairly small and
can be stored on disk or in memory with little concern. `_map` is a bit trickier because
it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored
across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data
analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new
equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles`
table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap`
data.
In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined
in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`,
which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data
structures that translate their method calls to Thrift server calls. I expect we will want to
optimize this in the future via some client-side caching, as well as creating batch APIs for talking
to the server via Thrift.
One advantage of this new implementation is that it enables us to delete
`eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`.
Between the recent implementation of `dirstate.walk()` for Eden and this switch
to the real dirstate, we can now use the default implementation of `hg add` and `hg remove`
(although we have to play some tricks, like in the implementation of `eden_dirstate.status()`
in order to make `hg remove` work).
In the course of doing this revision, I discovered that I had to make a minor fix to
`EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as
`hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which
case the glob was not matching `foo`!
I also had to do some work in `eden_dirstate.status()` in which the `match` argument
was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number
of things with the `match` specified as a filter, so the output of `status()` must be filtered
by `match` accordingly. Ultimately, this seems like work that would be better done on the
server, but for simplicity, we're just going to do it in Python, for now.
For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`.
As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was
testing should probably be converted to integration tests. At a high level, the role of
`DirstatePersistence` has not changed, but the exact data it writes is much different.
Its corresponding unit test is also disabled, for now.
Note that this revision does not change the name of the file where "dirstate data" is written
(this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing
instances of this file once this change lands. (It is still early enough in the project that it does
not seem worth the overhead of a proper migration.)
The true test of the success of this new approach is the ease with which we can write more
integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very
few changes to `eden_dirstate.py`.
Reviewed By: simpkins
Differential Revision: D5071778
fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
|
|
|
include "eden/fs/inodes/hgdirstate.thrift"
|
2016-05-12 23:43:17 +03:00
|
|
|
|
|
|
|
namespace cpp2 facebook.eden
|
2016-08-18 17:21:36 +03:00
|
|
|
namespace java com.facebook.eden.thrift
|
2016-05-12 23:43:17 +03:00
|
|
|
namespace py facebook.eden
|
|
|
|
|
2016-09-19 22:48:12 +03:00
|
|
|
/** Thrift doesn't really do unsigned numbers, but we can sort of fake it.
|
|
|
|
* This type is serialized as an integer value that is 64-bits wide and
|
|
|
|
* should round-trip with full fidelity for C++ client/server, but for
|
|
|
|
* other runtimes will have crazy results if the sign bit is ever set.
|
|
|
|
* In practice it is impossible for us to have files that large in eden,
|
|
|
|
* and sequence numbers will take an incredibly long time to ever roll
|
|
|
|
* over and cause problems.
|
|
|
|
* Once t13345978 is done, we can uncomment the cpp.type below.
|
|
|
|
*/
|
|
|
|
typedef i64 /* (cpp.type = "std::uint64_t") */ unsigned64
|
|
|
|
|
2017-02-22 23:19:04 +03:00
|
|
|
/**
|
|
|
|
* A source control hash, as a 20-byte binary value.
|
|
|
|
*/
|
|
|
|
typedef binary BinaryHash
|
|
|
|
|
2016-05-12 23:43:17 +03:00
|
|
|
exception EdenError {
|
2016-05-28 04:16:27 +03:00
|
|
|
1: required string message
|
|
|
|
2: optional i32 errorCode
|
2016-05-12 23:43:17 +03:00
|
|
|
} (message = 'message')
|
|
|
|
|
2017-09-11 20:37:13 +03:00
|
|
|
exception NoValueForKeyError {
|
|
|
|
1: string key
|
|
|
|
}
|
2016-05-12 23:43:17 +03:00
|
|
|
|
|
|
|
struct MountInfo {
|
|
|
|
1: string mountPoint
|
|
|
|
2: string edenClientPath
|
|
|
|
}
|
|
|
|
|
2016-08-18 17:21:36 +03:00
|
|
|
union SHA1Result {
|
2017-02-22 23:19:04 +03:00
|
|
|
1: BinaryHash sha1
|
2016-08-18 17:21:36 +03:00
|
|
|
2: EdenError error
|
|
|
|
}
|
|
|
|
|
2017-04-04 01:47:53 +03:00
|
|
|
/**
|
|
|
|
* Effectively a `struct timespec`
|
|
|
|
*/
|
2016-09-19 22:48:09 +03:00
|
|
|
struct TimeSpec {
|
|
|
|
1: i64 seconds
|
|
|
|
2: i64 nanoSeconds
|
|
|
|
}
|
|
|
|
|
2017-04-04 01:47:53 +03:00
|
|
|
/**
|
|
|
|
* Information that we return when querying entries
|
|
|
|
*/
|
2016-09-19 22:48:09 +03:00
|
|
|
struct FileInformation {
|
2016-09-19 22:48:12 +03:00
|
|
|
1: unsigned64 size // wish thrift had unsigned numbers
|
2016-09-19 22:48:09 +03:00
|
|
|
2: TimeSpec mtime
|
|
|
|
3: i32 mode // mode_t
|
|
|
|
}
|
|
|
|
|
2016-09-19 22:48:12 +03:00
|
|
|
/** Holds information about a file, or an error in retrieving that info.
|
|
|
|
* The most likely error will be ENOENT, implying that the file doesn't exist.
|
|
|
|
*/
|
|
|
|
union FileInformationOrError {
|
|
|
|
1: FileInformation info
|
|
|
|
2: EdenError error
|
|
|
|
}
|
|
|
|
|
|
|
|
/** reference a point in time in the journal.
|
|
|
|
* This can be used to reason about a point in time in a given mount point.
|
2016-09-19 22:48:14 +03:00
|
|
|
* The mountGeneration value is opaque to the client.
|
2016-09-19 22:48:12 +03:00
|
|
|
*/
|
|
|
|
struct JournalPosition {
|
|
|
|
/** An opaque but unique number within the scope of a given mount point.
|
|
|
|
* This is used to determine when sequenceNumber has been invalidated. */
|
2016-09-19 22:48:14 +03:00
|
|
|
1: i64 mountGeneration
|
2016-09-19 22:48:12 +03:00
|
|
|
|
|
|
|
/** Monotonically incrementing number
|
|
|
|
* Each journalled change causes this number to increment. */
|
|
|
|
2: unsigned64 sequenceNumber
|
|
|
|
|
|
|
|
/** Records the snapshot hash at the appropriate point in the journal */
|
2017-04-28 03:30:14 +03:00
|
|
|
3: BinaryHash snapshotHash
|
2016-09-19 22:48:12 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
/** Holds information about a set of paths that changed between two points.
|
|
|
|
* fromPosition, toPosition define the time window.
|
|
|
|
* paths holds the list of paths that changed in that window.
|
|
|
|
*/
|
|
|
|
struct FileDelta {
|
|
|
|
/** The fromPosition passed to getFilesChangedSince */
|
|
|
|
1: JournalPosition fromPosition
|
|
|
|
/** The current position at the time that getFilesChangedSince was called */
|
|
|
|
2: JournalPosition toPosition
|
|
|
|
/** The complete list of paths from both the snapshot and the overlay that
|
|
|
|
* changed between fromPosition and toPosition */
|
2017-08-11 22:51:51 +03:00
|
|
|
3: list<string> changedPaths
|
|
|
|
4: list<string> createdPaths
|
|
|
|
5: list<string> removedPaths
|
2016-09-19 22:48:12 +03:00
|
|
|
}
|
|
|
|
|
2016-12-17 04:48:28 +03:00
|
|
|
enum StatusCode {
|
2016-11-26 23:00:16 +03:00
|
|
|
CLEAN = 0x0,
|
|
|
|
MODIFIED = 0x1,
|
|
|
|
ADDED = 0x2,
|
|
|
|
REMOVED = 0x3,
|
|
|
|
MISSING = 0x4,
|
|
|
|
NOT_TRACKED = 0x5,
|
|
|
|
IGNORED = 0x6,
|
|
|
|
}
|
|
|
|
|
|
|
|
struct ThriftHgStatus {
|
2016-12-17 04:48:28 +03:00
|
|
|
1: map<string, StatusCode> entries
|
2016-11-26 23:00:16 +03:00
|
|
|
}
|
|
|
|
|
2017-02-16 07:31:48 +03:00
|
|
|
enum ConflictType {
|
2017-04-04 01:47:53 +03:00
|
|
|
/**
|
|
|
|
* We failed to update this particular path due to an error
|
|
|
|
*/
|
2017-05-08 23:49:17 +03:00
|
|
|
ERROR = 0,
|
2017-04-04 01:47:53 +03:00
|
|
|
/**
|
|
|
|
* A locally modified file was deleted in the new Tree
|
|
|
|
*/
|
2017-05-08 23:49:17 +03:00
|
|
|
MODIFIED_REMOVED = 1,
|
2017-04-04 01:47:53 +03:00
|
|
|
/**
|
|
|
|
* An untracked local file exists in the new Tree
|
|
|
|
*/
|
2017-05-08 23:49:17 +03:00
|
|
|
UNTRACKED_ADDED = 2,
|
2017-04-04 01:47:53 +03:00
|
|
|
/**
|
|
|
|
* The file was removed locally, but modified in the new Tree
|
|
|
|
*/
|
2017-05-08 23:49:17 +03:00
|
|
|
REMOVED_MODIFIED = 3,
|
2017-04-04 01:47:53 +03:00
|
|
|
/**
|
|
|
|
* The file was removed locally, and also removed in the new Tree.
|
|
|
|
*/
|
2017-05-08 23:49:17 +03:00
|
|
|
MISSING_REMOVED = 4,
|
2017-04-04 01:47:53 +03:00
|
|
|
/**
|
|
|
|
* A locally modified file was modified in the new Tree
|
|
|
|
* This may be contents modifications, or a file type change (directory to
|
|
|
|
* file or vice-versa), or permissions changes.
|
|
|
|
*/
|
2017-09-22 02:52:13 +03:00
|
|
|
MODIFIED_MODIFIED = 5,
|
2017-04-04 01:47:53 +03:00
|
|
|
/**
|
|
|
|
* A directory was supposed to be removed or replaced with a file,
|
|
|
|
* but it contains untracked files preventing us from updating it.
|
|
|
|
*/
|
2017-05-08 23:49:17 +03:00
|
|
|
DIRECTORY_NOT_EMPTY = 6,
|
2017-02-16 07:31:48 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Details about conflicts or errors that occurred during a checkout operation
|
|
|
|
*/
|
|
|
|
struct CheckoutConflict {
|
|
|
|
1: string path
|
|
|
|
2: ConflictType type
|
|
|
|
3: string message
|
|
|
|
}
|
|
|
|
|
2017-04-04 01:47:53 +03:00
|
|
|
struct ScmBlobMetadata {
|
|
|
|
1: i64 size
|
|
|
|
2: BinaryHash contentsSha1
|
|
|
|
}
|
|
|
|
|
|
|
|
struct ScmTreeEntry {
|
|
|
|
1: binary name
|
|
|
|
2: i32 mode
|
|
|
|
3: BinaryHash id
|
|
|
|
}
|
|
|
|
|
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate.
Summary:
This is a major change to Eden's Hg extension.
Our initial attempt to implement `edendirstate` was to create a "clean room"
implementation that did not share code with `mercurial/dirstate.py`. This was
helpful in uncovering the subset of the dirstate API that matters for Eden. It
also provided a better safeguard against upstream changes to `dirstate.py` in
Mercurial itself.
In this implementation, the state transition management was mostly done
on the server in `Dirstate.cpp`. We also made a modest attempt to make
`Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for
Git at some point.
However, as we have tried to support more of the sophisticated functionality
in Mercurial, particularly `hg histedit`, achieving parity between the clean room
implementation and Mercurial's internals has become more challenging.
Ultimately, the clean room implementation is likely the right way to go for Eden,
but for now, we need to prioritize having feature parity with vanilla Hg when
using Eden. Once we have a more complete set of integration tests in place,
we can reimplement Eden's dirstate more aggressively to optimize things.
Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]]
extension has already demonstrated that it is possible to provide a faithful
dirstate implementation that subclasses the original `dirstate` while using a different
storage mechanism. As such, I used `sqldirstate` as a model when implementing
the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`).
In particular, `sqldirstate` uses SQL tables as storage for the following private fields
of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because
`_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we
do not support them in `eden_dirstate` and add code to ensure the codepaths that
would access them in `dirstate` never get exercised. Similarly, we also implemented
`eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the
dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden.
It appears to be primarily used for checking whether a path to a file already exists in
the dirstate as a directory. We can protect against that in more efficient ways.)
That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set
of files that have been marked "copied" in the current dirstate, so it is fairly small and
can be stored on disk or in memory with little concern. `_map` is a bit trickier because
it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored
across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data
analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new
equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles`
table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap`
data.
In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined
in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`,
which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data
structures that translate their method calls to Thrift server calls. I expect we will want to
optimize this in the future via some client-side caching, as well as creating batch APIs for talking
to the server via Thrift.
One advantage of this new implementation is that it enables us to delete
`eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`.
Between the recent implementation of `dirstate.walk()` for Eden and this switch
to the real dirstate, we can now use the default implementation of `hg add` and `hg remove`
(although we have to play some tricks, like in the implementation of `eden_dirstate.status()`
in order to make `hg remove` work).
In the course of doing this revision, I discovered that I had to make a minor fix to
`EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as
`hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which
case the glob was not matching `foo`!
I also had to do some work in `eden_dirstate.status()` in which the `match` argument
was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number
of things with the `match` specified as a filter, so the output of `status()` must be filtered
by `match` accordingly. Ultimately, this seems like work that would be better done on the
server, but for simplicity, we're just going to do it in Python, for now.
For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`.
As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was
testing should probably be converted to integration tests. At a high level, the role of
`DirstatePersistence` has not changed, but the exact data it writes is much different.
Its corresponding unit test is also disabled, for now.
Note that this revision does not change the name of the file where "dirstate data" is written
(this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing
instances of this file once this change lands. (It is still early enough in the project that it does
not seem worth the overhead of a proper migration.)
The true test of the success of this new approach is the ease with which we can write more
integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very
few changes to `eden_dirstate.py`.
Reviewed By: simpkins
Differential Revision: D5071778
fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
|
|
|
struct HgNonnormalFile {
|
|
|
|
1: string relativePath
|
|
|
|
2: hgdirstate.DirstateTuple tuple
|
|
|
|
}
|
|
|
|
|
2017-04-04 01:47:53 +03:00
|
|
|
struct TreeInodeEntryDebugInfo {
|
|
|
|
/**
|
|
|
|
* The entry name. This is just a PathComponent, not the full path
|
|
|
|
*/
|
|
|
|
1: binary name
|
|
|
|
/**
|
|
|
|
* The inode number, or 0 if no inode number has been assigned to
|
|
|
|
* this entry
|
|
|
|
*/
|
|
|
|
2: i64 inodeNumber
|
|
|
|
/**
|
|
|
|
* The entry mode_t value
|
|
|
|
*/
|
|
|
|
3: i32 mode
|
|
|
|
/**
|
|
|
|
* True if an InodeBase object exists for this inode or not.
|
|
|
|
*/
|
|
|
|
4: bool loaded
|
|
|
|
/**
|
|
|
|
* True if an the inode is materialized in the overlay
|
|
|
|
*/
|
|
|
|
5: bool materialized
|
|
|
|
/**
|
|
|
|
* If materialized is false, hash contains the ID of the underlying source
|
|
|
|
* control Blob or Tree.
|
|
|
|
*/
|
|
|
|
6: BinaryHash hash
|
|
|
|
}
|
|
|
|
|
2017-04-28 03:30:14 +03:00
|
|
|
struct WorkingDirectoryParents {
|
|
|
|
1: BinaryHash parent1
|
|
|
|
2: optional BinaryHash parent2
|
|
|
|
}
|
|
|
|
|
2017-04-04 01:47:53 +03:00
|
|
|
struct TreeInodeDebugInfo {
|
|
|
|
1: i64 inodeNumber
|
|
|
|
2: binary path
|
|
|
|
3: bool materialized
|
|
|
|
4: BinaryHash treeHash
|
|
|
|
5: list<TreeInodeEntryDebugInfo> entries
|
2017-06-23 08:33:01 +03:00
|
|
|
6: i64 refcount
|
2017-04-04 01:47:53 +03:00
|
|
|
}
|
|
|
|
|
2017-08-17 05:56:32 +03:00
|
|
|
struct InodePathDebugInfo {
|
|
|
|
1: string path
|
|
|
|
2: bool loaded
|
2017-08-25 18:24:43 +03:00
|
|
|
3: bool linked
|
2017-08-17 05:56:32 +03:00
|
|
|
}
|
|
|
|
|
2017-08-25 22:41:41 +03:00
|
|
|
/**
|
|
|
|
* Struct to store Information about inodes in a mount point.
|
|
|
|
*/
|
|
|
|
struct MountInodeInfo {
|
|
|
|
1: i64 loadedInodeCount
|
|
|
|
2: i64 unloadedInodeCount
|
|
|
|
3: i64 materializedInodeCount
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Struct to store fb303 counters from ServiceData.getCounters() and inode
|
|
|
|
* information of all the mount points.
|
|
|
|
*/
|
|
|
|
struct InternalStats {
|
|
|
|
1: i64 periodicUnloadCount
|
|
|
|
/**
|
|
|
|
* counters is the list of fb303 counters, key is the counter name, value is the
|
|
|
|
* counter value.
|
|
|
|
*/
|
|
|
|
2: map<string, i64> counters
|
|
|
|
/**
|
|
|
|
* mountPointInfo is a map whose key is the path of the mount point and value
|
|
|
|
* is the details like number of loaded inodes,unloaded inodes in that mount
|
|
|
|
* and number of materialized inodes in that mountpoint.
|
|
|
|
*/
|
|
|
|
3: map<string, MountInodeInfo> mountPointInfo
|
|
|
|
}
|
|
|
|
|
2016-05-12 23:43:17 +03:00
|
|
|
service EdenService extends fb303.FacebookService {
|
|
|
|
list<MountInfo> listMounts() throws (1: EdenError ex)
|
|
|
|
void mount(1: MountInfo info) throws (1: EdenError ex)
|
|
|
|
void unmount(1: string mountPoint) throws (1: EdenError ex)
|
|
|
|
|
2017-02-22 23:19:04 +03:00
|
|
|
/**
|
2017-04-28 03:30:14 +03:00
|
|
|
* Get the parent commit(s) of the working directory
|
2017-02-22 23:19:04 +03:00
|
|
|
*/
|
2017-04-28 03:30:14 +03:00
|
|
|
WorkingDirectoryParents getParentCommits(1: string mountPoint)
|
2017-02-22 23:19:04 +03:00
|
|
|
throws (1: EdenError ex)
|
|
|
|
|
2017-02-16 07:31:48 +03:00
|
|
|
/**
|
|
|
|
* Check out the specified snapshot.
|
|
|
|
*
|
|
|
|
* This updates the contents of the mount point so that they match the
|
|
|
|
* contents of the given snapshot.
|
|
|
|
*
|
|
|
|
* Returns a list of conflicts and errors that occurred when performing the
|
|
|
|
* checkout operation.
|
|
|
|
*
|
|
|
|
* If the force parameter is true, the working directory will be forcibly
|
|
|
|
* updated to the contents of the new snapshot, even if there were conflicts.
|
|
|
|
* Conflicts will still be reported in the return value, but the files will
|
|
|
|
* be updated to their new state. If the force parameter is false files with
|
|
|
|
* conflicts will be left unmodified. Files that are untracked in both the
|
|
|
|
* source and destination snapshots are always left unchanged, even if force
|
|
|
|
* is true.
|
|
|
|
*
|
|
|
|
* On successful return from this function the mount point will point to the
|
|
|
|
* new commit, even if some paths had conflicts or errors. The caller is
|
|
|
|
* responsible for taking appropriate action to update these paths as desired
|
|
|
|
* after checkOutRevision() returns.
|
|
|
|
*/
|
|
|
|
list<CheckoutConflict> checkOutRevision(
|
|
|
|
1: string mountPoint,
|
2017-04-07 03:44:05 +03:00
|
|
|
2: BinaryHash snapshotHash,
|
2017-02-16 07:31:48 +03:00
|
|
|
3: bool force)
|
|
|
|
throws (1: EdenError ex)
|
2016-05-28 04:16:27 +03:00
|
|
|
|
2017-04-07 03:44:06 +03:00
|
|
|
/**
|
2017-04-28 03:30:14 +03:00
|
|
|
* Reset the working directory's parent commits, without changing the working
|
2017-04-07 03:44:06 +03:00
|
|
|
* directory contents.
|
|
|
|
*
|
|
|
|
* This operation is equivalent to `git reset --soft` or `hg reset --keep`
|
|
|
|
*/
|
2017-04-28 03:30:14 +03:00
|
|
|
void resetParentCommits(
|
2017-04-07 03:44:06 +03:00
|
|
|
1: string mountPoint,
|
2017-04-28 03:30:14 +03:00
|
|
|
2: WorkingDirectoryParents parents)
|
2017-04-07 03:44:06 +03:00
|
|
|
throws (1: EdenError ex)
|
|
|
|
|
2016-05-28 04:16:27 +03:00
|
|
|
/**
|
2016-08-18 17:21:36 +03:00
|
|
|
* For each path, returns an EdenError instead of the SHA-1 if any of the
|
|
|
|
* following occur:
|
2016-05-28 04:16:27 +03:00
|
|
|
* - path is the empty string.
|
|
|
|
* - path identifies a non-existent file.
|
|
|
|
* - path identifies something that is not an ordinary file (e.g., symlink
|
|
|
|
* or directory).
|
|
|
|
*/
|
2016-08-18 17:21:36 +03:00
|
|
|
list<SHA1Result> getSHA1(1: string mountPoint, 2: list<string> paths)
|
2016-10-04 21:02:22 +03:00
|
|
|
throws (1: EdenError ex)
|
2016-09-13 04:27:54 +03:00
|
|
|
|
|
|
|
/**
|
|
|
|
* Returns a list of paths relative to the mountPoint.
|
|
|
|
*/
|
|
|
|
list<string> getBindMounts(1: string mountPoint)
|
2016-10-04 21:02:22 +03:00
|
|
|
throws (1: EdenError ex)
|
2016-09-19 22:48:09 +03:00
|
|
|
|
2016-09-19 22:48:12 +03:00
|
|
|
/** Returns the sequence position at the time the method is called.
|
|
|
|
* Returns the instantaneous value of the journal sequence number.
|
|
|
|
*/
|
|
|
|
JournalPosition getCurrentJournalPosition(1: string mountPoint)
|
2016-10-04 21:02:22 +03:00
|
|
|
throws (1: EdenError ex)
|
2016-09-19 22:48:12 +03:00
|
|
|
|
|
|
|
/** Returns the set of files (and dirs) that changed since a prior point.
|
2016-09-19 22:48:14 +03:00
|
|
|
* If fromPosition.mountGeneration is mismatched with the current
|
|
|
|
* mountGeneration, throws an EdenError with errorCode = ERANGE.
|
2016-09-19 22:48:12 +03:00
|
|
|
* This indicates that eden cannot compute the delta for the requested
|
|
|
|
* range. The client will need to recompute a new baseline using
|
|
|
|
* other available functions in EdenService.
|
|
|
|
*/
|
|
|
|
FileDelta getFilesChangedSince(
|
|
|
|
1: string mountPoint,
|
|
|
|
2: JournalPosition fromPosition)
|
2016-09-19 22:48:15 +03:00
|
|
|
throws (1: EdenError ex)
|
2016-09-19 22:48:12 +03:00
|
|
|
|
|
|
|
/** Returns a subset of the stat() information for a list of paths.
|
|
|
|
* The returned list of information corresponds to the input list of
|
|
|
|
* paths; eg; result[0] holds the information for paths[0].
|
|
|
|
* We only support returning the instantaneous information about
|
|
|
|
* these paths, as we cannot answer with historical information about
|
|
|
|
* files in the overlay.
|
|
|
|
*/
|
|
|
|
list<FileInformationOrError> getFileInformation(
|
|
|
|
1: string mountPoint,
|
|
|
|
2: list<string> paths)
|
2016-10-04 21:02:22 +03:00
|
|
|
throws (1: EdenError ex)
|
2016-09-19 22:48:12 +03:00
|
|
|
|
|
|
|
/** Returns a list of files that match the input globs.
|
|
|
|
* There are no duplicate values in the result.
|
|
|
|
* wildMatchFlags can hold various WildMatchFlags values OR'd together.
|
|
|
|
*/
|
|
|
|
list<string> glob(
|
|
|
|
1: string mountPoint,
|
2017-01-26 23:45:50 +03:00
|
|
|
2: list<string> globs)
|
2016-10-04 21:02:22 +03:00
|
|
|
throws (1: EdenError ex)
|
2016-11-26 23:00:16 +03:00
|
|
|
|
2017-04-04 01:47:53 +03:00
|
|
|
//////// Source Control APIs ////////
|
2016-11-26 23:00:16 +03:00
|
|
|
|
|
|
|
// TODO(mbolin): `hg status` has a ton of command line flags to support.
|
2017-03-31 07:24:10 +03:00
|
|
|
ThriftHgStatus scmGetStatus(
|
|
|
|
1: string mountPoint,
|
|
|
|
2: bool listIgnored,
|
|
|
|
) throws (1: EdenError ex)
|
2016-11-26 23:00:16 +03:00
|
|
|
|
2017-08-23 02:35:00 +03:00
|
|
|
void hgClearDirstate(
|
|
|
|
1: string mountPoint,
|
|
|
|
) throws (1: EdenError ex)
|
|
|
|
|
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate.
Summary:
This is a major change to Eden's Hg extension.
Our initial attempt to implement `edendirstate` was to create a "clean room"
implementation that did not share code with `mercurial/dirstate.py`. This was
helpful in uncovering the subset of the dirstate API that matters for Eden. It
also provided a better safeguard against upstream changes to `dirstate.py` in
Mercurial itself.
In this implementation, the state transition management was mostly done
on the server in `Dirstate.cpp`. We also made a modest attempt to make
`Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for
Git at some point.
However, as we have tried to support more of the sophisticated functionality
in Mercurial, particularly `hg histedit`, achieving parity between the clean room
implementation and Mercurial's internals has become more challenging.
Ultimately, the clean room implementation is likely the right way to go for Eden,
but for now, we need to prioritize having feature parity with vanilla Hg when
using Eden. Once we have a more complete set of integration tests in place,
we can reimplement Eden's dirstate more aggressively to optimize things.
Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]]
extension has already demonstrated that it is possible to provide a faithful
dirstate implementation that subclasses the original `dirstate` while using a different
storage mechanism. As such, I used `sqldirstate` as a model when implementing
the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`).
In particular, `sqldirstate` uses SQL tables as storage for the following private fields
of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because
`_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we
do not support them in `eden_dirstate` and add code to ensure the codepaths that
would access them in `dirstate` never get exercised. Similarly, we also implemented
`eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the
dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden.
It appears to be primarily used for checking whether a path to a file already exists in
the dirstate as a directory. We can protect against that in more efficient ways.)
That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set
of files that have been marked "copied" in the current dirstate, so it is fairly small and
can be stored on disk or in memory with little concern. `_map` is a bit trickier because
it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored
across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data
analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new
equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles`
table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap`
data.
In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined
in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`,
which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data
structures that translate their method calls to Thrift server calls. I expect we will want to
optimize this in the future via some client-side caching, as well as creating batch APIs for talking
to the server via Thrift.
One advantage of this new implementation is that it enables us to delete
`eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`.
Between the recent implementation of `dirstate.walk()` for Eden and this switch
to the real dirstate, we can now use the default implementation of `hg add` and `hg remove`
(although we have to play some tricks, like in the implementation of `eden_dirstate.status()`
in order to make `hg remove` work).
In the course of doing this revision, I discovered that I had to make a minor fix to
`EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as
`hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which
case the glob was not matching `foo`!
I also had to do some work in `eden_dirstate.status()` in which the `match` argument
was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number
of things with the `match` specified as a filter, so the output of `status()` must be filtered
by `match` accordingly. Ultimately, this seems like work that would be better done on the
server, but for simplicity, we're just going to do it in Python, for now.
For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`.
As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was
testing should probably be converted to integration tests. At a high level, the role of
`DirstatePersistence` has not changed, but the exact data it writes is much different.
Its corresponding unit test is also disabled, for now.
Note that this revision does not change the name of the file where "dirstate data" is written
(this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing
instances of this file once this change lands. (It is still early enough in the project that it does
not seem worth the overhead of a proper migration.)
The true test of the success of this new approach is the ease with which we can write more
integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very
few changes to `eden_dirstate.py`.
Reviewed By: simpkins
Differential Revision: D5071778
fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
|
|
|
void hgSetDirstateTuple(
|
|
|
|
1: string mountPoint,
|
|
|
|
2: string relativePath,
|
|
|
|
3: hgdirstate.DirstateTuple tuple,
|
|
|
|
) throws (1: EdenError ex)
|
|
|
|
|
|
|
|
// Throw KeyError if no entry for relativePath?
|
|
|
|
hgdirstate.DirstateTuple hgGetDirstateTuple(
|
2016-11-26 23:00:16 +03:00
|
|
|
1: string mountPoint,
|
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate.
Summary:
This is a major change to Eden's Hg extension.
Our initial attempt to implement `edendirstate` was to create a "clean room"
implementation that did not share code with `mercurial/dirstate.py`. This was
helpful in uncovering the subset of the dirstate API that matters for Eden. It
also provided a better safeguard against upstream changes to `dirstate.py` in
Mercurial itself.
In this implementation, the state transition management was mostly done
on the server in `Dirstate.cpp`. We also made a modest attempt to make
`Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for
Git at some point.
However, as we have tried to support more of the sophisticated functionality
in Mercurial, particularly `hg histedit`, achieving parity between the clean room
implementation and Mercurial's internals has become more challenging.
Ultimately, the clean room implementation is likely the right way to go for Eden,
but for now, we need to prioritize having feature parity with vanilla Hg when
using Eden. Once we have a more complete set of integration tests in place,
we can reimplement Eden's dirstate more aggressively to optimize things.
Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]]
extension has already demonstrated that it is possible to provide a faithful
dirstate implementation that subclasses the original `dirstate` while using a different
storage mechanism. As such, I used `sqldirstate` as a model when implementing
the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`).
In particular, `sqldirstate` uses SQL tables as storage for the following private fields
of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because
`_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we
do not support them in `eden_dirstate` and add code to ensure the codepaths that
would access them in `dirstate` never get exercised. Similarly, we also implemented
`eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the
dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden.
It appears to be primarily used for checking whether a path to a file already exists in
the dirstate as a directory. We can protect against that in more efficient ways.)
That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set
of files that have been marked "copied" in the current dirstate, so it is fairly small and
can be stored on disk or in memory with little concern. `_map` is a bit trickier because
it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored
across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data
analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new
equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles`
table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap`
data.
In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined
in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`,
which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data
structures that translate their method calls to Thrift server calls. I expect we will want to
optimize this in the future via some client-side caching, as well as creating batch APIs for talking
to the server via Thrift.
One advantage of this new implementation is that it enables us to delete
`eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`.
Between the recent implementation of `dirstate.walk()` for Eden and this switch
to the real dirstate, we can now use the default implementation of `hg add` and `hg remove`
(although we have to play some tricks, like in the implementation of `eden_dirstate.status()`
in order to make `hg remove` work).
In the course of doing this revision, I discovered that I had to make a minor fix to
`EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as
`hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which
case the glob was not matching `foo`!
I also had to do some work in `eden_dirstate.status()` in which the `match` argument
was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number
of things with the `match` specified as a filter, so the output of `status()` must be filtered
by `match` accordingly. Ultimately, this seems like work that would be better done on the
server, but for simplicity, we're just going to do it in Python, for now.
For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`.
As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was
testing should probably be converted to integration tests. At a high level, the role of
`DirstatePersistence` has not changed, but the exact data it writes is much different.
Its corresponding unit test is also disabled, for now.
Note that this revision does not change the name of the file where "dirstate data" is written
(this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing
instances of this file once this change lands. (It is still early enough in the project that it does
not seem worth the overhead of a proper migration.)
The true test of the success of this new approach is the ease with which we can write more
integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very
few changes to `eden_dirstate.py`.
Reviewed By: simpkins
Differential Revision: D5071778
fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
|
|
|
2: string relativePath,
|
2017-09-11 20:37:13 +03:00
|
|
|
) throws (
|
|
|
|
1: EdenError ex
|
|
|
|
2: NoValueForKeyError noValueForKeyError
|
|
|
|
)
|
2016-11-26 23:00:16 +03:00
|
|
|
|
2017-08-19 07:36:41 +03:00
|
|
|
/** Return a boolean indicating whether something was actually deleted. */
|
|
|
|
bool hgDeleteDirstateTuple(
|
|
|
|
1: string mountPoint,
|
|
|
|
2: string relativePath,
|
|
|
|
) throws (1: EdenError ex)
|
|
|
|
|
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate.
Summary:
This is a major change to Eden's Hg extension.
Our initial attempt to implement `edendirstate` was to create a "clean room"
implementation that did not share code with `mercurial/dirstate.py`. This was
helpful in uncovering the subset of the dirstate API that matters for Eden. It
also provided a better safeguard against upstream changes to `dirstate.py` in
Mercurial itself.
In this implementation, the state transition management was mostly done
on the server in `Dirstate.cpp`. We also made a modest attempt to make
`Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for
Git at some point.
However, as we have tried to support more of the sophisticated functionality
in Mercurial, particularly `hg histedit`, achieving parity between the clean room
implementation and Mercurial's internals has become more challenging.
Ultimately, the clean room implementation is likely the right way to go for Eden,
but for now, we need to prioritize having feature parity with vanilla Hg when
using Eden. Once we have a more complete set of integration tests in place,
we can reimplement Eden's dirstate more aggressively to optimize things.
Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]]
extension has already demonstrated that it is possible to provide a faithful
dirstate implementation that subclasses the original `dirstate` while using a different
storage mechanism. As such, I used `sqldirstate` as a model when implementing
the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`).
In particular, `sqldirstate` uses SQL tables as storage for the following private fields
of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because
`_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we
do not support them in `eden_dirstate` and add code to ensure the codepaths that
would access them in `dirstate` never get exercised. Similarly, we also implemented
`eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the
dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden.
It appears to be primarily used for checking whether a path to a file already exists in
the dirstate as a directory. We can protect against that in more efficient ways.)
That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set
of files that have been marked "copied" in the current dirstate, so it is fairly small and
can be stored on disk or in memory with little concern. `_map` is a bit trickier because
it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored
across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data
analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new
equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles`
table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap`
data.
In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined
in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`,
which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data
structures that translate their method calls to Thrift server calls. I expect we will want to
optimize this in the future via some client-side caching, as well as creating batch APIs for talking
to the server via Thrift.
One advantage of this new implementation is that it enables us to delete
`eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`.
Between the recent implementation of `dirstate.walk()` for Eden and this switch
to the real dirstate, we can now use the default implementation of `hg add` and `hg remove`
(although we have to play some tricks, like in the implementation of `eden_dirstate.status()`
in order to make `hg remove` work).
In the course of doing this revision, I discovered that I had to make a minor fix to
`EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as
`hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which
case the glob was not matching `foo`!
I also had to do some work in `eden_dirstate.status()` in which the `match` argument
was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number
of things with the `match` specified as a filter, so the output of `status()` must be filtered
by `match` accordingly. Ultimately, this seems like work that would be better done on the
server, but for simplicity, we're just going to do it in Python, for now.
For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`.
As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was
testing should probably be converted to integration tests. At a high level, the role of
`DirstatePersistence` has not changed, but the exact data it writes is much different.
Its corresponding unit test is also disabled, for now.
Note that this revision does not change the name of the file where "dirstate data" is written
(this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing
instances of this file once this change lands. (It is still early enough in the project that it does
not seem worth the overhead of a proper migration.)
The true test of the success of this new approach is the ease with which we can write more
integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very
few changes to `eden_dirstate.py`.
Reviewed By: simpkins
Differential Revision: D5071778
fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
|
|
|
list<HgNonnormalFile> hgGetNonnormalFiles(
|
2016-11-26 23:00:16 +03:00
|
|
|
1: string mountPoint,
|
|
|
|
) throws (1: EdenError ex)
|
2016-12-10 12:06:37 +03:00
|
|
|
|
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate.
Summary:
This is a major change to Eden's Hg extension.
Our initial attempt to implement `edendirstate` was to create a "clean room"
implementation that did not share code with `mercurial/dirstate.py`. This was
helpful in uncovering the subset of the dirstate API that matters for Eden. It
also provided a better safeguard against upstream changes to `dirstate.py` in
Mercurial itself.
In this implementation, the state transition management was mostly done
on the server in `Dirstate.cpp`. We also made a modest attempt to make
`Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for
Git at some point.
However, as we have tried to support more of the sophisticated functionality
in Mercurial, particularly `hg histedit`, achieving parity between the clean room
implementation and Mercurial's internals has become more challenging.
Ultimately, the clean room implementation is likely the right way to go for Eden,
but for now, we need to prioritize having feature parity with vanilla Hg when
using Eden. Once we have a more complete set of integration tests in place,
we can reimplement Eden's dirstate more aggressively to optimize things.
Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]]
extension has already demonstrated that it is possible to provide a faithful
dirstate implementation that subclasses the original `dirstate` while using a different
storage mechanism. As such, I used `sqldirstate` as a model when implementing
the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`).
In particular, `sqldirstate` uses SQL tables as storage for the following private fields
of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because
`_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we
do not support them in `eden_dirstate` and add code to ensure the codepaths that
would access them in `dirstate` never get exercised. Similarly, we also implemented
`eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the
dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden.
It appears to be primarily used for checking whether a path to a file already exists in
the dirstate as a directory. We can protect against that in more efficient ways.)
That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set
of files that have been marked "copied" in the current dirstate, so it is fairly small and
can be stored on disk or in memory with little concern. `_map` is a bit trickier because
it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored
across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data
analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new
equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles`
table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap`
data.
In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined
in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`,
which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data
structures that translate their method calls to Thrift server calls. I expect we will want to
optimize this in the future via some client-side caching, as well as creating batch APIs for talking
to the server via Thrift.
One advantage of this new implementation is that it enables us to delete
`eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`.
Between the recent implementation of `dirstate.walk()` for Eden and this switch
to the real dirstate, we can now use the default implementation of `hg add` and `hg remove`
(although we have to play some tricks, like in the implementation of `eden_dirstate.status()`
in order to make `hg remove` work).
In the course of doing this revision, I discovered that I had to make a minor fix to
`EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as
`hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which
case the glob was not matching `foo`!
I also had to do some work in `eden_dirstate.status()` in which the `match` argument
was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number
of things with the `match` specified as a filter, so the output of `status()` must be filtered
by `match` accordingly. Ultimately, this seems like work that would be better done on the
server, but for simplicity, we're just going to do it in Python, for now.
For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`.
As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was
testing should probably be converted to integration tests. At a high level, the role of
`DirstatePersistence` has not changed, but the exact data it writes is much different.
Its corresponding unit test is also disabled, for now.
Note that this revision does not change the name of the file where "dirstate data" is written
(this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing
instances of this file once this change lands. (It is still early enough in the project that it does
not seem worth the overhead of a proper migration.)
The true test of the success of this new approach is the ease with which we can write more
integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very
few changes to `eden_dirstate.py`.
Reviewed By: simpkins
Differential Revision: D5071778
fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
|
|
|
// If relativePathSource is the empty string, remove the entry in the map for
|
|
|
|
// relativePathDest.
|
|
|
|
void hgCopyMapPut(
|
|
|
|
1: string mountPoint,
|
|
|
|
2: string relativePathDest,
|
|
|
|
3: string relativePathSource,
|
|
|
|
)
|
|
|
|
|
|
|
|
string hgCopyMapGet(
|
|
|
|
1: string mountPoint,
|
|
|
|
2: string relativePathDest,
|
2017-09-11 20:37:13 +03:00
|
|
|
) throws (1: NoValueForKeyError noValueForKeyError)
|
Reimplement dirstate used by Eden's Hg extension as a subclass of Hg's dirstate.
Summary:
This is a major change to Eden's Hg extension.
Our initial attempt to implement `edendirstate` was to create a "clean room"
implementation that did not share code with `mercurial/dirstate.py`. This was
helpful in uncovering the subset of the dirstate API that matters for Eden. It
also provided a better safeguard against upstream changes to `dirstate.py` in
Mercurial itself.
In this implementation, the state transition management was mostly done
on the server in `Dirstate.cpp`. We also made a modest attempt to make
`Dirstate.cpp` "SCM-agnostic" such that the same APIs could be used for
Git at some point.
However, as we have tried to support more of the sophisticated functionality
in Mercurial, particularly `hg histedit`, achieving parity between the clean room
implementation and Mercurial's internals has become more challenging.
Ultimately, the clean room implementation is likely the right way to go for Eden,
but for now, we need to prioritize having feature parity with vanilla Hg when
using Eden. Once we have a more complete set of integration tests in place,
we can reimplement Eden's dirstate more aggressively to optimize things.
Fortunately, the [[ https://bitbucket.org/facebook/hg-experimental/src/default/sqldirstate/ | sqldirstate ]]
extension has already demonstrated that it is possible to provide a faithful
dirstate implementation that subclasses the original `dirstate` while using a different
storage mechanism. As such, I used `sqldirstate` as a model when implementing
the new `eden_dirstate` (distinguishing it from our v1 implementation, `edendirstate`).
In particular, `sqldirstate` uses SQL tables as storage for the following private fields
of `dirstate`: `_map`, `_dirs`, `_copymap`, `_filefoldmap`, `_dirfoldmap`. Because
`_filefoldmap` and `_dirfoldmap` exist to deal with case-insensitivity issues, we
do not support them in `eden_dirstate` and add code to ensure the codepaths that
would access them in `dirstate` never get exercised. Similarly, we also implemented
`eden_dirstate` so that it never accesses `_dirs`. (`_dirs` is a multiset of all directories in the
dirstate, which is an O(repo) data structure, so we do not want to maintain it in Eden.
It appears to be primarily used for checking whether a path to a file already exists in
the dirstate as a directory. We can protect against that in more efficient ways.)
That leaves only `_map` and `_copymap` to worry about. `_copymap` contains the set
of files that have been marked "copied" in the current dirstate, so it is fairly small and
can be stored on disk or in memory with little concern. `_map` is a bit trickier because
it is expected to have an entry for every file in the dirstate. In `sqldirstate`, it is stored
across two tables: `files` and `nonnormalfiles`. For Eden, we already represent the data
analogous to the `files` table in RocksDB/the overlay, so we do not need to create a new
equivalent to the `files` table. We do, however, need an equivalent to the `nonnormalfiles`
table, which we store in as Thrift-serialized data in an ordinary file along with the `_copymap`
data.
In our Hg extension, our implementation of `_map` is `eden_dirstate_map`, which is defined
in a Python file of the same name. Our implementation of `_copymap` is `dummy_copymap`,
which is defined in `eden_dirstate.py`. Both of these collections are simple pass-through data
structures that translate their method calls to Thrift server calls. I expect we will want to
optimize this in the future via some client-side caching, as well as creating batch APIs for talking
to the server via Thrift.
One advantage of this new implementation is that it enables us to delete
`eden/hg/eden/overrides.py`, which overrode the entry points for `hg add` and `hg remove`.
Between the recent implementation of `dirstate.walk()` for Eden and this switch
to the real dirstate, we can now use the default implementation of `hg add` and `hg remove`
(although we have to play some tricks, like in the implementation of `eden_dirstate.status()`
in order to make `hg remove` work).
In the course of doing this revision, I discovered that I had to make a minor fix to
`EdenMatchInfo.make_glob_list()` because `hg add foo` was being treated as
`hg add foo/**/*` even when `foo` was just a file (as opposed to a directory), in which
case the glob was not matching `foo`!
I also had to do some work in `eden_dirstate.status()` in which the `match` argument
was previously largely ignored. It turns out that `dirstate.py` uses `status()` for a number
of things with the `match` specified as a filter, so the output of `status()` must be filtered
by `match` accordingly. Ultimately, this seems like work that would be better done on the
server, but for simplicity, we're just going to do it in Python, for now.
For the reasons explained above, this revision deletes a lot of code `Dirstate.cpp`.
As such, `DirstateTest.cpp` does not seem worth refactoring, though the scenarios it was
testing should probably be converted to integration tests. At a high level, the role of
`DirstatePersistence` has not changed, but the exact data it writes is much different.
Its corresponding unit test is also disabled, for now.
Note that this revision does not change the name of the file where "dirstate data" is written
(this is defined as `kDirstateFile` in `ClientConfig.cpp`), so we should blow away any existing
instances of this file once this change lands. (It is still early enough in the project that it does
not seem worth the overhead of a proper migration.)
The true test of the success of this new approach is the ease with which we can write more
integration tests for things like `hg histedit` and `hg graft`. Ideally, these should require very
few changes to `eden_dirstate.py`.
Reviewed By: simpkins
Differential Revision: D5071778
fbshipit-source-id: e8fec4d393035d80f36516ac050cad025dc3ba31
2017-05-26 21:51:30 +03:00
|
|
|
|
|
|
|
/**
|
|
|
|
* In practice, this map should be fairly small.
|
|
|
|
*/
|
|
|
|
map<string, string> hgCopyMapGetAll(
|
|
|
|
1: string mountPoint,
|
|
|
|
)
|
|
|
|
|
2017-04-04 01:47:53 +03:00
|
|
|
//////// Debugging APIs ////////
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Get the contents of a source control Tree.
|
|
|
|
*
|
|
|
|
* This can be used to confirm if eden's LocalStore contains information
|
|
|
|
* for the tree, and that the information is correct.
|
|
|
|
*
|
|
|
|
* If localStoreOnly is true, the data is loaded directly from the
|
|
|
|
* LocalStore, and an error will be raised if it is not already present in
|
|
|
|
* the LocalStore. If localStoreOnly is false, the data may be retrieved
|
|
|
|
* from the BackingStore if it is not already present in the LocalStore.
|
|
|
|
*/
|
|
|
|
list<ScmTreeEntry> debugGetScmTree(
|
|
|
|
1: string mountPoint,
|
|
|
|
2: BinaryHash id,
|
|
|
|
3: bool localStoreOnly,
|
|
|
|
) throws (1: EdenError ex)
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Get the contents of a source control Blob.
|
|
|
|
*
|
|
|
|
* This can be used to confirm if eden's LocalStore contains information
|
|
|
|
* for the blob, and that the information is correct.
|
|
|
|
*/
|
|
|
|
binary debugGetScmBlob(
|
|
|
|
1: string mountPoint,
|
|
|
|
2: BinaryHash id,
|
|
|
|
3: bool localStoreOnly,
|
|
|
|
) throws (1: EdenError ex)
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Get the metadata about a source control Blob.
|
|
|
|
*
|
|
|
|
* This retrieves the metadata about a source control Blob. This returns
|
|
|
|
* the size and contents SHA1 of the blob, which eden stores separately from
|
|
|
|
* the blob itself. This can also be a useful alternative to
|
|
|
|
* debugGetScmBlob() when getting data about extremely large blobs.
|
|
|
|
*/
|
|
|
|
ScmBlobMetadata debugGetScmBlobMetadata(
|
|
|
|
1: string mountPoint,
|
|
|
|
2: BinaryHash id,
|
|
|
|
3: bool localStoreOnly,
|
|
|
|
) throws (1: EdenError ex)
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Get status about currently loaded inode objects.
|
|
|
|
*
|
|
|
|
* This returns details about all currently loaded inode objects under the
|
|
|
|
* given path.
|
|
|
|
*
|
|
|
|
* If the path argument is the empty string data will be returned about all
|
|
|
|
* inodes in the entire mount point. Otherwise the path argument should
|
|
|
|
* refer to a subdirectory, and data will be returned for all inodes under
|
|
|
|
* the specified subdirectory.
|
|
|
|
*
|
|
|
|
* The rename lock is not held while gathering this information, so the path
|
|
|
|
* name information returned may not always be internally consistent. If
|
|
|
|
* renames were taking place while gathering the data, some inodes may show
|
|
|
|
* up under multiple parents. It's also possible that we may miss some
|
|
|
|
* inodes during the tree walk if they were renamed from a directory that was
|
|
|
|
* not yet walked into a directory that has already been walked.
|
|
|
|
*
|
|
|
|
* This API cannot return data about inodes that have been unlinked but still
|
|
|
|
* have outstanding references.
|
|
|
|
*/
|
|
|
|
list<TreeInodeDebugInfo> debugInodeStatus(
|
|
|
|
1: string mountPoint,
|
|
|
|
2: string path,
|
|
|
|
) throws (1: EdenError ex)
|
2017-06-17 02:08:19 +03:00
|
|
|
|
2017-08-17 05:56:32 +03:00
|
|
|
/**
|
|
|
|
* Get the InodePathDebugInfo for the inode that corresponds to the given
|
|
|
|
* inode number. This provides the path for the inode and also indicates
|
|
|
|
* whether the inode is currently loaded or not. Requires that the Eden
|
|
|
|
* mountPoint be specified.
|
|
|
|
*/
|
|
|
|
InodePathDebugInfo debugGetInodePath(
|
|
|
|
1: string mountPoint,
|
|
|
|
2: i64 inodeNumber,
|
|
|
|
) throws (1: EdenError ex)
|
|
|
|
|
2017-10-17 02:22:27 +03:00
|
|
|
/**
|
|
|
|
* Sets the log level for a given category at runtime.
|
|
|
|
*/
|
|
|
|
void debugSetLogLevel(
|
|
|
|
1: string category,
|
|
|
|
2: string level,
|
|
|
|
) throws (1: EdenError ex)
|
|
|
|
|
2017-06-17 02:08:19 +03:00
|
|
|
/**
|
2017-08-23 05:43:29 +03:00
|
|
|
* Unloads unused Inodes from a directory inside a mountPoint whose last
|
|
|
|
* access time is older than the specified age.
|
|
|
|
*
|
|
|
|
* The age parameter is a relative time to be subtracted from the current
|
|
|
|
* (wall clock) time.
|
2017-06-17 02:08:19 +03:00
|
|
|
*/
|
2017-08-23 05:43:31 +03:00
|
|
|
i64 unloadInodeForPath(
|
2017-06-17 02:08:19 +03:00
|
|
|
1: string mountPoint,
|
|
|
|
2: string path,
|
2017-08-23 05:43:29 +03:00
|
|
|
3: TimeSpec age,
|
2017-06-17 02:08:19 +03:00
|
|
|
) throws (1: EdenError ex)
|
|
|
|
|
2017-08-18 21:43:57 +03:00
|
|
|
/**
|
|
|
|
* Flush all thread-local stats to the main ServiceData object.
|
|
|
|
*
|
|
|
|
* Thread-local counters are normally flushed to the main ServiceData once
|
|
|
|
* a second. flushStatsNow() can be used to flush thread-local counters on
|
|
|
|
* demand, in addition to the normal once-a-second flush.
|
|
|
|
*
|
|
|
|
* This is mainly useful for unit and integration tests that want to ensure
|
|
|
|
* they see up-to-date counter information without waiting for the normal
|
|
|
|
* flush interval.
|
|
|
|
*/
|
|
|
|
void flushStatsNow() throws (1: EdenError ex)
|
2017-08-22 01:52:55 +03:00
|
|
|
|
|
|
|
/**
|
|
|
|
* Invalidate kernel cache for inode.
|
|
|
|
*/
|
|
|
|
void invalidateKernelInodeCache(
|
|
|
|
1: string mountPoint,
|
|
|
|
2: string path
|
|
|
|
)
|
|
|
|
throws (1: EdenError ex)
|
|
|
|
|
2017-08-25 22:41:41 +03:00
|
|
|
/**
|
|
|
|
* Gets the number of inodes unloaded by periodic job on an EdenMount.
|
|
|
|
*/
|
|
|
|
InternalStats getStatInfo() throws (1: EdenError ex)
|
2016-05-12 23:43:17 +03:00
|
|
|
}
|