mirror of
https://github.com/facebook/sapling.git
synced 2024-10-09 16:31:02 +03:00
7565ff7f59
Summary: Just running the linter :) Reviewed By: singhsrb Differential Revision: D26000274 fbshipit-source-id: 5d94abf11210fda5e956c408764aa0c348aa0d84
121 lines
5.6 KiB
Markdown
121 lines
5.6 KiB
Markdown
# Takeover
|
|
|
|
The takeover directory holds the logic for the Takeover Client (the new EdenFS
|
|
process) and Server (the old EdenFS process) whichare used during a graceful
|
|
restart process.
|
|
|
|
## Structure
|
|
|
|
There are 5 main components in the takeover directory: thrift serialization
|
|
library, client, server, data, and handler.
|
|
|
|
|
|
### Thrift serialization library
|
|
|
|
There are three main message classes that are exchanged over the takeover socket:
|
|
|
|
* `struct TakeoverVersionQuery` - A list of takeover data serialization versions
|
|
that the client supports
|
|
* empty "ready" ping - An empty ping sent by the server to ensure the client is
|
|
still alive and ready to receive takeover data
|
|
* `union SerializedTakeoverData` - A list of `SerializedMountInfo` or a string
|
|
error.
|
|
* `struct SerializedMountInfo` - Contains the mount path, state directory, a
|
|
list of bind mount paths (which is no longer used), connection information, and
|
|
a `SerializedInodeMap`
|
|
* `struct SerializedInodeMap` - A list of `SerializedInodeMapEntry` unloaded
|
|
inodes
|
|
* `struct SerializedInodeMapEntry` - contains inode information like
|
|
inodeNumber, parentInode, name, isUnlinked, numFuseReferences, hash,
|
|
and mode.
|
|
* `struct SerializedFileHandleMap` - currently empty
|
|
|
|
### Client
|
|
|
|
The client has one function - `takeoverMounts`. This function requests to take
|
|
over mount points from an existing edenfs process. On success, it returns a
|
|
`TakeoverData` object, and it throws an exception on error. It takes three
|
|
parameters: a socketPath, a bool shouldPing, and a set of integers of supported
|
|
takeover versions. The last two parameters are for testing purposes and should
|
|
not be used in productions builds.
|
|
|
|
This has a takeover timeout of 5 minutes for receiving takeover data from old
|
|
process.
|
|
|
|
We connect to the socket at the given path, then send our send our protocol
|
|
version so that the server knows whether we're capable of handshaking
|
|
successfully. We then wait for the server to send us a "ready" ping, making sure
|
|
we are still listening on the socket. We respond to this ping and then wait for
|
|
the takeover data response. It is possible that we will not recieve this ping,
|
|
and instead just recieve the takeover data response.
|
|
|
|
After we get the takeover data response, we either throw an exception if we do
|
|
not get a message, or we deserialize the message and check its contents. We
|
|
throw an exception if the message is not the expected size
|
|
(num of mount points + 2 for the lock file and the thrift socket). Otherwise, if
|
|
all is well, we save the lock file, thrift socket, and all the mount points.
|
|
|
|
|
|
### Server
|
|
|
|
A helper class that listens on a unix domain socket for clients that wish to
|
|
perform graceful takeover of this `EdenServer`'s mount points. This class uses
|
|
the `EdenServer`'s main `EventBase` for driving its I/O.
|
|
|
|
It has a few functions:
|
|
|
|
* public function:
|
|
* start - This is called when the EdenFS daemon first starts. It begins
|
|
listening on the takeover socket, waiting for a client to connect and
|
|
request to initiate a graceful restart. When a client connects, it verifies
|
|
that the client process is from the same user ID, and that the client and
|
|
server support a compatible takeover protocol version. If the versions are
|
|
compatible, then the server starts to initiate shutdown by calling return
|
|
`server_->getTakeoverHandler()->startTakeoverShutdown()`. After the shutdown
|
|
is completed, the takeover server pings the takeover client to ensure it is
|
|
still waiting for the data. If the ping is unsuccessful (timeout, error, etc),
|
|
the takeover server stops the takeover process and returns the untransmitted
|
|
`TakeoverData` in an exception in order to let the `EdenServer` recover itself
|
|
and start serving again. Finally, it closes its storage (local and backing stores)
|
|
and sends the takeover data over the takeover socket by serializing the
|
|
information (version, lock file, thrift socket, mount file descriptor) or error,
|
|
and sending it.
|
|
* private functions:
|
|
* `connectionAccepted` - callback function for allocating a connection
|
|
handler when the server gets a client.
|
|
* `acceptError` - callback function that simply logs on an accept() error on
|
|
the takeover socket
|
|
* `connectionDone` - callback function that is declared in the .h file but
|
|
currently is not defined.
|
|
|
|
### Data
|
|
|
|
This holds the set of versions supported by this build. It also holds the lock
|
|
file, the server socket, the mount points, and a takeover complete promise that
|
|
will be fulfilled by the `TakeoverServer` code once the `TakeoverData` has been
|
|
sent to the remote process. It has a function to serialize and deserialize
|
|
the `TakeoverData`.
|
|
|
|
|
|
### Handler
|
|
|
|
TakeoverHandler is a pure virtual interface for classes that want to implement
|
|
graceful takeover functionality. This is primarily implemented by the
|
|
`EdenServer` class. However, there are also alternative implementations used
|
|
for unit testing.
|
|
|
|
It has two pure virtual functions: `startTakeoverShutdown()` and `closeStorage()`.
|
|
|
|
`startTakeoverShutdown()` will be called when a graceful shutdown has been
|
|
requested, with a remote process attempting to take over the currently running
|
|
mount points.
|
|
|
|
When implemented, this should return a Future that will produce the
|
|
`TakeoverData` to send to the remote edenfs process once the edenfs process is
|
|
ready to transfer its mounts.
|
|
|
|
`closeStorage()` will be called before sending the `TakeoverData` to the client,
|
|
conditionally on a successful ready handshake (if applicable). This function should
|
|
close storage used by the server. In the case of an `EdenServer`, this function
|
|
allows for locks to be released in order for the new process to take over this storage.
|