always call fdatasync() when writing overlay data for the root inode

Summary:
The root inode is particularly important, so always call `fdatasync()` on its
changes in the overlay before committing them.

This will hopefully reduce the number of cases we see where users have empty
or corrupt data for the root inode after hard rebooting their server.  My
guess is that the root directory is being modified by hg or other tools
creating and removing temporary files in the root directory.  If a change like
this is in progress when a hard reboot has been performed we risk data loss
without the `fdatasync()` call.

While Eden can still mostly serve the checkout data if other files or
directories are corrupt/missing in the overlay it currently is completely
unable to mount the checkout if the root overlay is corrupt.  Therefore it
seems worth being more cautious about making sure that the root overlay data
is updated atomically.

Reviewed By: chadaustin, wez

Differential Revision: D9275852

fbshipit-source-id: b1e3eeb94ba670d0e2b52da4af7143d3ddbc919b
This commit is contained in:
Adam Simpkins 2018-08-10 14:46:19 -07:00 committed by Facebook Github Bot
parent a89a3db094
commit ee03bf2f70

View File

@ -765,10 +765,30 @@ folly::File Overlay::createOverlayFileImpl(
" in ",
localDir_);
// Eden used to call fdatasync() here because technically that's required to
// reliably, atomically write a file. But, per docs/InodeStorage.md, Eden
// does not claim to handle disk, kernel, or power failure, and fdatasync has
// a nearly 300 microsecond cost.
// fdatasync() is required to ensure that we are really reliably and
// atomically writing out the new file. Without calling fdatasync() the file
// contents may not be flushed to disk even though the rename has been
// written.
//
// However, fdatasync() has a significant performance overhead. We've
// measured it at a nearly 300 microsecond cost, which can significantly
// impact performance of source control update operations when many inodes are
// affected.
//
// Per docs/InodeStorage.md, Eden does not claim to handle disk, kernel, or
// power failure, so we do not call fdatasync() in the common case. However,
// the root inode is particularly important; if its data is corrupt Eden will
// not be able to remount the checkout. Therefore we always call fdatasync()
// when writing out the root inode.
if (inodeNumber == kRootNodeId) {
auto syncReturnCode = folly::fdatasyncNoInt(tmpFD);
folly::checkUnixError(
syncReturnCode,
"error flushing data to overlay file for inode ",
inodeNumber,
" in ",
localDir_);
}
auto returnCode =
renameat(dirFile_.fd(), tmpPath.data(), dirFile_.fd(), path.c_str());