getFileData: don't crash if length is too long

Summary: Currently, EdenFS crashes if ProjFS requests to read more bytes than we have available for a file. I am able to repro this, by blocking the read so that it starts before an `hg checkout` that shortens file contents and completes after. I am making eden return an error when we don't have enough data available. Unfortunately, it seems there is no error code that indicates to projfs it's view is stale and make it flush it's caches and retry. All error codes seem to be propagated to the user. These ones are the most promising to make PrjFS do something we want: https://learn.microsoft.com/en-us/windows/win32/projfs/provider-overview#callback-return-codes But all those still result in errors to the user. I tried a few more common HRESULT return codes: https://learn.microsoft.com/en-us/windows/win32/seccrypto/common-hresult-values But all errors still get propagated to the user. Invalid argument seems to be the most accurate reflection of the issue, so I'm going with that for now. It would be better if we could just return the amount of data that we have, and not have to error. However, PrjFS will still think the file is the longer length and will add null bytes. This will leave the file in an inconsistent state and leave `hg status` dirty. This seems worse than erroring because it causes EdenFS to corrupt the file. Ideally we would fix this by preventing the length from getting out of sync from the contents we should read the data from. As far as I see there are two options, but they are both more involved: - Failing the checkout when we try to invalidate the contents and it fails. This would be a bit disruptive to the user as background reads can interfere with a checkout. but it matches the what a native filesystem would do. - Keeping the version of the file to read in the placeholder so the version and length are never out of sync. The file contents would then be the old contents. This would leave the file modified and make status dirty, which is still not a great user experience, but at least the file would be in a consistent state. I've reached out to ProjFS about other solutions -- maybe there is or could be a particular error that eden can send that will make ProjFS retry and gracefully deal. For now error is better than crash Reviewed By: mshroyer, MichaelCuevas Differential Revision: D50193962 fbshipit-source-id: 6ca05a4195c29996a0cd255bfd461a98f7dd5fa6
2024-10-05 06:18:07 +03:00 · 2024-02-22 19:34:50 -08:00 · 2024-02-22 19:34:50 -08:00 · a7bb705027
commit a7bb705027
parent 5b68b731ef
9 changed files with 211 additions and 3 deletions
--- a/eden/fs/config/EdenConfig.h
+++ b/eden/fs/config/EdenConfig.h
@ -641,6 +641,16 @@ class EdenConfig : private ConfigSettingManager {
      true,
      this};

+  /**
+   * Controls how frequently we log to the EdenFS log file and scuba tables
+   * about torn reads - i.e. when Prjfs attempts to read a file that was
+   * modified in the middle of an operation.
+   */
+  ConfigSetting<std::chrono::nanoseconds> prjfsTornReadLogInterval{
+      "prjfs:torn-read-log-interval",
+      std::chrono::seconds{10},
+      this};
+
  // [fschannel]

  /**
--- a/eden/fs/inodes/EdenMount.cpp
+++ b/eden/fs/inodes/EdenMount.cpp
@ -2066,6 +2066,7 @@ folly::Future<folly::Unit> EdenMount::fsChannelMount(bool readOnly) {
                         EdenDispatcherFactory::makePrjfsDispatcher(this),
                         serverState_->getReloadableConfig(),
                         &getStraceLogger(),
+                         serverState_->getStructuredLogger(),
                         serverState_->getProcessInfoCache(),
                         getCheckoutConfig()->getRepoGuid(),
                         getCheckoutConfig()->getEnableWindowsSymlinks(),
--- a/eden/fs/inodes/test/CheckoutTest.cpp
+++ b/eden/fs/inodes/test/CheckoutTest.cpp
@ -1894,6 +1894,7 @@ class FakePrjfsChannel final : public PrjfsChannel {
            EdenDispatcherFactory::makePrjfsDispatcher(mount.get()),
            mount->getServerState()->getReloadableConfig(),
            &mount->getStraceLogger(),
+            mount->getServerState()->getStructuredLogger(),
            mount->getServerState()->getProcessInfoCache(),
            mount->getCheckoutConfig()->getRepoGuid(),
            mount->getCheckoutConfig()->getEnableWindowsSymlinks(),
--- a/eden/fs/prjfs/PrjfsChannel.cpp
+++ b/eden/fs/prjfs/PrjfsChannel.cpp
@ -18,6 +18,8 @@
 #include "eden/fs/prjfs/PrjfsDispatcher.h"
 #include "eden/fs/prjfs/PrjfsRequestContext.h"
 #include "eden/fs/telemetry/EdenStats.h"
+#include "eden/fs/telemetry/LogEvent.h"
+#include "eden/fs/telemetry/StructuredLogger.h"
 #include "eden/fs/utils/Bug.h"
 #include "eden/fs/utils/Guid.h"
 #include "eden/fs/utils/ImmediateFuture.h"
@ -428,14 +430,21 @@ void detachAndCompleteCallback(
 PrjfsChannelInner::PrjfsChannelInner(
    std::unique_ptr<PrjfsDispatcher> dispatcher,
    const folly::Logger* straceLogger,
+    const std::shared_ptr<StructuredLogger>& structuredLogger,
    ProcessAccessLog& processAccessLog,
+    std::shared_ptr<ReloadableConfig>& config,
    folly::Promise<folly::Unit> deletedPromise,
    std::shared_ptr<Notifier> notifier,
    size_t prjfsTraceBusCapacity)
    : dispatcher_(std::move(dispatcher)),
      straceLogger_(straceLogger),
+      structuredLogger_(structuredLogger),
+      lastTornReadLog_(
+          std::make_shared<
+              folly::Synchronized<std::chrono::steady_clock::time_point>>()),
      notifier_(std::move(notifier)),
      processAccessLog_(processAccessLog),
+      config_(config),
      deletedPromise_(std::move(deletedPromise)),
      traceDetailedArguments_(std::atomic<size_t>(0)),
      traceBus_(TraceBus<PrjfsTraceEvent>::create(
@ -857,6 +866,8 @@ HRESULT PrjfsChannelInner::getFileData(
       path = RelativePath(callbackData->FilePathName),
       virtualizationContext = callbackData->NamespaceVirtualizationContext,
       dataStreamId = Guid(callbackData->DataStreamId),
+       clientProcessName =
+           std::wstring{callbackData->TriggeringProcessImageFileName},
       byteOffset,
       length]() mutable {
        auto requestWatch =
@ -872,13 +883,71 @@ HRESULT PrjfsChannelInner::getFileData(
            path,
            byteOffset,
            length);
-        return dispatcher_
-            ->read(std::move(path), context->getObjectFetchContext())
+        return dispatcher_->read(path, context->getObjectFetchContext())
            .thenValue([context = std::move(context),
                        virtualizationContext = virtualizationContext,
                        dataStreamId = std::move(dataStreamId),
                        byteOffset = byteOffset,
-                        length = length](const std::string content) {
+                        length = length,
+                        path,
+                        structuredLogger = structuredLogger_,
+                        clientProcessName = std::move(clientProcessName),
+                        lastTornReadLog = lastTornReadLog_,
+                        config = config_,
+                        mountChannel =
+                            getMountChannel()](const std::string content) {
+              if ((content.length() - byteOffset) < length) {
+                auto now = std::chrono::steady_clock::now();
+
+                // These most likely come fround background tooling reads, so
+                // it's likely that there will be many at once. During one
+                // checkout operation we might see a bunch of torn reads all at
+                // once. We only log one per 10s to avoid spamming logs/scuba.
+                bool shouldLog = false;
+                {
+                  auto last = lastTornReadLog->wlock();
+                  if (now >= *last +
+                          config->getEdenConfig()
+                              ->prjfsTornReadLogInterval.getValue()) {
+                    shouldLog = true;
+                    *last = now;
+                  }
+                }
+                if (shouldLog) {
+                  auto client = wideToMultibyteString<std::string>(
+                      basenameFromAppName(clientProcessName.c_str()));
+                  XLOGF(
+                      DBG2,
+                      "PrjFS asked us to read {} bytes out of {}, but there are only"
+                      "{} bytes available in this file. Reading the file likely raced"
+                      "with checkout/reset. Client process: {}. ",
+                      length,
+                      path,
+                      content.length(),
+                      client);
+                  structuredLogger->logEvent(
+                      PrjFSCheckoutReadRace{std::move(client)});
+                }
+
+                // This error currently gets propagated to the user. Ideally, we
+                // don't want to fail this read. However, if the requested
+                // length is larger than the actual size of the file and we give
+                // PrjFS less data than it expects, PrjFs/Windows going to add 0
+                // bytes to the end of the file. The file will then be corrupted
+                // and out of sync. The only way we can prevent the file from
+                // being out of sync and corrupted is to error in this case.
+                context->sendError(HRESULT_FROM_WIN32(ERROR_INVALID_PARAMETER));
+                return;
+              }
+              // Note it's possible that PrjFs could be out of sync with
+              // EdenFS in the opposite way from above (PrjFS thinks the file
+              // length is shorter than EdenFS). That is still going to result
+              // in a corrupt file (truncated). That case is indistinguishable
+              // from ProjFs just requesting a portion of the file, so we
+              // can't raise an error here. We need to prevent that corruption
+              // elsewhere - failing the checkout that de-syncs Eden and
+              // ProjFs or storing the version of the file in the placeholder.
+
              //
              // We should return file data which is smaller than
              // our kMaxChunkSize and meets the memory alignment
@ -1319,6 +1388,7 @@ PrjfsChannel::PrjfsChannel(
    std::unique_ptr<PrjfsDispatcher> dispatcher,
    std::shared_ptr<ReloadableConfig> config,
    const folly::Logger* straceLogger,
+    const std::shared_ptr<StructuredLogger>& structuredLogger,
    std::shared_ptr<ProcessInfoCache> processInfoCache,
    Guid guid,
    bool enableWindowsSymlinks,
@ -1334,7 +1404,9 @@ PrjfsChannel::PrjfsChannel(
  inner_.store(std::make_shared<PrjfsChannelInner>(
      std::move(dispatcher),
      straceLogger,
+      structuredLogger,
      processAccessLog_,
+      config_,
      std::move(innerDeletedPromise),
      std::move(notifier),
      config_->getEdenConfig()->PrjfsTraceBusCapacity.getValue()));
--- a/eden/fs/prjfs/PrjfsChannel.h
+++ b/eden/fs/prjfs/PrjfsChannel.h
@ -36,6 +36,7 @@ class Notifier;
 class ReloadableConfig;
 class PrjfsChannelInner;
 class PrjfsRequestContext;
+class StructuredLogger;

 using EdenStatsPtr = RefPtr<EdenStats>;

@ -164,7 +165,9 @@ class PrjfsChannelInner {
  PrjfsChannelInner(
      std::unique_ptr<PrjfsDispatcher> dispatcher,
      const folly::Logger* straceLogger,
+      const std::shared_ptr<StructuredLogger>& structuredLogger,
      ProcessAccessLog& processAccessLog,
+      std::shared_ptr<ReloadableConfig>& config,
      folly::Promise<folly::Unit> deletedPromise,
      std::shared_ptr<Notifier> notifier,
      size_t prjfsTraceBusCapacity);
@ -451,12 +454,22 @@ class PrjfsChannelInner {
  std::unique_ptr<PrjfsDispatcher> dispatcher_;
  const folly::Logger* const straceLogger_{nullptr};

+  // scuba logger
+  const std::shared_ptr<StructuredLogger> structuredLogger_;
+
+  // The last time we logged a about a torn read the interval at which
+  // we log is set by EdenConfig::tornReadLogInterval.
+  std::shared_ptr<folly::Synchronized<std::chrono::steady_clock::time_point>>
+      lastTornReadLog_;
+
  std::shared_ptr<Notifier> notifier_;

  // The processAccessLog_ is owned by PrjfsChannel which is guaranteed to have
  // its lifetime be longer than that of PrjfsChannelInner.
  ProcessAccessLog& processAccessLog_;

+  std::shared_ptr<ReloadableConfig> config_;
+
  // Set of currently active directory enumerations.
  folly::Synchronized<folly::F14FastMap<Guid, std::shared_ptr<Enumerator>>>
      enumSessions_;
@ -490,6 +503,7 @@ class PrjfsChannel : public FsChannel {
      std::unique_ptr<PrjfsDispatcher> dispatcher,
      std::shared_ptr<ReloadableConfig> config,
      const folly::Logger* straceLogger,
+      const std::shared_ptr<StructuredLogger>& structuredLogger,
      std::shared_ptr<ProcessInfoCache> processInfoCache,
      Guid guid,
      bool enableWindowsSymlinks,
--- a/eden/fs/prjfs/TARGETS
+++ b/eden/fs/prjfs/TARGETS
@ -25,6 +25,8 @@ cpp_library(
        "//eden/common/utils:win_error",
        "//eden/fs/config:config",
        "//eden/fs/notifications:notifier",
+        "//eden/fs/telemetry:log_info",
+        "//eden/fs/telemetry:structured_logger",
        "//eden/fs/utils:eden_error",
        "//eden/fs/utils:static_assert",
        "//folly/executors:global_executor",
--- a/eden/fs/telemetry/LogEvent.h
+++ b/eden/fs/telemetry/LogEvent.h
@ -343,6 +343,16 @@ struct PrjFSFileNotificationFailure {
  }
 };

+struct PrjFSCheckoutReadRace {
+  static constexpr const char* type = "prjfs_checkout_read_race";
+
+  std::string client_cmdline;
+
+  void populate(DynamicEvent& event) const {
+    event.addString("client_cmdline", client_cmdline);
+  }
+};
+
 struct WorkingCopyGc {
  static constexpr const char* type = "working_copy_gc";

--- a/eden/integration/hg/update_test.py
+++ b/eden/integration/hg/update_test.py
@ -13,6 +13,7 @@ from contextlib import contextmanager
 from enum import Enum
 from multiprocessing import Process
 from textwrap import dedent
+from threading import Thread
 from typing import Dict, Generator, List, Optional, Set

 from eden.fs.cli import util
@ -1223,3 +1224,97 @@ class UpdateCacheInvalidationTest(EdenHgTestCase):
            self.assertEqual(self.read_file("dir/file3"), "three")
            with self.assertRaises(hgrepo.HgError):
                self.repo.status()
+
+
+@hg_test
+# pyre-ignore[13]: T62487924
+class PrjFSStressTornReads(EdenHgTestCase):
+    long_file_commit: str = ""
+    short_file_commit: str = ""
+
+    enable_fault_injection: bool = True
+
+    def populate_backing_repo(self, repo: hgrepo.HgRepository) -> None:
+        repo.write_file("file", "1234567890\n")
+        self.long_file_commit = repo.commit("Initial commit.")
+        repo.write_file("file", "54321\n")
+        self.short_file_commit = repo.commit("Shorter file commit.")
+
+    def edenfs_logging_settings(self) -> Dict[str, str]:
+        return {"eden.strace": "DBG7"}
+
+    def test_torn_read_long_to_short(self) -> None:
+        self.repo.update(self.long_file_commit)
+        rel_path = "file"
+        path = self.mount_path / rel_path
+
+        read_exception: Optional[OSError] = None
+
+        def read_file() -> None:
+            nonlocal read_exception
+            with self.run_with_blocking_fault(
+                keyClass="PrjfsDispatcherImpl::read",
+                keyValueRegex="file",
+            ):
+                try:
+                    with path.open("rb") as f:
+                        f.read()
+                except Exception as err:
+                    read_exception = err
+
+        read_thread = Thread(target=read_file)
+        read_thread.start()
+
+        try:
+            self.repo.update(self.short_file_commit)
+        except Exception:
+            pass
+
+        self.remove_fault(keyClass="PrjfsDispatcherImpl::read", keyValueRegex="file")
+        self.wait_on_fault_unblock(
+            keyClass="PrjfsDispatcherImpl::read", keyValueRegex="file"
+        )
+        read_thread.join()
+        self.assertIsNotNone(read_exception)
+        # ugh pyre I just asserted it's non none :| I want this to error
+        # anyways if the error does not have errno set.
+        # pyre-ignore[16]: `None` has no attribute `errno`.
+        self.assertEqual(read_exception.errno, 22)  # invalid argument error
+
+    def test_torn_read_short_to_long(self) -> None:
+        self.repo.update(self.short_file_commit)
+
+        rel_path = "file"
+        path = self.mount_path / rel_path
+
+        read_contents = None
+
+        def read_file() -> None:
+            nonlocal read_contents
+            with self.run_with_blocking_fault(
+                keyClass="PrjfsDispatcherImpl::read",
+                keyValueRegex="file",
+            ):
+                with path.open("rb") as f:
+                    read_contents = f.read()
+
+        read_thread = Thread(target=read_file)
+        read_thread.start()
+
+        try:
+            self.repo.update(self.long_file_commit)
+        except Exception:
+            pass
+
+        self.remove_fault(keyClass="PrjfsDispatcherImpl::read", keyValueRegex="file")
+        self.wait_on_fault_unblock(
+            keyClass="PrjfsDispatcherImpl::read", keyValueRegex="file"
+        )
+        read_thread.join()
+        self.assertIsNotNone(read_contents)
+        # This is not correct behavior, we want the contents to be either
+        # the contents from the first or second commit, not this inconsitent
+        # mashup. This test is for not documenting the behavior of torn reads.
+        # This case requires a larger fix.
+        # TODO(kmancini): fix torn reads.
+        self.assertEqual(read_contents, b"123456")
--- a/eden/integration/lib/skip.py
+++ b/eden/integration/lib/skip.py
@ -305,6 +305,8 @@ elif sys.platform.startswith("darwin"):


 # Windows specific tests
+# we keep them in the build graph on linux so pyre will run and disable them
+# here
 if sys.platform != "win32":
    TEST_DISABLED.update(
        {
@ -319,6 +321,7 @@ if sys.platform != "win32":
            "projfs_enumeration.ProjFSEnumerationInsufficientBuffer": True,
            "prjfs_match_fs.PrjfsMatchFsTest": True,
            "hg.symlink_test.SymlinkWindowsDisabledTest": True,
+            "hg.update_test.PrjFSStressTornReads": True,
        }
    )