Improve hist benchmarks driver and add to CI (#770)

* Remove hardcoded --stack-yaml and upstream/master assumption * support Cabal in bench suite * add benchmark run to CI Even if the time measurements are unreliable in a shared CI environment, the memory usage will be an accurate indicator of space leaks * Update bench/README * use origin/master * default to stack in benchmarks (for CI) * ignore ghcide-bench and ghcide-preprocessor binaries too * Review feedbacks * Add the v0.3.0 tag in bench/hist.yaml commented out to keep the CI time as tight as possible * Add .artifactignore file to avoid publishing binaries in azure bench pipeline * use default stack.yaml
2024-09-11 13:57:06 +03:00 · 2020-09-06 21:54:45 +01:00 · 2020-09-06 21:54:45 +01:00 · 0d7cae9846
commit 0d7cae9846
parent ed95e69965
8 changed files with 123 additions and 32 deletions
--- a/.azure/linux-bench.yml
+++ b/.azure/linux-bench.yml
@ -0,0 +1,48 @@
+jobs:
+- job: ghcide_bench_linux
+  timeoutInMinutes: 60
+  pool:
+    vmImage: 'ubuntu-latest'
+  strategy:
+    matrix:
+      stack:
+        STACK_YAML: "stack.yaml"
+  steps:
+  - checkout: self
+  - task: Cache@2
+    inputs:
+      key: stack-cache-v2 | $(Agent.OS) | $(Build.SourcesDirectory)/$(STACK_YAML) | $(Build.SourcesDirectory)/ghcide.cabal
+      path: .azure-cache
+      cacheHitVar: CACHE_RESTORED
+    displayName: "Cache stack artifacts"
+  - bash: |
+      mkdir -p ~/.stack
+      tar xzf .azure-cache/stack-root.tar.gz -C $HOME
+    displayName: "Unpack cache"
+    condition: eq(variables.CACHE_RESTORED, 'true')
+  - bash: |
+      sudo add-apt-repository ppa:hvr/ghc
+      sudo apt-get update
+      sudo apt-get install -y g++ gcc libc6-dev libffi-dev libgmp-dev zlib1g-dev
+      if ! which stack >/dev/null 2>&1; then
+         curl -sSL https://get.haskellstack.org/ | sh
+      fi
+    displayName: 'Install Stack'
+  - bash: stack setup --stack-yaml=$STACK_YAML
+    displayName: 'stack setup'
+  - bash: stack build --bench --only-dependencies --stack-yaml=$STACK_YAML
+    displayName: 'stack build --only-dependencies'
+  - bash: |
+      export PATH=/opt/cabal/bin:$PATH
+      stack bench --ghc-options=-Werror  --stack-yaml=$STACK_YAML
+    displayName: 'stack bench --ghc-options=-Werror'
+  - bash: |
+      mkdir -p .azure-cache
+      tar czf .azure-cache/stack-root.tar.gz -C $HOME .stack
+    displayName: "Pack cache"
+  - bash: |
+      cat bench-hist/results.csv
+    displayName: "cat results"
+  - publish: bench-hist
+    artifact: benchmarks
+    displayName: "publish"
--- a/.gitignore
+++ b/.gitignore
@ -12,5 +12,7 @@ bench-hist/
 bench-temp/
 .shake/
 ghcide
+ghcide-bench
+ghcide-preprocessor
 *.benchmark-gcStats
 tags
--- a/README.md
+++ b/README.md
@ -329,15 +329,17 @@ This writes a log file called `.tasty-rerun-log` of the failures, and only runs
 See the [tasty-rerun](https://hackage.haskell.org/package/tasty-rerun-1.1.17/docs/Test-Tasty-Ingredients-Rerun.html) documentation for other options.

 If you are touching performance sensitive code, take the time to run a differential
-benchmark between HEAD and upstream using the benchHist script. The configuration in
-`bench/hist.yaml` is setup to do this by default assuming upstream is
-`origin/master`. Run the benchmarks with `stack`:
+benchmark between HEAD and master using the benchHist script. This assumes that
+"master" points to the upstream master.
+
+Run the benchmarks with `stack`:

    export STACK_YAML=...
    stack bench

-It should take around 15 minutes and the results will be stored in the `bench-hist` folder.
-To interpret the results, see the comments in the `bench/hist/Main.hs` module.
+It should take around 15 minutes and the results will be stored in the `bench-hist` folder. To interpret the results, see the comments in the `bench/hist/Main.hs` module.
+
+More details in [bench/README](bench/README.md)

 ### Building the extension

--- a/azure-pipelines.yml
+++ b/azure-pipelines.yml
@ -16,3 +16,4 @@ pr:
 jobs:
  - template: ./.azure/linux-stack.yml
  - template: ./.azure/windows-stack.yml
+  - template: ./.azure/linux-bench.yml
--- a/bench-hist/.artifactignore
+++ b/bench-hist/.artifactignore
@ -0,0 +1,4 @@
+ghcide
+ghcide-bench
+ghcide-preprocessor
+*.benchmark-gcStats
--- a/bench/README.md
+++ b/bench/README.md
@ -4,11 +4,12 @@
 This folder contains two Haskell programs that work together to simplify the
 performance analysis of ghcide:

- `exe/Main.hs` - a standalone benchmark suite. Run with `stack bench`
+- `exe/Main.hs` - a standalone benchmark runner. Run with `stack run ghcide-bench`
 - `hist/Main.hs` - a Shake script for running the benchmark suite over a set of commits.
-  - Run with `stack exec benchHist`,
-  - Requires a `ghcide-bench` binary in the PATH,
-  - Calls `stack` internally to build the project,
-  - Driven by the `hist.yaml` configuration file. By default it compares HEAD with upstream
+  - Run with `stack bench` or `cabal bench`,
+  - Requires a `ghcide-bench` binary in the PATH (usually provided by stack/cabal),
+  - Calls `cabal` (or `stack`, configurable) internally to build the project,
+  - Driven by the `hist.yaml` configuration file.
+    By default it compares HEAD with "master"

 Further details available in the module header comments.
--- a/bench/hist.yaml
+++ b/bench/hist.yaml
@ -2,6 +2,8 @@
 # At least 100 is recommended in order to observe space leaks
 samples: 100

+buildTool: stack
+
 # Path to the ghcide-bench binary to use for experiments
 ghcideBench: ghcide-bench

@ -37,6 +39,6 @@ versions:
 # - v0.0.6
 # - v0.1.0
 # - v0.2.0
- upstream: upstream/master
+# - v0.3.0
+- upstream: origin/master
 - HEAD
-
--- a/bench/hist/Main.hs
+++ b/bench/hist/Main.hs
@ -25,15 +25,14 @@
   For diff graphs, the "previous version" is the preceding entry in the list of versions
   in the config file. A possible improvement is to obtain this info via `git rev-list`.

-   The script relies on stack for building and running all the binaries.
-
   To execute the script:

-   > stack bench
+   > cabal/stack bench

   To build a specific analysis, enumerate the desired file artifacts

   > stack bench --ba "bench-hist/HEAD/results.csv bench-hist/HEAD/edit.diff.svg"
+   > cabal bench --benchmark-options "bench-hist/HEAD/results.csv bench-hist/HEAD/edit.diff.svg"

 -}
 {-# LANGUAGE DeriveAnyClass    #-}
@ -42,6 +41,7 @@

 import Control.Applicative (Alternative (empty))
 import Control.Monad (when, forM, forM_, replicateM)
+import Data.Char (toLower)
 import Data.Foldable (find)
 import Data.Maybe (fromMaybe)
 import Data.Text (Text)
@ -103,8 +103,10 @@ main = shakeArgs shakeOptions {shakeChange = ChangeModtimeAndDigest} $ do
      readSamples = askOracle $ GetSamples ()
      getParent = askOracle . GetParent

-  build <- liftIO $ outputFolder <$> readConfigIO config
+  configStatic <- liftIO $ readConfigIO config
  ghcideBenchPath <- ghcideBench <$> liftIO (readConfigIO config)
+  let build = outputFolder configStatic
+      buildSystem = buildTool configStatic

  phony "all" $ do
    Config {..} <- readConfig config
@ -139,11 +141,8 @@ main = shakeArgs shakeOptions {shakeChange = ChangeModtimeAndDigest} $ do
    &%> \[out, ghcpath] -> do
      liftIO $ createDirectoryIfMissing True $ dropFileName out
      need =<< getDirectoryFiles "." ["src//*.hs", "exe//*.hs", "ghcide.cabal"]
-      cmd_
-          ( "stack --local-bin-path=" <> takeDirectory out
-              <> " --stack-yaml=stack88.yaml build ghcide:ghcide --copy-bins --ghc-options -rtsopts"
-          )
-      Stdout ghcLoc <- cmd (s "stack --stack-yaml=stack88.yaml exec which ghc")
+      cmd_ $ buildGhcide buildSystem (takeDirectory out)
+      ghcLoc <- findGhc buildSystem
      writeFile' ghcpath ghcLoc

  [ build -/- "*/ghcide",
@ -155,13 +154,8 @@ main = shakeArgs shakeOptions {shakeChange = ChangeModtimeAndDigest} $ do
      commitid <- readFile' $ b </> ver </> "commitid"
      cmd_ $ "git worktree add bench-temp " ++ commitid
      flip actionFinally (cmd_ (s "git worktree remove bench-temp --force")) $ do
-        Stdout ghcLoc <- cmd [Cwd "bench-temp"] (s "stack --stack-yaml=stack88.yaml exec which ghc")
-        cmd_
-          [Cwd "bench-temp"]
-          ( "stack --local-bin-path=../"
-              <> takeDirectory out
-              <> " --stack-yaml=stack88.yaml build ghcide:ghcide --copy-bins --ghc-options -rtsopts"
-          )
+        ghcLoc <- findGhc buildSystem
+        cmd_ [Cwd "bench-temp"] $ buildGhcide buildSystem (".." </> takeDirectory out)
        writeFile' ghcpath ghcLoc

  priority 8000 $
@ -198,7 +192,7 @@ main = shakeArgs shakeOptions {shakeChange = ChangeModtimeAndDigest} $ do
                RemEnv "GHC_PACKAGE_PATH",
                AddPath [takeDirectory ghcPath, "."] []
              ]
-              ghcideBenchPath
+              ghcideBenchPath $
              [ "--timeout=3000",
                "-v",
                "--samples=" <> show samples,
@ -208,7 +202,8 @@ main = shakeArgs shakeOptions {shakeChange = ChangeModtimeAndDigest} $ do
                "--ghcide=" <> ghcide,
                "--select",
                unescaped (unescapeExperiment (Escaped $ dropExtension exp))
-              ]
+              ] ++
+              [ "--stack" | Stack == buildSystem]
          cmd_ Shell $ "mv *.benchmark-gcStats " <> dropFileName outcsv

  build -/- "results.csv" %> \out -> do
@ -259,7 +254,30 @@ main = shakeArgs shakeOptions {shakeChange = ChangeModtimeAndDigest} $ do
        title = show (unescapeExperiment exp) <> " - live bytes over time"
    plotDiagram False diagram out

----------------------------------------------------------------------------------------------------
+--------------------------------------------------------------------------------
+
+buildGhcide :: BuildSystem -> String -> String
+buildGhcide Cabal out = unwords
+    ["cabal install"
+    ,"exe:ghcide"
+    ,"--installdir=" ++ out
+    ,"--install-method=copy"
+    ,"--overwrite-policy=always"
+    ,"--ghc-options -rtsopts"
+    ]
+buildGhcide Stack out =
+    "stack --local-bin-path=" <> out
+        <> " build ghcide:ghcide --copy-bins --ghc-options -rtsopts"
+
+
+findGhc :: BuildSystem -> Action FilePath
+findGhc Cabal =
+    liftIO $ fromMaybe (error "ghc is not in the PATH") <$> findExecutable "ghc"
+findGhc Stack = do
+    Stdout ghcLoc <- cmd (s "stack exec which ghc")
+    return ghcLoc
+
+--------------------------------------------------------------------------------

 data Config = Config
  { experiments :: [Unescaped String],
@ -268,7 +286,8 @@ data Config = Config
    -- | Path to the ghcide-bench binary for the experiments
    ghcideBench :: FilePath,
    -- | Output folder ('foo' works, 'foo/bar' does not)
-    outputFolder :: String
+    outputFolder :: String,
+    buildTool :: BuildSystem
  }
  deriving (Generic, Show)
  deriving anyclass (FromJSON, ToJSON)
@ -312,6 +331,18 @@ findPrev name (x : y : xx)
  | otherwise = findPrev name (y : xx)
 findPrev name _ = name

+data BuildSystem = Cabal | Stack
+  deriving (Eq, Read, Show)
+
+instance FromJSON BuildSystem where
+    parseJSON x = fromString . map toLower <$> parseJSON x
+      where
+        fromString "stack" = Stack
+        fromString "cabal" = Cabal
+        fromString other = error $ "Unknown build system: " <> other
+
+instance ToJSON BuildSystem where
+    toJSON = toJSON . show
 ----------------------------------------------------------------------------------------------------

 -- | A line in the output of -S