daml/bazel_tools/ghc-lib
Andreas Herrmann e8001fc4ff
Speed up ghc-lib(-parser) sdist generation (#14076)
* Update ghc-lib to speed up sdist generation

To incorporate https://github.com/digital-asset/ghc-lib/pull/379, which
is a more generic version of the changes introduced in
41ab1c2cba
to speed-up ghc-lib(-parser) sdist generation in Daml.

The slowest step in the ghc-lib(-parser) sdist generation is the
generation of `.hs` files from `.hsc` files via `hsc2hs` and from `.x`
or `.y` files via `alex` or `happy`. The reason that it's slow is that
`ghc-lib-gen` performs these through GHC's build system, hadrian, and
these steps require almost a full stage1 GHC build.

The `.hs` files are only needed to enable dependency discovery through
`ghc -M`, as it doesn't understand `.hsc|.x|.y` files. Apart from that
we can use the original `.hsc|.x|.y` files in the final sdist.

With this update `ghc-lib-gen` finds all relevant `.hsc|.x|.y` files
and replaces them with dummy `.hs` files that have the same module name
and the same imports. These dummy files are only used for the purposes
of dependency discovery via `ghc -M` and are not included in the final
sdist.

With this update the sdist generation is sped up between 4.3 to 4.5
times:
- ghc-lib-parser: 3m2s down to 42.04s (4.3x)
- ghc-lib: 3m5s down to 40.96s (4.5x)

I've applied `diffoscope` to the generated sdist tarballs with and
without this update to ensure that no unexpected differences are
introduced with this change.

https://github.com/digital-asset/ghc-lib/pull/379 reports a less
dramatic speed up of about a third reduction in build time for `stack
runhaskell CI.hs`. The reason for the discrepancy is that `CI.hs`
performs more steps than just the sdist generation, e.g. checking out
GHC's source tree, or building hadrian. These steps are not included in
the above benchmarks, because they are executed in separate Bazel
actions and can be cached separately.

CHANGELOG_BEGIN
CHANGELOG_END

* Update Cabal files

Co-authored-by: Andreas Herrmann <andreas.herrmann@tweag.io>
2022-06-09 09:15:19 +02:00
..
ghc-lib Speed up ghc-lib(-parser) sdist generation (#14076) 2022-06-09 09:15:19 +02:00
ghc-lib-parser Speed up ghc-lib(-parser) sdist generation (#14076) 2022-06-09 09:15:19 +02:00
BUILD.bazel Bazelify ghc-lib (#12508) 2022-05-19 10:49:16 +02:00
BUILD.ghc Bazelify ghc-lib (#12508) 2022-05-19 10:49:16 +02:00
BUILD.ghc-lib-gen Bazelify ghc-lib (#12508) 2022-05-19 10:49:16 +02:00
BUILD.nix-deps Bazelify ghc-lib (#12508) 2022-05-19 10:49:16 +02:00
defs.bzl Use -DDAML_PRIM cpp flag for building ghc-lib{,-parser} (#14074) 2022-06-03 15:04:23 +00:00
ghc-lib-no-stack.patch Speed up ghc-lib(-parser) sdist generation (#14076) 2022-06-09 09:15:19 +02:00
lib.sh Bazelify ghc-lib (#12508) 2022-05-19 10:49:16 +02:00
README.md Bazelify ghc-lib (#12508) 2022-05-19 10:49:16 +02:00
repositories.bzl Bazelify ghc-lib (#12508) 2022-05-19 10:49:16 +02:00
version.bzl Speed up ghc-lib(-parser) sdist generation (#14076) 2022-06-09 09:15:19 +02:00

GHC-LIB

This setup builds ghc-lib and ghc-lib-parser entirely within Bazel, including the generation of the Cabal sdists performed by ghc-lib-gen.

The generated sdists are built by Bazel using haskell_cabal_library rules and then included in the stack_snapshot rule for @stackage as vendored_packages.

Note, stack_snapshot's vendoring mechanism requries the Cabal files to be sources files, i.e. it cannot reference generated Cabal files. Therefore, the Cabal files are checked in and need to be updated when ghc-lib(-parser) changes.

Build

  • To build the sdists of ghc-lib(-parser) use the following commands:
    $ bazel build @da-ghc//:ghc-lib-parser
    $ bazel build @da-ghc//:ghc-lib
    
  • To build the Cabal libraries ghc-lib(-parser) use the following commands:
    $ bazel build //bazel_tools/ghc-lib/ghc-lib-parser
    $ bazel build //bazel_tools/ghc-lib/ghc-lib
    
    or alternatively you can use the aliases exposed by stack_snapshot's vendored_packages.
    $ bazel build @stackage//:ghc-lib-parser
    $ bazel build @stackage//:ghc-lib
    
  • To depend on ghc-lib(-parser) use the targets exposed by stack_snapshot:
    @stackage//:ghc-lib-parser
    @stackage//:ghc-lib
    

Update

Note, an update of any of these may affect ghc-lib(-parser)'s Cabal files. If so, see below for updating the checked in Cabal files.

  • To update the GHC revision used to build ghc-lib change the GHC_REV variable within bazel_tools/ghc-lib/version.bzl. If needed update the patches listed within GHC_PATCHES, see below.
  • To update the GHC version used to build ghc-lib also update the GHC_FLAVOR and GHC_LIB_VERSION variables within bazel_tools/ghc-lib/version.bzl.
  • To update the ghc-lib revision, which provides the ghc-lib-gen tool, change the GHC_LIB_REV and GHC_LIB_SHA256 variables within bazel_tools/ghc-lib/version.bzl and update the patches in GHC_LIB_PATCHES if needed.
  • To update the checked in Cabal files execute the following Bazel commands.
    $ bazel run //bazel_tools/ghc-lib/ghc-lib-parser:cabal-update
    $ bazel run //bazel_tools/ghc-lib/ghc-lib:cabal-update
    

Patches

  • ghc-lib-no-stack.patch patch ghc-lib-gen to use a prebuilt Hadrian binary. With this change ghc-lib-gen no longer requires stack.