server: dev build modes

Add some configurations for modern profiling modes, and integration into dev.sh

These require cabal 3.8 due to the use of `import`

PR-URL: https://github.com/hasura/graphql-engine-mono/pull/7671
GitOrigin-RevId: f793f64105cfd99fb51b247fa8bc050f6d4bd23e
This commit is contained in:
kodiakhq[bot] 2023-02-08 22:41:09 +00:00 committed by hasura-bot
parent 7a0b2b85bb
commit b01d3a45de
18 changed files with 328 additions and 15 deletions

View File

@ -146,6 +146,8 @@ constraints: any.Cabal ==3.6.3.0,
any.ghc-bignum ==1.2,
any.ghc-boot ==9.2.5,
any.ghc-boot-th ==9.2.5,
any.ghc-debug-convention ==0.4.0.0,
any.ghc-debug-stub ==0.4.0.0,
any.ghc-heap ==9.2.5,
any.ghc-heap-view ==0.6.3,
any.ghc-prim ==0.8.0,

91
cabal/README.md Normal file
View File

@ -0,0 +1,91 @@
At the top level we have a `cabal.project` file that defines project
configuration settings that stay the same, regardless of whether we're doing
local development, building on CI, etc.
Additionally in this directory we have various `cabal.project.local` files that
add or override settings depending on context:
- `ci-*.project.local` - these are used when building in CI (see `.buildkite/`)
- `dev-sh.project.local` - this is used to configure the environment we expect in
`scripts/dev.sh`, and is where we put good local development defaults
- `dev-sh-optimized.project.local` - above, but building with optimizations,
suitable for prod-like performance testing
- `dev-sh-prof-*.project.local` - Various profiling modes (see below)
Because we can only gives us `--project-file` to select configuration, we need
each of these local overrides to have a symlink back to the top-level
`cabal.project`. So e.g. if you wanted to build with CI settings locally you
would do:
$ cabal build --project-file=cabal/ci.project exe:graphql-engine
Likewise for the freeze file symlinks.
In cabal-install 3.8 we can have `import`s, which we also use.
Here's a helper for making new configurations:
```
function hasura_new_sub_config () {
cd "$(git rev-parse --show-toplevel)/cabal"
ln -s ../cabal.project.freeze "$1.project.freeze"
ln -s ../cabal.project "$1.project"
touch "$1.project.local"
echo "continue editing: $1.project.local"
cd - &>/dev/null
}
```
-------------------------------------------------------------------
## Profiling modes
See the `graphql-engine --prof-*` flags in `dev.sh` for the happy path to use these modes.
### `+RTS -hi`(info map) Heap Profiles
Every distinct constructure allocation site is instrumented with source code
information
See: `dev-sh-prof-heap-infomap.project.local`
- **Try it when**: you want to go deeper debugging high resident memory during development
- **Benefits**: doesn't inhibit optimizations, very granular
- **Downsides**: must recompile, binary sizes can get large, sometimes source info is confusing
### ghc-debug
A set of client and server libraries for snapshotting and arbitrarily analyzing
the Haskell heap
- **Try it when**: you need to answer any sort of complex question about what's in memory; e.g.
why is something retained? do we have many identical strings and memory?
- **Benefits**: extremely powerful, can run on production without restart
- **Downsides**: analysis passes can take time to write and debug, analyzing large heaps requires care
### “Ticky ticky” profiling
Generates a report for all allocations even those that are very short-lived;
quasi- time profiling
See: `dev-sh-prof-ticky.project.local`
- **Try it when**: debugging a regression and bytes allocated, comparing two different versions of code
- **Benefits**: see and compare allocations directly, doesn't inhibit optimizations
- **Downsides**: STG can take time to decipher, the program gets very slow, not suitable for production
### `-fprof-late` time profiling
**NOT YET IMPLEMENTED**
Time profiling that instruments code after all significant optimizations have
been performed, so it doesn't distort the profile (on 9.4+ only, but plug-in
available)
See: `dev-sh-prof-time.project.local`
- **Try it when**: you want to try to make some code faster, or understand where the time is being spent in the system
- **Benefits**: get call stacks, granular view of execution time
- **Downsides**: requires recompilation, STG can be confusing, not suitable for production

View File

@ -0,0 +1 @@
../cabal.project

View File

@ -0,0 +1 @@
../cabal.project.freeze

View File

@ -0,0 +1,12 @@
-- This configuration is used by `dev.sh graphql-engine --optimized` and is
-- also a good choice for running a local optimized build
-- This requires cabal-install >=3.8
import: cabal/dev-sh.project.local
---- (reminder: top-level means apply these to all local packages)
flags: +optimize-hasura
-- be faster:
documentation: false
-- coverage inhibits performance:
coverage: false

View File

@ -0,0 +1 @@
../cabal.project

View File

@ -0,0 +1 @@
../cabal.project.freeze

View File

@ -0,0 +1,20 @@
-- This configuration is used by `dev.sh graphql-engine --prof-heap-infomap`
-- and `--prof-ghc-debug`.
--
-- TODO COMING UP: We would like to turn on these flags always. But for now the
-- resulting binary size increase is too large. This will improve in GHC 9.4, but
-- also we may need to wait until further work on compressing IPE info lands.
-- (Likewise we might want to always link with -eventlog)
import: cabal/dev-sh-optimized.project.local
-- apply these to all LOCAL packages
-- TODO would be nice to refactor other dev-sh.project.local to use program-options' as well (and force cabal 3.8)
program-options
ghc-options: -fdistinct-constructor-tables -finfo-table-map
-- For each module, STG will be dumped to:
-- dist-newstyle/**/*.dump-stg-final
ghc-options: -ddump-stg-final -ddump-to-file
package graphql-engine
ghc-options: -eventlog

View File

@ -0,0 +1 @@
../cabal.project

View File

@ -0,0 +1 @@
../cabal.project.freeze

View File

@ -0,0 +1,11 @@
-- Enable ticky profiling, powers `dev.sh graphql-engine --prof-ticky`
-- https://ghc.gitlab.haskell.org/ghc/doc/users_guide/profiling.html#using-ticky-ticky-profiling-for-implementors
--
-- run with e.g. `+RTS -r outfilepath`
import: cabal/dev-sh-optimized.project.local
program-options
ghc-options: -ticky
-- TODO doesn't seem to work with -ticky??:
ghc-options: -ddump-stg-final -ddump-to-file

View File

@ -0,0 +1 @@
../cabal.project

View File

@ -0,0 +1 @@
../cabal.project.freeze

View File

@ -0,0 +1,13 @@
-- THIS IS JUST A PLACEHOLDER FOR NOW. WE CAN ENABLE THIS AND INTEGRATE IT
-- INTO DEV.SH AFTER MOVING TO GHC 9.4
import: cabal/dev-sh-optimized.project.local
profiling: True
flags: +profiling
package *
profiling-detail: none
ghc-options: -fprof-late
-- For each module, STG will be dumped to:
-- dist-newstyle/**/*.dump-stg-final
ghc-options: -ddump-stg-final -ddump-to-file

View File

@ -19,9 +19,6 @@ package *
-j2
+RTS -A64m -n2m -RTS
-- Modify this to '+optimize-hasura' to enable optimizations. Be sure also to comment
-- 'coverage: true' below for full prod performance.
--
-- NOTE: new-build may report a misleading 'Build profile: -O1'
-- See:https://github.com/haskell/cabal/issues/6221
flags: -optimize-hasura

View File

@ -12,8 +12,8 @@ shopt -s globstar
# document describing how to do various dev tasks (or worse yet, not writing
# one), make it runnable
#
# This makes use of 'cabal/dev-sh.project' files when building.
# See 'cabal/dev-sh.project.local' for details.
# This makes use of 'cabal/dev-sh*.project' files when building.
# See 'cabal/dev-sh.project.local' for details, and $CABAL_PROJECT_FILE below.
#
# The configuration for the containers of each backend is stored in
# separate files, see files in 'scripts/containers'
@ -36,9 +36,16 @@ Usage: $0 <COMMAND>
Available COMMANDs:
graphql-engine
graphql-engine [--optimized | --prof-ticky | --prof-heap-infomap |--prof-ghc-debug] [-- <extra_args>]
Launch graphql-engine, connecting to a database launched with
'$0 postgres'.
'$0 postgres'. <extra_args> will be passed to graphql-engine directly.
--optimized : will launch a prod-like optimized build
--prof-ticky : "Ticky ticky" profiling for accounting of allocations (see: cabal/README.md)
--prof-heap-infomap : Heap profiling (see: cabal/README.md)
--prof-ghc-debug : Enable ghc-debug (see: cabal/README.md)
--prof-time : NOT YET IMPLEMENTED (TODO After 9.4) (see: cabal/README.md)
postgres
Launch a postgres container suitable for use with graphql-engine, watch its
@ -72,6 +79,10 @@ EOL
exit 1
}
# The default configuration this script expects. May be overridden depending on
# flags passed to subcommands, or this can be edited for one-off tests:
CABAL_PROJECT_FILE=cabal/dev-sh.project
# Prettify JSON output, if possible
try_jq() {
if command -v jq >/dev/null; then
@ -83,11 +94,97 @@ try_jq() {
case "${1-}" in
graphql-engine)
# pass arguments after '--' directly to engine:
GRAPHQL_ENGINE_EXTRA_ARGS=()
case "${2-}" in
--no-rebuild)
echo_error 'The --no-rebuild option is no longer supported.'
die_usage
;;
--prof-ticky)
echo_warn "This will delete any 'graphql-engine.ticky' and perform significant recompilation. Ok?"
echo_warn "Press enter to continue [will proceed in 10s]"
read -r -t10 || true
# Avoid confusion:
rm -f graphql-engine.ticky
CABAL_PROJECT_FILE=cabal/dev-sh-prof-ticky.project
HASURA_PROF_MODE=ticky
GRAPHQL_ENGINE_EXTRA_ARGS+=( +RTS -r -RTS )
case "${3-}" in
--)
GRAPHQL_ENGINE_EXTRA_ARGS+=( "${@:4}" )
;;
esac
;;
--prof-heap-infomap)
echo_warn "This will delete any 'graphql-engine.eventlog' and 'graphql-engine.eventlog.html' and perform significant recompilation. Ok?"
echo_warn "Press enter to continue [will proceed in 10s]"
read -r -t10 || true
# Avoid confusion:
rm -f graphql-engine.eventlog
rm -f graphql-engine.eventlog.html
CABAL_PROJECT_FILE=cabal/dev-sh-prof-heap-infomap.project
HASURA_PROF_MODE=heap-infomap
GRAPHQL_ENGINE_EXTRA_ARGS+=( +RTS -hi -l-agu -RTS )
case "${3-}" in
--)
GRAPHQL_ENGINE_EXTRA_ARGS+=( "${@:4}" )
;;
esac
;;
--prof-ghc-debug)
# Used by ghc-debug-stub:
export GHC_DEBUG_SOCKET=/tmp/ghc-debug
echo_warn "This will require significant recompilation unless you just ran with --prof-heap-infomap "
echo_warn "A GHC debug socket will be opened at $GHC_DEBUG_SOCKET"
echo_warn "See examples of client code here: https://github.com/hasura/hasura-debug/"
echo_warn "Press enter to continue [will proceed in 10s]"
read -r -t10 || true
# NOTE: we just need IPE info so can re-use this:
CABAL_PROJECT_FILE=cabal/dev-sh-prof-heap-infomap.project
# This will open the debug socket:
export HASURA_GHC_DEBUG=true
HASURA_PROF_MODE=ghc-debug
case "${3-}" in
--)
GRAPHQL_ENGINE_EXTRA_ARGS+=( "${@:4}" )
;;
esac
;;
--prof-time)
die_usage # NOT YET IMPLEMENTED
echo_warn "This will delete any graphql-engine.prof and perform significant recompilation."
echo_warn "Press enter to continue [will proceed in 10s]"
read -r -t10 || true
rm -f graphql-engine.prof
rm -f graphql-engine.profiterole.html
CABAL_PROJECT_FILE=cabal/dev-sh-prof-time.project
HASURA_PROF_MODE="time"
GRAPHQL_ENGINE_EXTRA_ARGS+=( +RTS -P -RTS )
case "${3-}" in
--)
GRAPHQL_ENGINE_EXTRA_ARGS+=( "${@:4}" )
;;
esac
;;
--optimized)
CABAL_PROJECT_FILE=cabal/dev-sh-optimized.project
case "${3-}" in
--)
GRAPHQL_ENGINE_EXTRA_ARGS+=( "${@:4}" )
;;
esac
;;
--)
GRAPHQL_ENGINE_EXTRA_ARGS+=( "${@:3}" )
;;
"")
;;
*)
@ -269,7 +366,50 @@ if [ "$MODE" = "graphql-engine" ]; then
# Attempt to run this after a CTRL-C:
function cleanup {
echo
# Generate coverage, which can be useful for debugging or understanding
### Run analysis or visualization tools, if we ran in one of the profiling modes
case "${HASURA_PROF_MODE-}" in
ticky)
echo_warn "Done. View the ticky report at: graphql-engine.ticky"
echo_warn "See: https://downloads.haskell.org/ghc/latest/docs/users_guide/profiling.html#using-ticky-ticky-profiling-for-implementors"
echo_warn "Lookup referenced STG names dumped to their respective module files: dist-newstyle/**/*.dump-stg-final"
# TODO some analysis utilities:
# - sort by top
# - find dictionaries ("+" args)
;;
heap-infomap)
if command -v eventlog2html >/dev/null ; then
echo_warn "Running eventlog2html against the event log we just generated: graphql-engine.eventlog"
eventlog2html --bands 100 graphql-engine.eventlog
echo_warn "Done. View the report at: graphql-engine.eventlog.html"
echo_warn "Lookup referenced STG names dumped to their respective module files: dist-newstyle/**/*.dump-stg-final"
else
echo_warn "Please install eventlog2html"
fi
;;
ghc-debug)
# TODO maybe integrate snapshotting + common analysis here
;;
time)
if command -v profiterole >/dev/null ; then
echo_warn "Running profiterole..."
profiterole graphql-engine.prof
echo_warn "Done. Check out..."
echo_warn " - graphql-engine.prof ...for the top-down report"
echo_warn " - graphql-engine.profiterole.html ...for the top-down report"
echo_warn "Lookup referenced STG names dumped to their respective module files: dist-newstyle/**/*.dump-stg-final"
else
echo_warn "Please install profiterole"
fi
;;
"")
;;
*)
echo_error "Bug!: HASURA_PROF_MODE = $HASURA_PROF_MODE"
exit 1
;;
esac
### Generate coverage, which can be useful for debugging or understanding
if command -v hpc >/dev/null && command -v jq >/dev/null ; then
# Get the appropriate mix dir (the newest one); this way this hopefully
# works when 'cabal/dev-sh.project.local' is edited to turn on
@ -315,17 +455,18 @@ if [ "$MODE" = "graphql-engine" ]; then
echo_pretty " $ $0 postgres"
echo_pretty ""
RUN_INVOCATION=(cabal new-run --project-file=cabal/dev-sh.project --RTS --
RUN_INVOCATION=(cabal new-run --project-file="$CABAL_PROJECT_FILE" --RTS --
exe:graphql-engine +RTS -N -T -s -RTS serve
--enable-console --console-assets-dir "$PROJECT_ROOT/frontend/dist/apps/server-assets-console-ce"
"${GRAPHQL_ENGINE_EXTRA_ARGS[@]}"
)
echo_pretty 'About to do:'
echo_pretty ' $ cabal new-build --project-file=cabal/dev-sh.project exe:graphql-engine'
echo_pretty " $ cabal new-build --project-file=$CABAL_PROJECT_FILE exe:graphql-engine"
echo_pretty " $ ${RUN_INVOCATION[*]}"
echo_pretty ''
cabal new-build --project-file=cabal/dev-sh.project exe:graphql-engine
cabal new-build --project-file="$CABAL_PROJECT_FILE" exe:graphql-engine
# We assume a PG is *already running*, and therefore bypass the
# cleanup mechanism previously set.
@ -471,7 +612,7 @@ elif [ "$MODE" = "test" ]; then
# seems to conflict now, causing re-linking, haddock runs, etc. Instead do a
# `graphql-engine version` to trigger build
cabal run \
--project-file=cabal/dev-sh.project \
--project-file="$CABAL_PROJECT_FILE" \
-- exe:graphql-engine \
--metadata-database-url="$PG_DB_URL" \
version
@ -489,7 +630,7 @@ elif [ "$MODE" = "test" ]; then
HASURA_GRAPHQL_DATABASE_URL="$PG_DB_URL" \
HASURA_MSSQL_CONN_STR="$MSSQL_CONN_STR" \
cabal run \
--project-file=cabal/dev-sh.project \
--project-file="$CABAL_PROJECT_FILE" \
test:graphql-engine-tests \
-- "${UNIT_TEST_ARGS[@]}"
fi
@ -517,7 +658,7 @@ elif [ "$MODE" = "test" ]; then
# Using --metadata-database-url flag to test multiple backends
# HASURA_GRAPHQL_PG_SOURCE_URL_* For a couple multi-source pytests:
cabal new-run \
--project-file=cabal/dev-sh.project \
--project-file="$CABAL_PROJECT_FILE" \
-- exe:graphql-engine \
--metadata-database-url="$PG_DB_URL" serve \
--stringify-numeric-types \

View File

@ -271,6 +271,7 @@ common lib-depends
, autodocodec-openapi3
, barbies
, base
, ghc-debug-stub
, bytestring
, containers
, data-default
@ -1007,6 +1008,7 @@ executable graphql-engine
main-is: Main.hs
build-depends: base
, graphql-engine
, ghc-debug-stub
, bytestring
, ekg-core
, ekg-prometheus

View File

@ -14,6 +14,7 @@ import Data.Text.Conversions (convertText)
import Data.Time.Clock (getCurrentTime)
import Data.Time.Clock.POSIX (getPOSIXTime)
import Database.PG.Query qualified as PG
import GHC.Debug.Stub
import GHC.TypeLits (Symbol)
import Hasura.App
import Hasura.Backends.Postgres.Connection.MonadTx
@ -31,12 +32,13 @@ import Hasura.Server.Prometheus (makeDummyPrometheusMetrics)
import Hasura.Server.Version
import Hasura.ShutdownLatch
import Hasura.Tracing (sampleAlways)
import System.Environment (lookupEnv)
import System.Exit qualified as Sys
import System.Metrics qualified as EKG
import System.Posix.Signals qualified as Signals
main :: IO ()
main =
main = maybeWithGhcDebug $ do
catch
do
args <- parseArgs
@ -139,3 +141,17 @@ data
AppMetricsSpec name metricType tags
ServerTimestampMs ::
AppMetricsSpec "ekg.server_timestamp_ms" 'EKG.CounterType ()
-- | 'withGhcDebug' but conditional on the environment variable
-- @HASURA_GHC_DEBUG=true@. When this is set a debug socket will be opened,
-- otherwise the server will start normally. This must only be called once and
-- it's argument should be the program's @main@
maybeWithGhcDebug :: IO a -> IO a
maybeWithGhcDebug theMain = do
lookupEnv "HASURA_GHC_DEBUG" >>= \case
Just "true" -> do
putStrLn "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
putStrLn "!!!!! Opening a ghc-debug socket !!!!!"
putStrLn "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
withGhcDebug theMain
_ -> theMain