graphql-engine/server/bench-wrk
Samir Talwar 987b55f981 server/tests-py: Reduce the number of locations we check the status code.
We have a lot of `assert st_code == 200` scattered about. This is a
problem because (a) it makes the code harder to parse and (b) the error
message is lacking; I have seen a few flaky tests which were impossible
to diagnose because I didn't know what the response _should_ be.

This reduces the number of places in which we perform this assertion
(moving most of them to `HGECtx.execute_query`), so that we can have a
better chance of seeing a useful error message on test failure.

PR-URL: https://github.com/hasura/graphql-engine-mono/pull/4957
GitOrigin-RevId: 3ff388bccf49f96569aa6b7db85266a0c5ee27ea
2022-07-05 18:01:07 +00:00
..
wrk-websocket-server server, pro: actually reformat the code-base using ormolu 2021-09-23 22:57:37 +00:00
.gitignore Regression benchmarks setup (#3310) 2020-06-19 22:40:17 +05:30
.python-version Regression benchmarks setup (#3310) 2020-06-19 22:40:17 +05:30
gen-version.sh Regression benchmarks setup (#3310) 2020-06-19 22:40:17 +05:30
get-server-sha.sh Regression benchmarks setup (#3310) 2020-06-19 22:40:17 +05:30
hge_wrk_bench.py Regression benchmarks setup (#3310) 2020-06-19 22:40:17 +05:30
plot.py Regression benchmarks setup (#3310) 2020-06-19 22:40:17 +05:30
port_allocator.py Regression benchmarks setup (#3310) 2020-06-19 22:40:17 +05:30
queries.graphql Regression benchmarks setup (#3310) 2020-06-19 22:40:17 +05:30
Readme.md Regression benchmarks setup (#3310) 2020-06-19 22:40:17 +05:30
requirements-top-level.txt Regression benchmarks setup (#3310) 2020-06-19 22:40:17 +05:30
requirements.txt Regression benchmarks setup (#3310) 2020-06-19 22:40:17 +05:30
results_schema.yaml Regression benchmarks setup (#3310) 2020-06-19 22:40:17 +05:30
run_hge.py server/tests-py: Reduce the number of locations we check the status code. 2022-07-05 18:01:07 +00:00
run_postgres.py Regression benchmarks setup (#3310) 2020-06-19 22:40:17 +05:30
sportsdb_setup.py Regression benchmarks setup (#3310) 2020-06-19 22:40:17 +05:30

Benchmarking Hasura GraphQL Engine

The script hge_wrk_bench.py helps in benchmarking the given version of Hasura GraphQL Engine using a set of GraphQL queries. The results are stored (into the results GraphQL engine) along with details like the version of GraphQL engine against which the benchmark is run, the version of Postgres database etc. The stored results can help in comparing benchmarks of different versions of GraphQL engine.

Setup

The setup includes two Postgres databases with sportsdb schema and data, and two GraphQL engines running on the Postgres databases. Then one of the GraphQL engines is added as a remote schema to another GraphQL engine.

The data will be same in both the databases. The tables reside in different database schema in-order to avoid GraphQL schema conflicts.

The methods in script sportsdb_setup.py helps in setting up the databases, starting the Hasura GraphQL engines, and setting up relationships. This script can either take urls of already running Postgres databases as input, or it can start the databases as Docker instances. The GraphQL engines can be run either with cabal run or as Docker containers.

Run benchmark

  • Install Python 3.7.6 using pyenv
$ pyenv install 3.7.6
  • Install dependencies for the Python script in a virtual environment.
$ python3 -m venv venv
$ source venv/bin/activate
$ pip3 install -r requirements.txt
  • To run benchmarks, do
$ python3 hge_wrk_bench.py

This script uses wrk to benchmark Hasura GraphQL Engine against a list of queries defined in queries.graphql. The results are then stored through a results Hasura GraphQL Engine.

You can configure the build and runtime parameters for the graphql-engine's under test by modifying your local cabal.project.local file.

Interpreting the plots

For each query under test we first run wrk to try to determine the maximum throughput we can sustain for that query. This result is plotted under the max throughput graph. This can be considered the point after which graphql-engine will start to fall over.

Then for each query we measure latency under several different loads (but making sure not to approach max throughput) using wrk2 which measures latency in a principled way. Latency can be viewed as a continuous histogram or as a violin plot that also plots each latency sample. The latter provides the most visual information and can be useful for observing clustering or other patterns, or validating the benchmark run.

Cleaning up test runs

Data will be stored locally in the work directory (test_output by default). This entire directory can be deleted safely.

If you are using the default results graphql-engine and want to just remove old benchmark runs but avoid rebuilding the sportsdb data, you can do:

$ sudo rm -r test_output/{benchmark_runs,sportsdb_data}

Arguments

  • For the list of arguments supported, do
$ python3 hge_wrk_bench.py --help

Postgres

  • In order to use already runnning Postgres databases, use argument --pg-urls PG_URL,REMOTE_PG_URL, or environmental variable export HASURA_BENCH_PG_URLS=PG_URL,REMOTE_PG_URL
  • Set the docker image using argument --pg-docker-image DOCKER_IMAGE, or environmental variable HASURA_BENCH_PG_DOCKER_IMAGE

GraphQL Engine

  • Inorder to run as a docker container, use argument --hge-docker-image DOCKER_IMAGE, or environmental variable HASURA_BENCH_HGE_DOCKER_IMAGE
  • To skip stack build, use argument --skip-stack-build

wrk

  • Number of open connections can be set using argument --connections CONNECTIONS, or environmental variable HASURA_BENCH_CONNECTIONS
  • Duration of tests can be controlled using argument --duration DURATION, or environmental variable HASURA_BENCH_CONNECTIONS
  • If plots should not have to be shown at the end of benchmarks, use argument --skip-plots
  • The Hasura GraphQL Engine to which resuls should be pushed can be specified using argument --results-hge-url HGE_URL, or environmental variable HASURA_BENCH_RESULTS_HGE_URL. By default the launched (non-"remote") graphql-engine will be used, and its data stored in test_output/sportsdb_data. The admin secret for this GraphQL engine can be specified using environmental variable HASURA_BENCH_RESULTS_HGE_ADMIN_SECRET.

Work directory

  • The files used by Postgres docker containers, logs of Hasura GraphQL engines run with cabal run, and other stuff are stored in the work directory.
  • Storing data volumes of Postgres docker containers in the work directory (test_output by default) helps in avoiding database setup time for benchmarks after the first time setup.
  • The logs of Hasura GraphQL engines (when they are run using cabal run) are stored in files hge.log and remote_hge.log

Default settings

  • Postgres databases will be run as docker containers
  • Hasura GraphQL Engines by default will be run using cabal run
  • With wrk
    • Number of threads used by wrk will be number of CPUs
    • Number of connections = 50
    • Test duration = 5 minutes (300 sec)
  • By default the results are stored in the Hasura GraphQL Engine used for benchmarking.

Storing results

  • The results are stored in schema hge_bench.
  • For schema, see file results_schema.yaml
  • The main table is hge_bench.results. This table stores the following details
    • cpu_key: This is a foreign key reference to cpu_info(key). The table cpu_info captures the various parameters of the CPU inwhich the benchmark was run, including the model and number of vCPUS
    • query_name: This is a forieng key reference to gql_query(name). The table gql_query stores the name of the query and the query itself used for tests.
    • docker_image: Stores the docker images of Hasura GraphQL Engine when the HGE is run as docker
    • server_shasum, version: These are stored when HGE is run with cabal run. Version stores the version generated by script gen-version.sh. The server_shasum stores the shasum of the files in the server folder (excluding tests folder). This shasum shows whether the server code has actually varied between the commits.
    • postgres_version : Stores the version of Postgres
    • latency, requests_per_sec: Stores the benchmark latency and requests_per_sec results
    • wrk_parameters: Stores the parameters used by wrk during benchmarking, including number of threads, total number of open connections, and duration of tests

The simplest way to setup the benchmark

  • Note: This method currently only works on linux instances
  • run the benchmarks on a docker-image using
python3 hge_wrk_bench.py --hge-docker-image DOCKER_IMAGE
  • The command will prompt for a WORK_DIR which will store all the results,volumes and databases.
  • To compare the results, with another docker build, run the same command again with the modified DOCKER_IMAGE and the same WORK_DIR
  • If the catalog versions of the two docker builds are not the same, run the benchmarks first on the docker image with a lower catalog version and then run the benchmarks on the docker image with the higher catalog version.

Steps to run benchmarks on a new linux hosted instance

  • Install docker,python3
  • optional: install ghcup (cabal and ghc will be installed with it), you'll need cabal to be setup only when you want to run the benchmarks on a branch directly (i.e. there's no docker image for it).
  • Run the benchmarks following the steps in the The simplest way to setup the benchmark