docs: add prometheus grafana dashboard

[DOCS-1025]: https://hasurahq.atlassian.net/browse/DOCS-1025?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

PR-URL: https://github.com/hasura/graphql-engine-mono/pull/9377
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
GitOrigin-RevId: d60a53c5a0937f1ea2eb087a84df59e4121e61c8
This commit is contained in:
Rob Dominguez 2023-06-16 07:29:07 -05:00 committed by hasura-bot
parent ee255fec61
commit d711392fb1
19 changed files with 510 additions and 113 deletions

View File

@ -83,8 +83,8 @@ to be exported to your observability platform using the appropriate log drivers.
#### Metrics via Prometheus integration
You can export metrics of your Hasura Cloud project to Prometheus. You can configure this on the `Integrations` tab on
the project's settings page. You can find more information on this [here](/enterprise/metrics.mdx). This page also
provides information about different metrics exported from Hasura to Prometheus.
the project's settings page. You can find more information on this
[here](/observability/prometheus/cloud-integration.mdx).
## Database observability
@ -107,8 +107,8 @@ be implemented:
[Query Tags](/observability/query-tags.mdx) are SQL comments that consist of `key=value` pairs that are appended to
generated SQL statements. When you issue a query or mutation with query tags, the generated SQL has some extra
information. Database analytics tools can use that information (metadata) in these comments to analyze DB load
and track or monitor performance.
information. Database analytics tools can use that information (metadata) in these comments to analyze DB load and track
or monitor performance.
### Using Query Tags and pganalyze

View File

@ -194,8 +194,8 @@ Whether or not to send the request body (graphql request/variables) to the auth
Stringify certain
[BigQuery numeric types](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#numeric_types),
specifically `bignumeric`, `float64`, `int64`, `numeric` and aliases thereof, as they don't fit into the `IEnterprise EditionE 754` spec
for JSON encoding-decoding.
specifically `bignumeric`, `float64`, `int64`, `numeric` and aliases thereof, as they don't fit into the
`IEnterprise EditionE 754` spec for JSON encoding-decoding.
| | |
| ------------------- | ---------------------------------------------- |
@ -211,14 +211,14 @@ for JSON encoding-decoding.
When metadata changes, close all WebSocket connections (with error code `1012`). This is useful when you want to ensure
that all clients reconnect to the latest metadata.
| | |
| ------------------- | ----------------------------------------------------- |
| | |
| ------------------- | ---------------------------------------------------- |
| **Flag** | `--disable-close-websockets-on-metadata-change` |
| **Env var** | `HASURA_GRAPHQL_CLOSE_WEBSOCKETS_ON_METADATA_CHANGE` |
| **Accepted values** | Boolean |
| **Options** | `true` or `false` |
| **Default** | `true` |
| **Supported in** | CE, Enterprise Edition, Cloud |
| **Accepted values** | Boolean |
| **Options** | `true` or `false` |
| **Default** | `true` |
| **Supported in** | CE, Enterprise Edition, Cloud |
### Connections per Read-Replica
@ -355,21 +355,21 @@ Enable the Hasura Console (served by the server on `/` and `/console`).
### Enable High-cardinality Labels for Metrics
Enable high-cardinality labels for [Prometheus Metrics](/enterprise/metrics.mdx). Enabling this setting will add more labels to
some of the metrics (e.g. `operation_name` label for Graphql subscription metrics).
Enable high-cardinality labels for [Prometheus Metrics](/observability/prometheus/metrics.mdx). Enabling this setting
will add more labels to some of the metrics (e.g. `operation_name` label for Graphql subscription metrics).
| | |
| ------------------- | ------------------------------------------------- |
| **Flag** | N/A |
| **Env var** | `HASURA_GRAPHQL_METRICS_ENABLE_HIGH_CARDINALITY_LABELS` |
| **Accepted values** | Boolean |
| **Options** | `true` or `false` |
| **Default** | `true` |
| **Supported in** | Enterprise Edition |
| | |
| ------------------- | ------------------------------------------------------- |
| **Flag** | N/A |
| **Env var** | `HASURA_GRAPHQL_METRICS_ENABLE_HIGH_CARDINALITY_LABELS` |
| **Accepted values** | Boolean |
| **Options** | `true` or `false` |
| **Default** | `true` |
| **Supported in** | Enterprise Edition |
### Enable Log Compression
Enable sending compressed logs to [metrics server](/enterprise/metrics.mdx).
Enable sending compressed logs to [metrics server](/observability/prometheus/metrics.mdx).
| | |
| ------------------- | ------------------------------------------ |
@ -614,12 +614,12 @@ This variable sets the level for [Hasura's logs](/deployment/logging.mdx#logging
The [maximum cache size](/enterprise/caching.mdx), measured in MB, for queries.
| | |
| ------------------- | --------------------------------------------------------------------------------- |
| **Flag** | `--max-cache-size <SIZE_IN_MB>` |
| **Env var** | `HASURA_GRAPHQL_MAX_CACHE_SIZE` |
| **Accepted values** | Integer (Representing cache size measured in MB) |
| **Default** | `1` |
| | |
| ------------------- | ------------------------------------------------------------------------------------------------- |
| **Flag** | `--max-cache-size <SIZE_IN_MB>` |
| **Env var** | `HASURA_GRAPHQL_MAX_CACHE_SIZE` |
| **Accepted values** | Integer (Representing cache size measured in MB) |
| **Default** | `1` |
| **Supported in** | **Enterprise Edition**, **Cloud**: Standard / Professional tier is set to `100` MB as the default |
### Metadata Database Extension Schema
@ -873,8 +873,8 @@ The path to a shared CA store to use to connect to both (caching and rate-limiti
### Redis URL
The Redis URL to use for [query caching](/enterprise/caching.mdx) and [Webhook Auth
Caching](/auth/authentication/webhook.mdx#webhook-auth-caching).
The Redis URL to use for [query caching](/enterprise/caching.mdx) and
[Webhook Auth Caching](/auth/authentication/webhook.mdx#webhook-auth-caching).
| | |
| ------------------- | ---------------------------------------- |
@ -939,13 +939,13 @@ List of third-party identity providers to enable Single Sign-on authentication f
Multiplexed [streaming queries](/subscriptions/postgres/streaming/index.mdx) are split into batches of the specified
size.
| | |
| ------------------- | ------------------------------------------------------------------------- |
| **Flag** | `--streaming-queries-multiplexed-batch-size <SIZE>` |
| **Env var** | `HASURA_GRAPHQL_STREAMING_QUERIES_MULTIPLEXED_BATCH_SIZE` |
| **Accepted values** | Integer |
| **Default** | `100` |
| **Supported in** | CE, Enterprise Edition |
| | |
| ------------------- | --------------------------------------------------------- |
| **Flag** | `--streaming-queries-multiplexed-batch-size <SIZE>` |
| **Env var** | `HASURA_GRAPHQL_STREAMING_QUERIES_MULTIPLEXED_BATCH_SIZE` |
| **Accepted values** | Integer |
| **Default** | `100` |
| **Supported in** | CE, Enterprise Edition |
### Streaming Queries Multiplexed Refetch Interval
@ -1023,7 +1023,7 @@ This identifies an [unauthorized role](/auth/authentication/unauthenticated-acce
| **Accepted values** | String |
| **Example** | Setting this value to `anonymous`, whenever the `Authorization` header is absent, the request's role will default to `anonymous`. |
| **Default** | `null` |
| **Supported in** | CE, Enterprise Edition, Cloud |
| **Supported in** | CE, Enterprise Edition, Cloud |
### Use Prepared Statements

View File

@ -33,4 +33,4 @@ When you are ready to move Hasura to production, check out our
continuous availability.
- We recommend running Hasura with at least 4 CPU cores and a minimum of 8 GB RAM in production. Please set autoscaling
on CPU.
- [Enable and consume metrics](/enterprise/metrics.mdx/#enable-metrics-endpoint).
- [Enable and consume metrics](/observability/prometheus/index.mdx).

View File

@ -165,7 +165,7 @@ import Enterprise from '@site/static/icons/features/enterprise.svg';
</p>
</div>
</VersionedLink>
<VersionedLink to="/enterprise/metrics/">
<VersionedLink to="/observability/prometheus/index/">
<div className="card">
<h3>Metrics via Prometheus</h3>
<p>Learn how to configure Prometheus in Hasura Enterprise Edition to monitor your GraphQL API.</p>

View File

@ -50,8 +50,8 @@ events queue to the webhook.
<ProductBadge self />
Hasura exposes a set of [Prometheus metrics](/enterprise/metrics.mdx) that can be used to monitor the Event Trigger
system and help diagnose performance issues.
Hasura exposes a set of [Prometheus metrics](/observability/prometheus/metrics.mdx) that can be used to monitor the
Event Trigger system and help diagnose performance issues.
### Event fetch time per batch

View File

@ -1,5 +1,5 @@
{
"label": "Integrations",
"position": 2,
"position": 3,
"className": "cloud-icon"
}

View File

@ -222,5 +222,5 @@ choice:
- [Datadog](/observability/integrations/datadog.mdx)
- [New Relic](/observability/integrations/newrelic.mdx)
- [Azure monitor](/observability/integrations/azure-monitor.mdx)
- [Prometheus](/observability/integrations/prometheus.mdx)
- [Prometheus](/observability/prometheus/cloud-integration.mdx)
- [OpenTelemetry](/observability/integrations/opentelemetry.mdx)

View File

@ -35,13 +35,10 @@ import Observability from '@site/static/icons/features/observability.svg';
.
</p>
<p>
For our Enterprise customers, we have a set of pre-built dashboards and alerting rules configured with thew Prometheus Grafana Jaeger
stack, with which you can monitor and debug Hasura. These dashboards will be available soon and integrated with Hasura Cloud
too. You can read more and explore these dashboards &nbsp;
<VersionedLink to="/observability/pre-built-dashboards/">
here
</VersionedLink>
.
For our Enterprise customers, we have a set of pre-built dashboards and alerting rules configured with thew
Prometheus Grafana Jaeger stack, with which you can monitor and debug Hasura. These dashboards will be available
soon and integrated with Hasura Cloud too. You can read more and explore these dashboards &nbsp;
<VersionedLink to="/observability/prometheus/pre-built-dashboards/">here</VersionedLink>.
</p>
<h4>Quick Links</h4>
<ul>
@ -52,7 +49,7 @@ import Observability from '@site/static/icons/features/observability.svg';
<VersionedLink to="/observability/how-it-works">Learn how Observability works.</VersionedLink>
</li>
<li>
<VersionedLink to="/observability/pre-built-dashboards">Pre-built dashboards.</VersionedLink>
<VersionedLink to="/observability/prometheus/pre-built-dashboards">Pre-built dashboards.</VersionedLink>
</li>
</ul>
</div>
@ -73,7 +70,7 @@ import Observability from '@site/static/icons/features/observability.svg';
<p>Connect your Hasura GraphQL API to OpenTelemetry-compliant services.</p>
</div>
</VersionedLink>
<VersionedLink to="/observability/integrations/prometheus/">
<VersionedLink to="/observability/prometheus/cloud-integration/">
<div className="card">
<h3>Prometheus</h3>
<p>Connect your Hasura GraphQL API to Prometheus.</p>

View File

@ -0,0 +1,5 @@
{
"label": "Prometheus",
"position": 2,
"className": "cloud-and-enterprise-icon"
}

View File

@ -1,6 +1,6 @@
---
sidebar_label: Prometheus
sidebar_position: 4
sidebar_label: Integrate with Hasura Cloud
sidebar_position: 3
description: Prometheus Integration on Hasura Cloud
title: 'Cloud: Prometheus Integration'
keywords:
@ -18,7 +18,7 @@ import Thumbnail from '@site/src/components/Thumbnail';
import HeadingIcon from '@site/src/components/HeadingIcon';
import ProductBadge from '@site/src/components/ProductBadge';
# Prometheus Integration
# Prometheus Integration for Hasura Cloud
<ProductBadge standard pro ee />

View File

@ -0,0 +1,353 @@
---
sidebar_position: 2
sidebar_label: Integrate with Hasura EE
title: 'Integrate Prometheus with Hasura EE'
description: Install Prometheus server and Grafana to create a basic observability dashboard for Hasura.
keywords:
- hasura
- docs
- cloud
- integrations
- exporter
- integration
- observability
- prometheus
- grafana
- monitoring
- monitoring framework
---
import HeadingIcon from '@site/src/components/HeadingIcon';
import ProductBadge from '@site/src/components/ProductBadge';
import Thumbnail from '@site/src/components/Thumbnail';
# Integrate Prometheus with Hasura EE and build a Grafana Dashboard
<ProductBadge self />
## Overview
This guide will help you set up a basic observability dashboard for Hasura using Prometheus and Grafana. We have two
approaches depending on your use case:
- **Self-hosted**: If you are running Prometheus and Grafana on your own infrastructure, follow the
[self-hosted installation](#self-hosted-installation) instructions.
- **Containerized**: If you are running Prometheus and Grafana in a containerized environment, follow the
[containerized installation](#containerized-installation) instructions.
## Self-hosted installation
### Install and configure Prometheus
#### Step 1. Set up the environment
You will need to create a Prometheus user and group, and a directory for Prometheus to store its data. You will also
need to create a directory for Prometheus to store its configuration files.
This section is written based on an Ubuntu/Debian installation environment. The following commands will help you prepare
your environment:
```bash
sudo groupadd -system prometheus
sudo useradd -s /sbin/nologin -system -g prometheus prometheus
sudo mkdir /var/lib/prometheus
for i in rules rules.d files_sd; do sudo mkdir -p /etc/prometheus/${i}; done
```
#### Step 2. Install Prometheus
The following set of commands will help you download and install Prometheus:
```bash
sudo apt update
sudo apt -y install wget curl
mkdir -p /tmp/prometheus && cd /tmp/prometheus
curl -s https://api.github.com/repos/prometheus/prometheus/releases/latest |
grep browser_download_url | grep linux-amd64 | cut -d '"' -f 4 | wget -qi -
tar xvf prometheus*.tar.gz
cd prometheus*/
sudo mv prometheus promtool /usr/local/bin/
```
You can check to see if Prometheus is installed correctly by running the following command:
```bash
prometheus --version
```
#### Step 3. Connect Prometheus to Hasura
To connect Prometheus to Hasura, you will need to create a configuration file for Prometheus. The following commands
will help you do this:
```bash
sudo cp -rpf prometheus.yml /etc/prometheus/prometheus.yml sudo mv consoles/ console_libraries/ /etc/prometheus/
```
Then, you'll need to edit the Prometheus configuration file (`/etc/prometheus/prometheus.yml`) to include the changes
listed below:
```yaml
# my global config
global:
scrape_interval: 15s
evaluation_interval: 15s
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global evaluation_interval .
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here its Prometheus itself.
scrape_configs:
# The job name is added as a label job=<job_name> to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to /metrics
# scheme defaults to http .
static_configs:
- targets: ['localhost:9090']
- job_name: 'graphQL'
metrics_path: 'v1/metrics'
# metrics_path defaults to /metrics
# scheme defaults to http .
static_configs:
- targets: ['hasura_deployment_url:8080']
```
#### Step 4. Set firewall rules
If you are using a firewall, you will need to set the following rules:
```bash
sudo ufw allow 9090/tcp
```
#### Step 5. Set up a password for Prometheus web access
To set up a password for Prometheus web access, you will need to create a hashed password. First, we'll create the YAML
file which will store the password. Inside `/etc/prometheus/`, run the following:
```bash
sudo touch web.yml
```
Then, we'll install bcrypt:
```bash
sudo apt install python3-bcrypt -y
```
Then, we'll create a hashed password via a Python script called `genpass.py` which we can store anywhere:
```python
import getpass
import bcrypt
password = getpass.getpass("password: ")
hashed_password = bcrypt.hashpw(password.encode("utf-8"), bcrypt.gensalt())
print(hashed_password.decode())
```
You can then run the script, using the command below, and enter your password when prompted:
```bash
python3 gen-pass.py
```
The output will be a hashed password. Copy this password and paste it into the `web.yml` file, as shown below:
```yaml
basic_auth_users:
admin: your new hashed value
```
To check yourself, use `promtool` to check the configuration file:
```bash
promtool check web-config /etc/prometheus/web.yml
```
#### Step 6. Restart Prometheus
To restart Prometheus, run the following command:
```bash
sudo systemctl restart prometheus
```
Then, test the password by running:
```bash
curl -u admin:<YOUR_PASSWORD> http://localhost:9090/metrics
```
You should see a response similar to the one below:
```bash
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0
# etc...
```
### Install and configure Grafana
#### Step 7. Install Grafana
You can install Grafana by running the following commands:
```bash
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
sudo add-apt-repository "deb https://packages.grafana.com/oss/deb stable main" sudo apt update
sudo apt install grafana
sudo systemctl start grafana-server
sudo systemctl enable grafana-server
```
At this point, your Grafana server should be available at `http://<YOUR_IP_ADDRESS>:3000` where you'll find the login
screen. The default username and password are both `admin`.
:::info Change the default password
After logging in, you will be prompted to change the default password. Set your new password and login.
:::
#### Step 8. Create a Prometheus data source
In Grafana, from the settings icon on the sidebar, open the `Configuration` menu and select `Data Sources`. Then, click
on `Add data source` and select `Prometheus` as the type.
Then, set the appropriate URL for your Prometheus server (e.g., `http://localhost:9090`) and click `Save & Test`. If
everything is working correctly, you should see a green `Data source is working` message.
<Thumbnail
src="/img/enterprise/prometheus/create-prometheus-data-source.png"
alt="Create Prometheus data source"
width="1000px"
/>
#### Step 9. Create a Prometheus graph
Click the graph title and select `Edit`. Then, select the `Metrics` tab and select your Prometheus data source. Then,
enter any Prometheus expression ino the `Query` field while using the `Metric` field to lookup metrics via autocomplete.
<Thumbnail
src="/img/enterprise/prometheus/create-a-prometheus-graph.png"
alt="Create Prometheus data source"
width="1000px"
/>
:::info Formatting legend names
To format the legend names of time series, use the "Legend format" input. For example, to show only the method and
status labels of a returned query result, separated by a dash, you could use the legend format string
`{{method}} - {{status}}`.
<Thumbnail
src="/img/enterprise/prometheus/formatting-legend-names.png"
alt="Create Prometheus data source"
width="1000px"
/>
:::
## Containerized installation
### Install and configure Prometheus and Grafana
#### Step 1. Prepare the Prometheus configuration file
Create a file named `prometheus.yml` on your host with the following information:
```yaml
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global evaluation_interval .
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here its Prometheus itself.
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'hasura'
metrics_path: '/v1/metrics'
static_configs:
- targets: ['ip_address_of_hasura_installation:8080']
```
#### Step 2. Pull the Prometheus and Grafana Docker containers
For Prometheus, run the following command:
```bash
docker run -p 9090:9090 -v /path/to/your/local/prometheus.yml:/etc/ prometheus/prometheus.yml prom/prometheus
```
Then, for Grafana, run the following:
```bash
docker run -d -p 3000:3000 grafana/grafana-enterprise
```
### Configure Grafana
#### Step 3. Adding a Prometheus as a data source in Grafana
In Grafana, from the settings icon on the sidebar, open the `Configuration` menu and select `Data Sources`. Then, click
on `Add data source` and select `Prometheus` as the type.
Then, set the appropriate URL for your Prometheus server (e.g., `http://localhost:9090`) and click `Save & Test`. If
everything is working correctly, you should see a green `Alerting supported` message.
<Thumbnail
src="/img/enterprise/prometheus/create-prometheus-data-source.png"
alt="Create Prometheus data source"
width="1000px"
/>
#### Step 5. Add Hasura metrics to the dashboard
Click on the `Add Panel` icon in the top-right corner of the Grafana dashboard. Then, select `Add New Panel` or
`Add New Row`.
<Thumbnail
src="/img/enterprise/prometheus/add-metrics-to-dashboard.png"
alt="Create Prometheus data source"
width="1000px"
/>
Click on the `Metric` section and start typing, `hasura` — you should see a list of available Hasura metrics. Select the
metric you want to add to the dashboard.
<Thumbnail
src="/img/enterprise/prometheus/create-a-prometheus-graph.png"
alt="Create Prometheus data source"
width="1000px"
/>

View File

@ -0,0 +1,38 @@
---
slug: index
title: 'EE: Integrations with Prometheus'
description: Configure integrations with Hasura Enterprise Edition
keywords:
- hasura
- docs
- enterprise
- ee
- integrations
- exporter
- integration
- observability
- monitoring
- monitoring framework
- prometheus
---
import HeadingIcon from '@site/src/components/HeadingIcon';
import ProductBadge from '@site/src/components/ProductBadge';
# Integrate Prometheus with Hasura Enterprise Edition
<ProductBadge self pro ee />
## Overview
In this section, you'll find information on how to integrate [Prometheus](https://prometheus.io/) with Hasura Enterprise
Edition:
- [Available metrics](/observability/prometheus/metrics.mdx): Learn about metrics available to monitor the health,
performance and reliability of the Hasura GraphQL Engine.
- [Integrate with Hasura Cloud](/observability/prometheus/cloud-integration.mdx): Configure Prometheus integration with
Hasura Enterprise Edition.
- [Integrate Prometheus with Hasura EE and build a Grafana Dashboard](/observability/prometheus/grafana-dashboard.mdx):
Configure Prometheus integration with Hasura Enterprise Edition.
- [Pre-built dashboards](/observability/prometheus/pre-built-dashboards.mdx): Learn about pre-built dashboards available
for Hasura Enterprise Edition.

View File

@ -1,12 +1,12 @@
---
sidebar_label: Metrics via Prometheus
sidebar_label: Available Metrics
description: Metrics via Prometheus for Hasura Enterprise Edition
title: 'Enterprise Edition: Metrics via Prometheus'
keywords:
- hasura
- docs
- enterprise
sidebar_position: 4
sidebar_position: 1
---
import ProductBadge from '@site/src/components/ProductBadge';
@ -59,14 +59,14 @@ The following metrics are exported by Hasura GraphQL Engine:
The following metrics can be used to monitor the performance of Hasura Event Triggers system:
- [`hasura_event_fetch_time_per_batch_seconds`](/event-triggers/observability-and-performace.mdx/#event-fetch-time-per-batch)
- [`hasura_event_invocations_total`](/event-triggers/observability-and-performace.mdx/#event-invocations-total)
- [`hasura_event_processed_total`](/event-triggers/observability-and-performace.mdx/#event-processed-total)
- [`hasura_event_processing_time_seconds`](/event-triggers/observability-and-performace.mdx/#event-processing-time)
- [`hasura_event_queue_time_seconds`](/event-triggers/observability-and-performace.mdx/#event-queue-time)
- [`hasura_event_trigger_http_workers`](/event-triggers/observability-and-performace.mdx/#event-triggers-http-workers)
- [`hasura_event_webhook_processing_time_seconds`](/event-triggers/observability-and-performace.mdx/#event-webhook-processing-time)
- [`hasura_events_fetched_per_batch`](/event-triggers/observability-and-performace.mdx/#events-fetched-per-batch)
- [`hasura_event_fetch_time_per_batch_seconds`](/event-triggers/observability-and-performace.mdx/#event-fetch-time-per-batch)
- [`hasura_event_invocations_total`](/event-triggers/observability-and-performace.mdx/#event-invocations-total)
- [`hasura_event_processed_total`](/event-triggers/observability-and-performace.mdx/#event-processed-total)
- [`hasura_event_processing_time_seconds`](/event-triggers/observability-and-performace.mdx/#event-processing-time)
- [`hasura_event_queue_time_seconds`](/event-triggers/observability-and-performace.mdx/#event-queue-time)
- [`hasura_event_trigger_http_workers`](/event-triggers/observability-and-performace.mdx/#event-triggers-http-workers)
- [`hasura_event_webhook_processing_time_seconds`](/event-triggers/observability-and-performace.mdx/#event-webhook-processing-time)
- [`hasura_events_fetched_per_batch`](/event-triggers/observability-and-performace.mdx/#events-fetched-per-batch)
### Subscription Metrics
@ -80,7 +80,8 @@ The following metrics can be used to monitor the performance of subscriptions:
### Hasura cache request count
Tracks cache hit and miss requests, which helps in monitoring and optimizing cache utilization. You can read more about this [here](/caching/caching-metrics.mdx).
Tracks cache hit and miss requests, which helps in monitoring and optimizing cache utilization. You can read more about
this [here](/caching/caching-metrics.mdx).
| | |
| ------ | ---------------------------- |
@ -115,11 +116,11 @@ webhook.
Execution time of successful GraphQL requests (excluding subscriptions). If more requests are falling in the higher
buckets, you should consider [tuning the performance](/deployment/performance-tuning.mdx).
| | |
| ------ | -------------------------------------------------------------- |
| Name | `hasura_graphql_execution_time_seconds` |
| Type | Histogram<br /><br />Buckets: 0.01, 0.03, 0.1, 0.3, 1, 3, 10 |
| Labels | `operation_type`: query \| mutation |
| | |
| ------ | ------------------------------------------------------------ |
| Name | `hasura_graphql_execution_time_seconds` |
| Type | Histogram<br /><br />Buckets: 0.01, 0.03, 0.1, 0.3, 1, 3, 10 |
| Labels | `operation_type`: query \| mutation |
### Hasura GraphQL requests total
@ -179,16 +180,15 @@ Current number of active PostgreSQL connections. Compare this to
### Hasura source health
Health check status of a particular data source, corresponding to the output of
`/healthz/sources`, with possible values 0 through 3 indicating, respectively:
OK, TIMEOUT, FAILED, ERROR. See the [Source Health Check API Reference](/api-reference/source-health.mdx)
for details.
Health check status of a particular data source, corresponding to the output of `/healthz/sources`, with possible values
0 through 3 indicating, respectively: OK, TIMEOUT, FAILED, ERROR. See the
[Source Health Check API Reference](/api-reference/source-health.mdx) for details.
| | |
| ------ | -------------------------------------- |
| Name | `hasura_source_health` |
| Type | Gauge |
| Labels | `source_name`: name of the database |
| | |
| ------ | ----------------------------------- |
| Name | `hasura_source_health` |
| Type | Gauge |
| Labels | `source_name`: name of the database |
### Hasura WebSocket connections

View File

@ -19,14 +19,16 @@ import Thumbnail from '@site/src/components/Thumbnail';
import ProductBadge from '@site/src/components/ProductBadge';
## Subscription Execution
For serving subscription requests, Hasura optimizes the subscription execution to ensure it is as fast as possible while
not overloading the database with concurrent queries.
To achieve this, Hasura uses a combination of the following techniques:
- **Grouping same queries into "cohorts" **: Hasura groups subscriptions with the same set of query and session
variables into a single cohort. The subscribers in a cohort are updated simultaneously.
- **Diff-checking**: On receiving response from the database, Hasura checks the diff between the old and new values
and sends the response only to the subscribers whose values have changed.
- **Diff-checking**: On receiving response from the database, Hasura checks the diff between the old and new values and
sends the response only to the subscribers whose values have changed.
- **Multiplexing**: Hasura groups similar "parameterized" subscriptions together and additionally splits them into
batches for efficient performance on the database. The batch size can be configured using the
[`HASURA_GRAPHQL_LIVE_QUERIES_MULTIPLEXED_BATCH_SIZE`](deployment/graphql-engine-flags/reference.mdx#multiplexed-batch-size)
@ -35,15 +37,15 @@ To achieve this, Hasura uses a combination of the following techniques:
are grouping the three subscriptions into a single multiplexed query.
<Thumbnail
src='/img/databases/postgres/subscriptions/subscription-multiplexing.png'
alt='Hasura subscription multiplexing AST'
width='900px'
className='no-shadow'
src="/img/databases/postgres/subscriptions/subscription-multiplexing.png"
alt="Hasura subscription multiplexing AST"
width="900px"
className="no-shadow"
/>
For more details on how Hasura executes subscriptions, refer to the [live
query](/subscriptions/postgres/livequery/execution.mdx) or [streaming
subscription](/subscriptions/postgres/streaming/index.mdx) documentation.
For more details on how Hasura executes subscriptions, refer to the
[live query](/subscriptions/postgres/livequery/execution.mdx) or
[streaming subscription](/subscriptions/postgres/streaming/index.mdx) documentation.
## Observability
@ -137,7 +139,6 @@ some extra process time for other tasks the poller does during a single poll. In
| Type | Histogram<br /><br />Buckets: 0.000001, 0.0001, 0.01, 0.1, 0.3, 1, 3, 10, 30, 100 |
| Labels | `subscription_kind`: streaming \| live-query, `operation_name`, `parameterized_query_hash` |
### Subscription Database Execution Time
The time taken to run the subscription's multiplexed query in the database for a single batch.
@ -159,9 +160,9 @@ consider investigating the subscription query and see if indexes can help improv
## Golden Signals for subscriptions
You can perform [Golden
Signals-based](https://sre.google/sre-book/monitoring-distributed-systems/#xref_monitoring_golden-signals) system
monitoring with Hasura's exported metrics. The following are the golden signals for subscriptions:
You can perform
[Golden Signals-based](https://sre.google/sre-book/monitoring-distributed-systems/#xref_monitoring_golden-signals)
system monitoring with Hasura's exported metrics. The following are the golden signals for subscriptions:
### Latency
@ -170,11 +171,12 @@ latency, you can monitor the [`hasura_subscription_total_time_seconds`](#subscri
If the value of this metric is high, then it may be an indication that the multiplexed query is taking longer to execute
in the database, verify this with
[`hasura_subscription_db_execution_time_seconds`](#subscription-database-execution-time)
metric. If the value of this metric is high as well, you can do the following:
[`hasura_subscription_db_execution_time_seconds`](#subscription-database-execution-time) metric. If the value of this
metric is high as well, you can do the following:
- Check if any database index can help improve the performance of the query, [analyzing the GraphQL query](#analyze)
will show the generated multiplexed query.
- Avoid querying unnecessary fields that translate to joins or function calls in the GraphQL query.
- Avoid querying unnecessary fields that translate to joins or function calls in the GraphQL query.
- Consider adding more read replicas to the database and running subscriptions on them.
### Traffic
@ -188,7 +190,8 @@ number of Hasura instances to handle the load.
### Errors
Errors in subscriptions can be monitored using the following metrics
- [`hasura_graphql_requests_total{type="subscription",response_status="error"}`](/enterprise/metrics.mdx#hasura-graphql-requests-total):
- [`hasura_graphql_requests_total{type="subscription",response_status="error"}`](/observability/prometheus/metrics.mdx#hasura-graphql-requests-total):
Total number of errors that happen before the subscription is started (i.e. validation, parsing and authorization
errors).
- [`hasura_active_subscription_pollers_in_error_state`](#active-subscription-pollers-in-error-state): Number of
@ -204,10 +207,11 @@ Saturation is the threshold until which the subscriptions can run smoothly; once
performance issues, and abrupt disconnections of the connected subscribers.
To monitor the saturation for subscriptions, you can monitor the following:
- CPU and memory usage of Hasura instances.
- For postgres backends, you can monitor the
[`hasura_postgres_connections`](/enterprise/metrics.mdx#hasura-postgres-connections) metric to see the number of
connections opened by Hasura with the database.
[`hasura_postgres_connections`](/observability/prometheus/metrics.mdx#hasura-postgres-connections) metric to see the
number of connections opened by Hasura with the database.
- P99 of the [`hasura_subscription_total_time_seconds`](#subscription-total-time) metric.
If the number of database connections is high, you can consider increasing the `max_connections` of the database.
@ -222,9 +226,9 @@ to analyze and tune the performance of subscriptions.
### Analyze
Using the `Analyze` button on GraphiQL API Explorer of the Hasura Console, you can see the generated
multiplexed SQL and its query plan. The query plan can reveal the bottlenecks in query execution and appropriate indexes
can be added to improve performance.
Using the `Analyze` button on GraphiQL API Explorer of the Hasura Console, you can see the generated multiplexed SQL and
its query plan. The query plan can reveal the bottlenecks in query execution and appropriate indexes can be added to
improve performance.
In addition to these, simplifying the subscription to avoid unnecessary joins or avoiding fetching fields which are not
going to change can also help improve performance.
@ -236,20 +240,20 @@ The parameters governing the performance of subscriptions in terms of throughput
- `HASURA_GRAPHQL_LIVE_QUERIES_MULTIPLEXED_REFETCH_INTERVAL`
- Time interval between Hasura multiplexed queries to the DB for fetching changes in subscriptions data, default = 1
sec
- Increasing this reduces the frequency of queries to the DB, thereby reducing its load and improving throughput
while increasing the latency of updates to the clients.
- Increasing this reduces the frequency of queries to the DB, thereby reducing its load and improving throughput while
increasing the latency of updates to the clients.
- `HASURA_GRAPHQL_LIVE_QUERIES_MULTIPLEXED_BATCH_SIZE`
- Number of similar subscriptions batched into a single SQL query to the DB, default = 100
- Increasing this reduces the number of SQL statements fired to the DB, thereby reducing its load and improving
throughput while increasing individual SQL execution times and latency.
- You should reduce this value if the execution time of the SQL generated by Hasura after multiplexing is more
than the refetch interval.
- You should reduce this value if the execution time of the SQL generated by Hasura after multiplexing is more than
the refetch interval.
- `max_connections` of the source
- Max number of connections Hasura opens with the DB, default = 50 (configurable via [update source metadata
API](api-reference/metadata-api/source.mdx))
- Max number of connections Hasura opens with the DB, default = 50 (configurable via
[update source metadata API](api-reference/metadata-api/source.mdx))
- Increasing this increases the number of connections Hasura can open with the DB to execute queries, thereby
improving concurrency **but adding load to the database**. A very high number of concurrent connections can result in poor
DB performance.
improving concurrency **but adding load to the database**. A very high number of concurrent connections can result
in poor DB performance.
:::info Note
@ -257,4 +261,4 @@ The default values offer a reasonable trade-off between resource utilization and
scenarios with heavy queries and a high number of active subscriptions, you need to benchmark the setup and these
parameters need to be iterated upon to achieve optimal performance.
:::
:::

Binary file not shown.

After

Width:  |  Height:  |  Size: 146 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 98 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 86 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB