mirror of
https://github.com/hasura/graphql-engine.git
synced 2024-12-30 19:06:41 +03:00
239 lines
7.7 KiB
ReStructuredText
239 lines
7.7 KiB
ReStructuredText
|
.. meta::
|
||
|
:description: Performance of Hasura GraphQL queries
|
||
|
:keywords: hasura, docs, schema, queries, performance
|
||
|
|
||
|
.. _query_performance:
|
||
|
|
||
|
Query performance
|
||
|
=================
|
||
|
|
||
|
.. contents:: Table of contents
|
||
|
:backlinks: none
|
||
|
:depth: 2
|
||
|
:local:
|
||
|
|
||
|
Introduction
|
||
|
------------
|
||
|
|
||
|
Sometimes queries can become slow due to large data volumes or levels of nesting.
|
||
|
This page explains how to identify the query performance, how the query plan caching in Hasura works, and how queries can be optimized.
|
||
|
|
||
|
.. _analysing_query_performance:
|
||
|
|
||
|
Analysing query performance
|
||
|
---------------------------
|
||
|
|
||
|
Let's say we want to analyse the following query:
|
||
|
|
||
|
|
||
|
.. code-block:: graphql
|
||
|
|
||
|
query {
|
||
|
authors(where: {name: {_eq: "Mario"}}) {
|
||
|
rating
|
||
|
}
|
||
|
}
|
||
|
|
||
|
In order to analyse the performance of a query, you can click on the ``Analyze`` button on the Hasura console:
|
||
|
|
||
|
.. thumbnail:: ../../../img/graphql/manual/queries/analyze-query.png
|
||
|
:class: no-shadow
|
||
|
:width: 75%
|
||
|
:alt: Query analyze button on Hasura console
|
||
|
|
||
|
The following query execution plan is generated:
|
||
|
|
||
|
.. thumbnail:: ../../../img/graphql/manual/queries/query-analysis-before-index.png
|
||
|
:class: no-shadow
|
||
|
:width: 75%
|
||
|
:alt: Execution plan for Hasura GraphQL query
|
||
|
|
||
|
We can see that a sequential scan is conducted on the ``authors`` table. This means that Postgres goes through every row of the ``authors`` table in order to check if the author's name equals "Mario".
|
||
|
The ``cost`` of a query is an arbitrary number generated by Postgres and is to be interpreted as a measure of comparison rather than an absolute measure of something.
|
||
|
|
||
|
Read more about query performance analysis in the `Postgres explain statement docs <https://www.postgresql.org/docs/current/sql-explain.html>`__.
|
||
|
|
||
|
.. _query_plan_caching:
|
||
|
|
||
|
Query plan caching
|
||
|
------------------
|
||
|
|
||
|
How it works
|
||
|
^^^^^^^^^^^^
|
||
|
|
||
|
Hasura executes GraphQL queries as follows:
|
||
|
|
||
|
1. The incoming GraphQL query is parsed into an `abstract syntax tree <https://en.wikipedia.org/wiki/Abstract_syntax_tree>`__ (AST) which is how GraphQL is represented.
|
||
|
2. The GraphQL AST is validated against the schema to generate an internal representation.
|
||
|
3. The internal representation is converted into an SQL statement (a `prepared statement <https://www.postgresql.org/docs/current/sql-prepare.html>`__ whenever possible).
|
||
|
4. The (prepared) statement is executed on Postgres to retrieve the result of the query.
|
||
|
|
||
|
For most use cases, Hasura constructs a "plan" for a query, so that a new instance of the same query can be executed without the overhead of steps 1 to 3.
|
||
|
|
||
|
For example, let's consider the following query:
|
||
|
|
||
|
.. code-block:: graphql
|
||
|
|
||
|
query getAuthor($id: Int!) {
|
||
|
authors(where: {id: {_eq: $id}}) {
|
||
|
name
|
||
|
rating
|
||
|
}
|
||
|
}
|
||
|
|
||
|
With the following variable:
|
||
|
|
||
|
.. code-block:: graphql
|
||
|
|
||
|
{
|
||
|
"id": 1
|
||
|
}
|
||
|
|
||
|
Hasura now tries to map a GraphQL query to a prepared statement where the parameters have a one-to-one correspondence to the variables defined in the GraphQL query.
|
||
|
The first time a query comes in, Hasura generates a plan for the query which consists of two things:
|
||
|
|
||
|
1. The prepared statement
|
||
|
2. Information necessary to convert variables into the prepared statement's arguments
|
||
|
|
||
|
For the above query, Hasura generates the following prepared statement (simplified):
|
||
|
|
||
|
.. code-block:: plpgsql
|
||
|
|
||
|
select name, rating from author where id = $1
|
||
|
|
||
|
With the following prepared variables:
|
||
|
|
||
|
.. code-block:: plpgsql
|
||
|
|
||
|
$1 = 1
|
||
|
|
||
|
This plan is then saved in a data structure called ``Query Plan Cache``. The next time the same query is executed,
|
||
|
Hasura uses the plan to convert the provided variables into the prepared statement's arguments and then executes the statement.
|
||
|
This will significantly cut down the execution time for a GraphQL query resulting in lower latencies and higher throughput.
|
||
|
|
||
|
Caveats
|
||
|
^^^^^^^
|
||
|
|
||
|
The above optimization is not possible for all types of queries. For example, consider this query:
|
||
|
|
||
|
.. code-block:: graphql
|
||
|
|
||
|
query getAuthorWithCondition($condition: author_bool_exp!) {
|
||
|
author(where: $condition)
|
||
|
name
|
||
|
rating
|
||
|
}
|
||
|
}
|
||
|
|
||
|
The statement generated for ``getAuthorWithCondition`` is now dependent on the variables.
|
||
|
|
||
|
With the following variables:
|
||
|
|
||
|
.. code-block:: json
|
||
|
|
||
|
{
|
||
|
"condition": {"id": {"_eq": 1}}
|
||
|
}
|
||
|
|
||
|
the generated statement will be:
|
||
|
|
||
|
.. code-block:: plpgsql
|
||
|
|
||
|
select name, rating from author where id = $1
|
||
|
|
||
|
However, with the following variables:
|
||
|
|
||
|
.. code-block:: json
|
||
|
|
||
|
{
|
||
|
"condition": {"name": {"_eq": "John"}}
|
||
|
}
|
||
|
|
||
|
the generated statement will be:
|
||
|
|
||
|
.. code-block:: plpgsql
|
||
|
|
||
|
select name, rating from author where name = 'John'
|
||
|
|
||
|
A plan cannot be generated for such queries because the variables defined in the GraphQL query don't have a one-to-one correspondence to the parameters in the prepared statement.
|
||
|
|
||
|
Query optimization
|
||
|
------------------
|
||
|
|
||
|
Using GraphQL variables
|
||
|
^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
|
||
|
In order to leverage Hasura's query plan caching (as explained in the :ref:`previous section <query_plan_caching>`) to the full extent, GraphQL queries should be defined with
|
||
|
variables whose types are **non-nullable scalars** whenever possible.
|
||
|
|
||
|
To make variables non-nullable, add a ``!`` at the end of the type, like here:
|
||
|
|
||
|
.. code-block:: graphql
|
||
|
:emphasize-lines: 1
|
||
|
|
||
|
query getAuthor($id: Int!) {
|
||
|
authors(where: {id: {_eq: $id}}) {
|
||
|
name
|
||
|
rating
|
||
|
}
|
||
|
}
|
||
|
|
||
|
If the ``!`` is not added and the variable is nullable, the generated query will be different depending if an ``id`` is passed or if the variables is ``null``
|
||
|
(for the latter, there is no ``where`` statement present). Therefore, it's not possible for Hasura to create a reusable plan for a query in this case.
|
||
|
|
||
|
.. note::
|
||
|
|
||
|
Hasura is fast even for queries which cannot have a reusable plan.
|
||
|
This should concern you only if you face a high volume of traffic (thousands of requests per second).
|
||
|
|
||
|
Using PG indexes
|
||
|
^^^^^^^^^^^^^^^^
|
||
|
|
||
|
`Postgres indexes <https://www.tutorialspoint.com/postgresql/postgresql_indexes.htm>`__ are special lookup tables that Postgres can use to speed up data lookup.
|
||
|
An index acts as a pointer to data in a table, and it works very similar to an index in the back of a book.
|
||
|
If you look in the index first, you'll find the data much quicker than searching the whole book (or - in this case - database).
|
||
|
|
||
|
Let's say we know that ``authors`` table is frequently queried by ``name``:
|
||
|
|
||
|
.. code-block:: graphql
|
||
|
|
||
|
query {
|
||
|
authors(where: {name: {_eq: "Mario"}}) {
|
||
|
rating
|
||
|
}
|
||
|
}
|
||
|
|
||
|
We've seen in the :ref:`above example <analysing_query_performance>` that by default Postgres conducts a sequential scan i.e. going through all the rows.
|
||
|
Whenever there is a sequential scan, it can be optimized by adding an index.
|
||
|
|
||
|
.. rst-class:: api_tabs
|
||
|
.. tabs::
|
||
|
|
||
|
.. tab:: Console
|
||
|
|
||
|
An index can be added in the ``SQL -> Data`` tab in the Hasura console:
|
||
|
|
||
|
.. tab:: API
|
||
|
|
||
|
An index can be added via the :ref:`run_sql <run_sql>` metadata API.
|
||
|
|
||
|
The following statement sets an index on ``name`` in the ``authors`` table.
|
||
|
|
||
|
.. code-block:: plpgsql
|
||
|
|
||
|
CREATE INDEX ON authors (name);
|
||
|
|
||
|
Let's compare the performance analysis to :ref:`the one before adding the index <analysing_query_performance>`.
|
||
|
What was a ``sequential scan`` in the example earlier is now an ``index scan``. ``Index scans`` are usually more performant than ``sequential scans``.
|
||
|
We can also see that the ``cost`` of the query is now lower than the one before we added the index.
|
||
|
|
||
|
.. thumbnail:: ../../../img/graphql/manual/queries/query-analysis-after-index.png
|
||
|
:class: no-shadow
|
||
|
:width: 75%
|
||
|
:alt: Execution plan for Hasura GraphQL query
|
||
|
|
||
|
.. note::
|
||
|
|
||
|
In some cases sequential scans can still be faster than index scans, e.g. if the result returns a high percentage of the rows in the table.
|
||
|
Postgres comes up with multiple query plans and takes the call on what kind of scan would be faster.
|