From 7366184cbbf538ffac282df2702e89c2b94af6ae Mon Sep 17 00:00:00 2001 From: Gil Mizrahi Date: Wed, 6 Oct 2021 11:46:44 +0300 Subject: [PATCH] RFC: limit over join optimization PR-URL: https://github.com/hasura/graphql-engine-mono/pull/2424 GitOrigin-RevId: 95da4151190dbd66a6cc8e14e0f78a0a3dbbb5e6 --- rfcs/limit-over-join-optimization.md | 66 ++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) create mode 100644 rfcs/limit-over-join-optimization.md diff --git a/rfcs/limit-over-join-optimization.md b/rfcs/limit-over-join-optimization.md new file mode 100644 index 00000000000..8ce709b82e7 --- /dev/null +++ b/rfcs/limit-over-join-optimization.md @@ -0,0 +1,66 @@ +# Limit over join optimization + +## Metadata + +``` +--- +authors: Gil Mizrahi +discussion: + https://github.com/hasura/graphql-engine-mono/pull/2239 +state: draft +--- +``` + +## Description + +Optimize GraphQL queries containing a relationships and a limit by limiting the amount of returned results +before joining the relationship. + +### Problem + +Currently when a user runs a complex query with relationships and is using the `limit` operator, we construct an SQL query for postgresql that looks somewhat like this: + +```sql +SELECT * + FROM LEFT JOIN + LIMIT ; +``` + +Since join is an expensive operation, it would be useful if we could limit the number of rows it needs to process before running the join. + +In SQL, trying to push limits down into each side of the join is *not* a semantic perserving operation. This is because the relationship between the two sides is unspecified, and could be one-to-one, many-to-one, or many-to-many. + +For example, in a database of users and streaming providers, a user could be subscribed to multiple providers, and streaming providers provide services to multiple users. trying to get all users and their providers, limit by 10, is different than: + +1. Limiting to 10 users and match their providers. Because there can be more than 1 provider for each user - we might get more than 10 results +2. Limiting to 10 providers and match their users. Because there can be more than 1 user for each provider - we might get more than 10 results +3. Limit to 10 for both users and providers. Because some users might not use the selected providers, so we might get less than 10 results + +For this reason, postgresql will not apply this optimization when *it is* valid, because it cannot distinguish the cases. + +Fortunately, in GraphQL we do specify either have a one-to-one relationship, which means that we can limit one side and get the same result, or we have a one-to-many relationship where we aggregate the results, so we can limit the side of the "one", this side is always the root table in the query, or the "base" table. + +### Why is it important? + +It can improve the performance of queries by orders of magnitude ([as described by a customer](https://github.com/hasura/graphql-engine/issues/5745#issuecomment-899081795)). +Was requested by customers ([graphql-engine/#5745](https://github.com/hasura/graphql-engine/issues/5745)) which consider this feature a must-have. + +## How + +Implement this optimization ourselves by pushing the LIMIT into the base table. This has a few caveats: + +1. Both LIMITs and OFFSETs should be pushed to the base table +2. When ORDER BY is also involved, the order by should also be *duplicated* in the base table, so we can limit the results *after* sorting, and *also* sort at the final results generation (in the `json_agg` function), otherwise the results order is unspecified. +3. When *DISTINCT* is involved, it should also be pushed into the base table - distinct acts as a filter and may reduce the amount of rows, so it should happens before limiting the results. + +Because of (2) and (3) this optimization is only valid when the columns referred from any +DISTINCT or ORDER BY are from the base table. If other columns exist, this optimization is not valid. + +### Success + +We can verify the feature works by writing tests inspecting the generated SQL. + +### Future Work / Out of Scope + +Work on this features has been implemented in [#2239](https://github.com/hasura/graphql-engine-mono/pull/2239) by changing the way we translate `RQL` ASTs to postresql ASTs. This optimization might be better to express as an SQL to SQL transformation. +In order to refactor this code, we'd need to first [document the Postgres.Translate.Select](https://github.com/hasura/graphql-engine-mono/issues/2391) module. After that we could refactor this optimization to a straightforward translation of RQL to SQL and then an SQL transformation.