From 7366184cbbf538ffac282df2702e89c2b94af6ae Mon Sep 17 00:00:00 2001
From: Gil Mizrahi <gilmi@posteo.net>
Date: Wed, 6 Oct 2021 11:46:44 +0300
Subject: [PATCH] RFC: limit over join optimization

PR-URL: https://github.com/hasura/graphql-engine-mono/pull/2424
GitOrigin-RevId: 95da4151190dbd66a6cc8e14e0f78a0a3dbbb5e6
---
 rfcs/limit-over-join-optimization.md | 66 ++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)
 create mode 100644 rfcs/limit-over-join-optimization.md
diff --git a/rfcs/limit-over-join-optimization.md b/rfcs/limit-over-join-optimization.md
new file mode 100644
index 00000000000..8ce709b82e7
--- /dev/null
+++ b/rfcs/limit-over-join-optimization.md
@@ -0,0 +1,66 @@
+# Limit over join optimization
+
+## Metadata
+
+```
+---
+authors: Gil Mizrahi <gil@hasura.io>
+discussion:
+   https://github.com/hasura/graphql-engine-mono/pull/2239
+state: draft
+---
+```
+
+## Description
+
+Optimize GraphQL queries containing a relationships and a limit by limiting the amount of returned results
+before joining the relationship.
+
+### Problem
+
+Currently when a user runs a complex query with relationships and is using the `limit` operator, we construct an SQL query for postgresql that looks somewhat like this:
+
+```sql
+SELECT *
+  FROM <base-table> LEFT JOIN <other-table>
+ LIMIT <limit>;
+```
+
+Since join is an expensive operation, it would be useful if we could limit the number of rows it needs to process before running the join.
+
+In SQL, trying to push limits down into each side of the join is *not* a semantic perserving operation.  This is because the relationship between the two sides is unspecified, and could be one-to-one, many-to-one, or many-to-many.
+
+For example, in a database of users and streaming providers, a user could be subscribed to multiple providers, and streaming providers provide services to multiple users. trying to get all users and their providers, limit by 10, is different than:
+
+1. Limiting to 10 users and match their providers.  Because there can be more than 1 provider for each user - we might get more than 10 results
+2. Limiting to 10 providers and match their users.  Because there can be more than 1 user for each provider - we might get more than 10 results
+3. Limit to 10 for both users and providers.  Because some users might not use the selected providers, so we might get less than 10 results
+
+For this reason, postgresql will not apply this optimization when *it is* valid, because it cannot distinguish the cases.
+
+Fortunately, in GraphQL we do specify either have a one-to-one relationship, which means that we can limit one side and get the same result, or we have a one-to-many relationship where we aggregate the results, so we can limit the side of the "one", this side is always the root table in the query, or the "base" table.
+
+### Why is it important?
+
+It can improve the performance of queries by orders of magnitude ([as described by a customer](https://github.com/hasura/graphql-engine/issues/5745#issuecomment-899081795)).
+Was requested by customers ([graphql-engine/#5745](https://github.com/hasura/graphql-engine/issues/5745)) which consider this feature a must-have.
+
+## How
+
+Implement this optimization ourselves by pushing the LIMIT into the base table.  This has a few caveats:
+
+1. Both LIMITs and OFFSETs should be pushed to the base table
+2. When ORDER BY is also involved, the order by should also be *duplicated* in the base table, so we can limit the results *after*  sorting, and *also* sort at the final results generation (in the `json_agg` function), otherwise the results order is unspecified.
+3. When *DISTINCT* is involved, it should also be pushed into the base table - distinct acts as a filter and may reduce the amount of rows, so it should happens before limiting the results.
+
+Because of (2) and (3) this optimization is only valid when the columns referred from any
+DISTINCT or ORDER BY are from the base table. If other columns exist, this optimization is not valid.
+
+### Success
+
+We can verify the feature works by writing tests inspecting the generated SQL.
+
+### Future Work / Out of Scope
+
+Work on this features has been implemented in [#2239](https://github.com/hasura/graphql-engine-mono/pull/2239) by changing the way we translate `RQL` ASTs to postresql ASTs. This optimization might be better to express as an SQL to SQL transformation.
+In order to refactor this code, we'd need to first [document the Postgres.Translate.Select](https://github.com/hasura/graphql-engine-mono/issues/2391) module. After that we could refactor this optimization to a straightforward translation of RQL to SQL and then an SQL transformation.