graphql-engine/v3/crates/sql
Daniel Chambers 00fa5c42ba Refactor to prevent unresolved queries from being sent as ndc requests (#871)
~~Note: this PR is stacked on #845.~~ Rebased on main

This PR refactors the `execute::plan::types` further to make a clear
distinction between unresolved and resolved states. An "unresolved"
state refers to one in which remote predicates have not been computed
into local predicates. A "resolved" state is after this process is
performed and remote predicates are eliminated.

Previously, unresolved types could be passed to
`execute::plan::ndc_request` and they would fail at runtime due to the
presence of unresolved remote predicates. Now, this is impossible due to
a type-level distinction between unresolved and resolve states.

This distinction is made by type-parameterizing all
`execute::plan::types` that involve a predicate so that the predicate
type is parameterized out. Then, an `Unresolved` type alias is created
that sets the predicate type to
`execute::ir::filter::expression::Expression` (which contains remote
predicates) and a `Resolved` type alias is created that uses
`ResolvedFilterExpression` instead (which does not contain remote
predicates).

For example, for `QueryNode`, we now have:

```rust
pub struct QueryNode<'s, TFilterExpression> {
    ...
    pub predicate: Option<TFilterExpression>,
    ...
}
```

And then the two aliases are:

```rust
pub type UnresolvedQueryNode<'s> = QueryNode<'s, ir::filter::expression::Expression<'s>>;
pub type ResolvedQueryNode<'s> = QueryNode<'s, ResolvedFilterExpression>;
```

Subsequently, `plan::ndc_request` only deals with `Resolved` types.

This is mostly just type-fiddling, but one place some logic moved around
is in with the old `plan::types::FilterExpression`. This was mostly a
functional duplicate of `ir::filter::execute::Expression` except that it
had a "planned" remote predicate variant in it. In order to reduce the
number of types (so we didn't need `UnresolvedFilterExpression` and
`ResolvedFilterExpression`), this type has been repurposed into
`ResolvedFilterExpression` and no longer deals with remote predicates.
Instead, `ir::filter::execute::Expression` is resolved into a
`ResolvedFilterExpression` and the planning of the remote predicate is
done at that time, just before it is resolved. This works fine, since an
entirely new ndc query is performed in order to resolve the predicate,
so planning that can be deferred until then and it doesn't need to be
done at the same time as the main query.

Part of
https://linear.app/hasura/issue/APIPG-702/implement-separate-logic-that-maps-engine-types-to-ndc-models-types-on

V3_GIT_ORIGIN_REV_ID: 3ec89efbaa7b543fad6a100e2739bcc74b1d567f
2024-07-24 09:55:39 +00:00
..
src Refactor to prevent unresolved queries from being sent as ndc requests (#871) 2024-07-24 09:55:39 +00:00
Cargo.toml Remove NDC types from ir and plan and perform separate plan->NDC mapping per NDC version (#845) 2024-07-24 07:28:50 +00:00
readme.md Experimental SQL interface (#742) 2024-06-25 18:46:39 +00:00

SQL Interface

An experimental SQL interface over OpenDD models. This is mostly targeted at AI use cases for now - GenAI models are better at generating SQL queries than GraphQL queries.

This is implemented using the Apache DataFusion Query Engine by deriving the SQL metadata for datafusion from Open DDS metadata. As the implementation currently stands, once we get a LogicalPlan from datafusion we replace TableScans with NDC queries to the underlying connector. There is a rudimentary optimizer that pushes down projections to the ndc query so that we don't fetch all the columns of a collection.