Commit Graph

63 Commits

Author SHA1 Message Date
Daniel Harvey
056b1c18fc Stop Nix rebuilding (#1329)
<!-- The PR description should answer 2 important questions: -->

### What

Our Nix build was building all the workspace crates as part of it's deps
step. This means when any library crate is changed, we throw away all
the caching, which isn't ideal. This filters the source files out of
those builds, so that we get more cache hits. We also move all Cargo
features into the workspace, which I've been meaning to do for ages, so
things are more consistent, and again, we get more cache hits generally.

V3_GIT_ORIGIN_REV_ID: c724b152692575edf6c00ab426e48ecca13aa998
2024-11-11 12:06:31 +00:00
Daniel Chambers
d3491fc9f6 Upgrade to NDC v0.2.0-rc.1 (#1291)
### What
This PR updates the engine to use the NDC Spec v0.2.0-rc.1 version. This
is very likely to be the final RC before release.

### How

The `ndc_models` crate got updated, which then resulted in the schema
migration code in `metadata_resolve` being updated. This affected a lot
of test results because connectors that used deprecated type
representations got migrated to other representations, and if a type
representation was missing then JSON was used instead.

The NDC request-sending code in `execute` was updated to send the
`X-Hasura-NDC-Version` header depending on the version of request
getting sent.

The custom connector was updated to be compatible with the new NDC
0.2.0-rc.1 types. This resulted in the schema changing, so a lot of
tests that contained the connector's schema were updated.

---------

Co-authored-by: Daniel Harvey <danieljamesharvey@gmail.com>
V3_GIT_ORIGIN_REV_ID: b1c7081eb1ee6cffdead08328a857903102332c6
2024-11-06 13:08:10 +00:00
Daniel Harvey
a9f6610691 Split resolved metadata from SQL catalog (#1304)
V3_GIT_ORIGIN_REV_ID: 8e64fd6200e207336d18ed6b7c61559fe4b1470d
2024-11-01 09:30:58 +00:00
Vamshi Surabhi
6c7ffc270c sql introspection: unsupported metadata objects (#1301)
<!-- The PR description should answer 2 important questions: -->

### What

This allows for querying the metadata objects that couldn't be exposed
to the sql layer: such as models, commands, object types, fields of
object types and scalar types. This will help users in debugging their
graph.

### How

Instead of ignoring unsupported metadata objects, we propagate them
through call stacks as required and capture them as part of
introspection tables. Eg. on the duckduckemail API:

```bash
❯ echo 'select * from hasura.unsupported_commands;' | jq --null-input --rawfile sql /dev/stdin '{"sql": $sql}' | curl --silent -XPOST -H 'Content-Type: application/json' -d @- 'http://localhost:3000/v1/sql' | jq
[
  {
    "subgraph": "app",
    "name": "Bye",
    "reason": "Return type not supported: String! (reason: scalar return types are not supported)"
  },
  {
    "subgraph": "app",
    "name": "DdaCalendarLoaderInit",
    "reason": "Return type not supported: String! (reason: scalar return types are not supported)"
  },
  {
    "subgraph": "app",
    "name": "DdaCalendarLoaderStatus",
    "reason": "Return type not supported: String! (reason: scalar return types are not supported)"
  },
  {
    "subgraph": "app",
    "name": "DdaGmailLoaderInit",
    "reason": "Return type not supported: String! (reason: scalar return types are not supported)"
  },
  {
    "subgraph": "app",
    "name": "DdaGmailLoaderStatus",
    "reason": "Return type not supported: String! (reason: scalar return types are not supported)"
  },
  {
    "subgraph": "app",
    "name": "DdaMyLoaderInit",
    "reason": "Return type not supported: String! (reason: scalar return types are not supported)"
  },
  {
    "subgraph": "app",
    "name": "DdaMyLoaderStatus",
    "reason": "Return type not supported: String! (reason: scalar return types are not supported)"
  },
  {
    "subgraph": "app",
    "name": "Hello",
    "reason": "Return type not supported: String! (reason: scalar return types are not supported)"
  },
  {
    "subgraph": "app",
    "name": "SendEmail",
    "reason": "Return type not supported: String! (reason: scalar return types are not supported)"
  },
  {
    "subgraph": "app",
    "name": "TestCalendar",
    "reason": "Return type not supported: Json! (in subgraph app) (reason: scalar return types are not supported)"
  }
]

❯ echo 'select * from hasura.unsupported_scalars;' | jq --null-input --rawfile sql /dev/stdin '{"sql": $sql}' | curl --silent -XPOST -H 'Content-Type: application/json' -d @- 'http://localhost:3000/v1/sql' | jq
[
  {
    "subgraph": "app",
    "name": "Json",
    "reason": "No NDC representation found for scalar type 'Json'"
  }
]

❯ echo 'select * from hasura.unsupported_object_type_fields;' | jq --null-input --rawfile sql /dev/stdin '{"sql": $sql}' | curl --silent -XPOST -H 'Content-Type: application/json' -d @- 'http://localhost:3000/v1/sql' | jq
[
  {
    "subgraph": "app",
    "object": "CalendarEvents",
    "field_name": "attachments",
    "reason": "Unsupported scalar type: No NDC representation found for scalar type 'Json'"
  },
  {
    "subgraph": "app",
    "object": "CalendarEvents",
    "field_name": "conferenceData",
    "reason": "Unsupported scalar type: No NDC representation found for scalar type 'Json'"
  },
  {
    "subgraph": "app",
    "object": "CalendarEvents",
    "field_name": "extendedProperties",
    "reason": "Unsupported scalar type: No NDC representation found for scalar type 'Json'"
  },
  {
    "subgraph": "app",
    "object": "CalendarEvents",
    "field_name": "recurrence",
    "reason": "Unsupported scalar type: No NDC representation found for scalar type 'Json'"
  },
  {
    "subgraph": "app",
    "object": "CalendarEvents",
    "field_name": "reminders",
    "reason": "Unsupported scalar type: No NDC representation found for scalar type 'Json'"
  },
  {
    "subgraph": "app",
    "object": "GmailMessages",
    "field_name": "attachments",
    "reason": "Unsupported scalar type: No NDC representation found for scalar type 'Json'"
  },
  {
    "subgraph": "app",
    "object": "GmailMessages",
    "field_name": "bccAddresses",
    "reason": "Unsupported scalar type: No NDC representation found for scalar type 'Json'"
  },
  {
    "subgraph": "app",
    "object": "GmailMessages",
    "field_name": "ccAddresses",
    "reason": "Unsupported scalar type: No NDC representation found for scalar type 'Json'"
  },
  {
    "subgraph": "app",
    "object": "GmailMessages",
    "field_name": "headers",
    "reason": "Unsupported scalar type: No NDC representation found for scalar type 'Json'"
  },
  {
    "subgraph": "app",
    "object": "GmailMessages",
    "field_name": "labelIds",
    "reason": "Unsupported scalar type: No NDC representation found for scalar type 'Json'"
  },
  {
    "subgraph": "app",
    "object": "GmailMessages",
    "field_name": "toAddresses",
    "reason": "Unsupported scalar type: No NDC representation found for scalar type 'Json'"
  }
]

❯ echo 'select * from hasura.unsupported_models;' | jq --null-input --rawfile sql /dev/stdin '{"sql": $sql}' | curl --silent -XPOST -H 'Content-Type: application/json' -d @- 'http://localhost:3000/v1/sql' | jq
[]

❯ echo 'select * from hasura.unsupported_object_types;' | jq --null-input --rawfile sql /dev/stdin '{"sql": $sql}' | curl --silent -XPOST -H 'Content-Type: application/json' -d @- 'http://localhost:3000/v1/sql' | jq
[]
```

V3_GIT_ORIGIN_REV_ID: 01aae03d80b0cd15773812fa05a2fd9a57223250
2024-10-31 15:01:11 +00:00
Vamshi Surabhi
5866fd176d sql: error detail for disallowed mutations (#1280)
<!-- The PR description should answer 2 important questions: -->

### What

We want to change this error message
```json
{
  "error": "error in data fusion: External error: Mutations are requested to be disallowed as part of the request"
}
```

to this

```json
{
  "error": "error in data fusion: External error: Mutations are requested to be disallowed as part of the request",
  "detail": {
    "subgraph": "default",
    "commandName": "uppercase_actor_name_by_id",
    "arguments": {
      "id": {
        "literal": 1
      }
    }
  }
}

```

<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->

<!-- Consider: do we need to add a changelog entry? -->

<!-- Does this PR introduce new validation that might break old builds?
-->

<!-- Consider: do we need to put new checks behind a flag? -->

### How

<!-- How is it trying to accomplish it (what are the implementation
steps)? -->

V3_GIT_ORIGIN_REV_ID: 5d1f712c039ac9a4685c480634f1e7e17ff94c4b
2024-10-26 16:14:36 +00:00
Vamshi Surabhi
e5f1befba8 sql: allow inexact scalar conversion (#1279)
<!-- The PR description should answer 2 important questions: -->

### What

Float literals in SQL are represented as `Float64` values. We were check
for precision loss when converting these Float64 values into Float32
values. We now only check if there is an overflow.

<!-- Consider: do we need to add a changelog entry? -->

<!-- Does this PR introduce new validation that might break old builds?
-->

<!-- Consider: do we need to put new checks behind a flag? -->

### How

<!-- How is it trying to accomplish it (what are the implementation
steps)? -->

V3_GIT_ORIGIN_REV_ID: 1630e6130591df19f16be7cc97bbc6515537d951
2024-10-26 08:07:15 +00:00
Daniel Harvey
1879d1bb75 Enable argument presets in OpenDD pipeline (#1228)
<!-- The PR description should answer 2 important questions: -->

### What

This PR enables argument presets in the OpenDD pipeline by using
functions from `graphql_ir`. In the ideal future we'd flip the
dependency and move these functions out of `graphql_ir` and into the
`plan` crate, however we can't do that until `execute` crate is no
longer in active development as it will involve making a big mess there.

### How

- Calculate argument presets in the `plan/query/model_target` module
using functions from `graphql_ir`
- We also begin adding boolean expression resolve, then back away slowly
as it's a massive job and better tackled when we start making `where`
clauses work in this pipeline, to stop this PR ballooning insanely.

V3_GIT_ORIGIN_REV_ID: 47867452b7366e83f71b118e37302de93d9bde72
2024-10-17 10:48:04 +00:00
Daniel Harvey
bec9dee021 Move argument presets resolve into metadata-resolve (#1226)
<!-- The PR description should answer 2 important questions: -->

### What

Previously we were doing the business of calculating which arguments
(and parts of arguments) were preset in the GraphQL `schema` crate. This
meant we would have to reimplement the logic for each backend. Now we
move it into `metadata-resolve` so the results can be shared by all
frontends.

### How

Move argument preset resolve into `metadata-resolve`. What's left in
`graphql-schema` is all the stuff relating to `Annotation`s of various
kinds.

A lot of the diff is changing `ModelWithPermissions` and
`CommandWithPermissions` to `ModelWithArgumentPresets` and
`CommandWithArgumentPresets` in crates after `metadata-resolve`.

Functional no-op.

V3_GIT_ORIGIN_REV_ID: b1b0983abb9f6282652c8689b02e0796026752f5
2024-10-15 14:16:26 +00:00
Daniel Harvey
e313123ed7 Move commands from sql to plan (#1188)
<!-- The PR description should answer 2 important questions: -->

### What

Move the function/procedure planning from `sql` to the shared OpenDD IR
pipeline in `plan`. This should be a no-op for `sql`

### How

Move code, fix type errors.

V3_GIT_ORIGIN_REV_ID: 7da797ffedbc40a44692670679aa176817f2c65e
2024-10-03 10:28:39 +00:00
Daniel Harvey
4015612091 Extract NDCFunction and NDCProcedure in sql command planning (#1172)
<!-- The PR description should answer 2 important questions: -->

### What

Before we pull the command planning into `plan`, let's split the types
so the general and DataFusion stuff don't live in the same place.

### How

Move types, follow errors. Functional no-op.

V3_GIT_ORIGIN_REV_ID: bb4adbc6897a79f47be37d5ad1a13b7b8efb5e93
2024-09-30 10:35:53 +00:00
Daniel Harvey
a8deb88f4e Move model aggregate planning to 'plan' crate (#1171)
<!-- The PR description should answer 2 important questions: -->

### What

Much in the vein of https://github.com/hasura/v3-engine/pull/1166, we
move the model aggregate planning from `sql` to the `plan` crate. No
tests actually exercise this code in the OpenDD IR pipeline yet, perhaps
if we extend the GraphQL -> OpenDD IR pipeline we can put it under test.

### How

Move the code, fix the errors. Functional no-op.

V3_GIT_ORIGIN_REV_ID: 7beee0aec828296fefa24c975d4662a20aa0d2e5
2024-09-30 09:09:38 +00:00
Daniel Harvey
649b3c29b0 Move model planning from sql to plan (#1166)
<!-- The PR description should answer 2 important questions: -->

### What

We're building a new OpenDD IR pipeline. The `sql` crate already has a
lot of what we need, so let's take the model selection parts (ie, not
aggregates yet), pull them into the `plan` crate, and re-use them for
both `sql` and the `jsonapi` pipelines.

The broad idea here is that the shared `plan` will get incrementally
bigger, and `sql` will get smaller.

This is a functional no-op for `sql`, and slightly improves the WIP
JSONAPI pipeline as we enjoy better permission checks.

### How

- Copy model planning and helper functions from `sql` into `plan`
- Replace instances `DataFusionError` with a smaller local `PlanError`
- Fix JSONAPI to use these new `plan` functions
- Remove the code in `sql`, instead using the shared `plan` functions in
planning, mapping back into `DataFusionError` as appropriate.

V3_GIT_ORIGIN_REV_ID: 50314442b9b56f31d2b38a0cf6f104e265bc3886
2024-09-27 14:21:19 +00:00
Daniel Harvey
a6719bee76 Replace references to DataConnectorLink with Arc (#1162)
<!-- The PR description should answer 2 important questions: -->

### What

The references are making multiple frontends difficult to implement,
let's wrap them with `Arc` instead and have an easier time.

### How

Change the types, follow the errors. Functional no-op.

V3_GIT_ORIGIN_REV_ID: 8baea2bd6c0e56e8bfb1f899b8d15731eebfa976
2024-09-27 09:00:16 +00:00
Daniel Harvey
b13cd460ae Move NdcFieldAlias to new plan-types crate (#1161)
### What

We'd like to use `NdcFieldAlias` in the `plan` crate, however because of
the ways deps between `graphql_ir` and `execute` work we cannot without
a cycle. Functional no-op.

### How

Create a new crate that depends on nothing for planning-related domain
types.

V3_GIT_ORIGIN_REV_ID: c441f2de2eba01bda59ce16e1e4b0e4d9f765d78
2024-09-27 09:00:10 +00:00
Daniel Harvey
e2205b221c Keep reference to http_context instead of owned copy (#1159)
<!-- The PR description should answer 2 important questions: -->

### What

When we merged the PR that added `ResolveFilterExpressionContext`
(amongst other changes, sadly), the `Generate Query Plan` got slower.
Changing this to a reference to try and improve it. Locally run
benchmarks show this as mostly an improvement.

### How

<!-- How is it trying to accomplish it (what are the implementation
steps)? -->

Use reference to `http_context` inside `ResolveFilterExpressionContext`,
remove resulting `.clone()` calls.

V3_GIT_ORIGIN_REV_ID: b7e728cf4f376f7c69b83eab0d79a43c90ee265b
2024-09-26 13:02:52 +00:00
Daniel Harvey
ccd6cf793b Rename schema crate to graphql-schema (#1117)
<!-- The PR description should answer 2 important questions: -->

### What

Much like https://github.com/hasura/v3-engine/pull/1116, make clearer
what is and is not graphql-centric in engine by renaming `schema` to
`graphql-schema`.

### How

Moving file around, no functional changes.

V3_GIT_ORIGIN_REV_ID: ec06c33a964c16a53c1a4ed306de3fdccd2e8efc
2024-09-17 20:07:06 +00:00
Daniel Harvey
6574ed7da0 Rename ir crate to graphql-ir crate (#1116)
<!-- The PR description should answer 2 important questions: -->

### What

Make what is and is not graphql-centric a little clearer by renaming
this crate and moving it into a `crates/graphql` folder.

### How

No functional changes

V3_GIT_ORIGIN_REV_ID: 3644ce32059e16db9b467b010430ba23fc436ed9
2024-09-17 13:51:26 +00:00
Rakesh Emmadi
e5d7822086 Define subscription request plan (#1097)
<!-- The PR description should answer 2 important questions: -->
Closes:
https://linear.app/hasura/issue/APIPG-876/live-queries-or-ir-and-requestplan-for-subscriptions
### What

<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->

<!-- Consider: do we need to add a changelog entry? -->

<!-- Does this PR introduce new validation that might break old builds?
-->

<!-- Consider: do we need to put new checks behind a flag? -->

- Define and generate request plan for subscriptions.
- Define filter expression resolving context to allow/reject remote
relationships.

### How

<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
- Modifying existing enums, defining new types and functions.

V3_GIT_ORIGIN_REV_ID: 1e1a248c74d210cf45288002e1fe13ce6a868fe6
2024-09-17 07:11:46 +00:00
Vamshi Surabhi
a33dc2e1a7 PACHA-82 sql: support empty arguments when arguments are all nullable (#1095)
<!-- The PR description should answer 2 important questions: -->

### What

We support `fn()` syntax when all arguments are nullable.

### How

Checks if all the arguments are nullable and handles this case. There
are bunch of unit tests added.

V3_GIT_ORIGIN_REV_ID: 65c3eff6200930474bc479b27666b77e4c648b49
2024-09-11 03:42:53 +00:00
Phil Freeman
6363ac8289 [PACHA-80] Fix issue with generated uniqueness constraints (#1064)
<!-- The PR description should answer 2 important questions: -->

### What

- Check all columns exist in SQL schema before exposing uniqueness
constraints.

### How

<!-- How is it trying to accomplish it (what are the implementation
steps)? -->

V3_GIT_ORIGIN_REV_ID: 0f8c25e4f8b6b9aa5647a0acdba5e9c233ccb77d
2024-09-06 19:32:52 +00:00
Phil Freeman
e96acd1fe5 [PACHA-22] Disallow recursive types in SQL table column types (#1056)
<!-- The PR description should answer 2 important questions: -->

### What

- Disallow recursive types in types of table columns

### How

- As we walk down the table type constructing the corresponding Arrow
types, we keep track of a set of struct type names that we've seen.
- If we see a type name we've seen before, we don't include the current
field:
  - If we're in a nested context, we cascade the field deletion up
  - If we're at a top-level table column, we remove that column.

With this approach, the fields which are removed never depend on the
path the traversal takes, so we never end up with a reference to a named
struct type which in reality fetches only a subset of the total fields.
A named type always refers to the same set of fields.

<!-- How is it trying to accomplish it (what are the implementation
steps)? -->

V3_GIT_ORIGIN_REV_ID: 270a71cd5b1d3700067abf7c771c473bbc33167e
2024-09-06 18:55:50 +00:00
Phil Freeman
7a31d2d610 [PACHA-21] support uniqueness constraints in SQL layer (#1037)
<!-- The PR description should answer 2 important questions: -->

### What

Exposes SQL constraints to datafusion's planner via the
`TableProvider::constraints()` trait method.

### How

We pull these constraints from the GraphQL configuration for the model.
This is not ideal - the data should really live at the model layer now.
But we can hopefully solve that later.

<!-- How is it trying to accomplish it (what are the implementation
steps)? -->

V3_GIT_ORIGIN_REV_ID: c723d895fd707e24defae971f474b5938478a82b
2024-09-03 19:19:47 +00:00
Vamshi Surabhi
42ce01d755 [PACHA-18] sql: forward request headers to connectors (#1036)
<!-- The PR description should answer 2 important questions: -->

### What

When a command is called through the sql interface, we now respect
`argumentPresets` configured at the data connector link. We also
partially support `responseHeaders` in that we extract the response from
the 'result' key but not extract 'headers' and forward them.

### How

The part of the engine code that deals with data connector argument
presets is exposed from the ir crate and re-used in the sql layer.

V3_GIT_ORIGIN_REV_ID: 7c3124596a9bbc2b18cb79cb899c75fd4de3f7e5
2024-08-29 19:56:54 +00:00
Daniel Harvey
ffa43e3c57 Construct empty sql catalog if feature switched off (#1034)
<!-- The PR description should answer 2 important questions: -->

### What

We don't need to do this work if the feature isn't turned on.

### How

Create a new `Catalog::empty_from_metadata` function that is essentially
`mempty`.

V3_GIT_ORIGIN_REV_ID: 594462e729ff4afc3bee4db9e4b8e44d41020428
2024-08-29 14:56:09 +00:00
Phil Freeman
317193df10 [PACHA-24] Metrics in the SQL layer (#1029)
<!-- The PR description should answer 2 important questions: -->

### What

- Adds datafusion row metrics to our NDC query and aggregate nodes, for
explain output
- Aggregates all datafusion metrics in the trace attributes:
- `rows_processed`, i.e. total number of rows considered over all
execution plan nodes
- `elapsed_compute`, i.e. CPU time spent in _processing_ data (not
fetching it)
- Adds the explain output to the `create_logical_plan` span.

E.g. a query we don't push down to NDC:

```sql
SELECT
    COUNT(42 * invoiceId) AS odd_count
FROM
    InvoiceLine;
```

Attributes:

```text
rows_processed: 2242
total_rows: 1
elapsed_compute: 417
logical_plan: Projection: count(Int64(42) * InvoiceLine.invoiceId) AS odd_count
  Aggregate: groupBy=[[]], aggr=[[count(Int64(42) * InvoiceLine.invoiceId)]]
    TableScan: InvoiceLine
```

The metrics clearly indicate that the cost in terms of rows processed
per row returned (2242 / 1) is very high in this case. The logical plan
makes it clear why this was the case: we failed to push down the
aggregate node.

### How

<!-- How is it trying to accomplish it (what are the implementation
steps)? -->

V3_GIT_ORIGIN_REV_ID: c26cce9adab9d0feb0a7d2873a3eea38542564a0
2024-08-29 00:56:46 +00:00
Phil Freeman
06937b1107 [PACHA-25] Aggregate pushdown for COUNT (#1021)
<!-- The PR description should answer 2 important questions: -->

### What

Push down SQL `COUNT` aggregates to the NDC layer:

- `COUNT(1)`
- `COUNT(col)` where col may be nested
- `COUNT(DISTINCT col)`

### How

Introduces a new logical node (`ModelAggregate`) and corresponding
physical node (`NDCAggregatePushdown`). Whenever we see a `ModelQuery`
wrapped in an `Aggregate` node, we rewrite it to a `ModelAggregate` node
instead. We don't handle `GROUP BY` yet, but this approach will
generalize to that once NDC 0.2.0 lands.

V3_GIT_ORIGIN_REV_ID: 373d3941fc01c077270047612240d910045f6d93
2024-08-28 03:39:20 +00:00
Vamshi Surabhi
f47296c210 sql: commands backed by sql are now supported (#995)
<!-- The PR description should answer 2 important questions: -->

### What

1. Commands backed by ndc procedures can now be executed using in sql as
follows:

```sql
select * from command(args); -- args to be provided with struct syntax
```
2. There is an optional `disallow_mutations` field in the sql request
that defaults to `false`. When it is explicitly set to true, mutations
are disallowed. This is for pacha to get a confirmation for mutations.

### How

Most of the code is reused from ndc functions. There is a new
'NDCProcedurePushDown' physical node and the associated changes to a
mutation response from a connector.

V3_GIT_ORIGIN_REV_ID: 94913ab931290e0aa91ccd01173955da3aa1e423
2024-08-20 02:58:02 +00:00
Vamshi Surabhi
91b4fc6ee6 Improves the type mapping layer to SQL (#988)
<!-- The PR description should answer 2 important questions: -->

### What

The motivation was to allow introspecting field descriptions of object
types. To accomplish this, a fair number of things had to change. Prior
to this, we determined a model's and command's schema at the sql layer
by looking at its type mappings. When introspecting these models and
commands, if there are any nested fields, they are output as
'STRUCT<field1 type1, field2 type2>' without any descriptions.

With this change, we first build a catalog of types at the sql layer
from scalars and object types defined at opendd layer. These types are
then used when adding models and commands as opposed to looking at their
type mappings. This also changes how the type catalog is presented in
introspection. One can now query 'struct type's and their fields along
with their descriptions. Models and commands merely refer to these
struct types.

### How

<!-- How is it trying to accomplish it (what are the implementation
steps)? -->

V3_GIT_ORIGIN_REV_ID: 550da03c84b33ca44851858ee2cb73a31674d3d0
2024-08-19 21:25:25 +00:00
Phil Freeman
89c85d2c47 [PACHA-16] Pushdown limits and offsets (#981)
<!-- The PR description should answer 2 important questions: -->

### What

Push down limits and offsets, including inside existing limit and offset
queries.
### How

Adds a new rewrite which pushes a limit/offset inside an `NDCQuery`, in
case it is not picked up by the sort pushdown or the table scan
pushdown.

<!-- How is it trying to accomplish it (what are the implementation
steps)? -->

V3_GIT_ORIGIN_REV_ID: 2dcf14ff9f605919abb047560681615b88e766e2
2024-08-19 16:30:20 +00:00
Vamshi Surabhi
a21f82bbe7 [PACHA-4] initial support for commands (#975)
<!-- The PR description should answer 2 important questions: -->

### What

Supports calling functions from the SQL interface

### How

Similar to models.

V3_GIT_ORIGIN_REV_ID: 2958aeacdfd31bae0e4353cb7e20e627c84931b5
2024-08-18 03:34:15 +00:00
Phil Freeman
2795cdacad [PACHA-8] Test SQL endpoint (#980)
<!-- The PR description should answer 2 important questions: -->

### What

Adds tests for select, filter and order by.

### How

Reuses the existing test framework. There is a giant `metadata.json`
file which is used for all tests, based on Postgres. Each test is a
folder with a `query.sql` file, an `expected.json` for expected output,
and a `plan.json` for the expected explain output.

V3_GIT_ORIGIN_REV_ID: 7f5c134c5d3cf47e5f2ffa305f24f4274ccd545e
2024-08-17 22:54:12 +00:00
Phil Freeman
ffeefdf834 [PACHA-5] Pushdown other filter operators (#974)
<!-- The PR description should answer 2 important questions: -->

### What

Push down the `_lt`, `_lte`, `_gt`, `_gte` operators

### How

By matching operators with the same name on the NDC side, for now, until
we have additional NDC operator meanings that we can use.

V3_GIT_ORIGIN_REV_ID: 4341490a3cdbb62e9fe90c10279527716687545d
2024-08-17 16:03:29 +00:00
Phil Freeman
3d25939f0c [PACHA-14] Order by pushdown (#970)
<!-- The PR description should answer 2 important questions: -->

### What

Implements a new optimizer pass which pushes sort stages inside
`ModelQuery` stages.

### How

<!-- How is it trying to accomplish it (what are the implementation
steps)? -->

V3_GIT_ORIGIN_REV_ID: 2820f88003aec376b71605c0f753d7b50825ddad
2024-08-16 23:57:09 +00:00
Phil Freeman
453bcbbbb7 [PACHA-5] Filter pushdown for equality operators (#969)
### What

Push down the following SQL predicates to NDC via OpenDD IR:

- Logical operators AND, OR and NOT
- Equality operator and inequality operator

TODO:

- [ ] Comparison operators
- [x] `IS NULL` and `IS NOT NULL`
- [ ] Validate operators actually exist in the OpenDD metadata and NDC
mappings
- [ ] Use the actual OpenDD operators instead of the stand-in `_eq` and
`_neq` operators

<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->

<!-- Consider: do we need to add a changelog entry? -->

<!-- Does this PR introduce new validation that might break old builds?
-->

<!-- Consider: do we need to put new checks behind a flag? -->

### How

- `plan/filter.rs` implements two functions `can_pushdown_filter` and
`pushdown_filter` (which translates to OpenDD IR)
- `planner/filter.rs` translates OpenDD IR to NDC IR for execution.

V3_GIT_ORIGIN_REV_ID: ee2c38bf2c36292710becbabfaca51ccae983a2f
2024-08-16 01:27:53 +00:00
Samir Talwar
a78978ab4c Update dependencies in preparation for some cloud work. (#972)
### What

Update dependencies in preparation for some cloud work, and move
dependency versions to the workspace.

### How

```
$ cargo update
```

V3_GIT_ORIGIN_REV_ID: a42ba0c8a1cb376280f9771abd7d1f4f7bf19b84
2024-08-15 09:32:07 +00:00
Vamshi Surabhi
fa9d91a1a5 sql: moves crate::plan to crate::execute::planner::model (#971)
<!-- The PR description should answer 2 important questions: -->

### What

This is a no-op change. Moves model related planning code from the
top-level to `planner` submodule. This is in preparation for commands
implementation.

V3_GIT_ORIGIN_REV_ID: 97a73ced40dadf168efc1147e52ef4f36103bc50
2024-08-15 08:01:29 +00:00
Phil Freeman
714fffad18 [PACHA-6] Remove projection pushdown (#967)
<!-- The PR description should answer 2 important questions: -->

### What

Remove the projection pushdown optimization. `datafusion` already
optimizes this to the correct NDC IR.

### How

<!-- How is it trying to accomplish it (what are the implementation
steps)? -->

V3_GIT_ORIGIN_REV_ID: 6e0034e4d920c39b70667f5f521341069a5c53de
2024-08-13 19:18:40 +00:00
Vamshi Surabhi
6051c2f359 update datafusion to 41.0.0 (#959)
<!-- The PR description should answer 2 important questions: -->

### What

Updates datafusion dependency to `41`.

### How

Fixes for breaking changes.

V3_GIT_ORIGIN_REV_ID: cc957f6c7ff9b3c004cc5e0cbb48387c1f963b7a
2024-08-12 23:31:27 +00:00
Vamshi Surabhi
173ec9a1e5 [PACHA-12] sql: fix failing introspection queries (#958)
### What

Introspection queries (on 'hasura' schema) would fail when there is no
data in the underlying tables.

### How

A more robust 'MemTable' with a comprehensive set of tests is introduced
which shouldn't run into these issues.

V3_GIT_ORIGIN_REV_ID: e09de03e8d093fb4348514cfed6b6dc1d9b0b0c8
2024-08-12 22:04:36 +00:00
Phil Freeman
6f9e92c160 [PACHA-1] Handle nested fields in /sql endpoint (#936)
<!-- The PR description should answer 2 important questions: -->

### What

- Add columns with nested fields to the SQL schema
- Alias nested fields appropriately in order to support them for query
execution

<!-- Consider: do we need to add a changelog entry? -->

<!-- Does this PR introduce new validation that might break old builds?
-->

<!-- Consider: do we need to put new checks behind a flag? -->

### How

- Translate OpenDD types to Arrow types during schema generation
(`to_arrow_type`)
- Generate `NestedField` structures during planning to prepare data in
the right format during execution (`fields_for`)

V3_GIT_ORIGIN_REV_ID: d37d2eade2fd5c0f08861c1bbc6368a88299b0f3
2024-08-12 21:25:11 +00:00
Vamshi Surabhi
db80b37ece [PACHA-2] sql: handle ndc responses with empty rows (#947)
<!-- The PR description should answer 2 important questions: -->

### What

When querying a table with no data through SQL would result in an error.

### How

Instead of returning a `RecordBatch`, arrow_json's implementation
returns an `Option<RecordBatch>`, we now account for `None`.

V3_GIT_ORIGIN_REV_ID: 459440e82aeb1b2faa009405e025fc024497d5b4
2024-08-12 09:52:31 +00:00
Abhinav Gupta
180c1dbc59 Refactor SQL layer to use OpenDD query IR (#925)
As per the multiple frontends RFC:
https://github.com/hasura/v3-engine/blob/vamshi/multiple-frontends/rfcs/multiple-frontends.md

V3_GIT_ORIGIN_REV_ID: 07f7c5323179a62fd08717d6d49f9415da139873
2024-08-05 23:38:19 +00:00
Vamshi Surabhi
4aefdabb65 avoid using raw Strings in more places (#923)
- `DataConnectorAggregationFunctionName` and `AggregateFunctionName` now
use `str_newtype`.
- All usages of `String`s for subgraph names are removed.

(This is part of a larger effort to remove references in
`execute::plan::QueryPlan`).

V3_GIT_ORIGIN_REV_ID: d51f0a2335e8dabbc9efdad1d1efff285ddb74c3
2024-08-05 22:27:47 +00:00
Vamshi Surabhi
d41170b06a simplify the sql context that powers datafusion (#921)
Prior to this, on every request, a datafusion catalog provider was
created from the stored sql context. This PR reworks it so that this is
cheap and also more maintainable will fewer intermediate steps. There is
also some work done towards supporting table valued functions.

---------

Co-authored-by: Abhinav Gupta <127770473+abhinav-hasura@users.noreply.github.com>
V3_GIT_ORIGIN_REV_ID: 8c30485366969d81d2a35760962e0383ed5e488c
2024-08-01 21:28:32 +00:00
Rakesh Emmadi
7177a423da Support remote relationship in permission filter (#904)
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
Closes:
https://linear.app/hasura/issue/APIPG-397/support-remote-relationship-predicates-in-permission-filters

### What

<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->

<!-- Consider: do we need to add a changelog entry? -->
Allow defining permission filters with remote relationships in their
predicates.

### How

<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
- Lift metadata resolve restriction for remote relationships in
permission predicates
- Abstract out the remote relationship resolving logic, in query filter,
into a new function and re-use it while resolving permission filters.
- Tests:
- A metadata build test to check the presence of essential equal
operator on source fields in relationship mapping.
- Ported all `select_many/relationship_predicate/`* tests to a new
`select_many/remote_relationship_predicate/*` with appropriate metadata
changes.

---------

Co-authored-by: Anon Ray <ecthiender@users.noreply.github.com>
V3_GIT_ORIGIN_REV_ID: 9c496ecdc9829ed626354ef85e776e1afcb0dfc7
2024-07-31 11:41:12 +00:00
Daniel Harvey
07f0a90332 Split out IR crate (#909)
<!-- The PR description should answer 2 (maybe 3) important questions:
-->

### What

`execute` is now the biggest `crate` in engine and does a lot, let's
split it into it's constituent steps.

Functional no-op.

<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->

<!-- Consider: do we need to add a changelog entry? -->

### How

Split out `ir` crate from the `execute` crate. Replace export of entire
modules with that of specific types / functions. Therefore, consumers
outside the crate talk about `ir::CommandInfo` rather than
`ir::command::CommandInfo`. There is no need for other crates to know
about the internal structure of this crate.

<!-- How is it trying to accomplish it (what are the implementation
steps)? -->

V3_GIT_ORIGIN_REV_ID: 47553aec63e80af7f95e659a170a2685e9ac2ce3
2024-07-30 15:03:49 +00:00
Rakesh Emmadi
7c9c3f5859 no-op refactor: split plan/types.rs into separate modules (#908)
<!-- The PR description should answer 2 (maybe 3) important questions:
-->

### What

<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->

<!-- Consider: do we need to add a changelog entry? -->
The `plan/type.rs` has become large and overwhelmed. This PR refactors
its code and removes it.

### How

<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
- Move code from `plan/types.rs` into old `arguments.rs`, `filter.rs`
and new `field.rs`, `query.rs`, `mutation.rs`.
- Delete `plan/types.rs`
- Refactor code in other modules to accommodate new changes.

V3_GIT_ORIGIN_REV_ID: 0e294ca8fb4bf1d8622806f5c8b72a2bb01ccdaf
2024-07-30 13:41:05 +00:00
Anon Ray
72289171aa rename NdcFieldName to NdcFieldAlias (#882)
### What

We introduced a newtype around the NDC field alias, but we called it
`NdcFieldName`. While in reality it is the alias of the field requested
in the query.

This PR changes the name to `NdcFieldAlias`.

This is a no-op change

V3_GIT_ORIGIN_REV_ID: 8e892c29860e93243a200b6a6291fd0a32cc6fe3
2024-07-26 08:10:15 +00:00
Daniel Chambers
dacb229d10 sql crate now executes via plan and decouples from NDC types (#873)
Previously the `sql` crate generated a v02 ndc query request and then
downgraded it to v01 if necessary. This is fragile in that its easy to
use v02 ndc features and then get v01 downgrade errors, plus the
downgrade logic is extensive and tedious.

This PR refactors the `sql` crate so that it generates `ir` and `plan`
types and eventually creates `ResolvedQueryExecutionPlan` (rather than
ndc_models types), and then the ResolvedQueryExecutionPlan is
transformed into the appropriate ndc version in the same fashion as the
main engine execute code does it. This eliminates all the downgrade
logic and simplifies things.

Unfortunately, ndc's `QueryRequest` could not just simply be replaced
with `QueryExecutionPlan` on `sql`'s `NDCQuery` and `NDCPushDown`,
because it involves lifetime parameters which are incompatible with the
datafusion framework types. Instead, the individual components of a
query are kept on `NDCQuery` and `NDCPushDown`, and these are eventually
assembled into a `ResolvedQueryExecutionPlan` at a place where the
lifetime parameters are workable. In some sense this is clearer, as one
can now see where each individual part of the query is actually created
and relevant, instead of copying around and mutating a `QueryRequest`.

Completes
https://linear.app/hasura/issue/APIPG-702/implement-separate-logic-that-maps-engine-types-to-ndc-models-types-on

V3_GIT_ORIGIN_REV_ID: c4a9226c1b1addcfe5cd0bca783f1b65ab3ada38
2024-07-24 11:37:44 +00:00
Daniel Chambers
00fa5c42ba Refactor to prevent unresolved queries from being sent as ndc requests (#871)
~~Note: this PR is stacked on #845.~~ Rebased on main

This PR refactors the `execute::plan::types` further to make a clear
distinction between unresolved and resolved states. An "unresolved"
state refers to one in which remote predicates have not been computed
into local predicates. A "resolved" state is after this process is
performed and remote predicates are eliminated.

Previously, unresolved types could be passed to
`execute::plan::ndc_request` and they would fail at runtime due to the
presence of unresolved remote predicates. Now, this is impossible due to
a type-level distinction between unresolved and resolve states.

This distinction is made by type-parameterizing all
`execute::plan::types` that involve a predicate so that the predicate
type is parameterized out. Then, an `Unresolved` type alias is created
that sets the predicate type to
`execute::ir::filter::expression::Expression` (which contains remote
predicates) and a `Resolved` type alias is created that uses
`ResolvedFilterExpression` instead (which does not contain remote
predicates).

For example, for `QueryNode`, we now have:

```rust
pub struct QueryNode<'s, TFilterExpression> {
    ...
    pub predicate: Option<TFilterExpression>,
    ...
}
```

And then the two aliases are:

```rust
pub type UnresolvedQueryNode<'s> = QueryNode<'s, ir::filter::expression::Expression<'s>>;
pub type ResolvedQueryNode<'s> = QueryNode<'s, ResolvedFilterExpression>;
```

Subsequently, `plan::ndc_request` only deals with `Resolved` types.

This is mostly just type-fiddling, but one place some logic moved around
is in with the old `plan::types::FilterExpression`. This was mostly a
functional duplicate of `ir::filter::execute::Expression` except that it
had a "planned" remote predicate variant in it. In order to reduce the
number of types (so we didn't need `UnresolvedFilterExpression` and
`ResolvedFilterExpression`), this type has been repurposed into
`ResolvedFilterExpression` and no longer deals with remote predicates.
Instead, `ir::filter::execute::Expression` is resolved into a
`ResolvedFilterExpression` and the planning of the remote predicate is
done at that time, just before it is resolved. This works fine, since an
entirely new ndc query is performed in order to resolve the predicate,
so planning that can be deferred until then and it doesn't need to be
done at the same time as the main query.

Part of
https://linear.app/hasura/issue/APIPG-702/implement-separate-logic-that-maps-engine-types-to-ndc-models-types-on

V3_GIT_ORIGIN_REV_ID: 3ec89efbaa7b543fad6a100e2739bcc74b1d567f
2024-07-24 09:55:39 +00:00