<!-- The PR description should answer 2 important questions: -->
### What
- Add columns with nested fields to the SQL schema
- Alias nested fields appropriately in order to support them for query
execution
<!-- Consider: do we need to add a changelog entry? -->
<!-- Does this PR introduce new validation that might break old builds?
-->
<!-- Consider: do we need to put new checks behind a flag? -->
### How
- Translate OpenDD types to Arrow types during schema generation
(`to_arrow_type`)
- Generate `NestedField` structures during planning to prepare data in
the right format during execution (`fields_for`)
V3_GIT_ORIGIN_REV_ID: d37d2eade2fd5c0f08861c1bbc6368a88299b0f3
<!-- The PR description should answer 2 important questions: -->
### What
Renables Github benchmarking after we removed it in
https://github.com/hasura/v3-engine/pull/819
### How
Tell Criterion to only sample for 5 seconds each time to stop each
benchmark going on forever. This makes the whole run take a reasonable
10 minutes.
V3_GIT_ORIGIN_REV_ID: 364be6490f4f4b21877849daf1f734fa51ecf542
<!-- The PR description should answer 2 important questions: -->
### What
Upgrade to [Rust
1.80.0](https://blog.rust-lang.org/2024/07/25/Rust-1.80.0.html)
### How
Update `rust-toolchain.yaml` and Dockerfiles, fix warnings.
V3_GIT_ORIGIN_REV_ID: ba797e1aba6b9623a921734473a6b70a2a38c8b7
<!-- The PR description should answer 2 important questions: -->
### What
When querying a table with no data through SQL would result in an error.
### How
Instead of returning a `RecordBatch`, arrow_json's implementation
returns an `Option<RecordBatch>`, we now account for `None`.
V3_GIT_ORIGIN_REV_ID: 459440e82aeb1b2faa009405e025fc024497d5b4
<!-- The PR description should answer 2 important questions: -->
### What
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
<!-- Does this PR introduce new validation that might break old builds?
-->
<!-- Consider: do we need to put new checks behind a flag? -->
Just an improvement to the middleware plugin
### How
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
We now take a nonempty list of plugins. This ensures that we only do
things if we have plugins.
V3_GIT_ORIGIN_REV_ID: c8fb548f763cdefe3526c67d7c801104ad5c527a
<!-- The PR description should answer 2 important questions: -->
### What
We want to store the pre-plugins as artifacts. For this, we need to get
the list of pre-plugins while building the artifacts.
### How
This can be achieved by extracting the pre-plugins during the build
step.
---------
Co-authored-by: Rakesh Emmadi <12475069+rakeshkky@users.noreply.github.com>
V3_GIT_ORIGIN_REV_ID: 64e8697d90092acad0cb8a338becb7868af78350
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
This PR introduces a new OpenDD object of kind `LifecyclePluginHook`. An
example
```json
{
"kind": "LifecyclePluginHook",
"version": "v1",
"definition": {
"pre": "parse",
"name": "test",
"url": "http://localhost:8787",
"config": {
"request": {
"headers": {
"additional": {
"hasura-m-auth": {
"value": "zZkhKqFjqXR4g5MZCsJUZCnhCcoPyZ"
}
}
},
"session": {},
"rawRequest": {
"query": {},
"variables": {}
}
}
}
}
}
```
The plugin configs (only pre-parse plugins for now) are stored in the
engine state and used wherever required.
### How
We have added the OpenDD object.
V3_GIT_ORIGIN_REV_ID: aa02315362e5fc9a36b63ead48909e1baa92779f
### What
Add a check for presence of descriptions in JSON schema. Add missing
descriptions for types.
### How
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
- Extend the `validate_root_json_schema` utility with a check for
descriptions.
- Add a doc comment for types that need a description in their JSON
schema.
---------
Co-authored-by: Abhinav Gupta <127770473+abhinav-hasura@users.noreply.github.com>
V3_GIT_ORIGIN_REV_ID: 5c411aa0cf33ac1fde076c29020edd4957fbc27c
<!-- The PR description should answer 2 important questions: -->
### What
Resolve a TODO from boolean expressions stage.
There is a small chance this cause a build error - will check the schema
diff tests before merging.
### How
Use `store_new_graphql_type` on the graphql types of boolean
expressions.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: 0498f8f3480ef2d5bbdbb57a0e50cae3bbdef868
<!-- The PR description should answer 2 important questions: -->
### What
Avoid some unnecessary cloning during metadata resolve.
### How
Some data structures, such as `graphql_types` which are updated by
various stages of metadata resolve, were being unnecessarily cloned.
Instead of cloning from an immutable reference, we now move the value
into each stage and return it as part of the stage output.
V3_GIT_ORIGIN_REV_ID: 067698c3e004c70165fb0a8190542115a9f6cfb6
### What
- Added `AuthConfig` v2 config example in
`static/auth/auth_config_v2.json`
- Moved exisiting `auth_config.json` to `static/auth/`
- Removed unused `pre_plugins.json`
If one wants to start the engine with a v2 of AuthConfig,
`static/auth/auth_config_v2.json` can be used.
V3_GIT_ORIGIN_REV_ID: 471f8ae43ab02c2182457804a24b8445bb41f06c
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
- Add metadata resolver for the new `OrderByExpression` metadata object
kind.
- When resolving a v1 `Model`, generate an `OrderByExpression` based on
the model's `orderable_fields`.
### How
`OrderByExpression` was added to OpenDD in
https://github.com/hasura/v3-engine/pull/780. This PR adds metadata
resolvers for this new metadata object kind. The OpenDD changes involved
a new version v2 of `Model` where `orderableFields` was moved out of the
out of the model definition. The model now references an
`OrderByExpression` instead.
To retain backwards compatibility with `Model` v1, the model metadata
resolver now extracts the `orderableFields` from a v1 model and uses
them to create a new `OrderByExpression` for internal use by the
resolved metadata.
V3_GIT_ORIGIN_REV_ID: a7dbafe860e586efdb2e03c23020a067011c57a1
Specifically, DataConnectorColumnName and DataConnectorName which are
wrappers on `SmolStr` and hence cheap to clone.
We want to use `Plan` as a physical node in the sql layer but given
datafusion's architecture, a physical node cannot contain references.
This is a small PR towards this effort.
V3_GIT_ORIGIN_REV_ID: 284dcfb4e8e7ce83705b415611c22e8a6e25e4be
### What
Added all relevant W3C and Zipkin/B3 trace response headers, to exposed
headers for CORS.
The headers list (as pointed out by Samir) -
For Zipkin/B3:
- `X-B3-TraceId`
- `X-B3-SpanId`
- `X-B3-ParentSpanId`
- `X-B3-Sampled`
For W3C:
- `traceparent`
- `tracestate`
This is generally useful for any client accessing the API to retrieve
tracing information.
### How
Created a constant array of relevant header names. And initialize the
CORS middleware, with these as exposed headers.
V3_GIT_ORIGIN_REV_ID: c7aaf2507b03e1897971ca6cd2bbaa06b08dfa52
<!-- The PR description should answer 2 important questions: -->
### What
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
<!-- Does this PR introduce new validation that might break old builds?
-->
<!-- Consider: do we need to put new checks behind a flag? -->
Fields involved in the relationship's inner predicates were incorrectly
reported as fields of the root model. This PR resolves the issue. Also,
fixes the predicates inside `And` or `Or` are not reported.
Note: Changelog not required, as query usage analytics are Hasura
internal and hidden from users.
### How
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
- Use `for` loop instead of `Iterator map` to avoid confusion around the
execution of lambda passed to the `map` function (more context in this
[slack](https://hasurahq.slack.com/archives/C04PUMV4X16/p1722871834852519)
thread)
- Introduce a new struct, to report predicate relationship fields, that
has a field to report its inner filter predicate usage.
V3_GIT_ORIGIN_REV_ID: 9ca23e6005ccb09f2321a2ae30ef575f99e84e06
- `DataConnectorAggregationFunctionName` and `AggregateFunctionName` now
use `str_newtype`.
- All usages of `String`s for subgraph names are removed.
(This is part of a larger effort to remove references in
`execute::plan::QueryPlan`).
V3_GIT_ORIGIN_REV_ID: d51f0a2335e8dabbc9efdad1d1efff285ddb74c3
<!-- The PR description should answer 2 important questions: -->
### What
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
<!-- Does this PR introduce new validation that might break old builds?
-->
<!-- Consider: do we need to put new checks behind a flag? -->
When reporting query usage analytics, mention whether a field is
deprecated.
Note: The changelog is not required, as the usage analytics are for
Hasura internal use.
### How
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
OpenDd allows marking an ObjectType's field as deprecated with an
optional reason. Plumb the deprecation context to the input/output
schema annotation. Report the field usage with a deprecated boolean
field.
V3_GIT_ORIGIN_REV_ID: 430cdcf3e1ff0c43812caecb8d06a64b729665be
### What
Add a flag `--export-traces-stdout` to log traces to stdout. Default is
disabled.
Command-line flag - `--export-traces-stdout`
Env var - `EXPORT_TRACES_STDOUT`
### How
Introduce a new command line flag. Make `initialize_tracing` accept a
`bool`, and setup the stdout exporter based on that.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: f39d6f863fd2bca65ad89f1cef4b077aa9eabc5b
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
We would like to generate operation_name level metrics with execution
latency. Right now, the operation_name is part of validate span which
isn't really doing anything while `execute_query` is the parent span
which will represent the operation time.
<!-- Consider: do we need to add a changelog entry? -->
### How
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
This PR adds `operation_name` to `execute_query` span
V3_GIT_ORIGIN_REV_ID: fc14d92c66b0245739d672b7570be1871243f241
<!-- The PR description should answer 2 important questions: -->
### What
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
<!-- Does this PR introduce new validation that might break old builds?
-->
<!-- Consider: do we need to put new checks behind a flag? -->
Fixes a bug where queries with nested relationship selection and filter
predicates fail due to an issue with NDC relationship collection.
```graphql
query MyQuery {
Album {
AlbumId
Title
ArtistId
Tracks {
AlbumId
Name
TrackId
}
}
}
```
A selection permission defined on the `Tracks` model with a relationship
comparison in the predicate.
### How
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
- Previously, the collection of relationships occurred independently by
traversing through the IR AST. Consequently, during planning, the
collection of local relationships was explicitly ignored. This caused
confusion and resulted in the omission of relationship collectors when
planning nested selections for local relationships, leading to the
issue.
- In this PR, the independent collection of relationships is removed.
Instead, all NDC relationships for field selection, filter, and
permission predicates are now collected during planning. This unifies
the logic, and ensures consistency in achieving the same purpose.
V3_GIT_ORIGIN_REV_ID: cbd5bfef7a90a7d7602061a9c733ac54b764e0d3
<!-- The PR description should answer 2 important questions: -->
### What
Trying to understand what is going on here. Still no closer, but have
added a test and made some types more specific in order to clarify my
understanding.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
<!-- Does this PR introduce new validation that might break old builds?
-->
<!-- Consider: do we need to put new checks behind a flag? -->
### How
Add some introspection tests for relationships with
`ObjectBooleanExpressionType`s to ensure they generate. Tried to make
relationship fields disappear to recreate build problems but could not.
Split `BooleanExpressionGraphqlConfig` and
`ObjectBooleanExpressionGraphqlConfig` to make sure we're not mixing
them up. We only want to use `BooleanExpressionGraphqlConfig` in
`metadata_resolve`, this ensures that.
Pushed some partiality in `schema/boolean_expressions.rs` out - a
function was `Option<inputs> -> Option<outputs>` and now it's `inputs ->
outputs`. We use `Option` a lot and it makes reasoning why something
hasn't been added to the schema difficult.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: 893e6f32bfded14ea724be7eaedc519e264f4c01
This PR fixes the following bugs:
- Fixes a bug where models and commands were allowed even though they
did not define arguments to satisfy the underlying data connector
collection/function/procedure. **UPDATE:** This only raises a warning
rather than fails the build, because existing builds on staging and
production have this issue. This will need to be transitioned to an
error once the Compatibility Date plumbing is in place.
- Fixes a bug where argument presets set in the DataConnectorLink were
sent to every connector function/procedure regardless of whether the
function/procedure actually declared that argument
- Fixes a bug where argument presets set in the DataConnectorLink were
not sent to connector collections that backed Models
- Fixes a bug where the type of the argument name in the
DataConnectorLink's argument presets was incorrect in the Open DD
schema. It was `ArgumentName` but should have been
`DataConnectorArgumentName`
- Fixes a bug where the check to ensure that argument presets in the
DataConnectorLink does not overlap with arguments defined on
Models/Commands was comparing against the Model/Command argument name
not the data connector argument name
There are a number of changes that tighten things up in this PR.
Firstly, the custom connector is improved so that it rejects requests
with arguments of the wrong type or unexpected arguments. This causes
tests that should have been failing to actually fail. Then, new tests
have been added to metadata_resolve to cover the untested edge cases
around data connector link argument presets.
Then, metadata resolve is refactored so that the link argument presets
are validated and stored on each command/model source, rather than on
the DataConnectorLink. Extra validation has been added during this
process to fix the above bugs. Any irrelevant argument presets to the
particular command/model are dropped. Then, during execution, we read
the presets from the command/model source instead of from the
DataConnectorLink, which ensures we only send the appropriate arguments.
JIRA: [V3ENGINE-290](https://hasurahq.atlassian.net/browse/V3ENGINE-290)
Fixes
https://linear.app/hasura/issue/APIPG-676/dataconnectorlink-argument-presets-are-always-sent-regardless-of
[V3ENGINE-290]:
https://hasurahq.atlassian.net/browse/V3ENGINE-290?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
V3_GIT_ORIGIN_REV_ID: dd02e52e1ff224760c5f0ed6a73a1ae56779e1f1
The validation added in #880 validated that the version in the
DataConnectorLink's capabilities version matched the version specified
in the schema. Unfortunately, there are existing builds with invalid
capabilities versions that failed to parse. Subsequently the validation
was removed in #907 to fix staging the deploy that broke.
This is the unique set of errors found when deploying to staging:
```
error generating artifacts: schema build error: invalid metadata: The data connector myts (in subgraph app) has an error: The version specified in the capabilities ("") is an invalid version: empty string, expected a semver version
error generating artifacts: schema build error: invalid metadata: The data connector my_ts (in subgraph app) has an error: The version specified in the capabilities ("") is an invalid version: empty string, expected a semver version
error generating artifacts: schema build error: invalid metadata: The data connector mydbpg (in subgraph app) has an error: The version specified in the capabilities ("") is an invalid version: empty string, expected a semver version
error generating artifacts: schema build error: invalid metadata: The data connector chinook (in subgraph app) has an error: The version specified in the capabilities ("") is an invalid version: empty string, expected a semver version
error generating artifacts: schema build error: invalid metadata: The data connector clickhouse (in subgraph analytics) has an error: The version specified in the capabilities ("^0.1.1") is an invalid version: unexpected character '^' while parsing major version number
error generating artifacts: schema build error: invalid metadata: The data connector chinook_link (in subgraph app) has an error: The version specified in the capabilities ("") is an invalid version: empty string, expected a semver version
error generating artifacts: schema build error: invalid metadata: The data connector app_connector (in subgraph app) has an error: The version specified in the capabilities ("^0.1.1") is an invalid version: unexpected character '^' while parsing major version number
error generating artifacts: schema build error: invalid metadata: The data connector chinook (in subgraph app) has an error: The version specified in the capabilities ("^0.1.1") is an invalid version: unexpected character '^' while parsing major version number
error generating artifacts: schema build error: invalid metadata: The data connector nodejs (in subgraph app) has an error: The version specified in the capabilities ("") is an invalid version: empty string, expected a semver version
error generating artifacts: schema build error: invalid metadata: The data connector db (in subgraph app) has an error: The version specified in the capabilities ("*") is an invalid version: unexpected character '*' while parsing major version number
error generating artifacts: schema build error: invalid metadata: The data connector my_pg (in subgraph my_subgraph) has an error: The version specified in the capabilities ("") is an invalid version: empty string, expected a semver version
error generating artifacts: schema build error: invalid metadata: The data connector mypg (in subgraph myapp) has an error: The version specified in the capabilities ("") is an invalid version: empty string, expected a semver version
error generating artifacts: schema build error: invalid metadata: The data connector mypglink (in subgraph mysubgraph) has an error: The version specified in the capabilities ("") is an invalid version: empty string, expected a semver version
error generating artifacts: schema build error: invalid metadata: The data connector mypg (in subgraph app2) has an error: The version specified in the capabilities ("") is an invalid version: empty string, expected a semver version
error generating artifacts: schema build error: invalid metadata: The data connector test_connector (in subgraph app) has an error: The version specified in the capabilities ("") is an invalid version: empty string, expected a semver version
```
The invalid versions are: `""`, `"*"`, "^0.1.1"`.
This PR restores the version validation code, but for NDC v0.1.x
capabilities (the only supported version right now, v0.2.x is feature
flagged off), we now accept versions that fail to parse as a valid
semver, and instead we raise an issue that gets logged as a warning.
NDC v0.2.x capabilities retains the stricter behaviour and does not
accept dodgy a capabilities version. This is backwards compatible
because trying to use NDC v0.2.x right now produces a build error.
Fixes APIPG-736
V3_GIT_ORIGIN_REV_ID: 9e9bf99123bad31e8229e8ea29343eb8aaf9786d
Prior to this, on every request, a datafusion catalog provider was
created from the stored sql context. This PR reworks it so that this is
cheap and also more maintainable will fewer intermediate steps. There is
also some work done towards supporting table valued functions.
---------
Co-authored-by: Abhinav Gupta <127770473+abhinav-hasura@users.noreply.github.com>
V3_GIT_ORIGIN_REV_ID: 8c30485366969d81d2a35760962e0383ed5e488c
<!-- The PR description should answer 2 important questions: -->
### What
When no `booleanExpressionType` is specified in a
`BooleanExpressionType` `comparableRelationship`, we fallback to
whatever is defined for the model. However, we were ignoring old style
`ObjectBooleanExpressionType`, meaning relationship fields were
disappearing.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
<!-- Does this PR introduce new validation that might break old builds?
-->
<!-- Consider: do we need to put new checks behind a flag? -->
### How
Also match on `ModelExpressionType::ObjectBooleanExpressionType` when
looking up leaf boolean expressions for relationships.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: 9a67b734679b8a1fe3d176a259ba579e127948b8
<!-- The PR description should answer 2 important questions: -->
### What
Change an error down to an issue / warning to unbreak builds.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
<!-- Does this PR introduce new validation that might break old builds?
-->
<!-- Consider: do we need to put new checks behind a flag? -->
### How
Introduce `BooleanExpressionIssue`, move error value to it, emit this
instead. Later we'll turn this into an error based on compatibility
date.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: f0903cc04ea1cf328c9bf67a38d76fd670743679
<!-- The PR description should answer 2 important questions: -->
### What
We emit a warning suggesting users deprecate
`DataConnectorScalarRepresentation`, however it still has uses outside
boolean expressions, so let's not advise this until it is sensible
advice.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
<!-- Does this PR introduce new validation that might break old builds?
-->
<!-- Consider: do we need to put new checks behind a flag? -->
### How
Remove a warning.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: a95a705d121396a09a9b626237999f032e650189
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
Closes:
https://linear.app/hasura/issue/APIPG-397/support-remote-relationship-predicates-in-permission-filters
### What
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
Allow defining permission filters with remote relationships in their
predicates.
### How
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
- Lift metadata resolve restriction for remote relationships in
permission predicates
- Abstract out the remote relationship resolving logic, in query filter,
into a new function and re-use it while resolving permission filters.
- Tests:
- A metadata build test to check the presence of essential equal
operator on source fields in relationship mapping.
- Ported all `select_many/relationship_predicate/`* tests to a new
`select_many/remote_relationship_predicate/*` with appropriate metadata
changes.
---------
Co-authored-by: Anon Ray <ecthiender@users.noreply.github.com>
V3_GIT_ORIGIN_REV_ID: 9c496ecdc9829ed626354ef85e776e1afcb0dfc7
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
This pull request optimizes the `DistinctComparisons` struct in the
codebase to improve the performance of storing and checking for distinct
comparison predicates in remote relationship comparison expressions.
<!-- Consider: do we need to add a changelog entry? -->
### How
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
- Replaced `Vec` with `IndexSet`: Changed the data structure used in the
DistinctComparisons struct from `Vec` to `IndexSet` to leverage the
average O(1) complexity for contains and insert operations provided by
`IndexSet`.
- Updated push Method: Modified the push method to use `IndexSet`'s
insert method directly, which simplifies the code and improves
performance.
**Performance Improvement:**
query:
```graphql
query RemoteRelationship {
Album(where: {TracksRemote: {Name: {_ilike: "%B%"}}}) {
Title
}
}
```
The `TracksRemote` predicate query yields 723 non-distinct results,
which reduce to 266 unique results after deduplication.
Benchmark used: [graphql-bench](https://github.com/hasura/graphql-bench)
configuration: autocannon - Requests Per Second strategy (50 rps) - 10
seconds duration.
Results:
- Before Optimization:
Average Latency: 38.99 ms
- After Optimization:
Average Latency: 23.32 ms
- Percentage Decrease in Latency: Approximately 40%
V3_GIT_ORIGIN_REV_ID: 17a7160b7229eb3a2fde93273d5cf05102f9b4bd
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
`execute` is now the biggest `crate` in engine and does a lot, let's
split it into it's constituent steps.
Functional no-op.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
Split out `ir` crate from the `execute` crate. Replace export of entire
modules with that of specific types / functions. Therefore, consumers
outside the crate talk about `ir::CommandInfo` rather than
`ir::command::CommandInfo`. There is no need for other crates to know
about the internal structure of this crate.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: 47553aec63e80af7f95e659a170a2685e9ac2ce3
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
The `plan/type.rs` has become large and overwhelmed. This PR refactors
its code and removes it.
### How
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
- Move code from `plan/types.rs` into old `arguments.rs`, `filter.rs`
and new `field.rs`, `query.rs`, `mutation.rs`.
- Delete `plan/types.rs`
- Refactor code in other modules to accommodate new changes.
V3_GIT_ORIGIN_REV_ID: 0e294ca8fb4bf1d8622806f5c8b72a2bb01ccdaf
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
These checks are breaking artifact generation, so disabling them until
we can find a safer way to introduce them.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: ae97c87720b67384127122ed0220383036c87bbf
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
Making this match engine.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
Replace `V4` with `V6`
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: e86d118b96d41407a292f9ad4132b8ab6d06454f
### What
Telemetry-baggage is propagated via headers from incoming requests to a
service and relayed when the service itself calls another service.
However, when a service is open to the public it may not want just
anyone to be able to pass it baggage.
This PR adds the ability to configure the policy towards baggage
relaying in the tracing-util crate.
### How
When the argument `initialize_tracing(..., propagate_caller_baggage =
false)` we add to the globally defined text map propagator a derived
version of the `BaggagePropagator` which cannot extract baggage from
incoming requests, only inject its own context baggage into outgoing
requests.
V3_GIT_ORIGIN_REV_ID: af9a51c20a8fe7ae2085e8218a4f1d5e01b26ae1
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
This allows object types to be used as arguments for comparison
operators. This is useful for Elasticsearch's `range` operator, which
allows passing an object like `{ gt: 1, lt: 100 }` to an `integer` field
in order to filter items that are greater than `1` and less than `100`.
This PR has the nice side effect of dropping the requirement to use
information from scalar `BooleanExpressionType`s in place of
`DataConnectorScalarTypes`, which we only required because we were not
looking up the comparable operator information in scalar boolean
expression types correctly.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
Previously, when using `ObjectBooleanExpressionType` and
`DataConnectorScalarRepresentation`, we had no information about the
argument types of comparison operators (ie, what values should I pass to
`_eq`?), and so inferred this by looking up the comparison operator in
the data connector schema, then looking for a
`DataConnectorScalarRepresentation` that tells us what OpenDD type that
maps to.
Now, with `BooleanExpressionType`, we have this information provided in
OpenDD itself:
```yaml
kind: BooleanExpressionType
version: v1
definition:
name: Int_comparison_exp
operand:
scalar:
type: Int
comparisonOperators:
- name: _eq
argumentType: Int! # This is an OpenDD type
- name: _within
argumentType: WithinInput!
- name: _in
argumentType: "[Int!]!"
```
Now we look up this information properly, as well as tightening up some
validation around relationships that was making us fall back to the old
way of doing things where the user had failed to provide a
`comparableRelationship` entry.
This means
a) we can actually use object types as comparable operator types
b) scalar boolean expression types aren't used outside the world of
boolean expressions, which is a lot easier to reason about.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: ad5896c7f3dbf89a38e7a11ca9ae855a197211e3
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
We have decided to remove the role emulation feature from engine
altogether.
More details in the RFC -
https://docs.google.com/document/d/1tlS9pqRzLEotLXN_dhjFOeIgbH6zmejOdZTbkkPD-aM/edit
### How
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: e7cb765df5afac6c6d6a05a572a832ce9910cc0b
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
We've had issues with `metadata-resolve` rejecting Elasticsearch schema
output, so adding said output to this test. Appears to work fine, so
merging it for further discussion and to improve the test case.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
Add elastic search schema to test.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: ea7c39ca7ab07fc18abd08eb822d2d56fc152ae6
### What
We introduced a newtype around the NDC field alias, but we called it
`NdcFieldName`. While in reality it is the alias of the field requested
in the query.
This PR changes the name to `NdcFieldAlias`.
This is a no-op change
V3_GIT_ORIGIN_REV_ID: 8e892c29860e93243a200b6a6291fd0a32cc6fe3
### What
Part of the point of the `tracing-util` crate is to centrally enforce
usage of a single version of opentelemetry libraries.
Previously we added some support for relaying baggage, but not actually
for defining it. This PR exposes the crates and types necessary to add
baggage to the context.
V3_GIT_ORIGIN_REV_ID: 107ec652d4e812f31bbfaa362cedf44b25dc3c39
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
We have a bunch of local development infra for building the engine
inside a Docker container. This is helpful for Buildkite which doesn't
come with stuff like `cargo` preinstalled. We've not using Buildkite
anymore, let's remove it.
V3_GIT_ORIGIN_REV_ID: b4b7679aab5b14081288df25d139944f160a61fe
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
We'd like to make it simpler to try out DDN, by starting with a mode
that uses no auth.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
Add a `NoAuth` `AuthConfig` mode that is configured thus:
```json
"noAuth": {
"role": "admin",
"sessionVariables": {
"x-hasura-user-id": "1"
}
}
```
Given the above config:
- If no `x-hasura-role` is sent with a request, we run it as `admin`.
- If a `x-hasura-role` header is sent and it's `admin`, it continues to
work
- If any other `x-hasura-role` header is sent, an error will happen.
- All other headers are ignored, and we always set `x-hasura-user-id` to
1
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: dddcfbee9c3a31e84dfc8013de32e3a9bf31943d
This PR adds validation code to `metadata_resolve` that prevents someone
from putting a schema/capabilities from the wrong NDC version into the
DataConnectorLink while specifying a different schema version in the
DataConnectorLink. For example:
```
kind: DataConnectorLink
version: v1
definition:
name: data_connector
schema:
version: v0.2
schema: {}
capabilities:
version: 0.1.5 # Not allowed for version v0.2!
capabilities: {}
```
This PR has two commits. One is a refactor where we rearrange the
DataConnectorError types so that the name of the data connector is
captured centrally in `NamedDataConnectorError`, so that it doesn't have
to be passed around and included in every error manually. The other is
the validation changes to `metadata_resolve`.
Completes APIPG-705
V3_GIT_ORIGIN_REV_ID: baed571f36f4cbed824ca546128f5df360d5b298
This PR updates as many tests as possible that use the custom connector
so that the tests run over two versions of the custom connector:
1. The custom connector in the repo, which currently speaks `ndc_models`
v0.2.x
2. The custom connector from the past (commit ), which is the last
version to speak `ndc_models` v0.1.x
This helps us test both the NDC v0.1.x and v0.2.x code paths. When the
postgres connector upgrades to v0.2.x, we can use the same approach as
in this PR to get the tests to run over multiple versions of the
postgres connector too, for much better coverage. This approach with the
custom connector will become less useful over time as the v0.1.x
connector is not updated and will diverge in data from the v0.2.x
connector. The postgres connector is likely to be longer-lasting, as it
is more stable.
The basic test used for `execute` integration tests is
`test_execution_expectation` (in `crates/engine/tests/common.rs`) and it
has been extended into a version called
`test_execution_expectation_for_multiple_ndc_versions` that takes
metadata on a per NDC version basis and then runs the test multiple
times, once for each NDC version. This allows one to swap out the
DataConnectorLink involved in the test to a different one that points at
either the v0.1.x or v0.2.x versions of the connector. The assertion is
that both connectors should produce the same results, even if they talk
a different version of the NDC protocol. As each version runs, we
`println!` the version so that if the test fails you can look in stdout
for the test and see which one was executing when it failed.
Tests that use the custom connector now use
`test_execution_expectation_for_multiple_ndc_versions` and run across
both connector versions. Some tests were unable to be used across both
version as the data between the two versions has changed. Some tests
were modified to avoid the changed data so as to support running across
both versions. Any tests that use `test_execution_expectation_legacy`
don't run across both versions because those tests aren't backed by the
same test implementation as
`test_execution_expectation_for_multiple_ndc_versions`.
Unfortunately the custom connector doesn't use the standard connector
SDK, so it doesn't support `HASURA_CONNECTOR_PORT`. This means that the
old connector is stuck on 8101. To work around this, I've moved the
current connector port to 8102 instead. Technically we might be able to
use docker to remap the ports, but then this binds us into always
running the connectors in docker in order to move their ports around, so
I avoided that approach.
Completes APIPG-703
V3_GIT_ORIGIN_REV_ID: fb0e410ddbee0ea699815388bc63584d6ff5dd70
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
In a recent engine change, we changed some of our trace context mapping
to use the shared settings consistently. However, we needed to make sure
we included `TraceContextResponsePropagator`, which returns the
`traceresponse` header.
Request from console after this fix:
<img width="810" alt="Screenshot 2024-07-25 at 11 58 30"
src="https://github.com/user-attachments/assets/c8e73c56-87fd-49da-a887-f91cdb6d607a">
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
Adds `TraceContextResponsePropagator` to the global set of text map
propagators.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: 48df6a6fe55e78a48f1dc6bf82304199a0a7e248
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
NDC query request expects relationship names which are unique across the
query.
Previously, we would generate relationship name of the form -
```
[{\"subgraph\":\"connector_2\",\"name\":\"Album\"},\"Tracks\"]
```
This works, but is harder to read while debugging. This PR changes it to
have a human-readable name like -
```
connector_2___Album__Tracks
```
This is a no-op change, apart from the relationship names in NDC query
requests.
### How
Instead of json-ifying the data structure in a tuple, create a formatted
string.
V3_GIT_ORIGIN_REV_ID: 3fea3bf56f1688bc1cade1ea2b3ed6eb60509cac
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
We've had our CI mixed between Github and Buildkite for a while, it's
time to commit. First step is moving the "tests" step to Github Actions.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
This PR:
- Moves the `test` step to Github Actions
- Creates a new `custom_connector.Dockerfile` which builds custom
connector only, more quickly.
- Changes the metadata tests to use `localhost` instead of their Docker
internal names (ie `custom_connector` or `postgres_connector`) - this is
because the tests are being run from outside Docker now
- Removes the `test` Buildkite step
It does not:
- Remove the code coverage or benchmarks steps from Buildkite
- Tidy up `justfile` or Dockerfiles
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
---------
Co-authored-by: Philip Lykke Carlsen <plcplc@gmail.com>
V3_GIT_ORIGIN_REV_ID: a67534ebc1634a24b48d2620c45003221852e199
Previously the `sql` crate generated a v02 ndc query request and then
downgraded it to v01 if necessary. This is fragile in that its easy to
use v02 ndc features and then get v01 downgrade errors, plus the
downgrade logic is extensive and tedious.
This PR refactors the `sql` crate so that it generates `ir` and `plan`
types and eventually creates `ResolvedQueryExecutionPlan` (rather than
ndc_models types), and then the ResolvedQueryExecutionPlan is
transformed into the appropriate ndc version in the same fashion as the
main engine execute code does it. This eliminates all the downgrade
logic and simplifies things.
Unfortunately, ndc's `QueryRequest` could not just simply be replaced
with `QueryExecutionPlan` on `sql`'s `NDCQuery` and `NDCPushDown`,
because it involves lifetime parameters which are incompatible with the
datafusion framework types. Instead, the individual components of a
query are kept on `NDCQuery` and `NDCPushDown`, and these are eventually
assembled into a `ResolvedQueryExecutionPlan` at a place where the
lifetime parameters are workable. In some sense this is clearer, as one
can now see where each individual part of the query is actually created
and relevant, instead of copying around and mutating a `QueryRequest`.
Completes
https://linear.app/hasura/issue/APIPG-702/implement-separate-logic-that-maps-engine-types-to-ndc-models-types-on
V3_GIT_ORIGIN_REV_ID: c4a9226c1b1addcfe5cd0bca783f1b65ab3ada38
~~Note: this PR is stacked on #845.~~ Rebased on main
This PR refactors the `execute::plan::types` further to make a clear
distinction between unresolved and resolved states. An "unresolved"
state refers to one in which remote predicates have not been computed
into local predicates. A "resolved" state is after this process is
performed and remote predicates are eliminated.
Previously, unresolved types could be passed to
`execute::plan::ndc_request` and they would fail at runtime due to the
presence of unresolved remote predicates. Now, this is impossible due to
a type-level distinction between unresolved and resolve states.
This distinction is made by type-parameterizing all
`execute::plan::types` that involve a predicate so that the predicate
type is parameterized out. Then, an `Unresolved` type alias is created
that sets the predicate type to
`execute::ir::filter::expression::Expression` (which contains remote
predicates) and a `Resolved` type alias is created that uses
`ResolvedFilterExpression` instead (which does not contain remote
predicates).
For example, for `QueryNode`, we now have:
```rust
pub struct QueryNode<'s, TFilterExpression> {
...
pub predicate: Option<TFilterExpression>,
...
}
```
And then the two aliases are:
```rust
pub type UnresolvedQueryNode<'s> = QueryNode<'s, ir::filter::expression::Expression<'s>>;
pub type ResolvedQueryNode<'s> = QueryNode<'s, ResolvedFilterExpression>;
```
Subsequently, `plan::ndc_request` only deals with `Resolved` types.
This is mostly just type-fiddling, but one place some logic moved around
is in with the old `plan::types::FilterExpression`. This was mostly a
functional duplicate of `ir::filter::execute::Expression` except that it
had a "planned" remote predicate variant in it. In order to reduce the
number of types (so we didn't need `UnresolvedFilterExpression` and
`ResolvedFilterExpression`), this type has been repurposed into
`ResolvedFilterExpression` and no longer deals with remote predicates.
Instead, `ir::filter::execute::Expression` is resolved into a
`ResolvedFilterExpression` and the planning of the remote predicate is
done at that time, just before it is resolved. This works fine, since an
entirely new ndc query is performed in order to resolve the predicate,
so planning that can be deferred until then and it doesn't need to be
done at the same time as the main query.
Part of
https://linear.app/hasura/issue/APIPG-702/implement-separate-logic-that-maps-engine-types-to-ndc-models-types-on
V3_GIT_ORIGIN_REV_ID: 3ec89efbaa7b543fad6a100e2739bcc74b1d567f