This PR updates as many tests as possible that use the custom connector
so that the tests run over two versions of the custom connector:
1. The custom connector in the repo, which currently speaks `ndc_models`
v0.2.x
2. The custom connector from the past (commit ), which is the last
version to speak `ndc_models` v0.1.x
This helps us test both the NDC v0.1.x and v0.2.x code paths. When the
postgres connector upgrades to v0.2.x, we can use the same approach as
in this PR to get the tests to run over multiple versions of the
postgres connector too, for much better coverage. This approach with the
custom connector will become less useful over time as the v0.1.x
connector is not updated and will diverge in data from the v0.2.x
connector. The postgres connector is likely to be longer-lasting, as it
is more stable.
The basic test used for `execute` integration tests is
`test_execution_expectation` (in `crates/engine/tests/common.rs`) and it
has been extended into a version called
`test_execution_expectation_for_multiple_ndc_versions` that takes
metadata on a per NDC version basis and then runs the test multiple
times, once for each NDC version. This allows one to swap out the
DataConnectorLink involved in the test to a different one that points at
either the v0.1.x or v0.2.x versions of the connector. The assertion is
that both connectors should produce the same results, even if they talk
a different version of the NDC protocol. As each version runs, we
`println!` the version so that if the test fails you can look in stdout
for the test and see which one was executing when it failed.
Tests that use the custom connector now use
`test_execution_expectation_for_multiple_ndc_versions` and run across
both connector versions. Some tests were unable to be used across both
version as the data between the two versions has changed. Some tests
were modified to avoid the changed data so as to support running across
both versions. Any tests that use `test_execution_expectation_legacy`
don't run across both versions because those tests aren't backed by the
same test implementation as
`test_execution_expectation_for_multiple_ndc_versions`.
Unfortunately the custom connector doesn't use the standard connector
SDK, so it doesn't support `HASURA_CONNECTOR_PORT`. This means that the
old connector is stuck on 8101. To work around this, I've moved the
current connector port to 8102 instead. Technically we might be able to
use docker to remap the ports, but then this binds us into always
running the connectors in docker in order to move their ports around, so
I avoided that approach.
Completes APIPG-703
V3_GIT_ORIGIN_REV_ID: fb0e410ddbee0ea699815388bc63584d6ff5dd70
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
In a recent engine change, we changed some of our trace context mapping
to use the shared settings consistently. However, we needed to make sure
we included `TraceContextResponsePropagator`, which returns the
`traceresponse` header.
Request from console after this fix:
<img width="810" alt="Screenshot 2024-07-25 at 11 58 30"
src="https://github.com/user-attachments/assets/c8e73c56-87fd-49da-a887-f91cdb6d607a">
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
Adds `TraceContextResponsePropagator` to the global set of text map
propagators.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: 48df6a6fe55e78a48f1dc6bf82304199a0a7e248
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
NDC query request expects relationship names which are unique across the
query.
Previously, we would generate relationship name of the form -
```
[{\"subgraph\":\"connector_2\",\"name\":\"Album\"},\"Tracks\"]
```
This works, but is harder to read while debugging. This PR changes it to
have a human-readable name like -
```
connector_2___Album__Tracks
```
This is a no-op change, apart from the relationship names in NDC query
requests.
### How
Instead of json-ifying the data structure in a tuple, create a formatted
string.
V3_GIT_ORIGIN_REV_ID: 3fea3bf56f1688bc1cade1ea2b3ed6eb60509cac
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
We've had our CI mixed between Github and Buildkite for a while, it's
time to commit. First step is moving the "tests" step to Github Actions.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
This PR:
- Moves the `test` step to Github Actions
- Creates a new `custom_connector.Dockerfile` which builds custom
connector only, more quickly.
- Changes the metadata tests to use `localhost` instead of their Docker
internal names (ie `custom_connector` or `postgres_connector`) - this is
because the tests are being run from outside Docker now
- Removes the `test` Buildkite step
It does not:
- Remove the code coverage or benchmarks steps from Buildkite
- Tidy up `justfile` or Dockerfiles
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
---------
Co-authored-by: Philip Lykke Carlsen <plcplc@gmail.com>
V3_GIT_ORIGIN_REV_ID: a67534ebc1634a24b48d2620c45003221852e199
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
Our CI times are slow, thus merging is slow, and I do not feel this job
pulls it weight.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
Remove the code coverage job from Buildkite, delete all associated
files.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: 7bbde842f9f405920d1e559b8fef88078cc8d564
Previously the `sql` crate generated a v02 ndc query request and then
downgraded it to v01 if necessary. This is fragile in that its easy to
use v02 ndc features and then get v01 downgrade errors, plus the
downgrade logic is extensive and tedious.
This PR refactors the `sql` crate so that it generates `ir` and `plan`
types and eventually creates `ResolvedQueryExecutionPlan` (rather than
ndc_models types), and then the ResolvedQueryExecutionPlan is
transformed into the appropriate ndc version in the same fashion as the
main engine execute code does it. This eliminates all the downgrade
logic and simplifies things.
Unfortunately, ndc's `QueryRequest` could not just simply be replaced
with `QueryExecutionPlan` on `sql`'s `NDCQuery` and `NDCPushDown`,
because it involves lifetime parameters which are incompatible with the
datafusion framework types. Instead, the individual components of a
query are kept on `NDCQuery` and `NDCPushDown`, and these are eventually
assembled into a `ResolvedQueryExecutionPlan` at a place where the
lifetime parameters are workable. In some sense this is clearer, as one
can now see where each individual part of the query is actually created
and relevant, instead of copying around and mutating a `QueryRequest`.
Completes
https://linear.app/hasura/issue/APIPG-702/implement-separate-logic-that-maps-engine-types-to-ndc-models-types-on
V3_GIT_ORIGIN_REV_ID: c4a9226c1b1addcfe5cd0bca783f1b65ab3ada38
~~Note: this PR is stacked on #845.~~ Rebased on main
This PR refactors the `execute::plan::types` further to make a clear
distinction between unresolved and resolved states. An "unresolved"
state refers to one in which remote predicates have not been computed
into local predicates. A "resolved" state is after this process is
performed and remote predicates are eliminated.
Previously, unresolved types could be passed to
`execute::plan::ndc_request` and they would fail at runtime due to the
presence of unresolved remote predicates. Now, this is impossible due to
a type-level distinction between unresolved and resolve states.
This distinction is made by type-parameterizing all
`execute::plan::types` that involve a predicate so that the predicate
type is parameterized out. Then, an `Unresolved` type alias is created
that sets the predicate type to
`execute::ir::filter::expression::Expression` (which contains remote
predicates) and a `Resolved` type alias is created that uses
`ResolvedFilterExpression` instead (which does not contain remote
predicates).
For example, for `QueryNode`, we now have:
```rust
pub struct QueryNode<'s, TFilterExpression> {
...
pub predicate: Option<TFilterExpression>,
...
}
```
And then the two aliases are:
```rust
pub type UnresolvedQueryNode<'s> = QueryNode<'s, ir::filter::expression::Expression<'s>>;
pub type ResolvedQueryNode<'s> = QueryNode<'s, ResolvedFilterExpression>;
```
Subsequently, `plan::ndc_request` only deals with `Resolved` types.
This is mostly just type-fiddling, but one place some logic moved around
is in with the old `plan::types::FilterExpression`. This was mostly a
functional duplicate of `ir::filter::execute::Expression` except that it
had a "planned" remote predicate variant in it. In order to reduce the
number of types (so we didn't need `UnresolvedFilterExpression` and
`ResolvedFilterExpression`), this type has been repurposed into
`ResolvedFilterExpression` and no longer deals with remote predicates.
Instead, `ir::filter::execute::Expression` is resolved into a
`ResolvedFilterExpression` and the planning of the remote predicate is
done at that time, just before it is resolved. This works fine, since an
entirely new ndc query is performed in order to resolve the predicate,
so planning that can be deferred until then and it doesn't need to be
done at the same time as the main query.
Part of
https://linear.app/hasura/issue/APIPG-702/implement-separate-logic-that-maps-engine-types-to-ndc-models-types-on
V3_GIT_ORIGIN_REV_ID: 3ec89efbaa7b543fad6a100e2739bcc74b1d567f
In rare cases, sensitive headers were written to the GraphQL Engine logs. These cases are now handled correctly to avoid logging sensitive information.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/10946
GitOrigin-RevId: a9d66532245789a16cf91936d53bbfce19d272a4
This PR removes usages of ndc_models types from the `ir` types and the
`plan::types` in the `execute` crate. This is done to isolate the IR and
planning logic from the specific version of NDC that needs to be used to
talk to the specific data connector. Once planning is done, the
`plan::types` are mapped into `ndc_models` types via
`plan::ndc_request::v01` and `plan::ndc_request::v02`. Those two modules
contain code that maps `plan::types` to a specific ndc_models version.
This code is entirely separate per version.
Now, when the `plan::types` are `resolve`d, they don't return NDC types.
Now they return themselves, but modified to remove any remote
predicates. After `resolve` the types are in a state that they can be
converted into a single NDC request via `plan::ndc_request`.
Next steps for a future PR: `QueryExecutionPlan` (etc) needs to type
parameterize out the `FilterExpression` type so that after `resolve`, a
new `QueryExecutionPlan` is returned that uses another FilterExpression
type that does not have remote predicates in it. This will make
resolving typesafe and prevent accidentally using `plan::ndc_request` on
unresolved `QueryExecutionPlan`s.
The `sql` crate also needs to stop talking ndc directly and probably
talk `plan::types` instead. Right now it writes ndc v0.2.0 and then
downgrades it to v0.1.0 if necessary, which is awful. Talking
`plan::types` is currently not possible due to lifetime parameters used
on the `plan::types`. These may be able to be removed from the resolved
variant of those types as mentioned above.
Part of
https://linear.app/hasura/issue/APIPG-702/implement-separate-logic-that-maps-engine-types-to-ndc-models-types-on
V3_GIT_ORIGIN_REV_ID: b536009ea784d1486a2ece2262e0ce9d0f937ef0
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
When a boolean expression is passed as an argument it was not being
translated into an `ndc_models::Expression` and so queries failed.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
Look up the type of an argument in `metadata-resolve`, and mark the
`ArgumentInfo` with a new `ArgumentKind`. Then in IR step we use that to
work out whether to turn the argument to JSON as before, or translate it
into an `Expression` type that will eventually be turned into an
`ndc_models::Expression`.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: 4da3ce0ae04895c33de2b6bdb6fff1018c39b3ad
### What
This PR enables the use of Opentelemetry Baggage. Every bit of baggage
is then replicated on every span.
The current implementation does not actually set any baggage itself - it
only relays and outputs what it's getting.
Crafting a request with a `baggage` header set:
![image](https://github.com/user-attachments/assets/f2974398-370a-4e8c-8761-692cfc5682f6)
Has it propagated (here, to `dev-auth-webhook`) and stamped onto every
span:
![image](https://github.com/user-attachments/assets/6661c41f-56be-4edd-9027-e88eb816f1e7)
### How
This PR actually makes the engine and auth-hook use the globally
specified propagators (before they would only use an obsucre, concrete,
re-exported one from opentelemetry_contrib), and adds the
`BaggagePropagator` to the list.
It also adds a `SpanProcessor` which outputs the baggage as span
attributes. Currently it outputs all Baggage entries, but can be made
more specific in the future if we want to treat Baggage differently.
V3_GIT_ORIGIN_REV_ID: 3b5b8604b624c0b90c192e68b3b57fab7ca9b63e
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
Sometimes our builds succeed, but we'd like to tell the user how they
could do better. This implements the simplest possible warnings system.
<img width="957" alt="Screenshot 2024-07-18 at 16 06 49"
src="https://github.com/user-attachments/assets/ff91d221-667a-43f9-bc8a-51bf4574a7b8">
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
Warnings are printed to stdout on `v3-engine` startup, and will be
returned to the CLI via `v3-metadata-build-service`. The diff is mostly
updated snapshot tests as we're returning more from
`metadata-resolve::resolve` now.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: d01520e53f49d9b594e94a4531b6a86e749875c3
### What
Previously, while generating relationship definitions for NDC, we would
ignore columns with nested selection.
This PR fixes that.
Closes https://hasurahq.atlassian.net/browse/V3ENGINE-247
### How
While matching on `FieldSelection::Column`, don't ignore it. Check if it
contains nested selection, if it does, call
`collect_relationships_from_nested_selection`
V3_GIT_ORIGIN_REV_ID: 9db94744d8e2d35f8430bded07209ef519175205
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
This PR fixes issues in the SQL layer where the following queries would
fail:
1. `select count(*) from "Track"`
2. `select * from "Track" where id = 1`
<!-- Consider: do we need to add a changelog entry? -->
### How
These were failing because the built-in analyzer rules that rewrite
`count(*)` and type-cast expressions weren't firing.
`with_analyzer_rules` replaces the analyzer rules of a session context
with the given list. We want our analyzer rule to be fired in addition
to the built-in analyzer rules.
Tests are being worked on in a separate PR.
V3_GIT_ORIGIN_REV_ID: 42231f97b5b28d9b7eeff0c3e592cb43ff7d952f
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
We're having issues with our deduplication of names in JSONSchema. We
would like to fix this, but in the short term, this renames a
conflicting object to avoid this quickly.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
Rename `ArgumentPreset` in `open_dds::data_connectors` to
`DataConnectorArgumentPreset`.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: e3eafeffe8ba4d513f9d0a09a623f101650247ea
Alpine is often slower than Debian (because musl is sometimes slower than glibc) and not how anyone actually deploys PostgreSQL in production.
Most notably, some floating-point computations result in slightly different values, and Debian ships with better support for different locales so sorting text (e.g. with `ORDER BY`) behaves differently.
Let's test against an environment that people are likely to actually use.
As a result, there are slight changes to the results of PostGIS computations in a couple of test cases.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/10942
GitOrigin-RevId: 4caed19def23a372fc3930c409514b1c9b385026
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
update changelog for new release
### How
:)
V3_GIT_ORIGIN_REV_ID: 6620f9923f393d190f0f2fab1aced1cff4d6aec0
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
This stage already has it's own error type, but it returns the larger
`Error` type, so let's return the more specific type instead.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
Changing return types mostly. Functional no-op.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: 2aae1f06775db6d88c34b1d3c1779396e0ba410e
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
More breaking down the big error type, this time we sort the
`data_connector_scalar_types` stage. Functional no-op.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
Move error cases into a smaller enum.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: 50b699f3a77594deb27a6cc6ab8dd61752404daf
Now, users can filter their queries using remote relationships in the
filter predicate. Users need to provide the relationships for comparison
in `comparableRelationships` field of the newer `BooleanExpressionType`
opendd metadata.
Minimal Algorithm:
```
Relationship: ARemoteB => Model_A -> Model_B (remote NDC)
Column Mapping: (A_column_1, B_column_1), (A_column_2, B_column_2).
query:
Model_A:
where: ARemoteB: {B_column_3: {_eq: value}}
Step 1: Fetch RHS column values (in mapping) from remote target model
SELECT B_column_1, B_column_2 from model_b_collection WHERE B_column_3 = value;
yields the following rows
[
[(B_column_1, b_value_1), (B_column_2, b_value_2)],
[(B_column_1, b_value_11), (B_column_2, b_value_22)],
]
Step 2: Using above rows the generate LHS column filter for Model_A query.
SELECT <fields> from model_a_collection WHERE
((A_column_1 = b_value_1) AND (A_column_2 = b_value_2))
OR ((A_column_1 = b_value_11) AND (A_column_2 = b_value_22))
The above comparison is equivalent to
WHERE
(A_column_1, A_column_2) IN ((b_value_1, b_value_11), (b_value_2, b_value_22))
```
Sample query:
```graphql
query MyQuery {
Track(
where: {
_or: [
{ AlbumRemote: { Artist: { ArtistId: { _eq: 2 } } } }
{ TrackId: { _eq: 3 } }
]
}
) {
TrackId
AlbumRemote {
Artist {
ArtistId
Name
}
}
}
}
```
In the query above, `AlbumRemote` is a remote relationship which targets
a model backed by a different data connector.
V3_GIT_ORIGIN_REV_ID: 7aa76fcae83e1f22de460f1eef5648fb7a35c047
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
Much like this change, but for the `object_boolean_expressions` stage:
https://github.com/hasura/v3-engine/pull/843
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: 85da185b45bac429b754b0b92419f378a59fb536
### What
This PR fixes an issue where relationships that target commands do not
correctly use the data connector's argument name when making the ndc
request. Instead, they use the OpenDD argument name, which is incorrect.
For metadata where the OpenDD argument name is the same as the data
connector's argument name, the code works but only coincidentally.
### How
I've updated an existing test to change the name of the command argument
to be different from the data connector's argument name. This test
failed but is now fixed by this PR, which simply looks up the name of
the data connector argument name and uses that instead.
V3_GIT_ORIGIN_REV_ID: 71f1e812174c7bb9922792523129e4bcdce911ed
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
Filtering on nested arrays doesn't work, let's make sure it's not
allowed for now.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
Adding a check in `boolean_expression_types` stage.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
---------
Co-authored-by: Gil Mizrahi <gil@gilmi.net>
V3_GIT_ORIGIN_REV_ID: cc08e8c24098c1fea9b6e1ee61b82ade989dd29a
This PR adds true support for ndc_models v0.2.0 to v3-engine. Note that
v0.2.0 is not finalized yet, so we're pointing at v0.2.0-rc0. The
support still comes via the migration methodology, where v0.2.x ndc
models are downgraded to v0.1.x to support backwards compatibility. In
the future we want to remove this and have the engine generate the
different versioned ndc models separately instead of performing a
migration.
The ndc_models_v01 crate reference has been bumped to the official
v0.1.5 version, which brings the newtypes to the v0.1.x version. The
ndc_models crate reference is now on v0.2.0-rc0.
The custom connector has been updated to support ndc-spec v0.2.0. All
tests that talk to the custom connector have been updated with its
latest v0.2.0 schema/capabilities.
In `metadata_resolve` the v01->v02 schema/capabilities migration code
has been updated to handle the new v0.2.0 types. This includes inferring
v0.2.0 capabilities from what was possible in v0.1.x.
In `execution`, the migration code has been updated to deal with the new
v0.1.5 newtypes and v0.2.0 types. This means there are now cases where a
downgrade is impossible and produces an error (see `NdcDowngradeError`
in `execute::ndc::migration`). A bug has also been fixed where NDC
expressions in arguments were not being serialized to the correct NDC
version.
V3_GIT_ORIGIN_REV_ID: 5b4afcde64c307b2bd7c985c588d6c74d9623a0f
### What
Much like https://github.com/hasura/v3-engine/pull/824, we combine
relay-related errors into `RelayError`.
### How
Remove them from the big `Error` type.
---------
Co-authored-by: Daniel Chambers <daniel@hasura.io>
V3_GIT_ORIGIN_REV_ID: b26460c6aa4d622c6f5548e5cd294c7480acdca4
### What
Part of ongoing tidy up of errors, this splits out errors types for the
boolean expression stages.
### How
Remove things from `Error`, move files around. Functional no-op.
V3_GIT_ORIGIN_REV_ID: ccf1f29600a169a3787d744c7f60e79220aef8d2
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
To stop us being confused between `Error` type and `Error` trait.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
Import `thiserror::Error` explicitly in place.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: b930480927b2c64537960cfb69f2b2b30921f4fd
This PR fixes the custom connector whose schema endpoint doesn't
actually return correct output (it was missing some
functions/procedures, etc). Then it updates all tests that actually talk
to the custom connector with the latest version of its
schema/capabilities in their DataConnectorLink.
This test update is done by a new script added to the justfile that
finds and patches all metadata json files and inserts the new schema and
capabilities after reading them from the custom connector running in
docker.
V3_GIT_ORIGIN_REV_ID: f1825a6f74ddcb6c01198fe4a41de6b4fc0bf533
<!-- The PR description should answer 2 (maybe 3) important questions:
-->
### What
As a treat, combine all Apollo errors into one enum.
<!-- What is this PR trying to accomplish (and why, if it's not
obvious)? -->
<!-- Consider: do we need to add a changelog entry? -->
### How
Remove items from `ObjectTypesError` and `Error`.
<!-- How is it trying to accomplish it (what are the implementation
steps)? -->
V3_GIT_ORIGIN_REV_ID: 5a16a030b35372283490f3de7343fcfca2fadea5
### What
And don't set it except for the main application; tests and test
infrastructure does not care.
### How
We use the `VERSION` constant, populated from the `RELEASE_VERSION`
environment variable at build time.
It's now also optional so tests don't have to specify it.
V3_GIT_ORIGIN_REV_ID: 1bfc2efb060307cc9446bf07e944e107f0607ae0