Upgrade to GHC 9.4.5, and update any tests.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8954
Co-authored-by: Mohd Bilal <24944223+m-Bilal@users.noreply.github.com>
Co-authored-by: Samir Talwar <47582+SamirTalwar@users.noreply.github.com>
Co-authored-by: Philip Lykke Carlsen <358550+plcplc@users.noreply.github.com>
GitOrigin-RevId: 5261126777cb478567ea471c4bf5441bc345ea0d
When upgrading to GHC v9.4, we noticed a number of failures because the sort order of HashMaps has changed. With this changeset, I am endeavoring to mitigate this now and in the future.
This makes one of two changes in a few areas where we depend on the sort order of elements in a `HashMap`:
1. the ordering of the request is preserved with `InsOrdHashMap`, or
2. we sort the data after retrieving it.
Fortunately, we do not do this anywhere where we _must_ preserve order; it's "just" descriptions, error messages, and OpenAPI metadata. The main problem is that tests are likely to fail each time we upgrade GHC (or whatever is providing the hash seed).
[NDAT-705]: https://hasurahq.atlassian.net/browse/NDAT-705?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/9390
GitOrigin-RevId: 84503e029b44094edbbc298651744bc2843c15f3
## Description
This is the first step in making use of Logical Models with document databases such as MongoDB. As part of schema introspection, a data connector agent can supply a set of custom types that can be used to describe the schema for columns within the tables of the database (or _fields_ within a _document collection_ in MongoDB terminology).
Previously, we were storing these custom types as `TableObjectType`s within the `TableCoreInfo` for each table.
In this PR we
- replace the `TableObjectTypes` with `LogicalModel` types
- store these directly within the `DBObjectsIntrospection` instead of within the `TableCoreInfo` for each table. (The custom types are shared at the source level so there was no reason to have a separate set of types for each table.)
- When building the `SourceInfo`, we combine the `LogicalModel`s from `DBObjectsIntrospection` with `LogicalModel`s from the user's metadata to create the set of `LogicalModels` in the `SourceInfo` within the `SchemaCache`. I.e. we combine the set of types obtained by database introspection with the set of types specified by the user in the metadata. If two types have the same name, we use the type defined in the metadata.
## Limitations and future work
- Provide a way for the user to associate a meta-data defined `LogicalModel` with a table instead of requiring one to be provided by DB introspection
- Provide a way for the user to edit the `LogicalModel` types provided by introspection and add them to the metadata.
- Allow a `LogicalModel` object type to describe and entire table rather than just individual columns.
- Better handling for "unknown" types, e.g. if the type of a collection (or part of a collection) is unknown we should treat it as a JSON scalar value. This may also involve adding an `_everything` field which returns the full document as a JSON scalar.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/9345
GitOrigin-RevId: 5cec72fc1be1380d8600f7be547bbf71aad770bd
## Description
This change adds support for querying into nested arrays in Data Connector agents that support such a concept (currently MongoDB).
### DC API changes
- New API type `ColumnType` which allows representing the type of a "column" as either a scalar type, an object reference or an array of `ColumnType`s. This recursive definition allows arbitrary nesting of arrays of types.
- The `type` fields in the API types `ColumnInfo` and `ColumnInsertSchema` now take a `ColumnType` instead of a `ScalarType`.
- To ensure backwards compatibility, a `ColumnType` representing a scalar serialises and deserialises to the same representation as `ScalarType`.
- In queries, the `Field` type now has a new constructor `NestedArrayField`. This contains a nested `Field` along with optional `limit`, `offset`, `where` and `order_by` arguments. (These optional arguments are not yet used by either HGE or the MongoDB agent.)
### MongoDB Haskell agent changes
- The `/schema` endpoint will now recognise arrays within the JSON validation schema and generate corresponding arrays in the DC schema.
- The `/query` endpoint will now handle `NestedArrayField`s within queries (although it does not yet handle `limit`, `offset`, `where` and `order_by`).
### HGE server changes
- The `Backend` type class adds a new type family `XNestedArrays b` to enable nested arrays on a per-backend basis (currently enabled only for the `DataConnector` backend.
- Within `RawColumnInfo` the column type is now represented by a new type `RawColumnType b` which mirrors the shape of the DC API `ColumnType`, but uses `XNestedObjects b` and `XNestedArrays b` type families to allow turning nested object and array supports on or off for a particular backend. In the `DataConnector` backend `API.CustomType` is converted into `RawColumnInfo 'DataConnector` while building the schema.
- In the next stage of schema building, the `RawColumnInfo` is converted into a `StructuredColumnInfo` which allows us to represent the three different types of columns: scalar, object and array. TODO: the `StructuredColumnInfo` looks very similar to the Logical Model types. The main difference is that it uses the `XNestedObjects` and `XNestedArrays` type families. We should be able to combine these two representations.
- The `StructuredColumnInfo` is then placed into a `FIColumn` `FieldInfo`. This involved some refactoring of `FieldInfo` as I had previously split out `FINestedObject` into a separate constructor. However it works out better to represent all "column" fields (i.e. scalar, object and array) using `FIColumn` as this make it easier to implement permission checking correctly. This is the reason the `StructuredColumnInfo` was needed.
- Next, the `FieldInfo` are used to generate `FieldParser`s. We add a new constructor to `AnnFieldG` for `AFNestedArray`. An `AFNestedArray` field parser can contain either a simple array selection or an array aggregate. Simple array `FieldParsers` are currently limited to subfield selection. We will add support for limit, offset, where and order_by in a future PR. We also don't yet generate array aggregate `FieldParsers.
- The new `AFNestedArray` field is handled by the `QueryPlan` module in the `DataConnector` backend. There we generate an `API.NestedArrayField` from the AFNestedArray. We also handle nested arrays when reshaping the response from the DC agent.
## Limitations
- Support for limit, offset, filter (where) and order_by is not yet fully implemented, although it should not be hard to add this
- Support for aggregations on nested arrays is not yet fully implemented
- Permissions involving nested arrays (and objects) not yet implemented
- This should be integrated with Logical Model types, but that will happen in a separate PR
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/9149
GitOrigin-RevId: 0e7b71a994fc1d2ca1ef73bfe7b96e95b5328531
pro is a superset of the functionality of oss, so it's a better target,
and we expect benchmark results to be indicative of oss.
also tweak regression report view, tune highlighting of throughput benchmarks to be more lax
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8897
GitOrigin-RevId: c2d89cb61a841c9b5c76203bdcb4756ae9d7381f
Upgrades Citus to v11.3.0 in tests.
This breaks an assumption made by the tests for the `get_source_tables` metadata API, in which data is expected to be ordered. We fix it by explicitly ordering rather than relying on the goodwill of the database.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/9039
GitOrigin-RevId: ee86db7e1c264d5009bb0203ac2f3fb2cda7b39f
I also pinned Citus to v11.3. This should hopefully stop us from being surprised with random test failures in the future. We will need to bump this every now and again.
I have updated the Makefile to standardize Docker commands, and made sure we start all the containers even when running tests for a single database, as we need to test cross-DB remote joins. This ensures `make test-citus` actually works and runs all tests.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/9035
Co-authored-by: Daniel Harvey <4729125+danieljharvey@users.noreply.github.com>
GitOrigin-RevId: 9c36ab65eb05206bfddd639c257d6c5c5cedd2bd
These tests ensure that upgrading HGE preserves the GraphQL schema.
They do this by running two different versions of HGE against the same metadata, and ensuring that the GraphQL schema doesn't change.
We might find that in the future, we make an additive change that makes these tests fail. Improving the tests to allow for this is left as an exercise to whoever triggers it. (Sorry.)
Currently, we do this with:
* an empty database (zero tracked relations)
* the Chinook dataset
* the "huge schema" dataset
The base version of HGE tested against can be overridden with an option. The version must be available on Docker Hub.
Further information is in the Haddock documentation.
[NDAT-627]: https://hasurahq.atlassian.net/browse/NDAT-627?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8982
GitOrigin-RevId: 97b4deda1e6fe1db33ce35db02e12c6acc6c29e3
This version includes an important fix for properly detecting changes in extra-source-files.
We also pin the same version in GitHub workflows.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8930
GitOrigin-RevId: 884d9b807041a9b6c622ab221cbdfb446d3ffdba
This fixes the simple HLint warnings, and adds a few suppressions to avoid noise.
The suppressions don't really solve the problems, but I think the warnings here are quite benign and I'm uncomfortable with how likely I would be to introduce a bug during refactoring.
In the case of _pg-client_ and _resource-pool_, we can't use the recommended functions anyway, and there doesn't seem to be a way to tell HLint to ignore entire packages.
I have updated the `make` targets to only fail if errors or warnings are found, not suggestions. This brings it in line with the CI job.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8910
GitOrigin-RevId: 596277b4ae5833876fc3f43875208c1279518a59
We seem to be getting flakes where we try and use the same port for two different servers. This is because in certain cases we cannot simply allocate the port dynamically, but have to decide it in advance, leading to a race condition.
We resolve this by keeping track of the ports we allocate when using this method, making sure we never allocate them twice. We also make sure we allocate from a different pool of ports to the usual dynamic port pool (typically above port 32768, and often above port 49152).
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8903
GitOrigin-RevId: 375a23867591a4566493dddbc550c58cf88ea392
So you can do:
```
$ HASURA_GRAPHQL_EE_LICENSE_KEY=... scripts/dev.sh graphql-engine-pro
```
along with the `--prof-*` modes etc.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8894
GitOrigin-RevId: 2257749b2936cbd3230beb23e774ac92989e2fbc
I'm hoping this will reduce the flakiness of the tests, by ensuring the servers stop appropriately.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8880
GitOrigin-RevId: 6379b18cbe43527a1c2e73a14d93179b54bf1b75
Trying to reduce the nondeterminism in the allowlist benchmarks.
Thanks to @jberryman for spotting this.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8842
GitOrigin-RevId: 1369050e4c5844538ef0852f4347d7a72dd18287
These tests are intended to catch issues in upgrading HGE. However:
* the tests are very convoluted and hard to understand,
* we can only run a small subset of Python tests that don't mutate any data or metadata, and
* I have never seen them fail for a legitimate reason, but I've seen a lot of flakes.
While we do believe it's important to test that upgrades don't break the stored introspection, these tests don't seem to be doing that any more. I humbly request that we delete them now and either (a) figure out how to test this properly, or (b) just wait for v3, which does away with reintrospecting on server startup entirely.
[NDAT-259]: https://hasurahq.atlassian.net/browse/NDAT-259?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8844
GitOrigin-RevId: 528bc632fce377b7eff2026b832bd58586ac5a0b
This rewrites the JWT tests to generate and specify the secrets per test class, and to provide the server configuration to the HGE fixture.
It covers the tests in:
- *test_jwt.py*
- *test_jwt_claims_map.py*
- *test_config_api.py*
- *test_graphql_queries.py* (just a couple here)
This does reduce the number of code paths exercised with JWT, as we were previously running *all* tests with JWT tokens. However, this seems excessive; we don't need to tread every code path, just enough to ensure we handle the tokens appropriately. I believe that the test coverage in *test_jwt.py* does this well enough (though I'd prefer if we moved the coverage lower down in the stack as unit tests).
These tests were configured in multiple different ways by *test-server.sh*; this configuration is now moved to test subclasses within the various files. This results in a bit of duplication.
Unfortunately, the tests would ideally use parameterization rather than subclassing, but that doesn't work because of `hge_fixture_env`, which creates a "soft" dependency between the environment variables and `hge_server`. Parameterizing the former *should* force the latter to be recreated for each new set of environment variables, but `hge_server` isn't actually aware there's a dependency.
It currently looks like this adds lines of code; we'll more than make up for it when we delete the relevant lines from *test-server.sh*. I am not doing that here because I plan on deleting the whole file in a subsequent changeset.
[NDAT-538]: https://hasurahq.atlassian.net/browse/NDAT-538?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8803
GitOrigin-RevId: f7f2caa62de0b0a45e42964b69a8ae73d1575fe8
See this earlier iteration of this work for an example of the kind of report we're producing: #7664
And related work in this repo: github.com:hasura/graphql-bench-helper
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/7923
GitOrigin-RevId: 99d2a55e2fb5b55f3f33e2570cfd0bc23e448e0c
This adds the ability to capture logs to the HGE fixture, and uses this in test_logging.py to analyze the logs, instead of relying on a shell script redirecting the logs to a file.
We then inject the logs into the tests and parse the JSON. Because we're no longer reading a file, we need to do this in a separate thread, as we'll block on reading rather than the stream ending. (Once HGE stops, the stream will be closed.)
Some of the tests require a JWK server, so this has been extracted from test_jwk.py.
[NDAT-540]: https://hasurahq.atlassian.net/browse/NDAT-540?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8772
GitOrigin-RevId: 9413e714f1c42b8a0991d0d30c4358209fd30c0c
### Description
**This PR is on top of #8655. Its diff is [this commit](8dd6e8899e).**
This PR removes the app state reference in pro's context, allowing us to build the context _before_ building the firsts schema cache. It does so by introducing `MetricsConfigRef`, based on the very similar `TLSAllowListRef`.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8665
Co-authored-by: Auke Booij <164426+abooij@users.noreply.github.com>
GitOrigin-RevId: d0a614f699f41cd50801b125c6b1e40bd2f629e2
This requires rewriting the test class to split it into 3, each specifying the correct environment variables for HGE.
It would be lovely to use parameterization rather than subclassing, but that doesn't work because of `hge_fixture_env`, which creates a "soft" dependency between the environment variables and `hge_server`. Parameterizing the former *should* force the latter to be recreated for each new set of environment variables, but `hge_server` isn't actually aware there's a dependency. See `TestParameterizedFixtures` in test_tests.py for more information.
[NDAT-539]: https://hasurahq.atlassian.net/browse/NDAT-539?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8747
GitOrigin-RevId: 878b2fc20f39f962a67cd950046a99c283cfc6fc
The simplification will allow us to avoid a few `MonadError QErr m` constsraints in the future - this effect can be created locally instead of reusing a global one.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8729
GitOrigin-RevId: 851e28b1f5bfe4c47da43fa324714a941ef25c57
## Description
This change adds support for nested object fields in HGE IR and Schema Cache, the Data Connectors backend and API, and the MongoDB agent.
### Data Connector API changes
- The `/schema` endpoint response now includes an optional set of GraphQL type definitions. Table column types can refer to these definitions by name.
- Queries can now include a new field type `object` which contains a column name and a nested query. This allows querying into a nested object within a field.
### MongoDB agent changes
- Add support for querying into nested documents using the new `object` field type.
### HGE changes
- The `Backend` type class has a new type family `XNestedObjects b` which controls whether or not a backend supports querying into nested objects. This is currently enabled only for the `DataConnector` backend.
- For backends that support nested objects, the `FieldInfo` type gets a new constructor `FINestedObject`, and the `AnnFieldG` type gets a new constructor `AFNestedObject`.
- If the DC `/schema` endpoint returns any custom GraphQL type definitions they are stored in the `TableInfo` for each table in the source.
- During schema cache building, the function `addNonColumnFields` will check whether any column types match custom GraphQL object types stored in the `TableInfo`. If so, they are converted into `FINestedObject` instead of `FIColumn` in the `FieldInfoMap`.
- When building the `FieldParser`s from `FieldInfo` (function `fieldSelection`) any `FINestedObject` fields are converted into nested object parsers returning `AFNestedObject`.
- The `DataConnector` query planner converts `AFNestedObject` fields into `object` field types in the query sent to the agent.
## Limitations
### HGE not yet implemented:
- Support for nested arrays
- Support for nested objects/arrays in mutations
- Support for nested objects/arrays in order-by
- Support for filters (`where`) in nested objects/arrays
- Support for adding custom GraphQL types via track table metadata API
- Support for interface and union types
- Tests for nested objects
### Mongo agent not yet implemented:
- Generate nested object types from validation schema
- Support for aggregates
- Support for order-by
- Configure agent port
- Build agent in CI
- Agent tests for nested objects and MongoDB agent
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/7844
GitOrigin-RevId: aec9ec1e4216293286a68f9b1af6f3f5317db423
### Description
(This PR is better reviewed commit by commit.)
This PR is an aggregation of small incremental changes to Pro's init:
- it deletes some dead code,
- it starts reorganizing the code of that file by sections, similar to OSS' init,
- it extracts and cleans up license key cache init (groups several blocks of code in one separate function)
- makes some changes to a service class to reduce the dependency on `_acAppStateRef`
This PR is a first step: our goal is to move the schema cache build _in_ the app monad, in order to achieve #8344. To do so, we will need to remove `_acAppStateRef` from Pro's `AppContext`. There are two different paths we can take from here, which is why i cut this PR here:
- the first would be to change the different instances we implement on `AppM` to take as an argument the parts of the schema cache they depend on, rather than reading them from `_acAppStateRef`; as of this PR, `MetricsConfig` is the only such field;
- the second would be to apply the same strategy we already used for the TLSAllowList, and use a `IORef` that can be updated after the schema cache is built; this change would have a smaller footprint, but introduces one new `IORef` per such field, which feels something we don't want to generalize
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8655
GitOrigin-RevId: 809697d460bdb5c83ef7d30a2e835f589bcd80a6
### Description
This PR makes use of the fact that the `LoggerSet` is created in a `ManagedT` context to guarantee that we flush it when the `ManagedT` context terminates, which is guaranteed even when an exception is thrown since `allocate` is implemented using `bracket`.
This allows us to remove several manual calls to `flip onException flushLoggers`, and avoids having to manually perform that check everywhere.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8629
GitOrigin-RevId: 47657b910dc1bb7cfc0594f02e6cce9893ae3439
### Description
If i understand correctly, the _value_ of a feature flag **can** change at runtime. Since this has an impact on how the schema cache is built, this PR makes the value of the flag a part of the cache's "dynamic" config, therefore making it an arrow-ish argument for the incremental framework. It also removes `CheckFeatureFlag` from the static config, now obsolete.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8631
GitOrigin-RevId: 2ca898e4193e8552c40d2b21a819d3dd414601fe
### Description
This PR removes `ServerConfigCtx` and `HasServerConfigCtx`. Instead, it favours different approaches:
- when the code was only using one field, it passes that field explicitly (usually `SQLGenCtx` or `CheckFeatureFlag`)
- when the code was using several fields, but in only one function, it inlines
- for the cache build, it introduces `CacheStaticConfig` and `CacheDynamicConfig`, which are subsets of `AppEnv` and `AppContext` respectively
The main goal of this is to help with the modularization of the engine: as `ServerConfigCtx` had fields whose types were imported from several unrelated parts of the engine, using it tied together parts of the engine that should not be aware of one another (such as tying together `Hasura.LogicalModel` and `Hasura.GraphQL.Schema`).
The bulk of this PR is a change to the cache build, as a follow up to #8509: instead of giving the entire `ServerConfigCtx` as a incremental rule argument, we only give the new `CacheDynamicConfig` struct, which has fewer fields. The other required fields, that were coming from the `AppEnv`, are now given via the `HasCacheStaticConfig` constraint, which is a "subset" of `HasAppEnv`.
(Some further work could include moving `StringifyNumbers` out of `GraphQL.Schema.Options`, given how it is used all across the codebase, including in `RQL.DML`.)
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8513
GitOrigin-RevId: 818cbcd71494e3cd946b06adbb02ca328a8a298e
### Description
This small documentation PR explains why and how we use `ManagedT`, and how it solves problems we would have with `Managed`.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8615
GitOrigin-RevId: 5b7710a8cb6373fefb69bb4e9e8eda389c14c2da