When upgrading to GHC v9.4, we noticed a number of failures because the sort order of HashMaps has changed. With this changeset, I am endeavoring to mitigate this now and in the future.
This makes one of two changes in a few areas where we depend on the sort order of elements in a `HashMap`:
1. the ordering of the request is preserved with `InsOrdHashMap`, or
2. we sort the data after retrieving it.
Fortunately, we do not do this anywhere where we _must_ preserve order; it's "just" descriptions, error messages, and OpenAPI metadata. The main problem is that tests are likely to fail each time we upgrade GHC (or whatever is providing the hash seed).
[NDAT-705]: https://hasurahq.atlassian.net/browse/NDAT-705?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/9390
GitOrigin-RevId: 84503e029b44094edbbc298651744bc2843c15f3
### Description
This PR rewrites OpenAPI to be more idiomatic. Some noteworthy changes:
- we accumulate all required information during the Analyze phase, to avoid having to do a single lookup in the schema cache during the OpenAPI generation phase (we now only need the schema cache as input to run the analysis)
- we no longer build intermediary endpoint information and aggregate it, we directly build the the `PathItem` for each endpoint; additionally, that means we no longer have to assume that different methods have the same metadata
- we no longer have to first declare types, then craft references: we do everything in one step
- we now properly deal with nullability by treating "typeName" and "typeName!" as different
- we add a bunch of additional fields in the generated "schema", such as title
- we do now support enum values in both input and output positions
- checking whether the request body is required is now performed on the fly rather than by introspecting the generated schema
- the methods in the file are sorted by topic
### Controversial point
However, this PR creates some additional complexity, that we might not want to keep. The main complexity is _knot-tying_: to avoid lookups when generating the OpenAPI, it builds an actual graph of input types, which means that we need something similar to (but simpler than) `MonadSchema`, to avoid infinite recursions when analyzing the input types of a query. To do this, this PR introduces `CircularT`, a lesser `SchemaT` that aims at avoiding ever having to reinvent this particular wheel ever again.
### Remaining work
- [x] fix existing tests (they are all failing due to some of the schema changes)
- [ ] add tests to cover the new features:
- [x] tests for `CircularT`
- [ ] tests for enums in output schemas
- [x] extract / document `CircularT` if we wish to keep it
- [x] add more comments to `OpenAPI`
- [x] have a second look at `buildVariableSchema`
- [x] fix all missing diagnostics in `Analyze`
- [x] add a Changelog entry?
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/4654
Co-authored-by: David Overton <7734777+dmoverton@users.noreply.github.com>
GitOrigin-RevId: f4a9191f22dfcc1dccefd6a52f5c586b6ad17172
### Motivation
While we strive to write clear code, we have historically struggled at Hasura with having very different styles and standards across the codebase. There's been efforts to standardize our coding style, we have an official styleguide that isn't maintained as closely as it should... We still have some work in front of us.
However, in the last ~year or so, there's been a huge push towards incrementally improving the situation. As part of this we've been blocking PRs that don't add enough comments, or don't improve the files that they touch.
While looking at `Hasura.GraphQL.Analyse`, it became apparent that this file did not meet the engineering standards that I would expect to see addressed during a code review. Some ways in which I think it falls short:
- lack of documentation
- no clear distinction between public / internal components
- "unidiomatic" Haskell code (such as using `Either Result Error`)
While there's no problem with a file looking like this during development, those issues should have been caught at review time. The fact that they weren't indicates a problem in our process that we will need to address: code quality and maintainability is paramount, and we all need to do our part.
### Description
This PR rewrites all of `Hasura.GraphQL.Analyze`, and adapts `Hasura.Server.OpenAPI` accordingly where needed. I've attempted to clarify names and add documentation based on my understanding of the code, and to clean what was unused (such as field variables). I don't think this PR is good enough as is, and I welcome criticism where I got my comments wrong / am happy to help y'all add more.
This PR makes one small change in the way error messages are reported (and adjusts the corresponding test accordingly); each error message is now prefixed with the path within the selection set:
```
⚠️ $.test.foo.bar.baz.mizpelled: field 'mizpelled' not found in object 'Baz'
```
### Note
This PR is currently **on top of #3962**. You can preview the changes in isolation by [diffing the branches](https://github.com/hasura/graphql-engine-mono/compare/nicuveo/clean-rest-endpoint-inconsistency-check..nicuveo/rewrite-analysis).
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/3963
Co-authored-by: paritosh-08 <85472423+paritosh-08@users.noreply.github.com>
GitOrigin-RevId: 5ec38e0e753f0c12096a350db0737658495e2f15
### Motivation
#2338 introduced a way to validate REST queries against the metadata after a change, to properly report any inconsistency that would emerge from a change in the underlying structure of our schema. However, the way this was done was quite complex and error-prone. Namely: we would use the generated schema parsers to statically execute an introspection query, similar to the one we use for remote schemas, then parse the resulting bytestring as it were coming from a remote schema.
This led to several issues: the code was using remote schema primitives, and was associated with remote schema code, despite being unrelated, which led to absurd situations like creating fake `Variable`s whose type was also their name. A lot of the code had to deal with the fact that we might fail to re-parse our own schema. Additionally, some of it was dead code, that for some reason GHC did not warn about? But more fundamentally, this architecture decision creates a dependency between unrelated pieces of the engine: modifying the internal processing of root fields or the introspection of remote schemas now risks impacting the unrelated `OpenAPI` feature.
### Description
This PR decouples that process from the remote schema introspection logic and from the execution engine by making `Analyse` and `OpenAPI` work on the generic `G.SchemaIntrospection` instead. To accomplish this, it:
- adds `GraphQL.Parser.Schema.Convert`, to convert from our "live" schema back to a flat `SchemaIntrospection`
- persists in the schema cache the `admin` introspection generated when building the schema, and uses it both for validation and for generating the `OpenAPI`.
### Known issues and limitations
This adds a bit of memory pressure to the engine, as we persist the entire schema in the schema cache. This might be acceptable in the short-term, but we have several potential ideas going forward should this be a problem:
- cache the result of `Analyze`: when it becomes possible to build the `OpenAPI` purely with the result of `Analyze` without any additional schema information, then we could cache that instead, reducing the footprint
- caching the `OpenAPI`: if it doesn't need to change every time the endpoint is queried, then it should be possible to cache the entire `OpenAPI` object instead of the schema
- cache a copy of the `FieldParsers` used to generate the schema: as those are persisted through the GraphQL `Context`, and are the only input required to generate the `Schema`, making them accessible in the schema cache would allow us to have the exact same feature with no additional memory cost, at the price of a slightly slower and more complicated process (need to rebuild the `Schema` every time we query the OpenAPI endpoint)
- cache nothing at all, and rebuild the admin schema from scratch every time.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/3962
Co-authored-by: paritosh-08 <85472423+paritosh-08@users.noreply.github.com>
GitOrigin-RevId: a8b9808170b231fdf6787983b4a9ed286cde27e0
### Description
This small PR improves the representation of an endpoint method from `Text` to an enum of the supported methods. Additionally, it cleans some of the instances defined on surrounding types (such as Postgres-specific instances on Endpoint types).
Due to a name conflict, this makes `RQL.Types.Endpoint` impossible to re-export from `RQL.Types`, which in turn forces several other modules to import it explicitly, which I think is fine since we want to ultimately get rid of `RQL.Types`.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/3965
GitOrigin-RevId: 33869007d0d818ddf486fb61d1f6099f9dad7570
### Description
There were several places in the codebase where we would either implement a generic container, or express the need for one. This PR extracts / creates all relevant containers, and adapts the relevant parts of the code to make use of said new generic containers. More specifically, it introduces the following modules:
- `Data.Set.Extended`, for new functions on `Data.Set`
- `Data.HashMap.Strict.Multi`, for hash maps that accept multiple values
- `Data.HashMap.Strict.NonEmpty`, for hash maps that can never be constructed as empty
- `Data.Trie`, for a generic implementation of a prefix tree
This PR makes use of those new containers in the following parts of the code:
- `Hasura.GraphQL.Execute.RemoteJoin.Types`
- `Hasura.RQL.Types.Endpoint*`
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/3828
GitOrigin-RevId: e6c1b971bcb3f5ab66bc91d0fa4d0e9df7a0c6c6