We currently have a fairly intricate way of running our PostgreSQL and MSSQL integration tests (not the API tests). By splitting them out, we can simplify this a lot. Most prominently, we can rely on Cabal to be our argument parser instead of writing our own.
We can also simplify how they're run in CI. They are currently (weirdly) run alongside the Python integration tests. This breaks them out into their own jobs for better visibility, and to avoid conflating the two.
The changes are as follows:
- The "unit" tests that rely on a running PostgreSQL database are extracted out to a new test directory so they can be run separately.
- Most of the `Main` module comes with them.
- We now refer to these as "integration" tests instead.
- Likewise for the "unit" tests that rely on a running MS SQL Server database. These are a little simpler and we can use `hspec-discover`, with a `SpecHook` to extract the connection string from an environment variable.
- Henceforth, these are the MS SQL Server integration tests.
- New CI jobs have been added for each of these.
- There wasn't actually a job for the MS SQL Server integration tests. It's pretty amazing they still run well.
- The "haskell-tests" CI job, which used to run the PostgreSQL integration tests, has been removed.
- The makefiles and contributing guide have been updated to run these.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6912
GitOrigin-RevId: 67bbe2941bba31793f63d04a9a693779d4463ee1
The main aim of the PR is:
1. To set up a module structure for 'remote-schemas' package.
2. Move parts by the remote schema codebase into the new module structure to validate it.
## Notes to the reviewer
Why a PR with large-ish diff?
1. We've been making progress on the MM project but we don't yet know long it is going to take us to get to the first milestone. To understand this better, we need to figure out the unknowns as soon as possible. Hence I've taken a stab at the first two items in the [end-state](https://gist.github.com/0x777/ca2bdc4284d21c3eec153b51dea255c9) document to figure out the unknowns. Unsurprisingly, there are a bunch of issues that we haven't discussed earlier. These are documented in the 'open questions' section.
1. The diff is large but that is only code moved around and I've added a section that documents how things are moved. In addition, there are fair number of PR comments to help with the review process.
## Changes in the PR
### Module structure
Sets up the module structure as follows:
```
Hasura/
RemoteSchema/
Metadata/
Types.hs
SchemaCache/
Types.hs
Permission.hs
RemoteRelationship.hs
Build.hs
MetadataAPI/
Types.hs
Execute.hs
```
### 1. Types representing metadata are moved
Types that capture metadata information (currently scattered across several RQL modules) are moved into `Hasura.RemoteSchema.Metadata.Types`.
- This new module only depends on very 'core' modules such as
`Hasura.Session` for the notion of roles and `Hasura.Incremental` for `Cacheable` typeclass.
- The requirement on database modules is avoided by generalizing the remote schemas metadata to accept an arbitrary 'r' for a remote relationship
definition.
### 2. SchemaCache related types and build logic have been moved
Types that represent remote schemas information in SchemaCache are moved into `Hasura.RemoteSchema.SchemaCache.Types`.
Similar to `H.RS.Metadata.Types`, this module depends on 'core' modules except for `Hasura.GraphQL.Parser.Variable`. It has something to do with remote relationships but I haven't spent time looking into it. The validation of 'remote relationships to remote schema' is also something that needs to be looked at.
Rips out the logic that builds remote schema's SchemaCache information from the monolithic `buildSchemaCacheRule` and moves it into `Hasura.RemoteSchema.SchemaCache.Build`. Further, the `.SchemaCache.Permission` and `.SchemaCache.RemoteRelationship` have been created from existing modules that capture schema cache building logic for those two components.
This was a fair amount of work. On main, currently remote schema's SchemaCache information is built in two phases - in the first phase, 'permissions' and 'remote relationships' are ignored and in the second phase they are filled in.
While remote relationships can only be resolved after partially resolving sources and other remote schemas, the same isn't true for permissions. Further, most of the work that is done to resolve remote relationships can be moved to the first phase so that the second phase can be a very simple traversal.
This is the approach that was taken - resolve permissions and as much as remote relationships information in the first phase.
### 3. Metadata APIs related types and build logic have been moved
The types that represent remote schema related metadata APIs and the execution logic have been moved to `Hasura.RemoteSchema.MetadataAPI.Types` and `.Execute` modules respectively.
## Open questions:
1. `Hasura.RemoteSchema.Metadata.Types` is so called because I was hoping that all of the metadata related APIs of remote schema can be brought in at `Hasura.RemoteSchema.Metadata.API`. However, as metadata APIs depended on functions from `SchemaCache` module (see [1](ceba6d6226/server/src-lib/Hasura/RQL/DDL/RemoteSchema.hs (L55)) and [2](ceba6d6226/server/src-lib/Hasura/RQL/DDL/RemoteSchema.hs (L91)), it made more sense to create a separate top-level module for `MetadataAPI`s.
Maybe we can just have `Hasura.RemoteSchema.Metadata` and get rid of the extra nesting or have `Hasura.RemoteSchema.Metadata.{Core,Permission,RemoteRelationship}` if we want to break them down further.
1. `buildRemoteSchemas` in `H.RS.SchemaCache.Build` has the following type:
```haskell
buildRemoteSchemas ::
( ArrowChoice arr,
Inc.ArrowDistribute arr,
ArrowWriter (Seq CollectedInfo) arr,
Inc.ArrowCache m arr,
MonadIO m,
HasHttpManagerM m,
Inc.Cacheable remoteRelationshipDefinition,
ToJSON remoteRelationshipDefinition,
MonadError QErr m
) =>
Env.Environment ->
( (Inc.Dependency (HashMap RemoteSchemaName Inc.InvalidationKey), OrderedRoles),
[RemoteSchemaMetadataG remoteRelationshipDefinition]
)
`arr` HashMap RemoteSchemaName (PartiallyResolvedRemoteSchemaCtxG remoteRelationshipDefinition, MetadataObject)
```
Note the dependence on `CollectedInfo` which is defined as
```haskell
data CollectedInfo
= CIInconsistency InconsistentMetadata
| CIDependency
MetadataObject
-- ^ for error reporting on missing dependencies
SchemaObjId
SchemaDependency
deriving (Eq)
```
this pretty much means that remote schemas is dependent on types from databases, actions, ....
How do we fix this? Maybe introduce a typeclass such as `ArrowCollectRemoteSchemaDependencies` which is defined in `Hasura.RemoteSchema` and then implemented in graphql-engine?
1. The dependency on `buildSchemaCacheFor` in `.MetadataAPI.Execute` which has the following signature:
```haskell
buildSchemaCacheFor ::
(QErrM m, CacheRWM m, MetadataM m) =>
MetadataObjId ->
MetadataModifier ->
```
This can be easily resolved if we restrict what the metadata APIs are allowed to do. Currently, they operate in an unfettered access to modify SchemaCache (the `CacheRWM` constraint):
```haskell
runAddRemoteSchema ::
( QErrM m,
CacheRWM m,
MonadIO m,
HasHttpManagerM m,
MetadataM m,
Tracing.MonadTrace m
) =>
Env.Environment ->
AddRemoteSchemaQuery ->
m EncJSON
```
This should instead be changed to restrict remote schema APIs to only modify remote schema metadata (but has access to the remote schemas part of the schema cache), this dependency is completely removed.
```haskell
runAddRemoteSchema ::
( QErrM m,
MonadIO m,
HasHttpManagerM m,
MonadReader RemoteSchemasSchemaCache m,
MonadState RemoteSchemaMetadata m,
Tracing.MonadTrace m
) =>
Env.Environment ->
AddRemoteSchemaQuery ->
m RemoteSchemeMetadataObjId
```
The idea is that the core graphql-engine would call these functions and then call
`buildSchemaCacheFor`.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6291
GitOrigin-RevId: 51357148c6404afe70219afa71bd1d59bdf4ffc6
I am working on https://github.com/hasura/graphql-engine/issues/8807, and wanted to write a Haskell integration test case to reproduce it.
We have Python integration tests somewhat covering this behavior in *test_inconsistent_meta.py*, but no Haskell tests, so I thought I'd shore up the coverage here by adding a few test cases for working behavior.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5897
GitOrigin-RevId: 21500e530e413feaede5cbd8b4a94b07d25a6260
This abstracts `CircularT`'s test cases to work against "any" memoizer, and then runs them against `MemoizeT` as well.
Surprisingly (or not), this works without issue; `MemoizeT` passes all tests with a couple of extra instances.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5780
GitOrigin-RevId: 461880caf9220dc3f52d622a22e8b8bcd594e404
This PR expands the OpenAPI specification generated for metadata to include separate definitions for `SourceMetadata` for each native database type, and for DataConnector.
For the most part the changes add `HasCodec` implementations, and don't modify existing code otherwise.
The generated OpenAPI spec can be used to generate TypeScript definitions that distinguish different source metadata types based on the value of the `kind` properly. There is a problem: because the specified `kind` value for a data connector source is any string, when TypeScript gets a source with a `kind` value of, say, `"postgres"`, it cannot unambiguously determine whether the source is postgres, or a data connector. For example,
```ts
function consumeSourceMetadata(source: SourceMetadata) {
if (source.kind === "postgres" || source.kind === "pg") {
// At this point TypeScript infers that `source` is either an instance
// of `PostgresSourceMetadata`, or `DataconnectorSourceMetadata`. It
// can't narrow further.
source
}
if (source.kind === "something else") {
// TypeScript infers that this `source` must be an instance of
// `DataconnectorSourceMetadata` because `source.kind` does not match
// any of the other options.
source
}
}
```
The simplest way I can think of to fix this would be to add a boolean property to the `SourceMetadata` type along the lines of `isNative` or `isDataConnector`. This could be a field that only exists in serialized data, like the metadata version field. The combination of one of the native database names for `kind`, and a true value for `isNative` would be enough for TypeScript to unambiguously distinguish the source kinds.
But note that in the current state TypeScript is able to reference the short `"pg"` name correctly!
~~Tests are not passing yet due to some discrepancies in DTO serialization vs existing Metadata serialization. I'm working on that.~~
The placeholders that I used for table and function metadata are not compatible with the ordered JSON serialization in use. I think the best solution is to write compatible codecs for those types in another PR. For now I have disabled some DTO tests for this PR.
Here are the generated [OpenAPI spec](https://github.com/hasura/graphql-engine-mono/files/9397333/openapi.tar.gz) based on these changes, and the generated [TypeScript client code](https://github.com/hasura/graphql-engine-mono/files/9397339/client-typescript.tar.gz) based on that spec.
Ticket: [MM-66](https://hasurahq.atlassian.net/browse/MM-66)
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5582
GitOrigin-RevId: e1446191c6c832879db04f129daa397a3be03f62
The module `Hasura.SQL.AnyBackend` was introduced (in #751) to centralize the logic for case-switching behavior that depends on the particular flavor of relational DB backend (Postgres vs MSSQL vs BigQuery vs MySQL vs DataConnectors). This allows us to write a bunch of code in a backend-agnostic way, even if runtime behavior does depend on the chosen backend. At the same time, it allows us to write backend-specific code without having to care (too much) about the existence of other backends.
In #851 this module was rewritten to use Template Haskell.
I've heard that one of the reasons for the use of TH was that this would make it easier to keep backends out of the compilation product entirely. This would allow customers, especially on OSS, to benefit from simpler software licensing.
However:
1. This conditional compilation never materialized.
2. It's not clear whether writing this particular module based on TH would be sufficient for conditional compilation. And in any case, it can be done using CPP pragmas as well.
3. The TH code is extraordinarily complex. Since its introduction, it has been documented extraordinarily well, but it's still very difficult to maintain and/or refactor, due to its non-idiomatic nature.
4. Hasura's company objectives are now Cloud-oriented, so that software licensing issues work differently, and in particular, do not depend on what's part of the compilation product.
So this PR reverts on #851 by spelling out the code generated by TH. This is a net-negative diff size. IOW we used to generate less code than the size of the code doing the generating. This makes the code readable and maintainable.
The generated code has been modified in one way, which I'll now describe.
In the scenario that support for a new backend is introduced, a constructor is added to the `BackendType` type. This would then cause `liftTag` to be partial, thus raising a compiler warning. Resolving this requires adding corresponding constructors to the `BackendTag` and `AnyBackend` types. This would then require amending **almost** all other methods.
The exceptions are `composeAnyBackend` and `unpackAnyBackend`. These methods test whether two values are compatible, i.e. belong to the same backend. Both have a default case that in one way or another ignores the input values. Using TH here ensures that all values that belong together are caught. But after spelling out the TH, the presence of the default case means that no compiler warning is thrown for a missing match of matching values. So in the default case, we now do an explicit check for equality. If there _is_ an equality, that means that there is a missing `case`. So this is reported as an `error` (which is very crude, but it should be).
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5333
GitOrigin-RevId: 5aaf0a93394bd740aa7371526d3175c8142b3541
It's about time.
To do this I had to check a few more boxes.
* I copied the flags from `graphql-engine.cabal` to the libraries in `server/lib`.
* I moved `Cacheable` instances of schema parser types beside the typeclass declaration.
* I removed imports of `Hasura.Prelude` from the tests, and rewrote them accordingly.
* I copied the `TestMonad` parse monad into `server/src-test/Hasura/GraphQL/Schema/RemoteTest.hs`, which was using it. I think this could be done with the real thing, but I tried replacing it with constraints and it messed with my head somewhat.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5311
GitOrigin-RevId: ebebcc50a16f2d517b7f730fe72410827ca3e86c
Followup to hasura/graphql-engine-mono#4713.
The `memoizeOn` method, part of `MonadSchema`, originally had the following type:
```haskell
memoizeOn
:: (HasCallStack, Ord a, Typeable a, Typeable b, Typeable k)
=> TH.Name
-> a
-> m (Parser k n b)
-> m (Parser k n b)
```
The reason for operating on `Parser`s specifically was that the `MonadSchema` effect would additionally initialize certain `Unique` values, which appear (nested in) the type of `Parser`.
hasura/graphql-engine-mono#518 changed the type of `memoizeOn`, to additionally allow memoizing `FieldParser`s. These also contained a `Unique` value, which was similarly initialized by the `MonadSchema` effect. The new type of `memoizeOn` was as follows:
```haskell
memoizeOn
:: forall p d a b
. (HasCallStack, HasDefinition (p n b) d, Ord a, Typeable p, Typeable a, Typeable b)
=> TH.Name
-> a
-> m (p n b)
-> m (p n b)
```
Note the type `p n b` of the value being memoized: by choosing `p` to be either `Parser k` or `FieldParser`, both can be memoized. Also note the new `HasDefinition (p n b) d` constraint, which provided a `Lens` for accessing the `Unique` value to be initialized.
A quick simplification is that the `HasCallStack` constraint has never been used by any code. This was realized in hasura/graphql-engine-mono#4713, by removing that constraint.
hasura/graphql-engine-mono#2980 removed the `Unique` value from our GraphQL-related types entirely, as their original purpose was never truly realized. One part of removing `Unique` consisted of dropping the `HasDefinition (p n b) d` constraint from `memoizeOn`.
What I didn't realize at the time was that this meant that the type of `memoizeOn` could be generalized and simplified much further. This PR finally implements that generalization. The new type is as follows:
```haskell
memoizeOn ::
forall a p.
(Ord a, Typeable a, Typeable p) =>
TH.Name ->
a ->
m p ->
m p
```
This change has a couple of consequences.
1. While constructing the schema, we often output `Maybe (Parser ...)`, to model that the existence of certain pieces of GraphQL schema sometimes depends on the permissions that a certain role has. The previous versions of `memoizeOn` were not able to handle this, as the only thing they could memoize was fully-defined (if not yet fully-evaluated) `(Field)Parser`s. This much more general API _would_ allow memoizing `Maybe (Parser ...)`s. However, we probably have to be continue being cautious with this: if we blindly memoize all `Maybe (Parser ...)`s, the resulting code may never be able to decide whether the value is `Just` or `Nothing` - i.e. it never commits to the existence-or-not of a GraphQL schema fragment. This would manifest as a non-well-founded knot tying, and this would get reported as an error by the implementation of `memoizeOn`.
tl;dr: This generalization _technically_ allows for memoizing `Maybe` values, but we probably still want to avoid doing so.
For this reason, the PR adds a specialized version of `memoizeOn` to `Hasura.GraphQL.Schema.Parser`.
2. There is no longer any need to connect the `MonadSchema` knot-tying effect with the `MonadParse` effect. In fact, after this PR, the `memoizeOn` method is completely GraphQL-agnostic, and so we implement hasura/graphql-engine-mono#4726, separating `memoizeOn` from `MonadParse` entirely - `memoizeOn` can be defined and implemented as a general Haskell typeclass method.
Since `MonadSchema` has been made into a single-type-parameter type class, it has been renamed to something more general, namely `MonadMemoize`. Its only task is to memoize arbitrary `Typeable p` objects under a combined key consisting of a `TH.Name` and a `Typeable a`.
Also for this reason, the new `MonadMemoize` has been moved to the more general `Control.Monad.Memoize`.
3. After this change, it's somewhat clearer what `memoizeOn` does: it memoizes an arbitrary value of a `Typeable` type. The only thing that needs to be understood in its implementation is how the manual blackholing works. There is no more semantic interaction with _any_ GraphQL code.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/4725
Co-authored-by: Daniel Harvey <4729125+danieljharvey@users.noreply.github.com>
GitOrigin-RevId: 089fa2e82c2ce29da76850e994eabb1e261f9c92
Moves code from `Hasura.RQL.Types.Metadata` that is specific to serialization into a new module, `Hasura.RQL.Types.Metadata.Serialization`.
I'm breaking up #5184 into smaller PRs. This is the third and final PR in that effort. This PR is stacked on #5210 and #5211.
The tracking issue is https://hasurahq.atlassian.net/browse/MM-35
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5212
GitOrigin-RevId: 6cde6d52173590fafe0969a06f2a3411db4fbc78
A following PR moves serialization-related code out `Hasura.RQL.Types.Metadata` into a specialized submodule. To avoid circular dependencies a number of other definitions also need to be moved into their own submodule. This PR does that extra moving first so that we can keep each PR as small, and as easy to review as possible.
There are a lot of changed lines; but it's all moving code from one module to another.
I'm breaking up #5184 into smaller PRs, and this is the first PR in that effort.
The tracking issue is https://hasurahq.atlassian.net/browse/MM-35
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5210
GitOrigin-RevId: 6fb6e29a967ab5ad4724006c8e0addd2d63a3946