`CollectedInfo` was just an awkward sum type. By using an explicit `Either` instead, we can guarantee at the type level that certain methods only write inconsistencies, or only write dependencies. This is useful, because if we can guarantee that no dependencies are written, then we don't need to run `resolveDependencies` on that part of the Metadata. In other words, we can keep it out of `BuildOutputs`, which greatly benefits performance - see e.g. hasura/graphql-engine-mono#6613.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6765
GitOrigin-RevId: 9ce099d2eee2278dbb6e5bea72063e4b6e064b35
A bunch of configurations are retrieved from the Metadata, then stored in the `BuildOutputs` structure, only to then be forwarded to the `SchemaCache`, with extremely little processing in between.
So this simplifies the build pipeline for some parts of the metadata: just construct those things from `Metadata` directly, and store them in the `SchemaCache` without any intermediate container.
Why did we have the detour via `BuildOutputs` in the first place? Parts of the Metadata (codified by `MetadataObjId`) can generate _metadata inconsistencies_ and/or _schema dependencies_, which are related.
- Metadata inconsistencies are warnings that we show to the user, indicating that there's something wrong with their configuration, and they have to fix it.
- Schema dependencies are an internal mechanism that allow us to build a consistent view of the world. For instance, if we have a relationship from DB tables `books` to `authors`, but the `authors` table is inconsistent (e.g. it doesn't exist in the DB), then we have schema dependencies indicating that. The job of `resolveDependencies` is to then drop the relationship, so that we can at least generate a legal GraphQL schema for `books`.
If we never generate a schema dependency for a certain fragment of Metadata, then there is no reason to call `resolveDependencies` on it, and so there is no reason to store it in `BuildOutputs`.
---
The starting point that allows this refactor is to apply Metadata defaults before it reaches `buildAndCollectInfo`, so that metadata-with-defaults can be used elsewhere.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6609
GitOrigin-RevId: df0c4a7ff9451e10e02a40bf26304b26584ba483
This upgrades the version of Ormolu required by the HGE repository to v0.5.0.1, and reformats all code accordingly.
Ormolu v0.5 reformats code that uses infix operators. This is mostly useful, adding newlines and indentation to make it clear which operators are applied first, but in some cases, it's unpleasant. To make this easier on the eyes, I had to do the following:
* Add a few fixity declarations (search for `infix`)
* Add parentheses to make precedence clear, allowing Ormolu to keep everything on one line
* Rename `relevantEq` to `(==~)` in #6651 and set it to `infix 4`
* Add a few _.ormolu_ files (thanks to @hallettj for helping me get started), mostly for Autodocodec operators that don't have explicit fixity declarations
In general, I think these changes are quite reasonable. They mostly affect indentation.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6675
GitOrigin-RevId: cd47d87f1d089fb0bc9dcbbe7798dbceedcd7d83
The main aim of the PR is:
1. To set up a module structure for 'remote-schemas' package.
2. Move parts by the remote schema codebase into the new module structure to validate it.
## Notes to the reviewer
Why a PR with large-ish diff?
1. We've been making progress on the MM project but we don't yet know long it is going to take us to get to the first milestone. To understand this better, we need to figure out the unknowns as soon as possible. Hence I've taken a stab at the first two items in the [end-state](https://gist.github.com/0x777/ca2bdc4284d21c3eec153b51dea255c9) document to figure out the unknowns. Unsurprisingly, there are a bunch of issues that we haven't discussed earlier. These are documented in the 'open questions' section.
1. The diff is large but that is only code moved around and I've added a section that documents how things are moved. In addition, there are fair number of PR comments to help with the review process.
## Changes in the PR
### Module structure
Sets up the module structure as follows:
```
Hasura/
RemoteSchema/
Metadata/
Types.hs
SchemaCache/
Types.hs
Permission.hs
RemoteRelationship.hs
Build.hs
MetadataAPI/
Types.hs
Execute.hs
```
### 1. Types representing metadata are moved
Types that capture metadata information (currently scattered across several RQL modules) are moved into `Hasura.RemoteSchema.Metadata.Types`.
- This new module only depends on very 'core' modules such as
`Hasura.Session` for the notion of roles and `Hasura.Incremental` for `Cacheable` typeclass.
- The requirement on database modules is avoided by generalizing the remote schemas metadata to accept an arbitrary 'r' for a remote relationship
definition.
### 2. SchemaCache related types and build logic have been moved
Types that represent remote schemas information in SchemaCache are moved into `Hasura.RemoteSchema.SchemaCache.Types`.
Similar to `H.RS.Metadata.Types`, this module depends on 'core' modules except for `Hasura.GraphQL.Parser.Variable`. It has something to do with remote relationships but I haven't spent time looking into it. The validation of 'remote relationships to remote schema' is also something that needs to be looked at.
Rips out the logic that builds remote schema's SchemaCache information from the monolithic `buildSchemaCacheRule` and moves it into `Hasura.RemoteSchema.SchemaCache.Build`. Further, the `.SchemaCache.Permission` and `.SchemaCache.RemoteRelationship` have been created from existing modules that capture schema cache building logic for those two components.
This was a fair amount of work. On main, currently remote schema's SchemaCache information is built in two phases - in the first phase, 'permissions' and 'remote relationships' are ignored and in the second phase they are filled in.
While remote relationships can only be resolved after partially resolving sources and other remote schemas, the same isn't true for permissions. Further, most of the work that is done to resolve remote relationships can be moved to the first phase so that the second phase can be a very simple traversal.
This is the approach that was taken - resolve permissions and as much as remote relationships information in the first phase.
### 3. Metadata APIs related types and build logic have been moved
The types that represent remote schema related metadata APIs and the execution logic have been moved to `Hasura.RemoteSchema.MetadataAPI.Types` and `.Execute` modules respectively.
## Open questions:
1. `Hasura.RemoteSchema.Metadata.Types` is so called because I was hoping that all of the metadata related APIs of remote schema can be brought in at `Hasura.RemoteSchema.Metadata.API`. However, as metadata APIs depended on functions from `SchemaCache` module (see [1](ceba6d6226/server/src-lib/Hasura/RQL/DDL/RemoteSchema.hs (L55)) and [2](ceba6d6226/server/src-lib/Hasura/RQL/DDL/RemoteSchema.hs (L91)), it made more sense to create a separate top-level module for `MetadataAPI`s.
Maybe we can just have `Hasura.RemoteSchema.Metadata` and get rid of the extra nesting or have `Hasura.RemoteSchema.Metadata.{Core,Permission,RemoteRelationship}` if we want to break them down further.
1. `buildRemoteSchemas` in `H.RS.SchemaCache.Build` has the following type:
```haskell
buildRemoteSchemas ::
( ArrowChoice arr,
Inc.ArrowDistribute arr,
ArrowWriter (Seq CollectedInfo) arr,
Inc.ArrowCache m arr,
MonadIO m,
HasHttpManagerM m,
Inc.Cacheable remoteRelationshipDefinition,
ToJSON remoteRelationshipDefinition,
MonadError QErr m
) =>
Env.Environment ->
( (Inc.Dependency (HashMap RemoteSchemaName Inc.InvalidationKey), OrderedRoles),
[RemoteSchemaMetadataG remoteRelationshipDefinition]
)
`arr` HashMap RemoteSchemaName (PartiallyResolvedRemoteSchemaCtxG remoteRelationshipDefinition, MetadataObject)
```
Note the dependence on `CollectedInfo` which is defined as
```haskell
data CollectedInfo
= CIInconsistency InconsistentMetadata
| CIDependency
MetadataObject
-- ^ for error reporting on missing dependencies
SchemaObjId
SchemaDependency
deriving (Eq)
```
this pretty much means that remote schemas is dependent on types from databases, actions, ....
How do we fix this? Maybe introduce a typeclass such as `ArrowCollectRemoteSchemaDependencies` which is defined in `Hasura.RemoteSchema` and then implemented in graphql-engine?
1. The dependency on `buildSchemaCacheFor` in `.MetadataAPI.Execute` which has the following signature:
```haskell
buildSchemaCacheFor ::
(QErrM m, CacheRWM m, MetadataM m) =>
MetadataObjId ->
MetadataModifier ->
```
This can be easily resolved if we restrict what the metadata APIs are allowed to do. Currently, they operate in an unfettered access to modify SchemaCache (the `CacheRWM` constraint):
```haskell
runAddRemoteSchema ::
( QErrM m,
CacheRWM m,
MonadIO m,
HasHttpManagerM m,
MetadataM m,
Tracing.MonadTrace m
) =>
Env.Environment ->
AddRemoteSchemaQuery ->
m EncJSON
```
This should instead be changed to restrict remote schema APIs to only modify remote schema metadata (but has access to the remote schemas part of the schema cache), this dependency is completely removed.
```haskell
runAddRemoteSchema ::
( QErrM m,
MonadIO m,
HasHttpManagerM m,
MonadReader RemoteSchemasSchemaCache m,
MonadState RemoteSchemaMetadata m,
Tracing.MonadTrace m
) =>
Env.Environment ->
AddRemoteSchemaQuery ->
m RemoteSchemeMetadataObjId
```
The idea is that the core graphql-engine would call these functions and then call
`buildSchemaCacheFor`.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6291
GitOrigin-RevId: 51357148c6404afe70219afa71bd1d59bdf4ffc6
### Description
This PR attempts to fix several issues with source customization as it relates to remote relationships. There were several issues regarding casing: at the relationship border, we didn't properly set the target source's case, we didn't have access to the list of supported features to decide whether the feature was allowed or not, and we didn't have access to the global default.
However, all of that information is available when we build the schema cache, as we do resolve the case of some elements such as function names: we can therefore resolve source information at the same time, and simplify both the root of the schema and the remote relationship border.
To do this, this PR introduces a new type, `ResolvedSourceCustomization`, to be used in the Schema Cache, as opposed to the metadata's `SourceCustomization`, following a pattern established by a lot of other types.
### Remaining work and open questions
One major point of confusion: it seems to me that we didn't set the case at all across remote relationships, which would suggest we would use the case of the LHS source across the subset of the RHS one that is accessible through the remote relationship, which would in turn "corrupt" the parser cache and might result in the wrong case being used for that source later on. Is that assesment correct, and was I right to fix it?
Another one is that we seem not to be using the local case of the RHS to name the field in an object relationship; unless I'm mistaken we only use it for array relationships? Is that intentional?
This PR is also missing tests that would show-case the difference, and a changelog entry. To my knowledge, all the tests of this feature are in the python test suite; this could be the opportunity to move them to the hspec suite, but this might be a considerable amount of work?
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5619
GitOrigin-RevId: 51a81b713a74575e82d9f96b51633f158ce3a47b
### Description
This PR moves some strictness annotations to a concrete use site, rather than putting `seq` in an helper function.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5436
GitOrigin-RevId: 1f279e05333ab80167ad2e18d09b8792eddc52c3
This moves `MkTypename` and `NamingCase` into their own modules, with the intent of reducing the scope of the schema parsers code, and trying to reduce imports of large modules when small ones will do.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/4978
GitOrigin-RevId: 19541257fe010035390f6183a4eaa37bae0d3ca1
### Description
This small clean-up PR makes one further step towards backend-agnostic actions: it makes all the code parsing custom types backend agnostic. Surprisingly, this could be done *without* the need to finish generalizing the column parser. The remaining sore point is async queries, that still target Postgres explicitly.
In theory, this is enough to start allowing non-Postgres scalars in custom types. In practice, however:
- no other backend exposes scalars in a way that would allow users to do that as of this PR;
- we currently have no strategy to avoid / detect scalar collisions across backends.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/4691
GitOrigin-RevId: bfe63fb131e306663d4406697ce23c02736566c5
### Description
This PR is a first step in a series of cleanups of action relationships. This first step does not contain any behavioral change, and it simply reorganizes / prunes / rearranges / documents the code. Mainly:
- it divides some files in RQL.Types between metadata types, schema cache types, execution types;
- it renames some types for consistency;
- it minimizes exports and prunes unnecessary types;
- it moves some types in places where they make more sense;
- it replaces uses of `DMap BackendTag` with `BackendMap`.
Most of the "movement" within files re-organizes declarations in a "top-down" fashion, by moving all TH splices to the end of the file, which avoids order or declarations mattering.
### Optional list types
One main type change this PR makes is a replacement of variant list types in `CustomTypes.hs`; we had `Maybe [a]`, or sometimes `Maybe (NonEmpty a)`. This PR harmonizes all of them to `[a]`, as most of the code would use them as such, by doing `fromMaybe []` or `maybe [] toList`.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/4613
GitOrigin-RevId: bc624e10df587eba862ff27a5e8021b32d0d78a2
By generalizing the instances, they can be written as attached instance derivations, rather than standalone ones.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/4518
GitOrigin-RevId: 7a387911cf6ad46fe6acd36648275d6c2c68ffe3
### Description
As part of the cache building process, we create / update / migrate the catalog that each DB uses as a place to store event trigger information. The function that decides how this should be done was doing an explicit `case ... of` on the backend tag, instead of delegating to one of the backend classes. The downsides of this is that:
- it adds a "friction point" where the backend matters in the core of the engine, which is otherwise written to be almost entirely backend-agnostic
- it creates imports from deep in the engine to the `Backends`, which we try to restrict to a very small set of clearly identified files (the `Instances` files)
- it is currently implemented using a "catch all" default case, which might not always be correct for new backends
This PR makes the catalog updating process a part of `BackendMetadata`, and cleans the corresponding schema cache code.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/4457
GitOrigin-RevId: 592f0eaa97a7c38f4e6d4400e1d2353aab12c97e
## Description
This PR removes `RQL.Types`, which was now only re-exporting a bunch of unrelated modules.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/4363
GitOrigin-RevId: 894f29a19bff70b3dad8abc5d9858434d5065417
### Description
`HasSystemDefined` is defined in `RQL.Types`, but only used in one place, `LegacyCatalog`, to avoid passing a boolean around. It is easily replaced by an ad-hoc `ReaderT`.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/4337
GitOrigin-RevId: 649d758bb2b18b39533429dda5ab71afde62fb53
### Description
Small PR that moves code out of `RQL.Types.hs`. Specifically, it moves `HasServerConfigCtx` to where `ServerConfigCtx` is defined. This removes code from `RQL.Types`, makes the dependency on `Server.Types` more explicit, and will make some further cleanups easier.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/4336
GitOrigin-RevId: 95bb3467d741763892c4e68a38760497157ba1aa
### Motivation
#2338 introduced a way to validate REST queries against the metadata after a change, to properly report any inconsistency that would emerge from a change in the underlying structure of our schema. However, the way this was done was quite complex and error-prone. Namely: we would use the generated schema parsers to statically execute an introspection query, similar to the one we use for remote schemas, then parse the resulting bytestring as it were coming from a remote schema.
This led to several issues: the code was using remote schema primitives, and was associated with remote schema code, despite being unrelated, which led to absurd situations like creating fake `Variable`s whose type was also their name. A lot of the code had to deal with the fact that we might fail to re-parse our own schema. Additionally, some of it was dead code, that for some reason GHC did not warn about? But more fundamentally, this architecture decision creates a dependency between unrelated pieces of the engine: modifying the internal processing of root fields or the introspection of remote schemas now risks impacting the unrelated `OpenAPI` feature.
### Description
This PR decouples that process from the remote schema introspection logic and from the execution engine by making `Analyse` and `OpenAPI` work on the generic `G.SchemaIntrospection` instead. To accomplish this, it:
- adds `GraphQL.Parser.Schema.Convert`, to convert from our "live" schema back to a flat `SchemaIntrospection`
- persists in the schema cache the `admin` introspection generated when building the schema, and uses it both for validation and for generating the `OpenAPI`.
### Known issues and limitations
This adds a bit of memory pressure to the engine, as we persist the entire schema in the schema cache. This might be acceptable in the short-term, but we have several potential ideas going forward should this be a problem:
- cache the result of `Analyze`: when it becomes possible to build the `OpenAPI` purely with the result of `Analyze` without any additional schema information, then we could cache that instead, reducing the footprint
- caching the `OpenAPI`: if it doesn't need to change every time the endpoint is queried, then it should be possible to cache the entire `OpenAPI` object instead of the schema
- cache a copy of the `FieldParsers` used to generate the schema: as those are persisted through the GraphQL `Context`, and are the only input required to generate the `Schema`, making them accessible in the schema cache would allow us to have the exact same feature with no additional memory cost, at the price of a slightly slower and more complicated process (need to rebuild the `Schema` every time we query the OpenAPI endpoint)
- cache nothing at all, and rebuild the admin schema from scratch every time.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/3962
Co-authored-by: paritosh-08 <85472423+paritosh-08@users.noreply.github.com>
GitOrigin-RevId: a8b9808170b231fdf6787983b4a9ed286cde27e0
### Description
This small PR improves the representation of an endpoint method from `Text` to an enum of the supported methods. Additionally, it cleans some of the instances defined on surrounding types (such as Postgres-specific instances on Endpoint types).
Due to a name conflict, this makes `RQL.Types.Endpoint` impossible to re-export from `RQL.Types`, which in turn forces several other modules to import it explicitly, which I think is fine since we want to ultimately get rid of `RQL.Types`.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/3965
GitOrigin-RevId: 33869007d0d818ddf486fb61d1f6099f9dad7570
…rmance
It makes sense to try to utilize multiple threads for metadata
operations since we expect them to come one at a time (and likely at
lower load periods anyway).
As noted, although we build roles in parallel now, the admin role is
still a bottleneck. For replace_metadata on huge_schema, on my machine
I get:
BEFORE: 22.7 sec
AFTER: 13.5 sec
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/3911
GitOrigin-RevId: 4d4ee6ac8b5506603e70e4fc666a3aacc054d493
### Description
Several libraries define `catMaybes` as `mapMaybe id`. We had it defined in `Data.HashMap.Strict.Extended` already. This small PR also defines it in `Extended` modules for other containers and replaces every occurrence of `mapMaybe id` accordingly.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/3884
GitOrigin-RevId: d222a2ca2f4eb9b725b20450a62a626d3886dbf4
We build the GraphQL schema by combining building blocks such as `tableSelectionSet` and `columnParser`. These building blocks individually build `{InputFields,Field,}Parser` objects. Those object specify the valid GraphQL schema.
Since the GraphQL schema is role-dependent, at some point we need to know what fragment of the GraphQL schema a specific role is allowed to access, and this is stored in `{Sel,Upd,Ins,Del}PermInfo` objects.
We have passed around these permission objects as function arguments to the schema building blocks since we first started dealing with permissions during the PDV refactor - see hasura/graphql-engine@5168b99e46 in hasura/graphql-engine#4111. This means that, for instance, `tableSelectionSet` has as its type:
```haskell
tableSelectionSet ::
forall b r m n.
MonadBuildSchema b r m n =>
SourceName ->
TableInfo b ->
SelPermInfo b ->
m (Parser 'Output n (AnnotatedFields b))
```
There are three reasons to change this.
1. We often pass a `Maybe (xPermInfo b)` instead of a proper `xPermInfo b`, and it's not clear what the intended semantics of this is. Some potential improvements on the data types involved are discussed in issue hasura/graphql-engine-mono#3125.
2. In most cases we also already pass a `TableInfo b`, and together with the `MonadRole` that is usually also in scope, this means that we could look up the required permissions regardless: so passing the permissions explicitly undermines the "single source of truth" principle. Breaking this principle also makes the code more difficult to read.
3. We are working towards role-based parsers (see hasura/graphql-engine-mono#2711), where the `{InputFields,Field,}Parser` objects are constructed in a role-invariant way, so that we have a single object that can be used for all roles. In particular, this means that the schema building blocks _need_ to be constructed in a role-invariant way. While this PR doesn't accomplish that, it does reduce the amount of role-specific arguments being passed, thus fixing hasura/graphql-engine-mono#3068.
Concretely, this PR simply drops the `xPermInfo b` argument from almost all schema building blocks. Instead these objects are looked up from the `TableInfo b` as-needed. The resulting code is considerably simpler and shorter.
One way to interpret this change is as follows. Before this PR, we figured out permissions at the top-level in `Hasura.GraphQL.Schema`, passing down the obtained `xPermInfo` objects as required. After this PR, we have a bottom-up approach where the schema building blocks themselves decide whether they want to be included for a particular role.
So this moves some permission logic out of `Hasura.GraphQL.Schema`, which is very complex.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/3608
GitOrigin-RevId: 51a744f34ec7d57bc8077667ae7f9cb9c4f6c962