graphql-engine/server/src-lib/Hasura/RQL/DDL/RemoteRelationship.hs

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

401 lines
17 KiB
Haskell
Raw Normal View History

{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE UndecidableInstances #-}
module Hasura.RQL.DDL.RemoteRelationship
( CreateFromSourceRelationship (..),
runCreateRemoteRelationship,
runDeleteRemoteRelationship,
runUpdateRemoteRelationship,
DeleteFromSourceRelationship (..),
dropRemoteRelationshipInMetadata,
PartiallyResolvedSource (..),
server: support remote relationships on SQL Server and BigQuery (#1497) Remote relationships are now supported on SQL Server and BigQuery. The major change though is the re-architecture of remote join execution logic. Prior to this PR, each backend is responsible for processing the remote relationships that are part of their AST. This is not ideal as there is nothing specific about a remote join's execution that ties it to a backend. The only backend specific part is whether or not the specification of the remote relationship is valid (i.e, we'll need to validate whether the scalars are compatible). The approach now changes to this: 1. Before delegating the AST to the backend, we traverse the AST, collect all the remote joins while modifying the AST to add necessary join fields where needed. 1. Once the remote joins are collected from the AST, the database call is made to fetch the response. The necessary data for the remote join(s) is collected from the database's response and one or more remote schema calls are constructed as necessary. 1. The remote schema calls are then executed and the data from the database and from the remote schemas is joined to produce the final response. ### Known issues 1. Ideally the traversal of the IR to collect remote joins should return an AST which does not include remote join fields. This operation can be type safe but isn't taken up as part of the PR. 1. There is a lot of code duplication between `Transport/HTTP.hs` and `Transport/Websocket.hs` which needs to be fixed ASAP. This too hasn't been taken up by this PR. 1. The type which represents the execution plan is only modified to handle our current remote joins and as such it will have to be changed to accommodate general remote joins. 1. Use of lenses would have reduced the boilerplate code to collect remote joins from the base AST. 1. The current remote join logic assumes that the join columns of a remote relationship appear with their names in the database response. This however is incorrect as they could be aliased. This can be taken up by anyone, I've left a comment in the code. ### Notes to the reviewers I think it is best reviewed commit by commit. 1. The first one is very straight forward. 1. The second one refactors the remote join execution logic but other than moving things around, it doesn't change the user facing functionality. This moves Postgres specific parts to `Backends/Postgres` module from `Execute`. Some IR related code to `Hasura.RQL.IR` module. Simplifies various type class function signatures as a backend doesn't have to handle remote joins anymore 1. The third one fixes partial case matches that for some weird reason weren't shown as warnings before this refactor 1. The fourth one generalizes the validation logic of remote relationships and implements `scalarTypeGraphQLName` function on SQL Server and BigQuery which is used by the validation logic. This enables remote relationships on BigQuery and SQL Server. https://github.com/hasura/graphql-engine-mono/pull/1497 GitOrigin-RevId: 77dd8eed326602b16e9a8496f52f46d22b795598
2021-06-11 06:26:50 +03:00
buildRemoteFieldInfo,
CreateRemoteSchemaRemoteRelationship (..),
runCreateRemoteSchemaRemoteRelationship,
runUpdateRemoteSchemaRemoteRelationship,
DeleteRemoteSchemaRemoteRelationship (..),
runDeleteRemoteSchemaRemoteRelationship,
getRemoteSchemaEntityJoinColumns,
)
where
import Control.Lens (at, non, to, (^?))
import Data.Aeson (FromJSON (..), ToJSON (..), (.!=), (.:), (.:?), (.=))
import Data.Aeson qualified as J
import Data.Aeson.KeyMap qualified as KM
import Data.Aeson.TH qualified as J
server: support remote relationships on SQL Server and BigQuery (#1497) Remote relationships are now supported on SQL Server and BigQuery. The major change though is the re-architecture of remote join execution logic. Prior to this PR, each backend is responsible for processing the remote relationships that are part of their AST. This is not ideal as there is nothing specific about a remote join's execution that ties it to a backend. The only backend specific part is whether or not the specification of the remote relationship is valid (i.e, we'll need to validate whether the scalars are compatible). The approach now changes to this: 1. Before delegating the AST to the backend, we traverse the AST, collect all the remote joins while modifying the AST to add necessary join fields where needed. 1. Once the remote joins are collected from the AST, the database call is made to fetch the response. The necessary data for the remote join(s) is collected from the database's response and one or more remote schema calls are constructed as necessary. 1. The remote schema calls are then executed and the data from the database and from the remote schemas is joined to produce the final response. ### Known issues 1. Ideally the traversal of the IR to collect remote joins should return an AST which does not include remote join fields. This operation can be type safe but isn't taken up as part of the PR. 1. There is a lot of code duplication between `Transport/HTTP.hs` and `Transport/Websocket.hs` which needs to be fixed ASAP. This too hasn't been taken up by this PR. 1. The type which represents the execution plan is only modified to handle our current remote joins and as such it will have to be changed to accommodate general remote joins. 1. Use of lenses would have reduced the boilerplate code to collect remote joins from the base AST. 1. The current remote join logic assumes that the join columns of a remote relationship appear with their names in the database response. This however is incorrect as they could be aliased. This can be taken up by anyone, I've left a comment in the code. ### Notes to the reviewers I think it is best reviewed commit by commit. 1. The first one is very straight forward. 1. The second one refactors the remote join execution logic but other than moving things around, it doesn't change the user facing functionality. This moves Postgres specific parts to `Backends/Postgres` module from `Execute`. Some IR related code to `Hasura.RQL.IR` module. Simplifies various type class function signatures as a backend doesn't have to handle remote joins anymore 1. The third one fixes partial case matches that for some weird reason weren't shown as warnings before this refactor 1. The fourth one generalizes the validation logic of remote relationships and implements `scalarTypeGraphQLName` function on SQL Server and BigQuery which is used by the validation logic. This enables remote relationships on BigQuery and SQL Server. https://github.com/hasura/graphql-engine-mono/pull/1497 GitOrigin-RevId: 77dd8eed326602b16e9a8496f52f46d22b795598
2021-06-11 06:26:50 +03:00
import Data.HashMap.Strict qualified as Map
import Data.HashMap.Strict.InsOrd qualified as OMap
import Data.Sequence qualified as Seq
import Data.Text.Extended ((<<>), (<>>))
import Hasura.Base.Error
( Code (NotExists, NotFound, NotSupported, RemoteSchemaError),
QErr,
QErrM,
runAesonParser,
throw400,
)
import Hasura.EncJSON (EncJSON)
import Hasura.Prelude
import Hasura.RQL.Types.Backend
import Hasura.RQL.Types.Column
import Hasura.RQL.Types.Common
import Hasura.RQL.Types.Metadata
import Hasura.RQL.Types.Metadata.Backend
import Hasura.RQL.Types.Metadata.Object
import Hasura.RQL.Types.Relationships.Remote
import Hasura.RQL.Types.Relationships.ToSource
import Hasura.RQL.Types.SchemaCache
import Hasura.RQL.Types.SchemaCache.Build
import Hasura.RQL.Types.SchemaCacheTypes
import Hasura.RQL.Types.Source
import Hasura.RQL.Types.Table
scaffolding for remote-schemas module The main aim of the PR is: 1. To set up a module structure for 'remote-schemas' package. 2. Move parts by the remote schema codebase into the new module structure to validate it. ## Notes to the reviewer Why a PR with large-ish diff? 1. We've been making progress on the MM project but we don't yet know long it is going to take us to get to the first milestone. To understand this better, we need to figure out the unknowns as soon as possible. Hence I've taken a stab at the first two items in the [end-state](https://gist.github.com/0x777/ca2bdc4284d21c3eec153b51dea255c9) document to figure out the unknowns. Unsurprisingly, there are a bunch of issues that we haven't discussed earlier. These are documented in the 'open questions' section. 1. The diff is large but that is only code moved around and I've added a section that documents how things are moved. In addition, there are fair number of PR comments to help with the review process. ## Changes in the PR ### Module structure Sets up the module structure as follows: ``` Hasura/ RemoteSchema/ Metadata/ Types.hs SchemaCache/ Types.hs Permission.hs RemoteRelationship.hs Build.hs MetadataAPI/ Types.hs Execute.hs ``` ### 1. Types representing metadata are moved Types that capture metadata information (currently scattered across several RQL modules) are moved into `Hasura.RemoteSchema.Metadata.Types`. - This new module only depends on very 'core' modules such as `Hasura.Session` for the notion of roles and `Hasura.Incremental` for `Cacheable` typeclass. - The requirement on database modules is avoided by generalizing the remote schemas metadata to accept an arbitrary 'r' for a remote relationship definition. ### 2. SchemaCache related types and build logic have been moved Types that represent remote schemas information in SchemaCache are moved into `Hasura.RemoteSchema.SchemaCache.Types`. Similar to `H.RS.Metadata.Types`, this module depends on 'core' modules except for `Hasura.GraphQL.Parser.Variable`. It has something to do with remote relationships but I haven't spent time looking into it. The validation of 'remote relationships to remote schema' is also something that needs to be looked at. Rips out the logic that builds remote schema's SchemaCache information from the monolithic `buildSchemaCacheRule` and moves it into `Hasura.RemoteSchema.SchemaCache.Build`. Further, the `.SchemaCache.Permission` and `.SchemaCache.RemoteRelationship` have been created from existing modules that capture schema cache building logic for those two components. This was a fair amount of work. On main, currently remote schema's SchemaCache information is built in two phases - in the first phase, 'permissions' and 'remote relationships' are ignored and in the second phase they are filled in. While remote relationships can only be resolved after partially resolving sources and other remote schemas, the same isn't true for permissions. Further, most of the work that is done to resolve remote relationships can be moved to the first phase so that the second phase can be a very simple traversal. This is the approach that was taken - resolve permissions and as much as remote relationships information in the first phase. ### 3. Metadata APIs related types and build logic have been moved The types that represent remote schema related metadata APIs and the execution logic have been moved to `Hasura.RemoteSchema.MetadataAPI.Types` and `.Execute` modules respectively. ## Open questions: 1. `Hasura.RemoteSchema.Metadata.Types` is so called because I was hoping that all of the metadata related APIs of remote schema can be brought in at `Hasura.RemoteSchema.Metadata.API`. However, as metadata APIs depended on functions from `SchemaCache` module (see [1](https://github.com/hasura/graphql-engine-mono/blob/ceba6d62264603ee5d279814677b29bcc43ecaea/server/src-lib/Hasura/RQL/DDL/RemoteSchema.hs#L55) and [2](https://github.com/hasura/graphql-engine-mono/blob/ceba6d62264603ee5d279814677b29bcc43ecaea/server/src-lib/Hasura/RQL/DDL/RemoteSchema.hs#L91), it made more sense to create a separate top-level module for `MetadataAPI`s. Maybe we can just have `Hasura.RemoteSchema.Metadata` and get rid of the extra nesting or have `Hasura.RemoteSchema.Metadata.{Core,Permission,RemoteRelationship}` if we want to break them down further. 1. `buildRemoteSchemas` in `H.RS.SchemaCache.Build` has the following type: ```haskell buildRemoteSchemas :: ( ArrowChoice arr, Inc.ArrowDistribute arr, ArrowWriter (Seq CollectedInfo) arr, Inc.ArrowCache m arr, MonadIO m, HasHttpManagerM m, Inc.Cacheable remoteRelationshipDefinition, ToJSON remoteRelationshipDefinition, MonadError QErr m ) => Env.Environment -> ( (Inc.Dependency (HashMap RemoteSchemaName Inc.InvalidationKey), OrderedRoles), [RemoteSchemaMetadataG remoteRelationshipDefinition] ) `arr` HashMap RemoteSchemaName (PartiallyResolvedRemoteSchemaCtxG remoteRelationshipDefinition, MetadataObject) ``` Note the dependence on `CollectedInfo` which is defined as ```haskell data CollectedInfo = CIInconsistency InconsistentMetadata | CIDependency MetadataObject -- ^ for error reporting on missing dependencies SchemaObjId SchemaDependency deriving (Eq) ``` this pretty much means that remote schemas is dependent on types from databases, actions, .... How do we fix this? Maybe introduce a typeclass such as `ArrowCollectRemoteSchemaDependencies` which is defined in `Hasura.RemoteSchema` and then implemented in graphql-engine? 1. The dependency on `buildSchemaCacheFor` in `.MetadataAPI.Execute` which has the following signature: ```haskell buildSchemaCacheFor :: (QErrM m, CacheRWM m, MetadataM m) => MetadataObjId -> MetadataModifier -> ``` This can be easily resolved if we restrict what the metadata APIs are allowed to do. Currently, they operate in an unfettered access to modify SchemaCache (the `CacheRWM` constraint): ```haskell runAddRemoteSchema :: ( QErrM m, CacheRWM m, MonadIO m, HasHttpManagerM m, MetadataM m, Tracing.MonadTrace m ) => Env.Environment -> AddRemoteSchemaQuery -> m EncJSON ``` This should instead be changed to restrict remote schema APIs to only modify remote schema metadata (but has access to the remote schemas part of the schema cache), this dependency is completely removed. ```haskell runAddRemoteSchema :: ( QErrM m, MonadIO m, HasHttpManagerM m, MonadReader RemoteSchemasSchemaCache m, MonadState RemoteSchemaMetadata m, Tracing.MonadTrace m ) => Env.Environment -> AddRemoteSchemaQuery -> m RemoteSchemeMetadataObjId ``` The idea is that the core graphql-engine would call these functions and then call `buildSchemaCacheFor`. PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6291 GitOrigin-RevId: 51357148c6404afe70219afa71bd1d59bdf4ffc6
2022-10-21 06:13:07 +03:00
import Hasura.RemoteSchema.Metadata
import Hasura.RemoteSchema.SchemaCache
server: support remote relationships on SQL Server and BigQuery (#1497) Remote relationships are now supported on SQL Server and BigQuery. The major change though is the re-architecture of remote join execution logic. Prior to this PR, each backend is responsible for processing the remote relationships that are part of their AST. This is not ideal as there is nothing specific about a remote join's execution that ties it to a backend. The only backend specific part is whether or not the specification of the remote relationship is valid (i.e, we'll need to validate whether the scalars are compatible). The approach now changes to this: 1. Before delegating the AST to the backend, we traverse the AST, collect all the remote joins while modifying the AST to add necessary join fields where needed. 1. Once the remote joins are collected from the AST, the database call is made to fetch the response. The necessary data for the remote join(s) is collected from the database's response and one or more remote schema calls are constructed as necessary. 1. The remote schema calls are then executed and the data from the database and from the remote schemas is joined to produce the final response. ### Known issues 1. Ideally the traversal of the IR to collect remote joins should return an AST which does not include remote join fields. This operation can be type safe but isn't taken up as part of the PR. 1. There is a lot of code duplication between `Transport/HTTP.hs` and `Transport/Websocket.hs` which needs to be fixed ASAP. This too hasn't been taken up by this PR. 1. The type which represents the execution plan is only modified to handle our current remote joins and as such it will have to be changed to accommodate general remote joins. 1. Use of lenses would have reduced the boilerplate code to collect remote joins from the base AST. 1. The current remote join logic assumes that the join columns of a remote relationship appear with their names in the database response. This however is incorrect as they could be aliased. This can be taken up by anyone, I've left a comment in the code. ### Notes to the reviewers I think it is best reviewed commit by commit. 1. The first one is very straight forward. 1. The second one refactors the remote join execution logic but other than moving things around, it doesn't change the user facing functionality. This moves Postgres specific parts to `Backends/Postgres` module from `Execute`. Some IR related code to `Hasura.RQL.IR` module. Simplifies various type class function signatures as a backend doesn't have to handle remote joins anymore 1. The third one fixes partial case matches that for some weird reason weren't shown as warnings before this refactor 1. The fourth one generalizes the validation logic of remote relationships and implements `scalarTypeGraphQLName` function on SQL Server and BigQuery which is used by the validation logic. This enables remote relationships on BigQuery and SQL Server. https://github.com/hasura/graphql-engine-mono/pull/1497 GitOrigin-RevId: 77dd8eed326602b16e9a8496f52f46d22b795598
2021-06-11 06:26:50 +03:00
import Hasura.SQL.AnyBackend qualified as AB
import Hasura.SQL.Backend
import Language.GraphQL.Draft.Syntax qualified as G
--------------------------------------------------------------------------------
-- Create or update relationship from source
-- | Argument to the @_create_remote_relationship@ and
-- @_update_remote_relationship@ families of metadata commands.
--
-- For historical reason, this type is also used to represent a db-to-rs schema
-- in the metadata.
data CreateFromSourceRelationship (b :: BackendType) = CreateFromSourceRelationship
{ _crrSource :: SourceName,
_crrTable :: TableName b,
_crrName :: RelName,
_crrDefinition :: RemoteRelationshipDefinition
}
deriving stock instance (Eq (TableName b)) => Eq (CreateFromSourceRelationship b)
deriving stock instance (Show (TableName b)) => Show (CreateFromSourceRelationship b)
instance Backend b => FromJSON (CreateFromSourceRelationship b) where
parseJSON = J.withObject "CreateFromSourceRelationship" $ \o -> do
_crrSource <- o .:? "source" .!= defaultSource
_crrTable <- o .: "table"
_crrName <- o .: "name"
Fix several issues with remote relationships. ## Remaining Work - [x] changelog entry - [x] more tests: `<backend>_delete_remote_relationship` is definitely untested - [x] negative tests: we probably want to assert that there are some APIs we DON'T support - [x] update the console to use the new API, if necessary - [x] ~~adding the corresponding documentation for the API for other backends (only `pg_` was added here)~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3170 - [x] ~~deciding which backends should support this API~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3170 - [x] ~~deciding what to do about potentially overlapping schematic representations~~ - ~~cf. https://github.com/hasura/graphql-engine-mono/pull/3157#issuecomment-995307624~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3171 - [x] ~~add more descriptive versioning information to some of the types that are changing in this PR~~ - cf. https://github.com/hasura/graphql-engine-mono/pull/3157#discussion_r769830920 - deferred to https://github.com/hasura/graphql-engine-mono/issues/3172 ## Description This PR fixes several important issues wrt. the remote relationship API. - it fixes a regression introduced by [#3124](https://github.com/hasura/graphql-engine-mono/pull/3124), which prevented `<backend>_create_remote_relationship` from accepting the old argument format (break of backwards compatibility, broke the console) - it removes the command `create_remote_relationship` added to the v1/metadata API as a work-around as part of [#3124](https://github.com/hasura/graphql-engine-mono/pull/3124) - it reverts the subsequent fix in the console: [#3149](https://github.com/hasura/graphql-engine-mono/pull/3149) Furthermore, this PR also addresses two other issues: - THE DOCUMENTATION OF THE METADATA API WAS WRONG, and documented `create_remote_relationship` instead of `<backend>_create_remote_relationship`: this PR fixes this by adding `pg_` everywhere, but does not attempt to add the corresponding documentation for other backends, partly because: - `<backend>_delete_remote_relationship` WAS BROKEN ON NON-POSTGRES BACKENDS; it always expected an argument parameterized by Postgres. As of main, the `<backend>_(create|update|delete)_remote_relationship` commands are supported on Postgres, Citus, BigQuery, but **NOT MSSQL**. I do not know if this is intentional or not, if it even should be publicized or not, and as a result this PR doesn't change this. PR-URL: https://github.com/hasura/graphql-engine-mono/pull/3157 Co-authored-by: jkachmar <8461423+jkachmar@users.noreply.github.com> GitOrigin-RevId: 37e2f41522a9229a11c595574c3f4984317d652a
2021-12-16 23:28:08 +03:00
-- In the old format, the definition is inlined; in the new format, the
-- definition is in the "definition" object, and we don't allow legacy
-- fields to appear under it.
Fix several issues with remote relationships. ## Remaining Work - [x] changelog entry - [x] more tests: `<backend>_delete_remote_relationship` is definitely untested - [x] negative tests: we probably want to assert that there are some APIs we DON'T support - [x] update the console to use the new API, if necessary - [x] ~~adding the corresponding documentation for the API for other backends (only `pg_` was added here)~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3170 - [x] ~~deciding which backends should support this API~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3170 - [x] ~~deciding what to do about potentially overlapping schematic representations~~ - ~~cf. https://github.com/hasura/graphql-engine-mono/pull/3157#issuecomment-995307624~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3171 - [x] ~~add more descriptive versioning information to some of the types that are changing in this PR~~ - cf. https://github.com/hasura/graphql-engine-mono/pull/3157#discussion_r769830920 - deferred to https://github.com/hasura/graphql-engine-mono/issues/3172 ## Description This PR fixes several important issues wrt. the remote relationship API. - it fixes a regression introduced by [#3124](https://github.com/hasura/graphql-engine-mono/pull/3124), which prevented `<backend>_create_remote_relationship` from accepting the old argument format (break of backwards compatibility, broke the console) - it removes the command `create_remote_relationship` added to the v1/metadata API as a work-around as part of [#3124](https://github.com/hasura/graphql-engine-mono/pull/3124) - it reverts the subsequent fix in the console: [#3149](https://github.com/hasura/graphql-engine-mono/pull/3149) Furthermore, this PR also addresses two other issues: - THE DOCUMENTATION OF THE METADATA API WAS WRONG, and documented `create_remote_relationship` instead of `<backend>_create_remote_relationship`: this PR fixes this by adding `pg_` everywhere, but does not attempt to add the corresponding documentation for other backends, partly because: - `<backend>_delete_remote_relationship` WAS BROKEN ON NON-POSTGRES BACKENDS; it always expected an argument parameterized by Postgres. As of main, the `<backend>_(create|update|delete)_remote_relationship` commands are supported on Postgres, Citus, BigQuery, but **NOT MSSQL**. I do not know if this is intentional or not, if it even should be publicized or not, and as a result this PR doesn't change this. PR-URL: https://github.com/hasura/graphql-engine-mono/pull/3157 Co-authored-by: jkachmar <8461423+jkachmar@users.noreply.github.com> GitOrigin-RevId: 37e2f41522a9229a11c595574c3f4984317d652a
2021-12-16 23:28:08 +03:00
remoteSchema :: Maybe J.Value <- o .:? "remote_schema"
definition <- o .:? "definition"
_crrDefinition <- case (remoteSchema, definition) of
-- old format
(Just {}, Nothing) -> parseRemoteRelationshipDefinition RRPLegacy $ J.Object o
Fix several issues with remote relationships. ## Remaining Work - [x] changelog entry - [x] more tests: `<backend>_delete_remote_relationship` is definitely untested - [x] negative tests: we probably want to assert that there are some APIs we DON'T support - [x] update the console to use the new API, if necessary - [x] ~~adding the corresponding documentation for the API for other backends (only `pg_` was added here)~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3170 - [x] ~~deciding which backends should support this API~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3170 - [x] ~~deciding what to do about potentially overlapping schematic representations~~ - ~~cf. https://github.com/hasura/graphql-engine-mono/pull/3157#issuecomment-995307624~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3171 - [x] ~~add more descriptive versioning information to some of the types that are changing in this PR~~ - cf. https://github.com/hasura/graphql-engine-mono/pull/3157#discussion_r769830920 - deferred to https://github.com/hasura/graphql-engine-mono/issues/3172 ## Description This PR fixes several important issues wrt. the remote relationship API. - it fixes a regression introduced by [#3124](https://github.com/hasura/graphql-engine-mono/pull/3124), which prevented `<backend>_create_remote_relationship` from accepting the old argument format (break of backwards compatibility, broke the console) - it removes the command `create_remote_relationship` added to the v1/metadata API as a work-around as part of [#3124](https://github.com/hasura/graphql-engine-mono/pull/3124) - it reverts the subsequent fix in the console: [#3149](https://github.com/hasura/graphql-engine-mono/pull/3149) Furthermore, this PR also addresses two other issues: - THE DOCUMENTATION OF THE METADATA API WAS WRONG, and documented `create_remote_relationship` instead of `<backend>_create_remote_relationship`: this PR fixes this by adding `pg_` everywhere, but does not attempt to add the corresponding documentation for other backends, partly because: - `<backend>_delete_remote_relationship` WAS BROKEN ON NON-POSTGRES BACKENDS; it always expected an argument parameterized by Postgres. As of main, the `<backend>_(create|update|delete)_remote_relationship` commands are supported on Postgres, Citus, BigQuery, but **NOT MSSQL**. I do not know if this is intentional or not, if it even should be publicized or not, and as a result this PR doesn't change this. PR-URL: https://github.com/hasura/graphql-engine-mono/pull/3157 Co-authored-by: jkachmar <8461423+jkachmar@users.noreply.github.com> GitOrigin-RevId: 37e2f41522a9229a11c595574c3f4984317d652a
2021-12-16 23:28:08 +03:00
-- new format
(Nothing, Just def) -> parseRemoteRelationshipDefinition RRPStrict def
Fix several issues with remote relationships. ## Remaining Work - [x] changelog entry - [x] more tests: `<backend>_delete_remote_relationship` is definitely untested - [x] negative tests: we probably want to assert that there are some APIs we DON'T support - [x] update the console to use the new API, if necessary - [x] ~~adding the corresponding documentation for the API for other backends (only `pg_` was added here)~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3170 - [x] ~~deciding which backends should support this API~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3170 - [x] ~~deciding what to do about potentially overlapping schematic representations~~ - ~~cf. https://github.com/hasura/graphql-engine-mono/pull/3157#issuecomment-995307624~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3171 - [x] ~~add more descriptive versioning information to some of the types that are changing in this PR~~ - cf. https://github.com/hasura/graphql-engine-mono/pull/3157#discussion_r769830920 - deferred to https://github.com/hasura/graphql-engine-mono/issues/3172 ## Description This PR fixes several important issues wrt. the remote relationship API. - it fixes a regression introduced by [#3124](https://github.com/hasura/graphql-engine-mono/pull/3124), which prevented `<backend>_create_remote_relationship` from accepting the old argument format (break of backwards compatibility, broke the console) - it removes the command `create_remote_relationship` added to the v1/metadata API as a work-around as part of [#3124](https://github.com/hasura/graphql-engine-mono/pull/3124) - it reverts the subsequent fix in the console: [#3149](https://github.com/hasura/graphql-engine-mono/pull/3149) Furthermore, this PR also addresses two other issues: - THE DOCUMENTATION OF THE METADATA API WAS WRONG, and documented `create_remote_relationship` instead of `<backend>_create_remote_relationship`: this PR fixes this by adding `pg_` everywhere, but does not attempt to add the corresponding documentation for other backends, partly because: - `<backend>_delete_remote_relationship` WAS BROKEN ON NON-POSTGRES BACKENDS; it always expected an argument parameterized by Postgres. As of main, the `<backend>_(create|update|delete)_remote_relationship` commands are supported on Postgres, Citus, BigQuery, but **NOT MSSQL**. I do not know if this is intentional or not, if it even should be publicized or not, and as a result this PR doesn't change this. PR-URL: https://github.com/hasura/graphql-engine-mono/pull/3157 Co-authored-by: jkachmar <8461423+jkachmar@users.noreply.github.com> GitOrigin-RevId: 37e2f41522a9229a11c595574c3f4984317d652a
2021-12-16 23:28:08 +03:00
-- both or neither
_ -> fail "create_remote_relationship expects exactly one of: remote_schema, definition"
pure $ CreateFromSourceRelationship {..}
instance (Backend b) => ToJSON (CreateFromSourceRelationship b) where
toJSON CreateFromSourceRelationship {..} =
Fix several issues with remote relationships. ## Remaining Work - [x] changelog entry - [x] more tests: `<backend>_delete_remote_relationship` is definitely untested - [x] negative tests: we probably want to assert that there are some APIs we DON'T support - [x] update the console to use the new API, if necessary - [x] ~~adding the corresponding documentation for the API for other backends (only `pg_` was added here)~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3170 - [x] ~~deciding which backends should support this API~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3170 - [x] ~~deciding what to do about potentially overlapping schematic representations~~ - ~~cf. https://github.com/hasura/graphql-engine-mono/pull/3157#issuecomment-995307624~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3171 - [x] ~~add more descriptive versioning information to some of the types that are changing in this PR~~ - cf. https://github.com/hasura/graphql-engine-mono/pull/3157#discussion_r769830920 - deferred to https://github.com/hasura/graphql-engine-mono/issues/3172 ## Description This PR fixes several important issues wrt. the remote relationship API. - it fixes a regression introduced by [#3124](https://github.com/hasura/graphql-engine-mono/pull/3124), which prevented `<backend>_create_remote_relationship` from accepting the old argument format (break of backwards compatibility, broke the console) - it removes the command `create_remote_relationship` added to the v1/metadata API as a work-around as part of [#3124](https://github.com/hasura/graphql-engine-mono/pull/3124) - it reverts the subsequent fix in the console: [#3149](https://github.com/hasura/graphql-engine-mono/pull/3149) Furthermore, this PR also addresses two other issues: - THE DOCUMENTATION OF THE METADATA API WAS WRONG, and documented `create_remote_relationship` instead of `<backend>_create_remote_relationship`: this PR fixes this by adding `pg_` everywhere, but does not attempt to add the corresponding documentation for other backends, partly because: - `<backend>_delete_remote_relationship` WAS BROKEN ON NON-POSTGRES BACKENDS; it always expected an argument parameterized by Postgres. As of main, the `<backend>_(create|update|delete)_remote_relationship` commands are supported on Postgres, Citus, BigQuery, but **NOT MSSQL**. I do not know if this is intentional or not, if it even should be publicized or not, and as a result this PR doesn't change this. PR-URL: https://github.com/hasura/graphql-engine-mono/pull/3157 Co-authored-by: jkachmar <8461423+jkachmar@users.noreply.github.com> GitOrigin-RevId: 37e2f41522a9229a11c595574c3f4984317d652a
2021-12-16 23:28:08 +03:00
-- We need to introspect the definition, to know whether we need to inline
-- it, or if it needs to be in a distinct "definition" object.
J.object $ case _crrDefinition of
-- old format
RelationshipToSchema RRFOldDBToRemoteSchema _ ->
case J.toJSON _crrDefinition of
-- The result of this serialization will be an empty list if this
-- conversion fails (which it should _never_ do), in which case those
-- fields will be omitted from the serialized JSON. This could only
-- happen if the ToJSON instance of RemoteRelationshipDefinition were
-- changed to return something that isn't an object.
J.Object obj -> commonFields <> KM.toList obj
Fix several issues with remote relationships. ## Remaining Work - [x] changelog entry - [x] more tests: `<backend>_delete_remote_relationship` is definitely untested - [x] negative tests: we probably want to assert that there are some APIs we DON'T support - [x] update the console to use the new API, if necessary - [x] ~~adding the corresponding documentation for the API for other backends (only `pg_` was added here)~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3170 - [x] ~~deciding which backends should support this API~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3170 - [x] ~~deciding what to do about potentially overlapping schematic representations~~ - ~~cf. https://github.com/hasura/graphql-engine-mono/pull/3157#issuecomment-995307624~~ - deferred to https://github.com/hasura/graphql-engine-mono/issues/3171 - [x] ~~add more descriptive versioning information to some of the types that are changing in this PR~~ - cf. https://github.com/hasura/graphql-engine-mono/pull/3157#discussion_r769830920 - deferred to https://github.com/hasura/graphql-engine-mono/issues/3172 ## Description This PR fixes several important issues wrt. the remote relationship API. - it fixes a regression introduced by [#3124](https://github.com/hasura/graphql-engine-mono/pull/3124), which prevented `<backend>_create_remote_relationship` from accepting the old argument format (break of backwards compatibility, broke the console) - it removes the command `create_remote_relationship` added to the v1/metadata API as a work-around as part of [#3124](https://github.com/hasura/graphql-engine-mono/pull/3124) - it reverts the subsequent fix in the console: [#3149](https://github.com/hasura/graphql-engine-mono/pull/3149) Furthermore, this PR also addresses two other issues: - THE DOCUMENTATION OF THE METADATA API WAS WRONG, and documented `create_remote_relationship` instead of `<backend>_create_remote_relationship`: this PR fixes this by adding `pg_` everywhere, but does not attempt to add the corresponding documentation for other backends, partly because: - `<backend>_delete_remote_relationship` WAS BROKEN ON NON-POSTGRES BACKENDS; it always expected an argument parameterized by Postgres. As of main, the `<backend>_(create|update|delete)_remote_relationship` commands are supported on Postgres, Citus, BigQuery, but **NOT MSSQL**. I do not know if this is intentional or not, if it even should be publicized or not, and as a result this PR doesn't change this. PR-URL: https://github.com/hasura/graphql-engine-mono/pull/3157 Co-authored-by: jkachmar <8461423+jkachmar@users.noreply.github.com> GitOrigin-RevId: 37e2f41522a9229a11c595574c3f4984317d652a
2021-12-16 23:28:08 +03:00
_ -> []
-- new format
_ -> ("definition" .= _crrDefinition) : commonFields
where
commonFields =
[ "source" .= _crrSource,
"table" .= _crrTable,
"name" .= _crrName
]
server: support remote relationships on SQL Server and BigQuery (#1497) Remote relationships are now supported on SQL Server and BigQuery. The major change though is the re-architecture of remote join execution logic. Prior to this PR, each backend is responsible for processing the remote relationships that are part of their AST. This is not ideal as there is nothing specific about a remote join's execution that ties it to a backend. The only backend specific part is whether or not the specification of the remote relationship is valid (i.e, we'll need to validate whether the scalars are compatible). The approach now changes to this: 1. Before delegating the AST to the backend, we traverse the AST, collect all the remote joins while modifying the AST to add necessary join fields where needed. 1. Once the remote joins are collected from the AST, the database call is made to fetch the response. The necessary data for the remote join(s) is collected from the database's response and one or more remote schema calls are constructed as necessary. 1. The remote schema calls are then executed and the data from the database and from the remote schemas is joined to produce the final response. ### Known issues 1. Ideally the traversal of the IR to collect remote joins should return an AST which does not include remote join fields. This operation can be type safe but isn't taken up as part of the PR. 1. There is a lot of code duplication between `Transport/HTTP.hs` and `Transport/Websocket.hs` which needs to be fixed ASAP. This too hasn't been taken up by this PR. 1. The type which represents the execution plan is only modified to handle our current remote joins and as such it will have to be changed to accommodate general remote joins. 1. Use of lenses would have reduced the boilerplate code to collect remote joins from the base AST. 1. The current remote join logic assumes that the join columns of a remote relationship appear with their names in the database response. This however is incorrect as they could be aliased. This can be taken up by anyone, I've left a comment in the code. ### Notes to the reviewers I think it is best reviewed commit by commit. 1. The first one is very straight forward. 1. The second one refactors the remote join execution logic but other than moving things around, it doesn't change the user facing functionality. This moves Postgres specific parts to `Backends/Postgres` module from `Execute`. Some IR related code to `Hasura.RQL.IR` module. Simplifies various type class function signatures as a backend doesn't have to handle remote joins anymore 1. The third one fixes partial case matches that for some weird reason weren't shown as warnings before this refactor 1. The fourth one generalizes the validation logic of remote relationships and implements `scalarTypeGraphQLName` function on SQL Server and BigQuery which is used by the validation logic. This enables remote relationships on BigQuery and SQL Server. https://github.com/hasura/graphql-engine-mono/pull/1497 GitOrigin-RevId: 77dd8eed326602b16e9a8496f52f46d22b795598
2021-06-11 06:26:50 +03:00
runCreateRemoteRelationship ::
forall b m.
(MonadError QErr m, CacheRWM m, MetadataM m, BackendMetadata b) =>
CreateFromSourceRelationship b ->
m EncJSON
runCreateRemoteRelationship CreateFromSourceRelationship {..} = do
void $ askTableInfo @b _crrSource _crrTable
let metadataObj =
MOSourceObjId _crrSource $
AB.mkAnyBackend $
SMOTableObj @b _crrTable $
MTORemoteRelationship _crrName
buildSchemaCacheFor metadataObj $
MetadataModifier $
tableMetadataSetter @b _crrSource _crrTable . tmRemoteRelationships
%~ OMap.insert _crrName (RemoteRelationship _crrName _crrDefinition)
pure successMsg
runUpdateRemoteRelationship ::
forall b m.
(MonadError QErr m, CacheRWM m, MetadataM m, BackendMetadata b) =>
CreateFromSourceRelationship b ->
m EncJSON
runUpdateRemoteRelationship CreateFromSourceRelationship {..} = do
fieldInfoMap <- askTableFieldInfoMap @b _crrSource _crrTable
let metadataObj =
MOSourceObjId _crrSource $
AB.mkAnyBackend $
SMOTableObj @b _crrTable $
MTORemoteRelationship _crrName
void $ askRemoteRel fieldInfoMap _crrName
buildSchemaCacheFor metadataObj $
MetadataModifier $
tableMetadataSetter @b _crrSource _crrTable . tmRemoteRelationships
%~ OMap.insert _crrName (RemoteRelationship _crrName _crrDefinition)
pure successMsg
--------------------------------------------------------------------------------
-- Drop relationship from source
-- | Argument to the @_drop_remote_relationship@ family of metadata commands.
data DeleteFromSourceRelationship (b :: BackendType) = DeleteFromSourceRelationship
{ _drrSource :: SourceName,
_drrTable :: TableName b,
_drrName :: RelName
Clean metadata arguments ## Description Thanks to #1664, the Metadata API types no longer require a `ToJSON` instance. This PR follows up with a cleanup of the types of the arguments to the metadata API: - whenever possible, it moves those argument types to where they're used (RQL.DDL.*) - it removes all unrequired instances (mostly `ToJSON`) This PR does not attempt to do it for _all_ such argument types. For some of the metadata operations, the type used to describe the argument to the API and used to represent the value in the metadata are one and the same (like for `CreateEndpoint`). Sometimes, the two types are intertwined in complex ways (`RemoteRelationship` and `RemoteRelationshipDef`). In the spirit of only doing uncontroversial cleaning work, this PR only moves types that are not used outside of RQL.DDL. Furthermore, this is a small step towards separating the different types all jumbled together in RQL.Types. ## Notes This PR also improves several `FromJSON` instances to make use of `withObject`, and to use a human readable string instead of a type name in error messages whenever possible. For instance: - before: `expected Object for Object, but encountered X` after: `expected Object for add computed field, but encountered X` - before: `Expecting an object for update query` after: `expected Object for update query, but encountered X` This PR also renames `CreateFunctionPermission` to `FunctionPermissionArgument`, to remove the quite surprising `type DropFunctionPermission = CreateFunctionPermission`. This PR also deletes some dead code, mostly in RQL.DML. This PR also moves a PG-specific source resolving function from DDL.Schema.Source to the only place where it is used: App.hs. https://github.com/hasura/graphql-engine-mono/pull/1844 GitOrigin-RevId: a594521194bb7fe6a111b02a9e099896f9fed59c
2021-07-27 13:41:42 +03:00
}
instance Backend b => FromJSON (DeleteFromSourceRelationship b) where
parseJSON = J.withObject "DeleteFromSourceRelationship" $ \o ->
DeleteFromSourceRelationship
Clean metadata arguments ## Description Thanks to #1664, the Metadata API types no longer require a `ToJSON` instance. This PR follows up with a cleanup of the types of the arguments to the metadata API: - whenever possible, it moves those argument types to where they're used (RQL.DDL.*) - it removes all unrequired instances (mostly `ToJSON`) This PR does not attempt to do it for _all_ such argument types. For some of the metadata operations, the type used to describe the argument to the API and used to represent the value in the metadata are one and the same (like for `CreateEndpoint`). Sometimes, the two types are intertwined in complex ways (`RemoteRelationship` and `RemoteRelationshipDef`). In the spirit of only doing uncontroversial cleaning work, this PR only moves types that are not used outside of RQL.DDL. Furthermore, this is a small step towards separating the different types all jumbled together in RQL.Types. ## Notes This PR also improves several `FromJSON` instances to make use of `withObject`, and to use a human readable string instead of a type name in error messages whenever possible. For instance: - before: `expected Object for Object, but encountered X` after: `expected Object for add computed field, but encountered X` - before: `Expecting an object for update query` after: `expected Object for update query, but encountered X` This PR also renames `CreateFunctionPermission` to `FunctionPermissionArgument`, to remove the quite surprising `type DropFunctionPermission = CreateFunctionPermission`. This PR also deletes some dead code, mostly in RQL.DML. This PR also moves a PG-specific source resolving function from DDL.Schema.Source to the only place where it is used: App.hs. https://github.com/hasura/graphql-engine-mono/pull/1844 GitOrigin-RevId: a594521194bb7fe6a111b02a9e099896f9fed59c
2021-07-27 13:41:42 +03:00
<$> o .:? "source" .!= defaultSource
<*> o .: "table"
<*> o .: "name"
runDeleteRemoteRelationship ::
forall b m.
(BackendMetadata b, MonadError QErr m, CacheRWM m, MetadataM m) =>
DeleteFromSourceRelationship b ->
m EncJSON
runDeleteRemoteRelationship (DeleteFromSourceRelationship source table relName) = do
fieldInfoMap <- askTableFieldInfoMap @b source table
void $ askRemoteRel fieldInfoMap relName
let metadataObj =
MOSourceObjId source $
AB.mkAnyBackend $
SMOTableObj @b table $
MTORemoteRelationship relName
buildSchemaCacheFor metadataObj $
MetadataModifier $
tableMetadataSetter @b source table %~ dropRemoteRelationshipInMetadata relName
pure successMsg
--------------------------------------------------------------------------------
-- Create relationship from remote schema
data CreateRemoteSchemaRemoteRelationship = CreateRemoteSchemaRemoteRelationship
{ _crsrrRemoteSchema :: RemoteSchemaName,
_crsrrType :: G.Name,
_crsrrName :: RelName,
_crsrrDefinition :: RemoteRelationshipDefinition
}
deriving (Generic)
instance FromJSON CreateRemoteSchemaRemoteRelationship where
parseJSON = J.withObject "CreateRemoteSchemaRemoteRelationship" $ \o ->
CreateRemoteSchemaRemoteRelationship
<$> o .: "remote_schema"
<*> o .: "type_name"
<*> o .: "name"
<*> (o .: "definition" >>= parseRemoteRelationshipDefinition RRPStrict)
$(J.deriveToJSON hasuraJSON ''CreateRemoteSchemaRemoteRelationship)
runCreateRemoteSchemaRemoteRelationship ::
forall m.
(MonadError QErr m, CacheRWM m, MetadataM m) =>
CreateRemoteSchemaRemoteRelationship ->
m EncJSON
runCreateRemoteSchemaRemoteRelationship CreateRemoteSchemaRemoteRelationship {..} = do
let metadataObj =
MORemoteSchemaRemoteRelationship _crsrrRemoteSchema _crsrrType _crsrrName
buildSchemaCacheFor metadataObj $
MetadataModifier $
metaRemoteSchemas
. ix _crsrrRemoteSchema
. rsmRemoteRelationships
. at _crsrrType
. non (RemoteSchemaTypeRelationships _crsrrType mempty)
. rstrsRelationships
%~ OMap.insert _crsrrName (RemoteRelationship _crsrrName _crsrrDefinition)
pure successMsg
runUpdateRemoteSchemaRemoteRelationship ::
forall m.
(MonadError QErr m, CacheRWM m, MetadataM m) =>
CreateRemoteSchemaRemoteRelationship ->
m EncJSON
runUpdateRemoteSchemaRemoteRelationship crss@CreateRemoteSchemaRemoteRelationship {..} = do
schemaCache <- askSchemaCache
let remoteRelationship =
schemaCache
^? to scRemoteSchemas
. ix _crsrrRemoteSchema
. rscRemoteRelationships
. ix _crsrrType
. ix _crsrrName
void $
onNothing remoteRelationship $
throw400 NotExists $
"no relationship defined on remote schema " <>> _crsrrRemoteSchema <<> " with name " <>> _crsrrName
runCreateRemoteSchemaRemoteRelationship crss
--------------------------------------------------------------------------------
-- Drop relationship from remote schema
-- | Argument to the @_drop_remote_relationship@ family of metadata commands.
data DeleteRemoteSchemaRemoteRelationship = DeleteRemoteSchemaRemoteRelationship
{ _drsrrRemoteSchema :: RemoteSchemaName,
_drsrrTypeName :: G.Name,
_drsrrName :: RelName
}
instance FromJSON DeleteRemoteSchemaRemoteRelationship where
parseJSON = J.withObject "DeleteRemoteSchemaRemoteRelationship" $ \o ->
DeleteRemoteSchemaRemoteRelationship
<$> o .: "remote_schema"
<*> o .: "type_name"
<*> o .: "name"
runDeleteRemoteSchemaRemoteRelationship ::
forall m.
(MonadError QErr m, CacheRWM m, MetadataM m) =>
DeleteRemoteSchemaRemoteRelationship ->
m EncJSON
runDeleteRemoteSchemaRemoteRelationship DeleteRemoteSchemaRemoteRelationship {..} = do
let relName = _drsrrName
metadataObj =
MORemoteSchemaRemoteRelationship _drsrrRemoteSchema _drsrrTypeName relName
buildSchemaCacheFor metadataObj $
MetadataModifier $
metaRemoteSchemas . ix _drsrrRemoteSchema . rsmRemoteRelationships . ix _drsrrTypeName . rstrsRelationships
%~ OMap.delete relName
pure successMsg
--------------------------------------------------------------------------------
-- Schema cache building (TODO: move this elsewere!)
-- | Internal intermediary step.
--
-- We build the output of sources in two steps:
-- 1. we first resolve sources, and collect the core info of their tables
-- 2. we then build the entire output from the collection of partially resolved sources
--
-- We need this split to be able to resolve cross-source relationships: to process one source's
-- remote relationship, we need to know about the target source's tables core info.
--
-- This data structure is used as an argument to @AnyBackend@ in the backend-agnostic intermediary
-- collection, and used here to build remote field info.
data PartiallyResolvedSource b = PartiallyResolvedSource
{ _prsSourceMetadata :: SourceMetadata b,
server: plumb `StoredIntrospection` while building the Schema Cache We'd like to be able to build a Schema Cache from only serializable data. We already have Metadata. The data that's missing to build a Schema Cache is referred to as "stored introspection", and this includes: - DB introspection - User-defined enum values (i.e. contents of specific DB tables) - Remote schema introspection This PR introduces a new `StoredIntrospection` container that holds that data, and plumbs it through to the right parts of the schema cache building process, so that stored introspection can be used as a substitute for fresh introspection requests against live data sources. The serialization of `StoredIntrospection` is intended to be straightforward: just take the serialized source introspection results, and put them in an appropriate JSON object. Though I don't think that this PR achieves that entirely. In order for `StoredIntrospection` to be deserializable (through `aeson` instances), while keeping the required code changes low, this piggy-backs off of the `ResolvedSource` data type. `ResolvedSource` is _almost_ exactly what we want, and _almost_ deserializable, so this PR brings it across the finish line by moving a few things out of that type, and adding a `FromJSON (RawFunctionInfo b)` context to the `Backend` type class. [PLAT-270]: https://hasurahq.atlassian.net/browse/PLAT-270?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ [PLAT-270]: https://hasurahq.atlassian.net/browse/PLAT-270?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ [PLAT-276]: https://hasurahq.atlassian.net/browse/PLAT-276?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ [PLAT-276]: https://hasurahq.atlassian.net/browse/PLAT-276?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ PR-URL: https://github.com/hasura/graphql-engine-mono/pull/7053 GitOrigin-RevId: 5001b4ea086195cb5e65886747eac2a0a657b64c
2023-01-20 17:51:11 +03:00
_prsConfig :: SourceConfig b,
_prsIntrospection :: DBObjectsIntrospection b,
_tableCoreInfoMap :: HashMap (TableName b) (TableCoreInfoG b (ColumnInfo b) (ColumnInfo b)),
_eventTriggerInfoMap :: HashMap (TableName b) (EventTriggerInfoMap b)
}
deriving (Eq)
-- | Builds the schema cache representation of a remote relationship
-- TODO: this is not actually called by the remote relationship DDL API and is only used as part of
-- the schema cache process. Should this be moved elsewhere?
server: support remote relationships on SQL Server and BigQuery (#1497) Remote relationships are now supported on SQL Server and BigQuery. The major change though is the re-architecture of remote join execution logic. Prior to this PR, each backend is responsible for processing the remote relationships that are part of their AST. This is not ideal as there is nothing specific about a remote join's execution that ties it to a backend. The only backend specific part is whether or not the specification of the remote relationship is valid (i.e, we'll need to validate whether the scalars are compatible). The approach now changes to this: 1. Before delegating the AST to the backend, we traverse the AST, collect all the remote joins while modifying the AST to add necessary join fields where needed. 1. Once the remote joins are collected from the AST, the database call is made to fetch the response. The necessary data for the remote join(s) is collected from the database's response and one or more remote schema calls are constructed as necessary. 1. The remote schema calls are then executed and the data from the database and from the remote schemas is joined to produce the final response. ### Known issues 1. Ideally the traversal of the IR to collect remote joins should return an AST which does not include remote join fields. This operation can be type safe but isn't taken up as part of the PR. 1. There is a lot of code duplication between `Transport/HTTP.hs` and `Transport/Websocket.hs` which needs to be fixed ASAP. This too hasn't been taken up by this PR. 1. The type which represents the execution plan is only modified to handle our current remote joins and as such it will have to be changed to accommodate general remote joins. 1. Use of lenses would have reduced the boilerplate code to collect remote joins from the base AST. 1. The current remote join logic assumes that the join columns of a remote relationship appear with their names in the database response. This however is incorrect as they could be aliased. This can be taken up by anyone, I've left a comment in the code. ### Notes to the reviewers I think it is best reviewed commit by commit. 1. The first one is very straight forward. 1. The second one refactors the remote join execution logic but other than moving things around, it doesn't change the user facing functionality. This moves Postgres specific parts to `Backends/Postgres` module from `Execute`. Some IR related code to `Hasura.RQL.IR` module. Simplifies various type class function signatures as a backend doesn't have to handle remote joins anymore 1. The third one fixes partial case matches that for some weird reason weren't shown as warnings before this refactor 1. The fourth one generalizes the validation logic of remote relationships and implements `scalarTypeGraphQLName` function on SQL Server and BigQuery which is used by the validation logic. This enables remote relationships on BigQuery and SQL Server. https://github.com/hasura/graphql-engine-mono/pull/1497 GitOrigin-RevId: 77dd8eed326602b16e9a8496f52f46d22b795598
2021-06-11 06:26:50 +03:00
buildRemoteFieldInfo ::
QErrM m =>
-- | The entity on which the remote relationship is defined
LHSIdentifier ->
-- | join fields provided by the LHS entity
Map.HashMap FieldName lhsJoinField ->
-- | definition of remote relationship
RemoteRelationship ->
-- | Required context to process cross boundary relationships
HashMap SourceName (AB.AnyBackend PartiallyResolvedSource) ->
-- | Required context to process cross boundary relationships
scaffolding for remote-schemas module The main aim of the PR is: 1. To set up a module structure for 'remote-schemas' package. 2. Move parts by the remote schema codebase into the new module structure to validate it. ## Notes to the reviewer Why a PR with large-ish diff? 1. We've been making progress on the MM project but we don't yet know long it is going to take us to get to the first milestone. To understand this better, we need to figure out the unknowns as soon as possible. Hence I've taken a stab at the first two items in the [end-state](https://gist.github.com/0x777/ca2bdc4284d21c3eec153b51dea255c9) document to figure out the unknowns. Unsurprisingly, there are a bunch of issues that we haven't discussed earlier. These are documented in the 'open questions' section. 1. The diff is large but that is only code moved around and I've added a section that documents how things are moved. In addition, there are fair number of PR comments to help with the review process. ## Changes in the PR ### Module structure Sets up the module structure as follows: ``` Hasura/ RemoteSchema/ Metadata/ Types.hs SchemaCache/ Types.hs Permission.hs RemoteRelationship.hs Build.hs MetadataAPI/ Types.hs Execute.hs ``` ### 1. Types representing metadata are moved Types that capture metadata information (currently scattered across several RQL modules) are moved into `Hasura.RemoteSchema.Metadata.Types`. - This new module only depends on very 'core' modules such as `Hasura.Session` for the notion of roles and `Hasura.Incremental` for `Cacheable` typeclass. - The requirement on database modules is avoided by generalizing the remote schemas metadata to accept an arbitrary 'r' for a remote relationship definition. ### 2. SchemaCache related types and build logic have been moved Types that represent remote schemas information in SchemaCache are moved into `Hasura.RemoteSchema.SchemaCache.Types`. Similar to `H.RS.Metadata.Types`, this module depends on 'core' modules except for `Hasura.GraphQL.Parser.Variable`. It has something to do with remote relationships but I haven't spent time looking into it. The validation of 'remote relationships to remote schema' is also something that needs to be looked at. Rips out the logic that builds remote schema's SchemaCache information from the monolithic `buildSchemaCacheRule` and moves it into `Hasura.RemoteSchema.SchemaCache.Build`. Further, the `.SchemaCache.Permission` and `.SchemaCache.RemoteRelationship` have been created from existing modules that capture schema cache building logic for those two components. This was a fair amount of work. On main, currently remote schema's SchemaCache information is built in two phases - in the first phase, 'permissions' and 'remote relationships' are ignored and in the second phase they are filled in. While remote relationships can only be resolved after partially resolving sources and other remote schemas, the same isn't true for permissions. Further, most of the work that is done to resolve remote relationships can be moved to the first phase so that the second phase can be a very simple traversal. This is the approach that was taken - resolve permissions and as much as remote relationships information in the first phase. ### 3. Metadata APIs related types and build logic have been moved The types that represent remote schema related metadata APIs and the execution logic have been moved to `Hasura.RemoteSchema.MetadataAPI.Types` and `.Execute` modules respectively. ## Open questions: 1. `Hasura.RemoteSchema.Metadata.Types` is so called because I was hoping that all of the metadata related APIs of remote schema can be brought in at `Hasura.RemoteSchema.Metadata.API`. However, as metadata APIs depended on functions from `SchemaCache` module (see [1](https://github.com/hasura/graphql-engine-mono/blob/ceba6d62264603ee5d279814677b29bcc43ecaea/server/src-lib/Hasura/RQL/DDL/RemoteSchema.hs#L55) and [2](https://github.com/hasura/graphql-engine-mono/blob/ceba6d62264603ee5d279814677b29bcc43ecaea/server/src-lib/Hasura/RQL/DDL/RemoteSchema.hs#L91), it made more sense to create a separate top-level module for `MetadataAPI`s. Maybe we can just have `Hasura.RemoteSchema.Metadata` and get rid of the extra nesting or have `Hasura.RemoteSchema.Metadata.{Core,Permission,RemoteRelationship}` if we want to break them down further. 1. `buildRemoteSchemas` in `H.RS.SchemaCache.Build` has the following type: ```haskell buildRemoteSchemas :: ( ArrowChoice arr, Inc.ArrowDistribute arr, ArrowWriter (Seq CollectedInfo) arr, Inc.ArrowCache m arr, MonadIO m, HasHttpManagerM m, Inc.Cacheable remoteRelationshipDefinition, ToJSON remoteRelationshipDefinition, MonadError QErr m ) => Env.Environment -> ( (Inc.Dependency (HashMap RemoteSchemaName Inc.InvalidationKey), OrderedRoles), [RemoteSchemaMetadataG remoteRelationshipDefinition] ) `arr` HashMap RemoteSchemaName (PartiallyResolvedRemoteSchemaCtxG remoteRelationshipDefinition, MetadataObject) ``` Note the dependence on `CollectedInfo` which is defined as ```haskell data CollectedInfo = CIInconsistency InconsistentMetadata | CIDependency MetadataObject -- ^ for error reporting on missing dependencies SchemaObjId SchemaDependency deriving (Eq) ``` this pretty much means that remote schemas is dependent on types from databases, actions, .... How do we fix this? Maybe introduce a typeclass such as `ArrowCollectRemoteSchemaDependencies` which is defined in `Hasura.RemoteSchema` and then implemented in graphql-engine? 1. The dependency on `buildSchemaCacheFor` in `.MetadataAPI.Execute` which has the following signature: ```haskell buildSchemaCacheFor :: (QErrM m, CacheRWM m, MetadataM m) => MetadataObjId -> MetadataModifier -> ``` This can be easily resolved if we restrict what the metadata APIs are allowed to do. Currently, they operate in an unfettered access to modify SchemaCache (the `CacheRWM` constraint): ```haskell runAddRemoteSchema :: ( QErrM m, CacheRWM m, MonadIO m, HasHttpManagerM m, MetadataM m, Tracing.MonadTrace m ) => Env.Environment -> AddRemoteSchemaQuery -> m EncJSON ``` This should instead be changed to restrict remote schema APIs to only modify remote schema metadata (but has access to the remote schemas part of the schema cache), this dependency is completely removed. ```haskell runAddRemoteSchema :: ( QErrM m, MonadIO m, HasHttpManagerM m, MonadReader RemoteSchemasSchemaCache m, MonadState RemoteSchemaMetadata m, Tracing.MonadTrace m ) => Env.Environment -> AddRemoteSchemaQuery -> m RemoteSchemeMetadataObjId ``` The idea is that the core graphql-engine would call these functions and then call `buildSchemaCacheFor`. PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6291 GitOrigin-RevId: 51357148c6404afe70219afa71bd1d59bdf4ffc6
2022-10-21 06:13:07 +03:00
PartiallyResolvedRemoteSchemaMap ->
-- | returns
-- 1. schema cache representation of the remote relationships
-- 2. the dependencies on the RHS of the join. The dependencies
-- on the LHS entities has to be handled by the calling function
m (RemoteFieldInfo lhsJoinField, Seq SchemaDependency)
buildRemoteFieldInfo lhsIdentifier lhsJoinFields RemoteRelationship {..} allSources remoteSchemaMap =
case _rrDefinition of
RelationshipToSource ToSourceRelationshipDef {..} -> do
targetTables <-
Map.lookup _tsrdSource allSources
`onNothing` throw400 NotFound ("source not found: " <>> _tsrdSource)
AB.dispatchAnyBackendWithTwoConstraints @Backend @BackendMetadata targetTables \(partiallyResolvedSource :: PartiallyResolvedSource b') -> do
server: plumb `StoredIntrospection` while building the Schema Cache We'd like to be able to build a Schema Cache from only serializable data. We already have Metadata. The data that's missing to build a Schema Cache is referred to as "stored introspection", and this includes: - DB introspection - User-defined enum values (i.e. contents of specific DB tables) - Remote schema introspection This PR introduces a new `StoredIntrospection` container that holds that data, and plumbs it through to the right parts of the schema cache building process, so that stored introspection can be used as a substitute for fresh introspection requests against live data sources. The serialization of `StoredIntrospection` is intended to be straightforward: just take the serialized source introspection results, and put them in an appropriate JSON object. Though I don't think that this PR achieves that entirely. In order for `StoredIntrospection` to be deserializable (through `aeson` instances), while keeping the required code changes low, this piggy-backs off of the `ResolvedSource` data type. `ResolvedSource` is _almost_ exactly what we want, and _almost_ deserializable, so this PR brings it across the finish line by moving a few things out of that type, and adding a `FromJSON (RawFunctionInfo b)` context to the `Backend` type class. [PLAT-270]: https://hasurahq.atlassian.net/browse/PLAT-270?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ [PLAT-270]: https://hasurahq.atlassian.net/browse/PLAT-270?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ [PLAT-276]: https://hasurahq.atlassian.net/browse/PLAT-276?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ [PLAT-276]: https://hasurahq.atlassian.net/browse/PLAT-276?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ PR-URL: https://github.com/hasura/graphql-engine-mono/pull/7053 GitOrigin-RevId: 5001b4ea086195cb5e65886747eac2a0a657b64c
2023-01-20 17:51:11 +03:00
let PartiallyResolvedSource _ sourceConfig _ targetTablesInfo _ = partiallyResolvedSource
unless (supportsBeingRemoteRelationshipTarget @b' sourceConfig) $
throw400 NotSupported ("source " <> sourceNameToText _tsrdSource <> " does not support being used as the target of a remote relationship")
(targetTable :: TableName b') <- runAesonParser J.parseJSON _tsrdTable
targetColumns <-
fmap _tciFieldInfoMap $
onNothing (Map.lookup targetTable targetTablesInfo) $
throw400 NotExists $
"table " <> targetTable <<> " does not exist in source: " <> sourceNameToText _tsrdSource
-- TODO: rhs fields should also ideally be DBJoinFields
columnPairs <- for (Map.toList _tsrdFieldMapping) \(srcFieldName, tgtFieldName) -> do
lhsJoinField <- askFieldInfo lhsJoinFields srcFieldName
tgtField <- askFieldInfo targetColumns tgtFieldName
pure (srcFieldName, lhsJoinField, tgtField)
columnMapping <- for columnPairs \(srcFieldName, srcColumn, tgtColumn) -> do
tgtScalar <- case ciType tgtColumn of
ColumnScalar scalarType -> pure scalarType
ColumnEnumReference _ -> throw400 NotSupported "relationships to enum fields are not supported yet"
pure (srcFieldName, (srcColumn, tgtScalar, ciColumn tgtColumn))
server: plumb `StoredIntrospection` while building the Schema Cache We'd like to be able to build a Schema Cache from only serializable data. We already have Metadata. The data that's missing to build a Schema Cache is referred to as "stored introspection", and this includes: - DB introspection - User-defined enum values (i.e. contents of specific DB tables) - Remote schema introspection This PR introduces a new `StoredIntrospection` container that holds that data, and plumbs it through to the right parts of the schema cache building process, so that stored introspection can be used as a substitute for fresh introspection requests against live data sources. The serialization of `StoredIntrospection` is intended to be straightforward: just take the serialized source introspection results, and put them in an appropriate JSON object. Though I don't think that this PR achieves that entirely. In order for `StoredIntrospection` to be deserializable (through `aeson` instances), while keeping the required code changes low, this piggy-backs off of the `ResolvedSource` data type. `ResolvedSource` is _almost_ exactly what we want, and _almost_ deserializable, so this PR brings it across the finish line by moving a few things out of that type, and adding a `FromJSON (RawFunctionInfo b)` context to the `Backend` type class. [PLAT-270]: https://hasurahq.atlassian.net/browse/PLAT-270?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ [PLAT-270]: https://hasurahq.atlassian.net/browse/PLAT-270?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ [PLAT-276]: https://hasurahq.atlassian.net/browse/PLAT-276?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ [PLAT-276]: https://hasurahq.atlassian.net/browse/PLAT-276?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ PR-URL: https://github.com/hasura/graphql-engine-mono/pull/7053 GitOrigin-RevId: 5001b4ea086195cb5e65886747eac2a0a657b64c
2023-01-20 17:51:11 +03:00
let rsri =
Resolve source customization at schema cache building time. ### Description This PR attempts to fix several issues with source customization as it relates to remote relationships. There were several issues regarding casing: at the relationship border, we didn't properly set the target source's case, we didn't have access to the list of supported features to decide whether the feature was allowed or not, and we didn't have access to the global default. However, all of that information is available when we build the schema cache, as we do resolve the case of some elements such as function names: we can therefore resolve source information at the same time, and simplify both the root of the schema and the remote relationship border. To do this, this PR introduces a new type, `ResolvedSourceCustomization`, to be used in the Schema Cache, as opposed to the metadata's `SourceCustomization`, following a pattern established by a lot of other types. ### Remaining work and open questions One major point of confusion: it seems to me that we didn't set the case at all across remote relationships, which would suggest we would use the case of the LHS source across the subset of the RHS one that is accessible through the remote relationship, which would in turn "corrupt" the parser cache and might result in the wrong case being used for that source later on. Is that assesment correct, and was I right to fix it? Another one is that we seem not to be using the local case of the RHS to name the field in an object relationship; unless I'm mistaken we only use it for array relationships? Is that intentional? This PR is also missing tests that would show-case the difference, and a changelog entry. To my knowledge, all the tests of this feature are in the python test suite; this could be the opportunity to move them to the hspec suite, but this might be a considerable amount of work? PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5619 GitOrigin-RevId: 51a81b713a74575e82d9f96b51633f158ce3a47b
2022-09-12 19:05:40 +03:00
RemoteSourceFieldInfo _rrName _tsrdRelationshipType _tsrdSource sourceConfig targetTable $
fmap (\(_, tgtType, tgtColumn) -> (tgtType, tgtColumn)) $
Map.fromList columnMapping
rhsDependencies =
SchemaDependency (SOSourceObj _tsrdSource $ AB.mkAnyBackend $ SOITable @b' targetTable) DRTable
: flip map columnPairs \(_, _srcColumn, tgtColumn) ->
SchemaDependency
( SOSourceObj _tsrdSource $
AB.mkAnyBackend $
SOITableObj @b' targetTable $
TOCol @b' $
ciColumn tgtColumn
)
DRRemoteRelationship
requiredLHSJoinFields = fmap (\(srcField, _, _) -> srcField) $ Map.fromList columnMapping
pure (RemoteFieldInfo requiredLHSJoinFields $ RFISource $ AB.mkAnyBackend @b' rsri, Seq.fromList rhsDependencies)
RelationshipToSchema _ remoteRelationship@ToSchemaRelationshipDef {..} -> do
RemoteSchemaCtx {..} <-
onNothing (Map.lookup _trrdRemoteSchema remoteSchemaMap) $
throw400 RemoteSchemaError $
"remote schema with name " <> _trrdRemoteSchema <<> " not found"
(requiredLHSJoinFields, remoteField) <-
validateToSchemaRelationship remoteRelationship lhsIdentifier _rrName (_rscInfo, _rscIntroOriginal) lhsJoinFields
`onLeft` (throw400 RemoteSchemaError . errorToText)
let rhsDependency = SchemaDependency (SORemoteSchema _trrdRemoteSchema) DRRemoteSchema
pure (RemoteFieldInfo requiredLHSJoinFields $ RFISchema remoteField, Seq.singleton $ rhsDependency)
getRemoteSchemaEntityJoinColumns ::
(MonadError QErr m) =>
RemoteSchemaName ->
RemoteSchemaIntrospection ->
G.Name ->
m (HashMap FieldName G.Name)
getRemoteSchemaEntityJoinColumns remoteSchemaName introspection typeName = do
typeDefinition <-
onNothing (lookupType introspection typeName) $
throw400 NotFound ("no type named " <> typeName <<> " defined in remote schema " <>> remoteSchemaName)
case typeDefinition of
G.TypeDefinitionObject objectDefinition ->
pure $
Map.fromList $ do
fieldDefinition <- G._otdFieldsDefinition objectDefinition
guard $ null $ G._fldArgumentsDefinition fieldDefinition
pure (FieldName $ G.unName $ G._fldName fieldDefinition, G._fldName fieldDefinition)
_ -> throw400 NotSupported "remote relationships on a remote schema can only be defined on object types"