graphql-engine/server/src-lib/Hasura/RQL/DDL/Schema/Cache/Common.hs
Vamshi Surabhi a01d1188f2 scaffolding for remote-schemas module
The main aim of the PR is:

1. To set up a module structure for 'remote-schemas' package.
2. Move parts by the remote schema codebase into the new module structure to validate it.

## Notes to the reviewer

Why a PR with large-ish diff?

1. We've been making progress on the MM project but we don't yet know long it is going to take us to get to the first milestone. To understand this better, we need to figure out the unknowns as soon as possible. Hence I've taken a stab at the first two items in the [end-state](https://gist.github.com/0x777/ca2bdc4284d21c3eec153b51dea255c9) document to figure out the unknowns. Unsurprisingly, there are a bunch of issues that we haven't discussed earlier. These are documented in the 'open questions' section.

1. The diff is large but that is only code moved around and I've added a section that documents how things are moved. In addition, there are fair number of PR comments to help with the review process.

## Changes in the PR

### Module structure

Sets up the module structure as follows:

```
Hasura/
  RemoteSchema/
    Metadata/
      Types.hs
    SchemaCache/
      Types.hs
      Permission.hs
      RemoteRelationship.hs
      Build.hs
    MetadataAPI/
      Types.hs
      Execute.hs
```

### 1. Types representing metadata are moved

Types that capture metadata information (currently scattered across several RQL modules) are moved into `Hasura.RemoteSchema.Metadata.Types`.

- This new module only depends on very 'core' modules such as
  `Hasura.Session` for the notion of roles and `Hasura.Incremental` for `Cacheable` typeclass.

- The requirement on database modules is avoided by generalizing the remote schemas metadata to accept an arbitrary 'r' for a remote relationship
  definition.

### 2. SchemaCache related types and build logic have been moved

Types that represent remote schemas information in SchemaCache are moved into `Hasura.RemoteSchema.SchemaCache.Types`.

Similar to `H.RS.Metadata.Types`, this module depends on 'core' modules except for `Hasura.GraphQL.Parser.Variable`. It has something to do with remote relationships but I haven't spent time looking into it. The validation of 'remote relationships to remote schema' is also something that needs to be looked at.

Rips out the logic that builds remote schema's SchemaCache information from the monolithic `buildSchemaCacheRule` and moves it into `Hasura.RemoteSchema.SchemaCache.Build`. Further, the `.SchemaCache.Permission` and `.SchemaCache.RemoteRelationship` have been created from existing modules that capture schema cache building logic for those two components.

This was a fair amount of work. On main, currently remote schema's SchemaCache information is built in two phases - in the first phase, 'permissions' and 'remote relationships' are ignored and in the second phase they are filled in.

While remote relationships can only be resolved after partially resolving sources and other remote schemas, the same isn't true for permissions. Further, most of the work that is done to resolve remote relationships can be moved to the first phase so that the second phase can be a very simple traversal.

This is the approach that was taken - resolve permissions and as much as remote relationships information in the first phase.

### 3. Metadata APIs related types and build logic have been moved

The types that represent remote schema related metadata APIs and the execution logic have been moved to `Hasura.RemoteSchema.MetadataAPI.Types` and `.Execute` modules respectively.

## Open questions:

1. `Hasura.RemoteSchema.Metadata.Types` is so called because I was hoping that all of the metadata related APIs of remote schema can be brought in at `Hasura.RemoteSchema.Metadata.API`. However, as metadata APIs depended on functions from `SchemaCache` module (see [1](ceba6d6226/server/src-lib/Hasura/RQL/DDL/RemoteSchema.hs (L55)) and [2](ceba6d6226/server/src-lib/Hasura/RQL/DDL/RemoteSchema.hs (L91)), it made more sense to create a separate top-level module for `MetadataAPI`s.

   Maybe we can just have `Hasura.RemoteSchema.Metadata` and get rid of the extra nesting or have `Hasura.RemoteSchema.Metadata.{Core,Permission,RemoteRelationship}` if we want to break them down further.

1. `buildRemoteSchemas` in `H.RS.SchemaCache.Build` has the following type:

   ```haskell
   buildRemoteSchemas ::
     ( ArrowChoice arr,
       Inc.ArrowDistribute arr,
       ArrowWriter (Seq CollectedInfo) arr,
       Inc.ArrowCache m arr,
       MonadIO m,
       HasHttpManagerM m,
       Inc.Cacheable remoteRelationshipDefinition,
       ToJSON remoteRelationshipDefinition,
       MonadError QErr m
     ) =>
     Env.Environment ->
     ( (Inc.Dependency (HashMap RemoteSchemaName Inc.InvalidationKey), OrderedRoles),
       [RemoteSchemaMetadataG remoteRelationshipDefinition]
     )
       `arr` HashMap RemoteSchemaName (PartiallyResolvedRemoteSchemaCtxG remoteRelationshipDefinition, MetadataObject)
   ```

   Note the dependence on `CollectedInfo` which is defined as

   ```haskell
   data CollectedInfo
     = CIInconsistency InconsistentMetadata
     | CIDependency
         MetadataObject
         -- ^ for error reporting on missing dependencies
         SchemaObjId
         SchemaDependency
     deriving (Eq)
   ```

   this pretty much means that remote schemas is dependent on types from databases, actions, ....

   How do we fix this? Maybe introduce a typeclass such as `ArrowCollectRemoteSchemaDependencies` which is defined in `Hasura.RemoteSchema` and then implemented in graphql-engine?

1. The dependency on `buildSchemaCacheFor` in `.MetadataAPI.Execute` which has the following signature:

   ```haskell
   buildSchemaCacheFor ::
     (QErrM m, CacheRWM m, MetadataM m) =>
     MetadataObjId ->
     MetadataModifier ->
   ```

   This can be easily resolved if we restrict what the metadata APIs are allowed to do. Currently, they operate in an unfettered access to modify SchemaCache (the `CacheRWM` constraint):

   ```haskell
   runAddRemoteSchema ::
     ( QErrM m,
       CacheRWM m,
       MonadIO m,
       HasHttpManagerM m,
       MetadataM m,
       Tracing.MonadTrace m
     ) =>
     Env.Environment ->
     AddRemoteSchemaQuery ->
     m EncJSON
   ```

   This should instead be changed to restrict remote schema APIs to only modify remote schema metadata (but has access to the remote schemas part of the schema cache), this dependency is completely removed.

   ```haskell
   runAddRemoteSchema ::
     ( QErrM m,
       MonadIO m,
       HasHttpManagerM m,
       MonadReader RemoteSchemasSchemaCache m,
       MonadState RemoteSchemaMetadata m,
       Tracing.MonadTrace m
     ) =>
     Env.Environment ->
     AddRemoteSchemaQuery ->
     m RemoteSchemeMetadataObjId
   ```

   The idea is that the core graphql-engine would call these functions and then call
   `buildSchemaCacheFor`.

PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6291
GitOrigin-RevId: 51357148c6404afe70219afa71bd1d59bdf4ffc6
2022-10-21 03:15:04 +00:00

370 lines
13 KiB
Haskell
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{-# LANGUAGE Arrows #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE UndecidableInstances #-}
-- | Types/functions shared between modules that implement "Hasura.RQL.DDL.Schema.Cache". Other
-- modules should not import this module directly.
module Hasura.RQL.DDL.Schema.Cache.Common
( ApolloFederationConfig (..),
ApolloFederationVersion (..),
BackendInvalidationKeysWrapper (..),
BuildOutputs (..),
CacheBuild,
CacheBuildParams (CacheBuildParams),
InvalidationKeys (..),
ikMetadata,
ikRemoteSchemas,
ikSources,
ikBackends,
NonColumnTableInputs (..),
RebuildableSchemaCache (RebuildableSchemaCache, lastBuiltSchemaCache),
TableBuildInput (TableBuildInput, _tbiName),
TablePermissionInputs (..),
addTableContext,
bindErrorA,
boAllowlist,
boApiLimits,
boMetricsConfig,
boTlsAllowlist,
boActions,
boCronTriggers,
boCustomTypes,
boBackendCache,
boEndpoints,
boQueryCollections,
boRemoteSchemas,
boRoles,
boSources,
buildInfoMap,
buildInfoMapPreservingMetadata,
initialInvalidationKeys,
invalidateKeys,
mkTableInputs,
runCacheBuild,
runCacheBuildM,
withRecordDependencies,
)
where
import Control.Arrow.Extended
import Control.Arrow.Interpret
import Control.Lens
import Control.Monad.Trans.Control (MonadBaseControl)
import Control.Monad.Unique
import Data.HashMap.Strict.Extended qualified as M
import Data.HashMap.Strict.InsOrd qualified as OMap
import Data.Sequence qualified as Seq
import Data.Text.Extended
import Hasura.Base.Error
import Hasura.Incremental qualified as Inc
import Hasura.Prelude
import Hasura.RQL.Types.Allowlist
import Hasura.RQL.Types.ApiLimit
import Hasura.RQL.Types.Backend
import Hasura.RQL.Types.Common
import Hasura.RQL.Types.CustomTypes
import Hasura.RQL.Types.Endpoint
import Hasura.RQL.Types.EventTrigger
import Hasura.RQL.Types.Metadata
import Hasura.RQL.Types.Metadata.Backend (BackendMetadata (..))
import Hasura.RQL.Types.Metadata.Instances ()
import Hasura.RQL.Types.Metadata.Object
import Hasura.RQL.Types.Network
import Hasura.RQL.Types.Permission
import Hasura.RQL.Types.QueryCollection
import Hasura.RQL.Types.Relationships.Local
import Hasura.RQL.Types.Relationships.Remote
import Hasura.RQL.Types.Roles
import Hasura.RQL.Types.SchemaCache
import Hasura.RQL.Types.SchemaCache.Build
import Hasura.RQL.Types.Source
import Hasura.RemoteSchema.Metadata
import Hasura.SQL.Backend
import Hasura.SQL.BackendMap (BackendMap)
import Hasura.SQL.BackendMap qualified as BackendMap
import Hasura.Server.Types
import Hasura.Session
import Network.HTTP.Client.Manager (HasHttpManagerM (..))
import Network.HTTP.Client.Transformable qualified as HTTP
newtype BackendInvalidationKeysWrapper (b :: BackendType) = BackendInvalidationKeysWrapper
{ unBackendInvalidationKeysWrapper :: BackendInvalidationKeys b
}
deriving newtype instance Eq (BackendInvalidationKeys b) => Eq (BackendInvalidationKeysWrapper b)
deriving newtype instance Ord (BackendInvalidationKeys b) => Ord (BackendInvalidationKeysWrapper b)
deriving newtype instance Inc.Cacheable (BackendInvalidationKeys b) => Inc.Cacheable (BackendInvalidationKeysWrapper b)
deriving newtype instance Show (BackendInvalidationKeys b) => Show (BackendInvalidationKeysWrapper b)
deriving newtype instance Semigroup (BackendInvalidationKeys b) => Semigroup (BackendInvalidationKeysWrapper b)
deriving newtype instance Monoid (BackendInvalidationKeys b) => Monoid (BackendInvalidationKeysWrapper b)
instance Inc.Select (BackendInvalidationKeysWrapper b)
-- | 'InvalidationKeys' used to apply requested 'CacheInvalidations'.
data InvalidationKeys = InvalidationKeys
{ _ikMetadata :: Inc.InvalidationKey,
_ikRemoteSchemas :: HashMap RemoteSchemaName Inc.InvalidationKey,
_ikSources :: HashMap SourceName Inc.InvalidationKey,
_ikBackends :: BackendMap BackendInvalidationKeysWrapper
}
deriving (Show, Eq, Generic)
instance Inc.Cacheable InvalidationKeys
instance Inc.Select InvalidationKeys
$(makeLenses ''InvalidationKeys)
initialInvalidationKeys :: InvalidationKeys
initialInvalidationKeys = InvalidationKeys Inc.initialInvalidationKey mempty mempty mempty
invalidateKeys :: CacheInvalidations -> InvalidationKeys -> InvalidationKeys
invalidateKeys CacheInvalidations {..} InvalidationKeys {..} =
InvalidationKeys
{ _ikMetadata = if ciMetadata then Inc.invalidate _ikMetadata else _ikMetadata,
_ikRemoteSchemas = foldl' (flip invalidate) _ikRemoteSchemas ciRemoteSchemas,
_ikSources = foldl' (flip invalidate) _ikSources ciSources,
_ikBackends = BackendMap.modify @'DataConnector invalidateDataConnectors _ikBackends
}
where
invalidate ::
(Eq a, Hashable a) =>
a ->
HashMap a Inc.InvalidationKey ->
HashMap a Inc.InvalidationKey
invalidate = M.alter $ Just . maybe Inc.initialInvalidationKey Inc.invalidate
invalidateDataConnectors :: BackendInvalidationKeysWrapper 'DataConnector -> BackendInvalidationKeysWrapper 'DataConnector
invalidateDataConnectors (BackendInvalidationKeysWrapper invalidationKeys) =
BackendInvalidationKeysWrapper $ foldl' (flip invalidate) invalidationKeys ciDataConnectors
data TableBuildInput b = TableBuildInput
{ _tbiName :: TableName b,
_tbiIsEnum :: Bool,
_tbiConfiguration :: TableConfig b,
_tbiApolloFederationConfig :: Maybe ApolloFederationConfig
}
deriving (Show, Eq, Generic)
instance (Backend b) => NFData (TableBuildInput b)
instance (Backend b) => Inc.Cacheable (TableBuildInput b)
data NonColumnTableInputs b = NonColumnTableInputs
{ _nctiTable :: TableName b,
_nctiObjectRelationships :: [ObjRelDef b],
_nctiArrayRelationships :: [ArrRelDef b],
_nctiComputedFields :: [ComputedFieldMetadata b],
_nctiRemoteRelationships :: [RemoteRelationship]
}
deriving (Show, Eq, Generic)
data TablePermissionInputs b = TablePermissionInputs
{ _tpiTable :: TableName b,
_tpiInsert :: [InsPermDef b],
_tpiSelect :: [SelPermDef b],
_tpiUpdate :: [UpdPermDef b],
_tpiDelete :: [DelPermDef b]
}
deriving (Generic)
deriving instance (Backend b) => Show (TablePermissionInputs b)
deriving instance (Backend b) => Eq (TablePermissionInputs b)
instance (Backend b) => Inc.Cacheable (TablePermissionInputs b)
mkTableInputs ::
TableMetadata b -> (TableBuildInput b, NonColumnTableInputs b, TablePermissionInputs b)
mkTableInputs TableMetadata {..} =
(buildInput, nonColumns, permissions)
where
buildInput = TableBuildInput _tmTable _tmIsEnum _tmConfiguration _tmApolloFederationConfig
nonColumns =
NonColumnTableInputs
_tmTable
(OMap.elems _tmObjectRelationships)
(OMap.elems _tmArrayRelationships)
(OMap.elems _tmComputedFields)
(OMap.elems _tmRemoteRelationships)
permissions =
TablePermissionInputs
_tmTable
(OMap.elems _tmInsertPermissions)
(OMap.elems _tmSelectPermissions)
(OMap.elems _tmUpdatePermissions)
(OMap.elems _tmDeletePermissions)
-- | The direct output of 'buildSchemaCacheRule'. Contains most of the things necessary to build a
-- schema cache, but dependencies and inconsistent metadata objects are collected via a separate
-- 'MonadWriter' side channel.
data BuildOutputs = BuildOutputs
{ _boSources :: SourceCache,
_boActions :: ActionCache,
-- | We preserve the 'MetadataObject' from the original catalog metadata in the output so we can
-- reuse it later if we need to mark the remote schema inconsistent during GraphQL schema
-- generation (because of field conflicts).
_boRemoteSchemas :: HashMap RemoteSchemaName (RemoteSchemaCtx, MetadataObject),
_boAllowlist :: InlinedAllowlist,
_boCustomTypes :: AnnotatedCustomTypes,
_boCronTriggers :: M.HashMap TriggerName CronTriggerInfo,
_boEndpoints :: M.HashMap EndpointName (EndpointMetadata GQLQueryWithText),
_boApiLimits :: ApiLimit,
_boMetricsConfig :: MetricsConfig,
_boRoles :: HashMap RoleName Role,
_boTlsAllowlist :: [TlsAllow],
_boQueryCollections :: QueryCollections,
_boBackendCache :: BackendCache
}
$(makeLenses ''BuildOutputs)
-- | Parameters required for schema cache build
data CacheBuildParams = CacheBuildParams
{ _cbpManager :: HTTP.Manager,
_cbpPGSourceResolver :: SourceResolver ('Postgres 'Vanilla),
_cbpMSSQLSourceResolver :: SourceResolver 'MSSQL,
_cbpServerConfigCtx :: ServerConfigCtx
}
-- | The monad in which @'RebuildableSchemaCache' is being run
newtype CacheBuild a = CacheBuild (ReaderT CacheBuildParams (ExceptT QErr IO) a)
deriving
( Functor,
Applicative,
Monad,
MonadError QErr,
MonadReader CacheBuildParams,
MonadIO,
MonadBase IO,
MonadBaseControl IO,
MonadUnique
)
instance HasHttpManagerM CacheBuild where
askHttpManager = asks _cbpManager
instance HasServerConfigCtx CacheBuild where
askServerConfigCtx = asks _cbpServerConfigCtx
instance MonadResolveSource CacheBuild where
getPGSourceResolver = asks _cbpPGSourceResolver
getMSSQLSourceResolver = asks _cbpMSSQLSourceResolver
runCacheBuild ::
( MonadIO m,
MonadError QErr m
) =>
CacheBuildParams ->
CacheBuild a ->
m a
runCacheBuild params (CacheBuild m) = do
liftEitherM $ liftIO $ runExceptT (runReaderT m params)
runCacheBuildM ::
( MonadIO m,
MonadError QErr m,
HasHttpManagerM m,
HasServerConfigCtx m,
MonadResolveSource m
) =>
CacheBuild a ->
m a
runCacheBuildM m = do
params <-
CacheBuildParams
<$> askHttpManager
<*> getPGSourceResolver
<*> getMSSQLSourceResolver
<*> askServerConfigCtx
runCacheBuild params m
data RebuildableSchemaCache = RebuildableSchemaCache
{ lastBuiltSchemaCache :: SchemaCache,
_rscInvalidationMap :: InvalidationKeys,
_rscRebuild :: Inc.Rule (ReaderT BuildReason CacheBuild) (Metadata, InvalidationKeys) SchemaCache
}
bindErrorA ::
(ArrowChoice arr, ArrowKleisli m arr, ArrowError e arr, MonadError e m) =>
arr (m a) a
bindErrorA = liftEitherA <<< arrM \m -> (Right <$> m) `catchError` (pure . Left)
{-# INLINE bindErrorA #-}
withRecordDependencies ::
(ArrowWriter (Seq CollectedInfo) arr) =>
WriterA (Seq SchemaDependency) arr (e, s) a ->
arr (e, (MetadataObject, (SchemaObjId, s))) a
withRecordDependencies f = proc (e, (metadataObject, (schemaObjectId, s))) -> do
(result, dependencies) <- runWriterA f -< (e, s)
recordDependencies -< (metadataObject, schemaObjectId, toList dependencies)
returnA -< result
{-# INLINEABLE withRecordDependencies #-}
noDuplicates ::
(MonadWriter (Seq CollectedInfo) m) =>
(a -> MetadataObject) ->
[a] ->
m (Maybe a)
noDuplicates mkMetadataObject = \case
[] -> pure Nothing
[value] -> pure $ Just value
values@(value : _) -> do
let objectId = _moId $ mkMetadataObject value
definitions = map (_moDefinition . mkMetadataObject) values
tell $ Seq.singleton $ CIInconsistency (DuplicateObjects objectId definitions)
return Nothing
-- | Processes a list of catalog metadata into a map of processed information, marking any duplicate
-- entries inconsistent.
buildInfoMap ::
( ArrowChoice arr,
Inc.ArrowDistribute arr,
ArrowWriter (Seq CollectedInfo) arr,
Eq k,
Hashable k
) =>
(a -> k) ->
(a -> MetadataObject) ->
(e, a) `arr` Maybe b ->
(e, [a]) `arr` HashMap k b
buildInfoMap extractKey mkMetadataObject buildInfo = proc (e, infos) ->
(M.groupOn extractKey infos >- returnA)
>-> (|
Inc.keyed
( \_ duplicateInfos ->
(noDuplicates mkMetadataObject duplicateInfos >- interpretWriter)
>-> (| traverseA (\info -> (e, info) >- buildInfo) |)
>-> (\info -> join info >- returnA)
)
|)
>-> (\infoMap -> catMaybes infoMap >- returnA)
{-# INLINEABLE buildInfoMap #-}
-- | Like 'buildInfo', but includes each processed infos associated 'MetadataObject' in the result.
-- This is useful if the results will be further processed, and the 'MetadataObject' is still needed
-- to mark the object inconsistent.
buildInfoMapPreservingMetadata ::
( ArrowChoice arr,
Inc.ArrowDistribute arr,
ArrowWriter (Seq CollectedInfo) arr,
Eq k,
Hashable k
) =>
(a -> k) ->
(a -> MetadataObject) ->
(e, a) `arr` Maybe b ->
(e, [a]) `arr` HashMap k (b, MetadataObject)
buildInfoMapPreservingMetadata extractKey mkMetadataObject buildInfo =
buildInfoMap extractKey mkMetadataObject proc (e, info) ->
((e, info) >- buildInfo) >-> \result -> result <&> (,mkMetadataObject info) >- returnA
{-# INLINEABLE buildInfoMapPreservingMetadata #-}
addTableContext :: (Backend b) => TableName b -> Text -> Text
addTableContext tableName e = "in table " <> tableName <<> ": " <> e