mirror of
https://github.com/hasura/graphql-engine.git
synced 2024-12-18 04:51:35 +03:00
a01d1188f2
The main aim of the PR is: 1. To set up a module structure for 'remote-schemas' package. 2. Move parts by the remote schema codebase into the new module structure to validate it. ## Notes to the reviewer Why a PR with large-ish diff? 1. We've been making progress on the MM project but we don't yet know long it is going to take us to get to the first milestone. To understand this better, we need to figure out the unknowns as soon as possible. Hence I've taken a stab at the first two items in the [end-state](https://gist.github.com/0x777/ca2bdc4284d21c3eec153b51dea255c9) document to figure out the unknowns. Unsurprisingly, there are a bunch of issues that we haven't discussed earlier. These are documented in the 'open questions' section. 1. The diff is large but that is only code moved around and I've added a section that documents how things are moved. In addition, there are fair number of PR comments to help with the review process. ## Changes in the PR ### Module structure Sets up the module structure as follows: ``` Hasura/ RemoteSchema/ Metadata/ Types.hs SchemaCache/ Types.hs Permission.hs RemoteRelationship.hs Build.hs MetadataAPI/ Types.hs Execute.hs ``` ### 1. Types representing metadata are moved Types that capture metadata information (currently scattered across several RQL modules) are moved into `Hasura.RemoteSchema.Metadata.Types`. - This new module only depends on very 'core' modules such as `Hasura.Session` for the notion of roles and `Hasura.Incremental` for `Cacheable` typeclass. - The requirement on database modules is avoided by generalizing the remote schemas metadata to accept an arbitrary 'r' for a remote relationship definition. ### 2. SchemaCache related types and build logic have been moved Types that represent remote schemas information in SchemaCache are moved into `Hasura.RemoteSchema.SchemaCache.Types`. Similar to `H.RS.Metadata.Types`, this module depends on 'core' modules except for `Hasura.GraphQL.Parser.Variable`. It has something to do with remote relationships but I haven't spent time looking into it. The validation of 'remote relationships to remote schema' is also something that needs to be looked at. Rips out the logic that builds remote schema's SchemaCache information from the monolithic `buildSchemaCacheRule` and moves it into `Hasura.RemoteSchema.SchemaCache.Build`. Further, the `.SchemaCache.Permission` and `.SchemaCache.RemoteRelationship` have been created from existing modules that capture schema cache building logic for those two components. This was a fair amount of work. On main, currently remote schema's SchemaCache information is built in two phases - in the first phase, 'permissions' and 'remote relationships' are ignored and in the second phase they are filled in. While remote relationships can only be resolved after partially resolving sources and other remote schemas, the same isn't true for permissions. Further, most of the work that is done to resolve remote relationships can be moved to the first phase so that the second phase can be a very simple traversal. This is the approach that was taken - resolve permissions and as much as remote relationships information in the first phase. ### 3. Metadata APIs related types and build logic have been moved The types that represent remote schema related metadata APIs and the execution logic have been moved to `Hasura.RemoteSchema.MetadataAPI.Types` and `.Execute` modules respectively. ## Open questions: 1. `Hasura.RemoteSchema.Metadata.Types` is so called because I was hoping that all of the metadata related APIs of remote schema can be brought in at `Hasura.RemoteSchema.Metadata.API`. However, as metadata APIs depended on functions from `SchemaCache` module (see [1](ceba6d6226/server/src-lib/Hasura/RQL/DDL/RemoteSchema.hs (L55)
) and [2](ceba6d6226/server/src-lib/Hasura/RQL/DDL/RemoteSchema.hs (L91)
), it made more sense to create a separate top-level module for `MetadataAPI`s. Maybe we can just have `Hasura.RemoteSchema.Metadata` and get rid of the extra nesting or have `Hasura.RemoteSchema.Metadata.{Core,Permission,RemoteRelationship}` if we want to break them down further. 1. `buildRemoteSchemas` in `H.RS.SchemaCache.Build` has the following type: ```haskell buildRemoteSchemas :: ( ArrowChoice arr, Inc.ArrowDistribute arr, ArrowWriter (Seq CollectedInfo) arr, Inc.ArrowCache m arr, MonadIO m, HasHttpManagerM m, Inc.Cacheable remoteRelationshipDefinition, ToJSON remoteRelationshipDefinition, MonadError QErr m ) => Env.Environment -> ( (Inc.Dependency (HashMap RemoteSchemaName Inc.InvalidationKey), OrderedRoles), [RemoteSchemaMetadataG remoteRelationshipDefinition] ) `arr` HashMap RemoteSchemaName (PartiallyResolvedRemoteSchemaCtxG remoteRelationshipDefinition, MetadataObject) ``` Note the dependence on `CollectedInfo` which is defined as ```haskell data CollectedInfo = CIInconsistency InconsistentMetadata | CIDependency MetadataObject -- ^ for error reporting on missing dependencies SchemaObjId SchemaDependency deriving (Eq) ``` this pretty much means that remote schemas is dependent on types from databases, actions, .... How do we fix this? Maybe introduce a typeclass such as `ArrowCollectRemoteSchemaDependencies` which is defined in `Hasura.RemoteSchema` and then implemented in graphql-engine? 1. The dependency on `buildSchemaCacheFor` in `.MetadataAPI.Execute` which has the following signature: ```haskell buildSchemaCacheFor :: (QErrM m, CacheRWM m, MetadataM m) => MetadataObjId -> MetadataModifier -> ``` This can be easily resolved if we restrict what the metadata APIs are allowed to do. Currently, they operate in an unfettered access to modify SchemaCache (the `CacheRWM` constraint): ```haskell runAddRemoteSchema :: ( QErrM m, CacheRWM m, MonadIO m, HasHttpManagerM m, MetadataM m, Tracing.MonadTrace m ) => Env.Environment -> AddRemoteSchemaQuery -> m EncJSON ``` This should instead be changed to restrict remote schema APIs to only modify remote schema metadata (but has access to the remote schemas part of the schema cache), this dependency is completely removed. ```haskell runAddRemoteSchema :: ( QErrM m, MonadIO m, HasHttpManagerM m, MonadReader RemoteSchemasSchemaCache m, MonadState RemoteSchemaMetadata m, Tracing.MonadTrace m ) => Env.Environment -> AddRemoteSchemaQuery -> m RemoteSchemeMetadataObjId ``` The idea is that the core graphql-engine would call these functions and then call `buildSchemaCacheFor`. PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6291 GitOrigin-RevId: 51357148c6404afe70219afa71bd1d59bdf4ffc6
354 lines
14 KiB
Haskell
354 lines
14 KiB
Haskell
-- | How to construct and execute a call to a remote schema for a remote join.
|
|
--
|
|
-- There are three steps required to do this:
|
|
-- 1. construct the call: given the requested fields, the phantom fields, the
|
|
-- values extracted by the LHS, construct a GraphQL query
|
|
-- 2. execute that GraphQL query over the network
|
|
-- 3. build a index of the variables out of the response
|
|
--
|
|
-- This can be done as one function, but we also export the individual steps for
|
|
-- debugging / test purposes. We congregate all intermediary state in the opaque
|
|
-- 'RemoteSchemaCall' type.
|
|
module Hasura.GraphQL.Execute.RemoteJoin.RemoteSchema
|
|
( -- * Executing a remote join
|
|
makeRemoteSchemaJoinCall,
|
|
|
|
-- * Individual steps
|
|
RemoteSchemaCall,
|
|
buildRemoteSchemaCall,
|
|
executeRemoteSchemaCall,
|
|
buildJoinIndex,
|
|
)
|
|
where
|
|
|
|
import Control.Lens (view, _2, _3)
|
|
import Data.Aeson qualified as A
|
|
import Data.Aeson.Ordered qualified as AO
|
|
import Data.ByteString.Lazy qualified as BL
|
|
import Data.HashMap.Strict.Extended qualified as Map
|
|
import Data.IntMap.Strict qualified as IntMap
|
|
import Data.List.NonEmpty qualified as NE
|
|
import Data.Text qualified as T
|
|
import Data.Text.Extended (commaSeparated, toTxt, (<<>))
|
|
import Data.Validation (Validation (..), toEither)
|
|
import Hasura.Base.Error
|
|
import Hasura.Base.ErrorMessage (fromErrorMessage)
|
|
import Hasura.GraphQL.Execute.Remote
|
|
( getVariableDefinitionAndValue,
|
|
resolveRemoteVariable,
|
|
runVariableCache,
|
|
)
|
|
import Hasura.GraphQL.Execute.RemoteJoin.Types
|
|
import Hasura.GraphQL.Parser qualified as P
|
|
import Hasura.GraphQL.Transport.HTTP.Protocol (GQLReq (..), GQLReqOutgoing)
|
|
import Hasura.Prelude
|
|
import Hasura.RQL.IR.RemoteSchema (convertSelectionSet)
|
|
import Hasura.RQL.Types.Common
|
|
import Hasura.RQL.Types.ResultCustomization
|
|
import Hasura.RemoteSchema.SchemaCache
|
|
import Hasura.Session
|
|
import Language.GraphQL.Draft.Syntax qualified as G
|
|
|
|
-------------------------------------------------------------------------------
|
|
-- Executing a remote join
|
|
|
|
-- | Construct and execute a call to a remote schema for a remote join.
|
|
makeRemoteSchemaJoinCall ::
|
|
(MonadError QErr m) =>
|
|
-- | Function to send a request over the network.
|
|
(GQLReqOutgoing -> m BL.ByteString) ->
|
|
-- | User information.
|
|
UserInfo ->
|
|
-- | Information about that remote join.
|
|
RemoteSchemaJoin ->
|
|
-- | Mapping from 'JoinArgumentId' to its corresponding 'JoinArgument'.
|
|
IntMap.IntMap JoinArgument ->
|
|
-- | The resulting join index (see 'buildJoinIndex') if any.
|
|
m (Maybe (IntMap.IntMap AO.Value))
|
|
makeRemoteSchemaJoinCall networkFunction userInfo remoteSchemaJoin joinArguments = do
|
|
-- step 1: construct the internal intermediary representation
|
|
maybeRemoteCall <- buildRemoteSchemaCall remoteSchemaJoin joinArguments userInfo
|
|
-- if there actually is a remote call:
|
|
for maybeRemoteCall \remoteCall -> do
|
|
-- step 2: execute it over the network
|
|
responseValue <- executeRemoteSchemaCall networkFunction remoteCall
|
|
-- step 3: build the join index
|
|
buildJoinIndex remoteCall responseValue
|
|
|
|
-------------------------------------------------------------------------------
|
|
-- Internal representation
|
|
|
|
-- | Intermediate type containing all of the information required to perform
|
|
-- a remote schema call, constructed from the static join information.
|
|
data RemoteSchemaCall = RemoteSchemaCall
|
|
{ rscCustomizer :: ResultCustomizer,
|
|
rscGQLRequest :: GQLReqOutgoing,
|
|
rscResponsePaths :: IntMap.IntMap ResponsePath
|
|
}
|
|
|
|
-- | Used to extract the value from a remote schema response.
|
|
--
|
|
-- For example: if a remote relationship is defined to retrieve data from some
|
|
-- deeply nested field, this is the path towards that deeply nested field.
|
|
newtype ResponsePath = ResponsePath (NE.NonEmpty G.Name)
|
|
deriving stock (Eq, Show)
|
|
|
|
-------------------------------------------------------------------------------
|
|
-- Step 1: building the remote call
|
|
|
|
-- | Constructs a 'RemoteSchemaCall' from some static information, such as the
|
|
-- definition of the join, and dynamic information such as the user's
|
|
-- information and the map of join arguments.
|
|
buildRemoteSchemaCall ::
|
|
(MonadError QErr m) =>
|
|
RemoteSchemaJoin ->
|
|
IntMap.IntMap JoinArgument ->
|
|
UserInfo ->
|
|
m (Maybe RemoteSchemaCall)
|
|
buildRemoteSchemaCall RemoteSchemaJoin {..} arguments userInfo = do
|
|
-- for each join argument, we generate a unique field, with the alias
|
|
-- "f" <> argumentId
|
|
fields <- flip IntMap.traverseWithKey arguments $ \argumentId (JoinArgument argument) -> do
|
|
graphqlArgs <- fmap Map.fromList $
|
|
for (Map.toList argument) \(FieldName columnName, value) -> do
|
|
graphQLName <- parseGraphQLName columnName
|
|
graphQLValue <- ordJSONValueToGValue value
|
|
pure (graphQLName, graphQLValue)
|
|
-- Creating the alias should never fail.
|
|
let aliasText = T.pack $ "f" <> show argumentId
|
|
alias <-
|
|
G.mkName aliasText
|
|
`onNothing` throw500 ("'" <> aliasText <> "' is not a valid GraphQL name!")
|
|
let responsePath = alias NE.:| fmap fcName (NE.tail _rsjFieldCall)
|
|
rootField = fcName $ NE.head _rsjFieldCall
|
|
resultCustomizer = applyAliasMapping (singletonAliasMapping rootField alias) _rsjResultCustomizer
|
|
gqlField <- fieldCallsToField _rsjArgs graphqlArgs (convertSelectionSet _rsjSelSet) alias _rsjFieldCall
|
|
pure (gqlField, responsePath, resultCustomizer)
|
|
|
|
-- this constructs the actual GraphQL Request that can be sent to the remote
|
|
for (NE.nonEmpty $ IntMap.elems fields) $ \neFields -> do
|
|
gqlRequest <-
|
|
fmap fieldsToRequest . runVariableCache . for neFields $
|
|
\(field, _, _) -> traverse (resolveRemoteVariable userInfo) field
|
|
let customizer = foldMap (view _3) fields
|
|
responsePath = fmap (ResponsePath . view _2) fields
|
|
pure $ RemoteSchemaCall customizer gqlRequest responsePath
|
|
|
|
-- | Fold nested 'FieldCall's into a bare 'Field', inserting the passed
|
|
-- selection set at the leaf of the tree we construct.
|
|
fieldCallsToField ::
|
|
forall m.
|
|
MonadError QErr m =>
|
|
-- | user input arguments to the remote join field
|
|
Map.HashMap G.Name (P.InputValue RemoteSchemaVariable) ->
|
|
-- | Contains the values of the variables that have been defined in the remote join definition
|
|
Map.HashMap G.Name (G.Value Void) ->
|
|
-- | Inserted at leaf of nested FieldCalls
|
|
G.SelectionSet G.NoFragments RemoteSchemaVariable ->
|
|
-- | Top-level name to set for this Field
|
|
G.Name ->
|
|
NonEmpty FieldCall ->
|
|
m (G.Field G.NoFragments RemoteSchemaVariable)
|
|
fieldCallsToField rrArguments variables finalSelSet topAlias =
|
|
fmap (\f -> f {G._fAlias = Just topAlias}) . nest
|
|
where
|
|
-- almost: `foldr nest finalSelSet`
|
|
nest :: NonEmpty FieldCall -> m (G.Field G.NoFragments RemoteSchemaVariable)
|
|
nest ((FieldCall name remoteArgs) :| rest) = do
|
|
templatedArguments <- convert <$> createArguments variables remoteArgs
|
|
graphQLarguments <- traverse peel rrArguments
|
|
(args, selSet) <- case NE.nonEmpty rest of
|
|
Just f -> do
|
|
s <- nest f
|
|
pure (templatedArguments, [G.SelectionField s])
|
|
Nothing -> do
|
|
arguments <-
|
|
Map.unionWithM
|
|
combineValues
|
|
graphQLarguments
|
|
-- converting (G.Value Void) -> (G.Value Variable) to merge the
|
|
-- 'rrArguments' with the 'variables'
|
|
templatedArguments
|
|
pure (arguments, finalSelSet)
|
|
pure $ G.Field Nothing name args [] selSet
|
|
|
|
convert :: Map.HashMap G.Name (G.Value Void) -> Map.HashMap G.Name (G.Value RemoteSchemaVariable)
|
|
convert = fmap G.literal
|
|
|
|
peel :: P.InputValue RemoteSchemaVariable -> m (G.Value RemoteSchemaVariable)
|
|
peel = \case
|
|
P.GraphQLValue v -> pure v
|
|
P.JSONValue _ ->
|
|
-- At this point, it is theoretically impossible that we have
|
|
-- unpacked a variable into a JSONValue, as there's no "outer
|
|
-- scope" at which this value could have been peeled.
|
|
-- FIXME: check that this is correct!
|
|
throw500 "internal error: encountered an already expanded variable when folding remote field arguments"
|
|
|
|
-- | Create an argument map using the inputs taken from the left hand side.
|
|
createArguments ::
|
|
(MonadError QErr m) =>
|
|
Map.HashMap G.Name (G.Value Void) ->
|
|
RemoteArguments ->
|
|
m (HashMap G.Name (G.Value Void))
|
|
createArguments variables (RemoteArguments arguments) =
|
|
toEither (traverse substituteVariables arguments)
|
|
`onLeft` (\errors -> throw400 Unexpected $ "Found errors: " <> commaSeparated errors)
|
|
where
|
|
substituteVariables = \case
|
|
G.VVariable variableName ->
|
|
Map.lookup variableName variables
|
|
`onNothing` Failure ["Value for variable " <> variableName <<> " not provided"]
|
|
G.VList listValue ->
|
|
fmap G.VList (traverse substituteVariables listValue)
|
|
G.VObject objectValue ->
|
|
fmap G.VObject (traverse substituteVariables objectValue)
|
|
G.VInt i -> pure $ G.VInt i
|
|
G.VFloat d -> pure $ G.VFloat d
|
|
G.VString txt -> pure $ G.VString txt
|
|
G.VEnum e -> pure $ G.VEnum e
|
|
G.VBoolean b -> pure $ G.VBoolean b
|
|
G.VNull -> pure $ G.VNull
|
|
|
|
-- | Combine two GraphQL values together.
|
|
--
|
|
-- This is used to combine different input arguments into one. This function can
|
|
-- only combine objects or lists pairwise, and fails if it has to combine any
|
|
-- other combination of values.
|
|
--
|
|
-- >>> combineValues (Object (fromList [("id", Number 1)]) (Object (fromList [("name", String "foo")])
|
|
-- Object (fromList [("id", Number 1), ("name", String "foo")])
|
|
combineValues ::
|
|
MonadError QErr m =>
|
|
G.Name ->
|
|
G.Value RemoteSchemaVariable ->
|
|
G.Value RemoteSchemaVariable ->
|
|
m (G.Value RemoteSchemaVariable)
|
|
combineValues name v1 v2 = case (v1, v2) of
|
|
(G.VObject l, G.VObject r) -> G.VObject <$> Map.unionWithM combineValues l r
|
|
(G.VList l, G.VList r) -> pure $ G.VList $ l <> r
|
|
(l, r) ->
|
|
throw500 $
|
|
"combineValues: cannot combine values (" <> tshow l <> ") and (" <> tshow r
|
|
<> ") for field "
|
|
<> G.unName name
|
|
<> "; lists can only be merged with lists, objects can only be merged with objects"
|
|
|
|
-- | Craft a GraphQL query document from the list of fields.
|
|
fieldsToRequest :: NonEmpty (G.Field G.NoFragments P.Variable) -> GQLReqOutgoing
|
|
fieldsToRequest gFields =
|
|
GQLReq
|
|
{ _grOperationName = Nothing,
|
|
_grVariables =
|
|
if Map.null variableInfos
|
|
then Nothing
|
|
else Just $ mapKeys G._vdName variableInfos,
|
|
_grQuery =
|
|
G.TypedOperationDefinition
|
|
{ G._todSelectionSet =
|
|
-- convert from Field Variable to Field Name
|
|
NE.toList $ G.SelectionField . fmap P.getName <$> gFields,
|
|
G._todVariableDefinitions = Map.keys variableInfos,
|
|
G._todType = G.OperationTypeQuery,
|
|
G._todName = Nothing,
|
|
G._todDirectives = []
|
|
}
|
|
}
|
|
where
|
|
variableInfos :: HashMap G.VariableDefinition A.Value
|
|
variableInfos = Map.fromList $ concatMap (foldMap getVariableInfo) gFields
|
|
getVariableInfo :: P.Variable -> [(G.VariableDefinition, A.Value)]
|
|
getVariableInfo = pure . fmap snd . getVariableDefinitionAndValue
|
|
|
|
------------------------------------------------------------------------------
|
|
-- Step 2: sending the call over the network
|
|
|
|
-- | Sends the call over the network, and parse the resulting ByteString.
|
|
executeRemoteSchemaCall ::
|
|
(MonadError QErr m) =>
|
|
-- | Function to send a request over the network.
|
|
(GQLReqOutgoing -> m BL.ByteString) ->
|
|
-- | Information about that call.
|
|
RemoteSchemaCall ->
|
|
-- | Resulting JSON object
|
|
m AO.Object
|
|
executeRemoteSchemaCall networkFunction (RemoteSchemaCall customizer request _) = do
|
|
responseBody <- networkFunction request
|
|
responseJSON <-
|
|
AO.eitherDecode responseBody
|
|
`onLeft` (\e -> throw500 $ "Remote server response is not valid JSON: " <> T.pack e)
|
|
responseObject <- AO.asObject responseJSON `onLeft` throw500
|
|
let errors = AO.lookup "errors" responseObject
|
|
if
|
|
| isNothing errors || errors == Just AO.Null ->
|
|
case AO.lookup "data" responseObject of
|
|
Nothing -> throw500 "\"data\" field not found in remote response"
|
|
Just v ->
|
|
let v' = applyResultCustomizer customizer v
|
|
in AO.asObject v' `onLeft` throw500
|
|
| otherwise ->
|
|
throwError
|
|
(err400 Unexpected "Errors from remote server")
|
|
{ qeInternal = Just $ ExtraInternal $ A.object ["errors" A..= (AO.fromOrdered <$> errors)]
|
|
}
|
|
|
|
-------------------------------------------------------------------------------
|
|
-- Step 3: extracting the join index
|
|
|
|
-- | Construct a join index from the remote source's 'AO.Value' response.
|
|
--
|
|
-- This function extracts from the 'RemoteJoinCall' a mapping from
|
|
-- 'JoinArgumentId' to 'ResponsePath': from an integer that uniquely identifies
|
|
-- a join argument to the "path" at which we expect that value in the
|
|
-- response. With it, and with the actual reponse JSON value obtained from the
|
|
-- remote server, it constructs a corresponding mapping of, for each argument,
|
|
-- its extracted value.
|
|
--
|
|
-- If the response does not have value at any of the provided 'ResponsePath's,
|
|
-- throw a generic 'QErr'.
|
|
--
|
|
-- NOTE(jkachmar): If we switch to an 'Applicative' validator, we can collect
|
|
-- more than one missing 'ResponsePath's (rather than short-circuiting on the
|
|
-- first missing value).
|
|
buildJoinIndex ::
|
|
forall m.
|
|
(MonadError QErr m) =>
|
|
RemoteSchemaCall ->
|
|
AO.Object ->
|
|
m (IntMap.IntMap AO.Value)
|
|
buildJoinIndex RemoteSchemaCall {..} response =
|
|
for rscResponsePaths $ \(ResponsePath path) ->
|
|
go (AO.Object response) (map G.unName . NE.toList $ path)
|
|
where
|
|
go :: AO.Value -> [Text] -> m AO.Value
|
|
go value path = case path of
|
|
[] -> pure value
|
|
k : ks -> case value of
|
|
AO.Object obj -> do
|
|
objValue <-
|
|
AO.lookup k obj
|
|
`onNothing` throw500 ("failed to lookup key '" <> toTxt k <> "' in response")
|
|
go objValue ks
|
|
_ ->
|
|
throw500 $
|
|
"unexpected non-object json value found while path not empty: "
|
|
<> commaSeparated path
|
|
|
|
-------------------------------------------------------------------------------
|
|
-- Local helpers
|
|
|
|
-- NOTE: Ideally this should be done at the remote relationship validation
|
|
-- layer.
|
|
--
|
|
-- When validating remote relationships, we should store the validated names so
|
|
-- that we don't need to continually re-validate them downstream.
|
|
parseGraphQLName :: (MonadError QErr m) => Text -> m G.Name
|
|
parseGraphQLName txt =
|
|
G.mkName txt `onNothing` (throw400 RemoteSchemaError $ errMsg)
|
|
where
|
|
errMsg = txt <> " is not a valid GraphQL name"
|
|
|
|
ordJSONValueToGValue :: (MonadError QErr n) => AO.Value -> n (G.Value Void)
|
|
ordJSONValueToGValue =
|
|
either (throw400 ValidationFailed . fromErrorMessage) pure . P.jsonToGraphQL . AO.fromOrdered
|