graphql-engine/rfcs/apollo-federation.md
paritosh-08 4f5cb43954 RFC: apollo federation support
RFC for apollo federation support in HGE.

PR-URL: https://github.com/hasura/graphql-engine-mono/pull/4607
GitOrigin-RevId: c154c58532394ef25166d39ff2e100a00448c111
2022-07-12 09:30:38 +00:00

14 KiB
Raw Permalink Blame History

Apollo federation v1 support

Original issue #3064.

Apollo requires a bunch of fields and types to be exposed in schema in order to use the graphql server as a federated subgraph. The complete requirement can be found here.

Product perspective

Requirements

We want the following things:

  1. Hasura should mount on Apollo federated gateway
  2. Other subgraphs can use Hasura. At a minimum, this means that table types (generated by Hasura) should be available for other subgraphs.

Setup experience

Apollo federation (AF) support should be enabled via an env variable (or a global metadata field? TBD). When AF is enabled, it will add the required AF schema objects to the final generated schema. This will allow point 1. in requirements above.

Secondly, you will have to resolve entities for each type that you want to federate so that it can be used in other subgraphs. For this, you will have to enable AF through table metadata for the individual table (i.e. via *_track_table API). The *_track_table API's args will look something like the following:

{
     "source": "default",
     "table": "Author",
     "configuration": {},
     "apollo_federation_config": {
        "enable": "v1"
     }
}

This API design will enable us to add more features (by adding new key-value pairs in apollo_federation_config) such as:

  1. Extending the support to v2 directives in future such as adding @sharable to some columns.
  2. Changing @key directive fields (or primary key for apollo federation in simple terms).

Behaviour

When a table is tracked with enable AF option, the type of the table in the schema will have the @key directive added. This will allow point 2. in the requirements above. In the first version, the @key field will be populated by the tables primary key. For example:

type Review @key(fields: "id") {
  id: Integer!
  body: String
  author: User
  product: Product
}

@key(fields: “id”) will be automatically added where id is the primary key for a table called Review.

Evaluating feature

This feature will be released as an experimental feature first, so that users can use the feature only if they want to explore this. Using the user feedback, we can incrementally improve the feature by adding more supports, fixing bugs and improving performance.

Implementation perspective

Spec

According to the apollo spec:

To make a GraphQL service subgraph-capable, it needs the following:

  1. Implementation of the federation schema specification
  2. Support for fetching service capabilities
  3. Implementation of stub type generation for references
  4. Implementation of request resolving for entities.

SDL generation

To use HGE as a federated subgraph, we need to expose _service field, which is of type _Service, which contains field called sdl of type String. So, first, we need to generate SDL (schema definition language) for the HGE schema while building the schema field parsers.

For generating the SDL, we can use the Printer.schemaDocument from the graphql-parser-hs library on SchemaDocument. We can easily generate the SchemaDocument from SchemaIntrospection using the following:

getSchemaDocument:: G.SchemaIntrospection -> G.SchemaDocument
getSchemaDocument (G.SchemaIntrospection typeDefMap) =
  G.SchemaDocument completeSchema
  where
    allTypeDefns = map G.TypeSystemDefinitionType (Map.elems typeDefMap)
    rootOpTypeDefns = getRootOpTypeDefns -- define this
    completeSchema = rootOpTypeDefns : allTypeDefns

Then we can easily generate the SDL by:

generateSDL :: G.SchemaIntrospection -> Text
generateSDL = Builder.run . Printer.schemaDocument . getSchemaDocument

To add this new field and type in the schema, we would have to create FieldParser for each of the fields _service and sdl after creating the SchemaIntrospection and then expose these in the GQL context. This should be a trivial change. Please note that the schema introspection will be role dependent, thus the SDL will not expose fields/types that the current role doesn't have access to.

An example for the SDL generated for the following setup is:

HGE setup

A tracked table, called users with fields id and name using the graphql-default naming convention.

Generated SDL

(Shortened version)

schema {
  query: query_root
  mutation: mutation_root
  subscription: subscription_root
}

type query_root {
  usersAggregate(
    where: UsersBoolExp
    orderBy: [UsersOrderBy!]
    limit: Int
    offset: Int
    distinctOn: [UsersSelectColumn!]
  ): UsersAggregate!
  users(
    where: UsersBoolExp
    orderBy: [UsersOrderBy!]
    limit: Int
    offset: Int
    distinctOn: [UsersSelectColumn!]
  ): [Users!]!
  usersByPk(id: Int!): Users
}

type Users @key(fields: "id") {
  id: Int!
  name: String!
}

.
.
.
Click here for full SDL
schema {
  query: query_root
  mutation: mutation_root
  subscription: subscription_root
}

scalar Float

scalar Int

scalar String

type __Directive {
  args: __InputValue
  description: String!
  isRepeatable: String!
  locations: String!
  name: String!
}

type __EnumValue {
  deprecationReason: String!
  description: String!
  isDeprecated: String!
  name: String!
}

type __Field {
  args: __InputValue
  deprecationReason: String!
  description: String!
  isDeprecated: String!
  name: String!
  type: __Type
}

type __InputValue {
  defaultValue: String!
  description: String!
  name: String!
  type: __Type
}

type __Schema {
  description: String!
  directives: __Directive
  mutationType: __Type
  queryType: __Type
  subscriptionType: __Type
  types: __Type
}

type __Type {
  description: String!
  enumValues(includeDeprecated: Boolean = false): __EnumValue
  fields(includeDeprecated: Boolean = false): __Field
  inputFields: __InputValue
  interfaces: __Type
  kind: __TypeKind!
  name: String!
  ofType: __Type
  possibleTypes: __Type
}

type query_root {
  usersAggregate(
    where: UsersBoolExp
    orderBy: [UsersOrderBy!]
    limit: Int
    offset: Int
    distinctOn: [UsersSelectColumn!]
  ): UsersAggregate!
  users(
    where: UsersBoolExp
    orderBy: [UsersOrderBy!]
    limit: Int
    offset: Int
    distinctOn: [UsersSelectColumn!]
  ): [Users!]!
  usersByPk(id: Int!): Users
}

type subscription_root {
  usersAggregate(
    where: UsersBoolExp
    orderBy: [UsersOrderBy!]
    limit: Int
    offset: Int
    distinctOn: [UsersSelectColumn!]
  ): UsersAggregate!
  users(
    where: UsersBoolExp
    orderBy: [UsersOrderBy!]
    limit: Int
    offset: Int
    distinctOn: [UsersSelectColumn!]
  ): [Users!]!
  usersByPk(id: Int!): Users
}

type UsersAvgFields {
  id: Float
}

type UsersAggregateFields {
  avg: UsersAvgFields
  count(distinct: Boolean, columns: [UsersSelectColumn!]): Int!
  max: UsersMaxFields
  min: UsersMinFields
  stddev: UsersStddevFields
  stddevPop: UsersStddevPopFields
  stddevSamp: UsersStddevSampFields
  sum: UsersSumFields
  varPop: UsersVarPopFields
  varSamp: UsersVarSampFields
  variance: UsersVarianceFields
}

type UsersMaxFields {
  id: Int
  name: String
}

type UsersMinFields {
  id: Int
  name: String
}

type UsersStddevFields {
  id: Float
}

type UsersStddevPopFields {
  id: Float
}

type UsersStddevSampFields {
  id: Float
}

type UsersSumFields {
  id: Int
}

type UsersVarPopFields {
  id: Float
}

type UsersVarSampFields {
  id: Float
}

type UsersVarianceFields {
  id: Float
}

type UsersAggregate {
  aggregate: UsersAggregateFields
  nodes: [Users!]!
}

type Users @key(fields: "id") {
  id: Int!
  name: String!
}

type mutation_root {
  deleteUsers(where: UsersBoolExp!): UsersMutationResponse
  deleteUsersByPk(id: Int!): Users
  insertUsersOne(onConflict: UsersOnConflict, object: UsersInsertInput!): Users
  insertUsers(
    onConflict: UsersOnConflict
    objects: [UsersInsertInput!]!
  ): UsersMutationResponse
  updateUsers(
    _set: UsersSetInput
    _inc: UsersIncInput
    where: UsersBoolExp!
  ): UsersMutationResponse
  updateUsersByPk(
    _set: UsersSetInput
    _inc: UsersIncInput
    pk_columns: UsersPkColumnsInput!
  ): Users
}

type UsersMutationResponse {
  returning: [Users!]!
  affected_rows: Int!
}

enum __TypeKind {
  ENUM
  INPUT_OBJECT
  INTERFACE
  LIST
  NON_NULL
  OBJECT
  SCALAR
  UNION
}

enum orderBy {
  ascNullsFirst
  asc
  ascNullsLast
  desc
  descNullsFirst
  descNullsLast
}

enum UsersSelectColumn {
  id
  name
}

enum UsersConstraint {
  users_pkey
}

enum UsersUpdateColumn {
  id
  name
}

input Int_cast_exp {
  String: StringComparisonExp
}

input IntComparisonExp {
  _cast: Int_cast_exp
  _eq: Int
  _gt: Int
  _gte: Int
  _in: [Int!]
  _isNull: Boolean
  _lt: Int
  _lte: Int
  _neq: Int
  _nin: [Int!]
}

input StringComparisonExp {
  _eq: String
  _gt: String
  _gte: String
  _in: [String!]
  _isNull: Boolean
  _lt: String
  _lte: String
  _neq: String
  _nin: [String!]
  _niregex: String
  _nregex: String
  _nsimilar: String
  _nilike: String
  _nlike: String
  _iregex: String
  _regex: String
  _similar: String
  _ilike: String
  _like: String
}

input UsersBoolExp {
  _and: [UsersBoolExp!]
  _not: UsersBoolExp
  _or: [UsersBoolExp!]
  id: IntComparisonExp
  name: StringComparisonExp
}

input UsersOrderBy {
  id: orderBy
  name: orderBy
}

input UsersIncInput {
  id: Int
}

input UsersInsertInput {
  id: Int
  name: String
}

input UsersSetInput {
  id: Int
  name: String
}

input UsersOnConflict {
  constraint: UsersConstraint!
  update_columns: [UsersUpdateColumn!]! = []
  where: UsersBoolExp
}

input UsersPkColumnsInput {
  id: Int!
}

Entities Union

Apollo federation needs an _entities field as part of the specification, which is defined as follows:

# a union of all types that use the @key directive
scalar _Any

union _Entity

extend type Query {
  _entities(representations: [_Any!]!): [_Entity]!
}

To add this field, first we need to create an union of all the types that have the directive key. Please note that the types might be defined in the HGE (from DB sources or action types) or they might be exposed via the remote schema.

For DB sources, we will add key directive to the select type with field set to the primary key of the table (we can allow users to select the field that they want to specify in the key directive).

For actions, we need to modify the set_custom_types API to include the directives first.

For remote schema we already store the directives that are coming from the upstream, so we can use them.

Now, to create the union of the types, we already have a function called selectionSetUnion which creates Parser for the union:

selectionSetUnion ::
  (MonadParse n, Traversable t) =>
  Name ->
  Maybe Description ->
  -- | The member object types.
  t (Parser 'Output n b) ->
  Parser 'Output n (t b)

We need to be a little careful here as we would want to remove the Parsers based on the role access, for example if a role doesn't have access to the @key directive feields, then it makes sense to omit those Parsers.

Next, we need to evaluate the _entities query. Consider the following:

query MyQuery {
  _entities(representations: [{"__typename": "UsersData", "id": "1"}, {"__typename": "TwoPks", "id1": "1", "id2": "2"}]) {
    ... on TwoPks {
      internalData
    }
    ... on UsersData {
      id
      name
    }
  }
}

For evaluating this we need to do the following:

  1. Get selection set for each of the union types in the selection set of the _entities query. In the above query, we will have two selection sets (one for TwoPks and one for UsersData).
  2. Generate the arguments using the arguments of the query. For the above query, for the 1st selection set (TowPks), we will have the argument (id1: "1", id2: "2").
  3. Next we would have to use the parsers for the fields (TwoPksByPk and UsersDataByPk) to evaluate the Field (constructed using the selection set and arguments in above steps).
  4. Finally we would concatenate the results in a list.

Note: The above method of evaluating the query may do multiple fetches from the database, which might be something that can be optimised in further iterations.

Implementation

There are two main functions from the perspective of implementation:

  1. generateSDL: This function generates the SDL of a given schema. It uses the schema introspection inorder to build the SDL. The schema introspection is generated while building the parsers. The type definition of the function is:

    generateSDL :: G.SchemaIntrospection -> Text
    generateSDL = undefined
    
  2. mkEntityUnionFieldParser: This function will create a FieldParser for the entities query. This uses the union selection set Parser and the FieldParsers of fields having @key directive in order to evaluate the query. The union selection set Parser can be generated using selectionSetUnion and the FieldParsers can be collected based on the schema introspection (for getting the types with @key directive) and query FieldParsers. The type definition of the function is:

    mkEntityUnionFieldParser ::
      P.Parser 'P.Output (P.Parse) [[P.ParsedSelection a1]] ->
      [(G.Name, [G.Directive Void], FieldParser (P.Parse) (NamespacedField (QueryRootField UnpreparedValue)))] ->
      FieldParser (P.Parse) (NamespacedField (QueryRootField UnpreparedValue))
    mkEntityUnionFieldParser bodyParser fieldParsers = undefined
    

    Please note that the fieldParsers is a list of triple tuple, which have the type name, directives associated with type name and the FieldParser associated with the type.

Future work

There is a lot that can be improved in the v1 implementation. To list a few:

  1. Include action types under the apollo federation as well. This will require a few internal representational changes (such as adding directives in action types).
  2. Let the user choose the fields that are used in @key directives. This will enable users to choose fields other than primary key.
  3. Look into supporting apollo federation v2.