graphql-engine/dc-agents/reference
David Overton e5f88d8039 Nested array support for Data Connectors Backend and MongoDB
## Description

This change adds support for querying into nested arrays in Data Connector agents that support such a concept (currently MongoDB).

### DC API changes

- New API type `ColumnType` which allows representing the type of a "column" as either a scalar type, an object reference or an array of `ColumnType`s. This recursive definition allows arbitrary nesting of arrays of types.
- The `type` fields in the API types `ColumnInfo` and `ColumnInsertSchema` now take a `ColumnType` instead of a `ScalarType`.
- To ensure backwards compatibility, a `ColumnType` representing a scalar serialises and deserialises to the same representation as `ScalarType`.
- In queries, the `Field` type now has a new constructor `NestedArrayField`. This contains a nested `Field` along with optional `limit`, `offset`, `where` and `order_by` arguments. (These optional arguments are not yet used by either HGE or the MongoDB agent.)

### MongoDB Haskell agent changes

- The `/schema` endpoint will now recognise arrays within the JSON validation schema and generate corresponding arrays in the DC schema.
- The `/query` endpoint will now handle `NestedArrayField`s within queries (although it does not yet handle `limit`, `offset`, `where` and `order_by`).

### HGE server changes

- The `Backend` type class adds a new type family `XNestedArrays b` to enable nested arrays on a per-backend basis (currently enabled only for the `DataConnector` backend.
- Within `RawColumnInfo` the column type is now represented by a new type `RawColumnType b` which mirrors the shape of the DC API `ColumnType`, but uses `XNestedObjects b` and `XNestedArrays b` type families to allow turning nested object and array supports on or off for a particular backend. In the `DataConnector` backend `API.CustomType` is converted into `RawColumnInfo 'DataConnector` while building the schema.
- In the next stage of schema building, the `RawColumnInfo` is converted into a `StructuredColumnInfo` which allows us to represent the three different types of columns: scalar, object and array. TODO: the `StructuredColumnInfo` looks very similar to the Logical Model types. The main difference is that it uses the `XNestedObjects` and `XNestedArrays` type families. We should be able to combine these two representations.
- The `StructuredColumnInfo` is then placed into a `FIColumn` `FieldInfo`. This involved some refactoring of `FieldInfo` as I had previously split out `FINestedObject` into a separate constructor. However it works out better to represent all "column" fields (i.e. scalar, object and array) using `FIColumn` as this make it easier to implement permission checking correctly. This is the reason the `StructuredColumnInfo` was needed.
- Next, the `FieldInfo` are used to generate `FieldParser`s. We add a new constructor to `AnnFieldG` for `AFNestedArray`. An `AFNestedArray` field parser can contain either a simple array selection or an array aggregate. Simple array `FieldParsers` are currently limited to subfield selection. We will add support for limit, offset, where and order_by in a future PR. We also don't yet generate array aggregate `FieldParsers.
- The new `AFNestedArray` field is handled by the `QueryPlan` module in the `DataConnector` backend. There we generate an `API.NestedArrayField` from the AFNestedArray. We also handle nested arrays when reshaping the response from the DC agent.

## Limitations

- Support for limit, offset, filter (where) and order_by is not yet fully implemented, although it should not be hard to add this
- Support for aggregations on nested arrays is not yet fully implemented
- Permissions involving nested arrays (and objects) not yet implemented
- This should be integrated with Logical Model types, but that will happen in a separate PR

PR-URL: https://github.com/hasura/graphql-engine-mono/pull/9149
GitOrigin-RevId: 0e7b71a994fc1d2ca1ef73bfe7b96e95b5328531
2023-05-24 08:02:43 +00:00
..
src Nested array support for Data Connectors Backend and MongoDB 2023-05-24 08:02:43 +00:00
.gitignore Replace Haskell DC Reference Agent with TypeScript DC Reference Agent in HSpec tests 2022-06-23 08:09:46 +00:00
.nvmrc Move Typescript types for Data Connector agent into their own package 2022-09-05 06:09:23 +00:00
Dockerfile Replace Haskell DC Reference Agent with TypeScript DC Reference Agent in HSpec tests 2022-06-23 08:09:46 +00:00
package-lock.json Nested array support for Data Connectors Backend and MongoDB 2023-05-24 08:02:43 +00:00
package.json Nested array support for Data Connectors Backend and MongoDB 2023-05-24 08:02:43 +00:00
README.md Implemented datasets support in Data Connector agent test suite 2023-01-30 07:00:26 +00:00
tsconfig.json Replace Haskell DC Reference Agent with TypeScript DC Reference Agent in HSpec tests 2022-06-23 08:09:46 +00:00

Data Connector Agent Reference Implementation

This directory contains a barebones implementation of the Data Connector agent specification which fetches its data from static JSON files. It can be used as a reference implementation for testing, and as a reference for developers working on backend services.

Requirements

  • NodeJS 16

Build & Run

> npm install
> npm start

Docker Build & Run

> docker build . -t dc-reference-agent:latest
> docker run -it --rm -p 8100:8100 dc-reference-agent:latest

Dataset

The dataset exposed by the reference agent is sourced from https://github.com/lerocha/chinook-database/

More specifically, the Chinook.xml.gz file is a GZipped version of https://raw.githubusercontent.com/lerocha/chinook-database/ce27c48d9f375f81b7b68bacdfddf3c4458acc49/ChinookDatabase/DataSources/_Xml/ChinookData.xml

The schema-tables.json is manually derived from the schema of the data as can be seen from the CREATE TABLE etc DML statements in the various per-database-vendor SQL scripts that can be found in /ChinookDatabase/DataSources in that repo.

The datasets can be operated on via the /datasets resources as described in dc-agents/README.md.

Configuration

The reference agent supports some configuration properties that can be set via the value property of configuration on a source in Hasura metadata. The configuration is passed to the agent on each request via the X-Hasura-DataConnector-Config header.

The configuration that the reference agent can take supports two properties:

  • tables: This is a list of table names that should be exposed by the agent. If omitted all Chinook dataset tables are exposed. If specified, it filters all available table names by the specified list.
  • schema: If specified, this places all the tables within a schema of the specified name. For example, if schema is set to my_schema, all table names will be namespaced under my_schema, for example ["my_schema","Album"]. If not specified, then tables are not namespaced, for example ["Album"].

Here's an example configuration that only exposes the Artist and Album tables, and namespaces them under my_schema:

{
  "tables": ["Artist", "Album"],
  "schema": "my_schema"
}

Here's an example configuration that exposes all tables, un-namespaced:

{}