mirror of https://github.com/hasura/graphql-engine.git synced 2024-12-15 01:12:56 +03:00

History

David Overton e5f88d8039 Nested array support for Data Connectors Backend and MongoDB ## Description This change adds support for querying into nested arrays in Data Connector agents that support such a concept (currently MongoDB). ### DC API changes - New API type `ColumnType` which allows representing the type of a "column" as either a scalar type, an object reference or an array of `ColumnType`s. This recursive definition allows arbitrary nesting of arrays of types. - The `type` fields in the API types `ColumnInfo` and `ColumnInsertSchema` now take a `ColumnType` instead of a `ScalarType`. - To ensure backwards compatibility, a `ColumnType` representing a scalar serialises and deserialises to the same representation as `ScalarType`. - In queries, the `Field` type now has a new constructor `NestedArrayField`. This contains a nested `Field` along with optional `limit`, `offset`, `where` and `order_by` arguments. (These optional arguments are not yet used by either HGE or the MongoDB agent.) ### MongoDB Haskell agent changes - The `/schema` endpoint will now recognise arrays within the JSON validation schema and generate corresponding arrays in the DC schema. - The `/query` endpoint will now handle `NestedArrayField`s within queries (although it does not yet handle `limit`, `offset`, `where` and `order_by`). ### HGE server changes - The `Backend` type class adds a new type family `XNestedArrays b` to enable nested arrays on a per-backend basis (currently enabled only for the `DataConnector` backend. - Within `RawColumnInfo` the column type is now represented by a new type `RawColumnType b` which mirrors the shape of the DC API `ColumnType`, but uses `XNestedObjects b` and `XNestedArrays b` type families to allow turning nested object and array supports on or off for a particular backend. In the `DataConnector` backend `API.CustomType` is converted into `RawColumnInfo 'DataConnector` while building the schema. - In the next stage of schema building, the `RawColumnInfo` is converted into a `StructuredColumnInfo` which allows us to represent the three different types of columns: scalar, object and array. TODO: the `StructuredColumnInfo` looks very similar to the Logical Model types. The main difference is that it uses the `XNestedObjects` and `XNestedArrays` type families. We should be able to combine these two representations. - The `StructuredColumnInfo` is then placed into a `FIColumn` `FieldInfo`. This involved some refactoring of `FieldInfo` as I had previously split out `FINestedObject` into a separate constructor. However it works out better to represent all "column" fields (i.e. scalar, object and array) using `FIColumn` as this make it easier to implement permission checking correctly. This is the reason the `StructuredColumnInfo` was needed. - Next, the `FieldInfo` are used to generate `FieldParser`s. We add a new constructor to `AnnFieldG` for `AFNestedArray`. An `AFNestedArray` field parser can contain either a simple array selection or an array aggregate. Simple array `FieldParsers` are currently limited to subfield selection. We will add support for limit, offset, where and order_by in a future PR. We also don't yet generate array aggregate `FieldParsers. - The new `AFNestedArray` field is handled by the `QueryPlan` module in the `DataConnector` backend. There we generate an `API.NestedArrayField` from the AFNestedArray. We also handle nested arrays when reshaping the response from the DC agent. ## Limitations - Support for limit, offset, filter (where) and order_by is not yet fully implemented, although it should not be hard to add this - Support for aggregations on nested arrays is not yet fully implemented - Permissions involving nested arrays (and objects) not yet implemented - This should be integrated with Logical Model types, but that will happen in a separate PR PR-URL: https://github.com/hasura/graphql-engine-mono/pull/9149 GitOrigin-RevId: 0e7b71a994fc1d2ca1ef73bfe7b96e95b5328531		2023-05-24 08:02:43 +00:00
..
dataset_templates	Added weird table edge-case insert mutation tests to agent test suite	2023-02-16 07:51:36 +00:00
src	Nested array support for Data Connectors Backend and MongoDB	2023-05-24 08:02:43 +00:00
test	Use Dataset Clones for all SQLite tests	2023-02-02 04:27:57 +00:00
.gitignore	Use Dataset Clones for all SQLite tests	2023-02-02 04:27:57 +00:00
.nvmrc	Move Typescript types for Data Connector agent into their own package	2022-09-05 06:09:23 +00:00
Dockerfile	Sqlite reference agent	2022-08-05 07:12:51 +00:00
package-lock.json	Nested array support for Data Connectors Backend and MongoDB	2023-05-24 08:02:43 +00:00
package.json	Nested array support for Data Connectors Backend and MongoDB	2023-05-24 08:02:43 +00:00
README.md	Additional tests for mutations in the Data Connector agent test suite	2023-02-23 13:54:37 +00:00
tsconfig.json	Sqlite reference agent	2022-08-05 07:12:51 +00:00

README.md

Data Connector Agent for SQLite

This directory contains an SQLite implementation of a data connector agent. It can use local SQLite database files as referenced by the "db" config field.

Capabilities

The SQLite agent currently supports the following capabilities:

GraphQL Schema
GraphQL Queries
Relationships
Aggregations
Prometheus Metrics
Exposing Foreign-Key Information
Mutations
Subscriptions
Streaming Subscriptions

Note: You are able to get detailed metadata about the agent's capabilities by GETting the /capabilities endpoint of the running agent.

Requirements

NodeJS 16
SQLite >= 3.38.0 or compiled in JSON support
- Required for the json_group_array() and json_group_object() aggregate SQL functions
- https://www.sqlite.org/json1.html#jgrouparray
Note: NPM is used for the TS Types for the DC-API protocol

Build & Run

npm install
npm run build
npm run start

Or a simple dev-loop via entr:

echo src/**/*.ts | xargs -n1 echo | DB_READONLY=y entr -r npm run start

Docker Build & Run

> docker build . -t dc-sqlite-agent:latest
> docker run -it --rm -p 8100:8100 dc-sqlite-agent:latest

You will want to mount a volume with your database(s) so that they can be referenced in configuration.

Options / Environment Variables

Note: Boolean flags {FLAG} can be provided as 1, true, t, yes, y, or omitted and default to false.

ENV Variable Name	Format	Default	Info
`PORT`	`INT`	`8100`	Port for agent to listen on.
`PERMISSIVE_CORS`	`{FLAG}`	`false`	Allows all requests - Useful for testing with SwaggerUI. Turn off on production.
`DB_CREATE`	`{FLAG}`	`false`	Allows new databases to be created.
`DB_READONLY`	`{FLAG}`	`false`	Makes databases readonly.
`DB_ALLOW_LIST`	`DB1[,DB2]*`	Any Allowed	Restrict what databases can be connected to.
`DB_PRIVATECACHE`	`{FLAG}`	Shared	Keep caches between connections private.
`DEBUGGING_TAGS`	`{FLAG}`	`false`	Outputs xml style tags in query comments for deugging purposes.
`PRETTY_PRINT_LOGS`	`{FLAG}`	`false`	Uses `pino-pretty` to pretty print request logs
`LOG_LEVEL`	`fatal` \| `error` \| `info` \| `debug` \| `trace` \| `silent`	`info`	The minimum log level to output
`METRICS`	`{FLAG}`	`false`	Enables a `/metrics` prometheus metrics endpoint.
`QUERY_LENGTH_LIMIT`	`INT`	`Infinity`	Puts a limit on the length of generated SQL before execution.
`DATASETS`	`{FLAG}`	`false`	Enable dataset operations
`DATASET_DELETE`	`{FLAG}`	`false`	Enable `DELETE /datasets/:name`
`DATASET_TEMPLATES`	`DIRECTORY`	`./dataset_templates`	Directory to clone datasets from.
`DATASET_CLONES`	`DIRECTORY`	`./dataset_clones`	Directory to clone datasets to.
`MUTATIONS`	`{FLAG}`	`false`	Enable Mutation Support.

Agent usage

The agent is configured as per the configuration schema. The valid configuration properties are:

Property	Type	Default
`db`	`string`
`tables`	`string[]`	`null`
`include_sqlite_meta_tables`	`boolean`	`false`
`explicit_main_schema`	`boolean`	`false`

The only required property is db which specifies a local sqlite database to use.

The schema is exposed via introspection, but you can limit which tables are referenced by

Explicitly enumerating them via the tables property, or
Toggling the include_sqlite_meta_tables to include or exclude sqlite meta tables.

The explicit_main_schema field can be set to opt into exposing tables by their fully qualified names (ie ["main", "MyTable"] instead of just ["MyTable"]).

Dataset

The dataset used for testing the reference agent is sourced from:

https://raw.githubusercontent.com/lerocha/chinook-database/master/ChinookDatabase/DataSources/Chinook_Sqlite.sql

Datasets

Datasets support is enabled via the ENV variables:

DATASETS
DATASET_DELETE
DATASET_TEMPLATES
DATASET_CLONES

Templates will be looked up at ${DATASET_TEMPLATES}/${template_name}.sqlite or ${DATASET_TEMPLATES}/${template_name}.sql. The .sqlite templates are just SQLite database files that will be copied as a clone. The .sql templates are SQL script files that will be run against a blank SQLite database in order to create a clone.

Clones will be copied to ${DATASET_CLONES}/${clone_name}.sqlite.

Testing Changes to the Agent

Ensure you run the agent with DATASETS=1 DATASET_DELETE=1 MUTATIONS=1 in order to enable testing of mutations.

Then run:

cabal run dc-api:test:tests-dc-api -- test --agent-base-url http://localhost:8100 sandwich --tui

From the HGE repo.

Known Issues

Using "returning" in insert/update/delete mutations where you join across relationships that are affected by the insert/update/delete mutation itself may return inconsistent results. This is because of this issue with SQLite: https://sqlite.org/forum/forumpost/9470611066

TODO

Prometheus metrics hosted at /metrics
Pull reference types from a package rather than checked-in files
Health Check
DB Specific Health Checks
Schema
Capabilities
Query
Array Relationships
Object Relationships
Ensure everything is escaped correctly - https://sequelize.org/api/v6/class/src/sequelize.js~sequelize#instance-method-escape
Or... Use parameterized queries if possible - https://sequelize.org/docs/v6/core-concepts/raw-queries/#bind-parameter
Run test-suite from SDK
Remove old queries module
Relationships / Joins
Rename resultTT and other badly named types in the schema.ts module
Add ENV Variable for restriction on what databases can be used
Update to the latest types
Port back to hge codebase as an official reference agent
Make escapeSQL global to the query module
Make CORS permissions configurable
Optional DB Allowlist
Fix SDK Test suite to be more flexible about descriptions
READONLY option
CREATE option
Don't create DB option
Aggregate queries
Verbosity settings
Cache settings
Missing WHERE clause from object relationships
Reuse find_table_relationship in more scenarios
ORDER clause in aggregates breaks SQLite parser for some reason
Check that looped exist check doesn't cause name conflicts
NOT EXISTS IS NULL != EXISTS IS NOT NULL
Mutation support