e5f88d8039
## Description This change adds support for querying into nested arrays in Data Connector agents that support such a concept (currently MongoDB). ### DC API changes - New API type `ColumnType` which allows representing the type of a "column" as either a scalar type, an object reference or an array of `ColumnType`s. This recursive definition allows arbitrary nesting of arrays of types. - The `type` fields in the API types `ColumnInfo` and `ColumnInsertSchema` now take a `ColumnType` instead of a `ScalarType`. - To ensure backwards compatibility, a `ColumnType` representing a scalar serialises and deserialises to the same representation as `ScalarType`. - In queries, the `Field` type now has a new constructor `NestedArrayField`. This contains a nested `Field` along with optional `limit`, `offset`, `where` and `order_by` arguments. (These optional arguments are not yet used by either HGE or the MongoDB agent.) ### MongoDB Haskell agent changes - The `/schema` endpoint will now recognise arrays within the JSON validation schema and generate corresponding arrays in the DC schema. - The `/query` endpoint will now handle `NestedArrayField`s within queries (although it does not yet handle `limit`, `offset`, `where` and `order_by`). ### HGE server changes - The `Backend` type class adds a new type family `XNestedArrays b` to enable nested arrays on a per-backend basis (currently enabled only for the `DataConnector` backend. - Within `RawColumnInfo` the column type is now represented by a new type `RawColumnType b` which mirrors the shape of the DC API `ColumnType`, but uses `XNestedObjects b` and `XNestedArrays b` type families to allow turning nested object and array supports on or off for a particular backend. In the `DataConnector` backend `API.CustomType` is converted into `RawColumnInfo 'DataConnector` while building the schema. - In the next stage of schema building, the `RawColumnInfo` is converted into a `StructuredColumnInfo` which allows us to represent the three different types of columns: scalar, object and array. TODO: the `StructuredColumnInfo` looks very similar to the Logical Model types. The main difference is that it uses the `XNestedObjects` and `XNestedArrays` type families. We should be able to combine these two representations. - The `StructuredColumnInfo` is then placed into a `FIColumn` `FieldInfo`. This involved some refactoring of `FieldInfo` as I had previously split out `FINestedObject` into a separate constructor. However it works out better to represent all "column" fields (i.e. scalar, object and array) using `FIColumn` as this make it easier to implement permission checking correctly. This is the reason the `StructuredColumnInfo` was needed. - Next, the `FieldInfo` are used to generate `FieldParser`s. We add a new constructor to `AnnFieldG` for `AFNestedArray`. An `AFNestedArray` field parser can contain either a simple array selection or an array aggregate. Simple array `FieldParsers` are currently limited to subfield selection. We will add support for limit, offset, where and order_by in a future PR. We also don't yet generate array aggregate `FieldParsers. - The new `AFNestedArray` field is handled by the `QueryPlan` module in the `DataConnector` backend. There we generate an `API.NestedArrayField` from the AFNestedArray. We also handle nested arrays when reshaping the response from the DC agent. ## Limitations - Support for limit, offset, filter (where) and order_by is not yet fully implemented, although it should not be hard to add this - Support for aggregations on nested arrays is not yet fully implemented - Permissions involving nested arrays (and objects) not yet implemented - This should be integrated with Logical Model types, but that will happen in a separate PR PR-URL: https://github.com/hasura/graphql-engine-mono/pull/9149 GitOrigin-RevId: 0e7b71a994fc1d2ca1ef73bfe7b96e95b5328531 |
||
---|---|---|
.. | ||
dataset_templates | ||
src | ||
test | ||
.gitignore | ||
.nvmrc | ||
Dockerfile | ||
package-lock.json | ||
package.json | ||
README.md | ||
tsconfig.json |
Data Connector Agent for SQLite
This directory contains an SQLite implementation of a data connector agent. It can use local SQLite database files as referenced by the "db" config field.
Capabilities
The SQLite agent currently supports the following capabilities:
- GraphQL Schema
- GraphQL Queries
- Relationships
- Aggregations
- Prometheus Metrics
- Exposing Foreign-Key Information
- Mutations
- Subscriptions
- Streaming Subscriptions
Note: You are able to get detailed metadata about the agent's capabilities by
GET
ting the /capabilities
endpoint of the running agent.
Requirements
- NodeJS 16
- SQLite
>= 3.38.0
or compiled in JSON support- Required for the json_group_array() and json_group_object() aggregate SQL functions
- https://www.sqlite.org/json1.html#jgrouparray
- Note: NPM is used for the TS Types for the DC-API protocol
Build & Run
npm install
npm run build
npm run start
Or a simple dev-loop via entr
:
echo src/**/*.ts | xargs -n1 echo | DB_READONLY=y entr -r npm run start
Docker Build & Run
> docker build . -t dc-sqlite-agent:latest
> docker run -it --rm -p 8100:8100 dc-sqlite-agent:latest
You will want to mount a volume with your database(s) so that they can be referenced in configuration.
Options / Environment Variables
Note: Boolean flags {FLAG}
can be provided as 1
, true
, t
, yes
, y
, or omitted and default to false
.
ENV Variable Name | Format | Default | Info |
---|---|---|---|
PORT |
INT |
8100 |
Port for agent to listen on. |
PERMISSIVE_CORS |
{FLAG} |
false |
Allows all requests - Useful for testing with SwaggerUI. Turn off on production. |
DB_CREATE |
{FLAG} |
false |
Allows new databases to be created. |
DB_READONLY |
{FLAG} |
false |
Makes databases readonly. |
DB_ALLOW_LIST |
DB1[,DB2]* |
Any Allowed | Restrict what databases can be connected to. |
DB_PRIVATECACHE |
{FLAG} |
Shared | Keep caches between connections private. |
DEBUGGING_TAGS |
{FLAG} |
false |
Outputs xml style tags in query comments for deugging purposes. |
PRETTY_PRINT_LOGS |
{FLAG} |
false |
Uses pino-pretty to pretty print request logs |
LOG_LEVEL |
fatal | error | info | debug | trace | silent |
info |
The minimum log level to output |
METRICS |
{FLAG} |
false |
Enables a /metrics prometheus metrics endpoint. |
QUERY_LENGTH_LIMIT |
INT |
Infinity |
Puts a limit on the length of generated SQL before execution. |
DATASETS |
{FLAG} |
false |
Enable dataset operations |
DATASET_DELETE |
{FLAG} |
false |
Enable DELETE /datasets/:name |
DATASET_TEMPLATES |
DIRECTORY |
./dataset_templates |
Directory to clone datasets from. |
DATASET_CLONES |
DIRECTORY |
./dataset_clones |
Directory to clone datasets to. |
MUTATIONS |
{FLAG} |
false |
Enable Mutation Support. |
Agent usage
The agent is configured as per the configuration schema. The valid configuration properties are:
Property | Type | Default |
---|---|---|
db |
string |
|
tables |
string[] |
null |
include_sqlite_meta_tables |
boolean |
false |
explicit_main_schema |
boolean |
false |
The only required property is db
which specifies a local sqlite database to use.
The schema is exposed via introspection, but you can limit which tables are referenced by
- Explicitly enumerating them via the
tables
property, or - Toggling the
include_sqlite_meta_tables
to include or exclude sqlite meta tables.
The explicit_main_schema
field can be set to opt into exposing tables by their fully qualified names (ie ["main", "MyTable"]
instead of just ["MyTable"]
).
Dataset
The dataset used for testing the reference agent is sourced from:
Datasets
Datasets support is enabled via the ENV variables:
DATASETS
DATASET_DELETE
DATASET_TEMPLATES
DATASET_CLONES
Templates will be looked up at ${DATASET_TEMPLATES}/${template_name}.sqlite
or ${DATASET_TEMPLATES}/${template_name}.sql
. The .sqlite
templates are just SQLite database files that will be copied as a clone. The .sql
templates are SQL script files that will be run against a blank SQLite database in order to create a clone.
Clones will be copied to ${DATASET_CLONES}/${clone_name}.sqlite
.
Testing Changes to the Agent
Ensure you run the agent with DATASETS=1 DATASET_DELETE=1 MUTATIONS=1
in order to enable testing of mutations.
Then run:
cabal run dc-api:test:tests-dc-api -- test --agent-base-url http://localhost:8100 sandwich --tui
From the HGE repo.
Known Issues
- Using "returning" in insert/update/delete mutations where you join across relationships that are affected by the insert/update/delete mutation itself may return inconsistent results. This is because of this issue with SQLite: https://sqlite.org/forum/forumpost/9470611066
TODO
- Prometheus metrics hosted at
/metrics
- Pull reference types from a package rather than checked-in files
- Health Check
- DB Specific Health Checks
- Schema
- Capabilities
- Query
- Array Relationships
- Object Relationships
- Ensure everything is escaped correctly - https://sequelize.org/api/v6/class/src/sequelize.js~sequelize#instance-method-escape
- Or... Use parameterized queries if possible - https://sequelize.org/docs/v6/core-concepts/raw-queries/#bind-parameter
- Run test-suite from SDK
- Remove old queries module
- Relationships / Joins
- Rename
resultTT
and other badly named types in theschema.ts
module - Add ENV Variable for restriction on what databases can be used
- Update to the latest types
- Port back to hge codebase as an official reference agent
- Make escapeSQL global to the query module
- Make CORS permissions configurable
- Optional DB Allowlist
- Fix SDK Test suite to be more flexible about descriptions
- READONLY option
- CREATE option
- Don't create DB option
- Aggregate queries
- Verbosity settings
- Cache settings
- Missing WHERE clause from object relationships
- Reuse
find_table_relationship
in more scenarios - ORDER clause in aggregates breaks SQLite parser for some reason
- Check that looped exist check doesn't cause name conflicts
NOT EXISTS IS NULL
!=EXISTS IS NOT NULL
- Mutation support