mirror of
https://github.com/hasura/graphql-engine.git
synced 2024-12-14 17:02:49 +03:00
140 lines
5.7 KiB
Markdown
140 lines
5.7 KiB
Markdown
|
# Handling of Identity Columns
|
||
|
|
||
|
## Metadata
|
||
|
|
||
|
```
|
||
|
---
|
||
|
authors: Philip Lykke Carlsen <philip@hasura.io>
|
||
|
discussion:
|
||
|
https://github.com/hasura/graphql-engine-mono/issues/2407
|
||
|
https://github.com/hasura/graphql-engine-mono/pull/2507
|
||
|
state: pending answers to unresolved questions
|
||
|
---
|
||
|
```
|
||
|
|
||
|
## Description
|
||
|
|
||
|
This RFC collects discussion and decisions on how we want Identity Columns to
|
||
|
work in the GraphQL Engine.
|
||
|
|
||
|
## Problem
|
||
|
|
||
|
Identity Columns are an SQL standard database feature that attempts to solve
|
||
|
the problem of generating row identifiers in a more sound way than naive
|
||
|
auto-incrementing columns. This works by imposing restrictions on how such
|
||
|
columns may be updated.
|
||
|
|
||
|
This means that, in order for the GraphQL engine to correctly deal with tables
|
||
|
that have identity columns it has to observe these restrictions, specifically
|
||
|
when updating and inserting.
|
||
|
|
||
|
It is also possible to sometimes override the constraints imposed by Identity
|
||
|
Columns, and we need to decide what we want to support and how we want to
|
||
|
support it.
|
||
|
|
||
|
## Available options
|
||
|
|
||
|
Overall, there are two flavors of identity columns we may encounter:
|
||
|
|
||
|
* (Postgres only) Identity columns declared `GENERATED BY DEFAULT AS IDENTITY`
|
||
|
work just like regular `SERIAL` columns and impose no further constraints.
|
||
|
We can view these as being identity-columns-in-name-only.
|
||
|
|
||
|
* The more "true" Identity Columns, supported by both MSSQL and PostgreSQL, are not
|
||
|
updatable and only insertable using and override mechanism:
|
||
|
* In MSSQL, a column declared `IDENTITY(..)` may be inserted into only when `SET
|
||
|
IDENTITY_INSERT` is applied to that table.
|
||
|
* In Postgres, a column declared `GENERATED ALWAYS AS IDENTITY`
|
||
|
may be inserted into by giving the clause `OVERRIDING SYSTEM VALUE` in an
|
||
|
`INSERT` statement.
|
||
|
|
||
|
**We need to decide how/when/if we want to expose the overriding mechanism in
|
||
|
our GraphQL API** (see the Unresolved Questions section below).
|
||
|
|
||
|
## How
|
||
|
|
||
|
Implementing the handling of identity columns should apply the architecture
|
||
|
described in [Column Mutability](/rfcs/column-mutability.md).
|
||
|
|
||
|
If we go with the non-overriding policy described above there should not be
|
||
|
any changes necessary to SQL translation for either MSSQL or PostgreSQL.
|
||
|
|
||
|
The only necessary change then ought to be amending the table metadata
|
||
|
extraction (for both MSSQL and PostgreSQL) to identify identity columns
|
||
|
and set column mutability accordingly (i.e. not insertable, not updatable).
|
||
|
|
||
|
## Unresolved Questions
|
||
|
|
||
|
_When, if ever, should we make use of the constraints overriding mechanisms
|
||
|
described above? Do we want to never override? Always? Make it configurable?_
|
||
|
|
||
|
Note that:
|
||
|
* Column Mutability guides us for how to implement the schema generation aspects
|
||
|
of either choice (of "non-overriding" vs "overriding")
|
||
|
* Leaving this unanswered does not block implementation of basically correct
|
||
|
handling of identity columns.
|
||
|
* But the implementation will have to make an (arbitrary) choice between the two.
|
||
|
A reasonable choice would be to select "non-overriding".
|
||
|
* We don't expect any complications to result from amending the implementation
|
||
|
at a later point in time.
|
||
|
|
||
|
|
||
|
## Appendix
|
||
|
|
||
|
The purpose of this appendix is to collect relevant information on the concept
|
||
|
of _Identity Columns_ and inform the implementation of GraphQL Engine.
|
||
|
|
||
|
* Part of the SQL standard.
|
||
|
* Motivation is to standardise DB-supplied identifiers (i.e. autoincrement/serial/..)
|
||
|
* Note: This is a concept distinct from primary keys. Identity Columnss don't introduce
|
||
|
uniqueness constraints by themselves!
|
||
|
* Also provide better semantics than naive auto-increment/serial solutions, by
|
||
|
prohibiting updating and inserting of Identity Columns (to an extent), in order to
|
||
|
avoid issue where auto-increment logic produces duplicates because conflicts
|
||
|
with manual inserts/updates.
|
||
|
* Interestingly, no-one seems to actually link to the standard they implement from.
|
||
|
* Implemented in PG, MSSQL, DB2 and Oracle (also Oracle-NoSQL, ironically)
|
||
|
* Not implemented in MySQL or SQLite
|
||
|
* Introduces some complications/extra coordination for replication/backup.
|
||
|
|
||
|
In a sentence:
|
||
|
> Identity columns are immutable, sequentially distinct values
|
||
|
> provided only by the DBMS
|
||
|
|
||
|
### MSSQL semantics
|
||
|
|
||
|
[MSSQL TSQL Identity Columns](https://docs.microsoft.com/en-us/sql/t-sql/statements/create-table-transact-sql-identity-property?view=sql-server-ver15)
|
||
|
|
||
|
* Possible to `INSERT` values for Identity Columns, but guarded by a `SET INSERT_IDENTITY <tablename> ON` statement.
|
||
|
* Impossible to `UPDATE` values for Identity Columns.
|
||
|
* Syntax differs from SQL standard: `column IDENTITY(type, seed, increment)`.
|
||
|
|
||
|
### PostgreSQL Semantics
|
||
|
|
||
|
[PG Create table syntax (including GENERATED)](https://www.postgresql.org/docs/devel/sql-createtable.html)
|
||
|
|
||
|
* Syntax closer to SQL standard: `column GENERATED BY DEFAULT AS IDENTITY`, `column GENERATED ALWAYS AS IDENTITY`.
|
||
|
* Implemented on top of `series`.
|
||
|
* Columns `GENERATED BY DEFAULT` may be both `INSERT`ed and and `UPDATE`d.
|
||
|
* Columns `GENERATED ALWAYS` may be `INSERT`ed (guarded by an `OVERRIDE SYSTEM VALUE` keyword), but never `UPDATE`d.
|
||
|
|
||
|
|
||
|
### Links
|
||
|
|
||
|
[Don't use serial](https://wiki.postgresql.org/wiki/Don%27t_Do_This#Don.27t_use_serial):
|
||
|
> For new applications, identity columns should be used instead.
|
||
|
>
|
||
|
> Why not?
|
||
|
>
|
||
|
> The serial types have some weird behaviors that make schema, dependency, and permission management unnecessarily cumbersome.
|
||
|
|
||
|
[SE: pg serial vs identity](https://stackoverflow.com/questions/55300370/postgresql-serial-vs-identity)
|
||
|
|
||
|
[Implementers blog post](https://www.2ndquadrant.com/en/blog/postgresql-10-identity-columns/)
|
||
|
|
||
|
[Technical details blog post](https://www.depesz.com/2017/04/10/waiting-for-postgresql-10-identity-columns/)
|
||
|
|
||
|
[Wikipedia: Identity Columns](https://en.wikipedia.org/wiki/Identity_column)
|
||
|
|
||
|
|