mirror of
https://github.com/hasura/graphql-engine.git
synced 2024-11-10 10:29:12 +03:00
move pro rfcs to pro namespace
Many pro-only feature RFCs were added in `rfcs` folder which were shadowed in OSS repo PR-URL: https://github.com/hasura/graphql-engine-mono/pull/2546 GitOrigin-RevId: 0ae965393c04398014f910b5c8ecd2dcb3d5f861
This commit is contained in:
parent
744cf0233e
commit
2406de7601
@ -1,22 +0,0 @@
|
||||
### Motivation
|
||||
|
||||
If Hasura provides an ability to accept multiple admin secrets, then users can implement security mechanisms like admin key rotation.
|
||||
|
||||
### Configuration spec
|
||||
|
||||
1. Introduce env var and command line arg `HASURA_GRAPHQL_ADMIN_SECRETS` that can take a comma separated list of strings (instead of a single string currently).
|
||||
2. If an admin secret contains a comma, then it can be _literalized_ by putting the secret in quotes (this is as per the standard CSV spec).
|
||||
For example, HASURA_GRAPHQL_ADMIN_SECRETS: "abcd,efg", "xyz"
|
||||
|
||||
### Implementing rotating secrets
|
||||
|
||||
To implement a secret rotation mechanism, the following can be done:
|
||||
|
||||
1. add a new secret to list of admin secrets (and perform a rolling deploy)
|
||||
2. update apps/backends to use the new secret
|
||||
3. remove the old secret from the list (and perform a rolling deploy)
|
||||
|
||||
### Backwards compatibility
|
||||
|
||||
Note that we already have `HASURA_GRAPHQL_ADMIN_SECRET` env var, if both `HASURA_GRAPHQL_ADMIN_SECRET` and `HASURA_GRAPHQL_ADMIN_SECRETS`
|
||||
are given then we ignore `HASURA_GRAPHQL_ADMIN_SECRETS`.
|
@ -1,149 +0,0 @@
|
||||
Several mostly separate concerns:
|
||||
|
||||
1. How do we interrupt graphql-engine PostgreSQL client code while executing commands.
|
||||
(async + postgres cancel, switch to non-blocking libpq, FFI "interruptible", ...)
|
||||
2. What should communication between graphql-engine and PostgreSQL server be
|
||||
when interrupting. (close connection, cancel, just let it finish, ...)
|
||||
3. What should the externally observable behaviour be, to someone talking to both
|
||||
graphql-engine and datasources (PostgreSQL server) as a client.
|
||||
(guarantee that we communicate precisely whether a command was executed absent
|
||||
networking issues?)
|
||||
4. How do we time out operations inside graphql-engine from a high-level point of view?
|
||||
(top-level `System.Timeout.timeout`, lower-level `timeout`, check an mvar, exception
|
||||
safety, ...)
|
||||
|
||||
|
||||
[rob]: This is a bit of a brain-dump, currently. I'd particularly like input on:
|
||||
- (1.) The async/wait/catch/cancel solution seems workable to me. I'm a bit vague on
|
||||
some details, but I feel it should be possible to make this reliable and correct.
|
||||
But again I'm a bit inexperienced with the async runtime, FFI, so might be missing
|
||||
things.
|
||||
- (1.) I can dig further into non-blocking libpq, but feel it's not worth the effort
|
||||
at this stage.
|
||||
- (2.) Least unsure about this, it seems clear that cancelling commands is the way to
|
||||
go. But do object :)
|
||||
- (3./4.) "timeout -> send cancel -> wait for response and use regular error path"
|
||||
seems cleanest, but am a bit vague
|
||||
- (4.) Is the top-level use of `timeout` safe enough? I have vague doubts here, lacking
|
||||
familiarity with the code base and experience working with asynchronous exceptions.
|
||||
|
||||
|
||||
## 1. Interrupting PostgreSQL command execution
|
||||
|
||||
The FFI calls to [PGexec*](https://www.postgresql.org/docs/current/libpq-exec.html#LIBPQ-EXEC-MAIN)
|
||||
block the RTS, so a priori asynchronous exceptions won't be handled until after the
|
||||
server responds.
|
||||
|
||||
(This point interacts closely with 2. below; i.e., simply making the calls interruptible
|
||||
might not be enough for a correct solution towards PostgreSQL server.)
|
||||
|
||||
### Interruptible FFI (doesn't work)
|
||||
|
||||
Marking the FFI call [interruptible](https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/ffi.html#extension-InterruptibleFFI)
|
||||
should cause the OS-level thread to receive SIGPIPE on async exceptions, which would cause
|
||||
system calls to fail with EINTR. However, as verified experimentally, it appears this doesn't
|
||||
cause libpq to return, likely because it retries on EINTR (compare https://doxygen.postgresql.org/fe-misc_8c.html#a03fea567926b20bbeada40c134a8fdc1).
|
||||
|
||||
### Call libpq using `async` (separate thread)
|
||||
|
||||
We can make FFI actions interruptible towards the rest of the program by running them
|
||||
in a separate thread:
|
||||
|
||||
```haskell
|
||||
do
|
||||
a <- async $ uninterruptibleFFI conn
|
||||
wait a
|
||||
```
|
||||
|
||||
While this allows interrupting the action, it isn't thread-safe as such due to
|
||||
handing over the libpq connection to another thread. What we can do is to cancel
|
||||
the command (see 2. below) and then wait for the call to return:
|
||||
|
||||
```haskell
|
||||
do
|
||||
c <- getCancel conn
|
||||
a <- async $ uninterruptibleFFI conn
|
||||
res <- try $ wait a
|
||||
case res of
|
||||
Left e -> do
|
||||
cancel c
|
||||
throwTo (asyncThreadId a) e
|
||||
wait a
|
||||
Right x -> return x
|
||||
```
|
||||
|
||||
Couple things to figure out:
|
||||
- We might want to swallow certain exceptions (e.g. `System.Timeout.Timeout`
|
||||
or a custom `PGCancelCommand`), in which case we wouldn't rethrow the exception
|
||||
and instead just return regularly (with a postgres "canceled" error message if
|
||||
we canceled on time) -- this would allow user code to decide whether the command
|
||||
was actually interrupted. (See 3. below.)
|
||||
- Error handling for the call to `cancel`.
|
||||
|
||||
|
||||
### Use libpq in non-blocking mode
|
||||
|
||||
We could rewrite parts of pg-client-hs to use libpq
|
||||
[asynchronously](https://hackage.haskell.org/package/postgresql-libpq-0.9.4.3/docs/Database-PostgreSQL-LibPQ.html#g:9).
|
||||
The general approach would be:
|
||||
|
||||
- get the connection socket file descriptor via `socket :: Connection -> IO (Maybe Fd)`
|
||||
- loop with `poll()/select()` and `consumeInput`
|
||||
([rob]: is this in the standard library somewhere? this should be interruptible)
|
||||
|
||||
[rob]: I haven't explored this option in detail since it seems like the change
|
||||
has a far bigger impact, and the forking approach seems like an easier way to
|
||||
explore this. May well be a good idea longer term though.
|
||||
|
||||
|
||||
## 2. Interrupting towards PostgreSQL
|
||||
|
||||
There's a number of things that our postgresql client could do when interrupted:
|
||||
- closing or resetting the connection
|
||||
- sending a cancel message
|
||||
- process the command normally (i.e., wait and read the result, from the connection)
|
||||
|
||||
Breaking the connection won't interrupt command execution server-side, it seems
|
||||
cancelling is generally the right thing to do:
|
||||
- https://www.postgresql.org/message-id/flat/CA%2BTgmobakAdqELHtB_t8mPkU8qn4QPSJuCrRzWJFGfCJtpoV%3Dw%40mail.gmail.com
|
||||
- https://dba.stackexchange.com/questions/81408/is-a-postgres-long-running-query-aborted-if-the-connection-is-lost-broken
|
||||
|
||||
|
||||
## 3. From the outside
|
||||
|
||||
Let's say we have a slow graphql mutation that translates to a PostgreSQL transaction.
|
||||
It seems desirable to guarantee that we respond with:
|
||||
|
||||
- no error: the transaction was committed
|
||||
- controlled timeout: the transaction was not committed
|
||||
- other errors: case by case, generally same behaviour as without timeouts, plus
|
||||
new failure cases with e.g. an error sending the cancel message
|
||||
|
||||
The idea would be that if there are no connection issues, the client can reliably
|
||||
tell whether a mutation happened by looking at the result. This is possible to achieve
|
||||
with cancellation, by ensuring that after we send the cancel message, we continue the
|
||||
regular control flow, which would then return a PostgresQL "command cancelled" error
|
||||
if the command was cancelled, and the usual response otherwise. (E.g. in the case that
|
||||
we send the cancel around the same time as the server sends the successful response.)
|
||||
|
||||
[rob]: I lack knowledge here to what extent we try to guarantee anything reliable at
|
||||
all with respect to transactionality towards Postgres. It's clear that we can't
|
||||
provide similar guarantees e.g. towards remote schemas.
|
||||
|
||||
|
||||
## 4. High-level graphql-engine implementation
|
||||
|
||||
- Execute the operation under `System.Timeout.timeout` at the top level (`runGQL`).
|
||||
[rob]: I'm unsure to what extent we're prepared to be interrupted randomly, e.g.
|
||||
are we safe against things like `r <- executePostgresQuery; logResult; return r`
|
||||
being interrupted before the logging step (and less harmless similar scenarios).
|
||||
- Execute slow operations under `System.Timeout.timeout` at a lower level. E.g.
|
||||
we might set a deadline, and pass it in, and then use `System.Timeout.timeout`
|
||||
within `pg-client-hs` or different client code to ensure we don't pass that deadline.
|
||||
- For either of the above cases, we might opt to throw a different asynchronous
|
||||
exception at the operation (`HasuraCancel`).
|
||||
- Instead of returning `Nothing` from `timeout`, the interrupting exception could
|
||||
result in an "interrupted" error being returned along the usual error return path.
|
||||
- Instead of throwing an asynchronous exception, we could pass the "interrupted"
|
||||
information via a different mechanism, e.g. an MVar. E.g. non-blocking postgres
|
||||
code could wait on both the timed-out mvar and the connection socket.
|
@ -1,173 +0,0 @@
|
||||
# Overview
|
||||
|
||||
We want to be able to time out slow queries and mutations.
|
||||
|
||||
Concretely specced (issue [#1232](https://github.com/hasura/graphql-engine-mono/issues/1232)):
|
||||
- `api_limits` in metadata gets an extra `operation_timeout` setting
|
||||
(next to `rate_limit`, `depth_limit`, `node_limit`), configurable
|
||||
project-wide and per user
|
||||
- any GraphQL operation should be aborted once the timeout is reached;
|
||||
this includes REST endpoints via their mapping to GraphQL operations,
|
||||
and subscriptions.
|
||||
|
||||
This RFC outlines a minimal solution, and tries to provide an
|
||||
overview of the design space, mostly through several appendices
|
||||
that can be ignored when evaluating the concrete plan. There's
|
||||
a bit more detail in
|
||||
[operation-timeout-api-limits-braindump](./operation-timeout-api-limits-braindump.md).
|
||||
|
||||
There's a rather wide design space here, some questions,
|
||||
the answers to which determine how we go about this:
|
||||
1. Do we let the client know whether a query was actually interrupted
|
||||
database-server-side? Compare Appendix C.
|
||||
2. Do we need queries to timeout reliably when there
|
||||
are connectivity issues between graphql-engine and the
|
||||
database servers?
|
||||
3. Some considerations regarding the the effect of the approach
|
||||
on stability, maintainability, understandability of the code
|
||||
base, e.g.:
|
||||
- Is the core query code robust with respect to exception
|
||||
handling? Do we want to afford the effort of ironing out
|
||||
bugs there? Is the behavioural complexity of using asynchronous
|
||||
exceptions for high-level control flow a good trade-off
|
||||
compared to the implementation complexity of more explicit
|
||||
timeout handling?
|
||||
- Is the effort to rewrite the postgres client library in
|
||||
non-blocking mode worthwhile?
|
||||
|
||||
For the first implementation, we've generally opted for choices
|
||||
which require the smallest changes. None of these choices prevent
|
||||
us from reevaluating in the future. Concretely:
|
||||
|
||||
- Answer "no" to points 1 & 2; this allows using `timeout` (compare
|
||||
Appendix B on library API design), and a minimal change
|
||||
to `pg-client-hs` (compare Appendix A on pg-client-hs).
|
||||
- Answer "yes / we'll see" to the robustness question, and leave
|
||||
the potential non-blocking rewrite to the future.
|
||||
|
||||
# Implementation
|
||||
|
||||
There are a couple of mostly-independent things to do:
|
||||
|
||||
- modify backend library code to allow interrupting queries
|
||||
(concretely, `pg-client-hs`)
|
||||
- provide hooks in the server query execution to allow us to
|
||||
apply limits
|
||||
- extend the metadata to allow configuring timeouts
|
||||
- implement hooks for Pro referencing the metadata
|
||||
|
||||
## Concrete technical plan
|
||||
|
||||
- `ApiLimits` in metadata gets a new `operation_timeout` field, analogous
|
||||
to other limits
|
||||
- Postgres backend library `pg-client-hs` is modified so that queries
|
||||
are interruptible by async exceptions (absent networking issues).
|
||||
This is PR [#39](https://github.com/hasura/pg-client-hs/pull/39).
|
||||
- class `HasResourceLimits` (which currently has `askResourceLimits`
|
||||
yielding an HTTP handler allocation limiter) is augmented to
|
||||
offer `askHandlerLimits` and `askOperationLimits`
|
||||
- `askOperationLimits` returns an action modifier (`m a -> m a`)
|
||||
- for Pro, we implement this action modifier to call
|
||||
`Timeout.timeout :: IO a -> IO (Maybe a)` on the action, and
|
||||
throw an error if the timeout triggered
|
||||
- for both HTTP (in `runGQ`) and Websocket transport (in `onStart`),
|
||||
we modify query and mutation execution with this operation limiter
|
||||
|
||||
# Open questions, random thoughts
|
||||
|
||||
- How to handle subscriptions concretely?
|
||||
- By the placement of the limiter, we're making a choice about what
|
||||
concretely is part of the time-limited operation (e.g., do we
|
||||
limit "cache lookup + execution", or just "execution"; should
|
||||
request parsing count?)
|
||||
- The graphql-engine change will affect all backends, but this RFC
|
||||
doesn't cover potentially required changes analogous to the
|
||||
`pg-client-hs` changes. Without those, long-running queries to
|
||||
other backends would still fail, but might not return promptly
|
||||
(and won't save resources).
|
||||
- Unit for timeout is seconds (but can we make that clearer? e.g.
|
||||
operation_timeout_seconds or `operation_timeout: 5s`)
|
||||
- Offering operation limits for cloud ops (SRE), as opposed to
|
||||
project administrators.
|
||||
|
||||
# Appendix A: (Un-)Blocking postgres queries
|
||||
|
||||
The current implementation of `pg-client-hs` executes queries
|
||||
synchronously in FFI, and doesn't provide a timeout mechanism.
|
||||
When receiving an asynchronous exception, the action will block
|
||||
until a response is received from the server. There are several
|
||||
ways to go about this. (Compare also Appendix B on the library
|
||||
API below.)
|
||||
|
||||
- `pg-client-hs` builds on the C library `libpq`, which is built
|
||||
in a way that doesn't allow us to use "interruptible"
|
||||
FFI, because it catches signals itself
|
||||
- for the query to be interrupted on the database server,
|
||||
we need to send a cancel message:
|
||||
- [psql-hackers message](https://www.postgresql.org/message-id/flat/CA%2BTgmobakAdqELHtB_t8mPkU8qn4QPSJuCrRzWJFGfCJtpoV%3Dw%40mail.gmail.com)
|
||||
- [stackoverflow question](https://dba.stackexchange.com/questions/81408/is-a-postgres-long-running-query-aborted-if-the-connection-is-lost-broken)
|
||||
- options:
|
||||
1. (use a native haskell postgres client)
|
||||
2. (patch libpq to allow using it with interruptible FFI)
|
||||
3. use libpq in non-blocking mode (and `select(2)` on the file
|
||||
descriptor)
|
||||
4. use libpq in non-blocking mode and busy-wait
|
||||
5. cancel queries in blocking mode
|
||||
|
||||
# Appendix B: Timeout API design for a database client library
|
||||
|
||||
Say you're designing a Haskell database client library, e.g.
|
||||
`pg-client-hs`, and would like to allow library users to
|
||||
timeout queries. There's a range of approaches to how to
|
||||
provide this functionality:
|
||||
|
||||
1. query is essentially of the type
|
||||
`query :: Query -> IO Result`;
|
||||
we just ask that `query q` respects async exceptions
|
||||
(i.e., bails out cleanly and promptly when receiving an async exception,
|
||||
throwing that exception)
|
||||
client code would be expected to use `timeout :: DiffTime -> IO a -> IO (Maybe a)`
|
||||
2. query is essentially of the type
|
||||
`query :: Query -> IO (Either TimeoutError Result)`;
|
||||
if `query q` receives (a specific / any) async exception, it bails out
|
||||
cleanly, swallowing the exception and returning `Left TimeoutError`
|
||||
client code would use a variant of `timeout` with a custom exception, that doesn't
|
||||
need to decide whether the timeout was handled (`timeout' :: DiffTime -> IO a -> IO a`)
|
||||
3. query is essentially of the type
|
||||
`query :: TimeLimit -> Query -> IO (Either TimeoutError Result)`;
|
||||
timeouts are handled internally in some way that the library user doesn't
|
||||
care about
|
||||
- option: Deadline vs. TimeLimit
|
||||
- async exception behaviour should ideally be as in option 1 above, but
|
||||
e.g. blocking in FFI code would be fine
|
||||
client code just needs to pass a time limit / deadline
|
||||
4. query is essentially of the type
|
||||
`query :: Cancel -> Query -> IO (Either CancelError Result)`
|
||||
where `Cancel` is some channel to allow signalling the desire of client code to
|
||||
interrupt a query (e.g. `MVar ()`); library code would check for cancellation
|
||||
at appropriate spots
|
||||
|
||||
Some trade-offs:
|
||||
- options 2 and 4 offer most flexibility to the library user
|
||||
- option 2 violates the core contract of async exceptions, namely "stop what you're
|
||||
doing and bail, rethrowing the exception"
|
||||
- option 1 doesn't allow the library user to determine whether the query was interrupted
|
||||
server-side (i.e., there's no way to decide between "timeout hit while query was running
|
||||
and caused the remote transaction to be rolled back" and "timeout hit while client query
|
||||
code was still running, but remote transaction committed")
|
||||
|
||||
# Appendix C: Behaviour towards graphql clients
|
||||
|
||||
For backends that allow reliably deciding whether a
|
||||
transaction committed, do we provide this to the client?
|
||||
I.e., we could guarantee the following.
|
||||
Assuming a graphql query response is received client-side,
|
||||
then either:
|
||||
- there's no error and the query was executed successfully
|
||||
- there's an error of a class "all bets are off" (e.g.,
|
||||
network error talking to the database backend)
|
||||
- there's an error of a class "database query not performed" (e.g.,
|
||||
"malformed graphql", "table doesn't exist", "server shut down
|
||||
while running query"); and if our new timeout triggers, it will
|
||||
cause the transaction to not be committed, and reported via this
|
||||
class
|
@ -1,75 +0,0 @@
|
||||
This document outlines our support for Postgres client certificates in
|
||||
multitenant.
|
||||
|
||||
# API
|
||||
<https://github.com/hasura/graphql-engine-mono/issues/1143>
|
||||
|
||||
We want to use Env Vars to correlate certificate data to postgres
|
||||
sources. We want to allow users to add sources with client certs via
|
||||
the metadata api:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "pg_add_source",
|
||||
"args": {
|
||||
"name": "pg1",
|
||||
"configuration": {
|
||||
"connection_info": {
|
||||
"database_url": {
|
||||
"from_env": "<DB_URL_ENV_VAR>"
|
||||
},
|
||||
"ssl_configuration": {
|
||||
"sslmode": "verify-ca",
|
||||
"sslrootcert": {
|
||||
"from_env": "<SSL_ROOT_CERT_ENV_VAR>"
|
||||
},
|
||||
"sslcert": {
|
||||
"from_env": "<SSL_CERT_ENV_VAR>"
|
||||
},
|
||||
"sslkey": {
|
||||
"from_env": "<SSL_KEY_ENV_VAR>"
|
||||
},
|
||||
"sslpassword": {
|
||||
"from_env": "<SSL_PASSWORD_ENV_VAR>"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Cert Data is stored on disk in a path provided by the flag
|
||||
`--tenantCertPath`. If this flag is not provided it will be stored in a unique generated folder.
|
||||
|
||||
# Implementation
|
||||
|
||||
The `PostgresSourceConnInfo` type has been extended to include an
|
||||
optional set of `PGClientCerts` which itself contains the Env Vars
|
||||
provided to the Metadata API:
|
||||
|
||||
```
|
||||
data PostgresSourceConnInfo
|
||||
= PostgresSourceConnInfo
|
||||
{ _psciDatabaseUrl :: !UrlConf
|
||||
, _psciPoolSettings :: !(Maybe PostgresPoolSettings)
|
||||
, _psciUsePreparedStatements :: !Bool
|
||||
, _psciIsolationLevel :: !Q.TxIsolation
|
||||
, _psciSslConfiguration :: !(Maybe (PGClientCerts CertVar CertVar))
|
||||
```
|
||||
|
||||
When resolving a `PostgresConnConfiguration` in `getSourceResolver` we
|
||||
check for the existence of `PGClientCerts`. If it is present, the env
|
||||
vars are then queried in the environment and the cert data they
|
||||
contain is written to disk at the `tenantCertPath` in a per tenant
|
||||
sub-folder.
|
||||
|
||||
The filepaths are then merged into the postgres connection string as
|
||||
query parameters. The module `HasuraPro.PGClientCerts` encapsulates
|
||||
all the cert merging and writing operations.
|
||||
|
||||
Cleanup of cert data is handled in the multitenant lifecycle
|
||||
loop. When a tenant is taken down or updated, the entire tenant
|
||||
subfolder of `tenantCertPath` is deleted. This ensures that cert data
|
||||
does not linger longer then necessary and that during updates we dont
|
||||
end up with stale cert data.
|
@ -1,123 +0,0 @@
|
||||
---
|
||||
authors: Naveenaidu <naveen@hasura.io>
|
||||
---
|
||||
|
||||
# Query Tags:
|
||||
|
||||
## Motivation
|
||||
|
||||
We need some ability for some users who are using native monitoring tools
|
||||
(pganalyze, RDS performance insights) to see some "application context" in the
|
||||
query logs. This is a PRO only feature.
|
||||
|
||||
This can be provided by appending some extra information as comments to the
|
||||
|
||||
## Brief Overview
|
||||
|
||||
Query Tags are SQL comments which are made up of ``(key=value)`` pairs that
|
||||
are appended to the SQL statements generated by Hasura for GraphQL operations.
|
||||
The `(key=value)` pairs will henceforth be called as `attributes`. The main
|
||||
idea is to convert the attributes as SQL comment and append this to the SQL
|
||||
statement that is generated when a GraphQL operation is run using Hasura.
|
||||
|
||||
For example:
|
||||
|
||||
When the following query is sent to Hasura
|
||||
|
||||
```graphql
|
||||
|
||||
query GetChild {
|
||||
child {
|
||||
name
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Hasura attaches query tags (*unless disabled*) to the generated SQL statement it
|
||||
sends to the database.
|
||||
|
||||
```sql
|
||||
|
||||
SELECT * FROM child /* request_id=487c2ed5-08a4-429a-b0e0-4666a82e3cc6, field_name=child, operation_name=GetChild */
|
||||
```
|
||||
|
||||
|
||||
## Metadata Specification
|
||||
|
||||
Since there is various nitty-gritty with each vendor (who support automatic
|
||||
visualization) with the query tags format so, in the interest of (future)
|
||||
customization, we will need to introduce a configuration element to the
|
||||
metadata.
|
||||
|
||||
The `query_tags` will be a part of the `source`. i.e
|
||||
|
||||
`source` -> `query_tags`
|
||||
|
||||
|
||||
```yaml
|
||||
sources:
|
||||
name: hge_cloud_db
|
||||
configuration :
|
||||
query_tags: # Optional Field
|
||||
disabled: # Optional Field | Type: Bool | Values: true or false
|
||||
format: # Optional Field | Values: standard or sqlcommenter
|
||||
|
||||
```
|
||||
|
||||
Query Tags is enabled for all sources by default and the default format for the
|
||||
query tags is standard. That means:
|
||||
|
||||
1. The `query_tags` field is optional.
|
||||
2. If the `query_tags` field is not present for a source in metadata, then it
|
||||
means query tags are enabled for the source and the format of the
|
||||
`query_tags` is `standard`.
|
||||
4. To disable query tags, set the value of `disabled` field as `true`.
|
||||
3. A user would only have to write the `query_tags` field for a source only
|
||||
when they want to disable it for a source OR change the format of query
|
||||
tags.
|
||||
|
||||
## Approach
|
||||
|
||||
1. Add query-tags configuration field to `source` field of metadata. This is
|
||||
done by adding the query-tags to `SourceMetadata` type, since the
|
||||
`configuration` field of metadata is parsed as `SourceConnConfiguration`
|
||||
datatype. Different Backends will have different formats of
|
||||
`SourceConnConfiguration`, so if you want to configure query-tags for a
|
||||
specific backend, make sure to add `query-tags` to the backend specific
|
||||
`SourceConnConfiguration`.
|
||||
|
||||
2. Query tags configuration is made a part of `SourceInfo b` so that it is
|
||||
accessible via `SourceConfigWith`. Ideally, query tags should have been a
|
||||
part of `SourceConfig`, but due to the way our incremental schema building
|
||||
works, even if only query tags configuration changes due to an incremental
|
||||
API - then the entire `SourceConfig` will get built (_see
|
||||
`resolveSourceConfigIfNeeded`_) - that means the connection pools also will
|
||||
get rebuilt. And that is a heavy cost thus - we decided that it's a saner
|
||||
approach to have query tags config as part of `SourceConfigWith`
|
||||
|
||||
3. Create a new typeclass `MonadQueryTags` which has a `createQueryTags`
|
||||
function. This new typeclass is necessary because query-tags is a
|
||||
PRO-specific feature.
|
||||
|
||||
4. Add `MonadReader QueryTagsComment` constraint to the `mkDB..Plan` function.
|
||||
This constraint helps the backend specific implementation of `mkDB..Plan`
|
||||
function to ask for the query tags comment if it wants to, thus reducing
|
||||
the friction of the implementors of new backends, because they do not need
|
||||
to be concerned about query tags. `QueryTagsComment` fetches the comment to
|
||||
append to generated SQL, if a backend decides to do so
|
||||
|
||||
5. The QueryTags are generated where the `mkDB..Plan` functions are called
|
||||
since all the information necessary to make the QueryTags are available
|
||||
there. The Query Tags are converted as SQL comment at the same place and
|
||||
passed down the stack where the SQL is generated for the GQL operation. The
|
||||
Query Tags are then append to the SQL.
|
||||
|
||||
|
||||
### Note:
|
||||
|
||||
The above approach is only the first iteration of query tags.
|
||||
|
||||
We initially planned to have `appendQueryTags` inside the `BackendExecute`
|
||||
typeclass but that did not work out well as the `appendQueryTags` would then
|
||||
depend on the `PreparedQuery b` type family which is not used by mutations and
|
||||
for some backends, it only gets created after pretty-printing.
|
@ -1,67 +0,0 @@
|
||||
## Metadata examples
|
||||
|
||||
1. Global allowlist
|
||||
```
|
||||
allowlist:
|
||||
- collection: bar
|
||||
scope: #optional
|
||||
global: true #required
|
||||
```
|
||||
|
||||
2. Role-based allowlist
|
||||
```
|
||||
allowlist:
|
||||
- collection: bar
|
||||
scope: #required for role-based allowlist
|
||||
global: false #required
|
||||
roles: [foo, bar] #required for role-based allowlist
|
||||
```
|
||||
|
||||
## Metadata format
|
||||
|
||||
1. The allowlist is an array of allowlist entries.
|
||||
2. Each allowlist entry is an object, of one of 2 forms: global and role-based.
|
||||
|
||||
i. Each allowlist entry is required to contain a `collection`.
|
||||
|
||||
ii. A role-based allowlist entry is required to have a `scope`, containing `global: false` and `roles` (which is a non-empty array of roles).
|
||||
|
||||
iii. A global allowlist entry is not required to have a `scope` field, but if it does, we expect `global: true` and no roles. (This means that if `scope` is not provided, then it is assumed to be a global collection, which ensures backwards compatibility.)
|
||||
|
||||
3. Collection names must be unique in the entire allowlist (i.e. there must not be multiple allowlist entries with the same collection name).
|
||||
4. Incremental API `add_collection_to_allowlist` takes an additional (optional) field `scope`.
|
||||
5. We introduce a new incremental API `update_scope_of_collection_in_allowlist`. This has the fields `collection` and `scope` (both required).
|
||||
|
||||
## API Behaviour
|
||||
|
||||
1. Global collections are available for ALL roles.
|
||||
|
||||
2. Role-based collections are _additionally_ available to the configured roles.
|
||||
|
||||
3. When `add_collection_to_allowlist` does not have a `scope`:
|
||||
|
||||
a) If the collection already exists (global or role-based), no-op success with a response saying "collection already exists and given scope is ignored. To change scope, use the `update_scope_of_collection_in_allowlist` API". The main reason for not erroring here is because currently `add_collection_to_allowlist` is idempotent and we want to be backwards compatible.
|
||||
|
||||
b) If the collection doesn't exist, then it is created as a global collection.
|
||||
|
||||
4. When `add_collection_to_allowlist` has a `scope` field:
|
||||
|
||||
a) If the collection already exists (global or role-based) in allowlist, no-op success with a response saying "collection already exists and given scope is ignored. To change scope, use the `update_scope_of_collection_in_allowlist` API".
|
||||
|
||||
b) If the collection doesn't exist in allowlist, then it is created with the given `scope`. If `scope` doesn't conform to the format as specified in "Metdata format" section, then an appropriate error is thrown. E.g. "roles is missing for collection with non-global scope", "roles cannot be empty for collection with non-global scope", "roles should not be provided for collection with global scope".
|
||||
|
||||
5. For `drop_collection_from_allowlist`:
|
||||
|
||||
a) If the collection exists (global or role-based) in allowlist, then we drop the collection.
|
||||
|
||||
b) If the collection doesn't exist in allowlist, then an error is thrown saying collection does not exist in allowlist.
|
||||
|
||||
6. For `update_scope_of_collection_in_allowlist`:
|
||||
|
||||
a) If the collection exists (global or role-based) in allowlist, then we set its scope to the provided `scope`. If `scope` doesn't conform to the format specified in "Metdata format" section, then an appropriate error is thrown. E.g. "roles is missing for collection with non-global scope", "roles cannot be empty for collection with non-global scope", "roles should not be provided for collection with global scope".
|
||||
|
||||
b) If the collection doesn't exist in allowlist, then an error is thrown saying collection does not exist in allowlist.
|
||||
|
||||
**Backwards compatibility NOTE**
|
||||
|
||||
When a metadata spec configured with role-based allowlists is used with older Hasura version, then `scope` key is silently ignored and the collection becomes global. We should document this. Ideally, even with allowlist, role-based authz will kick-in for the collections.
|
Loading…
Reference in New Issue
Block a user