The main aim of the PR is:
1. To set up a module structure for 'remote-schemas' package.
2. Move parts by the remote schema codebase into the new module structure to validate it.
## Notes to the reviewer
Why a PR with large-ish diff?
1. We've been making progress on the MM project but we don't yet know long it is going to take us to get to the first milestone. To understand this better, we need to figure out the unknowns as soon as possible. Hence I've taken a stab at the first two items in the [end-state](https://gist.github.com/0x777/ca2bdc4284d21c3eec153b51dea255c9) document to figure out the unknowns. Unsurprisingly, there are a bunch of issues that we haven't discussed earlier. These are documented in the 'open questions' section.
1. The diff is large but that is only code moved around and I've added a section that documents how things are moved. In addition, there are fair number of PR comments to help with the review process.
## Changes in the PR
### Module structure
Sets up the module structure as follows:
```
Hasura/
RemoteSchema/
Metadata/
Types.hs
SchemaCache/
Types.hs
Permission.hs
RemoteRelationship.hs
Build.hs
MetadataAPI/
Types.hs
Execute.hs
```
### 1. Types representing metadata are moved
Types that capture metadata information (currently scattered across several RQL modules) are moved into `Hasura.RemoteSchema.Metadata.Types`.
- This new module only depends on very 'core' modules such as
`Hasura.Session` for the notion of roles and `Hasura.Incremental` for `Cacheable` typeclass.
- The requirement on database modules is avoided by generalizing the remote schemas metadata to accept an arbitrary 'r' for a remote relationship
definition.
### 2. SchemaCache related types and build logic have been moved
Types that represent remote schemas information in SchemaCache are moved into `Hasura.RemoteSchema.SchemaCache.Types`.
Similar to `H.RS.Metadata.Types`, this module depends on 'core' modules except for `Hasura.GraphQL.Parser.Variable`. It has something to do with remote relationships but I haven't spent time looking into it. The validation of 'remote relationships to remote schema' is also something that needs to be looked at.
Rips out the logic that builds remote schema's SchemaCache information from the monolithic `buildSchemaCacheRule` and moves it into `Hasura.RemoteSchema.SchemaCache.Build`. Further, the `.SchemaCache.Permission` and `.SchemaCache.RemoteRelationship` have been created from existing modules that capture schema cache building logic for those two components.
This was a fair amount of work. On main, currently remote schema's SchemaCache information is built in two phases - in the first phase, 'permissions' and 'remote relationships' are ignored and in the second phase they are filled in.
While remote relationships can only be resolved after partially resolving sources and other remote schemas, the same isn't true for permissions. Further, most of the work that is done to resolve remote relationships can be moved to the first phase so that the second phase can be a very simple traversal.
This is the approach that was taken - resolve permissions and as much as remote relationships information in the first phase.
### 3. Metadata APIs related types and build logic have been moved
The types that represent remote schema related metadata APIs and the execution logic have been moved to `Hasura.RemoteSchema.MetadataAPI.Types` and `.Execute` modules respectively.
## Open questions:
1. `Hasura.RemoteSchema.Metadata.Types` is so called because I was hoping that all of the metadata related APIs of remote schema can be brought in at `Hasura.RemoteSchema.Metadata.API`. However, as metadata APIs depended on functions from `SchemaCache` module (see [1](ceba6d6226/server/src-lib/Hasura/RQL/DDL/RemoteSchema.hs (L55)) and [2](ceba6d6226/server/src-lib/Hasura/RQL/DDL/RemoteSchema.hs (L91)), it made more sense to create a separate top-level module for `MetadataAPI`s.
Maybe we can just have `Hasura.RemoteSchema.Metadata` and get rid of the extra nesting or have `Hasura.RemoteSchema.Metadata.{Core,Permission,RemoteRelationship}` if we want to break them down further.
1. `buildRemoteSchemas` in `H.RS.SchemaCache.Build` has the following type:
```haskell
buildRemoteSchemas ::
( ArrowChoice arr,
Inc.ArrowDistribute arr,
ArrowWriter (Seq CollectedInfo) arr,
Inc.ArrowCache m arr,
MonadIO m,
HasHttpManagerM m,
Inc.Cacheable remoteRelationshipDefinition,
ToJSON remoteRelationshipDefinition,
MonadError QErr m
) =>
Env.Environment ->
( (Inc.Dependency (HashMap RemoteSchemaName Inc.InvalidationKey), OrderedRoles),
[RemoteSchemaMetadataG remoteRelationshipDefinition]
)
`arr` HashMap RemoteSchemaName (PartiallyResolvedRemoteSchemaCtxG remoteRelationshipDefinition, MetadataObject)
```
Note the dependence on `CollectedInfo` which is defined as
```haskell
data CollectedInfo
= CIInconsistency InconsistentMetadata
| CIDependency
MetadataObject
-- ^ for error reporting on missing dependencies
SchemaObjId
SchemaDependency
deriving (Eq)
```
this pretty much means that remote schemas is dependent on types from databases, actions, ....
How do we fix this? Maybe introduce a typeclass such as `ArrowCollectRemoteSchemaDependencies` which is defined in `Hasura.RemoteSchema` and then implemented in graphql-engine?
1. The dependency on `buildSchemaCacheFor` in `.MetadataAPI.Execute` which has the following signature:
```haskell
buildSchemaCacheFor ::
(QErrM m, CacheRWM m, MetadataM m) =>
MetadataObjId ->
MetadataModifier ->
```
This can be easily resolved if we restrict what the metadata APIs are allowed to do. Currently, they operate in an unfettered access to modify SchemaCache (the `CacheRWM` constraint):
```haskell
runAddRemoteSchema ::
( QErrM m,
CacheRWM m,
MonadIO m,
HasHttpManagerM m,
MetadataM m,
Tracing.MonadTrace m
) =>
Env.Environment ->
AddRemoteSchemaQuery ->
m EncJSON
```
This should instead be changed to restrict remote schema APIs to only modify remote schema metadata (but has access to the remote schemas part of the schema cache), this dependency is completely removed.
```haskell
runAddRemoteSchema ::
( QErrM m,
MonadIO m,
HasHttpManagerM m,
MonadReader RemoteSchemasSchemaCache m,
MonadState RemoteSchemaMetadata m,
Tracing.MonadTrace m
) =>
Env.Environment ->
AddRemoteSchemaQuery ->
m RemoteSchemeMetadataObjId
```
The idea is that the core graphql-engine would call these functions and then call
`buildSchemaCacheFor`.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6291
GitOrigin-RevId: 51357148c6404afe70219afa71bd1d59bdf4ffc6
We use a helper service to start a webhook-based authentication service for some tests. This moves the initialization of the service out of _test-server.sh_ and into the Python test harness, as a fixture.
In order to do this, I had to make a few changes. The main deviation is that we no longer run _all_ tests against an HGE with this authentication service, just a few (those in _test_webhook.py_). Because this reduced coverage, I have added some more tests there, which actually cover some areas not exacerbated elsewhere (mainly trying to use webhook credentials to talk to an admin-only endpoint).
The webhook service can run both with and without TLS, and decide whether it's necessary to skip one of these based on the arguments passed and how HGE is started, according to the following logic:
* If a TLS CA certificate is passed in, it will run with TLS, otherwise it will skip it.
* If HGE was started externally and a TLS certificate is provided, it will skip running without TLS, as it will assume that HGE was configured to talk to a webhook over HTTPS.
* Some tests should only be run with TLS; this is marked with a `tls_webhook_server` marker.
* Some tests should only be run _without_ TLS; this is marked with a `no_tls_webhook_server` marker.
The actual parameterization of the webhook service configuration is done through test subclasses, because normal pytest parameterization doesn't work with the `hge_fixture_env` hack that we use. Because `hge_fixture_env` is not a sanctioned way of conveying data between fixtures (and, unfortunately, there isn't a sanctioned way of doing this when the fixtures in question may not know about each other directly), parameterizing the `webhook_server` fixture doesn't actually parameterize `hge_server` properly. Subclassing forces this to work correctly.
The certificate generation is moved to a Python fixture, so that we don't have to revoke the CA certificate for _test_webhook_insecure.py_; we can just generate a bogus certificate instead. The CA certificate is still generated in the _test-server.sh_ script, as it needs to be installed into the OS certificate store.
Interestingly, the CA certificate installation wasn't actually working, because the certificates were written to the wrong location. This didn't cause any failures, as we weren't actually testing this behavior. This is now fixed with the other changes.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6363
GitOrigin-RevId: 0f277d374daa64f657257ed2a4c2057c74b911db
## Description
This PR fixes hasura/graphql-engine#8345: when creating the final representation of a remote relationship to a remote schema (a `RemoteJoin`), we would mistakenly label ALL join fields in the selection set as being relevant to that one relationship: if there are more than one remote relationship to process in that selection set, that would be the union of all their join fields. The problem with this error is that, when processing remote relationships, we correctly ignore all the ones for which at least one join key is null. Consequently, this error would result in us ignoring remote relationships for which an _unrelated_ join key was null, resulting in that data missing in the final JSON result.
This PR simply ensures that the aggregation of fields that are passed to `createRemoteJoin` is pruned to only contain the fields relevant to the join being created. This is a very small change, and the bulk of this PR is the regression tests.
## Changelog
__Component__ : server
__Type__: bugfix
__Product__: community-edition
### Short Changelog
fix remote relationship to remote schema sometimes being erroneously null when multiple relationships are defined on the same table / graphql object ([#8345](https://github.com/hasura/graphql-engine/issues/8345))
### Long Changelog
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6420
GitOrigin-RevId: eb54462724b007f80b674dcf234adf6d9cfaaf79
Context: https://hasurahq.atlassian.net/browse/SRE-10
Also remove an overlapping instance.
-----
The new flags if this needs to be tweaked on production by SRE are:
- --idleGCIdleInterval : "When the system has been idle for idleGCIdleInterval we may opportunistically try a major GC to run finalizers"
- --idleGCMinGCInterval : "We never run an opportunistic GC unless it has been at least idleGCMinGCInterval seconds since the last major GC"
- --idleGCMaxNoGCInterval : "If it has been longer than idleGCMaxNoGCInterval since the last major GC, force a GC to run finalizers"
Be aware: we may see memory usage grow to higher peaks than before, especially when under load
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6449
GitOrigin-RevId: 662d2f968f0d73b3b6eebb857c49aaede3312705
* The versions for some tools were out of date; updated accordingly.
* The link to the Dockerfile was broken.
* Included instructions for Nix.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6403
GitOrigin-RevId: 3acbafe90e4bb9267dcdb2dce5e205773a14dfc9
This helps with running the tests locally on a recent Mac.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6304
Co-authored-by: Daniel Harvey <4729125+danieljharvey@users.noreply.github.com>
GitOrigin-RevId: eb5c6a185f68b216c94df2581acf71906cce7872
This makes it possible for the test harness to start the test JWK server and the test remote schema server.
In order to do this, we still generate the TLS certificates in the test script (because we need to install the generated CA certificate in the OS certificate store), and then pass the certificate and key paths into the test runner.
Because we are still using _test-server.sh_ for now, we don't use the JWK server fixture in that case, as HGE needs the JWK server to be up and running when it starts. Instead, we keep running it outside (for now).
This is also the case for the GraphQL server fixture when we are running the server upgrade/downgrade tests.
I have also refactored _graphql_server.py_ so there isn't a global `HGE_URLS` value, but instead the value is passed through.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6303
GitOrigin-RevId: 06f05ff674372dc5d632e55d68e661f5c7a17c10
This upgrades CI and anyone using Nix to HLint v3.4.1.
If you're not using Nix, this doesn't actually _do_ anything on your
local machine; it's just a suggestion.
It also applies a bunch of simple HLint refactors, using
`make lint-hs-fix`.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6324
GitOrigin-RevId: de8267e4909d6dcd3f83543188517f3aaeebc5f3
I didn't track why these were left behind. Presumably GHC 9.2 has an improved redundant constraint checker, so that explains a few. Otherwise, perhaps code got refactored along the way.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6256
GitOrigin-RevId: b6275edf3e867f8e33bdec533ce9932381d36bbb
- Remove a few unnecessary helper functions
- Delete kind annotations
- Bring GHC warnings and language extensions more in line with those of the `graphql-engine` library
- Constrain unconstrained dependency on `hasql-pool`
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6251
GitOrigin-RevId: 10c2530f007f70cf1464cec36566ee2264589881
This updates _docker-compose.yml_ to use the new image tags, and updates _run.sh_ accordingly.
While I was at it, I also added a `docker compose pull` instruction to make sure that we don't have surprises half-way through the script, and a few `echo` lines for clarity.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6235
GitOrigin-RevId: 3855f6898bd3e906c5f423d9d0d6a7031de3777a
We seem to be rebuilding hpack on every PR. I'm hoping this will allow PRs to share a cache.
I have also changed the cache key to include the entirety of _server/VERSIONS.json_, and added the GHC version there, to make sure it's properly invalidated.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6142
GitOrigin-RevId: fc61a26ad721f59f52687913f6978902f4c2ca0a
- Remove `onJust` in favor of the more general `for_`
- Remove `withJust` which was used only once
- Remove `hashNub` in favor of `Ord`-based `uniques`
- Simplify some of the implementations in `Hasura.Prelude`
- Add `hlint` hint from `maybe True` to `all`, and `maybe False` to `any`
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6173
GitOrigin-RevId: 2c6ebbe2d04f60071d2a53a2d43c6d62dbc4b84e
This PR is the result of running the following commands:
```bash
$ git grep -l '".* : "' -- '*.hs' | xargs sed -i -E 's/(".*) : "/\1: "/'
$ scripts/dev.sh test --integration --accept
```
Also manually fixed a few tests and docs
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6148
GitOrigin-RevId: cf8b87605d41d9ce86613a41ac5fd18691f5a641
When we run the HGE server inside the test harness, it needs to run with
an admin secret for some tests to make sense. This tags each test that
requires an admin secret with `pytest.mark.admin_secret`, which then
generates a UUID and injects that into both the server and the test case
(if required).
It also simplifies the way the test harness picks up an existing admin
secret, allowing it to use the environment variable instead of requiring
it via a parameter.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6120
GitOrigin-RevId: 55c5b9e8c99bdad9c8304098444ddb9516749a2c
This teaches `hge_server` how to run more tests, thanks to `hge_env`.
It also simplifies the logic a bit more.
I have also modified _run.sh_ and _docker-compose.yml_ so we can run multiple test suites, one after another.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6105
GitOrigin-RevId: eff009362eb6bb90c07cedaf96dfe6ec9336ff32
If we don't do this, we might end up applying metadata with a stale schema cache.
Following the principle of least surprise, replacing the metadata should probably compute inconsistencies with regards to the actual state of the database.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6026
GitOrigin-RevId: ff7469d7d9857c8a9f517d5d0b6f1ecf463621b3
This has two purposes:
* When running the Python integration tests against a running HGE instance, with `--hge-url`, it will check the environment variables available and actively skip the test if they aren't set. This replaces the previous ad-hoc skip behavior.
* More interestingly, when running against a binary with `--hge-bin`, the environment variables are passed through, which means different tests can run with different environment variables.
On top of this, the various services we use for testing now also provide their own environment variables, rather than expecting a test script to do it.
In order to make this work, I also had to invert the dependency between various services and `hge_ctx`. I extracted a `pg_version` fixture to provide the PostgreSQL version, and now pass the `hge_url` and `hge_key` explicitly to `ActionsWebhookServer`.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/6028
GitOrigin-RevId: 16d866741dba5887da1adf4e1ade8182ccc9d344
NPM v7 uses a new (backwards-compatible) lockfile format. This upgrades all our various _package-lock.json_ files to use the new format.
It's much more verbose so that NPM can be a lot faster.
I figured it was cleaner to do this once in a separate PR rather than upgrading them in combination with adding or upgrading a new dependency.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5869
GitOrigin-RevId: 322fb63b96e2d873a4a3cc05fa6c7afa414716ce
This adds support for running the Python integration tests for MSSQL and Citus just as in CI, as follows:
```
./server/tests-py/run.sh backend-mssql
./server/tests-py/run.sh backend-citus
```
These run the named CI jobs, providing the appropriate backend.
(In reality, all backends are always provided, which is much simpler.)
It also provides the various databases to _server/tests-py/run-new.sh_, though the tests fail as they don't properly initialize the sources. (This will be fixed in the future by provisioning sources in the test framework itself.)
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5997
GitOrigin-RevId: c276a4779a35bb538ef0dc02ac8b7cb2d5a8dec5
This makes a few changes to the test scripts and makefiles in order to make things simpler for the average Apple user.
First of all, we change the `wait_for_mysql` function to use "localhost", not "127.0.0.1", as this fixed an issue on my system when attempting to connect to the MySQL server.
Secondly, we split the SQL Server test image into two:
* The first is the server itself, which now automatically uses `azure-sql-edge` as the image if you are on an aarch64 chip and using the `make` commands.
* The second is the initialization script. Because `sqlcmd` is not available in the `azure-sql-edge` image on aarch64, we use a separate container based on `mssql-tools` to initialize the server.
The README has been updated.
Tested on both macOS/aarch64 (with other changes) and Linux/x86_64.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5986
GitOrigin-RevId: b16e079861dcbcc66773295c47d715e443b67eea
See: https://github.com/grafana/k6/issues/2685
It might be interesting to think about taking into consideration decompression time when thinking about performance, but In general I think doing so is surprising and I wasted a lot of time trying to figure out why my optimizations to the compression codepath weren't improving things to the degree I expected
The downside here is we lose error reporting, so you'll need to only set
discardResponseBodies: true after the query has been tested.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5940
GitOrigin-RevId: 82a589a59b93f10ffb5391e4a3190459fb6e613b
Result of executing the following commands:
```shell
# replace "as Q" imports with "as PG" (in retrospect this didn't need a regex)
git grep -lE 'as Q($|[^a-zA-Z])' -- '*.hs' | xargs sed -i -E 's/as Q($|[^a-zA-Z])/as PG\1/'
# replace " Q." with " PG."
git grep -lE ' Q\.' -- '*.hs' | xargs sed -i 's/ Q\./ PG./g'
# replace "(Q." with "(PG."
git grep -lE '\(Q\.' -- '*.hs' | xargs sed -i 's/(Q\./(PG./g'
# ditto, but for [, |, { and !
git grep -lE '\[Q\.' -- '*.hs' | xargs sed -i 's/\[Q\./\[PG./g'
git grep -l '|Q\.' -- '*.hs' | xargs sed -i 's/|Q\./|PG./g'
git grep -l '{Q\.' -- '*.hs' | xargs sed -i 's/{Q\./{PG./g'
git grep -l '!Q\.' -- '*.hs' | xargs sed -i 's/!Q\./!PG./g'
```
(Doing the `grep -l` before the `sed`, instead of `sed` on the entire codebase, reduces the number of `mtime` updates, and so reduces how many times a file gets recompiled while checking intermediate results.)
Finally, I manually removed a broken and unused `Arbitrary` instance in `Hasura.RQL.Network`. (It used an `import Test.QuickCheck.Arbitrary as Q` statement, which was erroneously caught by the first find-replace command.)
After this PR, `Q` is no longer used as an import qualifier. That was not the goal of this PR, but perhaps it's a useful fact for future efforts.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5933
GitOrigin-RevId: 8c84c59d57789111d40f5d3322c5a885dcfbf40e
This fixes a few issues so that we can run `./server/tests-py/run.sh backend-bigquery` to run the Python integration tests for BigQuery locally.
* We forward the relevant environment variables to the Docker container.
* We increase the HTTP timeout, as I'm seeing requests taking up to 90s locally.
* We rewrite the setup so that it avoids `INSERT INTO`, which is not available using the BigQuery free tier. Instead, we use `CREATE TABLE ... AS SELECT ...`. This is the same method used by the Haskell integration tests.
We also capture local server output in a volume so it's easier to figure out what went wrong later.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5921
GitOrigin-RevId: c628f8c08a84f2582958659ab6d6494832471f6f
I am working on https://github.com/hasura/graphql-engine/issues/8807, and wanted to write a Haskell integration test case to reproduce it.
We have Python integration tests somewhat covering this behavior in *test_inconsistent_meta.py*, but no Haskell tests, so I thought I'd shore up the coverage here by adding a few test cases for working behavior.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5897
GitOrigin-RevId: 21500e530e413feaede5cbd8b4a94b07d25a6260
This makes two changes to the Docker Compose files that we use for local testing:
1. We disable `fsync`. On my machine, this decreases the time taken to create a new database from ~5s to less than 0.1s. The trade-off is that you might lose data, which we don't care about, as this is for testing.
2. We increase the maximum number of connections from the default, 100, to 1000. This allows us to run more tests in parallel without hitting connection limits.
These changes won't have any meaningful effect for now; they simply allow us to parallelize tests against PostgreSQL in the future.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5892
GitOrigin-RevId: 5d0d0ab37fdfbf4c9e20084d3cbedf647f54a04e
This argument allows the user to specify how to run HGE, rather than starting it beforehand. The runner will start a new instance of HGE for each test class.
This does not provide isolation, as the database is still re-used, but it helps us get closer.
You can try it yourself by executing:
```
$ cabal build graphql-engine:exe:graphql-engine
$ ./server/tests-py/run-new.sh
```
This doesn't affect CI at all.
I also fixed a few warnings flagged by Pylance.
PR-URL: https://github.com/hasura/graphql-engine-mono/pull/5881
GitOrigin-RevId: ea6f0fd631a2c278b2c6b50e9dbdd9d804ebc9d4