Commit Graph

5 Commits

Author SHA1 Message Date
Abby Sassel
76c322589c server/bigquery: Document BigQuery integration tests
This PR documents & streamlines how we run BigQuery integration tests locally as a first step to [running them in CI](https://github.com/hasura/graphql-engine-mono/issues/1525).

I've also created a hasura service account for internal use. [Internal docs here](https://github.com/hasura/graphql-engine-mono/wiki/Testing-BigQuery).

Thanks to FP Complete team for [the guidance here](https://docs.google.com/document/d/1dGDK0touUtsDxRQPonMxSoPbIfzBoSYo02tAjQEO7qA/edit?ts=60c0cf24#), which I've reused parts of.

https://github.com/hasura/graphql-engine-mono/pull/1732

GitOrigin-RevId: 303819d212aa073fbef685d077b1cfa583cd15fc
2021-07-06 11:13:06 +00:00
Chris Done
614c0dab80 Bigquery/fix limit offset for array aggregates
Blocked on https://github.com/hasura/graphql-engine-mono/pull/1640.

While fiddling with BigQuery I noticed a severe issue with offset/limit for array-aggregates. I've fixed it now.

The basic problem was that I was using a query like this:

```graphql
query MyQuery {
  hasura_Artist(order_by: {artist_self_id: asc}) {
    artist_self_id
    albums_aggregate(order_by: {album_self_id: asc}, limit: 2) {
      nodes {
        album_self_id
      }
      aggregate {
        count
      }
    }
  }
}
```

Producing this SQL:

```sql
SELECT `t_Artist1`.`artist_self_id` AS `artist_self_id`,
       STRUCT(IFNULL(`aa_albums1`.`nodes`, NULL) AS `nodes`, IFNULL(`aa_albums1`.`aggregate`, STRUCT(0 AS `count`)) AS `aggregate`) AS `albums_aggregate`
FROM `hasura`.`Artist` AS `t_Artist1`
LEFT OUTER JOIN (SELECT ARRAY_AGG(STRUCT(`t_Album1`.`album_self_id` AS `album_self_id`) ORDER BY (`t_Album1`.`album_self_id`) ASC) AS `nodes`,
                        STRUCT(COUNT(*) AS `count`) AS `aggregate`,
                        `t_Album1`.`artist_other_id` AS `artist_other_id`
                 FROM (SELECT *
                       FROM `hasura`.`Album` AS `t_Album1`
                       ORDER BY (`t_Album1`.`album_self_id`) ASC NULLS FIRST
                       -- PROBLEM HERE
                       LIMIT @param0) AS `t_Album1`
                 GROUP BY `t_Album1`.`artist_other_id`)
AS `aa_albums1`
ON (`aa_albums1`.`artist_other_id` = `t_Artist1`.`artist_self_id`)
ORDER BY (`t_Artist1`.`artist_self_id`) ASC NULLS FIRST
```

Note the `LIMIT @param0` -- that is incorrect because we want to limit
per artist. Instead, we want:

```sql
SELECT `t_Artist1`.`artist_self_id` AS `artist_self_id`,
       STRUCT(IFNULL(`aa_albums1`.`nodes`, NULL) AS `nodes`, IFNULL(`aa_albums1`.`aggregate`, STRUCT(0 AS `count`)) AS `aggregate`) AS `albums_aggregate`
FROM `hasura`.`Artist` AS `t_Artist1`
LEFT OUTER JOIN (SELECT ARRAY_AGG(STRUCT(`t_Album1`.`album_self_id` AS `album_self_id`) ORDER BY (`t_Album1`.`album_self_id`) ASC) AS `nodes`,
                        STRUCT(COUNT(*) AS `count`) AS `aggregate`,
                        `t_Album1`.`artist_other_id` AS `artist_other_id`
                 FROM (SELECT *,
                            -- ADDED
                            ROW_NUMBER() OVER(PARTITION BY artist_other_id) artist_album_index
                       FROM `hasura`.`Album` AS `t_Album1`
                       ORDER BY (`t_Album1`.`album_self_id`) ASC NULLS FIRST
                       ) AS `t_Album1`
                 -- CHANGED
                 WHERE artist_album_index <= @param
                 GROUP BY `t_Album1`.`artist_other_id`)
AS `aa_albums1`
ON (`aa_albums1`.`artist_other_id` = `t_Artist1`.`artist_self_id`)
ORDER BY (`t_Artist1`.`artist_self_id`) ASC NULLS FIRST
```

That serves both the LIMIT/OFFSET function in the where clause. Then,
both the ARRAY_AGG and the COUNT are correct per artist.

I've updated my Haskell test suite to add regression tests for this. I'll push a commit for Python tests shortly. The tests still pass there.

This just fixes a case that we hadn't noticed.

https://github.com/hasura/graphql-engine-mono/pull/1641

GitOrigin-RevId: 49933fa5e09a9306c89565743ecccf2cb54eaa80
2021-07-06 08:29:39 +00:00
Chris Done
6dc555f9eb Bigquery/global limit
This resolves https://github.com/hasura/graphql-engine/issues/6947.

A new [`global_select_limit`](b0ab5deefe/server/tests-py/queries/graphql_query/bigquery/replace_metadata.yaml (L17)) field is supported in the BigQuery configuration.

To test global limits, we have two sources defined,  the normal one (limited to 1million) and one with a limit of 1.

https://github.com/hasura/graphql-engine-mono/pull/1592

GitOrigin-RevId: 6ebcc7c1a16bc26ec36e53ae3694d36b7ce5c6e1
2021-06-25 13:36:35 +00:00
Chris Done
67a9045328 Bigquery/cleanups
A pull request for cleaning up small issues, bugs, redundancies and missing things in the BigQuery backend.

Summary:

1. Remove duplicate projection fields - BigQuery rejects these.
2. Add order_by to the test suite cases, as it was returning inconsistent results.
3. Add lots of in FromIr about how the dataloader approach is given support.
4. Produce the correct output structure for aggregates:
   a. Should be a singleton object for a top-level aggregate query.
   b. Should have appropriate aggregate{} and nodes{} labels.
   c. **Support for nodes** (via array_agg).
5. Smooth over support of array aggregates by removing the fields used for joining with an explicit projection of each wanted field.

https://github.com/hasura/graphql-engine-mono/pull/1317

Co-authored-by: Vamshi Surabhi <6562944+0x777@users.noreply.github.com>
GitOrigin-RevId: cd3899f4667770a27055f94988ef2a6d5808f1f5
2021-06-15 08:59:11 +00:00
Chris Done
53fc5617cf Feature/bigquery python tests
Co-authored-by: Aniket Deshpande <922486+aniketd@users.noreply.github.com>
GitOrigin-RevId: c425f84f1c9b8e38ebbfa509b6fa9298e023f386
2021-04-22 11:32:55 +00:00