urQL/docs/10-query.md

171 lines
6.1 KiB
Markdown
Raw Permalink Normal View History

2024-01-07 21:03:24 +03:00
# QUERY
2024-01-22 00:50:35 +03:00
*supported in urQL parser, partially supported and under development in Obelisk*
2024-01-07 21:03:24 +03:00
The `<query>` statement provides a means to create `<table-set>`s derived from persisted and/or cached `<table-set>`s and/or constants. Data rows can be joined based on predicates, specific columns can be selected, and the resulting rows can be filtered by predicate.
2024-04-28 19:22:56 +03:00
```
2024-01-07 21:03:24 +03:00
<query> ::=
2024-08-22 23:00:32 +03:00
[ FROM <table-set> [ [AS] <alias> ] [ <as-of-time> ]
2024-08-27 23:59:09 +03:00
{ JOIN <table-set> [ [AS] <alias> ] [ <as-of-time> ] }
| {
{ JOIN | LEFT JOIN | RIGHT JOIN | OUTER JOIN }
<table-set> [ [AS] <alias> ] [ <as-of-time> ]
ON <predicate>
}
[ ...n ]
2024-08-22 23:00:32 +03:00
| CROSS JOIN <table-set> [ [AS] <alias> ] [ <as-of-time> ]
2024-01-07 21:03:24 +03:00
]
[ WHERE <predicate> ]
[ GROUP BY { <qualified-column>
| <column-alias>
| <column-ordinal> } [ ,...n ]
[ HAVING <predicate> ]
]
SELECT [ TOP <n> ] [ BOTTOM <n> ]
{ * | { [<ship-qualifier>]<table-view> | <alias> }.*
| <expression> [ [ AS ] <column-alias> ]
} [ ,...n ]
[ ORDER BY
{
{ <qualified-column> | <column-alias> | <column-ordinal> } { ASC | DESC }
} [ ,...n ]
]
2024-04-28 19:22:56 +03:00
```
2024-08-27 23:59:09 +03:00
`JOIN` is an inner join returning all matching pairs of rows. When specified without `ON <predicate>` it specifies a natural join, indicating the join is performed on all columns that match both the column name and the aura type.
*natural join is the only join currently supported in Obelisk*
2024-01-07 21:03:24 +03:00
`LEFT JOIN` is a left outer join returning all rows from the left table not meeting the join condition, along with all matching pairs of rows.
`RIGHT JOIN` is a right outer join returning all rows from the right table not meeting the join condition, along with all matching pairs of rows.
`OUTER JOIN` is a full outer join returning matching pairs of rows, as well as all rows from both tables not meeting the join condition.
`CROSS JOIN` is a cartesian join of two tables.
Cross database joins are permitted, but not cross ship joins.
2024-08-22 23:00:32 +03:00
**`<as-of-time>`**
Timestamp for selection of table data. Defaults to `NOW` (current time). When specified, the timestamp must be greater than both the latest database schema and content timestamps.
2024-01-07 21:03:24 +03:00
`HAVING <predicate>` filters aggregated rows returned from the `<query>`. The column references in the predicate must be either one of the grouping columns or be contained in an aggregate function.
2024-08-27 23:59:09 +03:00
*supported in urQL parser, not yet supported in Obelisk*
2024-01-22 00:50:35 +03:00
Avoid using `ORDER BY` in CTEs or in any query prior to the last step in a `<transform>`, unless required by `TOP` or `BOTTOM` specified in the `SELECT` statement.
2024-01-07 21:03:24 +03:00
2024-08-27 23:59:09 +03:00
*supported in urQL parser, not yet supported in Obelisk*
2024-04-28 19:22:56 +03:00
```
2024-01-07 21:03:24 +03:00
<predicate> ::=
{ [ NOT ] <predicate> | [ ( ] <simple-predicate> [ ) ] }
[ { { AND | OR } [ NOT ] { <predicate> | [ ( ] <simple-predicate> [ ) ] }
[ ...n ]
]
2024-04-28 19:22:56 +03:00
```
2024-01-07 21:03:24 +03:00
2024-04-28 19:22:56 +03:00
```
2024-01-07 21:03:24 +03:00
<simple-predicate> ::=
{ expression <binary-operator> expression
| expression [ NOT ] EQUIV expression
| expression [ NOT ] IN
{ <scalar-query> | ( <value> ,...n ) }
| expression <inequality-operator>
{ ALL | ANY} { ( <scalar-query> ) | ( <value> ,...n ) }
| expression [ NOT ] BETWEEN expression [ AND ] expression
| [ NOT ] EXISTS { <column value> | <scalar-query> } }
2024-04-28 19:22:56 +03:00
```
2024-01-07 21:03:24 +03:00
When applied to a column `EXISTS` tests whether the returned `<row-type>` includes the required column. In the case of `<scalar-query>`, it tests whether a CTE returns any rows.
`[ NOT ] EQUIV` is a binary operator, similar to (not) equals `<>`, `=`. However, comparing two `NOT EXISTS` yields true.
2024-01-22 00:50:35 +03:00
`<scalar-query>` is a CTE that selects for one column. Depending on whether the operator expects a set or a value, it operates on the entire result set or on the first row returned, respectively.
2024-01-07 21:03:24 +03:00
2024-04-28 19:22:56 +03:00
```
2024-01-07 21:03:24 +03:00
<binary-operator> ::=
{ = | <> | != | > | >= | !> | < | <= | !< | EQUIV | NOT EQUIV}
2024-04-28 19:22:56 +03:00
```
2024-01-07 21:03:24 +03:00
Whitespace is not required between operands and binary-operators, except when the left operand is a numeric literal, in which case whitespace is required.
`<inequality-operator>` is any `<binary-operator>` other than equality and `EQUIV`.
2024-04-28 19:22:56 +03:00
```
2024-01-07 21:03:24 +03:00
<expression> ::=
{ <qualified-column>
| <constant>
| <scalar>
| <scalar-query>
| <aggregate-function>( { <column> | <scalar> } )
}
2024-04-28 19:22:56 +03:00
```
2024-01-22 00:50:35 +03:00
*<aggregate-function> is not yet implemente in the urQL parser*
2024-01-07 21:03:24 +03:00
`<scalar-query>` is a CTE that returns only one column. The first returned value is accepted and subsequent values ignored. Ordering the CTE may be required for predictable results.
2024-04-28 19:22:56 +03:00
```
2024-01-07 21:03:24 +03:00
<column> ::=
{ <qualified-column>
| <column-alias>
| <constant> }
2024-04-28 19:22:56 +03:00
```
2024-01-07 21:03:24 +03:00
2024-04-28 19:22:56 +03:00
```
2024-01-07 21:03:24 +03:00
<qualified-column> ::=
[ [ <ship-qualifier> ]<table-view> | <alias> ].<column-name>
2024-04-28 19:22:56 +03:00
```
2024-01-07 21:03:24 +03:00
2024-01-22 00:50:35 +03:00
### API
2024-04-28 19:22:56 +03:00
```
2024-01-07 21:03:24 +03:00
+$ query
$:
%query
from=(unit from)
scalars=(list scalar-function)
predicate=(unit predicate)
group-by=(list grouping-column)
having=(unit predicate)
selection=select
order-by=(list ordering-column)
==
2024-04-28 19:22:56 +03:00
```
2024-01-07 21:03:24 +03:00
2024-01-22 00:50:35 +03:00
### Arguments
2024-01-07 21:03:24 +03:00
**`<table-set> [ [AS] <alias> ]`**
Any valid `<table-set>`.
`<alias>` allows short-hand reference to the `<table-set>` in the `SELECT` clause and subsequent `<predicates>`.
**`{ <qualified-column> | <column-alias> | <column-ordinal> }`**
Used to select columns for ordering and grouping. `<column-ordinal>`s are 1-based.
**`[ TOP <n> ] [ BOTTOM <n> ]`**
2024-01-22 00:50:35 +03:00
Selects only the first and/or last `n` rows returned by the rest of the query. If the result set is less than `n`, the entire set of rows is returned.
2024-01-07 21:03:24 +03:00
`TOP` and `BOTTOM` require the presence of an `ORDER BY` clause.
2024-01-22 00:50:35 +03:00
### Remarks
2024-01-07 21:03:24 +03:00
2024-01-22 00:50:35 +03:00
The `SELECT` clause may choose columns from a single CTE, in which case the `FROM` clause is absent. It may also choose only literal constants and `SCALAR` functions on literals, columns, or scalar queries, in which case it will return a result set of one row.
2024-01-07 21:03:24 +03:00
The simplest possible query is `SELECT 0`.
`<query>` alone does not change the Obelisk agent state.
2024-01-22 00:50:35 +03:00
*as-of-time currently missing from specification*
### Produced Metadata
Row count
### Exceptions
`TOP` or `BOTTOM` specified without `ORDER BY` clause.
2024-08-09 01:25:05 +03:00
table `<database>`.`<namespace>`.`<table>` does not exist at schema time `<time>`
column `<column>` not found
2024-01-07 21:03:24 +03:00
2024-01-22 00:50:35 +03:00
### Example
2024-01-07 21:03:24 +03:00
2024-01-22 00:50:35 +03:00
*missing*