This commit is contained in:
jackfoxy 2023-12-17 10:27:09 -08:00
parent 306f327f83
commit 99be5d91e9
5 changed files with 88 additions and 104 deletions

View File

@ -170,19 +170,85 @@ Column types (auras) not supported for INSERT can only be inserted into tables t
<sup>1</sup> Example of embedding single quote in @t literal.
## Types
All data representations (nouns) of the Obelisk system are strongly typed.
## Issues
(very incomplete list)
1. Stored procedures - To Be Designed (TBD)
2. Triggers - TBD
3. Localization of date/time - TBD (See: https://github.com/sigilante/l10n)
4. `SELECT` single column named `top` or `bottom` may cause problems
5. Add `DISTINCT` and other advanced aggregate features like Grouping Sets, Rollup, Cube, GROUPING function. Feature T301 'Functional dependencies' from SQL 1999 specification needs to be added.
6. Change column:ast and value-literal:ast to vase in parser and AST.
7. Set operators, multiple commands per `<transform>` not complete in the parser.
8. Scalar and aggregate functions incompletely implemented in parser and not fully desinged.
9. Add aura @uc Bitcoin address 0c1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa
10. Custom types and support for arbitrary noun columns - TBD
11. Pivoting and windowing will be implemented in a future release.
12. `<view>` not implemented in parser and caching is TBD
13. Option in `<merge>` to replicate `<target-table>`'s `<foreign-key>`s and/or unique indices when new `<table>` created.
## Column Types
The fundamental data element in Obelisk is an atom that is typed by an aura. Data cells, which are intersections of a `<table-set>` row and column, are typed atoms.
Obelisk supports the following auras (see ch12-literals for representing the atomic types):
| Aura | Description |
| :--- |:---------------------------- |
| @c | UTF-32 |
| @da | date |
| @dr | timespan |
| @f | loobean |
| @if | IPv4 address |
| @is | IPv6 address |
| @p | ship name |
| @q | phonemic base |
| @rh | half float (16b) |
| @rs | single float (32b) |
| @rd | double float (64b) |
| @rq | quad float (128b) |
| @sb | signed (low bit) binary |
| @sd | signed (low bit) decimal |
| @sv | signed (low bit) base32 |
| @sw | signed (low bit) base64 |
| @sx | signed (low bit) hexadecimal |
| @t | UTF-8 text (cord) |
| @ta | ASCII text (knot, url safe) |
| @tas | ASCII text (term) |
| @ub | unsigned binary |
| @ud | unsigned decimal |
| @uv | unsigned base32 |
| @uw | unsigned base64 |
| @ux | unsigned hexadecimal |
Columns are typed by an aura and indexed by name.
```
<column-type> ::=
<aura/name>
```
## Table Row and Table Types
All datasets in Obelisk are sets, meaning each typed element, `<row-type>`, only exists once.
Datasets are also commonly regarded as tables, which is accurate when the index of each cell (row/column intersection) can be calculated. This calculation is possible when the `SELECT` statement includes an `ORDER BY` clause.
All tables originate from, or are derived from, base tables created by the `CREATE TABLE` command.
A base-table (`<table>`) row has a default type, which is the table's atomic aura-typed columns in a fixed order.
```
<row-type> ::= list <aura>
```
Each base table is typed by its `<row-type>`.
```
<table-type> ::= (list <row-type>)
```
A base table's definition includes a unique primary row order, giving it `list` type rather than `set` type. This is not true for all `<table-set>` instances.
Rows from `<view>`s, `<common-table-expression>`s, and the command output from `<transform>`, or any other table that is not a base-table, can only have an immutable row order if it is explicitly specified (i.e., the `SELECT` statement includes an `ORDER BY` clause). In general, these other tables have types that are unions of `<row-type>`s.
When the `<table-set-type>` is a union of `<row-type>`s. There is a `<row-type>` representing the full width of the `SELECT` statement and as many `<row-type>` sub-types as necessary to represent any unjoined outer `JOIN`s that result in a selected row.
Sub-types align their columns with the all-column `<row-type>`, regardless of the SELECT statement's construction.
In general, `<table-set>`s have the type:
```
<table-set-type> ::=
(list <row-type>)
| (set (<all-column-row-type> | <row-sub-type-1> | ... | <row-sub-type-n> ))
```
## Additional Types
All the static types in Obelisk API are defined in `sur/ast/hoon`.
## Remarks
Even `<table>`s can be typed as sets, because a `SELECT` statement without an `ORDER BY` clause has an undefined row order.
Regardless of the presence of `ORDER BY`, any `<table-set>` emitted by any step in a `<transform>`, a CTE, or a `<view>` is a list of `<row-type>` in some (possibly arbitrary) order.
Ultimately, "set" is the most important concept because every `<table-set>` will have one unique row value for any given sub-type of `<row-type>`.

View File

@ -30,7 +30,7 @@ _To Do NOTE_: Additional features like owner-desk property and GRANT desk permis
The user-defined name for the new database. It must comply with the Hoon term naming standard.
**`AS OF`**
Timestamp of database creation. Defaults to current time. Subsequent DDL and data actions must have timestamps equal to or greater than this timestamp.
Timestamp of database creation. Defaults to NOW (current time). Subsequent DDL and data actions must have timestamps greater than this timestamp.
## Remarks

View File

@ -32,7 +32,7 @@ This is a user-defined name for the new namespace. It must adhere to the hoon te
Note: The "sys" namespace is reserved for system use.
**`AS OF`**
Timestamp of namespace creation. Defaults to current time. When specified timestamp must be equal to or greater than latest system timestamp for the database.
Timestamp of namespace creation. Defaults to NOW (current time). When specified timestamp must be greater than latest system timestamp for the database.
## Remarks
This command mutates the state of the Obelisk agent.
@ -86,7 +86,7 @@ Indicates the type of the target object.
Name of the object to be transferred to the target namespace.
**`AS OF`**
Timestamp of namespace update. Defaults to current time. When specified timestamp must be equal to or greater than latest system timestamp for the database.
Timestamp of namespace update. Defaults to NOW (current time). When specified timestamp must be greater than latest system timestamp for the database.
## Remarks
This command mutates the state of the Obelisk agent.
@ -137,7 +137,7 @@ Optionally, force deletion of `<namespace>`.
The name of `<namespace>` to delete.
**`AS OF`**
Timestamp of namespace deletion. Defaults to current time. When specified timestamp must be equal to or greater than latest system timestamp for the database.
Timestamp of namespace deletion. Defaults to NOW (current time). When specified timestamp must be greater than latest system timestamp for the database.
## Remarks
This command mutates the state of the Obelisk agent.

View File

@ -94,7 +94,7 @@ All the values that make up the foreign key in the referencing row(s) are set to
The Obelisk agent raises an error if the parent foreign table has no entry with bunt values.
**`AS OF`**
Timestamp of table creation. Defaults to current time. When specified timestamp must be equal to or greater than system timestamp for the database creation.
Timestamp of table creation. Defaults to NOW (current time). When specified timestamp must be greater than system timestamp for the database.
## Remarks
This command mutates the state of the Obelisk agent.
@ -229,7 +229,7 @@ All the values that make up the foreign key in the referencing row(s) are set to
The Obelisk agent raises an error if the parent foreign table has no entry with bunt values.
**`AS OF`**
Timestamp of table aleration. Defaults to current time. When specified timestamp must be greater than latest database system timestamp structurally affecting table.
Timestamp of table aleration. Defaults to NOW (current time). When specified timestamp must be greater than latest database system timestamp and greater than the latest data timestamp for the table.
## Remarks
This command mutates the state of the Obelisk agent.
@ -285,7 +285,7 @@ Optionally, force deletion of a table.
Name of `<table>` to delete.
**`AS OF`**
Timestamp of table deletion. Defaults to current time. When specified timestamp must be greater than latest database system timestamp.
Timestamp of table deletion. Defaults to NOW (current time). When specified timestamp must be greater than latest database system timestamp and greater than the latest data timestamp for the table.
## Remarks
This command mutates the state of the Obelisk agent.

View File

@ -1,82 +0,0 @@
# Types
All data representations (nouns) of the Obelisk system are strongly typed.
## Column Types
The fundamental data element in Obelisk is an atom that is typed by an aura. Data cells, which are intersections of a `<table-set>` row and column, are typed atoms.
Obelisk supports the following auras (see ch12-literals for representing the atomic types):
| Aura | Description |
| :--- |:---------------------------- |
| @c | UTF-32 |
| @da | date |
| @dr | timespan |
| @f | loobean |
| @if | IPv4 address |
| @is | IPv6 address |
| @p | ship name |
| @q | phonemic base |
| @rh | half float (16b) |
| @rs | single float (32b) |
| @rd | double float (64b) |
| @rq | quad float (128b) |
| @sb | signed (low bit) binary |
| @sd | signed (low bit) decimal |
| @sv | signed (low bit) base32 |
| @sw | signed (low bit) base64 |
| @sx | signed (low bit) hexadecimal |
| @t | UTF-8 text (cord) |
| @ta | ASCII text (knot, url safe) |
| @tas | ASCII text (term) |
| @ub | unsigned binary |
| @ud | unsigned decimal |
| @uv | unsigned base32 |
| @uw | unsigned base64 |
| @ux | unsigned hexadecimal |
Columns are typed by an aura and indexed by name.
```
<column-type> ::=
<aura/name>
```
## Table Row and Table Types
All datasets in Obelisk are sets, meaning each typed element, `<row-type>`, only exists once.
Datasets are also commonly regarded as tables, which is accurate when the index of each cell (row/column intersection) can be calculated. This calculation is possible when the `SELECT` statement includes an `ORDER BY` clause.
All tables originate from, or are derived from, base tables created by the `CREATE TABLE` command.
A base-table (`<table>`) row has a default type, which is the table's atomic aura-typed columns in a fixed order.
```
<row-type> ::= list <aura>
```
Each base table is typed by its `<row-type>`.
```
<table-type> ::= (list <row-type>)
```
A base table's definition includes a unique primary row order, giving it `list` type rather than `set` type. This is not true for all `<table-set>` instances.
Rows from `<view>`s, `<common-table-expression>`s, and the command output from `<transform>`, or any other table that is not a base-table, can only have an immutable row order if it is explicitly specified (i.e., the `SELECT` statement includes an `ORDER BY` clause). In general, these other tables have types that are unions of `<row-type>`s.
When the `<table-set-type>` is a union of `<row-type>`s. There is a `<row-type>` representing the full width of the `SELECT` statement and as many `<row-type>` sub-types as necessary to represent any unjoined outer `JOIN`s that result in a selected row.
Sub-types align their columns with the all-column `<row-type>`, regardless of the SELECT statement's construction.
In general, `<table-set>`s have the type:
```
<table-set-type> ::=
(list <row-type>)
| (set (<all-column-row-type> | <row-sub-type-1> | ... | <row-sub-type-n> ))
```
## Additional Types
All the static types in Obelisk API are defined in `sur/ast/hoon`.
## Remarks
Even `<table>`s can be typed as sets, because a `SELECT` statement without an `ORDER BY` clause has an undefined row order.
Regardless of the presence of `ORDER BY`, any `<table-set>` emitted by any step in a `<transform>`, a CTE, or a `<view>` is a list of `<row-type>` in some (possibly arbitrary) order.
Ultimately, "set" is the most important concept because every `<table-set>` will have one unique row value for any given sub-type of `<row-type>`.