Move the `ConstraintName` and `Check` `string` fields on an
`alter_column` operation into a new `CheckConstraint` struct and make
validation a method on that new struct.
This is to facilitate being able to create tables and columns with
`CHECK` constraints in later PRs (#108, #109).
This change ensures we also catch DROP statements for their inclusion in
the migrations log.
It seems DROP statements don't make it to the `ddl_command_end` trigger.
We need to explictly listen for them under `sql_drop`.
Implement the `drop_constraint` operation for dropping constraints
defined on single columns.
An example of the operation looks like:
```json
{
"name": "23_drop_check_constraint",
"operations": [
{
"drop_constraint": {
"table": "posts",
"column": "title",
"name": "title_length",
"up": "title",
"down": "(SELECT CASE WHEN length(title) <= 3 THEN LPAD(title, 4, '-') ELSE title END)"
}
}
]
}
```
for dropping a `CHECK` constraint. And like this for dropping a `FOREIGN
KEY` constraint:
```json
{
"name": "24_drop_foreign_key_constraint",
"operations": [
{
"drop_constraint": {
"table": "posts",
"column": "user_id",
"name": "fk_users_id",
"up": "user_id",
"down": "(SELECT CASE WHEN EXISTS (SELECT 1 FROM users WHERE users.id = user_id) THEN user_id ELSE NULL END)"
}
}
]
}
```
The operation works very similarly to the inverse operation of adding
`CHECK` and `FOREIGN KEY` constraints to a single column.
* On `Start`:
* a new column without the constraint is added to the underlying table.
* triggers are created using the `up` and `down` SQL. The `down` SQL
needs to ensure that rows inserted into the new view that don't meet the
constraint are converted into rows that do meet the constraint.
* On `Complete`
* Triggers are removed, the old column is deleted and the new column is
renamed.
* On `Rollback`
* The new column and the triggers are removed.
## Improvements
* The `drop_constraint` operation requires that the column on which the
constraint is defined is named in the migration `json` file. If
`pg-roll`'s internal schema representation knew about the constraints
defined on a table it would be possible to delete constraints by
constraint name only; the schema representation would know on which
column the constraint was defined.
* `pg-roll` currently only allows for creating `CHECK` and `FOREIGN KEY`
constraints on single columns; this limitation also applies to the
`drop_constraint` operation.
Foreign key constraints can be created by 3 operations:
* `create_table`
* `add_column`
* `set_foreign_key`
This PR moves the logic that was repeated in these 3 operations to test
the validity of a FK constraint (has a name, target table and column
exist) into a method on on the `ForeignKeyReference` type.
Make it required to supply a name for a foreign key constraint created
in either the `create_table`, `add_column` or `set_foreign_key`
operations.
It should be possible to drop constraints with a later migration (not
yet implemented), so requiring a name and not relying on automatic
generation of constraint names will make this easier.
The same thing was done for indexes in #59 and `CHECK` constraints in
#99.
Make it required to supply a name for the `CHECK` constraint when adding
one with the `set_check_constraint` operation.
It should be possible to drop constraints with a later migration (not
yet implemented), so requiring a name and not relying on automatic
generation of constraint names will make this easier.
The same thing was done for indexes in
https://github.com/xataio/pg-roll/pull/59
Move the vaildation logic to test for the existence of the table and
column in an `alter_column` operation into the `alter_column` operation
itself and out of its inner operations.
This reduces duplication of each sub-operation having to repeat the test
for the table/column existence.
Build on #91 and #92 and move the `set_not_null` operation into the
`alter_column` operation.
The pattern is the same as for the operations already moved:
* Update the example migration to use the new operation type
* Remove serialization and deserialization logic for the individual
operation.
* Make the `alter_column` operation construct the right type of 'inner
operation'.
* Update tests for the foreign key and check constraint ops to use the
`alter_column` operation.
A migration to set a column `NOT NULL` now looks like:
```json
{
"name": "16_set_not_null",
"operations": [
{
"alter_column": {
"table": "reviews",
"column": "review",
"not_null": true,
"up": "(SELECT CASE WHEN review IS NULL THEN product || ' is good' ELSE review END)",
"down": "review"
}
}
]
}
```
The `not_null` field is currently only allowed to be set `true`, as
removing this kind of constraint is currently unsupported.
Move the `set_foreign_key` and `set_check_constraint` operations into
the new `alter_column` operation introduced in #91.
The pattern for moving the operations is the same as in #91:
* Update the example migrations to use the new operation type
* Remove serialization and deserialization logic for the individual
operations.
* Make the `alter_column` operation construct the right type of 'inner
operation'.
* Update tests for the foreign key and check constraint ops to use the
`alter_column` operation.
Create a new release for each tag pushed to the repository.
Build versions of `pg-roll` for most OS/arch combinations and use them
as artifacts for the release.
The build job runs on every push and the release job is gated on pushes
to `refs/tags`.
Introduce a new `alter_column` operation to combine existing operations
that work on columns.
The following operation types should be combined into one:
- [x] Change column type
- [x] Rename column
- [ ] Add `CHECK` constraint
- [ ] Add foreign key constraint
- [ ] Make column `NOT NULL`
- [ ] Add unique constraint
This PR implements the first two operations in the list, leaving the
rest for a later PR.
The new `alter_column` migrations look like:
```json
{
"name": "18_change_column_type",
"operations": [
{
"alter_column": {
"table": "reviews",
"column": "rating",
"type": "integer",
"up": "CAST(rating AS integer)",
"down": "CAST(rating AS text)"
}
}
]
}
```
and
```json
{
"name": "13_rename_column",
"operations": [
{
"alter_column": {
"table": "employees",
"column": "role",
"name": "job_title"
}
}
]
}
```
Ensure that `up` SQL is required when adding a `NOT NULL` column with no
default.
Add tests for this case and the other validation cases for which no
tests had been written.
For consistency with other 'alter column' style operations and to make
it easier to combine such operations into one, make the set `NOT NULL`
operation take `up` and `down` as `string` rather than `*string`.
For consistency with other operations, flexibility and the ability to
more easily combine operations into one 'alter column' operation, allow
the set `NOT NULL` operation to take user-supplied`down` SQL to be run
when moving values from the new to the old schema.
Previously, the `down` SQL was assumed to be just a straight copy of the
value from the new to the old column.
The example migrations in `examples/` are an important form of
executable documentation for `pg-roll`.
Ensure that all the migrations in the `examples/` dir can be run
sequentially without error on all supported versions of postgres by
adding a matrix step to the `build` workflow.
For consistency with other operations that allow `up` SQL and to allow
more flexibility, the `up` SQL on the set `NOT NULL` operation should
run on all values, not just `NULL`s.
Making the `up` SQL here behave the same as other operations will also
make it easier to combine operations as one 'alter column' operation in
future.
This was previously discussed in the comments on #63, especially
[here](https://github.com/xataio/pg-roll/pull/63#discussion_r1305433916).
Add support for an operation to add a `CHECK` constraint to an existing
column. The new operation looks like this:
```json
{
"name": "22_add_check_constraint",
"operations": [
{
"set_check_constraint": {
"table": "posts",
"column": "title",
"check": "length(title) > 3",
"up": "(SELECT CASE WHEN length(title) <= 3 THEN LPAD(title, 4, '-') ELSE title END)",
"down": "title"
}
}
]
}
```
This migrations adds a `CHECK (length(title) > 3)` constraint to the
`title` column on the `posts` table. Pre-existing values in the old
schema are rewritten to meet the constraint using the `up` SQL.
The implementation is similar to the **set not null**, **change column
type** and **set foreign key** operations.
* On `Start`:
* The column is duplicated and a `NOT VALID` `CHECK` constraint is added
to the new column.
* Values from the old column are backfilled into the new column using
`up` SQL.
* Triggers are created to copy values from old -> new with `up` SQL and
from new->old using `down` SQL.
* On `Complete`
* The `CHECK` constraint is validated
* The old column is dropped and the new column renamed to the name of
the old column.
* Postgres ensures that the `CHECK` constraint is also updated to apply
to the new column.
* Triggers and trigger functions are removed.
* On `Rollback`
* The new column is removed
* Triggers and trigger functions are removed.
As with other operations involving `up` and `down` SQL, it is the user's
responsibility to ensure that values from the old schema that don't meet
the new `CHECK` constraint are correctly rewritten to meet the
constraint with `up` SQL. If the `up` SQL fails to produce a value that
meets the constraint, the migration will fail either at start (for
existing values in the old schema) or at runtime (for values written to
the old schema during the migration period).
Add support for adding a foreign key constraint to an existing column.
Such a migration looks like:
```json
{
"name": "21_add_foreign_key_constraint",
"operations": [
{
"set_foreign_key": {
"table": "posts",
"column": "user_id",
"references": {
"table": "users",
"column": "id"
},
"up": "(SELECT CASE WHEN EXISTS (SELECT 1 FROM users WHERE users.id = user_id) THEN user_id ELSE NULL END)",
"down": "user_id"
}
}
]
}
```
This migration adds a foreign key constraint to the `user_id` column in
the `posts` table, referencing the `id` column in the `users` table.
The implementation is similar to the **set not null** and **change
column type** operations:
* On `Start`:
* Create a new column, duplicating the one to which the FK constraint
should be added.
* The new column has the foreign key constraint added as `NOT VALID` to
avoid taking a long lived `SHARE ROW EXCLUSIVE` lock (see
[here](https://medium.com/paypal-tech/postgresql-at-scale-database-schema-changes-without-downtime-20d3749ed680#00dc)).
* Backfill the new column with values from the existing column,
rewriting values using the `up` SQL.
* Create a trigger to populate the new column when values are written to
the old column, converting values with `up`.
* Create a trigger to populate the old column when values are written to
the new column, converting values with `down`.
* On `Complete`
* Validate the foreign key constraint.
* Remove triggers
* Drop the old column
* Rename the new column to the old column name.
* Rename the foreign key constraint to be consistent with the new name
of the column.
* On `Rollback`
* Remove the new column and both triggers. Removing the new column also
removes the foreign key constraint on it.
The `up` SQL in this operation is critical. The old column does not have
a foreign key constraint imposed on it after `Start` as that would
violate the guarantee that `pg-roll` does not make changes to the
existing schema. The `up` SQL therefore needs to take into account that
not all rows inserted into the old schema will have a valid foreign key.
In the example `json` above, the `up` SQL ensures that values for which
there is no corresponding user in the `users` table result in `NULL`
values in the new column. Failure to do this would result in the old
schema failing to insert rows without a valid `user_id`. An alternative
would be to implement data quarantining for these values, as discussed
last week @exekias .
Allow the **add column** operation to create foreign key columns.
An example of such an operation is:
```json
{
"name": "17_add_rating_column",
"operations": [
{
"add_column": {
"table": "orders",
"column": {
"name": "user_id",
"type": "integer",
"references": {
"table": "users",
"column": "id",
}
}
}
}
]
}
```
Most of the work to support the operation is in
https://github.com/xataio/pg-roll/pull/79.
* The constraint is added on `Start` (named according to the temporary
name of the new column).
* The entire new column, including the foreign key constraint, is
removed on `Rollback`.
* The constraint is renamed to use the final name of the new column on
`Complete`.
Test cases are included for both nullable and non-nullable FKs.
Some migration operations create new objects in the database; new
columns, triggers and functions. When such a migration fails to start
(eg a change column type operation where some existing rows in the
database can't be converted to the new type), all database objects
created by Start should be cleaned up.
This PR calls `Rollback` on `Start` errors to ensure that this cleanup
happens.
Implement the **change column type** operation. A change type migration
looks like this:
```json
{
"name": "18_change_column_type",
"operations": [
{
"change_type": {
"table": "reviews",
"column": "rating",
"type": "integer",
"up": "CAST(rating AS integer)",
"down": "CAST(rating AS text)"
}
}
]
}
```
This migration changes the type of the `rating` column from `TEXT` to
`INTEGER`.
The implementation is very similar to the **set NOT NULL** operation
(#63):
* On `Start`:
* Create a new column having the new type
* Backfill the new column with values from the existing column,
converting the types using the `up` SQL.
* Create a trigger to populate the new column when values are written to
the old column, converting types with `up`.
* Create a trigger to populate the old column when values are written to
the new column, converting types with `down`.
* On `Complete`
* Remove triggers
* Drop the old column
* Rename the new column to the old column name.
* On `Rollback`
* Remove the new column and both triggers.
The migration can fail in at least 2 ways:
* The initial backfill of existing rows on `Start` fails due to the type
conversion not being possible on one or more rows. In the above example,
any existing rows with `rating` values not representable as an `INTEGER`
will cause a failure on `Start`.
* In this case, the failure is reported and the migration rolled back
(#73)
* During the rollout period, unconvertible values are written to the old
version schema. The `up` trigger will fail to convert the values and the
`INSERT`/`UPDATE` will fail.
* Some form of data quarantine needs to be implemented here, copying the
invalid rows elsewhere and blocking completion of the migration until
those rows are handled in some way).
The PR also adds example migrations to the `/examples` directory.
The `.Columns` field provided to the template for execution has type
`map[string]schema.Column`. The existing template code ignored the key
values using just the `.Name` field in the `schema.Column`. By using the
key string, the template is able to support aliasing columns in
declarations, for example:
```go
Columns: map[string]schema.Column{
"id": {Name: "id", Type: "int"},
"username": {Name: "username", Type: "text"},
"product": {Name: "product", Type: "text"},
"review": {Name: "review", Type: "text"},
"rating": {Name: "_pgroll_new_rating", Type: "integer"},
},
```
aliases the `rating` key to the `_pgroll_new_rating` column, producing a
SQL declaration in the trigger like:
```sql
"rating" "public"."reviews"."_pgroll_new_rating"%TYPE := NEW."_pgroll_new_rating";
```
This is useful when generating declarations for the **change column
type** operation, where the `rating` variable in the down SQL should
refer to the new temporary column.
The example trigger is supposed to be an `up` trigger, so make the
`PhysicalColumn` and `SQL` fields reflect that.
This doesn't affect the correctness of the tests; just makes the example
triggers easier to understand.
This change adds a new `sql` operation, that allows to define an `up`
SQL statement to perform a migration on the schema.
An optional `down` field can be provided, this will be used when trying
to do a rollback after (for instance, in case of migration failure).
A new trigger is installed to capture DDL events coming from direct user
manipulations (not done by pg-roll), so they are stored as a migration,
getting to know the resulting schema in all cases.
Change the representation of a schema in `pg-roll`s state store from:
```go
type Schema struct {
// Tables is a map of virtual table name -> table mapping
Tables map[string]Table `json:"tables"`
}
```
to:
```go
type Schema struct {
// Name is the name of the schema
Name string `json:"name"`
// Tables is a map of virtual table name -> table mapping
Tables map[string]Table `json:"tables"`
}
```
ie, store the schema's name.
This allows the signature of `Start` to be simplified, removing the
`schemaName` parameter; the name can be retrieved from the
`schema.Schema` struct that is already provided.
Refactor the unwieldy `Sprintf`-based trigger generation code:
https://github.com/xataio/pg-roll/blob/main/pkg/migrations/triggers.go#L54-L78
With an equivalent approach that uses text templates:
```go
package templates
const Function = `CREATE OR REPLACE FUNCTION {{ .Name | qi }}()
RETURNS TRIGGER
LANGUAGE PLPGSQL
AS $$
DECLARE
{{- $schemaName := .SchemaName }}
{{- $tableName := .TableName }}
{{ range .Columns }}
{{- .Name | qi }} {{ $schemaName | qi }}.{{ $tableName | qi}}.{{ .Name | qi }}%TYPE := NEW.{{ .Name | qi }};
{{ end -}}
latest_schema text;
search_path text;
BEGIN
SELECT {{ .SchemaName | ql }} || '_' || latest_version
INTO latest_schema
FROM {{ .StateSchema | qi }}.latest_version({{ .SchemaName | ql }});
SELECT current_setting
INTO search_path
FROM current_setting('search_path');
IF search_path {{- if eq .Direction "up" }} != {{- else }} = {{ end }} latest_schema {{ if .TestExpr -}} AND {{ .TestExpr }} {{ end -}} THEN
NEW.{{ .PhysicalColumn | qi }} = {{ .SQL }};
{{- if .ElseExpr }}
ELSE
{{ .ElseExpr }};
{{- end }}
END IF;
RETURN NEW;
END; $$
`
```
The templates are easier to read, easier to extend and result in cleaner
triggers without oddities in indentation levels or empty `ELSE` blocks
when no `ElseExpr` is provided.
This is in response to [this
comment](https://github.com/xataio/pg-roll/pull/63#discussion_r1305430362).
Implement the operation to make an existing column `NOT NULL`.
The migration looks like this:
```json
{
"name": "16_set_not_null",
"operations": [
{
"set_not_null": {
"table": "reviews",
"column": "review",
"up": "product || ' is good'"
}
}
]
}
```
This migration adds a `NOT NULL` constraint to the `review` column in
the `reviews` table.
* On `Start`:
* Create a new column with a `NOT VALID` `NOT NULL` constraint
* Backfill the new column with values from the existing column using the
`up` SQL to replace `NULL` values
* Create a trigger to populate the new column when values are written to
the old column, rewriting `NULLs` with `up` SQL.
* Create a trigger to populate the old column when values are written to
the new column.
* On `Complete`
* Validate the `NOT VALID` `NOT NULL` constraint on the new column.
* Add `NOT NULL` to the new column.
* Remove triggers and the `NOT VALID` `NOT NULL` constraint
* Drop the old column
* Rename the new column to the old column name.
* On `Rollback`
* Remove the new column and both triggers.
Add the skeleton code for the **set not null** operation and add a test
that `Validate` fails if `up` SQL is not provided.
Fix the `wantError` logic in the test infrastructure in order to make
this test pass.
The old behaviour would mark tests as passed if `Start` did not return
an error, even if an error was expected.
Change the module name to match its import path.
In order for `pg-roll` to be usable as a module we need the be able to
import `"github.com/xataio/pg-roll/pkg/roll"` etc from other modules.
Changing the name of the module to match its import path ensures that
this is possible.
Add support for adding uniqueness constraints to columns. Such a
migration looks like this:
```json
{
"name": "15_set_columns_unique",
"operations": [
{
"set_unique": {
"name": "reviews_username_product_unique",
"table": "reviews",
"columns": [
"username",
"product"
]
}
}
]
}
```
This migration adds a unique constraint spanning the `username` and
`product` columns in the `reviews` table.
* On `Start` a unique index is created concurrently.
* On `Rollback` the unique index is removed.
* On `Complete` a unique constraint is added to the column using the
index.
Creating a unique constraint directly requires a full table exclusive
lock. By first creating a unique index concurrently and then adding a
constraint using the index the need for the lock is avoided.
Make specifying a name mandatory on the **Create index** operation.
In order to work with indexes in subsequent migrations (eg deleting the
index), the user will have to know the name of the migration. If the
migration name is auto-generated and we ever change how names are
generated, then we risk breaking a user's migrations if they have
migrations that refer to these generated names.
Add support for **rename column** migrations. A rename column migration
looks like:
```json
{
"name": "13_rename_column",
"operations": [
{
"rename_column": {
"table": "employees",
"from": "role",
"to": "jobTitle"
}
}
]
}
```
* On `Start`, the view in the new version schema aliases the renamed
column to its new name. The column in the underlying table is not
renamed.
* `Rollback` is a no-op.
* `Complete` renames the column in the underlying table.
Add support for **drop index** migrations. A drop index migration looks
like this:
```json
{
"name": "11_drop_index",
"operations": [
{
"drop_index": {
"name": "_pgroll_idx_users_name"
}
}
]
}
```
* `Start` is a no-op.
* On `Complete` the index is removed from the underlying table.
* `Rollback` is a no-op.
Add information about indexes on a table to `pg-roll`'s internal state
storage.
For each table, store an additional JSON object mapping each index name
on the table to details of the index (initially just its name).
An example of the resulting JSON is:
```json
{
"tables": {
"fruits": {
"oid": "16497",
"name": "fruits",
"columns": {
"id": {
"name": "id",
"type": "integer",
"comment": null,
"default": "nextval('_pgroll_new_fruits_id_seq'::regclass)",
"nullable": false
},
"name": {
"name": "name",
"type": "varchar(255)",
"comment": null,
"default": null,
"nullable": false
}
},
"comment": null,
"indexes": {
"_pgroll_idx_fruits_name": {
"name": "_pgroll_idx_fruits_name"
},
"_pgroll_new_fruits_pkey": {
"name": "_pgroll_new_fruits_pkey"
},
"_pgroll_new_fruits_name_key": {
"name": "_pgroll_new_fruits_name_key"
}
}
}
}
}
```
Also add fields to the `Schema` model structs to allow the new `indexes`
field to be unmarshalled.