Implement the **drop column** migration operation.
A migration to drop a column looks like this:
```json
{
"name": "09_drop_column",
"operations": [
{
"drop_column": {
"table": "fruits",
"column": "price",
"down": "0"
}
}
]
}
```
The migration takes the name of the table and column that should be
dropped along with (optionally) some `down` SQL to run to populate the
field in the underlying table when insertions are done via the new
schema version while the migration is in progress.
* On `Start`, the relevant view in the new version schema is created
without the dropped column. The column is not deleted from the
underlying table.
* If `down` SQL is specified, a trigger is created on the underlying
table to populate the column to be removed when inserts are made from
the new schema version.
* On `Rollback` any triggers on the underlying table are removed.
* On `Complete` the old version of the schema is removed and the column
is removed from the underlying table. Any triggers are also removed.
Add support for **drop table** migrations.
* On starting the migration the table is not dropped; but is not present
in the new version of the schema.
* The table is dropped on completion of the migration.
* Rollback is a no-op.
https://github.com/xataio/pg-roll/pull/35 ensured that the triggers
added to run `up` SQL are removed on `Rollback`, but the same needs to
happen on `Complete`.
This PR does that and extends the test to check for it.
Implement `Complete` for **add column** operations that add a `NOT NULL`
column without a `DEFAULT`.
To add such a column without forcing a exclusive lock while a full table
scan is performed, these steps need to be followed:
On `Start`:
1. Add the new column
2. Add a `CHECK IS NOT NULL` constraint to the new column, but with `NOT
VALID`, to avoid the scan.
3. Backfill the new column with the provided `up` SQL.
On `Complete`
1. Validate the constraint (with `ALTER TABLE VALIDATE CONSTRAINT`).
2. Add the `NOT NULL` attribute to the column. The presence of a valid
`NOT NULL` constraint on the column means that adding `NOT NULL` to the
column does not perform a full table scan.
See [this
post](https://medium.com/paypal-tech/postgresql-at-scale-database-schema-changes-without-downtime-20d3749ed680#00dc)
for a summary of these steps.
Implement `Rollback` for **add column** operations that add a `NOT NULL`
column without a `DEFAULT`.
Support for starting such an operation was added in
https://github.com/xataio/pg-roll/pull/37. See that PR for a description
of the steps involved.
In order to roll back, there is nothing to be done as the rollback for
the **add column** operation already removes the new column, which also
removes any constraints defined on the column.
This PR adds an `afterRollback` hook to the tests to verify that the
constraint has been dropped.
Implement `Start` for **add column** operations that add a `NOT NULL`
column without a `DEFAULT`.
To add such a column without forcing a exclusive lock while a full table
scan is performed, these steps need to be followed:
On `Start`:
1. Add the new column
2. Add a `CHECK IS NOT NULL` constraint to the new column, but with `NOT
VALID`, to avoid the scan.
3. Backfill the new column with the provided `up` SQL.
On `Complete`
1. Validate the constraint (with `ALTER TABLE VALIDATE CONSTRAINT`).
2. Add the `NOT NULL` attribute to the column. The presence of a valid
`NOT NULL` constraint on the column means that adding `NOT NULL` to the
column does not perform a full table scan.
See [this
post](https://medium.com/paypal-tech/postgresql-at-scale-database-schema-changes-without-downtime-20d3749ed680#00dc)
for a summary of these steps.
When an **add column** migration is started on a table with existing
rows, backfill the values of the new column using the `up` SQL from the
migration.
The backfill implementation here is naive; for tables with many rows a
batched update is more appropriate.
Support for `up` SQL in **add column** operations was added in #34
When such an **add column** operation runs, a trigger (and therefore a
trigger function) is created to implement running `up` SQL on insertions
to the old schema.
This PR ensures that the trigger and function are dropped when the
migration is rolled back:
* Extract two functions so that the rollback operations can name the
function to be dropped.
* Add necessary assertion functions to the common test infra.
* Add an `afterRollback` hook to the existing test for `up` SQL.
Add a new field `Up` to **add column** migrations:
```json
{
"name": "03_add_column_to_products",
"operations": [
{
"add_column": {
"table": "products",
"up": "UPPER(name)",
"column": {
"name": "description",
"type": "varchar(255)",
"nullable": true
}
}
}
]
}
```
The SQL specified by the `up` field will be run whenever an row is
inserted into the underlying table when the session's `search_path` is
not set to the latest version of the schema.
The `up` SQL snippet can refer to existing columns in the table by name
(as in the the above example, where the `description` field is set to
`UPPER(name)`).
If set, complete the migration immediately after starting it.
For example:
```bash
go run start examples/02_create_another_table.json --complete
```
Completes the migration immediately, without the user having to run
`complete`.
Change the `MustSelect` function to allow it to return arbitrary types,
rather than assuming all values are of type `string`.
In particular, this change will allow tests to check for the presence of
`NULL`s in the returned values.
`MustSelect` casts all 64 bit integers back to 32 bit, in order to
reduce the noise in test code.
For example, with 64 bit ints we'd have to write:
```golang
{"id": int64(2), "name": "Bob", "age": int64(21)},
```
rather than:
```golang
{"id": 2, "name": "Bob", "age": 21},
```
The potential loss of precision in test code seems unlikely to affect
us, so favouring ease of reading of test code here.
Replace the tests for the **add column** operation with the more concise
`ExecuteTests` style introduced in #23.
To support this, some new assertion functions are added:
* `ColumnMustExist/ColumnMustNotExist`
* `TableMustHaveColumnCount`
Allow the **add column** operation to add columns with `NOT NULL`,
`UNIQUE` and `DEFAULT` constraints by re-using the SQL generation code
that adds columns to tables.
This will check that operations are valid (can actually be executed
against the given schema).
I'm pretty sure we can do way more validation, but wanted to have the
infra in place before we have many ops and this becomes a bigger change
Sort the keys in the record map before constructing the `INSERT`
statement, to ensure a consistent ordering of the column names and
values in the `INSERT` statement.
For example inserting records into a `users` table could sometimes
generate these two `INSERT` statements depending how the map was
iterated:
```
INSERT INTO public_02_add_column.users (name, age) VALUES ('21', 'Bob')
INSERT INTO public_02_add_column.users (name, age) VALUES ('Bob', '21')
```
the first of which will fail (assuming the `age` field as type
`integer`).
This was causing flaky tests when inserting values into multiple
columns.
* Convert the tests for the **create table** operation to use the more
concise table driven style introduced in #23.
* Add a new `afterRollback` hook to the common test infra to allow the
**create table** op to test that the underlying table is removed on
rollback.
* Rename functions in the common test infra to make it clear where we
are checking for the existence of a table vs a view.
Add a `status` command to show the status of each schema that `pg-roll`
knows about (ie ,those schema that have had >0 migrations run in them).
`go run . status`
**Example output**:
```json
[
{
"Schema": "public",
"Version": "01_create_tables",
"Status": "In Progress"
}
]
```
or:
```json
[
{
"Schema": "public",
"Version": "01_create_tables",
"Status": "Complete"
}
]
```
In future the `json` output of the command should be behind a `-o json`
switch and the default output should be human readable.
Add rename table operation.
I worked a bit toward table-based testing with reusable code in
`op_common_test.go`
---------
Co-authored-by: Andrew Farries <andyrb@gmail.com>
Add support for **add column** migrations in the simple case where the
new column is nullable.
Add tests for the new operation, covering start, rollback and complete,
in the cases where the add column operation is running against a table
created in an earlier migration and in the case where the the column is
added to a table created in an operation earlier in the migration.
Reshape [offers](https://github.com/fabianlindfors/reshape#add-column)
an `up` option when adding a column, to allow users to backfill the new
column. This PR does not implement this feature.
Add a step to the `lint` job in the build workflow to check that all the
example `.json` migration files are consistently formatted.
The step fails as of this PR. It will pass when rebased on to
https://github.com/xataio/pg-roll/pull/21.
* Add tests for the `roll` package to ensure that the new versioned
schema is created on `start` and removed on `rollback`. We already had a
test there to ensure the previous versioned schema is dropped on
`complete`.
* Remove parts of tests for the create table operation that concerned
themselves with checking for the existence/non-existence of the
versioned schema. That is now tested in the `roll` package and we want
the tests for each operation to be focussed on the operation itself, not
schema creation.
* Add one more test for the create table operation to ensure that the
view for the new table is usable after `complete` (we already had a test
to ensure that it's usable on `start`).
Close https://github.com/xataio/pg-roll/issues/15.
Qualify versioned schema with the name of the schema they represent.
For a migration called `01_create_table` running in the `public` schema,
the versioned schema is called `public_01_create_table`.
This change will retrieve and store the resulting schema after a
migration is completed. This schema will be used as the base to execute
the next migration, making it possible to create views that are aware of
the full schema, and not only the one created by the last migration.
We use a function to retrieve the schema directly from Postgres instead
of building it from the migration files. This allows for more features
in the future, like doing an initial sync on top of the existing schema
or automatically detecting and storing out of band migrations from
triggers.
Example JSON stored schema:
```
{
"tables": {
"bills": {
"oid": "18272",
"name": "bills",
"columns": {
"id": {
"name": "id",
"type": "integer",
"comment": null,
"default": null,
"nullable": false
},
"date": {
"name": "date",
"type": "time with time zone",
"comment": null,
"default": null,
"nullable": false
},
"quantity": {
"name": "quantity",
"type": "integer",
"comment": null,
"default": null,
"nullable": false
}
},
"comment": null
},
"products": {
"oid": "18286",
"name": "products",
"columns": {
"id": {
"name": "id",
"type": "integer",
"comment": null,
"default": "nextval(_pgroll_new_products_id_seq::regclass)",
"nullable": false
},
"name": {
"name": "name",
"type": "varchar(255)",
"comment": null,
"default": null,
"nullable": false
},
"price": {
"name": "price",
"type": "numeric(10,2)",
"comment": null,
"default": null,
"nullable": false
}
},
"comment": null
},
"customers": {
"oid": "18263",
"name": "customers",
"columns": {
"id": {
"name": "id",
"type": "integer",
"comment": null,
"default": null,
"nullable": false
},
"name": {
"name": "name",
"type": "varchar(255)",
"comment": null,
"default": null,
"nullable": false
},
"credit_card": {
"name": "credit_card",
"type": "text",
"comment": null,
"default": null,
"nullable": true
}
},
"comment": null
}
}
}
```
After this change, I believe that the `create_table` operation is
feature complete and can be used for many sequential migrations.
Add a sentinel error `ErrNoActiveMigration` for the case where there is
no active migration. This improves the error strings presented to users
by not mentioning SQL errors.
**`pg-roll start` when there is a migration in progess:**
```
Error: a migration for schema "public" is already in progress
```
**`pg-roll rollback` when there is no migration in progress:**
```
Error: unable to get active migration: no active migration
```
**`pg-complete` when there is no active migration:**
```
Error: unable to get active migration: no active migration
```
This migrations introduces state handling by creating a dedicated
`pgroll` schema (name configurable). We will store migrations there, as
well as their state. So we keep some useful information, ie the
migration definition (so we don't need it for the `complete` state).
Schema includes the proper constraints to guarantee that:
* Only a migration is active at a time
* Migration history is linear (all migrations have a unique parent,
except the first one which is NULL)
* We now the current migration at all times
Some helper functions are included:
* `is_active_migration_period()` will return true if there is an active
migration.
* `latest_version()` will return the name of the latest version of the
schema.
Add a `rollback` command to the CLI.
Use the rollback functionality added to the `migrations` package in #5
to perform the rollback.
Example:
```bash
go run . start examples/01_create_tables.json
go run . rollback examples/01_create_tables.json
```
The schema for the `01_create_table` version is removed from the
database along with the underlying tables.
⚠️ We currently don't have a way to ensure that only uncompleted
migrations can be rolled back. Once we have some state in the db
recording with migrations have been applied, we can revisit this command
to ensure completed migrations can't be rolled back ⚠️
Implement rollbacks for the create table operation.
* Delete the new version of the schema and any views it contains.
* Drop the tables created by the operations.
Add an integration test to check that these resources are successfully
dropped.
This PR adds the rollback operation to the `migrations` package;
supporting the operation through the CLI will be in a later PR.