For Reshape to work with an existing database, the first migration
shouldn't autocomplete as then there can't be a zero-downtime
deployment. The user should instead use `--complete` if they don't care
about zero-downtime, for example when setting up a dev or CI
environment. We should update the `--complete` flag to apply and
complete each migration one by one for efficiency.
We want it to be possible for Reshape to be adopted for existing
databases with an existing schema. To make this possible, we can't check
if the schema path matches the current migration as there won't be a
current migration at first. Instead we'll compare against the target
migration.
The next step in making Reshape work with existing databases is to not
automatically complete the first migration. This will not be safe for an
existing application adopting Reshape.
Database queries will now be retried if possible using exponential
backoff with jitter. This should help protect against connection
problems and also timeouts caused by us now setting lock_timeout to
avoid blocking other queries. If such a timeout occurs, we should fail
to let waiting queries execute and then try again until the blocking
queries have completed.
When attempting to acquire a lock for DDL queries, Postgres will wait
for any long-running queries to complete first. It will also block any
other queries that need access, which causes a queue of queries until we
release our lock. This has the effect of long-running queries blocking
other queries when we perform migrations, which is equivalent to
downtime if the query runs for long enough.
To avoid this, we set the `lock_timeout` setting. This puts an upper
bound on how long we will wait for a lock and also for how long other
queries can be blocked by a long-running query. The next step will be to
add automatic retries to handle timeouts gracefully.
Reference: https://medium.com/paypal-tech/postgresql-at-scale-database-schema-changes-without-downtime-20d3749ed680
Previously we set up migrations directly as structs which doesn't match
how users will create them. For the new test framework, we'll instead
write TOML for migrations. This brings the tests closer to how an end
user will interact with Reshape and also lets us write tests to match
the examples in the documentation.
This reduces the amount of boilerplate required to run a test and also
has the added benefit of extending tests as we can now easily test both the
completion and abort paths. So far two tests have been ported over,
create_table and add_column, with the rest to follow.
The framework also improves the output of tests, adding sections clearly
indicating in which part of the migration process a test failed.
Tones down the disclaimer a bit as I feel reasonable certain Reshape
won't lose any data. Also adds some details about locking and downtime
to the introduction.
This makes the format of `add_index` a bit more consistent with
`add_column` which has a nested definition for the column. This also
sets up for us to allow adding indices in `create_table`.
This protects all database access using a Postgres advisory lock. This
way we can avoid problems when two instances of Reshape try to change
the same database at the same time.
With the old fields in the `State` broken out we can now make `State` a
proper enum, replacing the nested `Status`. The naming with both `State`
and `Status` was a bit confusing.
The version is not currently in used but will probably be useful later
when we want Reshape to be backwards compatbility. Having the version
stored in the database will then allow us to perform schema migrations
on the metadata of Reshape.
Before, the entire state was stored as a JSON encoded blob including all
completed migrations. This commits moves the migrations into a dedicated
table, which prevents the encoded from growing indefinitely. The size of
the state probably wouldn't have become a problem but it feels weird for
a schema migration tool to not use a proper schema.
Next, the version stored in the state struct will be moved to a separate
key in the `reshape.data` table and `State` will become an enum with the
same structure as `Status`. The naming right now is a bit confusing.
Until now, using alter_column would cause indices to be lost as they
weren't copied to the temporary column. This fixes that for indices that
consist of one or more actual columns. It won't work for indices on
expressions.
This allows people new to the project to understand how Reshape works before getting into the technical detail of setting it up and using it (which they may or may not want to do just yet).