Before, we explicitly tracked the current schema and relied on that in
our migrations. This makes things more complicated as we need to keep
track of not just tables and columns but also primary keys, constraints
etc.
This commit remove the schema tracking and instead queries the
database for the current schema. During migrations, we temporarily store
the changes that are made, for example having temporary columns override
real ones and combine these with the current schema in the database.
This is handled in schema.rs.
These changes also broke our previously handling of triggers and
functions and how we detected if an insert/update was made against the
old or new schema during a migration. The previous method, using a
temporary __reshape_is_new column has been replaced with some helper
functions which inspect the search_path setting and uses that to
determine which schema is being used. During migrations, we can also set
the custom "reshape.is_old_schema" setting to force the old schema, for
example during batch updates.
This greatly simplifies the triggers as we can now simply call a helper
function in Postgres, `reshape.is_old_schema()`, to determine which
schema the modification was made for.
This is a first, small step in removing schema tracking entirely.
Instead we should probe the dabatase directly for the current schema.
This makes it easier to start using Reshape on an existing database
without having to specify the entire schema first.
Previously, having multiple alter_column actions for a single column
could cause issues as the triggers and batch updates referenced the
wrong columns.
This commit also simplifies the batch update procedure. Rather than
running the actual update on existing rows, a NOP update will be
run which in turn will trigger the triggers to update the new temporary
columns.
The alter_column was always using UPPER and LOWER rather than the passed
up and down settings. This commit also adjusts the column names for
temporary columns as multiple alter_columns actions would conflict if
they edited the same column. They still conflict in other places which
will be fixed later.
The context stores the index of the current migration and action. This
index is used to provide ordering of triggers as well as providing
unique names for things like temporary columns and procedures. The
Context struct exposes a prefix() function which returns a unique string
which can be used as a prefix to triggers, columns, procedures etc.
Previously a simple UPDATE query was used to backfill the table. This
causes excessive locking and is inefficient on large tables. Now we
instead run backfills in batches of a thousand rows. The batches are
determined based on primary key and iterated in ascending order.
Previously the new migrations were directly added to the migrations
lists. Now they are instead added to a vector in the InProgress status
and moved to the migrations list once the migration is complete.
Another status should be added later to indicate that a migration is
currently being applied. Right now, if a migration fails and then the
migrations are changed before being retried, there could be some
dangling changes.