The 'transaction' object can now be fed a 'validator' function. This function
will be run right before the transaction is closed to validate its content. The
target usage is hooks. The validation function is expected to raise an exception
when it wants to abort the transaction.
This only introduce the idea with a default no-op validator. Actual usage is in
the next changeset.
Once the transaction is closed, we now write transaction related data for
possible future undo. For now, we only do it for full file "backup" because
their were not handle at all in that case. In the future, we could move all the
current logic to set undo up (that currently exists in localrepository) inside
transaction itself, but it is not strictly requires to solve the current
situation.
It is time for the transaction to be responsible for setting up the
undo data. It is necessary to move this logic into the transaction
because many more files are handled now, and the transaction is the
object tracking them all.
The value can be set to None if no undo should be set.
The argument is a string containing the journal name (used as prefix for all
other transaction file). This is not the transaction file itself. So we clarify
this.
Using 'copyfile' (single file) instead of 'copyfiles' (tree) will ensures
destination file will be overwritten. This will prevent some abort if backup
file are left in place for random reason.
It also seems more correct.
Previous transaction work added callbacks to be called during regular
transaction commit/close. As part of refactoring Mozilla's pushlog
extension (an extension that opens a SQLite database and tries to tie
its transaction semantics to Mercurial's transaction), I discovered that
the new transaction APIs were insufficient to avoid monkeypatching
transaction instance internals. Adding a callback that is called during
transaction abort removes the necessity for monkeypatching and completes
the API.
The location variable fetch from the loop and the one used to actually fetch it
mismatched. We fix the name to ensure file outside of store are cleaned up.
This method has the same behavior as the 'os.path.split' function, but having
it in vfs will allow handling of tricky encoding situations in the future.
In the same patch, we replace the use of 'os.path.split' in the transaction code.
The vfs.join method only works for absolute paths. We need something
that works for relative paths too when transforming filenames. Since
os.path.join may misbehave in tricky encoding situations, encapsulate
the new join method in our vfs abstraction. The default implementation
remains os.path.join, but this opens the door to other VFSes doing
something more intelligent based on their needs.
In the same go, we replace the usage of 'os.path.join' in transaction code.
Such file are generated with a .pending prefix. It is up to the reader to
implement the necessary logic for reading pending files.
We add a test to ensure pending files are properly cleaned-up in both success and
error cases.
This will allow us to generate temporary pending files. Files
generated with a suffix are assumed temporary and will be cleaned up
at the end of the transaction.
We are still doing double backups, but now that we have proper
location handling this is less of an issue. Dropping this simplifies
the code before we add some pending-related logic.
This also ensures we actually test the new 'location' mechanism.
The current naming scheme ('journal.backups.<file>') resulted is bad directory
name when 'file' was in a subdirectory. We now extract the directory name and
create the backupfile within it.
We plan to use file in a subdirectory for cachefile.
We do not want to abort if anything wrong happen while handling a cache file.
Cache file have way to be invalidated and if old/bad version stay no
misbehavior will happen. Proper value will eventually be computed and the wrong
will be righten.
This changeset use the transaction reporter (usually writing on stderr) to write
details about failed cache handling. This will only apply to write operation
using a transaction. The usual update during read only operation will stay a
debug message.
I was on the way to bring these message back to debug level when I realised it
could be a feature. People with write access to the repository are likely to
have the power to fix error related to cache (and it is valuable to fix them).
So let the things as is for now.
The goal is to allow access to file outside ofthe store directory from the
transaction. The obvious target are the `bookmarks` file. But we can envision
usage for cache too.
We keep passing a main opener explicitly because a lot of code rely on this
default opener. The main opener (operating on store) is using an empty key ''.
We need to store new data to improve the current transaction logic:
- location: We want to generate and backup file outside of the 'store' (eg:
bookmarks, or various cache files). This requires knowing and preserving where
each file is located. The value of this new field is a string. It will be used
as a key for a vfs mapping.
- cache: We would like to handle cache file in the transaction code. This
Will help to have cache consistent with the repository state and avoid
performance issue on big repository like Mozilla. However, failure to handle
cache file should not result in a transaction failure. We add a new field that
carry this information. The value is boolean, A True value mean any error
while handling this file can be ignored.
Those two mechanisms are not implemented yet, but they are now persisted in the
on disk file. Support for new mechanisms is coming in later changeset.
We update the file format now and will introduce the new features in later
changeset. The format version is set to 0 until we actually support the new feature.
This will prevent misunderstanding between incomplete and final client.
Support for reading both version 1 and (future) version 2 could be achieved
(using default value when reading version 1) but has not been seen as necessary
for now.
During the transaction, files may be created to store or expose data
involved in the transaction (eg: changelog index data are written in
a 'changelog.i.a' for hooks). But we do not have an official way to
record such file creation and make sure they are cleaned up. The lack
of clean-up is currently okay because there is a single file involved
and a single producer/consumer.
However, as we want to expose more data (bookmarks, phases, obsmarker)
we need something more solid. The 'backupentries' mechanism could
handle that. Temporary files can be encoded as a backup of nothing
'('', <temporarypath>)'. We "need" to attach it to the same mechanism
as we use to be able to use temporary transaction files outside of
.'store/' and 'backupentries' is expected to gain such feature.
This changeset makes it clear that we should rename 'backupentries' to
something more generic.
We are about to use the 'backupentry' mechanism to allow cleaning up
transaction-related temporary files (such as 'changelog.i.a'). We start
by extracting the entry registration into its own method for easy reuse.
At that point, I would like to rename the backup-file related variable to
something generic but I'm a bit short of ideas.
This mirrors the API for 'pending' and 'finalize' callbacks. I do not have
immediate usage planned for it, but I'm sure some callback will be happy to
access transaction related data.
The callback will likely need to perform some operation related to the
transaction (eg: registering file update). So we better pass the current
transaction as the callback argument. Otherwise callback that needs it has to
rely on horrible weak reference trick.
This allow already allow us to slay a wild weak reference usage.
The callback will likely need to perform some operation related to the
transaction (eg: backing files up). So we better pass the current transaction as
the callback argument. Otherwise callback that needs it has to rely on horrible
weak reference trick.
The first foreseen user of this is changelog._writepending. We would like it to
register the temporary file it create for cleanup purpose.
The case where a backup of a missing file was requested was previously
handled by the 'entries' list. As the 'backupentries' is about to gain
ability to backup files outside of '.hg/store', we want it to be able
to handle the missing file too.
Reminder: using 'addbackup' on a missing file means that such file needs to be
deleted if we rollback the transaction.
The `startgroup` and `endgroup` methods are used in a very specific
context to wrap a very specific operation (revlog truncation). It does
not make sense to perform any other operations during such a "group"
(eg:file backup). There is currently no user of backupfile during a
"group" so we drop the group-specific code and restrict authorized
operations during "group".
The addchangegroup code considers the transaction done after a 'tr.close()' call
and schedules the hook's execution for after lock release. In the nested transaction
case, the transaction is not yet committed and we must delay this scheduling.
We add an 'addpostclose' method (like the 'addpending' and 'addfinalize' ones) that
registers code to be run if the transaction is successfully committed.
The new 'addfinalize' method allows people to register a callback to
be triggered when the transaction is closed. This aims to get rid of
explicit calls to 'changelog.finalize'. This also obsoletes the
'onclose' function but removing it is not in the scope of this series.
The contents of the transaction must be flushed to disk before running
a hook. But it must be flushed to a special file so that the normal
reader does not use it. This logic is currently in the changelog only.
We add some facility to register such operations in the transaction
itself.
We extract the code generating files into its own function. We are
about to move this code around to fix a bug. We'll need it in a
function soon to reuse it for "pending" logic. So we move the code
into a function instead of moving it twice.
Previously the journal.backupfiles file was delimited by \0. Now we delimit it
using \n (same as the journal file). This allows us to change the number of
values in each line more easily, rather than relying on the count of \0's.
The transaction format will be changing a bit over the next releases, so let's
go ahead and add a version number to make backwards compatibility easier. This
whole file format was broken prior to 3.2 (see previous patch), so changing it
now is pretty low risk.