The transaction is already tracking some data intended for hooks (in
'hookargs'). However, that information is minimal as we optimise for
passing data to other processes through environment variables. There are
multiple places were we could use more complete and lower level
information locally (eg: cache update, better report of changes to
hooks, etc...).
For this purpose we introduces a 'changes' dictionary on the
transaction. It is intended to track every changes happening to the
repository (eg: new revs, bookmarks move, phases move, obs-markers,
etc).
For now we just adds the 'changes' dictionary. We'll adds more tracking
and usages over time.
Before this patch, if steps below occurs at "the same time in sec",
all of mtime, ctime and size are same between (1) and (3).
1. append data to revlog-style file (and close transaction)
2. discard appended data by truncation of rollback
3. append same size but different data to revlog-style file again
Therefore, cache validation doesn't work after (3) as expected.
To avoid file stat ambiguity around truncation, this patch opens a
file with checkambig=True.
This is a part of ExactCacheValidationPlan.
https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan
In some cases below, copying from backup is used to restore original
contents of a file, which is backuped via addfilegenerator(). If
copying keeps ctime, mtime and size of a file, restoring is
overlooked, and old contents cached before restoring isn't invalidated
as expected.
- failure of transaction (from '.hg/journal.backup.*')
- rollback of previous transaction (from '.hg/undo.backup.*')
To avoid ambiguity of file stat at restoring, this patch invokes
util.copyfile() with checkambig=True.
This patch is a part of "Exact Cache Validation Plan":
https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan
Files below, which might be changed at closing transaction, are used
to examine validity of cached properties. If changing keeps ctime,
mtime and size of a file, change is overlooked, and old contents
cached before change isn't invalidated as expected.
- .hg/bookmarks
- .hg/dirstate
- .hg/phaseroots
To avoid ambiguity of file stat, this patch writes files out with
checkambig=True at closing transaction.
checkambig becomes True only at closing (= 'not suffix'), because stat
information of '.pending' file isn't used to examine validity of
cached properties.
This patch is a part of "Exact Cache Validation Plan":
https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan
Prevents double usage and helps reduce reference cycles, which
were observed to occur in `hg convert` and other scenarios where
there are multiple transactions per process.
Previously, transaction.close would run the file generators before running the
finalizers (see the list below for what is in each). Since file generators
contain the bookmarks and the dirstate, this meant we made the dirstate and
bookmarks visible to external readers before we actually wrote the commits into
the changelog, which could result in missing bookmarks and missing working copy
parents (especially on servers with high commit throughput, since pulls might
fail to see certain bookmarks in this situation).
By moving the changelog writing to be before the bookmark/dirstate writing, we
ensure the commits are present before they are referenced.
This implementation allows certain file generators to be after the finalizers.
We didn't want to move all of the generators, since it's important that things
like phases actually run before the finalizers (otherwise you could expose
commits as public when they really shouldn't be).
For reference, file generators currently consist of: bookmarks, dirstate, and
phases. Finalizers currently consist of: changelog, revbranchcache, and fncache.
The new transaction context did not handle the case where an exception during
close should still call release. This cause pretxnclose hooks that failed to
cause the transaction to fail without aborting, thus requiring a hg recover.
I've added a test.
This lets us greatly simply acquire/release cycles.
If the block completes without raising an exception, the transaction
is closed.
Code pattern before:
try:
tr = repo.transaction('x')
# zillions of lines of code
tr.close()
finally:
tr.release()
And after:
with tr.transaction('x'):
# ...
After this reordering, absence of '.hg/journal' just before starting
new transaction means also absence of '.hg/journal.backupfiles'.
In this case, all temporary files for preceding transaction should be
completely unlinked, and HG_PENDING doesn't cause unintentional
reading stalled temporary files in.
Otherwise, 'repo.transaction()' raises exception with "run 'hg
recover' to clean up transaction" hint.
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.
For great justice.
'releasefn' is used by subsequent patch, to do appropriate action
according to the result of it at the end of a transaction scope.
To ensure that 'releasefn' is invoked only once, this patch invokes it
after assignment 'self.journal = None', because such assignment
prevents from invoked 'transaction._abort()' again via '__del__()'.
def __del__(self):
if self.journal:
self._abort()
Python 2.6 introduced the "except type as instance" syntax, replacing
the "except type, instance" syntax that came before. Python 3 dropped
support for the latter syntax. Since we no longer support Python 2.4 or
2.5, we have no need to continue supporting the "except type, instance".
This patch mass rewrites the exception syntax to be Python 2.6+ and
Python 3 compatible.
This patch was produced by running `2to3 -f except -w -n .`.
Python 2.6 introduced a new octal syntax: "0oXXX", replacing "0XXX". The
old syntax is not recognized in Python 3 and will result in a parse
error.
Mass rewrite all instances of the old octal syntax to the new syntax.
This patch was generated by `2to3 -f numliterals -w -n .` and the diff
was selectively recorded to exclude changes to "<N>l" syntax conversion,
which will be handled separately.
The fix in a795188178a1 is actually wrong. We now use the filename to match what
'_addentry'. This whole untested code is quite suspicious. This seems to point
that no one is ever running 'tr.find' for a backup file.
The 'transaction' object can now be fed a 'validator' function. This function
will be run right before the transaction is closed to validate its content. The
target usage is hooks. The validation function is expected to raise an exception
when it wants to abort the transaction.
This only introduce the idea with a default no-op validator. Actual usage is in
the next changeset.
Once the transaction is closed, we now write transaction related data for
possible future undo. For now, we only do it for full file "backup" because
their were not handle at all in that case. In the future, we could move all the
current logic to set undo up (that currently exists in localrepository) inside
transaction itself, but it is not strictly requires to solve the current
situation.
It is time for the transaction to be responsible for setting up the
undo data. It is necessary to move this logic into the transaction
because many more files are handled now, and the transaction is the
object tracking them all.
The value can be set to None if no undo should be set.
The argument is a string containing the journal name (used as prefix for all
other transaction file). This is not the transaction file itself. So we clarify
this.
Using 'copyfile' (single file) instead of 'copyfiles' (tree) will ensures
destination file will be overwritten. This will prevent some abort if backup
file are left in place for random reason.
It also seems more correct.
Previous transaction work added callbacks to be called during regular
transaction commit/close. As part of refactoring Mozilla's pushlog
extension (an extension that opens a SQLite database and tries to tie
its transaction semantics to Mercurial's transaction), I discovered that
the new transaction APIs were insufficient to avoid monkeypatching
transaction instance internals. Adding a callback that is called during
transaction abort removes the necessity for monkeypatching and completes
the API.
The location variable fetch from the loop and the one used to actually fetch it
mismatched. We fix the name to ensure file outside of store are cleaned up.
This method has the same behavior as the 'os.path.split' function, but having
it in vfs will allow handling of tricky encoding situations in the future.
In the same patch, we replace the use of 'os.path.split' in the transaction code.
The vfs.join method only works for absolute paths. We need something
that works for relative paths too when transforming filenames. Since
os.path.join may misbehave in tricky encoding situations, encapsulate
the new join method in our vfs abstraction. The default implementation
remains os.path.join, but this opens the door to other VFSes doing
something more intelligent based on their needs.
In the same go, we replace the usage of 'os.path.join' in transaction code.
Such file are generated with a .pending prefix. It is up to the reader to
implement the necessary logic for reading pending files.
We add a test to ensure pending files are properly cleaned-up in both success and
error cases.
This will allow us to generate temporary pending files. Files
generated with a suffix are assumed temporary and will be cleaned up
at the end of the transaction.
We are still doing double backups, but now that we have proper
location handling this is less of an issue. Dropping this simplifies
the code before we add some pending-related logic.
This also ensures we actually test the new 'location' mechanism.
The current naming scheme ('journal.backups.<file>') resulted is bad directory
name when 'file' was in a subdirectory. We now extract the directory name and
create the backupfile within it.
We plan to use file in a subdirectory for cachefile.
We do not want to abort if anything wrong happen while handling a cache file.
Cache file have way to be invalidated and if old/bad version stay no
misbehavior will happen. Proper value will eventually be computed and the wrong
will be righten.
This changeset use the transaction reporter (usually writing on stderr) to write
details about failed cache handling. This will only apply to write operation
using a transaction. The usual update during read only operation will stay a
debug message.
I was on the way to bring these message back to debug level when I realised it
could be a feature. People with write access to the repository are likely to
have the power to fix error related to cache (and it is valuable to fix them).
So let the things as is for now.