Commit Graph

147 Commits

Author SHA1 Message Date
Martin von Zweigbergk
e5ad1ba424 util: add base class for transactional context managers
We have at least three types with a close() and a release() method
where the close() method is supposed to be called on success and the
release() method is supposed to be called last, whether successful or
not. Two of them (transaction and dirstateguard) already have
identical implementations of __enter__ and __exit__. Let's extract a
base class for this, so we reuse the code and so the third type
(transactionmanager) can also be used as a context manager.

Differential Revision: https://phab.mercurial-scm.org/D392
2017-07-28 22:42:10 -07:00
FUJIWARA Katsunori
35f554c06a transaction: apply checkambig=True only on limited files for similarity
Now, transaction can determine whether avoidance of file stat
ambiguity is needed for each files, by blacklist "checkambigfiles".

For similarity to truncation in _playback(), this patch apply
checkambig=True only on limited files in code paths below.

  - restoring files by util.copyfile(), in _playback()
    (checkambigfiles itself is examined at first, because it as a
    keyword argument might be None)

  - writing files at finalization of transaction, in _generatefiles()

This patch reduces cost of checking stat at writing out and restoring
files, which aren't filecache-ed.
2017-07-04 23:13:47 +09:00
FUJIWARA Katsunori
ba85a5641c transaction: avoid file stat ambiguity only for files in blacklist
Advancing mtime by os.utime() fails for EPERM, if the target file is
owned by another. 0d920bcb0fd1 and related changes made some code
paths give advancing mtime up in such case, to fix issue5418.

This causes file stat ambiguity (again), if it is owned by another.

    https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan

To avoid file stat ambiguity in such case, especially for
.hg/dirstate, c75c7b3e3284 made vfs.rename() copy the target file, and
advance mtime of renamed one again, if EPERM (see issue5584 for detail).

But straightforward "copy if EPERM" isn't reasonable for truncation of
append-only files at rollbacking, because rollbacking might cost much
for truncation of many filelogs, even though filelogs aren't
filecache-ed.

Therefore, this patch introduces blacklist "checkambigfiles", and
avoids file stat ambiguity only for files specified in this blacklist.

This patch consists of two parts below, which should be applied at
once in order to avoid regression.

  - specify 'checkambig=True' at vfs.open(mode='a') in _playback()
    according to checkambigfiles

  - invoke _playback() with checkambigfiles
    - add transaction.__init__() checkambigfiles argument, for _abort()
      - make localrepo instantiate transaction with _cachedfiles
    - add rollback() checkambigfiles argument, for "hg rollback/recover"
      - make localrepo invoke rollback() with _cachedfiles

After this patch, straightforward "copy if EPERM" will be reasonable
at closing the file opened with checkambig=True, because this policy
is applied only on files, which are listed in blacklist
"checkambigfiles".
2017-07-04 23:13:46 +09:00
Jun Wu
e38073e90f strip: add a delayedstrip method that works in a transaction
For long, the fact that strip does not work inside a transaction and some
code has to work with both obsstore and fallback to strip lead to duplicated
code like:

      with repo.transaction():
          ....
          if obsstore:
              obsstore.createmarkers(...)
      if not obsstore:
          repair.strip(...)

Things get more complex when you want to call something which may call strip
under the hood. Like you cannot simply write:

      with repo.transaction():
          ....
          rebasemod.rebase(...) # may call "strip", so this doesn't work

But you do want rebase to run inside a same transaction if possible, so the
code may look like:

      with repo.transaction():
          ....
          if obsstore:
              rebasemod.rebase(...)
              obsstore.createmarkers(...)
      if not obsstore:
          rebasemod.rebase(...)
          repair.strip(...)

That's ugly and error-prone. Ideally it's possible to just write:

      with repo.transaction():
          rebasemod.rebase(...)
          saferemovenodes(...)

This patch is the first step towards that. It adds a "delayedstrip" method
to repair.py which maintains a postclose callback in the transaction object.
2017-06-25 10:38:45 -07:00
Jun Wu
e906874924 rebase: clean up rebasestate from active transaction
Previously, rebase assumes the following pattern:

    rebase:
        with transaction as tr: # top-level
            ...
        tr.__close__ writes rebasestate
        unlink('rebasestate')

However it's possible that "rebase" was called inside a transaction:

    with transaction as tr1:
        rebase:
            with transaction as tr2: # not top-level
                ...
            tr2.__close__ does not write rebasestate
            unlink('rebasestate')
    tr1.__close__ writes rebasestate

That leaves a rebasestate on disk incorrectly.

This patch adds "removefilegenerator" to notify transaction code that the
state file is no longer needed therefore fixes the issue.
2017-06-24 21:13:48 -07:00
Gregory Szorc
90a75c9c65 transaction: delete callbacks after use
Before this change, localrepository instances that performed multiple
transactions would leak transaction objects. This could occur when
running `hg convert`. When running `hg convert`, the leak would be
~90 MB per 10,000 changesets as measured with the Mercurial repo itself.

The leak I tracked down involved the "validate" closure from
localrepository.transaction(). It appeared to be keeping a
reference to the original transaction via __closure__. __del__
semantics and a circular reference involving the repo object
may have also come into play.

Attempting to refactor the "validate" closure proved to be
difficult because the "tr" reference in that closure may
reference an object that isn't created until transaction.__init__
is called. And the "validate" closure is passed as an argument to
transaction.__init__. Plus there is a giant warning comment in
"validate" about how hacky it is. I did not want to venture into
the dragon den.

Anyway, we've had problems with transactions causing leaks before.
The solution then (8b23c334b97f) is the same as the solution in this
patch: drop references to callbacks after they are called. This
not only breaks cycles in core Mercurial but can help break cycles
in extensions that accidentally introduce them.

While I only tracked down a leak due to self.validator, since this is
the 2nd time I've tracked down leaks due to transaction callbacks I
figure enough is enough and we should prevent the class of leak from
occurring regardless of the variable. That's why all callback variables
are now nuked.
2017-05-26 13:27:21 -07:00
Martin von Zweigbergk
c3406ac3db cleanup: use set literals
We no longer support Python 2.6, so we can now use set literals.
2017-02-10 16:56:29 -08:00
Pierre-Yves David
f6cb69a5f6 transaction: introduce "changes" dictionary to precisely track updates
The transaction is already tracking some data intended for hooks (in
'hookargs'). However, that information is minimal as we optimise for
passing data to other processes through environment variables. There are
multiple places were we could use more complete and lower level
information locally (eg: cache update, better report of changes to
hooks, etc...).

For this purpose we introduces a 'changes' dictionary on the
transaction.  It is intended to track every changes happening to the
repository (eg: new revs, bookmarks move, phases move, obs-markers,
etc).

For now we just adds the 'changes' dictionary. We'll adds more tracking
and usages over time.
2017-05-02 18:31:18 +02:00
Jun Wu
8b5d63a796 transaction: use ProgrammingError 2017-03-26 16:59:30 -07:00
FUJIWARA Katsunori
ed5aabb178 transaction: open a file with checkambig=True to avoid file stat ambiguity
Before this patch, if steps below occurs at "the same time in sec",
all of mtime, ctime and size are same between (1) and (3).

  1. append data to revlog-style file (and close transaction)
  2. discard appended data by truncation of rollback
  3. append same size but different data to revlog-style file again

Therefore, cache validation doesn't work after (3) as expected.

To avoid file stat ambiguity around truncation, this patch opens a
file with checkambig=True.

This is a part of ExactCacheValidationPlan.

    https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan
2016-09-22 21:52:00 +09:00
FUJIWARA Katsunori
f06844e1ef transaction: avoid ambiguity of file stat at restoring from backup
In some cases below, copying from backup is used to restore original
contents of a file, which is backuped via addfilegenerator(). If
copying keeps ctime, mtime and size of a file, restoring is
overlooked, and old contents cached before restoring isn't invalidated
as expected.

  - failure of transaction (from '.hg/journal.backup.*')
  - rollback of previous transaction (from '.hg/undo.backup.*')

To avoid ambiguity of file stat at restoring, this patch invokes
util.copyfile() with checkambig=True.

This patch is a part of "Exact Cache Validation Plan":

    https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan
2016-06-13 05:11:56 +09:00
FUJIWARA Katsunori
1d5601ef04 transaction: avoid ambiguity of file stat at closing transaction
Files below, which might be changed at closing transaction, are used
to examine validity of cached properties. If changing keeps ctime,
mtime and size of a file, change is overlooked, and old contents
cached before change isn't invalidated as expected.

  - .hg/bookmarks
  - .hg/dirstate
  - .hg/phaseroots

To avoid ambiguity of file stat, this patch writes files out with
checkambig=True at closing transaction.

checkambig becomes True only at closing (= 'not suffix'), because stat
information of '.pending' file isn't used to examine validity of
cached properties.

This patch is a part of "Exact Cache Validation Plan":

    https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan
2016-06-03 00:44:20 +09:00
Pierre-Yves David
b6f7b74a14 style: remove namespace class
For better or worse, our coding do not use use class for pure namespacing. We
remove the class introduced in e0db55ecbc14.
2016-04-16 16:01:24 -07:00
Pierre-Yves David
1f0c33027d style: don't use capital letter for constant
For better or worse, our coding do not use all caps for constants. We rename
constant name introduced in e0db55ecbc14.
2016-04-16 15:59:30 -07:00
Gregory Szorc
2a792d7c94 transaction: clear callback instances after usage
Prevents double usage and helps reduce reference cycles, which
were observed to occur in `hg convert` and other scenarios where
there are multiple transactions per process.
2016-04-16 09:02:37 -07:00
Durham Goode
67594c134c transaction: allow running file generators after finalizers
Previously, transaction.close would run the file generators before running the
finalizers (see the list below for what is in each). Since file generators
contain the bookmarks and the dirstate, this meant we made the dirstate and
bookmarks visible to external readers before we actually wrote the commits into
the changelog, which could result in missing bookmarks and missing working copy
parents (especially on servers with high commit throughput, since pulls might
fail to see certain bookmarks in this situation).

By moving the changelog writing to be before the bookmark/dirstate writing, we
ensure the commits are present before they are referenced.

This implementation allows certain file generators to be after the finalizers.
We didn't want to move all of the generators, since it's important that things
like phases actually run before the finalizers (otherwise you could expose
commits as public when they really shouldn't be).

For reference, file generators currently consist of: bookmarks, dirstate, and
phases. Finalizers currently consist of: changelog, revbranchcache, and fncache.
2016-04-07 14:10:49 -07:00
Durham Goode
64267e6e4d transaction: abort transaction during hook exception
The new transaction context did not handle the case where an exception during
close should still call release. This cause pretxnclose hooks that failed to
cause the transaction to fail without aborting, thus requiring a hg recover.

I've added a test.
2016-01-19 15:18:21 -08:00
Bryan O'Sullivan
31c9d97ad8 transaction: turn a transaction into a Python context manager
This lets us greatly simply acquire/release cycles.

If the block completes without raising an exception, the transaction
is closed.

Code pattern before:

  try:
    tr = repo.transaction('x')
    # zillions of lines of code
    tr.close()
  finally:
    tr.release()

And after:

  with tr.transaction('x'):
    # ...
2016-01-15 13:14:47 -08:00
Martin von Zweigbergk
6a744a4d74 transaction: remove 'if True:'
This seems to be left over from 5ae82d2dc4c0 (transaction: reorder
unlinking .hg/journal and .hg/journal.backupfiles, 2015-10-16).
2016-01-06 11:12:09 -08:00
Matt Mackall
cf710e0ab1 spelling: fix typo in transaction error messages 2015-10-17 15:28:02 -05:00
FUJIWARA Katsunori
56c4c4cb1b transaction: reorder unlinking .hg/journal and .hg/journal.backupfiles
After this reordering, absence of '.hg/journal' just before starting
new transaction means also absence of '.hg/journal.backupfiles'.

In this case, all temporary files for preceding transaction should be
completely unlinked, and HG_PENDING doesn't cause unintentional
reading stalled temporary files in.

Otherwise, 'repo.transaction()' raises exception with "run 'hg
recover' to clean up transaction" hint.
2015-10-16 03:29:51 +09:00
Pierre-Yves David
30913031d4 error: get Abort from 'error' instead of 'util'
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.

For great justice.
2015-10-08 12:55:45 -07:00
FUJIWARA Katsunori
47524f74ef transaction: add releasefn to notify the end of a transaction scope
'releasefn' is used by subsequent patch, to do appropriate action
according to the result of it at the end of a transaction scope.

To ensure that 'releasefn' is invoked only once, this patch invokes it
after assignment 'self.journal = None', because such assignment
prevents from invoked 'transaction._abort()' again via '__del__()'.

    def __del__(self):
        if self.journal:
            self._abort()
2015-10-09 03:53:46 +09:00
Gregory Szorc
04a726f0c8 transaction: use absolute_import 2015-08-08 20:10:23 -07:00
Gregory Szorc
5380dea2a7 global: mass rewrite to use modern exception syntax
Python 2.6 introduced the "except type as instance" syntax, replacing
the "except type, instance" syntax that came before. Python 3 dropped
support for the latter syntax. Since we no longer support Python 2.4 or
2.5, we have no need to continue supporting the "except type, instance".

This patch mass rewrites the exception syntax to be Python 2.6+ and
Python 3 compatible.

This patch was produced by running `2to3 -f except -w -n .`.
2015-06-23 22:20:08 -07:00
Gregory Szorc
3aa1c73868 global: mass rewrite to use modern octal syntax
Python 2.6 introduced a new octal syntax: "0oXXX", replacing "0XXX". The
old syntax is not recognized in Python 3 and will result in a parse
error.

Mass rewrite all instances of the old octal syntax to the new syntax.

This patch was generated by `2to3 -f numliterals -w -n .` and the diff
was selectively recorded to exclude changes to "<N>l" syntax conversion,
which will be handled separately.
2015-06-23 22:30:33 -07:00
Matt Mackall
1a860a7827 merge with stable 2015-05-27 17:41:42 -05:00
Pierre-Yves David
e78e0b6b0c transaction: really fix _addbackupentry key usage (issue4684)
The fix in a795188178a1 is actually wrong. We now use the filename to match what
'_addentry'. This whole untested code is quite suspicious. This seems to point
that no one is ever running 'tr.find' for a backup file.
2015-05-26 13:02:28 -07:00
Pierre-Yves David
1dde6a5eaf transaction: use the proper variable in '_addbackupentry' (issue4684)
The 'file' variable was undefined but resolved to the 'file' built-in.
This is why pylint complains about overwriting built-ins...
2015-05-22 12:13:18 -05:00
Matt Mackall
51fd20a56a merge with stable 2015-05-23 15:55:04 -05:00
Michael O'Connor
d4eb35d7ec transaction: add missing newline to message
Add a missing newline to the "journal was created by a
different version of Mercurial" message.
2015-04-14 10:59:26 -04:00
Pierre-Yves David
7e5ddaf348 recover: catch any exception, not just Exception
We want recover to be rock solid.
2015-05-18 15:38:24 -05:00
Pierre-Yves David
4c1ee6d2d1 transaction: add a validation stage
The 'transaction' object can now be fed a 'validator' function. This function
will be run right before the transaction is closed to validate its content. The
target usage is hooks. The validation function is expected to raise an exception
when it wants to abort the transaction.

This only introduce the idea with a default no-op validator. Actual usage is in
the next changeset.
2015-03-09 22:43:36 -07:00
Pierre-Yves David
be1b3a5e03 transaction: include backup file in the "undo" transaction
Once the transaction is closed, we now write transaction related data for
possible future undo. For now, we only do it for full file "backup" because
their were not handle at all in that case. In the future, we could move all the
current logic to set undo up (that currently exists in localrepository) inside
transaction itself, but it is not strictly requires to solve the current
situation.
2015-01-16 18:34:14 -08:00
Pierre-Yves David
8984d47c1b transaction: pass the name of the "undo" journal to the transaction
It is time for the transaction to be responsible for setting up the
undo data. It is necessary to move this logic into the transaction
because many more files are handled now, and the transaction is the
object tracking them all.

The value can be set to None if no undo should be set.
2015-01-16 19:35:04 -08:00
Pierre-Yves David
03d5cdc6b3 transaction: clarify the name of 'journal' argument for transaction
The argument is a string containing the journal name (used as prefix for all
other transaction file). This is not the transaction file itself. So we clarify
this.
2015-01-16 14:54:24 -08:00
Pierre-Yves David
09bcff37bf transaction: use 'util.copyfile' for creating backup
Using 'copyfile' (single file) instead of 'copyfiles' (tree) will ensures
destination file will be overwritten. This will prevent some abort if backup
file are left in place for random reason.

It also seems more correct.
2015-01-05 12:44:15 -08:00
Gregory Szorc
433ea5a1b2 transaction: support for callbacks during abort
Previous transaction work added callbacks to be called during regular
transaction commit/close. As part of refactoring Mozilla's pushlog
extension (an extension that opens a SQLite database and tries to tie
its transaction semantics to Mercurial's transaction), I discovered that
the new transaction APIs were insufficient to avoid monkeypatching
transaction instance internals. Adding a callback that is called during
transaction abort removes the necessity for monkeypatching and completes
the API.
2015-01-06 21:56:33 -08:00
Pierre-Yves David
c56675c8d2 transaction: use the right location when cleaning up backup file (issue4479)
The location variable fetch from the loop and the one used to actually fetch it
mismatched. We fix the name to ensure file outside of store are cleaned up.
2015-01-05 15:00:02 -08:00
Pierre-Yves David
8985258968 vfs: add a 'split' method
This method has the same behavior as the 'os.path.split' function, but having
it in vfs will allow handling of tricky encoding situations in the future.

In the same patch, we replace the use of 'os.path.split' in the transaction code.
2014-12-15 13:32:34 -08:00
Pierre-Yves David
329429e77c vfs: add a 'reljoin' function for joining relative paths
The vfs.join method only works for absolute paths. We need something
that works for relative paths too when transforming filenames. Since
os.path.join may misbehave in tricky encoding situations, encapsulate
the new join method in our vfs abstraction. The default implementation
remains os.path.join, but this opens the door to other VFSes doing
something more intelligent based on their needs.

In the same go, we replace the usage of 'os.path.join' in transaction code.
2014-12-15 13:27:46 -08:00
Mads Kiilerich
b420dd92b1 spelling: fixes from proofreading of spell checker issues 2014-04-17 22:47:38 +02:00
Pierre-Yves David
511030e86e transaction: remove the 'onabort' mechanism
It has no known users. If someones needs similar functionality, a new 'addabort'
method similar to 'addfinalize' should be added.
2014-12-04 13:52:46 -08:00
Pierre-Yves David
61ff2e647d transaction: remove the redundant 'onclose' mechanism
It is superseded by the 'addfinalize' function and all its user have been
migrated.
2014-12-04 13:51:41 -08:00
Pierre-Yves David
3ace7493d7 transaction: write pending generated files
Such file are generated with a .pending prefix. It is up to the reader to
implement the necessary logic for reading pending files.

We add a test to ensure pending files are properly cleaned-up in both success and
error cases.
2014-10-17 22:19:05 -07:00
Pierre-Yves David
58e32f1eeb transaction: have _generatefile return a boolean
The function returns True if any files were generated. This will be
used to know if any pending files have been written.
2014-10-17 21:57:32 -07:00
Pierre-Yves David
81a1fe4d5b transaction: allow generating files with a suffix
This will allow us to generate temporary pending files. Files
generated with a suffix are assumed temporary and will be cleaned up
at the end of the transaction.
2014-09-29 01:29:08 -07:00
Matt Mackall
3f845e51cb transaction: fix some docstring grammar 2014-11-19 09:52:05 -06:00
Pierre-Yves David
ecac877d99 transaction: accept a 'location' argument for registertmp
This will allow generation of temporary files outside of store. This will be
useful for bookmarks.
2014-11-12 14:57:41 +00:00
Pierre-Yves David
46df0be40b transaction: drop special handling for phases and bookmarks generation
We are still doing double backups, but now that we have proper
location handling this is less of an issue. Dropping this simplifies
the code before we add some pending-related logic.

This also ensures we actually test the new 'location' mechanism.
2014-11-12 14:47:48 +00:00