I can never remember the differences between the various revset
APIs. I can never remember that scmutil.revrange() is the one I
want to use from user-facing commands.
Add some documentation to clarify this.
While we're here, the argument name for revrange() is changed to
"specs" because that's what it actually is.
To make it easier to patch the wrapped function, make it possible to access the
filecache descriptor directly on the class (rather than have to use
ClassObject.__dict__['attributename']). Returning `self` when the first
argument to `__get__` is `None` makes the descriptor behave the same way
`property` objects do.
Rollback of previous transaction restores contents of files below by
renaming from 'undo.*' file. If renaming keeps ctime, mtime and size
of a file, restoring is overlooked, and old contents cached before
restoring isn't invalidated as expected.
- .hg/bookmarks
- .hg/phaseroots
To avoid ambiguity of file stat at restoring, this patch invokes
vfs.rename() with checkambig=True.
BTW, .hg/dirstate is also restored at rollback. But it is restored by
dirstate.restorebackup(), and previous patch already made it invoke
vfs.rename() with checkambig=True.
This patch is a part of "Exact Cache Validation Plan":
https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan
All versions of Python we support or hope to support make the hash
functions available in the same way under the same name, so we may as
well drop the util forwards.
If you have just executable-bit change and amend it twice it will vanish:
* After the first amend the commit will have the proper executable bit set
in manifest but it won't have the the file on the list of files in
changelog.
* The second amend will read the wrong list of files from changelog and it
will copy the manifest entry from parent for this file.
* Voila! The change is lost.
This change repairs the bug in localrepo causing this and adds a test for it.
This is one step towards having dirstate manage its own storage. It will
be useful for the implementation of sql dirstate [1].
This introduced a small test change: now we always write the dirstate before
saving backup so in some cases where dirstate file didn't exist yet
savebackup can create it.
[1] https://www.mercurial-scm.org/wiki/SQLDirstatePlan
This is one step towards having dirstate manage its own storage. It will
be useful for the implementation of sqldirstate [1].
I'm deleting two of the dirstate.invalidate() calls in localrepo because
restorebackup method does that for us.
[1] https://www.mercurial-scm.org/wiki/SQLDirstatePlan
We have been warning about transactions without locks for about a year (and
three releases), third party extensions had a fair grace period to fix their
code, we are moving lack of locking to a hard failure in order to protect users
against repository corruption.
prepushoutgoinghook was introduced in 8dfcd476a7f7 and largefiles is the only
in-tree use of it. Refactor it to be more useful for other use cases in
largefiles.
This patch extracts the code for determining requirements for a new
repo into a standalone function. By doing so, future code that will
perform an in-place repository upgrade (e.g. to generaldelta) can
examine the set of proposed new requirements and possibly take
additional actions (such as adding dotencode or fncache) when
performing the upgrade.
This patch is marked as API because _baserequirements (which was added
in 87e5a9d0fbb1 so extensions could override it) has been removed and
will presumably impact whatever extension it was added for. Consumers
should be able to monkeypatch the new function to achieve the same
functionality.
The "create" argument has been dropped because the function is only
called in one location and "create" is always true in that case.
While it makes logical sense for this code to be a method so extensions
can implement a custom repo class / method to override it, this won't
actually work. This is because requirements determination occurs during
localrepository.__init__ and this is before the "reposetup"
"callback" is fired. So, the only way for extensions to customize
requirements would be to overwrite localrepo.localrepository or to
monkeypatch a function on a module during extsetup(). Since we try to
keep localrepository small, we use a standalone function. There is
probably room to offer extensions a "hook" point to alter repository
creation. But that is scope bloat.
A future patch will refactor requirements determination into a
standalone function. To prepare for this, refactor the requirements
code to assign to a local variable instead of to self.requirements.
Before, the hook() closure (which is called as part of locking hooks)
would maintain a reference to a transaction instance (which should be
finalized by the time lock hooks are called). Because we accumulate
hook() instances when there are multiple transactions per lock, this
would result in holding references to the transaction instances which
would lead to higher memory utilization.
Creating a reference to the hook arguments dict minimizes the number
of objects that are kept alive until the lock release hook runs,
minimizing memory "leaks."
This further centralizes the handling of bookmark storage, and will
help get some lingering bookmarks business out of localrepo. Right
now, this change implies reading of the active bookmark to also imply
reading all bookmarks from disk - for users with many many bookmarks
this may be a measurable performance hit. In that case, we should
migrate bmstore to be able to lazy-read its properties from disk
rather than having to eagerly read them, but I decided to avoid doing
that to try and avoid some potentially complicated filecache decorator
issues.
This doesn't move the logic for writing the active bookmark into a
transaction, though that is probably the correct next step. Since the
API probably needs to morph a little more, I didn't bother marking
bookmarks.{activate,deactivate} as deprecated yet.
The 'peer.known' call (handled at the repository level) was applying its own
manual filtering (looking at phases) instead of relying on the repoview
mechanism. This led to the discovery finding more "common" node that
'getbundle' was willing to recognised. From there, bad things happen, issue4982
is a symptom of it. While situations like described in issue4982 can still
happen because of race conditions, fixing 'peer.known' is important for
consistency in all cases.
We update the code to use 'repoview' filtering. This lead to small changes in
the tests for exchanging obsolescence marker because the discovery yields
different results.
The test affected in 'test-obsolete-changeset-exchange.t' is a test for
issue4982 getting back to its expected state.
If acquisition of wlock waits for another "hg commit" process to
release it, dirstate will refer newly committed revision after
acquisition of wlock.
At that time, '00changelog.i' on the filesystem contains this new
revision, but in-memory 'repo.changelog' doesn't, if it is cached
without store lock (slock) before updating by another "hg commit".
This makes validating parents at re-loading 'repo.dirstate' from
'.hg/dirstate' replace such new revision with 'nullid'. Then,
'localrepository.commit()' creates "orphan" revision (see issue4368
for detail).
ec817376a322 makes 'commands.commit()' acquire both wlock and slock
before processing to avoid this issue at "hg commit".
But similar issue can occur even after ec817376a322, if 3rd party
extension does:
- refer 'repo.changelog' outside wlock scope, and
- invoke 'repo.commit()' directly (instead of 'commands.commit()')
This patch makes 'commit()' acquire slock before processing, to refer
recent changelog at validating parents of 'repo.dirstate'.
The function was dropped in d5d613de0f44. This API drop brokes three of my
extensions including some critical to my workflow like tortoisehg. Lets mark
this API for death and give people time to fix their code.
Previously we were only checking modified files for their resolve state. But a
file might be unresolved yet not in the modified state. Handle all such cases
properly.
Auditors keeps a cache of audited paths. Therefore we cannot use the same
auditor for working copy and history operation. We create a new one without
file system check for this purposes.
localrepo.parents() has relatively few users, and most of those were
actually implicitly looking at the wctx, which is now made explicit
via repo[None].
'repo.invalidate()' deletes 'filecache'-ed properties by
'filecache.__delete__()' below via 'delattr(unfiltered, k)'. But
cached objects are still kept in 'repo._filecache'.
def __delete__(self, obj):
try:
del obj.__dict__[self.name]
except KeyError:
raise AttributeError(self.name)
If 'repo' object is reused even after failure of command execution,
referring 'filecache'-ed property may reuse one kept in
'repo._filecache', even if reloading from a file is expected.
Executing command sequence on command server is a typical case of this
situation (e0a0f9ad3e4c also tried to fix this issue). For example:
1. start a command execution
2. 'changelog.delayupdate()' is invoked in a transaction scope
This replaces own 'opener' by '_divertopener()' for additional
accessing to '00changelog.i.a' (aka "pending file").
3. transaction is aborted, and command (1) execution is ended
After 'repo.invalidate()' at releasing store lock, changelog
object above (= 'opener' of it is still replaced) is deleted from
'repo.__dict__', but still kept in 'repo._filecache'.
4. start next command execution with same 'repo'
5. referring 'repo.changelog' may reuse changelog object kept in
'repo._filecache' according to timestamp of '00changelog.i'
'00changelog.i' is truncated at transaction failure (even though
this truncation is unintentional one, as described later), and
'st_mtime' of it is changed. But 'st_mtime' doesn't have enough
resolution to always detect this truncation, and invalid
changelog object kept in 'repo._filecache' is reused
occasionally.
Then, "No such file or directory" error occurs for
'00changelog.i.a', which is already removed at (3).
This patch discards objects in '_filecache' other than dirstate at
transaction failure.
Changes in 'invalidate()' can't be simplified by 'self._filecache =
{}', because 'invalidate()' should keep dirstate in 'self._filecache'
'repo.invalidate()' at "hg qpush" failure is removed in this patch,
because now it is redundant.
This patch doesn't make 'repo.invalidate()' always discard objects in
'_filecache', because 'repo.invalidate()' is invoked also at unlocking
store lock.
- "always discard objects in filecache at unlocking" may cause
serious performance problem for subsequent procedures at normal
execution
- but it is impossible to "discard objects in filecache at unlocking
only at failure", because 'releasefn' of lock can't know whether a
lock scope is terminated normally or not
BTW, using "with" statement described in PEP343 for lock may
resolve this ?
After this patch, truncation of '00changelog.i' still occurs at
transaction failure, even though newly added revisions exist only in
'00changelog.i.a' and size of '00changelog.i' isn't changed by this
truncation.
Updating 'st_mtime' of '00changelog.i' implied by this redundant
truncation also affects cache behavior as described above.
This will be fixed by dropping '00changelog.i' at aborting from the
list of files to be truncated in transaction.
This patch centralizes passing HG_PENDING to external hook process
into '_exthook()'. To make in-memory changes visible to external hook
process, this patch does:
- write (or schedule to write) in-memory dirstate changes, and
- set HG_PENDING environment variable, if:
- a transaction is running, and
- there are in-memory changes to be visible
This patch tests some commands with some hooks, because transaction
activity of a same hook differs from each other ("---": "not tested").
======== ========= ========= ============
command preupdate precommit pretxncommit
======== ========= ========= ============
unshelve o --- ---
backout x --- ---
import --- o o
qrefresh --- x o
======== ========= ========= ============
Each hooks are examined separately to prevent in-memory changes from
being visible to external process accidentally by side effect of hooks
previously invoked.
Now, 'dirstate.write(tr)' delays writing in-memory changes out, if a
transaction is running.
This may cause treating this revision as "the first bad one" at
bisecting in some cases using external hook process inside transaction
scope, because some external hooks and editor process are still
invoked without HG_PENDING and pending changes aren't visible to them.
'dirstate.write()' callers below in localrepo.py explicitly use 'None'
as 'tr', because they can assume that no transaction is running:
- just before starting transaction
- at closing transaction, or
- at unlocking wlock
This code will not currently be activated because there's no code to mark
files as driver-resolved in core. This point is also somewhat hard to plug into
from extensions.
Before this patch, making a commit on a local repo could move a bookmark and
both operations would not be grouped as one transaction. This patch makes both
operations part of one transaction. This is necessary to switch to the new api
to save bookmarks repo._bookmarks.recordchange if we don't want to change the
current behavior of rollback.
Dirstate change happening after the commit is done is now part of the
transaction mentioned above. This leads to a change in the expected output of
several tests.
The change to test-fncache happens because both lock are now released in the
same finally clause. The lock release is made explicitly buggy in this test.
Previously releasing lock would crash triggering release of wlock that crashes
too. Now lock release crash does not directly result in the release of wlock.
Instead wlock is released at garbage collection time and the error raised at
that time "confuses" python.
This patch delays writing in-memory changes out, if transaction is
running.
'_getfsnow()' is defined as a function, to hook it easily for
ambiguous timestamp tests (see also fakedirstatewritetime.py)
'if tr:' code path in this patch is still disabled at this revision,
because there is no client invoking 'dirstate.write()' with repo
object.
BTW, this patch changes 'dirstate.invalidate()' semantics around
'dirstate.write()' in a transaction scope:
before:
with repo.transaction():
dirstate.CHANGE('A')
dirstate.write() # change for A is written out here
dirstate.CHANGE('B')
dirstate.invalidate() # discards only change for B
after:
with repo.transaction():
dirstate.CHANGE('A')
dirstate.write() # change for A is still kept in memory
dirstate.CHANGE('B')
dirstate.invalidate() # discards changes for A and B
Fortunately, there is no code path expecting the former, at least, in
Mercurial itself, because 'dirstateguard' was introduced to remove
such 'dirstate.invalidate()'.
'localrepository.rollback()' explicilty restores dirstate, only if at
least one of current parents of the working directory is removed at
rollbacking (a.k.a "parent-gone").
After DirstateTransactionPlan, 'dirstate.write()' will cause marking
'.hg/dirstate' as a file to be restored at rollbacking.
https://mercurial.selenic.com/wiki/DirstateTransactionPlan
Then, 'transaction.rollback()' restores '.hg/dirstate' regardless of
parents of the working directory at that time, and this causes
unexpected dirstate changes if not "parent-gone" (e.g. "hg update" to
another branch after "hg commit" or so, then "hg rollback").
To avoid such situation, this patch restores dirstate to one before
rollbacking if not "parent-gone".
before:
b1. restore dirstate explicitly, if "parent-gone"
after:
a1. save dirstate before actual rollbacking via dirstateguard
a2. restore dirstate via 'transaction.rollback()'
a3. if "parent-gone"
- discard backup (a1)
- restore dirstate from 'undo.dirstate'
a4. otherwise, restore dirstate from backup (a1)
Even though restoring dirstate at (a3) after (a2) seems redundant,
this patch keeps this existing code path, because:
- it isn't ensured that 'dirstate.write()' was invoked at least once
while transaction running
If not, '.hg/dirstate' isn't restored at (a2).
In addition to it, rude 3rd party extension invoking
'dirstate.write()' without 'repo' while transaction running (see
subsequent patches for detail) may break consistency of a file
backup-ed by transaction.
- this patch mainly focuses on changes for DirstateTransactionPlan
Restoring dirstate at (a3) itself should be cheaper enough than
rollbacking itself. Redundancy will be removed in next step.
Newly added test is almost meaningless at this point. It will be used
to detect regression while implementing delayed dirstate write out.
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.
For great justice.
Before this patch, in-memory dirstate changes are still kept over a
transaction scope boundary regardless of the result of it.
For "all or nothing" policy of the transaction, in-memory dirstate
changes should be:
- written out at successful closing a transaction, because
subsequent 'dirstate.invalidate()' can lose them
- discarded at failure of a transaction, because outer
'wlock.release()' or so may write them out
To discard all changes in a transaction completely, this patch also
restores '.hg/dirstate' by '.hg/journal.dirstate' at failure, because
'transaction' itself does nothing for files related to '.hg/journal.*'
in such case (therefore, renaming in this patch is safe enough).
This is a part of preparations for "transactional dirstate". See also
the wiki page below for detail about it.
https://mercurial.selenic.com/wiki/DirstateTransactionPlan
This patch also removes redundant 'dirstate.invalidate()' just before
aborting a transaction for shelve/unshelve.
Review feedback from Pierre-Yves David. A separate line of work is working to
ensure that dirstate writes are written to a separate 'pending' file while a
transaction is active. Lock inheritance currently conflicts with that, so dodge
the issue by simply preventing inheritance while a transaction is running.
Custom merge drivers aren't going to run inside a transaction, so this doesn't
affect that.
This will be useful to pass around a reference to the lock to some functions
we're going to add to scmutil. We don't want those functions to live in
localrepo to avoid bloat.