Coming new bundle2 parts related to bookmark will use a binary encoding. It
encodes a series of '(bookmark, node)' pairs. Bookmark name has a high enough
size limit to not be affected by issue5165. (64K length, we are well covered)
The new 'txnclose-bookmark' hook expose the bookmark movement information
stored in 'tr.changes['bookmarks]'. To provide a simple and straightforward
hook API to the users, we introduce a new hook called for each bookmark
touched. Since a transaction can affect multiple bookmarks, updating the
existing 'txnclose' hook to expose that information would be more complex. The
data for all moves might not fit in environment variables and iterations over
each move would be cumbersome. So the introduction of a new dedicated hook is
preferred in this changeset.
This does not exclude the addition to the full bookmark information to the
existing 'txnclose' in the future to help write more complex hooks.
The transaction has now a 'bookmarks' dictionary in tr.changes. The structure
of the dictionary is {BOOKMARK_NAME: (OLD_NODE, NEW_NODE)}. If a bookmark is
deleted NEW_NODE will be None. If a bookmark is created OLD_NODE will be None.
If the bookmark is updated multiple time, the initial value is preserved.
Now that we have migrated all in-core caller of 'recordchange' to
'applychanges', deprecate 'recordchange' so external callers will move to the
new unified method.
checkconflict used to also do some bookmark deletion in case of divergence. It
is a bit suspicious given the function name, but it's not the goal of this
series.
In order to unify bookmarks changing, checkconflict now return the list of
divergent bookmarks to clean up and the callers must clean them by calling
applyphases.
We want to use applychanges in order to unify bookmark movement. We need a way
to compute divergence deletion without actually removing them.
We split the function in two in this patch while we migrate the existing users
of this code on next patches.
This is done by a script [2] using RedBaron [1], a tool designed for doing
code refactoring. All "default" values are decided by the script and are
strongly consistent with the existing code.
There are 2 changes done manually to fix tests:
[warn] mercurial/exchange.py: experimental.bundle2-output-capture: default needs manual removal
[warn] mercurial/localrepo.py: experimental.hook-track-tags: default needs manual removal
Since RedBaron is not confident about how to indent things [2].
[1]: https://github.com/PyCQA/redbaron
[2]: https://github.com/PyCQA/redbaron/issues/100
[3]:
#!/usr/bin/env python
# codemod_configitems.py - codemod tool to fill configitems
#
# Copyright 2017 Facebook, Inc.
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
from __future__ import absolute_import, print_function
import os
import sys
import redbaron
def readpath(path):
with open(path) as f:
return f.read()
def writepath(path, content):
with open(path, 'w') as f:
f.write(content)
_configmethods = {'config', 'configbool', 'configint', 'configbytes',
'configlist', 'configdate'}
def extractstring(rnode):
"""get the string from a RedBaron string or call_argument node"""
while rnode.type != 'string':
rnode = rnode.value
return rnode.value[1:-1] # unquote, "'str'" -> "str"
def uiconfigitems(red):
"""match *.ui.config* pattern, yield (node, method, args, section, name)"""
for node in red.find_all('atomtrailers'):
entry = None
try:
obj = node[-3].value
method = node[-2].value
args = node[-1]
section = args[0].value
name = args[1].value
if (obj in ('ui', 'self') and method in _configmethods
and section.type == 'string' and name.type == 'string'):
entry = (node, method, args, extractstring(section),
extractstring(name))
except Exception:
pass
else:
if entry:
yield entry
def coreconfigitems(red):
"""match coreconfigitem(...) pattern, yield (node, args, section, name)"""
for node in red.find_all('atomtrailers'):
entry = None
try:
args = node[1]
section = args[0].value
name = args[1].value
if (node[0].value == 'coreconfigitem' and section.type == 'string'
and name.type == 'string'):
entry = (node, args, extractstring(section),
extractstring(name))
except Exception:
pass
else:
if entry:
yield entry
def registercoreconfig(cfgred, section, name, defaultrepr):
"""insert coreconfigitem to cfgred AST
section and name are plain string, defaultrepr is a string
"""
# find a place to insert the "coreconfigitem" item
entries = list(coreconfigitems(cfgred))
for node, args, nodesection, nodename in reversed(entries):
if (nodesection, nodename) < (section, name):
# insert after this entry
node.insert_after(
'coreconfigitem(%r, %r,\n'
' default=%s,\n'
')' % (section, name, defaultrepr))
return
def main(argv):
if not argv:
print('Usage: codemod_configitems.py FILES\n'
'For example, FILES could be "{hgext,mercurial}/*/**.py"')
dirname = os.path.dirname
reporoot = dirname(dirname(dirname(os.path.abspath(__file__))))
# register configitems to this destination
cfgpath = os.path.join(reporoot, 'mercurial', 'configitems.py')
cfgred = redbaron.RedBaron(readpath(cfgpath))
# state about what to do
registered = set((s, n) for n, a, s, n in coreconfigitems(cfgred))
toregister = {} # {(section, name): defaultrepr}
coreconfigs = set() # {(section, name)}, whether it's used in core
# first loop: scan all files before taking any action
for i, path in enumerate(argv):
print('(%d/%d) scanning %s' % (i + 1, len(argv), path))
iscore = ('mercurial' in path) and ('hgext' not in path)
red = redbaron.RedBaron(readpath(path))
# find all repo.ui.config* and ui.config* calls, and collect their
# section, name and default value information.
for node, method, args, section, name in uiconfigitems(red):
if section == 'web':
# [web] section has some weirdness, ignore them for now
continue
defaultrepr = None
key = (section, name)
if len(args) == 2:
if key in registered:
continue
if method == 'configlist':
defaultrepr = 'list'
elif method == 'configbool':
defaultrepr = 'False'
else:
defaultrepr = 'None'
elif len(args) >= 3 and (args[2].target is None or
args[2].target.value == 'default'):
# try to understand the "default" value
dnode = args[2].value
if dnode.type == 'name':
if dnode.value in {'None', 'True', 'False'}:
defaultrepr = dnode.value
elif dnode.type == 'string':
defaultrepr = repr(dnode.value[1:-1])
elif dnode.type in ('int', 'float'):
defaultrepr = dnode.value
# inconsistent default
if key in toregister and toregister[key] != defaultrepr:
defaultrepr = None
# interesting to rewrite
if key not in registered:
if defaultrepr is None:
print('[note] %s: %s.%s: unsupported default'
% (path, section, name))
registered.add(key) # skip checking it again
else:
toregister[key] = defaultrepr
if iscore:
coreconfigs.add(key)
# second loop: rewrite files given "toregister" result
for path in argv:
# reconstruct redbaron - trade CPU for memory
red = redbaron.RedBaron(readpath(path))
changed = False
for node, method, args, section, name in uiconfigitems(red):
key = (section, name)
defaultrepr = toregister.get(key)
if defaultrepr is None or key not in coreconfigs:
continue
if len(args) >= 3 and (args[2].target is None or
args[2].target.value == 'default'):
try:
del args[2]
changed = True
except Exception:
# redbaron fails to do the rewrite due to indentation
# see https://github.com/PyCQA/redbaron/issues/100
print('[warn] %s: %s.%s: default needs manual removal'
% (path, section, name))
if key not in registered:
print('registering %s.%s' % (section, name))
registercoreconfig(cfgred, section, name, defaultrepr)
registered.add(key)
if changed:
print('updating %s' % path)
writepath(path, red.dumps())
if toregister:
print('updating configitems.py')
writepath(cfgpath, cfgred.dumps())
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))
There is still some use of 'deletedivergent' bookmark here. They will be taken
care of later. The 'deletedivergent' code needs some rework before fitting in
the new world.
We want to track bookmark movement within a transaction. For this we need a
more centralized way to update bookmarks.
For this purpose we introduce a new 'applychanges' method that apply a list of
changes encoded as '(name, node)'. We'll cover all bookmark updating code to
this new method in later changesets and add bookmark move in the transaction
when all will be migrated.
This allows even further customization via extensions for printing
bookmarks. For example, in hg-git this would allow printing remote refs
by just modifying the 'bmarks' parameter instead of reimplementing the
old commands.bookmarks method.
Furthermore, there is another benefit: now multiple extensions can
chain their custom data to bookmark printing. Previously, an extension
could have conflicting bookmark output due to which loaded first and
overrode commands.bookmarks. Now they can all play nicely together.
Again, commands.bookmark is getting too large. checkconflict already has
a lot of state and putting it in the bmstore makes more sense than
having it as a closure. This also allows extensions a place to override
this behavior.
While we're here, add a documentation string because, well, we should be
documenting more of our methods.
commands.bookmark has grown quite large with two closures already. Let's
split this up (and in the process allow extensions to override the
default behavior).
Since we no longer set '_clean = False' during the initialization loop, we can
move the attribute assignment earlier in the function for clarity.
(no speed improvement expected or measured ;-) )
The bmstore '__setitem__' method is setting an extra flag that is not needed
during initialization. Skipping the method will allow further cleanup and yield
some speedup as a side effect.
Before:
! wall 0.009120 comb 0.010000 user 0.010000 sys 0.000000 (best of 312)
After:
! wall 0.007874 comb 0.010000 user 0.010000 sys 0.000000 (best of 360)
Since we already have an exception context open, for other thing, we can
simplify the code a bit and rely on exception handling for invalid lines.
Speed is not the main motivation for this changes. However as I'm in the middle
of benchmarking things we can see a small positive impact.
Before:
! wall 0.009358 comb 0.000000 user 0.000000 sys 0.000000 (best of 303)
After:
! wall 0.009173 comb 0.010000 user 0.010000 sys 0.000000 (best of 310)
We know the content of the file is supposed to be full hex. So we can do the
translation ourselves and directly check if the node is known.
As nice side effect we now have proper error handling for invalid node value.
Before:
! wall 0.021580 comb 0.020000 user 0.020000 sys 0.000000 (best of 134)
After:
! wall 0.009342 comb 0.010000 user 0.010000 sys 0.000000 (best of 302)
Skipping the attribute lookup up raise a significant speedup.
Example on a repository with about 4000 bookmarks.
Before:
! wall 0.026027 comb 0.020000 user 0.020000 sys 0.000000 (best of 112)
After:
! wall 0.021580 comb 0.020000 user 0.020000 sys 0.000000 (best of 134)
(This is also in its own changeset to clarify the perf win from another coming
changesets)
This method is only used internally by destutil, and it's obscure
enough I'm willing to just move it without a deprecation warning,
especially since the new method has more constrained functionality.
Design-wise I'd also like to get active bookmark handling folded into
the bookmark store, so that we don't squirrel away an extra attribute
for the active bookmark on the repository object.
Before this patch, checking HG_PENDING in bookmarks.py might cause
unintentional reading unrelated '.hg/bookmarks.pending' in, because it
just examines existence of HG_PENDING environment variable.
This patch uses txnutil.trypending() to check HG_PENDING strictly.
This patch also changes share extension.
Enabling share extension (+ bookmark sharing) makes
bookmarks._getbkfile() receive repo to be shared (= "srcrepo"). On the
other hand, HG_PENDING always refers current working repo (=
"currepo"), and bookmarks.pending is written only into currepo.
Therefore, we should try to read .hg/bookmarks.pending of currepo in
at first. If it doesn't exist, we try to read .hg/bookmarks of srcrepo
in.
Even after this patch, an external hook spawned in currepo can't see
pending changes in currepo via srcrepo, even though such changes
become visible after closing transaction, because there is no easy and
cheap way to know existence of pending changes in currepo via srcrepo.
Please see https://www.mercurial-scm.org/wiki/SharedRepository, too.
BTW, this patch may cause failure of bisect in the repository of
Mercurial itself, if examination at bisecting assumes that an external
hook can see all pending changes while nested transactions across
repositories.
This invisibility issue will be fixed by subsequent patch, which
allows HG_PENDING to refer multiple repositories.
os.environ is a dictionary which has string elements on Python 3. We have
encoding.environ which take care of all these things. This is the first patch
of 5 patch series which tend to replace the occurences of os.environ with
encoding.environ as using os.environ will result in unusual behaviour.
Binary bookmark format should be used internally. It doesn't make sense to have
optional parameters `srchex` and `dsthex`. This patch removes them. It will
also be useful for `bookmarks` bundle2 part because unnecessary conversions
between hex and bin nodes will be avoided.
Next commit will remove optional parameters from `compare()` function.
Let's rename `compare()` to `comparebookmarks()` to avoid ambiguity from
callers from external extensions.
`bookmarks` bundle2 part will work with binary nodes. To avoid unnecessary
conversions between binary and hex nodes let's add `listbinbookmarks()` that
returns binary nodes. For now this function is a copy-paste of
listbookmarks(). In the next patch this copy-paste will be removed.
Cached attribute repo._bookmarks uses stat of '.hg/bookmarks' and
'.hg/bookmarks.current' files to examine validity of cached
contents. If writing these files out keeps ctime, mtime and size of
them, change is overlooked, and old contents cached before change
isn't invalidated as expected.
To avoid ambiguity of file stat, this patch writes '.hg/bookmarks' and
'.hg/bookmarks.current' files out with checkambig=True.
This patch is a part of "Exact Cache Validation Plan":
https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan