The idea behind effect flag is to store additional information in obs-markers
about what changed between a changeset and its successor(s). It's a low-level
information that comes without guarantees.
This information can be computed a posteriori, but only if we have all
changesets locally. This is not the case with distributed workflows where you
work with several people or on several computers (eg: laptop + build server).
Storing the effect-flag as a bitfield has several advantages:
- It's compact, we are using one byte per obs-marker at most for the effect-
flag.
- It's compoundable, the obsfate log approach needs to display evolve history
that could spans several obs-markers. Computing the effect-flag between a
changeset and its grand-grand-grand-successor is simple thanks to the
bitfield.
The effect-flag design has also some limitations:
- Evolving a changeset and reverting these changes just after would lead to
two obs-markers with the same effect-flag without information that the first
and third changesets are the same.
The effect-flag current design is a trade-off between compactness and
usefulness.
Storing this information helps commands to display a more complete and
understandable evolve history. For example, obslog (an Evolve command) use it
to improve its output:
x 62206adfd571 (34302) obscache: skip updating outdated obscache...
| rewritten(parent) by Matthieu Laneuville <matthieu.laneuville@octobus...
| rewritten(content) by Boris Feld <boris.feld@octobus.net>
The effect flag is stored in obs-markers metadata while we iterate on the
information we want to store. We plan to extend the existing obsmarkers
bit-field when the effect flag design will be stabilized.
It's different from the CommitCustody concept, effect-flag are not signed and
can be forged. It's also different from the operation metadata as the command
name (for example: amend) could alter a changeset in different ways (changing
the content with hg amend, changing the description with hg amend -e, changing
the user with hg amend -U). Also it's compatible with every custom command
that writes obs-markers without needing to be updated.
The effect-flag is placed behind an experimental flag set to off by default.
Hook the saving of effect flag in create markers, but store only an empty one
for the moment, I will refine the values in effect flag in following patches.
For more information, see:
https://www.mercurial-scm.org/wiki/ChangesetEvolutionDevel#Record_types_of_operation
Differential Revision: https://phab.mercurial-scm.org/D533
Various mutators fail when attempting to write obsmarkers with
metadata fields longer than 255 bytes, since the length of
mwetadata fields is stored in u8s. This change raises a more
helpful error in such circumstances.
Differential Revision: https://phab.mercurial-scm.org/D865
I will add another experiment in createmarkers, add a comment and some blank
lines for aesthetic sake.
Differential Revision: https://phab.mercurial-scm.org/D532
I'm not sure this is right, since this should either be bytes or str
to match what's going on in the revlog layer.
Differential Revision: https://phab.mercurial-scm.org/D271
This allows us to read a customized range of markers, instead of loading all
of them.
The condition of stop is made consistent across C and Python implementation
so we will still read marker when offset=a, stop=a+1.
This is done by a script [2] using RedBaron [1], a tool designed for doing
code refactoring. All "default" values are decided by the script and are
strongly consistent with the existing code.
There are 2 changes done manually to fix tests:
[warn] mercurial/exchange.py: experimental.bundle2-output-capture: default needs manual removal
[warn] mercurial/localrepo.py: experimental.hook-track-tags: default needs manual removal
Since RedBaron is not confident about how to indent things [2].
[1]: https://github.com/PyCQA/redbaron
[2]: https://github.com/PyCQA/redbaron/issues/100
[3]:
#!/usr/bin/env python
# codemod_configitems.py - codemod tool to fill configitems
#
# Copyright 2017 Facebook, Inc.
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
from __future__ import absolute_import, print_function
import os
import sys
import redbaron
def readpath(path):
with open(path) as f:
return f.read()
def writepath(path, content):
with open(path, 'w') as f:
f.write(content)
_configmethods = {'config', 'configbool', 'configint', 'configbytes',
'configlist', 'configdate'}
def extractstring(rnode):
"""get the string from a RedBaron string or call_argument node"""
while rnode.type != 'string':
rnode = rnode.value
return rnode.value[1:-1] # unquote, "'str'" -> "str"
def uiconfigitems(red):
"""match *.ui.config* pattern, yield (node, method, args, section, name)"""
for node in red.find_all('atomtrailers'):
entry = None
try:
obj = node[-3].value
method = node[-2].value
args = node[-1]
section = args[0].value
name = args[1].value
if (obj in ('ui', 'self') and method in _configmethods
and section.type == 'string' and name.type == 'string'):
entry = (node, method, args, extractstring(section),
extractstring(name))
except Exception:
pass
else:
if entry:
yield entry
def coreconfigitems(red):
"""match coreconfigitem(...) pattern, yield (node, args, section, name)"""
for node in red.find_all('atomtrailers'):
entry = None
try:
args = node[1]
section = args[0].value
name = args[1].value
if (node[0].value == 'coreconfigitem' and section.type == 'string'
and name.type == 'string'):
entry = (node, args, extractstring(section),
extractstring(name))
except Exception:
pass
else:
if entry:
yield entry
def registercoreconfig(cfgred, section, name, defaultrepr):
"""insert coreconfigitem to cfgred AST
section and name are plain string, defaultrepr is a string
"""
# find a place to insert the "coreconfigitem" item
entries = list(coreconfigitems(cfgred))
for node, args, nodesection, nodename in reversed(entries):
if (nodesection, nodename) < (section, name):
# insert after this entry
node.insert_after(
'coreconfigitem(%r, %r,\n'
' default=%s,\n'
')' % (section, name, defaultrepr))
return
def main(argv):
if not argv:
print('Usage: codemod_configitems.py FILES\n'
'For example, FILES could be "{hgext,mercurial}/*/**.py"')
dirname = os.path.dirname
reporoot = dirname(dirname(dirname(os.path.abspath(__file__))))
# register configitems to this destination
cfgpath = os.path.join(reporoot, 'mercurial', 'configitems.py')
cfgred = redbaron.RedBaron(readpath(cfgpath))
# state about what to do
registered = set((s, n) for n, a, s, n in coreconfigitems(cfgred))
toregister = {} # {(section, name): defaultrepr}
coreconfigs = set() # {(section, name)}, whether it's used in core
# first loop: scan all files before taking any action
for i, path in enumerate(argv):
print('(%d/%d) scanning %s' % (i + 1, len(argv), path))
iscore = ('mercurial' in path) and ('hgext' not in path)
red = redbaron.RedBaron(readpath(path))
# find all repo.ui.config* and ui.config* calls, and collect their
# section, name and default value information.
for node, method, args, section, name in uiconfigitems(red):
if section == 'web':
# [web] section has some weirdness, ignore them for now
continue
defaultrepr = None
key = (section, name)
if len(args) == 2:
if key in registered:
continue
if method == 'configlist':
defaultrepr = 'list'
elif method == 'configbool':
defaultrepr = 'False'
else:
defaultrepr = 'None'
elif len(args) >= 3 and (args[2].target is None or
args[2].target.value == 'default'):
# try to understand the "default" value
dnode = args[2].value
if dnode.type == 'name':
if dnode.value in {'None', 'True', 'False'}:
defaultrepr = dnode.value
elif dnode.type == 'string':
defaultrepr = repr(dnode.value[1:-1])
elif dnode.type in ('int', 'float'):
defaultrepr = dnode.value
# inconsistent default
if key in toregister and toregister[key] != defaultrepr:
defaultrepr = None
# interesting to rewrite
if key not in registered:
if defaultrepr is None:
print('[note] %s: %s.%s: unsupported default'
% (path, section, name))
registered.add(key) # skip checking it again
else:
toregister[key] = defaultrepr
if iscore:
coreconfigs.add(key)
# second loop: rewrite files given "toregister" result
for path in argv:
# reconstruct redbaron - trade CPU for memory
red = redbaron.RedBaron(readpath(path))
changed = False
for node, method, args, section, name in uiconfigitems(red):
key = (section, name)
defaultrepr = toregister.get(key)
if defaultrepr is None or key not in coreconfigs:
continue
if len(args) >= 3 and (args[2].target is None or
args[2].target.value == 'default'):
try:
del args[2]
changed = True
except Exception:
# redbaron fails to do the rewrite due to indentation
# see https://github.com/PyCQA/redbaron/issues/100
print('[warn] %s: %s.%s: default needs manual removal'
% (path, section, name))
if key not in registered:
print('registering %s.%s' % (section, name))
registercoreconfig(cfgred, section, name, defaultrepr)
registered.add(key)
if changed:
print('updating %s' % path)
writepath(path, red.dumps())
if toregister:
print('updating configitems.py')
writepath(cfgpath, cfgred.dumps())
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))
We plan to add a new argument to successorsets. But first we need to update
all callers to pass cache argument explicitly to avoid arguments confusion.
The obsstore collaborate with transaction to make sure we track all the
obsmarkers added during a transaction. This will be useful for various usages:
hooks, caches, better output, etc.
This is the seconds kind of data added to tr.changes (first one was added revisions)
None of this function has been used in the past 5 years, so I think it is safe
to just kill them. All code accessing rich markers is using 'getmarkers(...)'
instead (or raw markers).
We simplify the unstable computation code, skipping the expensive creation of
changectx object. We focus on efficient set operation and revnumber centric
functions.
In my mercurial development repository, this provides a 3x speedup to the
function:
before: 5.319 ms
after: 1.844 ms
repo details:
total changesets: 40886
obsolete changesets: 7756
mutable (not obsolete): 293
unstable: 30
We now display data about the "exclusive markers" in the test dedicated to
relevant and exclusive markers computation and usage. Each output have been
carefully validated
This set will be used to select the obsmarkers to be stripped alongside the
stripped changesets. See the function docstring for details.
More advanced testing is introduced in the next changesets to keep this one
simpler. That extra testing provides more example.
We raise a more precise subclass of Abort with details about the faulty
version. This will be used to detect this case and display some information
in debugbundle.
The markers pruning a node was not directly considered relevant for the pruned
node, only to its parents.
This went unnoticed during obsmarkers exchange because all
ancestors of the pruned node would be included in the computation.
This still affects obsmarkers exchange a bit since "inline" prune markers would
be ignored (see second test case). This went unnoticed, because in such case,
we always push another obsolescence markers for that node.
We add explicit tests covering this case.
(The set of relevant changeset is use in the obsmarkers discovery protocol used
in the evolve experimental extension, the impact will be handled on the
extension side).
Also use the default-date when creating obsmarkers. Currently they are created
with the current date and without any option to force their value.
To test the feature, we remove some of the many 'glob' used to match obsmarker
date in the tests.
Adding markers to the repository might affect the set of obsolete changesets. So we
most remove the "volatile" set who rely in that data. We add two missing
invalidations after merging markers. This was caught by code change in the evolve
extensions tests.
This issues highlight that the current way to do things is a bit fragile,
however we keep things simple for stable.
It seems better to introduce the experiment behind a flag for now as there are
multiple concerns around the feature:
* Storing operation increase the size of obsolescence markers significantly
(+10-20%).
* It performs poorly when exchanging markers (cannot combine command names,
command name might be unknown remotely, etc)