Every `extensions.bind` call inserts a frame in traceback:
... in closure
return func(*(args + a), **kw)
which makes traceback noisy.
The Python stdlib has a `functools.partial` which is backed by C code and
does not pollute traceback. However it does not support instancemethod and
sets `args` attribute which could be problematic for alias handling.
This patch makes `wrapfunction` use `functools.partial` if we are wrapping a
function directly exported by a module (so it's impossible to be a class or
instance method), and special handles `wrapfunction` results so alias
handling code could handle `args` just fine.
As an example, `hg rebase -s . -d . --traceback` got 6 lines removed in my
setup:
File "hg/mercurial/dispatch.py", line 898, in _dispatch
cmdpats, cmdoptions)
-File "hg/mercurial/extensions.py", line 333, in closure
- return func(*(args + a), **kw)
File "hg/hgext/journal.py", line 84, in runcommand
return orig(lui, repo, cmd, fullargs, *args)
-File "hg/mercurial/extensions.py", line 333, in closure
- return func(*(args + a), **kw)
File "fb-hgext/hgext3rd/fbamend/hiddenoverride.py", line 119, in runcommand
result = orig(lui, repo, cmd, fullargs, *args)
File "hg/mercurial/dispatch.py", line 660, in runcommand
ret = _runcommand(ui, options, cmd, d)
-File "hg/mercurial/extensions.py", line 333, in closure
- return func(*(args + a), **kw)
File "hg/hgext/pager.py", line 69, in pagecmd
return orig(ui, options, cmd, cmdfunc)
....
Differential Revision: https://phab.mercurial-scm.org/D632
Before this patch, explicit --pager=on is unintentionally ignored by
any disabling factor, even if priority of it is less than --pager=on
(e.g. "[ui] paginate = off").
This is done by a script [2] using RedBaron [1], a tool designed for doing
code refactoring. All "default" values are decided by the script and are
strongly consistent with the existing code.
There are 2 changes done manually to fix tests:
[warn] mercurial/exchange.py: experimental.bundle2-output-capture: default needs manual removal
[warn] mercurial/localrepo.py: experimental.hook-track-tags: default needs manual removal
Since RedBaron is not confident about how to indent things [2].
[1]: https://github.com/PyCQA/redbaron
[2]: https://github.com/PyCQA/redbaron/issues/100
[3]:
#!/usr/bin/env python
# codemod_configitems.py - codemod tool to fill configitems
#
# Copyright 2017 Facebook, Inc.
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
from __future__ import absolute_import, print_function
import os
import sys
import redbaron
def readpath(path):
with open(path) as f:
return f.read()
def writepath(path, content):
with open(path, 'w') as f:
f.write(content)
_configmethods = {'config', 'configbool', 'configint', 'configbytes',
'configlist', 'configdate'}
def extractstring(rnode):
"""get the string from a RedBaron string or call_argument node"""
while rnode.type != 'string':
rnode = rnode.value
return rnode.value[1:-1] # unquote, "'str'" -> "str"
def uiconfigitems(red):
"""match *.ui.config* pattern, yield (node, method, args, section, name)"""
for node in red.find_all('atomtrailers'):
entry = None
try:
obj = node[-3].value
method = node[-2].value
args = node[-1]
section = args[0].value
name = args[1].value
if (obj in ('ui', 'self') and method in _configmethods
and section.type == 'string' and name.type == 'string'):
entry = (node, method, args, extractstring(section),
extractstring(name))
except Exception:
pass
else:
if entry:
yield entry
def coreconfigitems(red):
"""match coreconfigitem(...) pattern, yield (node, args, section, name)"""
for node in red.find_all('atomtrailers'):
entry = None
try:
args = node[1]
section = args[0].value
name = args[1].value
if (node[0].value == 'coreconfigitem' and section.type == 'string'
and name.type == 'string'):
entry = (node, args, extractstring(section),
extractstring(name))
except Exception:
pass
else:
if entry:
yield entry
def registercoreconfig(cfgred, section, name, defaultrepr):
"""insert coreconfigitem to cfgred AST
section and name are plain string, defaultrepr is a string
"""
# find a place to insert the "coreconfigitem" item
entries = list(coreconfigitems(cfgred))
for node, args, nodesection, nodename in reversed(entries):
if (nodesection, nodename) < (section, name):
# insert after this entry
node.insert_after(
'coreconfigitem(%r, %r,\n'
' default=%s,\n'
')' % (section, name, defaultrepr))
return
def main(argv):
if not argv:
print('Usage: codemod_configitems.py FILES\n'
'For example, FILES could be "{hgext,mercurial}/*/**.py"')
dirname = os.path.dirname
reporoot = dirname(dirname(dirname(os.path.abspath(__file__))))
# register configitems to this destination
cfgpath = os.path.join(reporoot, 'mercurial', 'configitems.py')
cfgred = redbaron.RedBaron(readpath(cfgpath))
# state about what to do
registered = set((s, n) for n, a, s, n in coreconfigitems(cfgred))
toregister = {} # {(section, name): defaultrepr}
coreconfigs = set() # {(section, name)}, whether it's used in core
# first loop: scan all files before taking any action
for i, path in enumerate(argv):
print('(%d/%d) scanning %s' % (i + 1, len(argv), path))
iscore = ('mercurial' in path) and ('hgext' not in path)
red = redbaron.RedBaron(readpath(path))
# find all repo.ui.config* and ui.config* calls, and collect their
# section, name and default value information.
for node, method, args, section, name in uiconfigitems(red):
if section == 'web':
# [web] section has some weirdness, ignore them for now
continue
defaultrepr = None
key = (section, name)
if len(args) == 2:
if key in registered:
continue
if method == 'configlist':
defaultrepr = 'list'
elif method == 'configbool':
defaultrepr = 'False'
else:
defaultrepr = 'None'
elif len(args) >= 3 and (args[2].target is None or
args[2].target.value == 'default'):
# try to understand the "default" value
dnode = args[2].value
if dnode.type == 'name':
if dnode.value in {'None', 'True', 'False'}:
defaultrepr = dnode.value
elif dnode.type == 'string':
defaultrepr = repr(dnode.value[1:-1])
elif dnode.type in ('int', 'float'):
defaultrepr = dnode.value
# inconsistent default
if key in toregister and toregister[key] != defaultrepr:
defaultrepr = None
# interesting to rewrite
if key not in registered:
if defaultrepr is None:
print('[note] %s: %s.%s: unsupported default'
% (path, section, name))
registered.add(key) # skip checking it again
else:
toregister[key] = defaultrepr
if iscore:
coreconfigs.add(key)
# second loop: rewrite files given "toregister" result
for path in argv:
# reconstruct redbaron - trade CPU for memory
red = redbaron.RedBaron(readpath(path))
changed = False
for node, method, args, section, name in uiconfigitems(red):
key = (section, name)
defaultrepr = toregister.get(key)
if defaultrepr is None or key not in coreconfigs:
continue
if len(args) >= 3 and (args[2].target is None or
args[2].target.value == 'default'):
try:
del args[2]
changed = True
except Exception:
# redbaron fails to do the rewrite due to indentation
# see https://github.com/PyCQA/redbaron/issues/100
print('[warn] %s: %s.%s: default needs manual removal'
% (path, section, name))
if key not in registered:
print('registering %s.%s' % (section, name))
registercoreconfig(cfgred, section, name, defaultrepr)
registered.add(key)
if changed:
print('updating %s' % path)
writepath(path, red.dumps())
if toregister:
print('updating configitems.py')
writepath(cfgpath, cfgred.dumps())
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))
Before this patch, functions defined in extensions are registered via
extra loaders only in _dispatch(). Therefore, loading extensions in
other code paths like below omits registration of functions.
- WSGI service
- operation across repositories (e.g. subrepo)
- test-duplicateoptions.py, using extensions.loadall() directly
To register functions always at loading new extension, this patch
moves implementation for extra loading from dispatch._dispatch() to
extensions.loadall().
AFAIK, only commands module causes cyclic dependency between
extensions module, but this patch imports all related modules just
before extra loading in loadall(), in order to centralize them.
This patch makes extensions.py depend on many other modules, even
though extensions.py itself doesn't. It should be avoided if possible,
but I don't have any better idea. Some other places like below aren't
reasonable for extra loading, IMHO.
- specific function in newly added module:
existing callers of extensions.loadall() should invoke it, too
- hg.repository() or so:
no-repo commands aren't covered by this.
BTW, this patch removes _loaded.add(name) on relocation, because
dispatch._loaded is used only for extraloaders (for similar reason,
"exts" variable is removed, too).
It seems sufficiently simple to use "profile(enabled=X)" to not justify having
a dedicated context manager just to read the config.
(I do not have a too strong opinion about this).
We now process the "--profile" a second time after alias has been processed and
the command argument fully parsed. If appropriate we enable profiling at that
time.
In these situation, the --profile will cover less than if the full --profile
flag was passed on the command line. This is better than the previous behavior
(flag ignored) and still fullfil multiple valid usecases.
Since 348863ccec7e "util: always force line buffered stdout when stdout is
a tty", we have two file objects attached to the same STDOUT_FILENO. If one
is closed, the underlying file descriptor is also closed, and writing to
the other file object would crash the Python interpreter in a hard way, at
least on Windows.
So, it seems safer to not close the standard streams. This also matches
the behavior of the default sys.stdout/stderr.close(), which never close
the FILE* streams in C layer.
https://hg.python.org/cpython/file/v2.7.13/Python/sysmodule.c#l1401
Before this patch, "hg CMD --pager on" on Windows shows output
unintentionally decorated with ANSI color escape sequences, if color
mode is "auto". This issue occurs in steps below.
1. dispatch() invokes ui.pager() at detection of "--pager on"
2. stdout of hg process is redirected into stdin of pager process
3. "ui.formatted" = True, because isatty(stdout) is so before (2)
4. color module is loaded for colorization
5. color.w32effects = None, because GetConsoleScreenBufferInfo()
fails on stdout redirected at (2)
6. "ansi" color mode is chosen, because of "not w32effects"
7. output is colorized in "ansi" mode because of "ui.formatted" = True
Even if "ansi" color mode is chosen, ordinarily redirected stdout
makes ui.formatted() return False, and colorization is avoided. But in
this issue case, "ui.formatted" = True at (3) forces output to be
colorized.
For correct console information on win32, it is needed to ensure that
color module is loaded before redirection of stdout for pagination.
BTW, if any of enabled extensions has "colortable" attribute, this
issue is avoided even before this patch, because color module is
imported as a part of loading such extension, and extension loading
occurs before setting up pager. For example, mq and keyword have
"colortable".
Some shared-ssh installations assume that 'hg serve --stdio' is a safe
command to run for minimally trusted users. Unfortunately, the messy
implementation of argument parsing here meant that trying to access a
repo named '--debugger' would give the user a pdb prompt, thereby
sidestepping any hoped-for sandboxing. Serving repositories over HTTP(S)
is unaffected.
We're not currently hardening any subcommands other than 'serve'. If
your service exposes other commands to users with arbitrary repository
names, it is imperative that you defend against repository names of
'--debugger' and anything starting with '--config'.
The read-only mode of hg-ssh stopped working because it provided its hook
configuration to "hg serve --stdio" via --config parameter. This is banned for
security reasons now. This patch switches it to directly call ui.setconfig().
If your custom hosting infrastructure relies on passing --config to
"hg serve --stdio", you'll need to find a different way to get that configuration
into Mercurial, either by using ui.setconfig() as hg-ssh does in this patch,
or by placing an hgrc file someplace where Mercurial will read it.
mitrandir@fb.com provided some extra fixes for the dispatch code and
for hg-ssh in places that I overlooked.
I got the following error by running "hg log" and quitting the pager
immediately. Any output here may trigger another SIGPIPE, so only thing
we can do is to swallow the exception and exit with an error status.
Traceback (most recent call last):
File "./hg", line 45, in <module>
mercurial.dispatch.run()
File "mercurial/dispatch.py", line 83, in run
status = (dispatch(req) or 0) & 255
File "mercurial/dispatch.py", line 167, in dispatch
req.ui.warn(_("interrupted!\n"))
File "mercurial/ui.py", line 1224, in warn
self.write_err(*msg, **opts)
File "mercurial/ui.py", line 790, in write_err
self._write_err(*msgs, **opts)
File "mercurial/ui.py", line 798, in _write_err
self.ferr.write(a)
File "mercurial/ui.py", line 129, in _catchterm
raise error.SignalInterrupt
mercurial.error.SignalInterrupt
Perhaps this wasn't visible before ee4f321cd621 because the original stderr
handle was restored very late.
We attempt to report what went wrong, and more importantly exit the
program with an error code.
(The exception we catch is not yet raised anywhere in the code.)
In spite of its longstanding use, Python's built-in atexit code is
not suitable for Mercurial's purposes, for several reasons:
* Handlers run after application code has finished.
* Because of this, the code that runs handlers swallows exceptions
(since there's no possible stacktrace to associate errors with).
If we're lucky, we'll get something spat out to stderr (if stderr
still works), which of course isn't any use in a big deployment
where it's important that exceptions get logged and aggregated.
* Mercurial's current atexit handlers make unfortunate assumptions
about process state (specifically stdio) that, coupled with the
above problems, make it impossible to deal with certain categories
of error (try "hg status > /dev/full" on a Linux box).
* In Python 3, the atexit implementation is completely hidden, so
we can't hijack the platform's atexit code to run handlers at a
time of our choosing.
As a result, here's a perfectly cromulent atexit-like implementation
over which we have control. This lets us decide exactly when the
handlers run (after each request has completed), and control what
the process state is when that occurs (and afterwards).
We were previously depending on str() doing something reasonable here,
and we can't depend on the objects in question supporting __bytes__,
so we work around the lack of direct bytes formatting.
shlex.split() only accepts unicodes on Python 3. After this patch we will be
using pycompat.shlexsplit(). This patch also replaces existing occurences of
shlex.split with pycompat.shlexsplit.
Now that we have the '_style' dictionary in core, we can use the clean and
standard 'extraloader' mechanism to load extension's 'colortable'.
color.loadcolortable
keys of keyword arguments on Python 3 has to be string. We are dealing with
bytes in our codebase so the keys are also bytes. Done that using
pycompat.strkwargs().
Also after this patch, `hg version` now runs on Python 3.5. Hurray!
In the next patch, we will be creating a bytes version of getopt.getopt() and
doing that will leave getopt as unused import in fancyopts. So before removing
that there are instances in codebase where instead of importing getopt, we
have used fancyopts.getopt. This patch will switch all those cases so that
the next patch can remove the import of getopt from fancyopts without breaking
things.
This allows us to write doctests depending on a ui object, but not on global
configs.
ui.load() is a class method so we can do wsgiui.load(). All ui() calls but
for doctests are replaced with ui.load(). Some of them could be changed to
not load configs later.
Per discussion at 7d927e65eaf2 [1], we need "callcatch" in worker.py. Move
it to scmutil.py to avoid cycles.
Note that dispatch's callcatch handles some additional high-level exceptions
related to config parsing, and commands. Moving them to scmutil will make
scmutil depend on "commands" or require "_formatparse" and "_getsimilar"
(and "difflib") to be moved as well. In the worker use-case, it is forked
when config and commands are fully loaded. So it should not care about those
exceptions.
[1]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-August/087116.html
Following the behaviour of Python 3, os.getcwd() return unicodes. We need
bytes version as path variables are bytes in UNIX. Python 3 has os.getcwdb()
which returns current working directory in bytes.
Like rest of the things there in pycompat, like osname, ossep, we need to
rewrite every instance of os.getcwd to pycompat.getcwd to make them work
correctly on Python 3.
Almost all sys.stdin/out/err in hgext/ and mercurial/ are replaced by util's.
There are a few exceptions:
- lsprof.py and statprof.py are untouched since they are a kind of vendor
code and they never import mercurial modules right now.
- ui._readline() needs to replace sys.stdin and stdout to pass them to
raw_input(). We'll need another workaround here.
The quoting logic here was actually insufficient, and would have had
bogus b-prefixes on Python 3. shellquote seems more appropriate
anyway. Surprisingly, only two tests have output changes, and both of
them look reasonable to me (both are in blackbox logs).
Spotted by Yuya during review.
I think this makes the code much clearer. I had to think for a bit to
unpack the old-school `condition and if-true or if-false` dance, and
formatting argument lists here shouldn't be performance critical.
chg needs special logic around repo object creation (like, collecting and
reporting repo path to the master server). Adding "reposetup" to
dispatch.request seems to be an easy and reasonably clean way to allow that.
Instead, load the table by commands.py so the debug commands should always
be populated. The table in debugcommands.py is unnamed so extension authors
wouldn't be confused to wrap debugcommands.table in place of commands.table.
getattr passes a str value of the attribute to be looked and keys in adefaults
dict are bytes which resulted in AttributeError. This patch abuses r'' to
make the keys str.
dispatch handles KeyboardInterrupt already. This makes the code more
consistent, and makes worker not print "killed!" if it receives SIGTERM in
most cases (in rare cases there is still "killed!" printed, which will be
fixed by the next patch).
commands.py is our largest .py file by nearly 2x. Debug commands live
in a world of their own. So let's extract them to their own module.
We start with "debugancestor."
We currently reuse the commands table with commands.py and have a hack
in dispatch.py for loading debugcommands.py. In the future, we could
potentially use a separate commands table and avoid the import of
debugcommands.py.