The selector does not have a __del__ method and needs a manual close. We can
also use "with selector" but that makes the code too indented. Therefore
append a "selector.close()" after the end of the main loop for now.
Previously, commandserver was using select.select. That could have issue if
_sock.fileno() >= FD_SETSIZE (usually 1024), which raises:
ValueError: filedescriptor out of range in select()
We got that in production today, although it's the code opening that many
files to blame, it seems better for commandserver to work in this case.
There are multiple way to "solve" it, like preserving a fd with a small
number and swap it with sock using dup2(). But upgrading to a modern
selector supported by the system seems to be the most correct way.
This is done by a script [2] using RedBaron [1], a tool designed for doing
code refactoring. All "default" values are decided by the script and are
strongly consistent with the existing code.
There are 2 changes done manually to fix tests:
[warn] mercurial/exchange.py: experimental.bundle2-output-capture: default needs manual removal
[warn] mercurial/localrepo.py: experimental.hook-track-tags: default needs manual removal
Since RedBaron is not confident about how to indent things [2].
[1]: https://github.com/PyCQA/redbaron
[2]: https://github.com/PyCQA/redbaron/issues/100
[3]:
#!/usr/bin/env python
# codemod_configitems.py - codemod tool to fill configitems
#
# Copyright 2017 Facebook, Inc.
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
from __future__ import absolute_import, print_function
import os
import sys
import redbaron
def readpath(path):
with open(path) as f:
return f.read()
def writepath(path, content):
with open(path, 'w') as f:
f.write(content)
_configmethods = {'config', 'configbool', 'configint', 'configbytes',
'configlist', 'configdate'}
def extractstring(rnode):
"""get the string from a RedBaron string or call_argument node"""
while rnode.type != 'string':
rnode = rnode.value
return rnode.value[1:-1] # unquote, "'str'" -> "str"
def uiconfigitems(red):
"""match *.ui.config* pattern, yield (node, method, args, section, name)"""
for node in red.find_all('atomtrailers'):
entry = None
try:
obj = node[-3].value
method = node[-2].value
args = node[-1]
section = args[0].value
name = args[1].value
if (obj in ('ui', 'self') and method in _configmethods
and section.type == 'string' and name.type == 'string'):
entry = (node, method, args, extractstring(section),
extractstring(name))
except Exception:
pass
else:
if entry:
yield entry
def coreconfigitems(red):
"""match coreconfigitem(...) pattern, yield (node, args, section, name)"""
for node in red.find_all('atomtrailers'):
entry = None
try:
args = node[1]
section = args[0].value
name = args[1].value
if (node[0].value == 'coreconfigitem' and section.type == 'string'
and name.type == 'string'):
entry = (node, args, extractstring(section),
extractstring(name))
except Exception:
pass
else:
if entry:
yield entry
def registercoreconfig(cfgred, section, name, defaultrepr):
"""insert coreconfigitem to cfgred AST
section and name are plain string, defaultrepr is a string
"""
# find a place to insert the "coreconfigitem" item
entries = list(coreconfigitems(cfgred))
for node, args, nodesection, nodename in reversed(entries):
if (nodesection, nodename) < (section, name):
# insert after this entry
node.insert_after(
'coreconfigitem(%r, %r,\n'
' default=%s,\n'
')' % (section, name, defaultrepr))
return
def main(argv):
if not argv:
print('Usage: codemod_configitems.py FILES\n'
'For example, FILES could be "{hgext,mercurial}/*/**.py"')
dirname = os.path.dirname
reporoot = dirname(dirname(dirname(os.path.abspath(__file__))))
# register configitems to this destination
cfgpath = os.path.join(reporoot, 'mercurial', 'configitems.py')
cfgred = redbaron.RedBaron(readpath(cfgpath))
# state about what to do
registered = set((s, n) for n, a, s, n in coreconfigitems(cfgred))
toregister = {} # {(section, name): defaultrepr}
coreconfigs = set() # {(section, name)}, whether it's used in core
# first loop: scan all files before taking any action
for i, path in enumerate(argv):
print('(%d/%d) scanning %s' % (i + 1, len(argv), path))
iscore = ('mercurial' in path) and ('hgext' not in path)
red = redbaron.RedBaron(readpath(path))
# find all repo.ui.config* and ui.config* calls, and collect their
# section, name and default value information.
for node, method, args, section, name in uiconfigitems(red):
if section == 'web':
# [web] section has some weirdness, ignore them for now
continue
defaultrepr = None
key = (section, name)
if len(args) == 2:
if key in registered:
continue
if method == 'configlist':
defaultrepr = 'list'
elif method == 'configbool':
defaultrepr = 'False'
else:
defaultrepr = 'None'
elif len(args) >= 3 and (args[2].target is None or
args[2].target.value == 'default'):
# try to understand the "default" value
dnode = args[2].value
if dnode.type == 'name':
if dnode.value in {'None', 'True', 'False'}:
defaultrepr = dnode.value
elif dnode.type == 'string':
defaultrepr = repr(dnode.value[1:-1])
elif dnode.type in ('int', 'float'):
defaultrepr = dnode.value
# inconsistent default
if key in toregister and toregister[key] != defaultrepr:
defaultrepr = None
# interesting to rewrite
if key not in registered:
if defaultrepr is None:
print('[note] %s: %s.%s: unsupported default'
% (path, section, name))
registered.add(key) # skip checking it again
else:
toregister[key] = defaultrepr
if iscore:
coreconfigs.add(key)
# second loop: rewrite files given "toregister" result
for path in argv:
# reconstruct redbaron - trade CPU for memory
red = redbaron.RedBaron(readpath(path))
changed = False
for node, method, args, section, name in uiconfigitems(red):
key = (section, name)
defaultrepr = toregister.get(key)
if defaultrepr is None or key not in coreconfigs:
continue
if len(args) >= 3 and (args[2].target is None or
args[2].target.value == 'default'):
try:
del args[2]
changed = True
except Exception:
# redbaron fails to do the rewrite due to indentation
# see https://github.com/PyCQA/redbaron/issues/100
print('[warn] %s: %s.%s: default needs manual removal'
% (path, section, name))
if key not in registered:
print('registering %s.%s' % (section, name))
registercoreconfig(cfgred, section, name, defaultrepr)
registered.add(key)
if changed:
print('updating %s' % path)
writepath(path, red.dumps())
if toregister:
print('updating configitems.py')
writepath(cfgpath, cfgred.dumps())
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))
This enables chg to replace a server socket in an atomic way:
1. bind to a temp address
2. listen
3. rename
Currently 3 happens before 2 so a client may see the socket file but fails
to connect to it.
os.fdopen() does not accepts bytes as its second argument which represent the
mode in which the file is to be opened. This patch makes sure unicodes are
passed in py3 by using pycompat.sysstr().
Previously, when a chg server is exiting, it does not handle connected
clients so clients may get ECONNRESET and crash:
1. client connect() # success
2. server shouldexit = True and exit
3. client recv() # ECONNRESET
297d89f2789e makes this race condition easier to reproduce if a lot of short
chg commands are started in parallel.
This patch fixes the above issue by unlinking the socket path to stop
queuing new connections and processing all pending connections before exit.
This patch changes unixforkingservice so it only calls
`self._servicehandler.unlinksocket(self.address)` at most once.
This is needed by the next patch.
This is necessary to solve future dependency cycle between commandserver.py
and chgserver.py.
'cmd' prefix is added to table and function names to avoid conflicts with
hgweb.
Almost all sys.stdin/out/err in hgext/ and mercurial/ are replaced by util's.
There are a few exceptions:
- lsprof.py and statprof.py are untouched since they are a kind of vendor
code and they never import mercurial modules right now.
- ui._readline() needs to replace sys.stdin and stdout to pass them to
raw_input(). We'll need another workaround here.
This makes a channeledoutput thread-safe as long as the underlying fwrite() is
thread-safe. Both POSIX and Windows implementations are documented as MT-safe.
MT-safety is necessary to use ui.fout and ui.ferr in hgweb.
Now setpgid has 2 main purposes: better handling for terminal-generated
SIGTSTP, SIGINT, and process-exit-generated SIGHUP. Update the comment to
explain things more clearly.
The old value 5 was arbitrary chosen. Since there's no practical reason to
limit the backlog, this patch simply uses SOMAXCONN as a value large enough.
SocketServer.ForkingMixIn of Python 2.x has a couple of issues, such as:
- race condition that leads to 100% CPU usage (Python 2.6)
https://bugs.python.org/issue21491
- can't wait for children belonging to different process groups (Python 2.6)
- leaves at least one zombie process (Python 2.6, 2.7)
https://bugs.python.org/issue11109
The first two are critical because we do setpgid(0, 0) in child process to
isolate terminal signals. The last one isn't, but ForkingMixIn seems to be
doing silly. So there are two choices:
a) backport and maintain SocketServer until we can drop support for Python 2.x
b) replace SocketServer by simpler one and eliminate glue codes
I chose (b) because it's great time for getting rid of utterly complicated
SocketServer stuff, and preparing for future move towards prefork service.
New unixforkingservice is implemented loosely based on chg 531f8ef64be6. It
is monolithic but much simpler than SocketServer. unixservicehandler provides
customizing points for chg, and it will be shared with future prefork service.
Old unixservice class is still used by chgserver. It will be removed later.
Thanks to Jun Wu for investigating these issues.
This is common between chg and vanilla forking server, so move it to
commandserver and unify handle().
It would be debatable whether we really need gc.collect() or not, but that
is beyond the scope of this series. Maybe we can remove gc.collect() once
all resource deallocations are switched to context manager.
This will allow us to clear in-memory password storage per runcommand().
I've updated commandserver to call resetstate() of both ui and repo.ui because
they may have different states in theory.
In 'unix' mode, the server is typically detached from the console. Therefore
a client couldn't see the exception that occurred while instantiating the
server object.
This patch tries to catch the early error and send it to 'e' channel even if
the server isn't instantiated yet. This means the error may be sent before the
initial hello message. So it's up to the client implementation whether to
handle the early error message or error out as protocol violation.
The error handling code is also copied to chgserver.py. I'll factor out them
later if we manage to get chg passes the test suite.
These operations are obviously invalid for file-like channels because they
will read or write protocol headers.
This patch works around the issue that "hg archive" generates a corrupted
zip file on Windows commandserver because of unusable tell() implementation.
But the problem still occurs without using a commandserver.
$ hg archive -R not-small-repo -t zip - | cat > invalid.zip
So, this patch cannot fix the issue5049 completely.
A progress bar is normally disabled in command-server session, but chg can
enable it. This patch makes sure the last-print time is measured per command.
Otherwise, progress.delay could be ineffective if a progbar was loaded before
forking worker process.
This patch is corresponding to the following change.
https://bitbucket.org/yuja/chg/commits/2dfe3e90b365
This prepares for porting the chg server. In chg, a server receives client's
stdio over a UNIX domain socket to override server channels. This is because
chg should behave as if it is a normal hg command attached to tty. "nontty"
is not wanted.
This patch is corresponding to the following change. This doesn't test the
identity of "cin" object because the current version of chg reopens stdio
to apply buffering mode.
https://bitbucket.org/yuja/chg/commits/c48c7aed5fc0
Because unknown attributes are delegated to the underlying file object,
commandserver channels said they were '<stdout>' or '<stdin>' even though
they weren't. This patch makes them say '<X-channel>'.
We generally make modules importable from the front-end layer, dispatch ->
commands -> x. So the import cycle to dispatch should be resolved by the
commandserver module.
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.
For great justice.
Python 2.6 introduced the "except type as instance" syntax, replacing
the "except type, instance" syntax that came before. Python 3 dropped
support for the latter syntax. Since we no longer support Python 2.4 or
2.5, we have no need to continue supporting the "except type, instance".
This patch mass rewrites the exception syntax to be Python 2.6+ and
Python 3 compatible.
This patch was produced by running `2to3 -f except -w -n .`.
Because pipe-mode server uses stdio as IPC channel, other modules should not
touch stdio directly and use ui instead. However, this strategy is brittle
because several Python functions read and write stdio implicitly.
print 'hello' # should use ui.write()
# => ch = 'h', size = 1701604463 'ello', data = '\n'
This patch adds protection for such mistakes. Both stdio files and low-level
file descriptors are redirected to /dev/null while command server uses them.
Because unix-mode server forks child process per connection, client does not
know the pid of the server that will handle requests. The pid is necessary
to interrupt hung process:
1. client connects to socket server
2. server accepts the connection, forks, and tells pid
3. client requests "runcommand pull"
.. hung ..
4. client sends SIGINT to the (forked) server
5. server returns from I/O wait
Note that getsockopt(SO_PEERCRED) of Linux cannot be used because the server
fork()s after accept().
Typical use case of 'unix' mode is a background hg daemon.
$ hg serve --cmdserver unix --cwd / -a /tmp/hg-`id -u`.sock
Unlike 'pipe' mode in which parent process keeps stdio channel, 'unix' server
can be detached. So clients can freely connect and disconnect from server,
saving Python start-up time.
It might be better to write "--cmdserver socket -a unix:/sockpath" instead
of "--cmdserver unix -a /sockpath" in case hgweb gets the ability to listen
on unix domain socket.