Previously, commandserver was using select.select. That could have issue if
_sock.fileno() >= FD_SETSIZE (usually 1024), which raises:
ValueError: filedescriptor out of range in select()
We got that in production today, although it's the code opening that many
files to blame, it seems better for commandserver to work in this case.
There are multiple way to "solve" it, like preserving a fd with a small
number and swap it with sock using dup2(). But upgrading to a modern
selector supported by the system seems to be the most correct way.
This library was a backport of the Python 3 "selectors" library. It is
useful to provide a better selector interface for Python2, to address some
issues of the plain old select.select, mentioned in the next patch.
The code [1] was ported using the MIT license, with some minor modifications
to make our test happy:
1. "# no-check-code" was added since it's foreign code.
2. "from __future__ import absolute_import" was added.
3. "from collections import namedtuple, Mapping" changed to avoid direct
symbol import.
[1]: d27dbd2fdc/selectors2.py
# no-check-commit
I was several directories deep in the kernel tree, ran `hg add`, and got the
warning about the size of one of the files. I noticed that it suggested undoing
the add with a specific revert command. The problem is, it would have failed
since the path printed was relative to the repo root instead of cwd. While
here, I just fixed the other messages too. As an added benefit, these messages
now look the same as the verbose/inexact messages for the corresponding command.
I don't think most of these messages are reachable (typically the corresponding
cmdutil function does the check). I wasn't able to figure out why the keyword
tests were failing when using pathto()- I couldn't cause an absolute path to be
used by manipulating the --cwd flag on a regular add. (I did notice that
keyword is adding the file without holding wlock.)
This is done by a script [2] using RedBaron [1], a tool designed for doing
code refactoring. All "default" values are decided by the script and are
strongly consistent with the existing code.
There are 2 changes done manually to fix tests:
[warn] mercurial/exchange.py: experimental.bundle2-output-capture: default needs manual removal
[warn] mercurial/localrepo.py: experimental.hook-track-tags: default needs manual removal
Since RedBaron is not confident about how to indent things [2].
[1]: https://github.com/PyCQA/redbaron
[2]: https://github.com/PyCQA/redbaron/issues/100
[3]:
#!/usr/bin/env python
# codemod_configitems.py - codemod tool to fill configitems
#
# Copyright 2017 Facebook, Inc.
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
from __future__ import absolute_import, print_function
import os
import sys
import redbaron
def readpath(path):
with open(path) as f:
return f.read()
def writepath(path, content):
with open(path, 'w') as f:
f.write(content)
_configmethods = {'config', 'configbool', 'configint', 'configbytes',
'configlist', 'configdate'}
def extractstring(rnode):
"""get the string from a RedBaron string or call_argument node"""
while rnode.type != 'string':
rnode = rnode.value
return rnode.value[1:-1] # unquote, "'str'" -> "str"
def uiconfigitems(red):
"""match *.ui.config* pattern, yield (node, method, args, section, name)"""
for node in red.find_all('atomtrailers'):
entry = None
try:
obj = node[-3].value
method = node[-2].value
args = node[-1]
section = args[0].value
name = args[1].value
if (obj in ('ui', 'self') and method in _configmethods
and section.type == 'string' and name.type == 'string'):
entry = (node, method, args, extractstring(section),
extractstring(name))
except Exception:
pass
else:
if entry:
yield entry
def coreconfigitems(red):
"""match coreconfigitem(...) pattern, yield (node, args, section, name)"""
for node in red.find_all('atomtrailers'):
entry = None
try:
args = node[1]
section = args[0].value
name = args[1].value
if (node[0].value == 'coreconfigitem' and section.type == 'string'
and name.type == 'string'):
entry = (node, args, extractstring(section),
extractstring(name))
except Exception:
pass
else:
if entry:
yield entry
def registercoreconfig(cfgred, section, name, defaultrepr):
"""insert coreconfigitem to cfgred AST
section and name are plain string, defaultrepr is a string
"""
# find a place to insert the "coreconfigitem" item
entries = list(coreconfigitems(cfgred))
for node, args, nodesection, nodename in reversed(entries):
if (nodesection, nodename) < (section, name):
# insert after this entry
node.insert_after(
'coreconfigitem(%r, %r,\n'
' default=%s,\n'
')' % (section, name, defaultrepr))
return
def main(argv):
if not argv:
print('Usage: codemod_configitems.py FILES\n'
'For example, FILES could be "{hgext,mercurial}/*/**.py"')
dirname = os.path.dirname
reporoot = dirname(dirname(dirname(os.path.abspath(__file__))))
# register configitems to this destination
cfgpath = os.path.join(reporoot, 'mercurial', 'configitems.py')
cfgred = redbaron.RedBaron(readpath(cfgpath))
# state about what to do
registered = set((s, n) for n, a, s, n in coreconfigitems(cfgred))
toregister = {} # {(section, name): defaultrepr}
coreconfigs = set() # {(section, name)}, whether it's used in core
# first loop: scan all files before taking any action
for i, path in enumerate(argv):
print('(%d/%d) scanning %s' % (i + 1, len(argv), path))
iscore = ('mercurial' in path) and ('hgext' not in path)
red = redbaron.RedBaron(readpath(path))
# find all repo.ui.config* and ui.config* calls, and collect their
# section, name and default value information.
for node, method, args, section, name in uiconfigitems(red):
if section == 'web':
# [web] section has some weirdness, ignore them for now
continue
defaultrepr = None
key = (section, name)
if len(args) == 2:
if key in registered:
continue
if method == 'configlist':
defaultrepr = 'list'
elif method == 'configbool':
defaultrepr = 'False'
else:
defaultrepr = 'None'
elif len(args) >= 3 and (args[2].target is None or
args[2].target.value == 'default'):
# try to understand the "default" value
dnode = args[2].value
if dnode.type == 'name':
if dnode.value in {'None', 'True', 'False'}:
defaultrepr = dnode.value
elif dnode.type == 'string':
defaultrepr = repr(dnode.value[1:-1])
elif dnode.type in ('int', 'float'):
defaultrepr = dnode.value
# inconsistent default
if key in toregister and toregister[key] != defaultrepr:
defaultrepr = None
# interesting to rewrite
if key not in registered:
if defaultrepr is None:
print('[note] %s: %s.%s: unsupported default'
% (path, section, name))
registered.add(key) # skip checking it again
else:
toregister[key] = defaultrepr
if iscore:
coreconfigs.add(key)
# second loop: rewrite files given "toregister" result
for path in argv:
# reconstruct redbaron - trade CPU for memory
red = redbaron.RedBaron(readpath(path))
changed = False
for node, method, args, section, name in uiconfigitems(red):
key = (section, name)
defaultrepr = toregister.get(key)
if defaultrepr is None or key not in coreconfigs:
continue
if len(args) >= 3 and (args[2].target is None or
args[2].target.value == 'default'):
try:
del args[2]
changed = True
except Exception:
# redbaron fails to do the rewrite due to indentation
# see https://github.com/PyCQA/redbaron/issues/100
print('[warn] %s: %s.%s: default needs manual removal'
% (path, section, name))
if key not in registered:
print('registering %s.%s' % (section, name))
registercoreconfig(cfgred, section, name, defaultrepr)
registered.add(key)
if changed:
print('updating %s' % path)
writepath(path, red.dumps())
if toregister:
print('updating configitems.py')
writepath(cfgpath, cfgred.dumps())
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))
This was only used by the sparse extension's dirstate._ignore
override, which no longer exists.
Differential Revision: https://phab.mercurial-scm.org/D60
This is a Windows only thing. Unfortunately, the socket is closed at this point
(so the certificate is unavailable to check the chain). That means it's printed
out when verification fails as a guess, on the assumption that 1) most of the
time verification won't fail, and 2) sites using expired or certs that are too
new will be rare. Maybe this is an argument for adding more functionality to
debugssl, to test for problems and print certificate info. Or maybe it's an
argument for bundling certificates with the Windows builds. That idea was set
aside when the enhanced SSL code went in last summer, and it looks like there
were issues with using certifi on Windows anyway[1].
This was tested by deleting the certificate out of certmgr.msc > "Third-Party
Root Certification Authorities" > "Certificates", seeing `hg pull` fail (with
the new message), trying this command, and then successfully performing the pull
command.
[1] https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-October/089573.html
This is only useful on Windows, and avoids the need to use Internet Explorer to
build the certificate chain. I can see this being extended in the future to
print information about the certificate(s) to help debug issues on any platform.
Maybe even perform some of the python checks listed on the secure connections
wiki page. But for now, all I need is 1) a command that can be invoked in a
setup script to ensure the certificate is installed, and 2) a command that the
user can run if/when a certificate changes in the future.
It would have been nice to leverage the sslutil library to pick up host specific
settings, but attempting to use sslutil.wrapsocket() failed the
'not sslsocket.cipher()' check in it and aborted.
The output is a little more chatty than some commands, but I've seen the update
take 10+ seconds, and this is only a debug command.
I started a thread[1] on the mailing list awhile ago, but the short version is
that Windows doesn't ship with a full list of certificates[2]. Even if the
server sends the whole chain, if Windows doesn't have the appropriate
certificate pre-installed in its "Third-Party Root Certification Authorities"
store, connections mysteriously fail with:
abort: error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:661)
Windows expects the application to call the methods invoked here as part of the
certificate verification, triggering a call out to Windows update if necessary,
to complete the trust chain. The python bug to add this support[3] hasn't had
any recent activity, and isn't targeting py27 anyway.
The only work around that I could find (besides figuring out the certificate and
walking through the import wizard) is to browse to the site in Internet
Explorer. Opening the page with FireFox or Chrome didn't work. That's a pretty
obscure way to fix a pretty obscure problem. We go to great lengths to
demystify various SSL errors, but this case is clearly lacking. Let's try to
make things easier to diagnose and fix.
When I had trouble figuring out how to get ctypes to work with all of the API
pointers, I found that there are other python projects[4] using this API to
achieve the same thing.
[1] https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-April/096501.html
[2] https://support.microsoft.com/en-us/help/931125/how-to-get-a-root-certificate-update-for-windows
[3] https://bugs.python.org/issue20916
[4] 3b86bce206/source/updateCheck.py (L511)
There is still some use of 'deletedivergent' bookmark here. They will be taken
care of later. The 'deletedivergent' code needs some rework before fitting in
the new world.
We want to track bookmark movement within a transaction. For this we need a
more centralized way to update bookmarks.
For this purpose we introduce a new 'applychanges' method that apply a list of
changes encoded as '(name, node)'. We'll cover all bookmark updating code to
this new method in later changesets and add bookmark move in the transaction
when all will be migrated.
If a matcher doesn't implement visitdir, we should be returning True so that
tree traversals are not prematurely pruned. The old value of False would prevent
tree traversals when using any matcher that didn't implement visitdir.
Differential Revision: https://phab.mercurial-scm.org/D83
Thinking a bit further about list/dict subscript operation (proposed by
issue 5534), I noticed the current data structure, a dict of dicts, might
not be ideal.
For example, if there were "'[' index ']'" and "'.' key" operators,
"{parents[0]}" would return "{p1rev}:{p1node}", and we would probably want to
write "{parents[0].desc}" to get the first element of "{parents % "{desc}"}".
This will basically execute parents[0].makemap()['desc'] in Python.
Given the rule above, "{peerpaths.default.pushurl}" will be translated to
peerpaths['default'].makemap()['pushurl'], which means {peerpaths} should
be a single-level dict and sub-options should be makemap()-ed.
"{peerpaths % "{name} = {url}, {pushurl}, ..."}"
(Well, it could be peerpaths['default']['pushurl'], but in which case,
peerpaths['default'] should be a plain dict, not a hybrid object.)
So, let's mark the current implementation experimental and revisit it later.
find_deepest is used to find the "best" ancestors given a list. In the main
loop it keeps an invariant called 'ninteresting' which is supposed to contain
the number of non-zero entries in the 'interesting' array. This invariant is
incorrectly maintained, however, which leads the the algorithm returning an
empty result for certain graphs. This has been fixed.
Also, the 'interesting' array is supposed to fit 2^ancestors values, but is
incorrectly allocated to twice that size. This has been fixed as well.
The tests in test-ancestor.py compare the Python and C versions of the code,
and report the error correctly, since the Python version works correct. Even
so, I have added an additional test against the expected result, in the event
that both algorithms have an identical error in the future.
This fixes issue5623.
In some case, the default of one value is derived from other value. We add a
way to register them anyway and an associated devel-warning.
The registration is very naive for the moment. We might be able to have a
better way for registering each of these cases but it could be done later.
cg.apply used to returns the added nodes. Callers doesn't have a use for it
anymore, remove the added node and stops recording it in the current
operation.
This information was added in the current release cycle so no extensions
breakage should happens.
updatephases have no use of the 'addednodes' parameter since 44be3dc1fec8.
However caller are still passing it for nothing, remove the parameter and
remove computing of the added nodes in caller.
We adds new computation to find and record the revision affected by the
boundary retraction. This add more complication to the function but this seems
fine since it is only used in a couple of rare and explicit cases (`hg phase
--force` and `hg qimport`).
Having strong tracking of phase changes is worth the effort.
It is useful to detect noop and avoid expensive operations in this case.
We return the information to inform the caller of a possible update. Top level
function might need to react to the phase update (eg: invalidating some
caches, tracking phase change).
We rework the code to call 'registernew' before any other phase advancement.
This make 'changegroup.apply' register correct phase movement for the added
and bundled nodes.
This new function will be used by code that adds new changesets. It ajusts the
phase boundary to make sure added changesets are at least in their target
phase (they end up in an higher phase if their parents are in a higher phase).
Having a dedicated function also simplify the phases tracking. All the new
nodes are passed as argument, so we know that all of them needs to have their
new phase registered. We also know that no other nodes will be affected, so no
extra computation are needed.
This function differ from 'retractboundary' where some nodes might change
phase while some other might not. It can also affect nodes not passed as
parameters.
These simplification also apply to the computation itself. For now we use
'_retractboundary' there by convenience, but we may introduces simpler code
later.
While registering new revisions, we still need to check the actual phases of
the added node because it might be higher than the target phase (eg: target is
draft but parent is secret).
We will migrate users over the next changesets.
At the moment the 'retractboundary' function is called for multiple reasons:
First, actually retracting boundaries. There are only two cases for theses:
'hg phase --force' and 'hg qimport'. This will need extra graph computation to
retrieve the phase changes.
Second, setting the phases of newly added changesets. In this case we already
know all the affected nodes and we just needs to register different
information (old phase is None).
Third, when reducing the set of roots when advancing phase. The phase are
already properly tracked so we do not needs anything else in this case.
To deal with this difference in phase tracking, we extract the core logic into
a private method that all three cases can use.
Makes advanceboundary record the phase movement of affected revisions in
tr.changes['phases'].
The tracking is not usable yet because the 'retractboundary' function can also
affect phases.
We'll improve that in the coming changesets.
When advancing phases, we compute the new roots for the phases above. During
this process, we need to compute all the revisions that change phases (to the
new target phases). Extract these revisions into a separate variable. This
will be useful to record the phase changes in the transaction.
It seems that we were calling retractboundary for each phases to process.
Putting the retractboundary out of the loop reduce the number of calls,
helping tracking the phases changes.
unionmatcher is currently used where only a limited subset of its
functions will be called. Specifically, visitdir() is never
called. The next patch will pass it to dirstate.walk() where it will
matter that visitdir() is correctly implemented, so let's fix
that. Also add the explicitdir etc that will also be assumed by
dirstate.walk() to exist on a matcher.
Differential Revision: https://phab.mercurial-scm.org/D58
The forceincludematcher is simply a unionmatcher of a includematcher
(matching paths recursively) with the given matcher. Since the
forceincludematcher is only used by sparse, move it there.
I don't have a good sparse repo setup to test performance impact on.
Differential Revision: https://phab.mercurial-scm.org/D57
rebase will have similar logic, so let's extract it. Besides, it makes
the histedit code more readable.
We may want to parametrize acceptintervention() by the exception(s)
that should result in transaction close.
Differential Revision: https://phab.mercurial-scm.org/D66
Update the dirstate functions so that the caller supplies the full backup
filename rather than just a prefix and suffix.
The localrepo code was already hard-coding the fact that the backup name must
be (exactly prefix + "dirstate" + suffix): it relied on this in _journalfiles()
and undofiles(). Making the caller responsible for specifying the full backup
name removes the need for the localrepo code to assume that dirstate._filename
is always "dirstate".
Differential Revision: https://phab.mercurial-scm.org/D68
This was meant as a substitute for Python's "with" with multiple
context managers before we moved to Python 2.7. We're now on 2.7, so
we should have no reason to keep ctxmanager. "hg grep --all
ctxmanager" says that it was never used anyway.
Differential Revision: https://phab.mercurial-scm.org/D73
This is the result of running:
python codemod_nestedwith.py **/*.py
where codemod_nestedwith.py looks like this:
#!/usr/bin/env python
# codemod_nestedwith.py - codemod tool to rewrite nested with
#
# Copyright 2017 Facebook, Inc.
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
from __future__ import absolute_import, print_function
import sys
import redbaron
def readpath(path):
with open(path) as f:
return f.read()
def writepath(path, content):
with open(path, 'w') as f:
f.write(content)
def main(argv):
if not argv:
print('Usage: codemod_nestedwith.py FILES')
for i, path in enumerate(argv):
print('(%d/%d) scanning %s' % (i + 1, len(argv), path))
changed = False
red = redbaron.RedBaron(readpath(path))
processed = set()
for node in red.find_all('with'):
if node in processed or node.type != 'with':
continue
top = node
child = top[0]
while True:
if len(top) > 1 or child.type != 'with':
break
# estimate line length after merging two "with"s
new = '%swith %s:' % (top.indentation, top.contexts.dumps())
new += ', %s' % child.contexts.dumps()
# only do the rewrite if the end result is within 80 chars
if len(new) > 80:
break
processed.add(child)
top.contexts.extend(child.contexts)
top.value = child.value
top.value.decrease_indentation(4)
child = child[0]
changed = True
if changed:
print('updating %s' % path)
writepath(path, red.dumps())
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))
Differential Revision: https://phab.mercurial-scm.org/D77
we wrap 'repo.svfs.audit' to check for the store lock when accessing file in
'.hg/store' for writing. This caught a couple of instance where the transaction
was released after the lock, we should probably have a dedicated checker for
that case.
When the appropriate developer warnings are enabled, We wrap 'repo.vfs.audit' to
check for locks when accessing file in '.hg' for writing. Another changeset will
add a 'ward' for the store vfs (svfs).
This check system has caught a handful of locking issues that have been fixed
in previous series (mostly in 4.0). I expect another batch to be caught in third
party extensions.
We introduce two real exceptions from extensions 'blackbox.log' (because a lot of
read-only operations add entry to it), and 'last-email.txt' (because 'hg email'
is currently a read only operation and there is value to keep it this way).
In addition we are currently allowing bisect to operate outside of the lock
because the current code is a bit hard to get properly locked for now. Multiple
clean up have been made but there is still a couple of them to do and the freeze
is coming.
We want to be able to do more precise check when auditing a path depending of
the intend of the file access (eg read versus write). So we now pass the 'mode'
value to 'audit' and update the audit function to accept them.
This will be put to use in the next changeset.
This function already does an excellent job of reading from context objects;
we simply need to change the single write call to eliminate all uses of the
wvfs.
As with past changes, the effect should be a no-op but opens the door to
in-memory merge later by using different context objects.
I ran into this ctypes bug while working with the Crypto API. While this could
be an issue with any Win32 API in theory, the handful of things that we call are
older functions that are unlikely to return COM errors, so I didn't retrofit
this everywhere.
The proposed syntax [1] was originally 'set{n rel}', but it seemed slightly
confusing if template is involved. On the other hand, we want to keep 'set[n]'
for future extension. So this patch introduces 'set#rel[n]' ternary operator.
I chose '#' just because it looks like applying an attribute.
This also adds stubs for 'set[n]' and 'set#rel' operators since these syntax
elements are fundamental for constructing 'set#rel[n]'.
[1]: https://www.mercurial-scm.org/wiki/RevsetOperatorPlan#ideas_from_mpm