Summary:
D13853115 adds `edenscm/` to `sys.path` and code still uses `import mercurial`.
That has nasty problems if both `import mercurial` and
`import edenscm.mercurial` are used, because Python would think `mercurial.foo`
and `edenscm.mercurial.foo` are different modules so code like
`try: ... except mercurial.error.Foo: ...`, or `isinstance(x, mercurial.foo.Bar)`
would fail to handle the `edenscm.mercurial` version. There are also some
module-level states (ex. `extensions._extensions`) that would cause trouble if
they have multiple versions in a single process.
Change imports to use the `edenscm` so ideally the `mercurial` is no longer
imported at all. Add checks in extensions.py to catch unexpected extensions
importing modules from the old (wrong) locations when running tests.
Reviewed By: phillco
Differential Revision: D13868981
fbshipit-source-id: f4e2513766957fd81d85407994f7521a08e4de48
Summary:
Turned on the auto formatter. Ran `arc lint --apply-patches --take BLACK **/*.py`.
Then run `arc lint` again so some other autofixers like spellchecker etc. looked
at the code base. Manually accept the changes whenever they make sense, or use
a workaround (ex. changing "dict()" to "dict constructor") where autofix is false
positive. Disabled linters on files that are hard (i18n/polib.py) to fix, or less
interesting to fix (hgsubversion tests), or cannot be fixed without breaking
OSS build (FBPYTHON4).
Conflicted linters (test-check-module-imports.t, part of test-check-code.t,
test-check-pyflakes.t) are removed or disabled.
Duplicated linters (test-check-pyflakes.t, test-check-pylint.t) are removed.
An issue of the auto-formatter is lines are no longer guarnateed to be <= 80
chars. But that seems less important comparing with the benefit auto-formatter
provides.
As we're here, also remove test-check-py3-compat.t, as it is currently broken
if `PYTHON3=/bin/python3` is set.
Reviewed By: wez, phillco, simpkins, pkaush, singhsrb
Differential Revision: D8173629
fbshipit-source-id: 90e248ae0c5e6eaadbe25520a6ee42d32005621b
sslutil contains its own hostname matching logic. CPython has code
for the same intent. However, it is only available to Python 2.7.9+
(or distributions that have backported 2.7.9's ssl module
improvements).
This patch effectively imports CPython's hostname matching code
from its ssl.py into sslutil.py. The hostname matching code itself
is pretty similar. However, the DNS name matching code is much more
robust and spec conformant.
As the test changes show, this changes some behavior around
wildcard handling and IDNA matching. The new behavior allows
wildcards in the middle of words (e.g. 'f*.com' matches 'foo.com')
This is spec compliant according to RFC 6125 Section 6.5.3 item 3.
There is one test where the matcher is more strict. Before,
'*.a.com' matched '.a.com'. Now it doesn't match. Strictly speaking
this is a security vulnerability.
CPython has a more comprehensive test suite for it's built-in hostname
matching functionality. This patch adds its tests so we can improve
our hostname matching functionality.
Many of the tests have different results from CPython. These will be
addressed in a subsequent commit.
Before:
>>> str(url('file:///c:/tmp/foo/bar'))
'file:c%3C/tmp/foo/bar'
After:
>>> str(url('file:///c:/tmp/foo/bar'))
'file:///c%3C/tmp/foo/bar'
The previous behaviour had no effect on mercurial itself (clone command for
instance) because we fortunately called .localpath() on the parsed URL.
hgsubversion was not so lucky and cloning a local subversion repository on
Windows no longer worked on the default branch (it works on stable because
2b62605189dc defeats the hasdriveletter() test in url class).
I do not know if the %3C is correct or not but svn accepts file:// URLs
containing it. Mads fixed it in 2b62605189dc, so we can always backport should
the need arise.
This fix mirrors the changes made to test-doctest.py in 04cfbbc5ae97
and 39599b7929c4.
Without this change, tests running heredoctest.py can fail on certain
versions of OS X when TERM is set to xterm-256color:
$ /opt/local/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python -m heredoctest <<EOF
> >>> open('b', 'w').write('this' * 1000)
> EOF
+ \x1b[?1034h (no-eol) (esc)
A similar problem occurs with test-url.py:
$ ./run-tests.py test-url.py
--- .../tests/test-url.py.out
+++ .../tests/test-url.py.err
@@ -0,0 +1 @@
+
ERROR: .../test-url.py output changed
!
Failed test-url.py: output changed
# Ran 1 tests, 0 skipped, 1 failed.
8264e5172141 made sure that paths that seemed to start with a windows drive
letter would not get an extra leading slash.
localpath should thus not try to handle this case by removing a leading slash,
and this special handling is thus removed.
(The localpath handling of this case was wrong anyway, because paths that look
like they start with a windows drive letter can't have a leading slash.)
A quick verification of this is to run 'hg id file:///c:/foo/bar/'.
Any entries in subjectAltName would prevent fallback to using commonName, but
RFC 2818 says:
If a subjectAltName extension of type dNSName is present, that MUST
be used as the identity. Otherwise, the (most specific) Common Name
field in the Subject field of the certificate MUST be used.
We now only consider dNSNames in subjectAltName.
(dNSName is known as 'DNS' in OpenSSL/Python.)
str(url) was recently changed to return only file:/. However, the
canonical way to represent absolute local paths is file:/// [1], which
is also expected by at least hgsubversion.
Relative paths are returned as file:the/relative/path.
[1] http://en.wikipedia.org/wiki/File_URI_scheme
The introduction of the new URL parsing code has created a startup
time regression. This is mainly due to the use of url.hasscheme() in
the ui class. It ends up importing many libraries that the url module
requires.
This fix helps marginally, but if we can get rid of the urllib import
in the URL parser all together, startup time will go back to normal.
perfstartup time before the URL refactoring (707e4b1e8064):
! wall 0.050692 comb 0.000000 user 0.000000 sys 0.000000 (best of 100)
current startup time (9ad1dce9e7f4):
! wall 0.070685 comb 0.000000 user 0.000000 sys 0.000000 (best of 100)
after this change:
! wall 0.064667 comb 0.000000 user 0.000000 sys 0.000000 (best of 100)
While the URL parser is very forgiving about what characters are
allowed in each component, it's useful to be strict about the scheme
so we don't accidentally interpret local paths with colons as URLs.
This restricts schemes to containing alphanumeric characters, dashes,
pluses, and dots (as specified in RFC 2396).
This adds a url object that re-implements urlsplit() and
unsplit(). The implementation splits out usernames, passwords, and
ports.
The implementation is based on the behavior specified by RFC
2396[1]. However, it is much more forgiving than the RFC's
specification; it places no specific restrictions on what characters
are allowed in each segment of the URL other than what is necessary to
split the URL into its constituent parts.
[1]: http://www.ietf.org/rfc/rfc2396.txt
SSLSocket.getpeercert() returns tuple containing unicode for 'subject'.
Since Mercurial does't support IDN at all, it just returns error for non-ascii
certname.
Pythons SSL module verifies that certificates received for HTTPS are valid
according to the specified cacerts, but it doesn't verify that the certificate
is for the host we connect to.
We now explicitly verify that the commonName in the received certificate
matches the requested hostname and is valid for the time being.
This is a minimal patch where we try to fail to the safe side, but we do still
rely on Python's SSL functionality and do not try to implement the standards
fully and correctly. CRLs and subjectAltName are not handled and proxies
haven't been considered.
This change might break connections to some sites if cacerts is specified and
the certificates (by our definition) isn't correct. The workaround is to
disable cacerts which in most cases isn't much worse than it was before with
cacerts.