Before:
>>> str(url('file:///c:/tmp/foo/bar'))
'file:c%3C/tmp/foo/bar'
After:
>>> str(url('file:///c:/tmp/foo/bar'))
'file:///c%3C/tmp/foo/bar'
The previous behaviour had no effect on mercurial itself (clone command for
instance) because we fortunately called .localpath() on the parsed URL.
hgsubversion was not so lucky and cloning a local subversion repository on
Windows no longer worked on the default branch (it works on stable because
2b62605189dc defeats the hasdriveletter() test in url class).
I do not know if the %3C is correct or not but svn accepts file:// URLs
containing it. Mads fixed it in 2b62605189dc, so we can always backport should
the need arise.
This fix mirrors the changes made to test-doctest.py in 04cfbbc5ae97
and 39599b7929c4.
Without this change, tests running heredoctest.py can fail on certain
versions of OS X when TERM is set to xterm-256color:
$ /opt/local/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python -m heredoctest <<EOF
> >>> open('b', 'w').write('this' * 1000)
> EOF
+ \x1b[?1034h (no-eol) (esc)
A similar problem occurs with test-url.py:
$ ./run-tests.py test-url.py
--- .../tests/test-url.py.out
+++ .../tests/test-url.py.err
@@ -0,0 +1 @@
+
ERROR: .../test-url.py output changed
!
Failed test-url.py: output changed
# Ran 1 tests, 0 skipped, 1 failed.
8264e5172141 made sure that paths that seemed to start with a windows drive
letter would not get an extra leading slash.
localpath should thus not try to handle this case by removing a leading slash,
and this special handling is thus removed.
(The localpath handling of this case was wrong anyway, because paths that look
like they start with a windows drive letter can't have a leading slash.)
A quick verification of this is to run 'hg id file:///c:/foo/bar/'.
Any entries in subjectAltName would prevent fallback to using commonName, but
RFC 2818 says:
If a subjectAltName extension of type dNSName is present, that MUST
be used as the identity. Otherwise, the (most specific) Common Name
field in the Subject field of the certificate MUST be used.
We now only consider dNSNames in subjectAltName.
(dNSName is known as 'DNS' in OpenSSL/Python.)
str(url) was recently changed to return only file:/. However, the
canonical way to represent absolute local paths is file:/// [1], which
is also expected by at least hgsubversion.
Relative paths are returned as file:the/relative/path.
[1] http://en.wikipedia.org/wiki/File_URI_scheme
The introduction of the new URL parsing code has created a startup
time regression. This is mainly due to the use of url.hasscheme() in
the ui class. It ends up importing many libraries that the url module
requires.
This fix helps marginally, but if we can get rid of the urllib import
in the URL parser all together, startup time will go back to normal.
perfstartup time before the URL refactoring (707e4b1e8064):
! wall 0.050692 comb 0.000000 user 0.000000 sys 0.000000 (best of 100)
current startup time (9ad1dce9e7f4):
! wall 0.070685 comb 0.000000 user 0.000000 sys 0.000000 (best of 100)
after this change:
! wall 0.064667 comb 0.000000 user 0.000000 sys 0.000000 (best of 100)
While the URL parser is very forgiving about what characters are
allowed in each component, it's useful to be strict about the scheme
so we don't accidentally interpret local paths with colons as URLs.
This restricts schemes to containing alphanumeric characters, dashes,
pluses, and dots (as specified in RFC 2396).
This adds a url object that re-implements urlsplit() and
unsplit(). The implementation splits out usernames, passwords, and
ports.
The implementation is based on the behavior specified by RFC
2396[1]. However, it is much more forgiving than the RFC's
specification; it places no specific restrictions on what characters
are allowed in each segment of the URL other than what is necessary to
split the URL into its constituent parts.
[1]: http://www.ietf.org/rfc/rfc2396.txt
SSLSocket.getpeercert() returns tuple containing unicode for 'subject'.
Since Mercurial does't support IDN at all, it just returns error for non-ascii
certname.
Pythons SSL module verifies that certificates received for HTTPS are valid
according to the specified cacerts, but it doesn't verify that the certificate
is for the host we connect to.
We now explicitly verify that the commonName in the received certificate
matches the requested hostname and is valid for the time being.
This is a minimal patch where we try to fail to the safe side, but we do still
rely on Python's SSL functionality and do not try to implement the standards
fully and correctly. CRLs and subjectAltName are not handled and proxies
haven't been considered.
This change might break connections to some sites if cacerts is specified and
the certificates (by our definition) isn't correct. The workaround is to
disable cacerts which in most cases isn't much worse than it was before with
cacerts.