This patch adds a fixer that accounts for changes in python packages, as the
framework provided by lib2to3 is only able to track changes in module names.
This fixer (hopefully) can fix any change in one-level hierarchies.
To exemplify, this fixer can successfully change an import from
"email.MIMEMultipart" to "email.mime.multipart".
This patch implements a script that inherits most of its functionality from
hg's setup.py and adds support to calling 2to3 during invocation with python3.
The motivation of having this script around is twofold:
1) It enables py3k crazies to test mercurial in py3k and, hopefully, patch it
more easily, so it can improve the py3k support to eventually run there.
2) Being separated from the main setup.py eliminates the need to make hg's
setup.py even more cluttered, and enables "independent" development until
the port is done.
Some considerations about the structure of this patch:
Mercurial already overrides the behavior of build_py, this patch tweaks it a bit
more to add support to call 2to3 with a custom fixer* location for Mercurial.
There is also a need of having the core C modules built *before* the
translation process starts, otherwise 2to3 will think those are global modules.
* A fixer is a python module that transforms python 2.x code in python 3.x
code.
This patch implement a fixer that replaces all calls to the '%' when bytes
arguments are used to a call to bytesformatter(), a function that knows how to
format byte strings. As one can't be sure if a formatting call is done when
only variables are used in a '%' call, these calls are also translated. The
bytesformatter, in runtime, makes sure to return the "raw" % operation if
that's what was intended.
This patch adds some ugly constructs. The first of them is bytesformatter, a
function that formats strings like when '%' is called. The main motivation for
this function is py3k's strange behavior:
>>> 'foo %s' % b'bar'
"foo b'bar'"
>>> b'foo %s' % b'bar'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for %: 'bytes' and 'bytes'
>>> b'foo %s' % 'bar'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for %: 'bytes' and 'str'
In other words, if we can't format bytes with bytes, and recall that all
mercurial strings will be converted by a fixer, then things will break badly if
we don't take a similar approach.
The other addition with this patch is that the os.environ dictionary is
monkeypatched to have bytes items. Hopefully this won't be needed in the
future, as python 3.2 might get a os.environb dictionary that holds bytes
items.
This patch implements a 2to3 fixer that converts all plain strings in a python
source file to byte strings syntax. Example:
foo = 'Normal string'
would become
foo = b'Normal string'
The motivation behind this fixer can be found in
http://selenic.com/pipermail/mercurial-devel/2010-June/022363.html or, in other
words: the current hg source assumes that _most_ strings are "meant" to be byte
sequences, so it makes sense to make the convertion implemented by this patch.
As mentioned above, not all mercurial modules want to use strings as bytes,
examples include i18n (which uses unicode), and demandimport (in py3k, module
names are normal strings, thus unicode, and there's no need for a convertion).
Therefore, these modules are blacklisted in the fixer. There are also a few
functions that can take only unicode arguments, thus the convertion shouldn't
be done for those.