pycompat.getenv returns os.getenvb on py3 which is not available on Windows.
This patch replaces them with encoding.environ.get and checks to ensure no
new instances of os.getenv or os.setenv are introduced.
shlex.split() only accepts unicodes on Python 3. After this patch we will be
using pycompat.shlexsplit(). This patch also replaces existing occurences of
shlex.split with pycompat.shlexsplit.
sys.executable on Python 3 returns unicodes and we want bytes. So this patch
adds a new pycompat.sysexecutable which returns bytes by encoding using
os.fsencode() since it is path variable.
os.getenv() on python 3 deals with unicodes. If we want to pass bytes. we have
os.getenvb() which deals with bytes. This patch adds up a pycompat.osgetenv
which deals with bytes on both python 2 and 3.
Keys of keyword arguments need to be str(unicodes) on Python 3. We have a lot
of function where we pass keyword arguments. Having utility functions to help
converting keys to unicodes before passing and convert back them to bytes once
passed into the function will be helpful. We now have functions named
pycompat.strkwargs(dic) and pycompat.byteskwargs(dic) to help us.
getopt.getopt() deals with unicodes on Python 3 internally and if bytes
arguments are passed, then it will return TypeError. So we have now
pycompat.getoptb() which takes bytes arguments, convert them to unicode, call
getopt.getopt() and then convert the returned value back to bytes and then
return those value.
All the instances of getopt.getopt() are replaced with pycompat.getoptb().
Following the behaviour of Python 3, os.getcwd() return unicodes. We need
bytes version as path variables are bytes in UNIX. Python 3 has os.getcwdb()
which returns current working directory in bytes.
Like rest of the things there in pycompat, like osname, ossep, we need to
rewrite every instance of os.getcwd to pycompat.getcwd to make them work
correctly on Python 3.
Since standard streams are TextIO on Python 3, we can't use sys.stdin/out/err
directly. Fortunately we can get the underlying BytesIO via .buffer as long as
the streams aren't replaced by e.g. StringIO.
stdin/out/err are provided through util so we can wrap them by platform API.
sys.argv returns unicodes on Python 3. We need a bytes version for us.
There was also a python bug/feature request which wanted then to implement
one. They rejected and it is quoted in one of the comments that we can use
fsencode() to get a bytes version of sys.argv. Though not sure about its
correctness.
Link to the comment: http://bugs.python.org/issue8776#msg217416
After this patch we will have pycompat.sysargv which will return us bytes
version of sys.argv. If this patch goes in, i will like to make transformer
rewrite sys.argv with pycompat.argv because there are lot of occurences.
os.name returns unicodes on py3. Most of our checks are like
os.name == 'nt'
Because of the transformer, on the right hand side we have b'nt'. The
condition will never satisfy even if os.name returns 'nt' as that will be an
unicode.
We either need to encode every occurence of os.name or have a
new variable which is much cleaner. Now we have pycompat.osname.
There are around 53 occurences of os.name in the codebase which needs to
be replaced by pycompat.osname to support Python 3.
The custom module importer was making these bytes, so when we poked
values into self.__dict__ we had bytes instead of unicode on py3 and
it didn't work.
This will be used to convert encoding.encoding to a str acceptable by
Python 3 functions.
The source encoding is changed to "latin-1" because encoding.encoding can
have arbitrary bytes. Since valid names should consist of ASCII characters,
we don't care about the mapping of non-ASCII characters so long as invalid
names are distinct from valid names.
Replacement _pycompatstub designed to be compatible with our demandimporter.
try-except is replaced by version comparison because ImportError will no longer
be raised immediately.
This should be less invasive than mucking builtins.
Since tokenize.untokenize() looks start/end positions of tokens, we calculates
them from the NEWLINE token of the future import.
These functions will be imported automagically by our code transformer.
getattr() and setattr() are widely used in our code. We wouldn't probably
want to rewrite every single call of getattr/setattr. delattr() and hasattr()
aren't that important, but they are functions of the same kind.
We have a single line function which just lowercase the letters and replaces
"_" with "". Its better to avoid that function call. Moreover we calling this
function around 33 times.
pycompat.py includes hack to import modules whose names are changed in Python 3.
We use try-except to load module according to the version of python. But this
method forces us to import the modules to raise an ImportError and hence making
it demandimport unfriendly.
This patch changes the try-except blocks to a single if-else block. To avoid
test-check-pyflakes.t complain about unused imports, pycompat.py is excluded
from the test.
The BaseHTTPServer, SimpleHTTPServer and CGIHTTPServer has been merged into
http.server in python 3. All of them has been merged as util.httpserver to use
in both python 2 and 3. This patch adds a regex to check-code to warn against
the use of BaseHTTPServer. Moreover this patch also includes updates to lower
part of test-check-py3-compat.t which used to remain unchanged.
The httplib library is renamed to http.client in python 3. So the
import is conditionalized and a test is added in check-code to warn
to use util.httplib
The cPickle is renamed to _pickle in python3 and this C extension is available
in pickle which was not included in earlier versions. So imports are conditionalized
to import cPickle in py2 and pickle in py3. Moreover the use of pickle in py2 is
switched to cPickle as the C extension is faster. The hack is added in util.py and
the modules import util.pickle
python3 url.request and url.error are mapped as util.urlreq/util.urlerr
python2 equivalents from urllib/urllib2 are mapped according to the py3
hierarchy
While the pycompat module will actually handle divergence, please
access these properties from the util module:
util.queue = Queue.Queue / queue.Queue
util.empty = Queue.Empty / queue.Empty