Commit Graph

276 Commits

Author SHA1 Message Date
Martin von Zweigbergk
c56b4f93aa tests: allow ModuleNotFoundError in addition to ImportError
My environment (Python version? PYTHONPATH? something else?) raises
ModuleNotFoundError in test-check-py3-compat.t. This patch allows any
"*Error". The error string contains "error importing", so it seems
specific enough even after.
2017-03-17 09:58:49 -07:00
Gregory Szorc
3f02063f8e zstd: vendor python-zstandard 0.7.0
Commit 3054ae3a66112970a091d3939fee32c2d0c1a23e from
https://github.com/indygreg/python-zstandard is imported without
modifications (other than removing unwanted files).

The vendored zstd library within has been upgraded from 1.1.2 to
1.1.3. This version introduced new APIs for threads, thread
pools, multi-threaded compression, and a new dictionary
builder (COVER). These features are not yet used by
python-zstandard (or Mercurial for that matter). However,
that will likely change in the next python-zstandard release
(and I think there are opportunities for Mercurial to take
advantage of the multi-threaded APIs).

Relevant to Mercurial, the CFFI bindings are now fully
implemented. This means zstd should "just work" with PyPy
(although I haven't tried). The python-zstandard test suite also
runs all tests against both the C extension and CFFI bindings to
ensure feature parity.

There is also a "decompress_content_dict_chain()" API. This was
derived from discussions with Yann Collet on list about alternate
ways of encoding delta chains.

The change most relevant to Mercurial is a performance enhancement in
the simple decompression API to reuse a data structure across
operations. This makes decompression of multiple inputs significantly
faster. (This scenario occurs when reading revlog delta chains, for
example.)

Using python-zstandard's bench.py to measure the performance
difference...

On changelog chunks in the mozilla-unified repo:

decompress discrete decompress() reuse zctx
1.262243 wall; 1.260000 CPU; 1.260000 user; 0.000000 sys 170.43 MB/s (best of 3)
0.949106 wall; 0.950000 CPU; 0.950000 user; 0.000000 sys 226.66 MB/s (best of 4)

decompress discrete dict decompress() reuse zctx
0.692170 wall; 0.690000 CPU; 0.690000 user; 0.000000 sys 310.80 MB/s (best of 5)
0.437088 wall; 0.440000 CPU; 0.440000 user; 0.000000 sys 492.17 MB/s (best of 7)

On manifest chunks in the mozilla-unified repo:

decompress discrete decompress() reuse zctx
1.367284 wall; 1.370000 CPU; 1.370000 user; 0.000000 sys 274.01 MB/s (best of 3)
1.086831 wall; 1.080000 CPU; 1.080000 user; 0.000000 sys 344.72 MB/s (best of 3)

decompress discrete dict decompress() reuse zctx
0.993272 wall; 0.990000 CPU; 0.990000 user; 0.000000 sys 377.19 MB/s (best of 3)
0.678651 wall; 0.680000 CPU; 0.680000 user; 0.000000 sys 552.06 MB/s (best of 5)

That should make reads on zstd revlogs a bit faster ;)

# no-check-commit
2017-02-07 23:24:47 -08:00
Pulkit Goyal
5fe31c0917 py3: exclude pywatchman from test-check-py3-compat.t
Exclude pywatchman from py3 test. They have already worked on Python 3
compatibility https://github.com/facebook/watchman/pull/247
2016-12-25 02:42:46 +05:30
Pulkit Goyal
16aa4fb774 py3: update test-check-py3-compat.t
This part of test runs only on py3. This change was introduced by df7aa9b1c69b.
2016-12-25 02:34:19 +05:30
Zack Hricz
ca776e404a fsmonitor: refresh pywatchman to upstream
Update to upstream to version c77452. The refresh includes fixes to improve
windows compatibility.

There is a minor update to 'test-check-py3-compat.t' as c77452 no longer have
the py3 compatibility issues the previous version had.

# no-check-commit
2016-12-22 11:22:32 -08:00
Pulkit Goyal
849c625bac py3: use pycompat.sysstr() in __import__()
__import__() on Python 3 accepts strings which are different from that of
Python 2. Used pycompat.sysstr() to get string accordingly.
2016-12-01 13:12:04 +05:30
Pulkit Goyal
bf0008abe9 py3: use unicodes in __slots__
__slots__ in Python 3 accepts only unicodes and there is no harm using
unicodes in __slots__. So just adding u'' is fine. Previous occurences of this
problem are treated the same way.
2016-11-30 23:38:50 +05:30
Pulkit Goyal
8a1cd56292 py3: update test-check-py3-compat.t output
This part remains unchanged because it runs in Python 3 only.
2016-11-21 15:38:56 +05:30
Gregory Szorc
2005375cdc zstd: vendor python-zstandard 0.5.0
As the commit message for the previous changeset says, we wish
for zstd to be a 1st class citizen in Mercurial. To make that
happen, we need to enable Python to talk to the zstd C API. And
that requires bindings.

This commit vendors a copy of existing Python bindings. Why do we
need to vendor? As the commit message of the previous commit says,
relying on systems in the wild to have the bindings or zstd present
is a losing proposition. By distributing the zstd and bindings with
Mercurial, we significantly increase our chances that zstd will
work. Since zstd will deliver a better end-user experience by
achieving better performance, this benefits our users. Another
reason is that the Python bindings still aren't stable and the
API is somewhat fluid. While Mercurial could be coded to target
multiple versions of the Python bindings, it is safer to bundle
an explicit, known working version.

The added Python bindings are mostly a fully-featured interface
to the zstd C API. They allow one-shot operations, streaming,
reading and writing from objects implements the file object
protocol, dictionary compression, control over low-level compression
parameters, and more. The Python bindings work on Python 2.6,
2.7, and 3.3+ and have been tested on Linux and Windows. There are
CFFI bindings, but they are lacking compared to the C extension.
Upstream work will be needed before we can support zstd with PyPy.
But it will be possible.

The files added in this commit come from Git commit
e637c1b214d5f869cf8116c550dcae23ec13b677 from
https://github.com/indygreg/python-zstandard and are added without
modifications. Some files from the upstream repository have been
omitted, namely files related to continuous integration.

In the spirit of full disclosure, I'm the maintainer of the
"python-zstandard" project and have authored 100% of the code
added in this commit. Unfortunately, the Python bindings have
not been formally code reviewed by anyone. While I've tested
much of the code thoroughly (I even have tests that fuzz APIs),
there's a good chance there are bugs, memory leaks, not well
thought out APIs, etc. If someone wants to review the code and
send feedback to the GitHub project, it would be greatly
appreciated.

Despite my involvement with both projects, my opinions of code
style differ from Mercurial's. The code in this commit introduces
numerous code style violations in Mercurial's linters. So, the code
is excluded from most lints. However, some violations I agree with.
These have been added to the known violations ignore list for now.
2016-11-10 22:15:58 -08:00
Yuya Nishihara
ac60c0e796 py3: update test-check-py3-compat.t output
5d20bf74f0eb (scmutil: move util.termwidth()) changed where the import fails.
2016-11-09 22:15:51 +09:00
Gregory Szorc
7ae9040b80 statprof: use print function 2016-08-14 19:20:12 -07:00
Gregory Szorc
62b5c735eb statprof: use absolute_imports
As part of this, we modify import order to satisfy our import
checker.
2016-11-01 18:55:30 -07:00
Gregory Szorc
e735a48987 statprof: vendor statprof.py
Vendored from https://bitbucket.org/facebook/hg-experimental
changeset 73f9db47ae5a1a9fa29a98dfe92d557ad51234c3 without
modification.

This introduces a number of code style violations. The file
already has the magic words to skip test-check-code.t. I'll
make additional changes to clean up the test-check-py3-compat.t
warnings and to change some behavior in the code that isn't
suitable for general use.

test-check-commit.t also complains about numerous things. But
there's nothing we can do if we're importing as-is.
2016-11-01 18:54:03 -07:00
Mateusz Kwapich
f97c61b781 py3: use raw strings in line continuation (convert ext)
Our py2 to py3 string translations marks those as bytestrings.
2016-10-10 05:31:31 -07:00
Mateusz Kwapich
41dd071f2f py3: namedtuple takes unicode (journal ext)
namedtuple usage consistent with changelog.py:141
2016-10-10 05:30:14 -07:00
Yuya Nishihara
6ed34c797c py3: make check-py3-compat.py load modules in standard manner
Otherwise no code transformation would be applied to the modules which are
imported only by imp.load_module().

This change means modules are imported from PYTHONPATH, not from the paths
given by command arguments. This isn't always correct, but seems acceptable.
2016-10-08 17:22:07 +02:00
Yuya Nishihara
635359be9c py3: include module filename in check-py3-compat.py output
This change is intended to reduce noises in the next patch.
2016-10-09 08:31:39 +02:00
Augie Fackler
10de2c7e4f util: ensure forwarded attrs are set in globals() as sysstr
Custom module importer strikes again.
2016-10-08 08:36:39 -04:00
Augie Fackler
968375bf81 i18n: make the locale directory name the same string type as the datapath 2016-10-08 05:26:18 -04:00
Mateusz Kwapich
c9434eddcf py3: make encodefun in store.py compatible with py3k
This ensures that the filename encoding functions always map bytestrings
to bytestrings regardless of python version.
2016-10-08 08:54:05 -07:00
Pulkit Goyal
5df46f7e20 mail: handle renamed email.Header
We are still using email.Header which was renamed to email.header back in
Python 2.5. References: https://hg.python.org/cpython/file/2.4/Lib/email
and https://hg.python.org/cpython/file/2.5/Lib/email
2016-10-07 17:30:11 +02:00
Augie Fackler
efaaf08415 revset: build _syminitletters from a saner source: the string module
For now, these sets will be unicode characters in Python 3, which is
probably wrong, but it un-blocks importing the module so we can get
further along. In the future we'll have to come up with a reasonable
encoding strategy for revsets in Python 3.

This patch was originally pair-programmed with Martijn.
2016-10-07 08:32:40 -04:00
Augie Fackler
8facc4e910 hgmanpage: stop using raw-unicode strings
These don't exist in Python 3, and this ends up looking a little more
explicit to Martijn and me anyway.
2016-10-07 07:43:04 -04:00
Augie Fackler
2b99f3fc25 util: use string.hexdigits instead of defining it ourselves
This resolves some Python 3 weirdness.
2016-10-07 08:58:23 -04:00
Augie Fackler
ef318ea690 util: correct check of sys.version_info
sys.version is a string, and shouldn't be compared against a tuple for
version comparisons. This was always true, so we were never disabling
gc on 2.6.

>>> (2, 7) >= '2.7'
True
>>> (2, 6) >= '2.7'
True
2016-10-07 08:01:16 -04:00
Pulkit Goyal
757d64b218 py3: handle multiple arguments in .encode() and .decode()
There is a case and more can be present where these functions have
multiple arguments. Our transformer used to handle the first argument, so
added a loop to handle more arguments if present.
2016-10-07 14:04:49 +02:00
Pulkit Goyal
3a0ce3bcba py3: convert to unicode to pass into encode()
encoding.encoding is bytes, we need to pass it to encode() which accepts
unicodes in py3, so used pycomapt.sysstr() Also this can't be done using
transformer as that only transforms the string values not variables.
2016-10-07 12:13:28 +02:00
Pulkit Goyal
8ae574b546 py3: use unicode in is_frozen()
imp.is_frozen() doesnot accepts bytes on Python 3.
It does accept both bytes and strings on Python 2.
2016-10-02 05:29:17 +05:30
Pulkit Goyal
93ca2d7cd6 py3: use unicodes in __slots__
__slots__ doesnot accepts bytes on Python 3.
2016-10-02 03:38:14 +05:30
Yuya Nishihara
19a513c19c py3: make i18n use encoding.environ 2016-09-28 20:07:32 +09:00
Yuya Nishihara
52ffc6a5bd py3: provide encoding.environ which is a dict of bytes
This can't be moved to pycompat.py since we need encoding.tolocal() to
build bytes dict from unicode os.environ.
2016-09-28 20:05:34 +09:00
Pulkit Goyal
8b4f697218 py3: remove use of *L syntax
The int in Python 3 behaves as long so no need of L's in py3.
Moreover we dont need long here.
2016-09-01 02:29:46 +05:30
Augie Fackler
79ea2f0a0b py3: split check of pygments-using files from the rest of the tree
If we don't do this, people without pygments installed in their Python
3 environment silently stop checking test-check-py3-compat, which
isn't really what we wanted. This preserves stability of the test
output while still letting anyone with a recent-enough Python 3 run
the majority of the Python 3 compat checking test.
2016-08-30 13:33:48 -04:00
Yuya Nishihara
a869725d2d py3: automatically glob out line numbers from check-py3-compat output
It was boring task to update the result manually.
2016-08-17 20:56:12 +09:00
Yuya Nishihara
41507deea4 py3: have check-py3-compat require pygments to get stable result 2016-08-17 20:52:50 +09:00
Yuya Nishihara
12eca1889e py3: import builtin wrappers automagically by code transformer
This should be less invasive than mucking builtins.

Since tokenize.untokenize() looks start/end positions of tokens, we calculates
them from the NEWLINE token of the future import.
2016-08-16 12:35:15 +09:00
Pulkit Goyal
d280d7dc4a py3: conditionalize _winreg import
_winreg module is renamed to winreg in python 3. Added the conditionalize
statements in the respective file because adding this in pycompat will result
in pycompat throwing error as this is a windows registry module and we have
buildbots and most of the contributors on linux.
2016-08-10 04:35:44 +05:30
Pulkit Goyal
9d5cc65520 py3: conditionalize the raise statement
raise E,V,T is not acceptable in Python 3, thats is conditionalized.
Moreover this will result in syntax error so we have to use exec() to
execute this. Related PEP- https://www.python.org/dev/peps/pep-3109/#id14

My implementation is motivated from the six implementation except they are
defining a new function exec_() to prevent adding an extra frame AFAIK :)

https://bitbucket.org/gutworth/six/src/ca4580a5a648/six.py#six.py-680
2016-08-08 23:51:11 +05:30
FUJIWARA Katsunori
d0610552d6 py3: make check-py3-compat.py use correct module name at loading pure modules
Before this patch, check-py3-compat.py implies unintentional ".pure"
sub-package name at loading pure modules, because module name is
composed by just replacing "/" in the path to actual ".py" file by
".".

This makes pure modules belong to "mercurial.pure" package, and
prevents them from importing a module belonging to "mercurial" package
relatively by "from . import foo" or so.

This is reason why pure modules fail to import another module
relatively only at examination by check-py3-compat.py.
2016-08-09 02:28:34 +09:00
FUJIWARA Katsunori
c66ea5e531 py3: update output of check-py3-compat.py with python3
Recent work on mercurial/pure/mpatch.py made it import
mercurial.policy, and changed output of check-py3-compat.py with
python3 in test-check-py3-compat.t. But test-check-py3-compat.t isn't
yet updated.
2016-08-09 02:28:34 +09:00
Pulkit Goyal
7a4ee4f18a py3: use unicode literals in pure/osutil.py
The first element of _fields_ tuples must be a str in Python 3.
Also fix some function calls that were also expecting str.
2016-08-04 00:32:19 +05:30
Pulkit Goyal
b12e59d025 py3: update test-check-py3-compat.t output
The lower part of test-check-py3-compat.t runs only on py3 and hence its
 remain unchanged. Hence this patch updates the output so that change in output
 in the next patches will be only related to the change in the patch.
2016-08-04 00:04:48 +05:30
Pulkit Goyal
6b3bc52b40 py3: conditionalize BaseHTTPServer, SimpleHTTPServer and CGIHTTPServer import
The BaseHTTPServer, SimpleHTTPServer and CGIHTTPServer has been merged into
http.server in python 3. All of them has been merged as util.httpserver to use
in both python 2 and 3. This patch adds a regex to check-code to warn against
the use of BaseHTTPServer. Moreover this patch also includes updates to lower
part of test-check-py3-compat.t which used to remain unchanged.
2016-07-13 23:38:29 +05:30
Gregory Szorc
2125e5d7d4 mercurial: implement a source transforming module loader on Python 3
The most painful part of ensuring Python code runs on both Python 2
and 3 is string encoding. Making this difficult is that string
literals in Python 2 are bytes and string literals in Python 3 are
unicode. So, to ensure consistent types are used, you have to
use "from __future__ import unicode_literals" and/or prefix literals
with their type (e.g. b'foo' or u'foo').

Nearly every string in Mercurial is bytes. So, to use the same source
code on both Python 2 and 3 would require prefixing nearly every
string literal with "b" to make it a byte literal. This is ugly and
not something mpm is willing to do at this point in time.

This patch implements a custom module loader on Python 3 that performs
source transformation to convert string literals (unicode in Python 3)
to byte literals. In effect, it changes Python 3's string literals to
behave like Python 2's.

In addition, the module loader recognizes well-known built-in
functions (getattr, setattr, hasattr) and methods (encode and decode)
that barf when bytes are used and prevents these from being rewritten.
This prevents excessive source changes to accommodate this change
(we would have to rewrite every occurrence of these functions passing
string literals otherwise).

The module loader is only used on Python packages belonging to
Mercurial.

The loader works by tokenizing the loaded source and replacing
"string" tokens if necessary. The modified token stream is
untokenized back to source and loaded like normal. This does add some
overhead. However, this all occurs before caching: .pyc files will
cache the transformed version. This means the transformation penalty
is only paid on first load.

As the extensive inline comments explain, the presence of a custom
source transformer invalidates assumptions made by Python's built-in
bytecode caching mechanism. So, we have to wrap bytecode loading and
writing and add an additional header to bytecode files to facilitate
additional cache validation when the source transformations
change in the future.

There are still a few things this code doesn't handle well, namely
support for zip files as module sources and for extensions. Since
Mercurial doesn't officially support Python 3 yet, I'm inclined to
leave these as to-do items: getting a basic module loading mechanism
in place to unblock further Python 3 porting effort is more important
than comprehensive module importing support.

check-py3-compat.py has been updated to ignore frames. This is
necessary because CPython has built-in code to strip frames from the
built-in importer. When our custom code is present, this doesn't work
and the frames get all messed up. The new code is not perfect. It
works for now. But once you start chasing import failures you find
some edge cases where the files aren't being printed properly. This
only burdens people doing future Python 3 porting work so I'm inclined
to punt on the issue: the most important thing is for the source
transforming module loader to land.

There was a bit of churn in test-check-py3-compat.t because we now
trip up on str/unicode/bytes failures as a result of source
transformation. This is unfortunate but what are you going to do.

It's worth noting that other approaches were investigated.

We considered using a custom file encoding whose decode() would
apply source transformations. This was rejected because it would
require each source file to declare its custom Mercurial encoding.
Furthermore, when changing the source transformation we'd need to
version bump the encoding name otherwise the module caching layer
wouldn't know the .pyc file was invalidated. This would mean mass
updating every file when the source transformation changes. Yuck.

We also considered transforming at the AST layer. However, Python's
ast module is quite gnarly and doing AST transforms is quite
complicated, even for trivial rewrites. There are whole Python packages
that exist to make AST transformations usable. AST transforms would
still require import machinery, so the choice was basically to
perform source-level, token-level, or ast-level transforms.

Token-level rewriting delivers the metadata we need to rewrite
intelligently while being relatively easy to understand. So it won.

General consensus seems to be that this approach is the best available
to avoid bulk rewriting of '' to b''. However, we aren't confident
that this approach will never be a future maintenance burden. This
approach does unblock serious Python 3 porting efforts. So we can
re-evaulate once more work is done to support Python 3.
2016-07-04 11:18:03 -07:00
Pulkit Goyal
aef2bdd39a py3: make files use absolute_import and print_function
This patch includes addition of absolute_import and print_function to the
 files where they are missing. The modern importing conventions are also followed.
2016-07-03 22:28:24 +05:30
Pulkit Goyal
af9d7f66d0 py3: conditionalize httplib import
The httplib library is renamed to http.client in python 3. So the
import is conditionalized and a test is added in check-code to warn
to use util.httplib
2016-06-28 16:01:53 +05:30
Pulkit Goyal
38a359ce5c py3: conditionalize SocketServer import
The SocketServer is renamed to socketserver in python 3
2016-06-27 16:48:54 +05:30
Pulkit Goyal
fdc0861e35 py3: conditionalize xmlrpclib import
The xmlrpclib library is renamed to xmlrpc.client in python 3
2016-06-27 16:37:37 +05:30
Pulkit Goyal
5fcc6a2628 py3: conditionalize the urlparse import
The urlparse library is renamed to urllib.parse in python 3
2016-06-27 16:16:10 +05:30
Pulkit Goyal
175e976a5b py3: update tests/test-check-py3-compat.t
The lower part of the test runs with python 3 and hence remain unchanged.
2016-06-27 15:53:38 +05:30