sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-10 08:47:12 +03:00

Author	SHA1	Message	Date
Yuya Nishihara	47edeae90b	policy: drop custom importer for pure modules	2016-08-13 12:28:52 +09:00
Yuya Nishihara	4563e16232	parsers: switch to policy importer # no-check-commit	2016-08-13 12:23:56 +09:00
Yuya Nishihara	5fe7742660	mpatch: switch to policy importer	2016-08-13 12:18:58 +09:00
Yuya Nishihara	6130be9a6c	diffhelpers: switch to policy importer # no-check-commit	2016-08-13 12:15:49 +09:00
Yuya Nishihara	50b316b748	bdiff: switch to policy importer # no-check-commit	2016-08-13 12:12:50 +09:00
Yuya Nishihara	a9b78ccb21	base85: switch to policy importer	2016-08-13 12:08:23 +09:00
Yuya Nishihara	70995f9aa9	osutil: switch to policy importer "make clean" is recommended to test this change, though C API compatibility should be preserved.	2016-08-12 11:35:17 +09:00
Martin von Zweigbergk	c3406ac3db	cleanup: use set literals We no longer support Python 2.6, so we can now use set literals.	2017-02-10 16:56:29 -08:00
Yuya Nishihara	5314481bbf	policy: eliminate ".pure." from module name only if marked as dual So we can switch cext/pure modules to new layout one by one.	2016-08-13 17:21:58 +09:00
Pulkit Goyal	9a233abc0a	py3: add pycompat.unicode and add it to importer On python 3, builtins.unicode does not exist.	2017-04-07 23:35:51 +05:30
Yuya Nishihara	02b9a6d074	py3: rewrite itervalues() as values() by importer I'm not a great fan of these importer magics, but this should be okay since "itervalues" seems as unique name as "iteritems".	2017-03-13 08:44:57 -07:00
FUJIWARA Katsunori	9d450170ba	py3: add "b" prefix to string literals related to module policy String literals without explicit prefix in __init__.py and policy.py are treated as unicode object on Python3, because these modules are loaded before setup of our specific code transformation (the later module is imported at the beginning of __init__.py). BTW, "modulepolicy" in __init__.py is initialized by "policy.policy". This causes issues below; - checking "policy" value in other modules causes unintentional result For example, "b'py' not in (u'c', u'py')" returns True unintentionally on Python3. - writing "policy" out fails at conversion from unicode to bytes db1ebf457295 fixed this issue for default code path, but "policy" can be overridden by HGMODULEPOLICY environment variable (it should be rare case for developer using Python3, though). This patch does: - add "b" prefix to all string literals, which are related to module policy, in modules above. - check existence of HGMODULEPOLICY, and overwrite "policy" only if it exists For simplicity, this patch omits checking "supports_bytes_environ", switching os.environ/os.environb, and so on (Yuya agreed this in personal talking)	2017-03-13 04:06:36 +09:00
Augie Fackler	afdb0af8d1	init: zstd is already python3-ready, so don't run it through our importer	2017-03-08 18:11:19 -05:00
Pulkit Goyal	1da791b8bb	py3: drop unrequired code from __init__.py Once we are using pycompat.open(), we don't need this hack to convert the second argument to unicodes. So removing this.	2017-03-03 15:30:48 +05:30
Pulkit Goyal	cb0521463d	py3: add pycompat.open and replace open() calls open() requires mode argument as unicodes on Python 3. This patch introduces pycompat.open() which is inserted to files using transformer and replaces builtins.open() calls.	2017-03-03 13:04:32 +05:30
Martijn Pieters	217717bef0	py3: use namedtuple._replace to produce new tokens	2016-10-13 09:27:37 +01:00
Martijn Pieters	f29fa5e5a1	py3: refactor token parsing to handle call args properly The token parsing was getting unwieldy and was too naive about accessing arguments.	2016-10-14 17:55:02 +01:00
Martijn Pieters	e20d571ccf	py3: a second argument to open can't be bytes This fixes open(filename, 'r'), open(filename, 'w'), etc. calls. In Python 3, that second argument must be a string, you can't use bytes. The fix is the same as used with getattr() (where the second argument must also always be a string); in the tokenizer, where we detect calls, if there is something that looks like a call to open (and is not an attribute, so the previous token is not a "." dot) then make sure that that second argument is not converted to a `bytes` object instead. There is some remaining issue where the current transformer will also rewrite open(f('foo')). However this also affect function for which we perform similar rewrite ('getattr', 'setattr', 'hasattr', 'safehasattr') and will be dealt with in a follow up.	2016-10-09 14:10:01 +02:00
Pulkit Goyal	924bdac6e1	py3: switch to .items() using transformer .iteritems() don't exist in Python 3 world. Used the transformer to replace .iteritems() to .items()	2016-10-07 15:29:57 +02:00
Pulkit Goyal	757d64b218	py3: handle multiple arguments in .encode() and .decode() There is a case and more can be present where these functions have multiple arguments. Our transformer used to handle the first argument, so added a loop to handle more arguments if present.	2016-10-07 14:04:49 +02:00
Yuya Nishihara	12eca1889e	py3: import builtin wrappers automagically by code transformer This should be less invasive than mucking builtins. Since tokenize.untokenize() looks start/end positions of tokens, we calculates them from the NEWLINE token of the future import.	2016-08-16 12:35:15 +09:00
Gregory Szorc	2125e5d7d4	mercurial: implement a source transforming module loader on Python 3 The most painful part of ensuring Python code runs on both Python 2 and 3 is string encoding. Making this difficult is that string literals in Python 2 are bytes and string literals in Python 3 are unicode. So, to ensure consistent types are used, you have to use "from __future__ import unicode_literals" and/or prefix literals with their type (e.g. b'foo' or u'foo'). Nearly every string in Mercurial is bytes. So, to use the same source code on both Python 2 and 3 would require prefixing nearly every string literal with "b" to make it a byte literal. This is ugly and not something mpm is willing to do at this point in time. This patch implements a custom module loader on Python 3 that performs source transformation to convert string literals (unicode in Python 3) to byte literals. In effect, it changes Python 3's string literals to behave like Python 2's. In addition, the module loader recognizes well-known built-in functions (getattr, setattr, hasattr) and methods (encode and decode) that barf when bytes are used and prevents these from being rewritten. This prevents excessive source changes to accommodate this change (we would have to rewrite every occurrence of these functions passing string literals otherwise). The module loader is only used on Python packages belonging to Mercurial. The loader works by tokenizing the loaded source and replacing "string" tokens if necessary. The modified token stream is untokenized back to source and loaded like normal. This does add some overhead. However, this all occurs before caching: .pyc files will cache the transformed version. This means the transformation penalty is only paid on first load. As the extensive inline comments explain, the presence of a custom source transformer invalidates assumptions made by Python's built-in bytecode caching mechanism. So, we have to wrap bytecode loading and writing and add an additional header to bytecode files to facilitate additional cache validation when the source transformations change in the future. There are still a few things this code doesn't handle well, namely support for zip files as module sources and for extensions. Since Mercurial doesn't officially support Python 3 yet, I'm inclined to leave these as to-do items: getting a basic module loading mechanism in place to unblock further Python 3 porting effort is more important than comprehensive module importing support. check-py3-compat.py has been updated to ignore frames. This is necessary because CPython has built-in code to strip frames from the built-in importer. When our custom code is present, this doesn't work and the frames get all messed up. The new code is not perfect. It works for now. But once you start chasing import failures you find some edge cases where the files aren't being printed properly. This only burdens people doing future Python 3 porting work so I'm inclined to punt on the issue: the most important thing is for the source transforming module loader to land. There was a bit of churn in test-check-py3-compat.t because we now trip up on str/unicode/bytes failures as a result of source transformation. This is unfortunate but what are you going to do. It's worth noting that other approaches were investigated. We considered using a custom file encoding whose decode() would apply source transformations. This was rejected because it would require each source file to declare its custom Mercurial encoding. Furthermore, when changing the source transformation we'd need to version bump the encoding name otherwise the module caching layer wouldn't know the .pyc file was invalidated. This would mean mass updating every file when the source transformation changes. Yuck. We also considered transforming at the AST layer. However, Python's ast module is quite gnarly and doing AST transforms is quite complicated, even for trivial rewrites. There are whole Python packages that exist to make AST transformations usable. AST transforms would still require import machinery, so the choice was basically to perform source-level, token-level, or ast-level transforms. Token-level rewriting delivers the metadata we need to rewrite intelligently while being relatively easy to understand. So it won. General consensus seems to be that this approach is the best available to avoid bulk rewriting of '' to b''. However, we aren't confident that this approach will never be a future maintenance burden. This approach does unblock serious Python 3 porting efforts. So we can re-evaulate once more work is done to support Python 3.	2016-07-04 11:18:03 -07:00
Maciej Fijalkowski	7f30335cfb	policy: add cffi policy for PyPy This adds cffi policy in the case where we don't want to use C modules, but instead we're happy to rely on cffi (bundled with pypy)	2016-06-07 15:35:58 +02:00
timeless	0dcdb26a9d	debuginstall: expose modulepolicy With this, you can check for pure easily: $ HGMODULEPOLICY=py ./hg debuginstall -T "{hgmodulepolicy}" py	2016-03-09 19:55:45 +00:00
Gregory Szorc	f23e0380e1	mercurial: use pure Python module policy on Python 3 The C extensions don't yet work with Python 3. Let's minimize the work required to get Mercurial running on Python 3 by always using the pure Python module policy on Python 3.	2016-03-12 13:19:19 -08:00
timeless	d54666f50a	setup: create a module for the modulepolicy Instead of rewriting __init__ to define the modulepolicy, write out a __modulepolicy__.py file like __version__.py This should work for both system-wide installation and in-place build. Therefore we can avoid relying on two separate modulepolicy rules, '@MODULELOADPOLICY@' and 'mercurial/modulepolicy'.	2016-03-09 15:47:01 +00:00
Gregory Szorc	f1bd627c00	mercurial: support loading modules from zipimporter The previous refactor to module importing broke module loading when mercurial.* modules were loaded from a zipfile (using a zipimporter). This scenario is likely encountered when using py2exe. Supporting zipimporter and the traditional importer side-by-side turns out to be quite a pain. In Python 2.x, the standard, file-based import mechanism is partially implemented in C. The sys.meta_path and sys.path_hooks hook points exist to allow custom importers in Python/userland. zipimport.zipimporter and our "hgimporter" class from earlier in this patch series are 2 of these. In a standard Python installation (no matter if running in py2exe or similar or not), zipimport.zipimporter appears to be registered in sys.path_hooks. This means that as each sys.path entry is consulted, it will ask zipimporter if it supports that path and zipimporter will be used if that entry is a zip file. In a py2exe environment, sys.path contains an entry with the path to the zip file containing the Python standard library along with Mercurial's Python files. The way the importer mechanism works is the first importer that declares knowledge of a module (via find_module() returning an object) gets to load it. Since our "hgimporter" is registered in sys.meta_path and returns an interest in specific mercurial.* modules, the zipimporter registered on sys.path_hooks never comes into play for these modules. So, we need to be zipimporter aware and call into zipimporter to load modules. This patch teaches "hgimporter" how to call out into zipimporter when necessary. We detect the necessity of zipimporter by looking at the loader for the "mercurial" module. If it is a zipimporter instance, we load via zipimporter. The behavior of zipimporter is a bit wonky. You appear to need separate zipimporter instances for each directory in the zip file. I'm not sure why this is. I suspect it has something to do with the low-level importing mechanism (implemented in C) operating on a per-directory basis. PEP-302 makes some references to this. I was not able to get a zipimporter to import modules outside of its immediate directory no matter how I specified the module name. This is why we use separate zipimporter instances for the ".zip/mercurial" and ".zip/mercurial/pure" locations. The zipimporter documentation for Python 2.7 explicitly states that zipimporter does not import dynamic modules (C extensions). Yet from a py2exe distribution on Windows - where the .pyd files are not in the zip archive - zipimporter imported these dynamic modules just fine! I'm not sure if dynamic modules can't be imported from inside the zip archive or whether zipimporter looks for dynamic modules outside the zip archive. All I know is zipimporter does manage to import the .pyd files on Windows and this patch makes our new importer compatible with py2exe. In the ideal world, We'd probably reimplement or fall back to parts of the built-in import mechanism instead of handling zipimporter specially. After all, if we're loading Mercurial modules via something that isn't the built-in file-based importer or zipimporter, our custom importer will likely fail because it doesn't know how to call into it. I'd like to think that we'll never encounter this in the wild, but you never know. If we do encounter it, we can come up with another solution. It's worth nothing that Python 3 has moved a lot of the importing code from C to Python. Python 3 gives you near total control over the import mechanism. So in the very distant future when Mercurial drops Python 2 support, it's likely that our custom importer code can be refactored to something a bit saner.	2015-12-03 21:25:05 -08:00
Gregory Szorc	767a462bbd	mercurial: don't load C extensions from PyPy PyPy isn't compatible with Python C extensions. With this patch, the module load policy is automatically to "Python only" when run under PyPy. `hg` and other Python scripts importing mercurial.* modules will run from the source checkout or any installation when executed with PyPy. This should enable people to more easily experiment with PyPy and its potentially significant performance benefits over CPython!	2015-11-24 22:21:51 -08:00
Gregory Szorc	3dd11a41cf	mercurial: be more strict about loading dual implemented modules With this change in place, we should have slightly stronger guarantees about how modules with both Python and C implementations are loaded. Before, our module loader's default policy looked under both mercurial/* and mercurial/pure/* and imported whatever it found, C or pure. The fact it looked in both locations by default was a temporary regression from the beginning of this series. This patch does 2 things: 1) Changes the default module load policy to only load C modules 2) Verifies that files loaded from mercurial/* are actually C modules This 2nd behavior change makes our new module loading mechanism stricter than from before this series. Before, it was possible to load a .py-based module from mercurial/*. This could happen if an old installation orphaned a file and then somehow didn't install the C version for the new install. We now detect this odd configuration and fall back to loading the pure Python module, assuming it is allowed. In the case of a busted installation, we fail fast. While we could fall back, we explicitly decide not to do this because we don't want people accidentally not running the C modules and having slow performance as a result.	2015-11-24 22:50:04 -08:00
Gregory Szorc	d7669a769a	mercurial: implement import hook for handling C/Python modules There are a handful of modules that have both pure Python and C extension implementations. Currently, setup.py copies files from mercurial/pure/.py to mercurial/ during the install process if C extensions are not available. This way, "import mercurial.X" will work whether C extensions are available or not. This approach has a few drawbacks. First, there aren't run-time checks verifying the C extensions are loaded when they should be. This could lead to accidental use of the slower pure Python modules. Second, the C extensions aren't compatible with PyPy and running Mercurial with PyPy requires installing Mercurial - you can't run ./hg from a source checkout. This makes developing while running PyPy somewhat difficult. This patch implements a PEP-302 import hook for finding and loading the modules with both C and Python implementations. When a module with dual implementations is requested for import, its import is handled by our import hook. The importer has a mechanism that controls what types of modules we allow to load. We call this loading behavior the "module load policy." There are 3 settings: Only load C extensions * Only load pure Python * Try to load C and fall back to Python An environment variable allows overriding this policy at run time. This is mainly useful for developers and for performing actions against the source checkout (such as installing), which require overriding the default (strict) policy about requiring C extensions. The default mode for now is to allow both. This isn't proper and is technically backwards incompatible. However, it is necessary to implement a sane patch series that doesn't break the world during future bisections. The behavior will be corrected in future patch. We choose the main mercurial/__init__.py module for this code out of necessity: in a future world, if the custom module importer isn't registered, we'll fail to find/import certain modules when running from a pure installation. Without the magical import-time side-effects, any importer of mercurial.* modules would be required to call a function to register our importer. I'm not a fan of import time side effects and I initially attempted to do this. However, I was foiled by our own test harness, which has numerous `python` invoked scripts that "import mercurial" and fail because the importer isn't registered. Realizing this problem is probably present in random Python scripts that have been written over the years, I decided that sacrificing purity for backwards compatibility is necessary. Plus, if you are programming Python, "import" should probably "just work." It's worth noting that now that we have a custom module loader, it would be possible to hook up demand module proxies at this level instead of replacing __import__. We leave this work for another time, if it's even desired. This patch breaks importing in environments where Mercurial modules are loaded from a zip file (such as py2exe distributions). This will be addressed in a subsequent patch.	2015-12-03 21:37:01 -08:00
mpm@selenic.com	ca8cb8ba67	Add back links from file revisions to changeset revisions Add simple transaction support Add hg verify Improve caching in revlog Fix a bunch of bugs Self-hosting now that the metadata is close to finalized	2005-05-03 13:16:10 -08:00

31 Commits