sapling/mercurial/__init__.py
Gregory Szorc f1bd627c00 mercurial: support loading modules from zipimporter
The previous refactor to module importing broke module loading when
mercurial.* modules were loaded from a zipfile (using a zipimporter).
This scenario is likely encountered when using py2exe.

Supporting zipimporter and the traditional importer side-by-side
turns out to be quite a pain. In Python 2.x, the standard, file-based
import mechanism is partially implemented in C. The sys.meta_path
and sys.path_hooks hook points exist to allow custom importers in
Python/userland. zipimport.zipimporter and our "hgimporter" class
from earlier in this patch series are 2 of these.

In a standard Python installation (no matter if running in py2exe
or similar or not), zipimport.zipimporter appears to be registered
in sys.path_hooks. This means that as each sys.path entry is
consulted, it will ask zipimporter if it supports that path and
zipimporter will be used if that entry is a zip file. In a
py2exe environment, sys.path contains an entry with the path to
the zip file containing the Python standard library along with
Mercurial's Python files.

The way the importer mechanism works is the first importer that
declares knowledge of a module (via find_module() returning an
object) gets to load it. Since our "hgimporter" is registered
in sys.meta_path and returns an interest in specific mercurial.*
modules, the zipimporter registered on sys.path_hooks never comes
into play for these modules. So, we need to be zipimporter aware
and call into zipimporter to load modules.

This patch teaches "hgimporter" how to call out into zipimporter
when necessary. We detect the necessity of zipimporter by looking
at the loader for the "mercurial" module. If it is a zipimporter
instance, we load via zipimporter.

The behavior of zipimporter is a bit wonky.

You appear to need separate zipimporter instances for each directory
in the zip file. I'm not sure why this is. I suspect it has
something to do with the low-level importing mechanism (implemented
in C) operating on a per-directory basis. PEP-302 makes some
references to this. I was not able to get a zipimporter to
import modules outside of its immediate directory no matter how
I specified the module name. This is why we use separate
zipimporter instances for the ".zip/mercurial" and
".zip/mercurial/pure" locations.

The zipimporter documentation for Python 2.7 explicitly states that
zipimporter does not import dynamic modules (C extensions). Yet from
a py2exe distribution on Windows - where the .pyd files are *not*
in the zip archive - zipimporter imported these dynamic modules
just fine! I'm not sure if dynamic modules can't be imported from
*inside* the zip archive or whether zipimporter looks for dynamic
modules outside the zip archive. All I know is zipimporter does
manage to import the .pyd files on Windows and this patch makes
our new importer compatible with py2exe.

In the ideal world, We'd probably reimplement or fall back to parts
of the built-in import mechanism instead of handling zipimporter
specially. After all, if we're loading Mercurial modules via
something that isn't the built-in file-based importer or zipimporter,
our custom importer will likely fail because it doesn't know how to
call into it. I'd like to think that we'll never encounter this
in the wild, but you never know. If we do encounter it, we can
come up with another solution.

It's worth nothing that Python 3 has moved a lot of the importing
code from C to Python. Python 3 gives you near total control over
the import mechanism. So in the very distant future when Mercurial
drops Python 2 support, it's likely that our custom importer code
can be refactored to something a bit saner.
2015-12-03 21:25:05 -08:00

145 lines
5.3 KiB
Python

# __init__.py - Startup and module loading logic for Mercurial.
#
# Copyright 2015 Gregory Szorc <gregory.szorc@gmail.com>
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
from __future__ import absolute_import
import imp
import os
import sys
import zipimport
__all__ = []
# Rules for how modules can be loaded. Values are:
#
# c - require C extensions
# allow - allow pure Python implementation when C loading fails
# py - only load pure Python modules
modulepolicy = '@MODULELOADPOLICY@'
# By default, require the C extensions for performance reasons.
if modulepolicy == '@' 'MODULELOADPOLICY' '@':
modulepolicy = 'c'
# PyPy doesn't load C extensions.
#
# The canonical way to do this is to test platform.python_implementation().
# But we don't import platform and don't bloat for it here.
if '__pypy__' in sys.builtin_module_names:
modulepolicy = 'py'
# Environment variable can always force settings.
modulepolicy = os.environ.get('HGMODULEPOLICY', modulepolicy)
# Modules that have both Python and C implementations. See also the
# set of .py files under mercurial/pure/.
_dualmodules = set([
'mercurial.base85',
'mercurial.bdiff',
'mercurial.diffhelpers',
'mercurial.mpatch',
'mercurial.osutil',
'mercurial.parsers',
])
class hgimporter(object):
"""Object that conforms to import hook interface defined in PEP-302."""
def find_module(self, name, path=None):
# We only care about modules that have both C and pure implementations.
if name in _dualmodules:
return self
return None
def load_module(self, name):
mod = sys.modules.get(name, None)
if mod:
return mod
mercurial = sys.modules['mercurial']
# The zip importer behaves sufficiently differently from the default
# importer to warrant its own code path.
loader = getattr(mercurial, '__loader__', None)
if isinstance(loader, zipimport.zipimporter):
def ziploader(*paths):
"""Obtain a zipimporter for a directory under the main zip."""
path = os.path.join(loader.archive, *paths)
zl = sys.path_importer_cache.get(path)
if not zl:
zl = zipimport.zipimporter(path)
return zl
try:
if modulepolicy == 'py':
raise ImportError()
zl = ziploader('mercurial')
mod = zl.load_module(name)
# Unlike imp, ziploader doesn't expose module metadata that
# indicates the type of module. So just assume what we found
# is OK (even though it could be a pure Python module).
except ImportError:
if modulepolicy == 'c':
raise
zl = ziploader('mercurial', 'pure')
mod = zl.load_module(name)
sys.modules[name] = mod
return mod
# Unlike the default importer which searches special locations and
# sys.path, we only look in the directory where "mercurial" was
# imported from.
# imp.find_module doesn't support submodules (modules with ".").
# Instead you have to pass the parent package's __path__ attribute
# as the path argument.
stem = name.split('.')[-1]
try:
if modulepolicy == 'py':
raise ImportError()
modinfo = imp.find_module(stem, mercurial.__path__)
# The Mercurial installer used to copy files from
# mercurial/pure/*.py to mercurial/*.py. Therefore, it's possible
# for some installations to have .py files under mercurial/*.
# Loading Python modules when we expected C versions could result
# in a) poor performance b) loading a version from a previous
# Mercurial version, potentially leading to incompatibility. Either
# scenario is bad. So we verify that modules loaded from
# mercurial/* are C extensions. If the current policy allows the
# loading of .py modules, the module will be re-imported from
# mercurial/pure/* below.
if modinfo[2][2] != imp.C_EXTENSION:
raise ImportError('.py version of %s found where C '
'version should exist' % name)
except ImportError:
if modulepolicy == 'c':
raise
# Could not load the C extension and pure Python is allowed. So
# try to load them.
from . import pure
modinfo = imp.find_module(stem, pure.__path__)
if not modinfo:
raise ImportError('could not find mercurial module %s' %
name)
mod = imp.load_module(name, *modinfo)
sys.modules[name] = mod
return mod
# We automagically register our custom importer as a side-effect of loading.
# This is necessary to ensure that any entry points are able to import
# mercurial.* modules without having to perform this registration themselves.
if not any(isinstance(x, hgimporter) for x in sys.meta_path):
# meta_path is used before any implicit finders and before sys.path.
sys.meta_path.insert(0, hgimporter())