Module 'appdirs' tries to import win32com.shell (and catch ImportError as an
indication of failure) to check whether some further functionality should
be implemented one or another way [1]. Of course, demandimport lets it down, so
if we want appdirs to work we have to add it to demandimport's ignore list.
The reason we want appdirs to work is becuase it is used by setuptools [2] to
determine egg cache location. Only fairly recent versions of setuptools depend
on this so people don't see this often.
[1] https://github.com/ActiveState/appdirs/blob/master/appdirs.py#L560
[2] aae0a92811/pkg_resources/__init__.py (L1369)
This is the behavior of the default __import__() function, which doesn't
validate the existence of the fromlist items. Later on, the missing attribute
is detected while processing the import statement.
https://hg.python.org/cpython/file/v2.7.13/Python/import.c#l2575
The comtypes library relies on this (maybe) undocumented behavior, and we
got a bug report to TortoiseHg, sigh.
https://bitbucket.org/tortoisehg/thg/issues/4647/
The test added at 0be19b069edf verifies the behavior of the import statement,
so this patch only adds the test of __import__() function and works around
CPython/PyPy difference.
Demandimport uses the "try to import __builtin__, else use builtins" trick to
handle Python 3. External libraries and extensions might do something similar.
On Fedora 25 subversion-python-1.9.4-4.fc25.x86_64 will do just that (except
the opposite) ... and it failed all subversion convert tests because
demandimport was hiding that it didn't have builtins but should use
__builtin__.
The builtin module has already been imported when demandimport is loaded so
there is no point in trying to import it on demand. Just always ignore both
variants in demandimport.
Before this patch, importing sub-module might (1) fail or (2) success
but import incorrect module, because demandimport tries to import
sub-module with level=-1 (on Python 2.x) or level=0 (on Python 3.x),
which is default value of "level" argument at construction of
"_demandmod" proxy object.
(1) on Python 3.x, importing sub-module always fails to import
existing sub-module
(2) both on Python 2.x and 3.x, importing sub-module might import
same name module on root level unintentionally
On Python 2.x, existing sub-module is prior to this unexpected
module. Therefore, this problem hasn't appeared.
To import sub-module relatively as expected, this patch specifies "1"
as import level explicitly at construction of "_demandmod" proxy
object for sub-module.
Before this patch, importing C module on Windows environment causes
infinite recursion call, if py2exe is used with -b2 option.
At importing C module "a.b", extra hooking by zipextimporter of py2exe
causes:
0. assumption before accessing "b" of "a":
- built-in module object is created for "a",
(= "a" is actually imported)
- _demandmod is created for "a.b" as a proxy object, and
(= "a.b" is not yet imported)
- an attribute "b" of "a" is initialized by the latter
1. invocation of __import__ via _hgextimport() in _demandmod._load()
for "a.b" implies _demandimport() for "a.b"
This is unintentional, because _demandmod might be returned by
_hgextimport() instead of built-in module object.
2. _demandimport() at (1) is invoked with not context of "a", but
context of zipextimporter
Just after invocation of _hgextimport() in _demandimport(), an
attribute "b" of the built-in module object for "a" is still
bound to the proxy object for "a.b", because context of "a" isn't
updated by actual importing "a.b". even though the built-in
module object for "a.b" already appears in sys.modules.
Therefore, chainmodules() returns _demandmod for "a.b", which is
gotten from the attribute "b" of "a".
3. processfromitem() on "a.b" causes _demandmod._load() for "a.b"
again
_demandimport() takes context of "a" in this case.
Therefore, attributes below are bound to built-in module object
for "a.b", as expected:
- "b" of built-in module object for "a"
- _module of _demandmod for "a.b"
4. but _demandimport() invoked at (1) returns _demandmod object
because _demandimport() just returns the object returned by
chainmodules() at (3) above.
5. then, _demandmod._load() causes infinite recursion call
_demandimport() returns _demandmod for "a.b", and it is "self" at
_demandmod._load().
To avoid infinite recursion at actual module importing, this patch
uses self._module, if _hgextimport() returns _demandmod itself. If
_demandmod._module isn't yet bound at this point, execution should be
aborted, because actual importing failed.
In this patch, _demandmod._module is examined not on _demandimport()
side, but on _demandmod._load() side, because:
- the former has some exit points
- only the latter uses _hgextimport(), except for _demandimport()
BTW, this issue occurs only in the code path for non .py/.pyc files in
zipextimporter (strictly speaking, in _memimporter) of py2exe.
Even if zipextimporter is enabled, .py/.pyc files are handled by
zipimporter, and it doesn't imply unintentional _demandimport() at
invocation of __import__ via _hgextimport().
Before this patch, "from a import b" doesn't delay loading module "b",
if absolute_import is enabled, even though "from . import b" does.
For example:
- it is assumed that extension X has "from P import M" for module M
under package P with absolute_import feature
- if importing module M is already delayed before loading extension
X, loading module M in extension X is delayed until actually
referring
util, cmdutil, scmutil or so of Mercurial itself should be
imported by "from . import M" style before loading extension X
- otherwise, module M is loaded immediately at loading extension X,
even if extension X itself isn't used at that "hg" command invocation
Some minor modules (e.g. filemerge or so) of Mercurial itself
aren't imported by "from . import M" style before loading
extension X. And of course, external libraries aren't, too.
This might cause startup performance problem of hg command, because
many bundled extensions already enable absolute_import feature.
To delay loading module for "from a import b" with absolute_import
feature, this patch does below in "from a (or .a) import b" with
absolute_import case:
1. import root module of "name" by system built-in __import__
(referred as _origimport)
2. recurse down the module chain for hierarchical "name"
This logic can be shared with non absolute_import
case. Therefore, this patch also centralizes it into chainmodules().
3. and fall through to process elements in "fromlist" for the leaf
module of "name"
Processing elements in "fromlist" is executed in the code path
after "if _pypy: .... else: ..." clause. Therefore, this patch
replaces "if _pypy:" with "elif _pypy:" to share it.
At faecf59a4184 introducing original "work around" for "from a import
b" case, elements in "fromlist" were imported with "level=level". But
"level" might be grater than 1 (e.g. level=2 in "from .. import b"
case) at demandimport() invocation, and importing direct sub-module in
"fromlist" with level grater than 1 causes unexpected result.
IMHO, this seems main reason of "errors for unknown reason" described
in faecf59a4184, and we don't have to worry about it, because this
issue was already fixed by 2711f50242cf.
This is reason why this patch removes "errors for unknown reasons"
comment.
Mozilla is seeing an issue with demand importing of _imp
failing in pkg_resources/__init__.py:fixup_namespace_packages.
It strangely only reproduces when using a modern version of
setuptools/pip in certain scenarios. Adding _imp to the demand import
ignore list seems to make the problem go away.
PyPy's implementation of __import__ differs subtly from that of CPython.
If invoked without a name or fromlist, it throws an ImportError,
whereas CPython returns a reference to the level-appropriate importing
package.
Here, we achieve the same behaviour by hand.
Importing sqlalchemy.events cannot be delayed as it registers classes to
their event mechanism. It worked fine before faecf59a4184, since they use
new-style imports. But now we have to blacklist it because our demandimport
can handle new-style imports.
This patch series isn't intended for stable as we don't guarantee API
compatibility with 3rd-party extensions. They can temporarily disable the
demand importer to work around the issue.
If a module is loaded as "from . import x" form, there has been no way to
disable demand loading for that module because name is ''. This patch makes
it possible to prevent demand loading by '<package-name>.x'.
We don't use _hgextimport(_origimport) here since attr is known to be a
sub-module name. Adding hgext_ to attr wouldn't make sense.
As the fromlist gives the names of sub-modules, they should be searched in
the parent directory of the package's __init__.py, which is level=1.
I got the following error by rewriting hgweb to use absolute_import, where
the "mercurial" package is referenced as ".." (level=2):
ValueError: Attempted relative import beyond toplevel package
I know little about the import mechanism, but this change seems correct.
Before this patch, the following code did import the os module with no error:
from mercurial import demandimport
demandimport.enable()
from mercurial import os
print os.name
_demandmod instances may be referenced by multiple importing modules.
Before this patch, the _demandmod instance only maintained a reference
to its first consumer when using the "from X import Y" syntax. This is
because we only created a single _demandmod instance (attached to the
parent X module). If multiple modules A and B performed
"from X import Y", we'd produce a single _demandmod instance
"demandmod" with the following references:
X.Y = <demandmod>
A.Y = <demandmod>
B.Y = <demandmod>
The locals from the first consumer (A) would be stored in <demandmod1>.
When <demandmod1> was loaded, we'd look at the locals for the first
consumer and replace the symbol, if necessary. This resulted in state:
X.Y = <module>
A.Y = <module>
B.Y = <demandmod>
B's reference to Y wasn't updated and was still using the proxy object
because we just didn't record that B had a reference to <demandmod> that
needed updating!
With this patch, we add support for tracking which modules in addition
to the initial importer have a reference to the _demandmod instance and
we replace those references at module load time.
In the case of posix.py, this fixes an issue where the "encoding" module
was being proxied, resulting in hundreds of thousands of
__getattribute__ lookups on the _demandmod instance during dirstate
operations on mozilla-central, speeding up execution by many
milliseconds. There are likely several other operation that benefit from
this change as well.
The new mechanism isn't perfect: references in locals (not globals) may
likely linger. So, if there is an import inside a function and a symbol
from that module is used in a hot loop, we could have unwanted overhead
from proxying through _demandmod. Non-global imports are discouraged
anyway. So hopefully this isn't a big deal in practice. We could
potentially deploy a code checker that bans use of attribute lookups of
function-level-imported modules inside loops.
This deficiency in theory could be avoided by storing the set of globals
and locals dicts to update in the _demandmod instance. However, I tried
this and it didn't work. One reason is that some globals are _demandmod
instances. We could work around this, but it's a bit more work. There
also might be other module import foo at play. The solution as
implemented is better than what we had and IMO is good enough for the
time being.
It's worth noting that this sub-optimal behavior was made worse by the
introduction of absolute_import and its recommended "from . import X"
syntax for importing modules from the "mercurial" package. If we ever
wrote performance tests, measuring the amount of module imports and
__getattribute__ proxy calls through _demandmod instances would be
something I'd have it check.
Before, we didn't support lazy loading if absolute_import was in
effect and a fromlist was used. This meant that "from . import X"
wasn't lazy and performance could suffer as a result.
With this patch, we now support lazy loading for this scenario.
As part of developing this, I discovered issues when module names
are defined. Since the enforced import style only allows
"from X import Y" or "from .X import Y" in very few scenarios
when absolute_import is enabled - scenarios where Y is not a
module and thus there is nothing to lazy load - I decided to drop
support for this case instead of chasing down the errors. I don't
think much harm will come from this. But I'd like to take another
look once all modules are using absolute_import and I can see the
full extent of what is using names in absolute_import mode.
demandimport doesn't currently support absolute imports (level >= 0).
In preparation for this, we add some documentation and a code branch
to handle the absolute_import case.
The removed code was to support an __import__ function that doesn't
support the "level" argument. This argument was added in Python 2.5,
which we no longer support.
This module depends on _winreg, which is windows-only. Recent versions
of setuptools load distutils.msvc9compiler and expect it to
ImportError immediately when on non-Windows platforms, so we need to
let them do that. This breaks in an especially mystifying way, because
setuptools uses vars() on the imported module. We then throw an
exception, which vars doesn't pick up on well. For example:
In [3]: class wat(object):
...: @property
...: def __dict__(self):
...: assert False
...:
In [4]: vars(wat())
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-4-2781ada5ffe6> in <module>()
----> 1 vars(wat())
TypeError: vars() argument must have __dict__ attribute
Which is similar to the problem we run into.
demandimport was failing in Python 3 with a ValueError because
__import__'s level=-1 has gone away (-1 means to try both relative
and absolute imports and relative imports don't exist in Python 3).
With this patch, demandimport still doesn't work in Python 3 (it
fails when importing a non-package module).
This fixes an issue introduced in 818c8992811a where, when disabling
demandimport while running hooks, it's inadvertently re-enabled even when
it was never enabled in the first place.
This doesn't affect normal command line usage of Mercurial; it only matters
when Mercurial is run with demandimport intentionally disabled.
Before this patch, python modules of each extensions can't import
another one in own extension by absolute name, because root modules of
each extensions are loaded with "hgext_" prefix.
For example, "import extroot.bar" in "extroot/foo.py" of "extroot"
extension fails, even though "import bar" in it succeeds.
Installing extensions into site-packages of python library path can
avoid this problem, but this solution is not reasonable in some cases:
using binary package of Mercurial on Windows, for example.
This patch retries to import with "hgext_" prefix after ImportError,
if the module in the extension may try to import another one in own
extension.
This patch doesn't change some "_import()"/"_origimport()" invocations
below, because ordinary extensions shouldn't cause such invocations.
- invocation of "_import()" when root module imports sub-module by
absolute path without "fromlist"
for example, "import a.b" in "a.__init__.py".
extensions are loaded with "hgext_" prefix, and this causes
execution of another (= fixed by this patch) code path.
- invocation of "_origimport()" when "level != -1" with "fromlist"
for example, importing after "from __future__ import
absolute_import" (level == 0), or "from . import b" or "from .a
import b" (0 < level),
for portability between python versions and environments,
extensions shouldn't cause "level != -1".
Before this patch, demandimport of Mercurial may fail to load external
libraries using "from __future__ import absolute_import": for example,
importing "foo" in "bar.baz" module will load "bar.foo" if it exists,
even though "absolute_import" is enabled in "bar.baz" module.
So, extensions for Mercurial can't use such external libraries.
This patch saves "level" of import request for on-demand module
loading in the future: default value of level is -1, and level is 0
when "absolute_import" is enabled.
"level" value is passed to built-in import function in
"_demandmod._load()" and it should load target module correctly.
This patch changes only one "_demandmod" construction case other than
cases below:
- construction in "_demandmod._load()"
this code path should be used only in relative sub-module
loading case
- constructions other than patched one in"_demandimport()"
these code paths shouldn't be used in "level != -1" case
We don't use util.safehasattr() here to avoid adding new dependencies
for demandimport. This change may expose previously-silenced
deprecation warnings to appear, as hasattr silently hides warnings
that occur during module import when using demandimport.
The Python default for this function is -1, indicating both relative
and absolute imports should be used.[1] Previously, we relied on the
Python VM not passing level when such semantics were
requisted. This is not the case for PyPy, however, where a level of -1
is always passed to __import__.
[1] <http://docs.python.org/library/functions.html#__import__>
Python's __import__() function has 'level' as the fourth argument, not the
third. The code path in question probably never worked.
(This was seen trying to run Mercurial in PyPy. Fixing this made it
die somewhere else...)
The 'level' argument to __import__ was added in Python 2.6, and is
specified for either relative or absolute imports. The fix introduced
in 5b0fda8ff209 allowed such imports to proceed without failure, but
effectively disabled demandimport for them. This is particularly
unfortunate in Python 3.x, where *all* imports are either relative or
absolute.
The solution introduced here is to store the level argument on the
demandimport instance, and propagate it to _origimport() when its
value isn't None.
Please note that this patch hasn't been tested in Python 3.x, and thus
may not be complete. I'm worried about how sub-imports are handled; I
don't know what they are, or whether the level argument should be
modified for them. I've added 'TODO' notes to these cases; hopefully,
someone more knowledgable of these issues will deal with them.