Commit Graph

12 Commits

Author SHA1 Message Date
Chris Jerdonek
0a2a1314d9 parsers: fail fast if Python has wrong minor version (issue4110)
This change causes an informative ImportError to be raised when importing
the parsers extension module if the minor version of the currently-running
Python interpreter doesn't match that of the Python used when compiling
the extension module.

This change also exposes a parsers.versionerrortext constant in the
C implementation of the module.  Its presence can be used to determine
whether this behavior is present in a version of the module.  The value
of the constant is the leading text of the ImportError raised and is set
to "Python minor version mismatch".

Here is an example of what the new error looks like:

  Traceback (most recent call last):
    File "test.py", line 1, in <module>
      import mercurial.parsers
  ImportError: Python minor version mismatch: The Mercurial extension
  modules were compiled with Python 2.7.6, but Mercurial is currently using
  Python with sys.hexversion=33883888: Python 2.5.6
  (r256:88840, Nov 18 2012, 05:37:10)
  [GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))]
   at: /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/
    Python.app/Contents/MacOS/Python

The reason for raising an error in this scenario is that Python's C API
is known not to be compatible from minor version to minor version, even
if sys.api_version is the same.  See for example this Python bug report
about incompatibilities between 2.5 and 2.6+:

  http://bugs.python.org/issue8118

These incompatibilities can cause Mercurial to break in mysterious,
unforeseen ways.  For example, when Mercurial compiled with Python 2.7 was
run with 2.5, the following crash occurred when running "hg status":

  http://bz.selenic.com/show_bug.cgi?id=4110

After this crash was fixed, running with Python 2.5 no longer crashes, but
the following puzzling behavior still occurs:

    $ hg status
      ...
      File ".../mercurial/changelog.py", line 123, in __init__
        revlog.revlog.__init__(self, opener, "00changelog.i")
      File ".../mercurial/revlog.py", line 251, in __init__
        d = self._io.parseindex(i, self._inline)
      File ".../mercurial/revlog.py", line 158, in parseindex
        index, cache = parsers.parse_index2(data, inline)
    TypeError: data is not a string

which can be reproduced more simply with:

    import mercurial.parsers as parsers
    parsers.parse_index2("", True)

Both the crash and the TypeError occurred because the Python C API's
PyString_Check() returns the wrong value when the C header files from
Python 2.7 are run with Python 2.5.  This is an example of an
incompatibility of the sort mentioned in the Python bug report above.

Failing fast with an informative error message results in a better user
experience in cases like the above.  The information in the ImportError
also simplifies troubleshooting for those on Mercurial mailing lists, the
bug tracker, etc.

This patch only adds the version check to parsers.c, which is sufficient
to affect command-line commands like "hg status" and "hg summary".
An idea for a future improvement is to move the version-checking C code
to a more central location, and have it run when importing all
Mercurial extension modules and not just parsers.c.
2013-12-04 20:38:27 -08:00
Chris Jerdonek
ebe058e6c3 parsers: clarify documentation of test-parseindex2.py
This change updates and improves the description of test-parseindex2.py.
In particular, it removes language that can be interpreted to mean that the
test module checks only the C implementation of parsers.parse_index2().
Rather, the module checks parsers.parse_index2(), which can be either the
C or pure Python implementation, depending on which version is being used.

As of 23b69fd11636, the module also does more than just compare the return
value with the original Python implementation.
2013-12-02 07:49:49 -08:00
Matt Mackall
30d6dbb217 parsers: backout version mismatch detection from 5f712fe8433d
This introduced mandatory recompilations and breaks pure mode in tests
2013-12-01 20:46:36 -06:00
Chris Jerdonek
030ca96e57 parsers: fail fast if Python has wrong minor version (issue4110)
This change causes an informative ImportError to be raised when importing
the extension module parsers if the minor version of the currently-running
Python interpreter doesn't match that of the Python that was used when
compiling the extension module.  Here is an example of what the new error
looks like:

  Traceback (most recent call last):
    File "test.py", line 1, in <module>
      import mercurial.parsers
  ImportError: Python minor version mismatch: The Mercurial extension
  modules were compiled with Python 2.7.6, but Mercurial is currently using
  Python with sys.hexversion=33883888: Python 2.5.6
  (r256:88840, Nov 18 2012, 05:37:10)
  [GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))]
   at: /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/
    Python.app/Contents/MacOS/Python

The reason for raising an error in this scenario is that Python's C API
is known not to be compatible from minor version to minor version, even
if sys.api_version is the same.  See for example this Python bug report
about incompatibilities between 2.5 and 2.6+:

  http://bugs.python.org/issue8118

These incompatibilities can cause Mercurial to break in mysterious,
unforeseen ways.  For example, when Mercurial compiled with Python 2.7 was
run with 2.5, the following crash occurred when running "hg status":

  http://bz.selenic.com/show_bug.cgi?id=4110

After this crash was fixed, running with Python 2.5 no longer crashes, but
the following puzzling behavior still occurs:

    $ hg status
      ...
      File ".../mercurial/changelog.py", line 123, in __init__
        revlog.revlog.__init__(self, opener, "00changelog.i")
      File ".../mercurial/revlog.py", line 251, in __init__
        d = self._io.parseindex(i, self._inline)
      File ".../mercurial/revlog.py", line 158, in parseindex
        index, cache = parsers.parse_index2(data, inline)
    TypeError: data is not a string

which can be reproduced more simply with:

    import mercurial.parsers as parsers
    parsers.parse_index2("", True)

Both the crash and the TypeError occurred because the Python C API's
PyString_Check returns the wrong value when the C header files from
Python 2.7 are run with Python 2.5.  This is an example of an
incompatibility of the sort mentioned in the Python bug report above.

Failing fast with an informative error message will result in a better
user experience in cases like the above.  The information in the ImportError
will also simplify troubleshooting for those on Mercurial mailing lists,
the bug tracker, etc.

This patch only adds the version check to parsers.c, which is sufficient
to affect command-line commands like "hg status" and "hg summary".
An idea for a future improvement is to move the version-checking C code
to a more central location, and have it run when importing all
Mercurial extension modules and not just parsers.c.
2013-11-29 12:36:28 -08:00
Chris Jerdonek
e02a62783a parse_index2: fix crash on bad argument type (issue4110)
Passing a non-string to parsers.parse_index2() causes Mercurial to crash
instead of raising a TypeError (found on Mac OS X 10.8.5, Python 2.7.6):

    import mercurial.parsers as parsers
    parsers.parse_index2(0, 0)

    Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
    0 parsers.so  0x000000010e071c59 _index_clearcaches + 73 (parsers.c:644)
    1 parsers.so  0x000000010e06f2d5 index_dealloc + 21 (parsers.c:1767)
    2 parsers.so  0x000000010e074e3b parse_index2 + 347 (parsers.c:1891)
    3 org.python.python 0x000000010dda8b17 PyEval_EvalFrameEx + 9911

This happens because when arguments of the wrong type are passed to
parsers.parse_index2(), indexType's initialization function index_init() in
parsers.c leaves the indexObject instance in a state that indexType's
destructor function index_dealloc() cannot handle.

This patch moves enough of the indexObject initialization code inside
index_init() from after the argument validation code to before it.
This way, when bad arguments are passed to index_init(), the destructor
doesn't crash and the existing code to raise a TypeError works.  This
patch also adds a test to check that a TypeError is raised.
2013-11-26 16:14:22 -08:00
Bryan O'Sullivan
449c0e1dbf tests: fix test-parseindex2.py when run with --pure 2012-05-11 02:32:26 -07:00
Bryan O'Sullivan
dc46676e81 parsers: use base-16 trie for faster node->rev mapping
This greatly speeds up node->rev lookups, with results that are
often user-perceptible: for instance, "hg --time log" of the node
associated with rev 1000 on a linux-2.6 repo improves from 0.3
seconds to 0.03.  I have not found any instances of slowdowns.

The new perfnodelookup command in contrib/perf.py demonstrates the
speedup more dramatically, since it performs no I/O.  For a single
lookup, the new code is about 40x faster.

These changes also prepare the ground for the possibility of further
improving the performance of prefix-based node lookups.
2012-04-12 14:05:59 -07:00
Bryan O'Sullivan
849e7f15fd parsers: incrementally parse the revlog index in C
We only parse entries in a revlog index file when they are actually
needed, and cache them when first requested.

This makes a huge difference to performance on large revlogs when
accessing the tip revision or performing a handful of numeric lookups
(very common cases).  For instance, "hg --time tip --template {node}"
on a tree with 300,000 revs takes 0.15 before, 0.02 after.

Even for revlog-intensive operations (e.g. running "hg log" to
completion), the lazy approach is about 1% faster than the eager
parse_index2.
2012-04-05 13:00:35 -07:00
Matt Mackall
846d35e24f revlog: only build the nodemap on demand 2011-01-11 17:01:04 -06:00
Matt Mackall
efaaee2894 revlog: remove lazy index 2011-01-04 14:12:52 -06:00
Martin Geisler
80dd126e92 remove unnecessary outer parenthesis in if-statements 2009-04-22 01:39:47 +02:00
Bernhard Leiner
78aef95fe0 Add parseindex2.py test case
Make sure that the new implementation in C return that same values as the
original Python implementation.
2008-10-17 01:05:10 +02:00