On Solaris, recvmsg() is provided by libsocket.so. We could try hard to look
for the library which provides 'recvmsg' symbol, but it would make little sense
now since recvfds() won't work anyway on Solaris. So this patch just disables
_recvmsg() on such platforms.
Thanks to FUJIWARA Katsunori for spotting this problem.
This is less portable than the C version, but PyPy can't load CPython
extensions. So for now, this will be used on PyPy.
I've tested it on Linux amd64 and Mac OS X.
While I was here, I removed the try..except around importing cStringIO
because cStringIO should always be importable on modern Python versions.
We already do an unconditional import in other files.
Parsers.py had a reference to util.sha1 which was unused. This commit removes
this reference as well as the unused import of util to simplify the dependency
graph. This is important for the next commit which actually relocates part
of a module to eliminate a cycle.
Previously, while unpacking the dirstate we'd create 3-4 new CPython objects
for most dirstate values:
- the state is a single character string, which is pooled by CPython
- the mode is a new object if it isn't 0 due to being in the lookup set
- the size is a new object if it is greater than 255
- the mtime is a new object if it isn't -1 due to being in the lookup set
- the tuple to contain them all
In some cases such as regular hg status, we actually look at all the objects.
In other cases like hg add, hg status for a subdirectory, or hg status with the
third-party hgwatchman enabled, we look at almost none of the objects.
This patch eliminates most object creation in these cases by defining a custom
C struct that is exposed to Python with an interface similar to a tuple. Only
when tuple elements are actually requested are the respective objects created.
The gains, where they're expected, are significant. The following tests are run
against a working copy with over 270,000 files.
parse_dirstate becomes significantly faster:
$ hg perfdirstate
before: wall 0.186437 comb 0.180000 user 0.160000 sys 0.020000 (best of 35)
after: wall 0.093158 comb 0.100000 user 0.090000 sys 0.010000 (best of 95)
and as a result, several commands benefit:
$ time hg status # with hgwatchman enabled
before: 0.42s user 0.14s system 99% cpu 0.563 total
after: 0.34s user 0.12s system 99% cpu 0.471 total
$ time hg add new-file
before: 0.85s user 0.18s system 99% cpu 1.033 total
after: 0.76s user 0.17s system 99% cpu 0.931 total
There is a slight regression in regular status performance, but this is fixed
in an upcoming patch.
Previously we'd place files written in the last second in the lookup set. This
can lead to pathological cases where a file always remains in the lookup set if
it gets modified before the next time status is run.
With this patch, only the mtime of those files is invalidated. This means that
if a file's size or mode changes, we can immediately declare it as modified
without needing to compare file contents.
This change treat the ESRCH error as ENOENT like WindowsError class
does in python 2.5 or later. Without this change, some try..execpt
code which expects errno is ENOENT may fail. Actually hg command does
not work with python 2.4 on Windows.
CreateFile() will fail with error code ESRCH
when parent directory of specified path is not exist,
or ENOENT when parent directory exist but file is not exist.
Two errors are same in the mean of "file is not exist".
So WindowsError class treats error code ESRCH as ENOENT
in python 2.5 or later, but python 2.4 does not.
Actual results with python 2.4:
>>> errno.ENOENT
2
>>> errno.ESRCH
3
>>> WindowsError(3, 'msg').errno
3
>>> WindowsError(3, 'msg').args
(3, 'msg')
And with python 2.5 (or later):
>>> errno.ENOENT
2
>>> errno.ESRCH
3
>>> WindowsError(3, 'msg').errno
2
>>> WindowsError(3, 'msg').args
(3, 'msg')
Note that there is no need to fix osutil.c because it never be used
with python 2.4.
brendan mentioned on IRC that b64decode raises a TypeError too, but while the
previous exception type may be better in general, it is much easier to make it
behave like the related C code and changes nothing for mercurial itself.
This new Python code should be equivalent in behavior to the if
statement at line 312 of parsers.c. Without this, the pure-python
parsers improperly ignore truncated revlogs as created in
test-verify.t.
requires ctypes
Why is posixfile a class?
Because the implementation needs to use the Python library call os.fdopen [1],
which sets the 'name' attribute on the Python file object it creates to the
mostly meaningless string '<fdopen>', since file descriptors don't have a name.
But users of posixfile depend on the name attribute [2] being set to a proper
value, like Python's built-in 'open' function sets it on file objects.
Python file's name attribute is read-only, so we can't just assign to it after
the file object has alrady been created.
To solve this problem, we save the name of the file on a wrapper object,
and delegate the file function calls to the wrapped (private) file object
using __getattr__.
[1] http://docs.python.org/library/os.html#os.fdopen
[2] http://docs.python.org/library/stdtypes.html#file.name
In Python lists are implemented as arrays with overallocation. As a
result, list.insert(0, ...) is O(n), whereas list.append() has an
amortised running time of O(1). Reversing the internal representation
of the list should cause a slight speedup for pure Python builds.
The posixfile_nt class has been superseded by posixfile in osutils.c,
which works on Windows NT and above. All other systems get the regular
python file class which is assigned to posixfile in posix.py (for POSIX)
and in the pure python version of osutils.py (for Win 9x or Windows NT
in pure mode).