os.name returns unicodes on py3 and we have pycompat.osname which returns
bytes. This series of 2 patches will change every ocurrence of os.name with
pycompat.osname.
Certain instances of os.sep has been converted to pycompat.ossep where it was
sure to use bytes only. There are more such instances which needs some more
attention and will get surely.
On PyPy this version performs reasonably well compared to C version.
Example command is "hg id" which gets faster, depending on details
of your operating system and hard drive (it's bottlenecked on stat mostly)
There is potential for improvements by storing extra as a condensed struct too.
The pure version was mpatch was throwing struct.error or ValueError
for errors, whereas the C version was throwing an "mpatch.mpatchError".
Introducing an mpatch.mpatchError into pure and using it consistently
is fairly easy, but the actual form for it is mercurial.mpatch.mpatchError,
so with this commit, we change the C implementation to match the naming
convention too.
On Solaris, recvmsg() is provided by libsocket.so. We could try hard to look
for the library which provides 'recvmsg' symbol, but it would make little sense
now since recvfds() won't work anyway on Solaris. So this patch just disables
_recvmsg() on such platforms.
Thanks to FUJIWARA Katsunori for spotting this problem.
This is less portable than the C version, but PyPy can't load CPython
extensions. So for now, this will be used on PyPy.
I've tested it on Linux amd64 and Mac OS X.
While I was here, I removed the try..except around importing cStringIO
because cStringIO should always be importable on modern Python versions.
We already do an unconditional import in other files.
Parsers.py had a reference to util.sha1 which was unused. This commit removes
this reference as well as the unused import of util to simplify the dependency
graph. This is important for the next commit which actually relocates part
of a module to eliminate a cycle.
Previously, while unpacking the dirstate we'd create 3-4 new CPython objects
for most dirstate values:
- the state is a single character string, which is pooled by CPython
- the mode is a new object if it isn't 0 due to being in the lookup set
- the size is a new object if it is greater than 255
- the mtime is a new object if it isn't -1 due to being in the lookup set
- the tuple to contain them all
In some cases such as regular hg status, we actually look at all the objects.
In other cases like hg add, hg status for a subdirectory, or hg status with the
third-party hgwatchman enabled, we look at almost none of the objects.
This patch eliminates most object creation in these cases by defining a custom
C struct that is exposed to Python with an interface similar to a tuple. Only
when tuple elements are actually requested are the respective objects created.
The gains, where they're expected, are significant. The following tests are run
against a working copy with over 270,000 files.
parse_dirstate becomes significantly faster:
$ hg perfdirstate
before: wall 0.186437 comb 0.180000 user 0.160000 sys 0.020000 (best of 35)
after: wall 0.093158 comb 0.100000 user 0.090000 sys 0.010000 (best of 95)
and as a result, several commands benefit:
$ time hg status # with hgwatchman enabled
before: 0.42s user 0.14s system 99% cpu 0.563 total
after: 0.34s user 0.12s system 99% cpu 0.471 total
$ time hg add new-file
before: 0.85s user 0.18s system 99% cpu 1.033 total
after: 0.76s user 0.17s system 99% cpu 0.931 total
There is a slight regression in regular status performance, but this is fixed
in an upcoming patch.
Previously we'd place files written in the last second in the lookup set. This
can lead to pathological cases where a file always remains in the lookup set if
it gets modified before the next time status is run.
With this patch, only the mtime of those files is invalidated. This means that
if a file's size or mode changes, we can immediately declare it as modified
without needing to compare file contents.