Previously, the code only has what manpager says. In <linux/magic.h>, there
are more defined. This patch adds filesystems that appear in the current
Arch Linux's /proc/filesystems (autofs, overlay, securityfs) and f2fs, which
was seen in news.
Previously we have "static struct statfs" to return a string. That is not
multiple-thread safe. This patch moves the allocation to the caller to
address the problem.
Previously we check three things: "statfs" function, "linux/magic.h" and
"sys/vfs.h" headers. But we didn't check "struct statfs" or the "f_type"
field. That means if a system has "statfs" but "struct statfs" is not
defined in the two header files we check, or defined without the "f_type"
field, the compilation will fail.
This patch combines the checks (2 headers + 1 function + 1 field) together
and sets "HAVE_LINUX_STATFS". It makes setup.py faster (less checks), and
more reliable (immutable to the issue above).
This patch exports the "getfstype" method. So we can use it to enable
hardlinks for known safe filesystems.
The patch was tested manually via debugshell on a Linux system.
"mercurial.osutil.getfstype" works as expected. It's hard to mount
filesystem on user-space easily. I will add a test for real hardlink support
to indirectly test this patch, after turning on real hardlinks support for
certain whitelisted filesystems.
Currently it only has Linux filesystems, according to my Linux manpage,
built at 2016-03-15.
The code uses "if" instead of "switch" because there could be some
duplicated values.
This patch adds a simple setprocname method to osutil. The operation is not
defined by any standard and is platform-specific, the current implementation
tries to cover some major platforms (ex. Linux, OS X, FreeBSD) that is
relatively easy to support. Other platforms (Windows [4], other BSDs, ...)
can be added in the future.
The current implementation supports two methods to change process title:
a. setproctitle if available (works in FreeBSD).
b. rewrite argv in place (works in Linux [1] and Mac OS X). [2] [3]
[1]: Linux has "prctl(PR_SET_NAME, ...)" but 1) it has 16-byte limit, which
is too small; 2) it is not quite equivalent to what we want - it changes
"/proc/self/comm", not "/proc/self/cmdline" - "comm" change won't show up
in "ps" output unless "-o comm" is used.
[2]: The implementation does not rewrite the **environ buffer like some
other implementations do, just to make the code simpler and safer. However,
this also means the buffer size we can rewrite is significantly shorter. If
we are really greedy and want the "environ" space, we can change the
implementation later.
[3]: It requires a CPython private API: Py_GetArgcArgv to get the original
argv. Unfortunately Python 3 makes a copy of argv and returns the wchar_t
version, so it is not supported for now. (if we really want to, we could
count backwards from "char **environ", given known argc and argv, not sure
if that's a good idea - probably not)
[4]: The feature is aimed to make it easier for forked command server
processes to show what they are doing. Since Windows does not support
fork(), despite it's a major platform, its support is not added in this
patch.
PyIntObject doesn't exist in Python 3. While PyIntObject is preferred
on Python 2 because it is a fixed capacity and faster, the difference
between PyIntObject and PyLongObject for scenarios where performance
isn't critical or the caller isn't performing type checking shouldn't
be relevant.
So change recvfds to return a list of longs instead of ints on Python
2.
It appears that Solaris doesn't provide CMSG_LEN(), msg_control, etc. As
recvfds() is only necessary for chg, this patch just drops it if CMSG_LEN
isn't defined, which is the same workaround as Python 3.x.
https://hg.python.org/cpython/rev/c64216addd7f#l7.33
Some evil evil awful tool adds resource forks to files it's comparing.
Our Mac-specific code to do bulk stats was accidentally using "total
size" which includes those forks in the file size, causing them to be
reported as modified. This changes it to only care about the normal
data size and thus agree with what Mercurial's expecting.
It will be used to attach client's stdio files to a background chg command
server.
The socket module of Python 2.x doesn't provide recvmsg(). This could be
implemented by using ctypes, but it would be less portable than the C version
because the handling of socket ancillary data heavily depends on preprocessor.
Also, some length fields are wrongly typed in the Linux kernel.
This is a significant win for large repositories on OS X, especially with a
cold cache. Unfortunately we need to keep the lstat-based implementation around
for two reasons:
- Not all filesystems support this call.
- There's an edge case in which it's best to fall back to avoid a retry loop.
More about this in the comments.
The below tests are all performed on a Mac with an SSD running OS X 10.9, on a
repository with over 200k files. The results are best of 5 with simulated
best-effort conditions.
The gains with a hot cache are pretty impressive: 'hg status' goes from 5.18
seconds to 3.79 seconds.
However, a repository that large will probably already be using something like
hgwatchman [1], which helps much more (for this repo, 'hg status' with
hgwatchman is approximately 1 second). Where this really helps is when the
cache is cold [2]: hg status goes from 31.0 seconds to 9.66.
See http://lists.apple.com/archives/filesystem-dev/2014/Dec/msg00002.html for
some more discussion about this function.
This is based on a patch by Sean Farley <sean@farley.io>.
[1] https://bitbucket.org/facebook/hgwatchman
[2] There appears to be no easy way to clear the file cache (aka "vnodes") on
OS X short of rebooting. purge(8) purportedly does that but in my testing had
little effect. The workaround I came up with was to assume that vnode eviction
was LRU, make sure the kern.maxvnodes sysctl is smaller than the size of the
repository, then make sure we'd always miss the cache by running 'hg status' in
another clone of the repository before running it in the test repository.
In upcoming patches we'll add another implementation of listdir on OS X. That
implementation will have to fall back to this one under some circumstances,
though. We'll make _listdir be able to detect those circumstances and use the
right function as appropriate.
This makes a big difference to performance.
In a clean working directory containing 170,000 files, performance of
"hg --time diff" improves from 2.38 seconds to 1.69.
Needed to use 'Py_RETURN_TRUE' instead of 'return Py_True' to avoid
reference count errors which would randomly crash the Python
executable during merge. This only happened when you had something
configured in merge-tools and the merge was large enough.
The previous test assumed that 'os.name' was "mac" on Mac OS X. This
is not the case; 'mac' was classic Mac OS, whereas Mac OS X has 'os.name'
be 'posix'.
Please note that this change will break Mercurial on hypothetical
non-Mac OS X deployments of Darwin.
Credit to Brodie Rao for thinking of CGSessionCopyCurrentDictionary()
and Kevin Bullock for testing.
to work around http://support.microsoft.com/kb/899149.
Also, Microsoft's documentation of the CreateFile Windows API says (quote):
When an application creates a file across a network, it is better to use
GENERIC_READ | GENERIC_WRITE for dwDesiredAccess than to use
GENERIC_WRITE alone. The resulting code is faster, because the
redirector can use the cache manager and send fewer SMBs with more data.
This combination also avoids an issue where writing to a file across a
network can occasionally return ERROR_ACCESS_DENIED.
This patch adds support for py3k in osutil.c. This is accomplished by including
a header file responsible for abstracting the API differences between python 2
and python 3.
listdir_stat_type is also changed in the following way: A previous call to
PyObject_HEAD_INIT is substituted to a call to PyVarObject_HEAD_INIT, which
makes the object buildable in both python 2.x and 3.x without weird warnings.
After testing on windows, some modifications were also made in the posixfile
function, as it calls PyFile_FromFile and PyFile_SetBufSize, which are gone in
py3k. In py3k the PyFile_* API is, actually a wrapper over the io module, and
code has been adapted accordingly to fit py3k.
The posixfile_nt code hits the win32 file API directly, which
essentially amounts to performing a system call for every read and
write. This is slow.
We add a C extension that lets us use a Python file object instead,
but preserve our desired POSIX-like semantics (the ability to rename
or delete a file that is being accessed).
If the C extension is not available (e.g. in a VPS environment
without a compiler), we fall back to the posixfile_nt code.