Commit Graph

200 Commits

Author SHA1 Message Date
Gregory Szorc
34b225b38d parsers: avoid PySliceObject cast on Python 3
PySlice_GetIndicesEx() accepts a PySliceObject* on Python 2 and a
PyObject* on Python 3. Casting to PySliceObject* on Python 3 was
yielding a compiler warning. So stop doing that.

With this patch, I no longer see any compiler warnings when
building the core extensions for Python 3!
2016-10-13 13:34:53 +02:00
Gregory Szorc
b946a5f05c parsers: alias more PyInt* symbols on Python 3
I feel dirty for having to do this. But this is currently our approach
for dealing with PyInt -> PyLong in Python 3 for this file.

This removes a ton of compiler warnings by fixing unresolved symbols.
2016-10-13 13:22:40 +02:00
Gregory Szorc
d61a5b6632 parsers: move PyInt aliasing out of util.h
The PyInt aliasing is only used by parsers.c. Since we don't want to
encourage the use of PyInt parsing, move the aliasing to parsers.c.
2016-10-09 13:50:53 +02:00
Gregory Szorc
216178bf54 parsers: use PyVarObject_HEAD_INIT
The macro changed slightly in Python 3, introducing curly brackets
that somehow confuse Clang into issuing a ton of compiler warnings.
Using PyVarObject_HEAD_INIT makes these go away.

It's worth noting that the code is identical: the 2nd argument to
PyVarObject_HEAD_INIT is assigned to the ob_size field and is
inserted immediately after "PyObject_HEAD_INIT(type)" is generated.
Compilers are weird.
2016-10-08 22:44:02 +02:00
Gregory Szorc
bd245cb051 parsers: convert PyString* to PyBytes*
With this change, we no longer have any occurrences of "PyString" in
our C extensions.
2016-10-08 22:02:29 +02:00
Gregory Szorc
8d34ccc102 parsers: return NULL from PyInit_parsers on Python 3
This function must return a PyObject* or the compiler complains.
2016-10-08 17:51:29 +02:00
Maciej Fijalkowski
8e7a874bdf internals: move the bitmanipulation routines into its own file
This is to allow more flexibility with the C sources -- now the
bitmanipulation routines can be safely imported without importing Python.h
2016-06-06 13:08:13 +02:00
Matt Fowles
be27b58285 parsers: fix istat macro to work with single line if statement 2016-04-05 10:43:43 -04:00
Durham Goode
f5bc3ca716 parsers: optimize filtered headrevs logic
The old native head revs logic would iterate over every node, starting from 0,
and check if every node was filtered (by testing it against the filteredrevs
python set). On large repos with hundreds of thousands of commits, this could
take 150ms.

This new logic iterates over the nodes in reverse order, and skips the filtered
check if we've seen an unfiltered child of the node. This saves approximately a
bagillion filteredrevs set checks, which shaves the time down from 150ms to
20ms during every branch cache write.
2016-03-08 00:20:08 -08:00
timeless
28949d3481 cleanup: remove superfluous space after space after equals (C) 2015-12-31 08:17:15 +00:00
Laurent Charignon
056d0af816 dirstate: add a C implementation for nonnormalentries
Before this patch, there was only a python version of nonnormalentries.
On mozilla-central we have a 10x win by putting this function in C:
% python -m timeit -s \
        'from mercurial import hg, ui, parsers; \
        repo = hg.repository(ui.ui(), "mozilla-central"); \
        m = repo.dirstate._map' \
        'parsers.nonnormalentries(m)'

100 loops, best of 3: 3.15 msec per loop

The python implementation runs in 31ms, a similar test gives:
10 loops, best of 3: 31.7 msec per loop

On our big repos, the win is still of 10x with the python implementation running
in 350ms and the C implementation running in 30ms.
2015-12-21 16:27:16 -08:00
Bryan O'Sullivan
509e69c982 parsers: use PyTuple_Pack instead of manual list-filling
Suggested by Yuya.
2015-12-17 13:07:34 -08:00
Bryan O'Sullivan
964c9f9835 parsers: add a missed PyErr_NoMemory 2015-12-14 10:47:27 -08:00
Bryan O'Sullivan
c7f7939b54 parsers: check results of PyInt_FromLong (issue4771) 2015-12-14 10:47:26 -08:00
Bryan O'Sullivan
be02f1c0fd parsers: simplify error logic in compute_phases_map_sets
Since Py_XDECREF and free both accept NULL pointers, we can get by
with just two exit paths: one for success, and one for error.

This considerably simplifies reasoning about the possible ways to
exit from this function.
2015-12-14 10:47:24 -08:00
Bryan O'Sullivan
5a05512f2f parsers: narrow scope of a variable to be less confusing 2015-12-12 20:57:01 -08:00
Yuya Nishihara
bbbc39e0a4 parsers: fix parse_dirstate to check len before unpacking header (issue4979) 2015-12-02 23:04:58 +09:00
Yuya Nishihara
557c963f7b parsers: fix width of datalen variable in fm1readmarkers
Because parsers.c does not define PY_SSIZE_T_CLEAN, "s#" format requires
(const char*, int), not (const char*, Py_ssize_t).

https://docs.python.org/2/c-api/arg.html

This error had no problem before 7d13be5f72c2, where datalen wasn't used.
But now fm1readmarkers() fails with "overflow in obsstore" on Python 2.6.9
(amd64) because upper bits of datalen seem to be filled with 1, making it
a negative integer.

This problem seems not visible on our Python 2.7 environment because upper
bits happen to be filled with 0.
2015-11-07 17:43:20 +09:00
Yuya Nishihara
18a70ac37c parsers: suppress warning of signed and unsigned comparison at nt_init
Spotted by CC=clang CFLAGS='-Wall -Wextra -Wno-missing-field-initializers
-Wno-unused-parameter -Wshorten-64-to-32':

  mercurial/parsers.c:1580:24: warning: comparison of integers of different
  signs: 'Py_ssize_t' (aka 'long') and 'unsigned long' [-Wsign-compare]
                  if (self->raw_length > INT_MAX / sizeof(nodetree)) {
2015-10-18 09:05:04 +09:00
Yuya Nishihara
0318e586fb parsers: correct type of temporary variables for dirstate tuple fields
These fields are defined as int. This eliminates the following warning
spotted by CC=clang CFLAGS='-Wall -Wextra -Wno-missing-field-initializers
-Wno-unused-parameter -Wshorten-64-to-32':

  mercurial/parsers.c:625:29: warning: comparison of integers of different
  signs: 'uint32_t' (aka 'unsigned int') and 'int' [-Wsign-compare]
                  if (state == 'n' && mtime == now) {
2015-10-17 23:14:13 +09:00
FUJIWARA Katsunori
77975f1ce1 parsers: make pack_dirstate take now in integer for consistency
On recent OS, 'stat.st_mtime' has a double precision floating point
value to represent nano seconds, but it is not wide enough for actual
file timestamp: nowadays, only 52 - 32 = 20 bit width is available for
decimal places in sec.

Therefore, casting it to 'int' may cause unexpected result. See also
changeset 8102a3981272 fixing issue4836 for detail.

For example, changed file A may be treated as "clean" unexpectedly in
steps below. "rounded now" is the value gotten by rounding via
'int(st.st_mtime)' or so.

    ---------------------+--------------------+------------------------
    "now"                |                    | timestamp of A (time_t)
    float  rounded time_t| action             | FS       dirstate
    ------ ------- ------+--------------------+-------- ---------------
    N+.nnn   N       N   |                    | ---      ---
                         | update file A      |  N
                         | dirstate.normal(A) |           N
    N+.999   N+1     N   |                    |
                         | dirstate.write()   |           N (*1)
                         |    :               |
                         | change file A      |  N
                         |    :               |
    N+1.00   N+1    N+1  |                    |
                         | "hg status" (*2)   |  N        N
    ------ ------- ------+--------------------+-------- ---------------

Timestamp N of A in dirstate isn't dropped at (*1), because "rounded
now" is N+1 at that time, even if 'st_mtime' in 'time_t' is still N.

Then, file A is unexpectedly treated as "clean" at (*2) in this case.

For consistent handling of 'stat.st_mtime', this patch makes
'pack_dirstate()' take 'now' argument not in floating point but in
integer.

This patch makes 'PyArg_ParseTuple()' in 'pack_dirstate()' use format
'i' (= checking type mismatch or overflow), even though it is ensured
that 'now' is in the range of 32bit signed integer by masking with
'_rangemask' (= 0x7fffffff) on caller side.

It should be cheaper enough than packing itself, and useful to
detect that legacy code invokes 'pack_dirstate()' with 'now' in
floating point value.
2015-10-14 02:40:04 +09:00
Yuya Nishihara
b1dadc9002 parsers: fix infinite loop or out-of-bound read in fm1readmarkers (issue4888)
The issue4888 was caused by 0-length obsolete marker. If msize is zero,
fm1readmarkers() never ends.

This patch adds several bound checks to fm1readmarker(). Therefore, 0-length
and invalid-size marker should be rejected.
2015-10-11 18:30:47 +09:00
Yuya Nishihara
62c0a27d40 parsers: read sizes of metadata pair of obsolete marker at once
This will make it easy to implement bound checking. Currently fm1readmarker()
has no protection for corrupted obsstore and can cause infinite loop or
out-of-bound reads.
2015-10-11 18:41:41 +09:00
Yuya Nishihara
d786b95595 parsers: use PyTuple_New and SET_ITEM to construct metadata pair of markers
With these 2 patches, fm1readmarkers() gets slightly faster:

  obsolete._fm1readmarkers() for 78644 entries
  58.0 -> 56.2msec
2015-09-05 16:50:35 +09:00
Yuya Nishihara
7f0688b8e9 parsers: use PyTuple_SET_ITEM() to fill new marker tuples
Because we know these tuples have no member yet, PyTuple_SetItem() isn't
necessary.
2015-09-05 16:41:21 +09:00
Augie Fackler
5d1e91bd29 parsers: fix two cases of unsigned long instead of Py_ssize_t
We had to do this before because Python 2.4 didn't understand the n
format specifier in Py_BuildValue and friends. We no longer have that
problem.
2015-08-26 10:20:07 -04:00
timeless@mozdev.org
52eae47139 spelling: behaviour -> behavior 2015-08-28 10:53:55 -04:00
Yuya Nishihara
d439794357 reachableroots: silence warning of implicit integer narrowing issued by clang
Tested with CFLAGS=-Wshorten-64-to-32 CC=clang which is the default of
Mac OS X.

Because a valid revnum shouldn't exceed INT_MAX, we don't need long width for
large tovisit array.
2015-08-14 12:25:14 +09:00
Yuya Nishihara
ac624a2f60 reachableroots: narrow scope of minidx variable
minidx is never used if includepath is false, so let's define it where it
is used.
2015-08-14 12:22:08 +09:00
Augie Fackler
e4a0ef9ede parsers: avoid int/unsigned conversions
Detected with
make local CFLAGS='-Wall -Wextra -Wno-missing-field-initializers -Wno-unused-parameter' CC=clang
2015-08-21 14:33:51 -04:00
Yuya Nishihara
4c30aae351 reachableroots: unroll loop that checks if one of parents is reachable
The difference is small, but fewer loops should be better in general:

  revset #0: 0::tip
  0) 0.001609
  1) 0.001510  93%
2015-08-16 09:30:37 +09:00
Yuya Nishihara
ffe4e45775 reachableroots: handle error of PyList_Append() 2015-08-15 19:38:03 +09:00
Yuya Nishihara
ecea4ee80c reachableroots: return list of revisions instead of set
Now we don't need a set of reachable revisions, and the caller wants a sorted
list of revisions, so constructing a set is just a waste of time.

  revset #0: 0::tip
  2) 0.002536
  3) 0.001598  63%

PyList_New() should set an appropriate exception on error, so we don't need
to call PyErr_NoMemory() manually.

This patch lacks error handling of PyList_Append() as it was before for
PySet_Add(). It should be fixed later.
2015-08-14 15:52:19 +09:00
Yuya Nishihara
72474b8722 reachableroots: use internal "revstates" array to test if rev is reachable
This is faster than using PySet_Contains().

  revset #0: 0::tip
  1) 0.003678
  2) 0.002536  68%
2015-08-14 15:49:11 +09:00
Yuya Nishihara
7f0aba37f0 reachableroots: use internal "revstates" array to test if rev is a root
The main goal of this patch series is to reduce the use of PyXxx() function
that is likely to require ugly error handling and inc/decref. Plus, this is
faster than using PySet_Contains().

  revset #0: 0::tip
  0) 0.004168
  1) 0.003678  88%

This patch ignores out-of-range roots as they are in the pure implementation.
Because reachable sets are calculated from heads, and out-of-range heads raise
IndexError, we can just take out-of-range roots as unreachable. Otherwise,
the test of "hg log -Gr '. + wdir()'" would fail.

"heads" argument is changed to a list. Should we have to rename the C function
as its signature is changed?
2015-08-14 15:43:29 +09:00
Augie Fackler
7b20700303 parsers: set exception when there's too little string data to extract parents
Previously we were returning NULL from this function without actually
setting up an exception. This fixes that problem, which was detected
with cpychecker.
2015-08-18 16:40:10 -04:00
Augie Fackler
71f1912b43 parsers: drop spurious check of readlen value
We're about to check if len < 40 after assigning readlen to len, which
means that if len < 40 we'll still abort, but I'm about to add a
sensible exception to that failure, so let's just discard this useless
check.
2015-08-18 16:39:26 -04:00
Augie Fackler
d203bdeed3 parsers: correctly decref normed value after PyDict_SetItem
Previously we were leaving this PyObject* with a refcount that was one
too high. Detected with cpychecker.
2015-08-18 16:43:26 -04:00
Augie Fackler
b4e44876ff parsers: fix two leaks in index_ancestors
Both happy paths through this function leaked the returned list:

1) If the list was of size 0 or 1, it was retained an extra time and then
   returned.

2) If the list was passed to find_deepest, it was never released before
   exiting this function.

Both paths spotted by cpychecker.
2015-08-18 17:15:04 -04:00
Yuya Nishihara
fe0b1b769d reachableroots: extend "revstates" to array of bit flags 2015-08-14 15:30:52 +09:00
Yuya Nishihara
548434fd88 reachableroots: rename "seen" array to "revstates" for future extension
It will be an array of bit flags, SEEN | ROOT | REACHABLE.
2015-08-14 15:23:42 +09:00
Yuya Nishihara
8267d3bb5d reachableroots: give anonymous name to short-lived "numheads" variable
I'll reuse it for the length of the roots list.
2015-08-15 18:29:58 +09:00
Yuya Nishihara
8743607a4d reachableroots: reduce nesting level by jumping to next iteration by continue
This can eliminate lines over 80 columns. No code change except for the
outermost "if" condition.
2015-08-15 18:03:47 +09:00
Yuya Nishihara
dd91337869 reachableroots: fix memleak of integer objects at includepath loop
In the first visit loop, val is decref-ed correctly after PySet_Add().
Let's do the same for the includepath loop.
2015-08-14 12:36:41 +09:00
Yuya Nishihara
814039db26 reachableroots: bail if integer object cannot be allocated
This patch also replaces Py_XDECREF() by Py_DECREF() because we known "val"
and "p" are not NULL.

BTW, we can eliminate some of these allocation and error handling of int objects
if the internal "seen" array has more information. For example,

  enum { SEEN = 1, ROOT = 2, REACHABLE = 4 };
  /* ... build ROOT mask from roots argument ... */
  if (seen[revnum + 1] & ROOT) {  /* instead of PySet_Contains(roots, val) */

>From my quick hack, it is 2x faster.
2015-08-14 12:31:56 +09:00
Yuya Nishihara
1566598e14 reachableroots: verify type of each item of heads argument
Though PyInt_AS_LONG() can return a value no matter if it isn't an int object,
it could exceed the boundary of the underlying struct. I think C API should be
defensive to such errors.
2015-08-13 18:59:49 +09:00
Yuya Nishihara
01d4a46e15 reachableroots: verify integer range of heads argument (issue4775)
Now it raises IndexError instead of SEGV for 'wdir()' as it was before.
2015-08-13 18:38:46 +09:00
Yuya Nishihara
d210a7a2bf reachableroots: unify bail cases to raise exception correctly
Before this patch, release_seen_and_tovisit did not return NULL, so the
exception was not raised immediately. As Py_XDECREF() and free() are safe
for NULL, we can simply bail in any case.
2015-08-13 18:29:38 +09:00
Yuya Nishihara
d44f168a4b reachableroots: pass NULL to PySet_New() as it expects a pointer, not an int 2015-08-13 17:58:33 +09:00
Augie Fackler
4d8670352a reachableroots: return NULL if we're throwing an exception
Based on my reading of [0] and surrounding sections, if we want an
exception to be properly raised when something goes wrong in the C
code, we need to make sure we return NULL here. Do so.

https://docs.python.org/2/extending/extending.html#back-to-the-example
2015-08-11 14:53:47 -04:00