sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-10 00:45:18 +03:00

Author	SHA1	Message	Date
Laurent Charignon	3a3486745e	phases: add set per phase in C phase computation To speed up the computation of draft(), secret(), divergent(), obsolete() and unstable() we need to have a fast way of getting the list of revisions that are in draft(), secret() or the union of both: not public(). This patch extends the work on phase computation in C and make the phase computation code also return a list of set for each non public phase. Using these sets we can quickly obtain all the revisions of a given phase. We do not return a set for the public phase as we expect it to be roughly the size of the repo. Also, it can be computed easily by substracting the entries in the non public phases from all the revs in the repo.	2015-04-01 11:17:17 -07:00
Yuya Nishihara	1af765ff19	parsers: avoid signed integer overflow in calculation of leaf-node index If v = -INT_MAX - 1, -v would exceed INT_MAX. I don't think this would cause problems such as issue4627, but we can't blame it as a compiler bug because signed integer overflow is undefined in C.	2015-04-29 23:07:34 +09:00
Siddharth Agarwal	f6292cfad2	parsers: when available, use a presized dictionary for the file foldmap On a repo with over 300,000 files, this speeds up perffilefoldmap: before: wall 0.178421 comb 0.180000 user 0.160000 sys 0.020000 (best of 55) after: wall 0.164462 comb 0.160000 user 0.140000 sys 0.020000 (best of 59)	2015-04-15 14:35:44 -07:00
Bryan O'Sullivan	9b4a17636e	parsers: check for memory allocation overflows more carefully	2015-04-06 08:23:27 -07:00
André Sintzoff	150315ad76	parsers.c: avoid implicit conversion loses integer precision warning This warning is raised by Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn) and was introduced in 521875cc28be	2015-04-04 11:27:15 +02:00
Siddharth Agarwal	1b263fddd0	parsers: add a C function to create a file foldmap This is a hot path on case-insensitive filesystems -- it's guaranteed to be called every time 'hg status' is run. This is significantly faster than the equivalent Python code: see the following patch for numbers.	2015-03-31 23:32:27 -07:00
Siddharth Agarwal	6beb10735e	parsers._asciitransform: also accept a fallback function This function will be used in upcoming patches to provide a C implementation of the function to generate the foldmap.	2015-03-31 23:22:03 -07:00
Siddharth Agarwal	863cc77d5f	parsers: introduce an asciiupper function	2015-03-31 13:46:21 -07:00
Siddharth Agarwal	a7ee003c9b	parsers: make _asciilower a generic _asciitransform function We can now pass in whatever table we like. For example, an upcoming patch will introduce asciiupper.	2015-03-31 10:28:17 -07:00
Siddharth Agarwal	d3a8c5ea20	parsers._asciilower: use an explicit return object No functional change, but this will make upcoming patches cleaner.	2015-04-01 13:58:51 -07:00
Siddharth Agarwal	e455eaff24	parsers: factor out most of asciilower into an internal function We're going to reuse this in upcoming patches. The change to Py_ssize_t is necessary because parsers.c doesn't define PY_SSIZE_T_CLEAN. That macro changes the behavior of PyArg_ParseTuple but not PyBytes_GET_SIZE.	2015-03-31 10:25:29 -07:00
André Sintzoff	aadf1ad8aa	parsers.c: avoid implicit conversion loses integer warnings These warnings are raised by Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn) and were introduced in 37171a30314d	2015-03-29 19:06:23 +02:00
Laurent Charignon	0cf109868d	phase: compute phases in C Previously, the phase computation would grow much slower as the oldest draft commit in the repository grew older (which is very common in repos with evolve on) and the number of commits increase. By rewriting the computation in C we can speed it up from 700ms to 7ms on a large repository whose oldest draft commit is a year old.	2015-03-24 11:00:09 -07:00
Augie Fackler	b619fe8004	manifest.c: new extension code to lazily parse manifests This lets us iterate manifests in order, but do a _lot_ less work in the common case when we only care about a few manifest entries. Many thanks to Mike Edgar for reviewing this in advance of it going out to the list, which caught many things I missed. This version of the patch includes C89 fixes from Sean Farley and many correctness/efficiency cleanups from Martin von Zweigbergk. Thanks to both!	2015-01-13 14:31:38 -08:00
Augie Fackler	dd265ebee0	parsers: use k instead of n for PyArg_ParseTuple because python 2.4 is awful	2015-02-04 11:38:30 -05:00
Martin von Zweigbergk	2601c25066	_fm1readmarkers: generate list in C This moves perfloadmarkers from ! result: 63866 ! wall 0.239217 comb 0.250000 user 0.240000 sys 0.010000 (best of 42) to ! result: 63866 ! wall 0.218795 comb 0.210000 user 0.210000 sys 0.000000 (best of 46)	2015-01-27 09:22:59 -05:00
Augie Fackler	836f3a16e3	parsers: add fm1readmarker This lets us do most of the interesting work of parsing obsolete markers in C, which should provide significant time savings. Thanks to Martin von Zweigbergk for some cleanups on this code.	2015-01-23 15:11:25 -05:00
Martin von Zweigbergk	a0409817bd	parsers: rewrite index_ancestors() in terms of index_commonancestorsheads() The first 80% of index_ancestors() is identical to index_commonancestorsheads(), so just call that function instead.	2015-01-23 14:09:49 -08:00
Augie Fackler	f4bab3a3b0	parsers: avoid leaking several PyObjects in index_stats Found with cpychecker.	2015-01-23 15:55:36 -05:00
Augie Fackler	73a9456239	parsers: don't leak a reference to raise_revlog_error on success Found with cpychecker.	2015-01-23 15:50:40 -05:00
Augie Fackler	97b88be1c1	parsers: don't leak a tuple in pack_dirstate Spotted with cpychecker.	2015-01-23 15:48:18 -05:00
Augie Fackler	e20c9721be	parsers.c: fix a memory leak in index_commonancestorsheads Spotted with cpychecker.	2015-01-23 15:41:46 -05:00
Augie Fackler	4ef5247a75	parsers: avoid leaking obj in index_ancestors PySequence_GetItem returns a new reference. Found with cpychecker.	2015-01-23 15:33:27 -05:00
Augie Fackler	b5ac153d88	parsers: don't leak references to sys et al in check_python_version Found with cpychecker.	2015-01-23 15:30:21 -05:00
Augie Fackler	da503cae80	parsers: fix leak of err when asciilower hits a unicode decode error This is one of many errors detected in parsers.c by cpychecker[1]. I haven't gone through all of them yet. 1: https://gcc-python-plugin.readthedocs.org/en/latest/index.html	2015-01-23 15:19:04 -05:00
Mike Edgar	343ab73738	parsers: ensure revlog index node tree is initialized before insertion Currently, the revlog index C implementation assumes its node tree will be initialized before a new element is inserted by revnum. For example, revlog.py executes 'self.index.insert(-1, e)' in _addrevision(). This is only safe because the node tree has been initialized by a "node in self.nodemap" check made in addrevision(). (For context, this was discovered while developing an experimental revlog mixin which stores "elided nodes" via a separate code path from _addrevision(); that new code path segfaults without this patch.)	2014-12-04 12:02:02 -05:00
Mads Kiilerich	40c407ae08	parsers: introduce headrevsfiltered in C extension All extensions that have this function do support filtering. The existing headrevs function may support filtering but we cannot reliably detect whether it does.	2014-10-26 12:14:10 +01:00
Mads Kiilerich	20e288b0f3	parsers: use 'k' format for Py_BuildValue instead of 'n' because Python 2.4 'n' was introduced in Mercurial in 5d1adb6683fa and broke Python 2.4 support in mysterious ways that only showed failure in test-glog.t. Py_BuildValue failed because of the unknown format and a TypeError was thrown ... but it never showed up on the Python side and it happily continued processing with wrong data. Quoting https://docs.python.org/2/c-api/arg.html : n (integer) [Py_ssize_t] Convert a Python integer or long integer to a C Py_ssize_t. New in version 2.5. k (integer) [unsigned long] Convert a Python integer or long integer to a C unsigned long without overflow checking. This will use unsigned long instead of Py_ssize_t. That is not a good solution, but good is not an option when we have to support Python 2.4.	2014-10-23 02:42:57 +02:00
Siddharth Agarwal	e56ab5399b	parsers: add a function to efficiently lowercase ASCII strings We need a way to efficiently lowercase ASCII strings. For example, 'hg status' needs to build up the fold map -- a map from a canonical case (for OS X, lowercase) to the actual case of each file and directory in the dirstate. The current way we do that is to try decoding to ASCII and then calling lower() on the string, labeled 'orig' below: str.decode('ascii') return str.lower() This is pretty inefficient, and it turns out we can do much better. I also tested out a condition-based approach, labeled 'cond' below: (c >= 'A' && c <= 'Z') ? (c + ('a' - 'A')) : c 'cond' turned out to be slower in all cases. A 256-byte lookup table with invalid values for everything past 127 performed similarly, but this was less verbose. On OS X 10.9 with LLVM version 6.0 (clang-600.0.51), the asciilower function was run against two corpuses. Corpus 1 (list of files from real-world repo, > 100k files): orig: wall 0.428567 comb 0.430000 user 0.430000 sys 0.000000 (best of 24) cond: wall 0.077204 comb 0.070000 user 0.070000 sys 0.000000 (best of 100) lookup: wall 0.060714 comb 0.060000 user 0.060000 sys 0.000000 (best of 100) Corpus 2 (mozilla-central, 113k files): orig: wall 0.238406 comb 0.240000 user 0.240000 sys 0.000000 (best of 42) cond: wall 0.040779 comb 0.040000 user 0.040000 sys 0.000000 (best of 100) lookup: wall 0.037623 comb 0.040000 user 0.040000 sys 0.000000 (best of 100) On a Linux server-class machine with GCC 4.4.6 20120305 (Red Hat 4.4.6-4): Corpus 1 (real-world repo, > 100k files): orig: wall 0.260899 comb 0.260000 user 0.260000 sys 0.000000 (best of 38) cond: wall 0.054818 comb 0.060000 user 0.060000 sys 0.000000 (best of 100) lookup: wall 0.048489 comb 0.050000 user 0.050000 sys 0.000000 (best of 100) Corpus 2 (mozilla-central, 113k files): orig: wall 0.153082 comb 0.150000 user 0.150000 sys 0.000000 (best of 65) cond: wall 0.031007 comb 0.040000 user 0.040000 sys 0.000000 (best of 100) lookup: wall 0.028793 comb 0.030000 user 0.030000 sys 0.000000 (best of 100) SSE instructions might help even more, but I didn't experiment with those.	2014-10-03 18:42:39 -07:00
Matt Mackall	05afef1210	parsers: fix Py2.4 argument parsing issue Since d0ec15e840e5, we were getting this strange message with Py2.4: TypeError: argument 1 must be impossible<bad format char>, not int ..because we were using the 'n' type specifier introduced in 2.5. It turns out that offset is actually a revision number index, which ought to be an int anyway. So we store it in an int, use the 'i' specifier, rely on Py_ParseTuple for range checking, and rename it to avoid type confusion.	2014-10-01 14:44:24 -05:00
David Soria Parra	4a8632512f	parsers: fix uninitialize variable warning The heads pointer is not initialized correctly if filter is false, causing both clang and gcc to issue a warning. Correctly initialize heads to NULL.	2014-09-24 13:16:20 -07:00
Durham Goode	1b59b1e4c0	obsolete: use C code for headrevs calculation Previously, if there were filtered revs the repository could not use the C fast path for computing the head revs in the changelog. This slowed down many operations in large repositories. This adds the ability to filter revs to the C fast path. This speeds up histedit on repositories with filtered revs by 30% (13s to 9s). This could be improved further by sorting the filtered revs and walking the sorted list while we walk the changelog, but even this initial version that just calls __contains__ is still massively faster. The new C api is compatible for both new and old python clients, and the new python client can call both new and old C apis.	2014-09-16 16:03:21 -07:00
Henrik Stuart	e2838d5249	parsers: avoid signed/unsigned comparison mismatch Based on warning from Microsoft Visual C++ 2008.	2014-09-08 20:57:44 +02:00
Henrik Stuart	78f716f0fd	parsers: use correct type for file offset Now using Py_ssize_t instead of long to denote offset in file whose length is already measured using Py_ssize_t. Length and offset are now consistent. Based on warning from Microsoft Visual C++ 2008.	2014-09-08 20:22:10 +02:00
Henrik Stuart	a276be3478	parsers: ensure correct return type for inline_scan The returned data type for inline_scan should be Py_ssize_t rather than long. Based on warning from Microsoft Visual C++ 2008.	2014-09-08 20:20:17 +02:00
Henrik Stuart	a18556978b	parsers: fix typing issue when constructing Python integer object The passed variable is a Py_ssize_t, not a long, and consequently should use PyInt_FromSsize_t rather than PyInt-FromLong. Fixed based on warning from Microsoft Visual C++ 2008.	2014-09-11 12:05:23 -05:00
Henrik Stuart	fc62e1d1bc	parsers: use bitmask type consistently in find_gca_candidates Normalized type usage in find_gca_candidates triggered by warning from Microsoft Visual C++ 2008.	2014-09-08 20:06:52 +02:00
Siddharth Agarwal	b06eb2e341	parsers: remove unused getintat function Warning detected by clang.	2014-07-14 15:42:31 -07:00
Siddharth Agarwal	f40a94a790	parsers: inline fields of dirstate values in C version Previously, while unpacking the dirstate we'd create 3-4 new CPython objects for most dirstate values: - the state is a single character string, which is pooled by CPython - the mode is a new object if it isn't 0 due to being in the lookup set - the size is a new object if it is greater than 255 - the mtime is a new object if it isn't -1 due to being in the lookup set - the tuple to contain them all In some cases such as regular hg status, we actually look at all the objects. In other cases like hg add, hg status for a subdirectory, or hg status with the third-party hgwatchman enabled, we look at almost none of the objects. This patch eliminates most object creation in these cases by defining a custom C struct that is exposed to Python with an interface similar to a tuple. Only when tuple elements are actually requested are the respective objects created. The gains, where they're expected, are significant. The following tests are run against a working copy with over 270,000 files. parse_dirstate becomes significantly faster: $ hg perfdirstate before: wall 0.186437 comb 0.180000 user 0.160000 sys 0.020000 (best of 35) after: wall 0.093158 comb 0.100000 user 0.090000 sys 0.010000 (best of 95) and as a result, several commands benefit: $ time hg status # with hgwatchman enabled before: 0.42s user 0.14s system 99% cpu 0.563 total after: 0.34s user 0.12s system 99% cpu 0.471 total $ time hg add new-file before: 0.85s user 0.18s system 99% cpu 1.033 total after: 0.76s user 0.17s system 99% cpu 0.931 total There is a slight regression in regular status performance, but this is fixed in an upcoming patch.	2014-05-27 14:27:41 -07:00
Siddharth Agarwal	b0368f2bfe	parsers: remove no longer used dirstate_unset	2014-05-27 15:22:23 -07:00
Siddharth Agarwal	a23ea931cf	pack_dirstate: in C version, for invalidation set dict to what we write to disk For files written out in the last second, Mercurial used to invalidate all the stat data (state, size, mode, mtime) while persisting to disk. This included invalidating the data in the dirstate dict as well. In commit c7a2ac9361a7, this was found to be unnecessary, and Mercurial switched to invalidating only the mtime. However, in the C version of pack_dirstate the value set in the dict was still the fully invalidated one. Switch to invalidating just the mtime in the dict as well.	2014-05-27 15:17:38 -07:00
Danek Duvall	7280554540	parsers.c: fix a couple of memory leaks	2014-06-11 15:31:04 -07:00
Mads Kiilerich	6ffb3587ef	parsers: remove unnecessary gca variable in index_commonancestorsheads	2014-04-17 19:58:08 +02:00
Mads Kiilerich	b73865b95b	parsers: introduce index_commonancestorsheads This is an exact copy of index_ancestors but without the final "deepest" pruning.	2014-02-24 22:42:14 +01:00
Matt Harbison	aba75e33f4	parsers: fix compiler errors on MSVC 2008 This broke in 945eb8bd8f12.	2014-03-20 00:01:59 -04:00
Chris Jerdonek	0a2a1314d9	parsers: fail fast if Python has wrong minor version (issue4110) This change causes an informative ImportError to be raised when importing the parsers extension module if the minor version of the currently-running Python interpreter doesn't match that of the Python used when compiling the extension module. This change also exposes a parsers.versionerrortext constant in the C implementation of the module. Its presence can be used to determine whether this behavior is present in a version of the module. The value of the constant is the leading text of the ImportError raised and is set to "Python minor version mismatch". Here is an example of what the new error looks like: Traceback (most recent call last): File "test.py", line 1, in <module> import mercurial.parsers ImportError: Python minor version mismatch: The Mercurial extension modules were compiled with Python 2.7.6, but Mercurial is currently using Python with sys.hexversion=33883888: Python 2.5.6 (r256:88840, Nov 18 2012, 05:37:10) [GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] at: /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/ Python.app/Contents/MacOS/Python The reason for raising an error in this scenario is that Python's C API is known not to be compatible from minor version to minor version, even if sys.api_version is the same. See for example this Python bug report about incompatibilities between 2.5 and 2.6+: http://bugs.python.org/issue8118 These incompatibilities can cause Mercurial to break in mysterious, unforeseen ways. For example, when Mercurial compiled with Python 2.7 was run with 2.5, the following crash occurred when running "hg status": http://bz.selenic.com/show_bug.cgi?id=4110 After this crash was fixed, running with Python 2.5 no longer crashes, but the following puzzling behavior still occurs: $ hg status ... File ".../mercurial/changelog.py", line 123, in __init__ revlog.revlog.__init__(self, opener, "00changelog.i") File ".../mercurial/revlog.py", line 251, in __init__ d = self._io.parseindex(i, self._inline) File ".../mercurial/revlog.py", line 158, in parseindex index, cache = parsers.parse_index2(data, inline) TypeError: data is not a string which can be reproduced more simply with: import mercurial.parsers as parsers parsers.parse_index2("", True) Both the crash and the TypeError occurred because the Python C API's PyString_Check() returns the wrong value when the C header files from Python 2.7 are run with Python 2.5. This is an example of an incompatibility of the sort mentioned in the Python bug report above. Failing fast with an informative error message results in a better user experience in cases like the above. The information in the ImportError also simplifies troubleshooting for those on Mercurial mailing lists, the bug tracker, etc. This patch only adds the version check to parsers.c, which is sufficient to affect command-line commands like "hg status" and "hg summary". An idea for a future improvement is to move the version-checking C code to a more central location, and have it run when importing all Mercurial extension modules and not just parsers.c.	2013-12-04 20:38:27 -08:00
Mads Kiilerich	451439c52b	ancestors: remove unnecessary handling of 'left' If one of the initial nodes also is an ancestor then that most be the only ancestor. There is no need for additional bookkeeping.	2014-02-24 22:42:13 +01:00
Mads Kiilerich	e38a229881	parsers: remove unreachable and invalid code in index_ancestors The function normally returns a list. Returning a single element instead of a list with one element would be weird.	2014-02-24 22:42:13 +01:00
David Soria Parra	e5be8c59d8	parsers: fix 'unsigned expression is always true' warning (issue4142) On Mac OS gcc-llvm throws an -Wtautological-compare warning because flen is defined as an unsigned integer, therefore flen < 0 is always true.	2014-01-23 19:08:26 +01:00
Matt Mackall	b73357aaab	merge with stable	2013-12-13 17:23:02 -06:00

1 2 3

133 Commits