sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-10 08:47:12 +03:00

Author	SHA1	Message	Date
Mads Kiilerich	6ffb3587ef	parsers: remove unnecessary gca variable in index_commonancestorsheads	2014-04-17 19:58:08 +02:00
Mads Kiilerich	b73865b95b	parsers: introduce index_commonancestorsheads This is an exact copy of index_ancestors but without the final "deepest" pruning.	2014-02-24 22:42:14 +01:00
Matt Harbison	aba75e33f4	parsers: fix compiler errors on MSVC 2008 This broke in 945eb8bd8f12.	2014-03-20 00:01:59 -04:00
Chris Jerdonek	0a2a1314d9	parsers: fail fast if Python has wrong minor version (issue4110) This change causes an informative ImportError to be raised when importing the parsers extension module if the minor version of the currently-running Python interpreter doesn't match that of the Python used when compiling the extension module. This change also exposes a parsers.versionerrortext constant in the C implementation of the module. Its presence can be used to determine whether this behavior is present in a version of the module. The value of the constant is the leading text of the ImportError raised and is set to "Python minor version mismatch". Here is an example of what the new error looks like: Traceback (most recent call last): File "test.py", line 1, in <module> import mercurial.parsers ImportError: Python minor version mismatch: The Mercurial extension modules were compiled with Python 2.7.6, but Mercurial is currently using Python with sys.hexversion=33883888: Python 2.5.6 (r256:88840, Nov 18 2012, 05:37:10) [GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] at: /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/ Python.app/Contents/MacOS/Python The reason for raising an error in this scenario is that Python's C API is known not to be compatible from minor version to minor version, even if sys.api_version is the same. See for example this Python bug report about incompatibilities between 2.5 and 2.6+: http://bugs.python.org/issue8118 These incompatibilities can cause Mercurial to break in mysterious, unforeseen ways. For example, when Mercurial compiled with Python 2.7 was run with 2.5, the following crash occurred when running "hg status": http://bz.selenic.com/show_bug.cgi?id=4110 After this crash was fixed, running with Python 2.5 no longer crashes, but the following puzzling behavior still occurs: $ hg status ... File ".../mercurial/changelog.py", line 123, in __init__ revlog.revlog.__init__(self, opener, "00changelog.i") File ".../mercurial/revlog.py", line 251, in __init__ d = self._io.parseindex(i, self._inline) File ".../mercurial/revlog.py", line 158, in parseindex index, cache = parsers.parse_index2(data, inline) TypeError: data is not a string which can be reproduced more simply with: import mercurial.parsers as parsers parsers.parse_index2("", True) Both the crash and the TypeError occurred because the Python C API's PyString_Check() returns the wrong value when the C header files from Python 2.7 are run with Python 2.5. This is an example of an incompatibility of the sort mentioned in the Python bug report above. Failing fast with an informative error message results in a better user experience in cases like the above. The information in the ImportError also simplifies troubleshooting for those on Mercurial mailing lists, the bug tracker, etc. This patch only adds the version check to parsers.c, which is sufficient to affect command-line commands like "hg status" and "hg summary". An idea for a future improvement is to move the version-checking C code to a more central location, and have it run when importing all Mercurial extension modules and not just parsers.c.	2013-12-04 20:38:27 -08:00
Mads Kiilerich	451439c52b	ancestors: remove unnecessary handling of 'left' If one of the initial nodes also is an ancestor then that most be the only ancestor. There is no need for additional bookkeeping.	2014-02-24 22:42:13 +01:00
Mads Kiilerich	e38a229881	parsers: remove unreachable and invalid code in index_ancestors The function normally returns a list. Returning a single element instead of a list with one element would be weird.	2014-02-24 22:42:13 +01:00
David Soria Parra	e5be8c59d8	parsers: fix 'unsigned expression is always true' warning (issue4142) On Mac OS gcc-llvm throws an -Wtautological-compare warning because flen is defined as an unsigned integer, therefore flen < 0 is always true.	2014-01-23 19:08:26 +01:00
Matt Mackall	b73357aaab	merge with stable	2013-12-13 17:23:02 -06:00
Matt Mackall	c4f5764d33	mpatch: rewrite pointer overflow checks	2013-12-11 18:33:42 -06:00
Matt Mackall	30d6dbb217	parsers: backout version mismatch detection from 5f712fe8433d This introduced mandatory recompilations and breaks pure mode in tests	2013-12-01 20:46:36 -06:00
Chris Jerdonek	030ca96e57	parsers: fail fast if Python has wrong minor version (issue4110) This change causes an informative ImportError to be raised when importing the extension module parsers if the minor version of the currently-running Python interpreter doesn't match that of the Python that was used when compiling the extension module. Here is an example of what the new error looks like: Traceback (most recent call last): File "test.py", line 1, in <module> import mercurial.parsers ImportError: Python minor version mismatch: The Mercurial extension modules were compiled with Python 2.7.6, but Mercurial is currently using Python with sys.hexversion=33883888: Python 2.5.6 (r256:88840, Nov 18 2012, 05:37:10) [GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] at: /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/ Python.app/Contents/MacOS/Python The reason for raising an error in this scenario is that Python's C API is known not to be compatible from minor version to minor version, even if sys.api_version is the same. See for example this Python bug report about incompatibilities between 2.5 and 2.6+: http://bugs.python.org/issue8118 These incompatibilities can cause Mercurial to break in mysterious, unforeseen ways. For example, when Mercurial compiled with Python 2.7 was run with 2.5, the following crash occurred when running "hg status": http://bz.selenic.com/show_bug.cgi?id=4110 After this crash was fixed, running with Python 2.5 no longer crashes, but the following puzzling behavior still occurs: $ hg status ... File ".../mercurial/changelog.py", line 123, in __init__ revlog.revlog.__init__(self, opener, "00changelog.i") File ".../mercurial/revlog.py", line 251, in __init__ d = self._io.parseindex(i, self._inline) File ".../mercurial/revlog.py", line 158, in parseindex index, cache = parsers.parse_index2(data, inline) TypeError: data is not a string which can be reproduced more simply with: import mercurial.parsers as parsers parsers.parse_index2("", True) Both the crash and the TypeError occurred because the Python C API's PyString_Check returns the wrong value when the C header files from Python 2.7 are run with Python 2.5. This is an example of an incompatibility of the sort mentioned in the Python bug report above. Failing fast with an informative error message will result in a better user experience in cases like the above. The information in the ImportError will also simplify troubleshooting for those on Mercurial mailing lists, the bug tracker, etc. This patch only adds the version check to parsers.c, which is sufficient to affect command-line commands like "hg status" and "hg summary". An idea for a future improvement is to move the version-checking C code to a more central location, and have it run when importing all Mercurial extension modules and not just parsers.c.	2013-11-29 12:36:28 -08:00
Chris Jerdonek	e02a62783a	parse_index2: fix crash on bad argument type (issue4110) Passing a non-string to parsers.parse_index2() causes Mercurial to crash instead of raising a TypeError (found on Mac OS X 10.8.5, Python 2.7.6): import mercurial.parsers as parsers parsers.parse_index2(0, 0) Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 parsers.so 0x000000010e071c59 _index_clearcaches + 73 (parsers.c:644) 1 parsers.so 0x000000010e06f2d5 index_dealloc + 21 (parsers.c:1767) 2 parsers.so 0x000000010e074e3b parse_index2 + 347 (parsers.c:1891) 3 org.python.python 0x000000010dda8b17 PyEval_EvalFrameEx + 9911 This happens because when arguments of the wrong type are passed to parsers.parse_index2(), indexType's initialization function index_init() in parsers.c leaves the indexObject instance in a state that indexType's destructor function index_dealloc() cannot handle. This patch moves enough of the indexObject initialization code inside index_init() from after the argument validation code to before it. This way, when bad arguments are passed to index_init(), the destructor doesn't crash and the existing code to raise a TypeError works. This patch also adds a test to check that a TypeError is raised.	2013-11-26 16:14:22 -08:00
Bryan O'Sullivan	0fb3ccb7e6	Merge	2013-11-26 21:55:21 -08:00
Abhay Kadam	fbe291d0cf	mercurial/parsers.c: fix compiler warning When try to compile on x64 OS X, I get this warning: mercurial/parsers.c:931:27: warning: implicit conversion loses integer precision : 'long' to 'int' [-Wshorten-64-to-32] ? 4 : self->raw_length / 2; The patch verifies if value of self->raw_length falls bellow INT_MAX; if not, it raises the ValueError exception. If value of self->raw_length is greater than 4, it's casted to int type, to eliminate the warning.	2013-11-19 23:49:11 +05:30
Siddharth Agarwal	f64894b936	parse_manifest: rewrite to use memchr memchr is usually smarter than a simple for loop. With gcc 4.4.6 and glibc 2.12 on x86-64, for a 20 MB, 200,000 file manifest, parse_manifest goes from 0.116 seconds to 0.095 seconds.	2013-09-06 23:47:59 -07:00
Bryan O'Sullivan	87700a719c	parsers: correctly handle a failed allocation	2013-09-16 12:17:55 -07:00
Bryan O'Sullivan	fb0fb16051	parsers: use Py_INCREF safely	2013-09-16 12:12:37 -07:00
Bryan O'Sullivan	3cfc802f1f	parsers: state is a char, not an int	2013-09-16 12:10:28 -07:00
Siddharth Agarwal	b2531e9232	parsers: use a lookup table to convert hex to binary This is a hotspot for parse_manifest. With this patch, for a 20 MB, 200,000 file manifest, parse_manifest goes down from 0.153 seconds to 0.116.	2013-09-07 00:59:24 -07:00
Siddharth Agarwal	cf3f4a0258	pack_dirstate: only invalidate mtime for files written in the last second Previously we'd place files written in the last second in the lookup set. This can lead to pathological cases where a file always remains in the lookup set if it gets modified before the next time status is run. With this patch, only the mtime of those files is invalidated. This means that if a file's size or mode changes, we can immediately declare it as modified without needing to compare file contents.	2013-08-17 20:48:49 -07:00
Siddharth Agarwal	52db69329a	ancestor.deepest: ignore ninteresting while building result (issue3984) ninteresting indicates the number of non-zero elements in the interesting array, not the number of elements in the final list. Since elements in interesting can stand for more than one gca, limiting the number of results to ninteresting is an error. Tests for issue3984 are included.	2013-07-25 14:43:15 -07:00
Wei, Elson	ec2b8c873f	ancestor.deepest: decrement ninteresting correctly (issue3984) The invariant this code tries to hold is that ninteresting is the number of non-zero elements in the interesting array. interesting[nsp] is incremented at the same time as interesting[sp] is decremented. So if interesting[nsp] was previously 0, ninteresting shouldn't be decremented.	2013-07-25 17:35:53 +08:00
Siddharth Agarwal	2267993ebc	ancestor.deepest: sort revs in C version This isn't strictly necessary, but it makes the code more consistent with the Python version.	2013-07-25 14:20:37 -07:00
André Sintzoff	cf733b64f6	parsers: remove warning: format ‘%ld’ expects argument of type ‘long int’ gcc 4.6.3 on 12.04 Ubuntu machine emits warnings: mercurial/parsers.c: In function ‘find_deepest’: mercurial/parsers.c:1288:9: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 3 has type ‘Py_ssize_t’ [-Wformat] mercurial/parsers.c:1288:9: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 4 has type ‘Py_ssize_t’ [-Wformat]	2013-04-18 20:28:38 +02:00
Matt Mackall	a6cee1569b	parsers: fix variable declaration position issue	2013-04-17 12:57:26 -05:00
Bryan O'Sullivan	c6b9f1099d	parsers: a C implementation of the new ancestors algorithm The performance of both the old and new Python ancestor algorithms depends on the number of revs they need to traverse. Although the new algorithm performs far better than the old when revs are numerically and topologically close, both algorithms become slow under other circumstances, taking up to 1.8 seconds to give answers in a Linux kernel repo. This C implementation of the new algorithm is a fairly straightforward transliteration. The only corner case of interest is that it raises an OverflowError if the number of GCA candidates found during the first pass is greater than 24, to avoid the dual perils of fixnum overflow and trying to allocate too much memory. (If this exception is raised, the Python implementation is used instead.) Performance numbers are good: in a Linux kernel repo, time for "hg debugancestors" on two distant revs (24bf01de7537 and c2a8808f5943) is as follows: Old Python: 0.36 sec New Python: 0.42 sec New C: 0.02 sec For a case where the new algorithm should perform well: Old Python: 1.84 sec New Python: 0.07 sec New C: measures as zero when using --time (This commit includes a paranoid cross-check to ensure that the Python and C implementations give identical answers. The above performance numbers were measured with that check disabled.)	2013-04-16 10:08:20 -07:00
Bryan O'Sullivan	8f78d582d5	scmutil: rewrite dirs in C, use if available This is over twice as fast as the Python dirs code. Upcoming changes will nearly double its speed again. perfdirs results for a working dir with 170,000 files: Python 638 msec C 244	2013-04-10 15:08:27 -07:00
Siddharth Agarwal	9334236621	dirstate: move pure python dirstate packing to pure/parsers.py	2013-01-17 23:46:08 -08:00
Yuya Nishihara	26e354a203	parsers: fix memleak of revlog cache entries on strip Since 2852b9b207e9, raw_length can be reduced on strip, but corresponding cache entries still have refcount. They are not dereferenced by _index_clearcache(), and never freed. To reproduce the problem, run "hg pull" and "hg strip null" several times in the same process.	2013-01-28 19:05:35 +09:00
Bryan O'Sullivan	dea2c50032	store: implement lowerencode in C	2012-12-12 13:09:33 -08:00
Bryan O'Sullivan	a150198558	store: implement fncache basic path encoding in C (This is not yet enabled; it will be turned on in a followup patch.) The path encoding performed by fncache is complex and (perhaps surprisingly) slow enough to negatively affect the overall performance of Mercurial. For a short path (< 120 bytes), the Python code can be reduced to a fairly tractable state machine that either determines that nothing needs to be done in a single pass, or performs the encoding in a second pass. For longer paths, we avoid the more complicated hashed encoding scheme for now, and fall back to Python. Raw performance: I measured in a repo containing 150,000 files in its tip manifest, with a median path name length of 57 bytes, and 95th percentile of 96 bytes. In this repo, the Python code takes 3.1 seconds to encode all path names, while the hybrid C-and-Python code (called from Python) takes 0.21 seconds, for a speedup of about 14. Across several other large repositories, I've measured the speedup from the C code at between 26x and 40x. For path names above 120 bytes where we must fall back to Python for hashed encoding, the speedup is about 1.7x. Thus absolute performance will depend strongly on the characteristics of a particular repository.	2012-09-18 15:42:19 -07:00
Adrian Buehlmann	fd6785ba1c	pathencode: new C module with fast encodedir() function Not yet used (will be enabled in a later patch). This patch is a stripped down version of patches originally created by Bryan O'Sullivan <bryano@fb.com>	2012-09-18 11:43:30 +02:00
Bryan O'Sullivan	4045197307	parsers: fix an integer size warning issued by clang	2012-08-13 14:04:52 -07:00
sorcerer	b04ae8ca03	revlog: don't try to partialmatch strings those length > 40 _partialmatch() does prefix matching against nodes. String passed to _partialmetch() actualy may be any string, not prefix only. For example, "63af8381691a9e5c52ee57c4e965eb306f86826e or 300" is a good argument for _partialmatch(). When _partialmatch() searches using radix tree, index_partialmatch() C function shouldn't try to match too long strings.	2012-08-02 19:10:45 +04:00
Mads Kiilerich	6dedbb6378	parsers.c: remove warning: 'size' may be used uninitialized in this function Some compilers / compiler options (such as gcc 4.7) would emit warnings: mercurial/parsers.c: In function 'pack_dirstate': mercurial/parsers.c:306:18: warning: 'size' may be used uninitialized in this function [-Wmaybe-uninitialized] mercurial/parsers.c:306:12: warning: 'mode' may be used uninitialized in this function [-Wmaybe-uninitialized] It is apparently not smart enough to figure out how the 'err' arithmetics makes sure that it can't happen. 'err' is now replaced with simple checks and goto. That might also help the optimizer when it is inlining getintat().	2012-07-06 00:48:45 +02:00
Bryan O'Sullivan	ce2a30609e	parsers: add a C function to pack the dirstate This is about 9 times faster than the Python dirstate packing code. The relatively small speedup is due to the poor locality and memory access patterns caused by traversing dicts and other boxed Python values.	2012-05-30 12:55:33 -07:00
Bryan O'Sullivan	7edc7069a5	parsers: replace magic number 64 with symbolic constant	2012-06-01 15:19:08 -07:00
Bryan O'Sullivan	1e9deb3b01	parsers: cache the result of index_headrevs Although index_headrevs is much faster than its Python counterpart, it's still somewhat expensive when history is large. Since headrevs is called several times when the tag cache is stale or missing (e.g. after a strip or rebase), there's a win to be gained from caching the result, which we do here.	2012-05-19 20:21:48 -07:00
Bryan O'Sullivan	a49ea963d7	revlog: switch to a C version of headrevs The C implementation is more than 100 times faster than the Python version (which is still available as a fallback). In a repo with 330,000 revs and a stale .hg/cache/tags file, this patch improves the performance of "hg tip" from 2.2 to 1.6 seconds.	2012-05-19 19:44:58 -07:00
Bryan O'Sullivan	f9c29929d4	parsers: reduce raw_length when truncating When stripping revs, we now update raw_length to correctly reflect the new end of the index.	2012-05-19 19:44:18 -07:00
Bryan O'Sullivan	226bc14024	parsers: use Py_CLEAR where appropriate	2012-05-13 11:56:50 +02:00
Matt Mackall	05e48d4041	merge with stable	2012-05-13 12:52:24 +02:00
Bryan O'Sullivan	058dfb801d	revlog: speed up prefix matching against nodes The radix tree already contains all the information we need to determine whether a short string is an unambiguous node identifier. We now make use of this information. In a kernel tree, this improves the performance of "hg log -q -r24bf01de75" from 0.27 seconds to 0.06.	2012-05-12 10:55:08 +02:00
Bryan O'Sullivan	f29187cd15	parsers: ensure that nullid is always present in the radix tree	2012-05-12 10:55:08 +02:00
Bryan O'Sullivan	4fe1bcbdb1	parsers: allow hex keys	2012-05-12 10:55:07 +02:00
Matt Mackall	48471fd098	merge with stable	2012-05-12 00:06:11 +02:00
Adrian Buehlmann	f5ca6da4d6	parser: use PyInt_FromSsize_t in index_stats Eliminates mercurial/parsers.c(515) : warning C4244: 'function' : conversion from 'Py_ssize_t' to 'long', possible loss of data mercurial/parsers.c(520) : warning C4244: 'function' : conversion from 'Py_ssize_t' to 'long', possible loss of data mercurial/parsers.c(521) : warning C4244: 'function' : conversion from 'Py_ssize_t' to 'long', possible loss of data when compiling for Windows x64 target using the Microsoft compiler. PyInt_FromSsize_t does not exist for Python 2.4 and earlier, so we define a fallback in util.h to use PyInt_FromLong when compiling for Python 2.4.	2012-05-09 09:58:50 +02:00
Matt Mackall	6d78ec67ed	merge with stable	2012-05-11 14:48:24 +02:00
Bryan O'Sullivan	9daeaf8600	parsers: change the type of nt_level We should generally prefer Py_ssize_t whenever we are talking about lengths.	2012-05-08 14:48:50 -07:00
Bryan O'Sullivan	2d6b967125	parsers: change the type signature of hexdigit An upcoming change will make use of this.	2012-05-08 14:48:48 -07:00

1 2

91 Commits