sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-10 00:45:18 +03:00

Author	SHA1	Message	Date
Durham Goode	847f748e59	manifest: add __nonzero__ method This adds a __nonzero__ method to manifestdict. This isn't strictly necessary in the vanilla Mercurial implementation, since Python will handle nonzero checks by using __len__, but having it implemented here makes it easier for alternative implementations to implement __nonzero__ and have them be plug-n-play with the normal implementation.	2016-11-03 17:31:14 -07:00
Pulkit Goyal	41a3214683	py3: have bytes version of sys.argv sys.argv returns unicodes on Python 3. We need a bytes version for us. There was also a python bug/feature request which wanted then to implement one. They rejected and it is quoted in one of the comments that we can use fsencode() to get a bytes version of sys.argv. Though not sure about its correctness. Link to the comment: http://bugs.python.org/issue8776#msg217416 After this patch we will have pycompat.sysargv which will return us bytes version of sys.argv. If this patch goes in, i will like to make transformer rewrite sys.argv with pycompat.argv because there are lot of occurences.	2016-11-06 04:36:26 +05:30
Augie Fackler	23fc422908	util: use '\\' rather than using r'\' We need bytes, and I find this just a little more immediately obvious than doing rb'\'.	2016-10-09 09:00:47 -04:00
Augie Fackler	96ff73d801	util: use pycompat urlunquote function	2016-10-09 09:03:10 -04:00
Augie Fackler	c2678fbe9d	pycompat: introduce an alias for urllib.unquote We have to use unquote_to_bytes on Python 3, so we need an abstraction for this.	2016-10-09 09:02:25 -04:00
Christian Ebert	bfdeb77b45	keyword: handle filectx _customcmp Suggested by Yuya Nishihara: https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-October/089461.html Related to issue5364.	2016-10-17 17:42:46 +02:00
Yuya Nishihara	6d3379c3c6	mail: do not print(), use ui.debug() instead Since print() can't take a bytes output, it's pretty useless in Mercurial on Python 3. As this is a debug message, switching to ui.debug() seems fine.	2016-10-20 22:20:31 +09:00
Yuya Nishihara	be2b0e18a4	progress: obtain stderr from ui This will help Python 3 porting.	2016-10-20 22:12:48 +09:00
Yuya Nishihara	1a2cfb06b7	simplemerge: obtain stdout from ui This will help Python 3 porting.	2016-10-20 22:09:50 +09:00
Yuya Nishihara	e23e6c94ff	profiling: obtain stderr from ui This will help Python 3 porting.	2016-10-20 22:07:03 +09:00
Gregory Szorc	9c0cfd877d	bdiff: replace hash algorithm This patch replaces lyhash with the hash algorithm used by diffutils. The algorithm has its origins in Git commit 2e9d1410, which is all the way back from 1992. The license header in the code at that revision in GPL v2. I have not performed an extensive analysis of the distribution (and therefore buckets) of hash output. However, `hg perfbdiff` gives some clear wins. I'd like to think that if it is good enough for diffutils it is good enough for us? From the mozilla-unified repository: $ perfbdiff -m 3041e4d59df2 ! wall 0.053271 comb 0.060000 user 0.060000 sys 0.000000 (best of 100) ! wall 0.035827 comb 0.040000 user 0.040000 sys 0.000000 (best of 100) $ perfbdiff 0e9928989e9c --alldata --count 100 ! wall 6.204277 comb 6.200000 user 6.200000 sys 0.000000 (best of 3) ! wall 4.309710 comb 4.300000 user 4.300000 sys 0.000000 (best of 3) From the hg repo: $ perfbdiff 35000 --alldata --count 1000 ! wall 0.660358 comb 0.660000 user 0.660000 sys 0.000000 (best of 15) ! wall 0.534092 comb 0.530000 user 0.530000 sys 0.000000 (best of 19) Looking at the generated assembly and statistical profiler output from the kernel level, I believe there is room to make this function even faster. Namely, we're still consuming data character by character instead of at the word level. This translates to more loop iterations and more instructions. At this juncture though, the real performance killer is that we're hashing every line. We should get a significant speedup if we change the algorithm to find the longest prefix, longest suffix, treat those as single "lines" and then only do the line splitting and hashing on the parts that are different. That will require a lot of C code, however. I'm optimistic this approach could result in a ~2x speedup.	2016-11-06 18:51:57 -08:00
Gregory Szorc	d1e86f5ae3	profiling: make statprof the default profiler (BC) The statprof sampling profiler runs with significantly less overhead. Its data is therefore more useful. Furthermore, its default output shows the hotpath by default, which I've found to be way more useful than the default profiler's function time table. There is one behavioral regression with this change worth noting: the statprof profiler currently doesn't profile individual hgweb requests like lsprof does. This is because the current implementation of statprof only profiles the thread that started profiling. The ability for lsprof to profile individual hgweb requests is relatively new and likely not widely used. Furthermore, I have plans to modify statprof to support profiling multiple threads. I expect that change to go through several iterations. I'm submitting this patch first so there is more time to test statprof. Perfect is the enemy of good.	2016-11-04 21:44:25 -07:00
Gregory Szorc	c60af834ba	profiling: use vendored statprof and upstream enhancements (BC) Now that the statprof module is vendored and suitable for use, we switch our statprof profiler to use it. This required some minor changes because of drift between the official statprof profiler and the vendored copy. We also incorporate Facebook's improvements from the "statprofext" extension at https://bitbucket.org/facebook/hg-experimental, notably support for different display formats. Because statprof output is different, this is marked as BC. Although most users likely won't notice since most users don't profile.	2016-11-04 20:50:38 -07:00
Yuya Nishihara	f9eda5312c	crecord: use scmutil.termsize()	2016-10-20 23:16:32 +09:00
Yuya Nishihara	d1b623a5c1	scmutil: extend termwidth() to return terminal height, renamed to termsize() It appears crecord.py has its own termsize() function. I want to get rid of it. The fallback height is chosen from the default of cmd.exe on Windows, and VT100 on Unix.	2016-10-20 23:09:05 +09:00
Yuya Nishihara	143bd6fcbb	scmutil: clarify that we explicitly do termwidth - 1 on Windows I was a bit confused since we didn't add 1 to the width, which is different from the example shown in StackOverflow. http://stackoverflow.com/a/12642749	2016-10-20 22:57:12 +09:00
Yuya Nishihara	a85ef4fbe3	scmutil: remove superfluous indent from termwidth()	2016-10-20 21:57:32 +09:00
Yuya Nishihara	da9da141c4	scmutil: narrow ImportError handling in termwidth() The array module must exist. It's sufficient to suppress the ImportError of termios. Also salvaged the comment why we have to handle AttributeError, from 54db81f689bd.	2016-10-20 21:50:29 +09:00
Yuya Nishihara	5b9ee7c321	scmutil: make termwidth() obtain stdio from ui I'm getting rid of direct sys.stderr\|out\|in references so Py3 porting will be slightly easier.	2016-10-20 21:42:11 +09:00
Yuya Nishihara	52e59de51c	scmutil: move util.termwidth() I'm going to get rid of sys.stderr\|out\|in references from posix.termwidth(). In order to do that, termwidth() needs to take a ui, but functions in util.py shouldn't depend on a ui object. So moves termwidth() to scmutil.py.	2016-10-20 21:38:44 +09:00
Gregory Szorc	e6dba2eac7	bdiff: don't check border condition in loop `plast = a + len - 1`. So, this "for" loop iterates from "a" to "plast", inclusive. So, `p == plast` can only be true on the final iteration of the loop. So checking for it on every loop iteration is wasteful. This patch simply decreases the upper bound of the loop by 1 and adds an explicit check after iteration for the `p == plast` case. We can't simply add 1 to the initial value for "i" because that doesn't do the correct thing on empty input strings. `perfbdiff -m 3041e4d59df2` on the Firefox repo becomes significantly faster: ! wall 0.072763 comb 0.070000 user 0.070000 sys 0.000000 (best of 100) ! wall 0.053221 comb 0.060000 user 0.060000 sys 0.000000 (best of 100) For the curious, this code has its origins in 3605ecc924ef, which is the changeset that introduced bdiff.c in 2005. Also, GNU diffutils is able to perform a similar line-based diff in under 20ms. So there's likely more perf wins to be found in this code. One of them is the hashing algorithm. But it looks like mpm spent some time testing hash collisions in c22788816627. I'd like to do the same before switching away from lyhash, just to be on the safe side.	2016-11-06 00:37:50 -07:00
Gregory Szorc	51504da4ad	perf: add perfbdiff bdiff shows up a lot in profiling. I think it would be useful to have a perf command that runs bdiff over and over so we can find hot spots.	2016-11-05 23:41:52 -07:00
Pulkit Goyal	96b2983c6a	help: show help for disabled extensions (issue5228) This patch does not exactly solve issue5228 but it results in a better condition on this issue. For disabled extensions, we used to parse the module and get the first occurrences of docstring and then return the first line of that as an introductory heading of extension. This is what we get today. This patch returns the whole docstring of the module as a help for extension, which is more informative. There are some modules which don't have much docstring at top level except the heading so those are unaffected by this change. To follow the existing trend of showing commands either we have to load the extension or have a very ugly parsing method which don't even assure correctness.	2016-11-06 06:54:31 +05:30
Pulkit Goyal	22fabd6f33	py3: make scmutil.rcpath() return bytes This patch make sure scmutil.rcpath() returns bytes independent of which platform is used on Python 3. If we want to change type for windows we can just conditionalize the return variable.	2016-11-06 04:17:19 +05:30
Pulkit Goyal	af51d6f213	py3: use pycompat.ossep at certain places Certain instances of os.sep has been converted to pycompat.ossep where it was sure to use bytes only. There are more such instances which needs some more attention and will get surely.	2016-11-06 04:10:33 +05:30
Pulkit Goyal	92076ed1a3	py3: have pycompat.ospathsep and pycompat.ossep We needed bytes version of os.sep and os.pathsep in py3 as they return unicodes.	2016-11-06 03:44:44 +05:30
Pulkit Goyal	214b36d54b	py3: add a bytes version of os.name os.name returns unicodes on py3. Most of our checks are like os.name == 'nt' Because of the transformer, on the right hand side we have b'nt'. The condition will never satisfy even if os.name returns 'nt' as that will be an unicode. We either need to encode every occurence of os.name or have a new variable which is much cleaner. Now we have pycompat.osname. There are around 53 occurences of os.name in the codebase which needs to be replaced by pycompat.osname to support Python 3.	2016-11-06 03:33:22 +05:30
Pulkit Goyal	0f327320e7	py3: make util.datapath a bytes variable In this patch we make util.datapath a bytes variable, but we have to pass a unicode to gettext.translation otherwise it will cry. Used pycompat.fsdecode() to decode it back to unicode as it was converted to bytes using pycompat.fsencode().	2016-11-06 12:18:23 +09:00
Pulkit Goyal	3c0a6ae01d	py3: add os.fsdecode() as pycompat.fsdecode() We need to use os.fsdecode() but this was not present in Python 2. So added the function in pycompat.py	2016-11-06 03:12:40 +05:30
Gregory Szorc	8ed15e9613	statprof: return state from stop() I don't like global variables. Have stop() return the captured state so callers can pass data to the display function.	2016-11-04 20:22:37 -07:00
Yuya Nishihara	6aeed209ab	hghave: check darcs version more strictly test-convert-darcs.t suddenly started failing on my Debian sid machine. The reason was Darcs was upgraded from 2.12.0 to 2.12.4 so the original pattern got to match the last two digits. Fix the pattern to match 2.2+.	2016-11-05 13:20:53 +09:00
Yuya Nishihara	4560adf4ef	tests: silence output of darcs command It appears darcs is more verbose by default these days. I got test failure with Darcs 2.12.4.	2016-11-05 13:16:40 +09:00
Durham Goode	d793e01462	manifest: remove manifest.readshallowdelta This removes manifest.readshallowdelta and converts its one consumer to use manifestlog instead.	2016-11-02 17:10:47 -07:00
Durham Goode	f952eca1af	manifest: get rid of manifest.readshallowfast This removes manifest.readshallowfast and converts it's one user to use manifestlog instead.	2016-11-02 17:10:47 -07:00
Durham Goode	b393b6387d	manifest: add shallow option to treemanifestctx.readdelta and readfast The old manifest had different functions for performing shallow reads, shallow readdeltas, and shallow readfasts. Since a lot of the code is duplicate (and since those functions don't make sense on a normal manifestctx), let's unify them into flags on the existing readdelta and readfast functions. A future diff will change consumers of these functions to use the manifestctx versions and will delete the old apis.	2016-11-02 17:10:47 -07:00
Durham Goode	974eda820c	manifest: change manifestlog mancache to be directory based In the last patch we added a get() function that allows fetching directory level treemanifestctxs. It didn't handle caching at directory level though, so we need to change our mancache to support multiple directories.	2016-11-02 17:10:47 -07:00
Durham Goode	cd705bb046	manifest: add manifestlog.get to obtain subdirectory instances Previously manifestlog only allowed obtaining root level manifests. Future patches will need direct access to subdirectory manifests as part of changegroup creation, so let's add a get() function that knows how to deal with subdirectories.	2016-11-02 17:24:06 -07:00
Durham Goode	c70bd2fb82	manifest: throw LookupError if node not in revlog When accessing a manifest via manifestlog[node], let's verify that the node actually exists and throw a LookupError if it doesn't. This matches the old read behavior, so we don't accidentally return invalid manifestctxs. We do this in manifestlog instead of in the manifestctx/treemanifestctx constructors because the treemanifest code currently relies on the fact that certain code paths can produce treemanifests without touching the revlogs (and it has tests that verify things work if certain revlogs are missing entirely, so they break if we add validation that tries to read them).	2016-11-02 17:33:31 -07:00
Gregory Szorc	83ab000007	revlog: optimize _chunkraw when startrev==endrev In many cases, _chunkraw() is called with startrev==endrev. When this is true, we can avoid an extra index lookup and some other minor operations. On the mozilla-unified repo, `hg perfrevlogchunks -c` says this has the following impact: ! read w/ reused fd ! wall 0.371846 comb 0.370000 user 0.350000 sys 0.020000 (best of 27) ! wall 0.337930 comb 0.330000 user 0.300000 sys 0.030000 (best of 30) ! read batch w/ reused fd ! wall 0.014952 comb 0.020000 user 0.000000 sys 0.020000 (best of 197) ! wall 0.014866 comb 0.010000 user 0.000000 sys 0.010000 (best of 196) So, we've gone from ~25x slower than batch to ~22.5x slower. At this point, there's probably not much else we can do except implement an optimized function in the index itself, including in C.	2016-10-23 10:40:33 -07:00
Gregory Szorc	4d79c96e22	revlog: inline start() and end() for perf reasons When I implemented `hg perfrevlogchunks`, one of the things that stood out was N * _chunk() calls was ~38x slower than 1 _chunks() call. Specifically, on the mozilla-unified repo: N_chunk: 0.528997s 1_chunks: 0.013735s This repo has 352,097 changesets. So the average time per changeset comes out to: N_chunk: 1.502us 1_chunks: 0.039us If you extrapolate these numbers to a repository with 1M changesets, that comes out to 1.502s versus 0.039s, which is significant. At these latencies, Python attribute lookups and function calls matter. So, this patch inlines some code to cut down on that overhead. The impact of this patch on N*_chunk() calls is clear: ! wall 0.528997 comb 0.520000 user 0.500000 sys 0.020000 (best of 19) ! wall 0.367723 comb 0.370000 user 0.350000 sys 0.020000 (best of 27) So, we go from ~38x slower to ~27x. A nice improvement. But there's still a long way to go. It's worth noting that functionality like revsets perform changelog lookups one revision at a time. So this code path is worth optimizing.	2016-10-22 15:41:23 -07:00
Gregory Szorc	52757f4357	revlog: reorder index accessors to match data structure order Index entries are ordered tuples. We have accessors in the revlog class to map tuple offsets to names. To help reinforce the order, reorder the methods so they match the order of elements in the tuple. While I'm here, also sneak in some minimal documentation.	2016-10-23 09:34:55 -07:00
Pierre-Yves David	1a69a24d99	color: add the ability to display configured style to 'debugcolor' The 'hg debugcolor' command gains a '--style' flag to display all the configured labels and their styles. This have many benefits: * discovering documented label, * checking consistency between label's style, * showing the actual style of a label.	2016-11-03 15:17:02 +01:00
Pierre-Yves David	a87b2dcecc	color: sort output of 'debugcolor' The previous ordering were provided by the set. The new output is more stable and rational. In addition we have some logic to keep the '_background' version together to help readability.	2016-11-03 15:15:47 +01:00
Pierre-Yves David	d254535be6	color: extract color and effect display from 'debugcolor' We are about to introduce a second mode for 'hg debugcolor' that would list the known label and their configuration, so we split the code related to color and effect out of the main function.	2016-11-03 14:48:47 +01:00
Pierre-Yves David	9bb3c03012	color: restore _style global after debugcolor ran Before this change, running 'debugcolor' would destroy all color style for the rest of the process life. We now properly backup and restore the variable content. Using a global variable is sketchy in general and could probably be removed. However, this is a quest for another adventure.	2016-11-03 14:29:19 +01:00
Pierre-Yves David	0551e941c2	color: add basic documentation to 'debugcolor' This does not hurt.	2016-11-03 14:12:32 +01:00
Pierre-Yves David	938c5644af	tests: merge 'test-push-hook-lock.t' into 'test-push.t' That test file is very small and is merge with the new 'test-push.t'. No logic is changed. We don't register this as a copy because is actually a "ypoc" merging two file together without replacing the destination and Mercurial cannot express that.	2016-11-03 05:12:23 +01:00
Pierre-Yves David	8eb3ae733d	tests: merge 'test-push-validation.t' into 'test-push.t' That test file is very small and is merge with the new 'test-push.t'. No logic is changed but repository name are update to avoid collision. We don't register this as a copy because is actually a "ypoc" merging two file together without replacing the destination and Mercurial cannot express that.	2016-11-03 05:10:14 +01:00
Pierre-Yves David	7cb95b138e	test: rename 'test-push-r.t' to 'test-push.t' We do not have a simple test for 'hg push' but we have multiple tiny tests for various aspect of it. We'll unify them into a single file, and we start with 'test-push-r.t'. The code is unchanged but we renamed the repository used to avoid collision with other tests we'll import in coming changesets. Test timing for the record: start end cuser csys real Test 1.850 2.640 0.650 0.090 0.790 test-push-validation.t 2.640 3.520 0.760 0.090 0.880 test-push-hook-lock.t 0.000 1.850 1.560 0.210 1.850 test-push-r.t	2016-11-03 04:58:46 +01:00
Pierre-Yves David	91d0a7cccf	tests: simplify command script in 'test-push-r.t' I came across this code by chance. The script of this test is a bit messy with a lot of unnecessary intermediate commands. We simplify the script and unify repository access through '-R'. In the process the update after the unbundle is dropped as it does not add anything to the tests.	2016-11-03 05:05:34 +01:00

1 2 3 4 5 ...

30091 Commits