sapling/mercurial
Bryan O'Sullivan a150198558 store: implement fncache basic path encoding in C
(This is not yet enabled; it will be turned on in a followup patch.)

The path encoding performed by fncache is complex and (perhaps
surprisingly) slow enough to negatively affect the overall performance
of Mercurial.

For a short path (< 120 bytes), the Python code can be reduced to a fairly
tractable state machine that either determines that nothing needs to be
done in a single pass, or performs the encoding in a second pass.

For longer paths, we avoid the more complicated hashed encoding scheme
for now, and fall back to Python.

Raw performance: I measured in a repo containing 150,000 files in its tip
manifest, with a median path name length of 57 bytes, and 95th percentile
of 96 bytes.

In this repo, the Python code takes 3.1 seconds to encode all path
names, while the hybrid C-and-Python code (called from Python) takes
0.21 seconds, for a speedup of about 14.

Across several other large repositories, I've measured the speedup from
the C code at between 26x and 40x.

For path names above 120 bytes where we must fall back to Python for
hashed encoding, the speedup is about 1.7x.  Thus absolute performance
will depend strongly on the characteristics of a particular repository.
2012-09-18 15:42:19 -07:00
..
help help: fix literal block syntax 2012-09-07 00:42:42 +09:00
hgweb hgweb: respond 403 forbidden for ssl required error 2012-09-05 23:59:27 +09:00
httpclient avoid using abbreviations that look like spelling errors 2012-08-27 23:14:27 +02:00
pure declare local constants instead of using magic values and comments 2012-08-27 23:16:22 +02:00
templates Merge with stable 2012-09-17 21:53:50 +02:00
__init__.py Add back links from file revisions to changeset revisions 2005-05-03 13:16:10 -08:00
ancestor.py check-code: flag 0/1 used as constant Boolean expression 2011-06-01 12:38:46 +02:00
archival.py declare local constants instead of using magic values and comments 2012-08-27 23:16:22 +02:00
base85.c base85: cast Py_ssize_t values to int (issue3481) 2012-06-04 16:59:34 +02:00
bdiff.c bdiff: check and cast first parameter value on putbe32() calls 2012-05-15 22:36:47 +02:00
bookmarks.py bookmark: take successors into account when updating (issue3561) 2012-08-26 01:28:22 +02:00
bundlerepo.py peer: introduce canpush and improve error message 2012-07-13 21:52:28 +02:00
byterange.py spelling: primarily 2012-08-17 13:58:18 -07:00
changegroup.py changegroup: decompress GZ algorithm in larger chunks for better performance 2012-04-29 20:58:50 +02:00
changelog.py fix trivial spelling errors 2012-08-15 22:38:42 +02:00
cmdutil.py amend: preserve phase of amended revision (issue3602) 2012-08-30 16:47:08 +02:00
commands.py Merge spelling fixes 2012-09-11 08:36:09 -07:00
commandserver.py fix wording and not-completely-trivial spelling errors and bad docstrings 2012-08-15 22:39:18 +02:00
config.py grammar: it-handles 2012-08-17 13:58:19 -07:00
context.py obsolete: introduce caches for all meaningful sets 2012-08-28 20:52:04 +02:00
copies.py copies: re-include root directory in directory rename detection (issue3511) 2012-06-27 13:41:04 -05:00
dagparser.py en-us: labeled 2012-08-17 13:58:18 -07:00
dagutil.py cleanup: "raise SomeException()" -> "raise SomeException" 2012-05-12 16:00:58 +02:00
demandimport.py demandimport: determine at load time if __import__ has level argument 2011-08-22 22:50:52 +02:00
diffhelpers.c diffhelpers: use Py_ssize_t in testhunk() 2012-05-12 14:00:51 +02:00
dirstate.py dirstate: drop assert 2012-07-16 16:19:53 -05:00
discovery.py bookmarks: extract valid destination logic in a dedicated function 2012-08-26 00:28:56 +02:00
dispatch.py check-code: indent 4 spaces in py files 2012-07-31 03:30:42 +02:00
encoding.py spelling: successfully 2012-08-17 13:58:19 -07:00
error.py wireproto: add out-of-band error class to allow remote repo to report errors 2011-08-02 15:21:10 -04:00
exewrapper.c exewrapper: use generic term script 2012-06-29 08:10:43 +02:00
extensions.py hooks: print out more information when loading a python hook fails 2012-07-06 18:41:25 +02:00
fancyopts.py globally: use safehasattr(x, '__call__') instead of hasattr(x, '__call__') 2011-07-25 16:24:37 -05:00
filelog.py filelog: add file function to open other filelogs 2011-05-10 17:38:58 +02:00
filemerge.py merge with stable 2012-03-13 16:29:13 -05:00
fileset.py fileset: fix generator vs list bug in fast path 2012-08-15 22:50:23 +02:00
formatter.py formatter: add basic formatters 2012-02-20 16:42:47 -06:00
graphmod.py graphlog: extract ascii drawing code into graphmod 2012-07-11 17:13:39 +02:00
hbisect.py spelling: recursion 2012-08-17 13:58:18 -07:00
help.py help: add 'mergetools' alias for the 'merge-tools' help topic 2012-08-01 00:20:10 +02:00
hg.py clone: don't fail with --update for non-local clones (issue3578) 2012-08-08 10:04:02 -05:00
hook.py avoid using abbreviations that look like spelling errors 2012-08-27 23:14:27 +02:00
httpconnection.py avoid using abbreviations that look like spelling errors 2012-08-27 23:14:27 +02:00
httppeer.py httprepo: ensure Content-Type header exists when pushing data 2012-07-13 13:21:20 +02:00
i18n.py i18n: use getattr instead of hasattr 2011-07-25 20:46:30 -05:00
ignore.py misc: adding missing file close() calls 2011-11-03 11:24:55 -05:00
keepalive.py fix trivial spelling errors 2012-08-15 22:38:42 +02:00
localrepo.py bookmarks: extract valid destination logic in a dedicated function 2012-08-26 00:28:56 +02:00
lock.py Merge spelling fixes 2012-09-11 08:36:09 -07:00
lsprof.py lsprof: report units correctly 2012-05-30 13:57:41 -07:00
lsprofcalltree.py drop unused imports 2009-05-14 15:35:46 +02:00
mail.py avoid using abbreviations that look like spelling errors 2012-08-27 23:14:27 +02:00
manifest.py avoid using abbreviations that look like spelling errors 2012-08-27 23:14:27 +02:00
match.py fix wording and not-completely-trivial spelling errors and bad docstrings 2012-08-15 22:39:18 +02:00
mdiff.py mdiff: fix diff header generation for files with spaces (issue3357) 2012-04-05 15:39:07 +02:00
merge.py merge: warn about file deleted in one branch and renamed in other (issue3074) 2012-05-23 20:50:16 +02:00
minirst.py spelling: indented 2012-08-17 13:58:18 -07:00
mpatch.c mpatch: use Py_ssize_t for string length 2012-05-20 01:28:31 +02:00
node.py Update license to GPLv2+ 2010-01-19 22:20:08 -06:00
obsolete.py obsolete: import modules within mercurial/ without "from mercurial" 2012-08-28 11:15:34 -05:00
osutil.c osutil: handle deletion race with readdir/stat (issue3463) 2012-05-18 14:34:33 -05:00
parser.py en-us: labeled 2012-08-17 13:58:18 -07:00
parsers.c store: implement fncache basic path encoding in C 2012-09-18 15:42:19 -07:00
patch.py check-code: indent 4 spaces in py files 2012-07-31 03:30:42 +02:00
pathencode.c store: implement fncache basic path encoding in C 2012-09-18 15:42:19 -07:00
peer.py peer: delete double definition of method peer 2012-07-28 22:36:22 +02:00
phases.py Merge spelling fixes 2012-09-11 08:36:09 -07:00
posix.py util: implement a faster os.path.split for posix systems 2012-09-14 12:08:17 -07:00
pushkey.py pushkey: do not exchange obsole markers if feature is disabled 2012-07-28 13:33:06 +02:00
pvec.py fix trivial spelling errors 2012-08-15 22:38:42 +02:00
py3kcompat.py spelling: relies 2012-08-17 13:58:18 -07:00
repair.py strip: fix revset usage (issue3604) 2012-08-31 23:27:26 +02:00
revlog.py spelling: descendants 2012-08-17 13:58:18 -07:00
revset.py obsolete: introduce caches for all meaningful sets 2012-08-28 20:52:04 +02:00
scmutil.py scmutil: use the new faster path split 2012-09-14 12:08:55 -07:00
setdiscovery.py delete some dead comments and docstrings 2012-08-21 02:41:20 +02:00
similar.py cleanup: eradicate long lines 2012-05-12 15:54:54 +02:00
simplemerge.py cleanup: "raise SomeException()" -> "raise SomeException" 2012-05-12 16:00:58 +02:00
sshpeer.py peer: introduce real peer classes 2012-07-13 21:47:06 +02:00
sshserver.py sshserver: avoid a multi-dot attribute lookup in a hot loop 2012-09-14 12:09:44 -07:00
sslutil.py ui: optionally quiesce ssl verification warnings on python 2.5 2012-04-09 14:36:16 -07:00
statichttprepo.py peer: introduce canpush and improve error message 2012-07-13 21:52:28 +02:00
store.py store: refactor hashed encoding into its own function 2012-09-18 14:37:32 -07:00
strutil.py Update license to GPLv2+ 2010-01-19 22:20:08 -06:00
subrepo.py subrepo: encode unicode path names (issue3610) 2012-09-04 15:46:04 -07:00
tags.py spelling: supersede 2012-08-17 13:58:19 -07:00
templatefilters.py cleanup: "not x in y" -> "x not in y" 2012-05-12 16:00:57 +02:00
templatekw.py templatekw: add parent1, parent1node, parent2, parent2node keywords 2012-07-10 08:43:32 -07:00
templater.py templater: abort when a template filter raises an exception (issue2987) 2012-08-17 15:12:01 -07:00
transaction.py spelling: journaling 2012-08-17 13:58:18 -07:00
treediscovery.py util: subclass deque for Python 2.4 backwards compatibility 2012-06-01 17:05:31 -07:00
ui.py fix trivial spelling errors 2012-08-15 22:38:42 +02:00
url.py avoid using abbreviations that look like spelling errors 2012-08-27 23:14:27 +02:00
util.h store: implement fncache basic path encoding in C 2012-09-18 15:42:19 -07:00
util.py util: implement a faster os.path.split for posix systems 2012-09-14 12:08:17 -07:00
verify.py verify: do not choke on valid changelog without manifest 2012-08-21 20:51:16 +02:00
win32.py avoid using abbreviations that look like spelling errors 2012-08-27 23:14:27 +02:00
windows.py util: implement a faster os.path.split for posix systems 2012-09-14 12:08:17 -07:00
wireproto.py wireproto: workaround for yield inside try/finally incompatible with python2.4 2012-09-18 17:00:58 +02:00