sapling/mercurial
David Greenaway ae788f807e findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes.
We speed up 'findrenames' for the usecase when a user specifies they
want a similarity of 100% by matching files by their exact SHA1 hash
value. This reduces the number of comparisons required to find exact
matches from O(n^2) to O(n).

While it would be nice if we could just use mercurial's pre-calculated
SHA1 hash for existing files, this hash includes the file's ancestor
information making it unsuitable for our purposes. Instead, we calculate
the hash of old content from scratch.

The following benchmarks were taken on the current head of crew:

addremove 100% similarity:
  rm -rf *; hg up -C; mv tests tests.new
  hg --time addremove -s100 --dry-run

  before:  real 176.350 secs (user 128.890+0.000 sys 47.430+0.000)
  after:   real   2.130 secs (user   1.890+0.000 sys  0.240+0.000)

addremove 75% similarity:
  rm -rf *; hg up -C; mv tests tests.new; \
      for i in tests.new/*; do echo x >> $i; done
  hg --time addremove -s75  --dry-run

  before: real 264.560 secs (user 215.130+0.000 sys 49.410+0.000)
  after:  real 218.710 secs (user 172.790+0.000 sys 45.870+0.000)
2010-04-03 11:58:16 +11:00
..
help Merge with stable 2010-04-29 22:14:14 -05:00
hgweb hgweb: make hgweb.hgweb a unified interface to hgweb/hgwebdir 2010-04-26 11:03:40 -05:00
pure style: use consistent variable names (*mod) with imports which would shadow 2010-03-11 17:43:44 +01:00
templates Merge with stable 2010-04-19 17:00:02 -05:00
__init__.py Add back links from file revisions to changeset revisions 2005-05-03 13:16:10 -08:00
ancestor.py Merge with stable 2010-01-19 22:45:09 -06:00
archival.py many, many trivial check-code fixups 2010-01-25 00:05:27 -06:00
base85.c many, many trivial check-code fixups 2010-01-25 00:05:27 -06:00
bdiff.c bdiff: do not use recursion / avoid stackoverflow (issue1940) 2010-02-18 10:32:51 +01:00
bundlerepo.py many, many trivial check-code fixups 2010-01-25 00:05:27 -06:00
byterange.py pylint, pyflakes: remove unused or duplicate imports 2010-04-14 17:58:10 +09:00
changegroup.py localrepo: show indeterminate progress for incoming data 2010-02-07 12:00:40 -06:00
changelog.py Merge with stable 2010-02-11 17:44:01 -06:00
cmdutil.py remoteui: copy http_proxy settings 2010-04-08 00:13:33 +09:00
commands.py commands: revised documentation of 'default' and 'default-push' 2010-04-27 00:44:06 +05:30
config.py config: handle short continuations (issue1999) 2010-01-28 23:07:28 -06:00
context.py context: remove parents parameter to workingctx 2010-04-21 01:18:31 +02:00
copies.py copies: properly visit file context ancestors on working file contexts 2010-04-07 21:31:47 +02:00
demandimport.py demandimport: blacklist _ssl (issue1964) 2010-03-09 16:03:57 +01:00
diffhelpers.c many, many trivial check-code fixups 2010-01-25 00:05:27 -06:00
dirstate.py dirstate: more explicit name, rename normaldirty() to otherparent() 2010-04-20 11:17:01 +02:00
dispatch.py Merge with stable 2010-05-01 16:15:27 +02:00
encoding.py many, many trivial check-code fixups 2010-01-25 00:05:27 -06:00
error.py Update license to GPLv2+ 2010-01-19 22:20:08 -06:00
extensions.py dispatch: provide help for disabled extensions and commands 2010-02-07 14:01:43 +01:00
fancyopts.py many, many trivial check-code fixups 2010-01-25 00:05:27 -06:00
filelog.py merge with stable 2010-03-16 01:16:19 +01:00
filemerge.py filemerge: use working dir parent as ancestor for backward wdir merge 2010-04-19 20:41:53 +02:00
graphmod.py hgweb/graph: edge should be same color as the destination 2010-03-07 17:44:43 +01:00
hbisect.py Update license to GPLv2+ 2010-01-19 22:20:08 -06:00
help.py help: add some help for hgweb.config files 2010-04-26 11:03:40 -05:00
hg.py Merge with stable 2010-03-18 14:36:24 -07:00
hook.py Update license to GPLv2+ 2010-01-19 22:20:08 -06:00
httprepo.py httprepo: normalize output from unbundle with ssh 2010-02-24 12:35:26 -05:00
i18n.py ui: add HGPLAIN environment variable for easier scripting 2010-02-07 14:56:18 +01:00
ignore.py Update license to GPLv2+ 2010-01-19 22:20:08 -06:00
keepalive.py fix coding style (reported by pylint) 2010-02-08 15:36:34 +01:00
localrepo.py tags: return tags in sorted order 2010-04-26 15:58:36 -04:00
lock.py Update license to GPLv2+ 2010-01-19 22:20:08 -06:00
lsprof.py fix spaces/identation issues 2010-02-05 18:50:08 +01:00
lsprofcalltree.py drop unused imports 2009-05-14 15:35:46 +02:00
mail.py Merge with stable 2010-01-19 22:45:09 -06:00
manifest.py many, many trivial check-code fixups 2010-01-25 00:05:27 -06:00
match.py many, many trivial check-code fixups 2010-01-25 00:05:27 -06:00
mdiff.py remove header handling out of mdiff.bunidiff, rename it 2010-03-09 18:31:57 +01:00
merge.py dirstate: more explicit name, rename normaldirty() to otherparent() 2010-04-20 11:17:01 +02:00
minirst.py minirst: support all recommended title adornments 2010-04-25 18:19:54 +02:00
mpatch.c many, many trivial check-code fixups 2010-01-25 00:05:27 -06:00
node.py Update license to GPLv2+ 2010-01-19 22:20:08 -06:00
osutil.c many, many trivial check-code fixups 2010-01-25 00:05:27 -06:00
parsers.c parsers: fix some signed comparison issues 2010-02-13 17:37:44 -06:00
patch.py patch: strip paths in leaked git patchmeta objects 2010-04-26 13:21:03 +02:00
posix.py util: fix default termwidth() under Windows 2010-04-26 22:30:40 +02:00
repair.py localrepo: add desc parameter to transaction 2010-04-09 17:23:35 -05:00
repo.py Update license to GPLv2+ 2010-01-19 22:20:08 -06:00
revlog.py merge with stable 2010-04-15 15:35:06 +02:00
similar.py findrenames: Optimise "addremove -s100" by matching files by their SHA1 hashes. 2010-04-03 11:58:16 +11:00
simplemerge.py many, many trivial check-code fixups 2010-01-25 00:05:27 -06:00
sshrepo.py many, many trivial check-code fixups 2010-01-25 00:05:27 -06:00
sshserver.py streamclone: allow uncompressed clones by default 2010-02-07 15:31:53 +01:00
statichttprepo.py Update license to GPLv2+ 2010-01-19 22:20:08 -06:00
store.py store: only add new entries to the fncache file 2010-03-03 14:50:35 +01:00
streamclone.py pylint, pyflakes: remove unused or duplicate imports 2010-04-14 17:58:10 +09:00
strutil.py Update license to GPLv2+ 2010-01-19 22:20:08 -06:00
subrepo.py subrepo: fix repo root path handling in svn subrepo 2010-04-18 14:20:08 -07:00
tags.py tags: delete unnecessary close() of atomictempfile 2010-04-24 18:08:06 +09:00
templatefilters.py templatefilters: fix check-code warning 2010-03-29 16:11:40 -05:00
templatekw.py fix coding style (reported by pylint) 2010-02-08 15:36:34 +01:00
templater.py templater: drop \ when handling escaped { 2010-04-05 15:25:08 -05:00
transaction.py many, many trivial check-code fixups 2010-01-25 00:05:27 -06:00
ui.py ui: fix check-code error 2010-04-28 21:00:07 +02:00
url.py schemes: fix // breakage with Python 2.6.5 (issue2111) 2010-04-08 11:00:46 -04:00
util.py util: fix default termwidth() under Windows 2010-04-26 22:30:40 +02:00
verify.py verify: improve progress descriptions 2010-04-13 23:12:23 -05:00
win32.py win32: detect console width on Windows 2010-04-25 18:27:12 +02:00
windows.py util: fix default termwidth() under Windows 2010-04-26 22:30:40 +02:00