Instead of only finding similarities in the added/removed files found
by the addremove step, follow the match object:
hg addremove -s80 foo -> add and removes files in foo
+ find similarities between files in foo
hg addremove -s80 -> add and removes files in the whole repo
+ find similarities between files in the whole repo
hg import --similarity will still work correctly (only find similarities
between files found in the patch).
The original code uses the similary score
1 - len(diff(after, before)) / len(after)
The diff can at most be the size of the 'before' file, so any small
'before' file would be considered very similar. Removing an empty file
would cause all files added in the same revision to be considered
copies of the removed file.
This changes the metric to
bytes_overlap(before, after) / len(before + after)
i.e. the actual percentage of bytes shared between the two files.