Commit Graph

4 Commits

Author SHA1 Message Date
Gregory Szorc
68f48d2e6f bdiff: use Python memory allocator in fixws
Python has its own memory allocation APIs. For allocations
<= 512 bytes, it allocates memory from arenas. This means that
average small allocations don't call the system allocator, which
makes them faster. Also, arena allocations cut down on memory
fragmentation, which can matter for performance in long-running
processes.

Another advantage of using the Python memory allocator is that
allocations are tracked by Python. This is a bigger deal in
Python 3, as modern versions of Python have some decent built-in
tools for examining memory usage, leaks, etc.

This patch converts a trivial malloc() + free() in the bdiff code
to use the Python allocator APIs. Since the object being
operated on is a line, chances are it will use an arena. So,
this could have a net positive impact on performance (although
I didn't measure it).
2017-03-09 11:54:25 -08:00
Mads Kiilerich
ecd76cc19e bdiff: early pruning of common prefix before doing expensive computations
It seems quite common that files don't change completely. New lines are often
pretty much appended, and modifications will often only change a small section
of the file which on average will be in the middle.

There can thus be a big win by pruning a common prefix before starting the more
expensive search for longest common substrings.

Worst case, it will scan through a long sequence of similar bytes without
encountering a newline. Splitlines will then have to do the same again ...
twice for each side. If similar lines are found, splitlines will save the
double iteration and hashing of the lines ... plus there will be less lines to
find common substrings in.

This change might in some cases make the algorith pick shorter or less optimal
common substrings. We can't have the cake and eat it.

This make hg --time bundle --base null -r 4.0 go from 14.5 to 15 s - a 3%
increase.

On mozilla-unified:
perfbdiff -m 3041e4d59df2
! wall 0.053088 comb 0.060000 user 0.060000 sys 0.000000 (best of 100) to
! wall 0.024618 comb 0.020000 user 0.020000 sys 0.000000 (best of 116)
perfbdiff 0e9928989e9c --alldata --count 10
! wall 0.702075 comb 0.700000 user 0.700000 sys 0.000000 (best of 15) to
! wall 0.579235 comb 0.580000 user 0.580000 sys 0.000000 (best of 18)
2016-11-16 19:45:35 +01:00
Gregory Szorc
5f9c903265 bdiff: include util.h
Without this, IS_PY3K isn't define and the preprocessor uses the
incorrect module loading code, causing the module fail to load at
run-time.

After this patch, all our C extensions (except for watchman's) appear
to import correctly in Python 3!
2016-10-13 13:27:14 +02:00
Maciej Fijalkowski
04dfc79402 bdiff: split bdiff into cpy-aware and cpy-agnostic part 2016-07-13 10:46:26 +02:00