The histedit command uses a revset like:
(_intlist('1234\x001235')) and merge()
Previously the optimizer gave a weight of 1.5 to the _intlist side (1 for the
function, 0.5 for the string) which caused it to process the merge() side first.
This caused it to evaluate merge against every commit in the repo, which took
2.5 seconds on a large repo.
I changed the weight of _intlist to 0, since it's a trivial calculation, which
makes it process intlist first, which makes merge apply only to the revs in the
list. Which makes the revset take 0.15 seconds now. Cutting off 2.4 seconds off
our histedit performance.
>From the revset benchmark:
revset #25: (_intlist('20000\x0020001')) and merge()
0) obsolete feature not enabled but 54243 markers found!
! wall 0.036767 comb 0.040000 user 0.040000 sys 0.000000 (best of 100)
1) obsolete feature not enabled but 54243 markers found!
! wall 0.000198 comb 0.000000 user 0.000000 sys 0.000000 (best of 9084)
Strip executes a revset like this:
max(parents(_intlist('1234\x001235')) - _intlist('1234\x001235'))
Previously the parents() revset would do 'subset & parents' which iterates over
each item in the subset and checks if it's in parents. subset is usually the
entire repo (a spanset) so this takes a while.
Reversing the parameters to be 'parents & subset' means the operation becomes
O(number of parents) instead of O(size of repo). It also means the result gets
evaluated immediately (since parents isn't a lazy set), but I think this is a
win in most scenarios.
This shaves 0.3 seconds off strip (amend/histedit/rebase/etc) for large repositories.
revset #0: parents(20000)
0) obsolete feature not enabled but 54243 markers found!
! wall 0.006256 comb 0.010000 user 0.010000 sys 0.000000 (best of 289)
1) obsolete feature not enabled but 54243 markers found!
! wall 0.000391 comb 0.000000 user 0.000000 sys 0.000000 (best of 4323)
Previously descendants() would force the provided subset to become a set. In
the case of revsets like '(%ld::) - (%ld)' (as used by histedit) this would
force the '- (%ld)' set to be evaluated, which produced a set containing every
commit in the repo (except %ld). This takes 0.6s on large repos.
This changes descendants to trust the subset to implement __contains__
efficiently, which improves the above revset to 0.16s. Shaving 0.4 seconds off
of histedit.
revset #27: (20000::) - (20000)
0) obsolete feature not enabled but 54243 markers found!
! wall 0.023640 comb 0.020000 user 0.020000 sys 0.000000 (best of 100)
1) obsolete feature not enabled but 54243 markers found!
! wall 0.019589 comb 0.020000 user 0.020000 sys 0.000000 (best of 100)
This commit removes the final revset related perf hotspot from histedit.
Combined with the previous two patches, they shave a little over 3 seconds off
histedit on large repos.
The internal commit API was changed in 2eef89bfd70d to expect None from the
filectx function when a file is to be deleted, not an IOError. This change
keeps synthrepo up-to-date.
Simplifies the rpm build process.
We will use platform specific rpmbuild directories and will not clean them and
will drop the explicit copy to build directory.
Docker can be run by ordinary users if they are in the docker group. The build
process would however be run as a root user, only protected by the sandboxing.
That caused problems with the shared directory where rpmbuild would be picky
about building from sources owned by less privileged users and producing files
owned by root.
Instead, add a build user with the right uid/gid to the image and run the
docker process as that user.
The added revset is used by obsolescence and currently results in
recursion in __contains__ between 2 lazysets. We should have
coverage of this revset.
Matt Mackall said:
The goal of simplemerge should have always been to be a drop-in
replacement for RCS merge. Please nuke this minimization thing entirely.
This whole things is now dead.
f5a63a5506d2 regressed performance of baseset.__sub__ by introducing
a lazyset. This patch restores that lost performance by eagerly
evaluating baseset.__sub__ if the other set is a baseset.
revsetbenchmark.py results impacted by this change:
revset #6: roots(0::tip)
0) wall 2.923473 comb 2.920000 user 2.920000 sys 0.000000 (best of 4)
1) wall 0.077614 comb 0.080000 user 0.080000 sys 0.000000 (best of 100)
revset #23: roots((0:tip)::)
0) wall 2.875178 comb 2.880000 user 2.880000 sys 0.000000 (best of 4)
1) wall 0.154519 comb 0.150000 user 0.150000 sys 0.000000 (best of 61)
On the author's machine, this slowdown manifested during evaluation of
'roots(%ln::)' in phases.retractboundary after unbundling the Firefox
repository. Using `time hg unbundle firefox.hg` as a benchmark:
Before: 8:00
After: 4:28
Delta: -3:32
For reference, the subset and cs baseset instances impacted by this
change were of lengths 193634 and 193627, respectively.
Explicit test coverage of roots(%ln::), while similar to the existing
roots(0::tip) benchmark, has been added.
This patch changes the calling signature of memfilectx's __init__ to fall in
line with the other file contexts.
Calling code and tests have been updated accordingly.
On my Linux machine, paths seen by 2to3 include the build directory. We
switch from an exact to substring match to allow 2to3 to work in more
environments.
The ``editmerge`` script is shipped in contrib and opens an editor on
every conflicting file. It needs minimal configuration to inject the
config marker in the file before opening. Otherwise it behaves the
same as ``internal:local`` and bad things happen.
We add a -R/--repo option to run the benchmarks on another repository. This is
very useful as some repository are bigger/more interesting than the mercurial one.
Before this changeset, you had to stand in the root of the mercurial repo to run
the `revsetbenchmark.py` script. Otherwise, the perf extension would not be
found a `./contrib/perf.py` and the script would crash in panic.
We now figure out the contrib directory from the location of this script. This
makes it possible to run the script from other location that the mercurial repo
root (but you still need to be in the core mercurial repository)
This revset is used in evolve. The new revset lazyness should make it all faster
but in practice it is significantly slower.
Below is a timing for this entry on my Mercurial repo.
2.9.2) ! wall 0.034598 comb 0.040000 user 0.040000 sys 0.000000 (best of 100)
3.0+@) ! wall 0.062268 comb 0.060000 user 0.060000 sys 0.000000 (best of 100)
The ~20 have been taken arbitrary.
The check-code tool now expects the "desc" keyword to be followed by the
"websub" filter, with the following exceptions:
a) It has no filters at all, e.g. a changeset description in the raw style
templates or the repository description in the summary page.
b) It is followed by the "firstline" filter, e.g. the first line of the
changeset description is displayed as a summary or title.
This matters as `author(mpm)` have a lot of matches evenly split in the repo,
while `author(lmoscoviz)` have less match (and later). This changes the execution
path of the "or" operator a lot.
Calling a function is super expensive in python. We inline the trivial range
comparison to get back to more sensible performance on common revset operation.
Benchmark result below:
Revision mapping:
0) bced32a3fd6c 2.9.2 release
1) 2ab64f462d81 current @
2) This revision
revset #0: public()
0) wall 0.010890 comb 0.010000 user 0.010000 sys 0.000000 (best of 201)
1) wall 0.012109 comb 0.010000 user 0.010000 sys 0.000000 (best of 199)
2) wall 0.012211 comb 0.020000 user 0.020000 sys 0.000000 (best of 197)
revset #1: :10000 and public()
0) wall 0.007141 comb 0.010000 user 0.010000 sys 0.000000 (best of 361)
1) wall 0.014139 comb 0.010000 user 0.010000 sys 0.000000 (best of 186)
2) wall 0.008334 comb 0.010000 user 0.010000 sys 0.000000 (best of 308)
revset #2: draft()
0) wall 0.009610 comb 0.010000 user 0.010000 sys 0.000000 (best of 279)
1) wall 0.010942 comb 0.010000 user 0.010000 sys 0.000000 (best of 243)
2) wall 0.011036 comb 0.010000 user 0.010000 sys 0.000000 (best of 239)
revset #3: :10000 and draft()
0) wall 0.006852 comb 0.010000 user 0.010000 sys 0.000000 (best of 383)
1) wall 0.014641 comb 0.010000 user 0.010000 sys 0.000000 (best of 183)
2) wall 0.008314 comb 0.010000 user 0.010000 sys 0.000000 (best of 299)
We can see this changeset gains back the regression for `and` operation on
spanset. We are still a bit slowerfor the `public()` and `draft()`. Predicates
not touched by this changeset.
There can be good reasons to disable premerge no matter which merge tool is
used. Most tools will do just fine without premerge and handle the simple
merges more or less automatic and silent. We _could_ thus disable premerge for
most tools. But without premerge, the merge tool will be launched for each file
- that makes it a slow and expensive process to perform big simple merges. It
is thus better to consistently stick to the default premerge=True.
The mergetools.hgrc configuration for most tools implicitly use the default
premerge=True but Araxis and the Linux entry for Beyond Compare had
premerge=False. These lines has been removed.
These settings were introduced by de7cda55270e without further explanation of
why they should be good.
(We have seen some crashes on Windows with Araxis where a merge failed after a
lot of Araxis flashing. I haven't been able to reproduce it and do not know
exactly what happened. Enabling premerge avoids the problems.)
Before this patch, "contrib/check-code.py" can't detect "% inside _()"
correctly, when there are leading whitespaces before the format
string, like below:
_(
"format string %s" % v)
This patch adds regexp pattern "[ \t\n]*" before the pattern matching
against the format string.
"[\s\n]" can't be used in this purpose, because "\s" is automatically
replaced with "[ \t]" by "_preparepats()" and "\s" in "[]" causes
nested "[]" unexpectedly.
Previously revset._descendants would iterate over the entire subset (which is
often the entire repo) and test if each rev was in the descendants list. This is
really slow on large repos (3+ seconds).
Now we iterate over the descendants and test if they're in the subset.
This affects advancing and retracting the phase boundary (3.5 seconds down to
0.8 seconds, which is even faster than it was in 2.9). Also affects commands
that move the phase boundary (commit and rebase, presumably).
The new revsetbenchmark indicates an improvement from 0.2 to 0.12 seconds. So
future revset changes should be able to notice regressions.
I removed a bad test. It was recently added and tested '1:: and reverse(all())',
which has an amibiguous output direction. Previously it printed in reverse order,
because we iterated over the subset (the reverse part). Now it prints in normal
order because we iterate over the 1:: . Since the revset itself doesn't imply an
order, I removed the test.
Before this patch, "contrib/check-code.py" can't detect these
problems, because the regexp pattern to detect "% inside _()" doesn't
suppose the case that format string consists of multiple string
components concatenated implicitly or explicitly,
This patch does below for that regexp pattern to detect "% inside _()"
problems in such case.
- put "+" into separator part ("[ \t\n]") for explicit concatenation
("...." + "...." style)
- enclose "component and separator" part by "(?:....)+" for
concatenation itself ("...." "...." or "...." + "....")
Before this patch, "contrib/check-code.py" can't detect these
problems, because the regexp pattern to detect "% inside _()" doesn't
suppose the case that the format string and "%" aren't placed in the
same line.
This patch replaces "\s" in that regexp pattern with "[ \t\n]" to
detect "% inside _()" problems in such case.
"[\s\n]" can't be used in this purpose, because "\s" is automatically
replaced with "[ \t]" by "_preparepats()" and "\s" in "[]" causes
nested "[]" unexpectedly.
The script is now in python. That translation is very raw, more improvement to
comes:
The "current code" and "base" entry have been dropped. This is trivial to get
same result using a tagged revision or "." in the list of benchmarked revision.
With the new lazy revset implementation, we need to actually read all elements
to trigger all the computations. Otherwise a no-op if of course much faster than
the full work.
Newly added "hgperf" command repeats "dispatch.runcommand()" for
specified Mercurial command and measure performance of it in the same
manner of "contrib/perf.py" extension.
Users (mainly developers) can examine performance of the target
command easily, without adding new entry for it to "contrib/perf.py"
extension.
Use the env binary to figure out the correct bash to use.
Certain systems ships with an ancient version of bash, but the
user might have installed a newer one that is earlier in $PATH.
For example the current version of Mac OS X ships version 3.2.51
of bash, which does not understand new fancy builtins such as
readarray. A user might install a newer version of bash, use that
as their shell and add that path before bin.
If a developer doesn't have the perf extension enabled the revset
benchmarking will fail. This patch explicitly enables the
perf extension when running the tests.
All revsets added to benchmark so far are aimed to show an improvement of
performance from laziness. We had more wider version to track impact of laziness
on them.
Port a patch from gitk. Original description:
On windows, mouse input follows the keyboard focus, so to allow selecting
text from the patch canvas we must not shift focus back to the top level.
This change has no negative impact on X, so we don't explicitly test
for Win32 on this change. This provides similar selection capability
as already available using X-Windows.
Signed-off-by: Mark Levedahl <mdl123@verizon.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Port a patch from gitk. Original description:
Cygwin's Tcl is configured to honor any occurence of ctrl-z as an
end-of-file marker, while some commits in the git repository and possibly
elsewhere include that character in the commit comment. This causes gitk
ignore commit history following such a comment and incorrect graphs. This
change affects only Windows as Tcl on other platforms already has
eofchar == {}. This fixes problems noted by me and by Ray Lehtiniemi, and
the fix was suggested by Shawn Pierce.
Signed-off-by: Mark Levedahl <mdl123@verizon.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The script now selection revision to run benchmark against using a revset query
instead of a revision range.
It is expected that people benchmarking revset have some knowledge of revset.
This script takes two arguments (starting revision, ending revision) and tests
for each revision in between the entire list of revsets in the script using
perfrevset.
Without this patch, 2to3's rewrites to httpclient cause it to fail to
import. With this patch, it's probably hopelessly broken, but at least
won't block forward progress on non-http2 functionality on Python 3.
Solaris diff -u isn't silent when two files are identical, and tests that
don't account for that will fail. Fix those tests, and introduce a check
that prevents reintroduction.
When we add two newlines after ".. note::" translators will not see this
entry. And all versions of docutils interpret this paragraph correctly
(details in 89e31d6e438f).