This script scans files for lines that look like either ui.config
usage or config variable documentation. It then ensures:
- ui.config calls for each option agree on types and defaults
- every option appears to be mentioned in documentation
It doesn't complain about devel/experimental options and allows
marking options that are not intended to be public.
Since we haven't been able to come up with a good scheme for
documenting config options at point of use, this will help close the
loop of making sure all options that should be documented are.
If mercurial was installed into a directory other than the site-packages,
test-module-imports.t failed as 'mercurial.node' was listed in stdlib_modules:
testpackage/latesymbolimport.py relative import of stdlib module
Instead, we should exclude our packages explicitly.
We can't assume that the site-packages is the only directory that has Python
files but is not handled as a package. For example, we have dist-packages
directory on Debian.
Before this patch, `import-checker.py` exits with non-0 code, if no
error is detected. This is unusual as Unix command.
This change may be a one of preparations for issue4677, because this
can avoid extra explanation about unusual exit code of
`import-checker.py` for third party tool developers.
We introduce a new convention for declaring imports and enforce it via
the import checker script.
The new convention is only active when absolute imports are used, which is
currently nowhere. Keying off "from __future__ import absolute_import" to
engage the new import convention seems like the easiest solution. It is
also beneficial for Mercurial to use this mode because it means less work
and ambiguity for the importer and potentially better performance due to
fewer stat() system calls because the importer won't look for modules in
relative paths unless explicitly asked.
Once all files are converted to use absolute import, we can refactor
this code to again only have a single import convention and we can
require use of absolute import in the style checker.
The rules for the new convention are documented in the docstring of the
added function. Tests have been added to test-module-imports.t. Some
tests are sensitive to newlines and source column position, which makes
docstring testing difficult and/or impossible.
A future patch will formalize the modern import convention. In
preparation for that, introduce a new wrapper function that will invoke
the proper function.
"from . import X" will produce an ImportFrom ast node with .module =
None. This resulted in a run-time error from attempting to concatenate
None with a str.
Another problem with relative imports is that the prefix may be dynamic
based on the "level" attribute of the import. e.g. "from ." has level 1
and "from .." has level 2.
We teach the "fromlocal" function how to cope with relative imports.
Where appropriate, the consumer passes in the level so relative module
names may be resolved properly.
We just rewrote all files to use modern exception syntax. Ban the old
form.
This will detect the "except type, instance" and
"except (type1, type2), instance" forms.
Python 2.6 introduced the "except type as instance" syntax, replacing
the "except type, instance" syntax that came before. Python 3 dropped
support for the latter syntax. Since we no longer support Python 2.4 or
2.5, we have no need to continue supporting the "except type, instance".
This patch mass rewrites the exception syntax to be Python 2.6+ and
Python 3 compatible.
This patch was produced by running `2to3 -f except -w -n .`.
The canonical way of doing 'roots(X)' is 'X - children(X)'. This is what the
implementation used to be. However, computing children is expensive because it
is unbounded. Any changesets in the repository may be a children of '0' so you
have to look at all changesets in the repository to compute children(0).
Moreover the current revsets implementation for children is not lazy, leading to
bad performance when fetching the first result.
There is a more restricted algorithm to compute roots:
roots(X) = [r for r in X if not parents(r) & X]
This achieve the same result while only looking for parent/children relation in
the X set itself, making the algorithm 'O(len(X))' membership operation.
Another advantages is that it turns the check into a simple filter, preserving
all laziness property of the underlying revsets.
The speed is very significant and some laziness is restored.
-) revset without 'roots(...)' to compare to base line
0) before this change
1) after this change
revset #0: roots((tip~100::) - (tip~100::tip))
plain min last
-) 0.001082 0.000993 0.000790
0) 0.001366 0.001385 0.001339
1) 0.001257 92% 0.001028 74% 0.000821 61%
revset #1: roots((0::) - (0::tip))
plain min last
-) 0.134551 0.144682 0.068453
0) 0.161822 0.171786 0.157683
1) 0.137583 85% 0.146204 85% 0.070012 44%
revset #2: roots(tip~100:)
plain min first last
-) 0.000219 0.000225 0.000231 0.000229
0) 0.000513 0.000529 0.000507 0.000539
1) 0.000463 90% 0.000269 50% 0.000267 52% 0.000463 85%
revset #3: roots(:42)
plain min first last
-) 0.000119 0.000146 0.000146 0.000146
0) 0.000231 0.000254 0.000253 0.000260
1) 0.000216 93% 0.000186 73% 0.000184 72% 0.000244 93%
revset #4: roots(not public())
plain min first
-) 0.000478 0.000502 0.000504
0) 0.000611 0.000639 0.000634
1) 0.000604 0.000560 87% 0.000558
revset #5: roots((0:tip)::)
plain min max first last
-) 0.057795 0.004905 0.058260 0.004908 0.038812
0) 0.132845 0.118931 0.130306 0.114280 0.127742
1) 0.111659 84% 0.005023 4% 0.111658 85% 0.005022 4% 0.092490 72%
revset #6: roots(0::tip)
plain min max first last
-) 0.032971 0.033947 0.033460 0.032350 0.033125
0) 0.083671 0.081953 0.084074 0.080364 0.086069
1) 0.074720 89% 0.035547 43% 0.077025 91% 0.033729 41% 0.083197
revset #7: 42:68 and roots(42:tip)
plain min max first last
-) 0.006827 0.000251 0.006830 0.000254 0.006771
0) 0.000337 0.000353 0.000366 0.000350 0.000366
1) 0.000318 94% 0.000297 84% 0.000353 0.000293 83% 0.000351
revset #8: roots(0:tip)
plain min max first last
-) 0.002119 0.000145 0.000147 0.000147 0.000147
0) 0.047441 0.040660 0.045662 0.040284 0.043435
1) 0.038057 80% 0.000187 0% 0.034919 76% 0.000186 0% 0.035097 80%
revset #0: roots(:42 + tip~42:)
plain min max first last sort
-) 0.000321 0.000317 0.000319 0.000308 0.000369 0.000343
0) 0.000772 0.000751 0.000811 0.000750 0.000802 0.000783
1) 0.000632 81% 0.000369 49% 0.000617 76% 0.000358 47% 0.000601 74% 0.000642 81%
If the computation of a set for each phase (done in C) is available,
we use it directly instead of applying a simple filter. This give a
massive speed-up in the vast majority of cases.
On my mercurial repo with about 15000 out of 40000 draft changesets:
revset: draft()
plain min first last
0) 0.011201 0.019950 0.009844 0.000074
1) 0.000284 2% 0.000312 1% 0.000314 3% 0.000315 x4.3
Bad performance for "last" come from the handling of the 15000 elements set
(memory allocation, filtering hidden changesets (99% of it) etc. compared to
applying the filter only on a handfuld of revisions (the first draft changesets
being close of tip).
This is not seen as an issue since:
* Timing is still pretty good and in line with all the other one,
* Current user of Vanilla Mercurial will not have 1/3 of their repo draft,
This bad effect disappears when phase's set is smaller. (about 200 secrets):
revset: secret()
plain min first last
0) 0.011181 0.022228 0.010851 0.000452
1) 0.000058 0% 0.000084 0% 0.000087 0% 0.000087 19%
Using 'repo[X]' is much slower because it creates a 'changectx' object and goes
though multiple layers of code to do so. It is also error prone if there is
tags, bookmarks, branch or other names that could map to a node hash and take
precedence (user are wicked).
This provides a significant performance boost on repository with a lot of
heads. Benchmark result for a repo with 1181 heads.
revset: head()
plain min last reverse
0) 0.014853 0.014371 0.014350 0.015161
1) 0.001402 9% 0.000975 6% 0.000874 6% 0.001415 9%
revset: head() - public()
plain min last reverse
0) 0.015121 0.014420 0.014560 0.015028
1) 0.001674 11% 0.001109 7% 0.000980 6% 0.001693 11%
revset: draft() and head()
plain min last reverse
0) 0.015976 0.014490 0.014214 0.015892
1) 0.002335 14% 0.001018 7% 0.000887 6% 0.002340 14%
The speed up is visible even when other more costly revset are in use
revset: head() and author("mpm")
plain min last reverse
0) 0.105419 0.090046 0.017169 0.108180
1) 0.090721 86% 0.077602 86% 0.003556 20% 0.093324 86%
This file should gather all revsets ever thought interesting by
anyone. That way one can check the impact of a change when touching
something revset-ish. See inline comments for details.
This file have been refilled with all the entry I could automatically
find from changeset descriptions. I assume we missed some not using
'revsetbenchmarks.py' output.
We rename the file and document its purpose. We'll be introducing another file
gathering revsets useful for benchmark of the predicate themsleves in a coming
changesets.
We remove revset making use of min and max as this is covered by the variants.
We could use variant for roots too, but it is not in the default so keep it
here.
We need more advanced variants in some cases. For example, "The last
rev of the sorted version".
We introduce a syntax for this: `reverse+last` means `last(reverse(REVSET))`.
We now use an 8 char display for timing (from 10), we add some logic to drop
precision if the number grows too large (as we do not care about sub-0 digit
in this case). This allow to pack more variants in a single screen.
The current benchmarks were only testing the whole iteration. This is suboptimal
because some changes are meaningful for things like first result, minimum or
sorting.
We introduce a "variants" feature that let you systematically add some variants
to all revsets tested.
A typical variants value would be 'plain,min,last,sort'. When testing 'all()' it
will also provide testing for:
- all()
- min(all())
- last(all())
- sort(sort)
and output:
plain min last sort
0) 0.034568 0.037857 0.000074 0.034238
1) 0.011358 32% 0.020181 53% 0.000080 108% 0.011405 33%
Using revsets (who hit the API) instead of the internal API add some overhead,
but the overhead should be the same everywhere so it still allow comparison.
This is is more simple to implement and allows comparison with older versions
who do not have the same API.
If the time difference is more than 5% from the previous run, we'll display
relative information. This makes it much simpler to spot performance changes in
a sea of benchmarks.
We mostly only care about total time. Dropping this output give us some room to
display more useful information (like percentage different) in future
changesets.
The file doc was saying something, the code was doing something else, the
argument validation was doing a third thing.
Doc and behavior now comply with the argument defined in the code.
We cannot just ask perfrevset to provide debug output because we usually want
to compare output from old version of Mercurial that do not support it. So, we
are using a regular expression.
(/we now have \d problems/).
This makes the root install folder (on Windows) nice and tidy. The
only files left in the root folder are:
hg.exe
python27.dll
COPYING.rtf
ReadMe.html
the last of which was probably out-of-date 7 years ago
\s is equivalent to the character class [ \t\n\r\f\v]. Using \s+ in
a regular expression against input with multiple lines may match across
multiple lines.
For the regexp in question, "\+\s+" would match "+\n " and similar
sequences, leading to false positives for functions that were included
in diff context, after a modified hunk.
In their infinite wisdom, the Python maintainers stripped bytes of its
% and format() methods for 3.x. They've now added % back to 3.5, but
format() is still missing. Since we don't have any particular need for
it, we should keep avoiding it.