Previously, the code was similar to what revlog._chunks() was doing,
which took a raw data segment and delta chain, obtained buffers for
the raw revlog chunks within, and decompressed them.
This commit splits the "get raw chunks" action from "decompress." The
goal of this change is to more accurately measurely decompression
performance.
On a ~50k deltachain for a manifest in mozilla-central:
! full
! wall 0.430548 comb 0.440000 user 0.410000 sys 0.030000 (best of 24)
! deltachain
! wall 0.016053 comb 0.010000 user 0.010000 sys 0.000000 (best of 181)
! read
! wall 0.008078 comb 0.010000 user 0.000000 sys 0.010000 (best of 362)
! rawchunks
! wall 0.033785 comb 0.040000 user 0.040000 sys 0.000000 (best of 100)
! decompress
! wall 0.327126 comb 0.320000 user 0.320000 sys 0.000000 (best of 31)
! patch
! wall 0.032391 comb 0.030000 user 0.030000 sys 0.000000 (best of 100)
! hash
! wall 0.012587 comb 0.010000 user 0.010000 sys 0.000000 (best of 233)
Now, all URL in Mercurial source tree should refer mercurial-scm.org
domain instead of selenic.com.
*.po files are ignored in this patch, because they might contain
msgid/msgstr coming from old source files.
This ignorance seems safe enough, because such msgstr should be
ignored at runtime, because:
- msgid corresponded to it should be invalid, or
- msgstr itself should be marked as fuzzy at synchronized to recent hg.pot
If any additional examination for *.po files is needed in the future,
let i18n/check-translation.py achieve such examination.
BTW, some binary files (e.g. *.png) are meaningless for checking
reference to old domain in this patch, but aren't ignored like as *.po
files, because excluding multiple suffixes is difficult for regexp
matching.
Before this patch, check-code.py applies filtering on the file
content, to which filtering of previous check is already applied.
This might hide issues, which should be detected by a subsequent check
in "checks" list.
Fortunately, this problem hasn't appeared, because there is no
overlapping of filename matching (examined in the order below).
1. *.py or *.cgi
2. test-* (not *.t suffix)
3. *.c or *.h
4. *.t
5. *.txt
6. *.tmpl
For example, adding a test, which wants to examine raw comment text in
*.py files, at the end of current "checks" list doesn't work as
expected, because a filter for *.py files normalizes comment text in
them.
Putting such test at the beginning of "checks" list also resolves this
problem, but such dependence on the order decreases maintainability of
check-code.py itself.
This patch discards filtering result of previous check at the
beginning of each checks, for independence of each checks.
The old @property on manifestlog was broken. It meant that we would always
recreate the manifestlog instance, which meant the cache was never hit. Since
we'll eventually remove repo.manifest and make manifestlog the only property,
let's go ahead and make manifestlog the @storecache property, have manifestlog
own the manifest instance, and have repo.manifest refer to it via manifestlog.
This means all accesses go through repo.manifestlog, which is now invalidated
correctly.
On systems with large repositories and slow disks,
the calls to 'hg status' make autocomplete annoyingly slow.
This fix makes it possible to avoid the slowdown.
Before this patch, "hg perftags" command doesn't measure performance
of "repo.tags()" correctly, because it doesn't clear tags cache
correctly.
a043ed82a5cd replaced repo._tags with repo._tagscache, but didn't
change the code path to clear tags cache in perftags() at that time.
BTW, full history of "tags cache" is:
- b8d757d45f24 (or 0.6) introduced repo.tagscache as the first "tags cache"
- 4cbf51c74e8c (or 1.4) replaced repo.tagscache with repo._tags
- a043ed82a5cd (or 2.0) replaced repo._tags with repo._tagscache
- 04c204f1ed65 (or 2.5) made repo._tagscache filteredpropertycache
To make perftags clear tags cache correctly, and to increase
"historical portability" of perftags, this patch examines existence of
attributes in repo object, and guess appropriate procedure to clear
tags cache.
To avoid examining existence of attributes at each repetition, this
patch makes repocleartagscachefunc() return the function, which
actually clears tags cache.
mozilla-central repo (85 tags on 308365 revs) with each Mercurial
version between before and after this patch.
==== ========= =========
ver before after
==== ========= =========
1.9 0.476062 0.466464
------- *1 -------
2.0 0.346309 0.458327
2.1 0.343106 0.454489
------- *2 -------
2.2 0.069790 0.071263
2.3 0.067829 0.069340
2.4 0.068075 0.069573
------- *3 -------
2.5 0.021896 0.022406
2.6 0.021900 0.022374
2.7 0.021883 0.022379
2.8 0.021949 0.022327
2.9 0.021877 0.022330
3.0 0.021860 0.022314
3.1 0.021869 0.022669
3.2 0.021831 0.022668
3.3 0.021809 0.022691
3.4 0.021861 0.022916
3.5 0.019335 0.020749
3.6 0.019319 0.020866
3.7 0.018781 0.020251
------- *4 -------
3.8 0.068262 0.072558
3.9 0.069682 0.073773
==== ========= =========
(*1) repo._tags was replaced with repo._tagscache at this point
"repo._tags = None" in perftags "before" this patch doesn't clear
tags cache for Mercurial 2.0 or later. This causes significant
gap of "before" between 1.9 and 2.0 .
(*2) I'm not sure about significant gap at this point, but release
note of 2.2 described "a number of significant performance
improvements for large repositories"
(*3) filtered changelog was cached in repoview as repoview.changelog
at this point (by 131b01a4654d)
This avoids calculation of filtered changelog at each repetition
of t().
(*4) calculation of filtered changelog was included into wall time at
this point (by adf01efe43a5), again
See below for detail about this significant gap:
https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-April/083410.html
Before this patch, using ui.configint() prevents perf.py from
measuring performance with Mercurial earlier than 1.9 (or
12e7e9fbf243), because ui.configint() isn't available in such
Mercurial, even though there are some code paths for Mercurial earlier
than 1.9 in perf.py.
For example, setting "_prereadsize" attribute in perfindex() and
perfnodelookup() is effective only with hg earlier than 1.8 (or
1299f0c14572).
This patch replaces ui.configint() invocations by newly introduced
getint().
This patch also adds check-perf-code.py an extra check entry to detect
direct usage of ui.configint() in perf.py.
BTW, this patch doesn't choose adding configint() method at runtime by
replacing ui.__class__ like below, even though this is the recommended
way to modern Mercurial extensions.
def uisetup(ui):
if not util.safehasattr(ui, 'configint'):
class uiwrap(ui.__class__):
def configint(self, section, name, ....):
....
ui.__class__ = uiwrap
Because changes to ui.__class__ by uisetup() of loaded extension have
been propagated since 1.6.1 (or 07a6e7bd0cc1), the recommended way
above doesn't work as expected with Mercurial earlier than it.
Before this patch, referring ui.ferr prevents perf.py from measuring
performance with Mercurial earlier than 1.9 (or bac01d164cc1), because
ui.ferr isn't available in such Mercurial, even though there are some
code paths for Mercurial earlier than 1.9 in perf.py.
For example, setting "_prereadsize" attribute in perfindex() and
perfnodelookup() is effective only with hg earlier than 1.8 (or
1299f0c14572).
Before this patch, using ui.formatter() prevents perf.py from
measuring performance with Mercurial earlier than 2.2 (or
045c8375c770), because ui.formatter() isn't available in such
Mercurial, even though there are some code paths for Mercurial earlier
than 2.2 in perf.py.
For example, setting "_prereadsize" attribute in perfindex() and
perfnodelookup() is effective only with hg earlier than 1.8 (or
1299f0c14572).
This patch defines formatter class locally, and use it instead of the
value returned by ui.formatter(), if perf.py is used with Mercurial
earlier than 2.2.
In this case, we don't need to think about -T/--template option for
formatter, because previous patch made -T/--template disabled for
perf.py with Mercurial earlier than 3.2 (or 44a82ed65df7).
Before this patch, using svfs prevents perf.py from measuring
performance of Mercurial earlier than 2.3 (or 12df7401e8cd), because
svfs isn't available in such Mercurial, even though there are some
code paths for Mercurial earlier than 2.3 in perf.py.
For example, setting "_prereadsize" attribute in perfindex() and
perfnodelookup() is effective only with hg earlier than 1.8 (or
1299f0c14572).
To get appropriate vfs-like object to access files under .hg/store,
this patch adds getsvfs() (and also getvfs(), for future use).
To avoid examining existence of attribute at each repetition while
measuring performance, getsvfs() is invoked outside the function to be
called repeatedly.
This patch also adds check-perf-code.py an extra check entry to detect
direct usage of repo.(vfs|svfs|opener|sopener) in perf.py.
Mercurial 2.5 (or 0eb7dcc721cb) introduced "perfbranchmap" command,
and tried to avoid actual writing branch cache out by replacing
write() of branchcache class in branchmap.py with no-op function
(probably, for elimination of noisy and heavy file I/O factor).
But its implementation isn't correct, because 0eb7dcc721cb replaced
not branchmap.branchcache.write() but branchmap.write(). The latter
doesn't exist, even at that change.
To avoid actual writing branch cache out correctly, this patch
replaces branchmap.branchcache.write() with no-op function.
To detect mistake of replacement or change of API in the future
quickly, this patch uses safeattrsetter() instead of direct attribute
assignment. For similarity between replacements, this patch also
changes replacement of branchmap.read().
In this patch, replacement of read()/write() can run safely outside
"try" block, because two safeattrsetter() invocations ensure that
replacement doesn't cause exception.
FYI, the table below compares "base" filter wall time of perfbranchmap
on recent mozilla-central repo with each Mercurial version between
before and after this patch.
==== ========= =========
ver before after
==== ========= =========
2.5 18.492334 18.232455
2.6 18.733858 18.156702
2.7 18.245598 18.349210
2.8 18.289070 18.528422
2.9 17.572742 16.989655
3.0 17.406953 17.615012
3.1 17.228419 17.689805
3.2 17.862961 17.718367
3.3 2.632110 2.707960
3.4 3.285683 3.272060
3.5 3.370141 3.352176
3.6 3.366939 3.242455
3.7 3.300778 3.367328
3.8 3.300132 3.267298
3.9 3.418996 3.370265
==== ========= =========
IMHO, there is no serious overlooking performance regression.
Before this patch, using branchmap.subsettable prevents perfbranchmap
from measuring performance of Mercurial earlier than 2.9 (or
aad678a92970), because aad678a92970 moved subsettable from repoview.py
to branchmap.py, even though there are some code paths for Mercurial
earlier than 2.9 in perf.py.
For example, setting "_prereadsize" attribute in perfindex() and
perfnodelookup() is effective only with hg earlier than 1.8 (or
1299f0c14572).
To get subsettable from appropriate module, this patch examines
existence of subsettable in branchmap and repoview.
This patch also adds check-perf-code.py an extra check entry to detect
direct usage of subsettable attribute in perf.py.
Referring not-existing attribute immediately causes failure, but
assigning a value to such attribute doesn't.
For example, perf.py has code paths below, which assign a value to
not-existing attribute. This causes incorrect performance measurement,
but these code paths are executed successfully.
- "repo._tags = None" in perftags()
recent Mercurial has tags cache information in repo._tagscache
- "branchmap.write = lambda repo: None" in perfbranchmap()
branchmap cache is written out by branchcache.write() in branchmap.py
"util.safehasattr() before assignment" can avoid this issue, but might
increase mistake at "copy & paste" attribute name or so.
To centralize (1) examining existence of, (2) assigning a value to,
and (3) restoring an old value to the attribute, this patch introduces
safeattrsetter(). This is used to replace direct attribute assignment
in subsequent patches.
Encapsulation of restoring is needed to completely remove direct
attribute assignment from perf.py, even though restoring isn't needed
so often.
Otherwise no code transformation would be applied to the modules which are
imported only by imp.load_module().
This change means modules are imported from PYTHONPATH, not from the paths
given by command arguments. This isn't always correct, but seems acceptable.
Here are the relevant symbol descriptions from [0]
optspec:...
The colon indicates handling for one or more arguments to the option; if it
is not present, the option is assumed to take no arguments.
-optname+
The first argument may appear immediately after optname in the same word,
or may appear as a separate word after the option. For example, ‘-foo+:...’
specifies that the completed option and argument will look like either
‘-fooarg’ or ‘-foo arg’.
-optname=
The argument may appear as the next word, or in same word as the option
name provided that it is separated from it by an equals sign, for example
‘-foo=arg’ or ‘-foo arg’.
There are 3 types of changes in this patch:
- addition of '=': means that '--repository ~/foo' can now also be spelled as
'--reporitory=~/foo'
- addition of '+': means '-r 0' can now also be spelled as '-r0'
- removal of '+', '=' and ':': means that '-u|--untrusted' doesn't take any
arguments, so '-uq' is definitely '-u -q' and not '--untrusted=q'
Occasionally, ':' had to be added together with '=' and '+'.
This patch is mostly just making zsh_completion file look more like zsh's own
hg completion file so that we can more easily take improvements from them or
send patches to them. Some context for this patch from zsh project: [2] [3].
[0]: http://zsh.sourceforge.net/Doc/Release/Completion-System.html
[1]: https://sourceforge.net/p/zsh/code/ci/master/tree/Completion/Unix/Command/_hg
[2]: 92584634d3/
[3]: http://www.zsh.org/mla/workers/2015/msg02551.html
This command can be used for testing the performance of producing the
changelog portion of a changegroup.
We could use additional perf* commands for testing other parts of
changegroup. Those can be written another time, when they are needed.
(And those may want to refactor the changegroup generation API so code
can be reused.) Speaking of code reuse, yes, this command does reinvent
a small wheel. I didn't want to scope bloat to change the changegroup
API because that will invite bikeshedding.
The Mercurial wire protocol is under-documented. This includes a lack
of source docstrings and comments as well as pages on the official
wiki.
This patch adds the beginnings of "internals" documentation on the
wire protocol.
The documentation should have nearly complete coverage on the
lower-level parts of the protocol, such as the different transport
mechanims, how commands and arguments are sent, capabilities, and,
of course, the commands themselves.
As part of writing this documentation, I discovered a number of
deficiencies in the protocol and bugs in the implementation. I've
started sending patches for some of the issues. I hope to send a lot
more.
This patch starts with the scaffolding for a new internals page.
I've caught multiple extensions in the wild lying about being
'internal', so it's time to move the goalposts on people. Goalpost
moving will continue until third party extensions stop trying to
defeat the system.
These signals are meant to send to a process group, instead of a single
process: SIGINT is usually emitted by the terminal and sent to the process
group. SIGHUP usually happens to a process group if termination of a process
causes that process group to become orphaned.
Before this patch, chg will only forward these signals to the single server
process. This patch changes it to the server process group.
This will allow us to properly kill processes started by the forked server
process, like a ssh process. The behavior difference can be observed by
setting SSH_ASKPASS to a dummy script doing "sleep 100" and then run
"chg push ssh://dest-need-password-auth". Before this patch, the first Ctrl+C
will kill the hg process while ssh-askpass and ssh will remain alive. This
patch will make sure they are killed properly.
We recently discovered a case in production that chg uses 100% CPU and is
trying to read data forever:
recvfrom(4, "", 1814012019, 0, NULL, NULL) = 0
Using gdb, apparently readchannel() got wrong data. It was reading in an
infinite loop because rsize == 0 does not exit the loop, while the server
process had ended.
(gdb) bt
#0 ... in recv () at /lib64/libc.so.6
#1 ... in readchannel (...) at /usr/include/bits/socket2.h:45
#2 ... in readchannel (hgc=...) at hgclient.c:129
#3 ... in handleresponse (hgc=...) at hgclient.c:255
#4 ... in hgc_runcommand (hgc=..., args=<optimized>, argsize=<optimized>)
#5 ... in main (argc=...486922636, argv=..., envp=...) at chg.c:661
(gdb) frame 2
(gdb) p *hgc
$1 = {sockfd = 4, pid = 381152, ctx = {ch = 108 'l',
data = 0x7fb05164f010 "st):\nTraceback (most recent call last):\n"
"Traceback (most recent call last):\ne", maxdatasize = 1814065152,"
" datasize = 1814064225}, capflags = 16131}
This patch addresses the infinite loop issue by detecting continuously empty
responses and abort in that case.
Note that datasize can be translated to ['l', ' ', 'l', 'a']. Concatenate
datasize and data, it forms part of "Traceback (most recent call last):".
This may indicate a server-side channeledoutput issue. If it is a race
condition, we may want to use flock to protect the channels.
This patch makes an extra check pattern to be prepared by
"_preparepats()" as similarly as existing patterns, if it is added to
"checks" array before invocation of "main()" in check-code.py.
This is a part of preparation for adding check-code.py extra checks by
another python script in subsequent patch.
This is also useful for SkeletonExtensionPlan.
https://www.mercurial-scm.org/wiki/SkeletonExtensionPlan
demandimport of early Mercurial loads an imported module immediately,
if a module is imported absolutely by "from a import b" style. Recent
perf.py satisfies this condition, because it does:
- have "from __future__ import absolute_import" line
- use "from a import b" style for modules in "mercurial" package
Before this patch, importing modules below prevents perf.py from being
loaded by earlier Mercurial, because these aren't available in such
Mercurial, even though there are some code paths for Mercurial earlier
than 1.9.
- branchmap 2.5 (or e3354ba12eef)
- repoview 2.5 (or 7d207cb7e38a)
- obsolete 2.3 (or b50a017cd9ac)
- scmutil 1.9 (or 065064cdde5f)
For example, setting "_prereadsize" attribute in perfindex() and
perfnodelookup() is effective only with Mercurial earlier than 1.8 (or
1299f0c14572).
After this patch, "mercurial.error" is the only blocker in "from
mercurial import" statement for loading perf.py with Mercurial earlier
than 1.2. This patch ignores it, because just importing it separately
isn't enough.
The BaseHTTPServer, SimpleHTTPServer and CGIHTTPServer has been merged into
http.server in python 3. All of them has been merged as util.httpserver to use
in both python 2 and 3. This patch adds a regex to check-code to warn against
the use of BaseHTTPServer. Moreover this patch also includes updates to lower
part of test-check-py3-compat.t which used to remain unchanged.
The most painful part of ensuring Python code runs on both Python 2
and 3 is string encoding. Making this difficult is that string
literals in Python 2 are bytes and string literals in Python 3 are
unicode. So, to ensure consistent types are used, you have to
use "from __future__ import unicode_literals" and/or prefix literals
with their type (e.g. b'foo' or u'foo').
Nearly every string in Mercurial is bytes. So, to use the same source
code on both Python 2 and 3 would require prefixing nearly every
string literal with "b" to make it a byte literal. This is ugly and
not something mpm is willing to do at this point in time.
This patch implements a custom module loader on Python 3 that performs
source transformation to convert string literals (unicode in Python 3)
to byte literals. In effect, it changes Python 3's string literals to
behave like Python 2's.
In addition, the module loader recognizes well-known built-in
functions (getattr, setattr, hasattr) and methods (encode and decode)
that barf when bytes are used and prevents these from being rewritten.
This prevents excessive source changes to accommodate this change
(we would have to rewrite every occurrence of these functions passing
string literals otherwise).
The module loader is only used on Python packages belonging to
Mercurial.
The loader works by tokenizing the loaded source and replacing
"string" tokens if necessary. The modified token stream is
untokenized back to source and loaded like normal. This does add some
overhead. However, this all occurs before caching: .pyc files will
cache the transformed version. This means the transformation penalty
is only paid on first load.
As the extensive inline comments explain, the presence of a custom
source transformer invalidates assumptions made by Python's built-in
bytecode caching mechanism. So, we have to wrap bytecode loading and
writing and add an additional header to bytecode files to facilitate
additional cache validation when the source transformations
change in the future.
There are still a few things this code doesn't handle well, namely
support for zip files as module sources and for extensions. Since
Mercurial doesn't officially support Python 3 yet, I'm inclined to
leave these as to-do items: getting a basic module loading mechanism
in place to unblock further Python 3 porting effort is more important
than comprehensive module importing support.
check-py3-compat.py has been updated to ignore frames. This is
necessary because CPython has built-in code to strip frames from the
built-in importer. When our custom code is present, this doesn't work
and the frames get all messed up. The new code is not perfect. It
works for now. But once you start chasing import failures you find
some edge cases where the files aren't being printed properly. This
only burdens people doing future Python 3 porting work so I'm inclined
to punt on the issue: the most important thing is for the source
transforming module loader to land.
There was a bit of churn in test-check-py3-compat.t because we now
trip up on str/unicode/bytes failures as a result of source
transformation. This is unfortunate but what are you going to do.
It's worth noting that other approaches were investigated.
We considered using a custom file encoding whose decode() would
apply source transformations. This was rejected because it would
require each source file to declare its custom Mercurial encoding.
Furthermore, when changing the source transformation we'd need to
version bump the encoding name otherwise the module caching layer
wouldn't know the .pyc file was invalidated. This would mean mass
updating every file when the source transformation changes. Yuck.
We also considered transforming at the AST layer. However, Python's
ast module is quite gnarly and doing AST transforms is quite
complicated, even for trivial rewrites. There are whole Python packages
that exist to make AST transformations usable. AST transforms would
still require import machinery, so the choice was basically to
perform source-level, token-level, or ast-level transforms.
Token-level rewriting delivers the metadata we need to rewrite
intelligently while being relatively easy to understand. So it won.
General consensus seems to be that this approach is the best available
to avoid bulk rewriting of '' to b''. However, we aren't confident
that this approach will never be a future maintenance burden. This
approach does unblock serious Python 3 porting efforts. So we can
re-evaulate once more work is done to support Python 3.
Prior to this, check-code wouldn't notice things like (glob)
annotations or similar in a test if they were after a # anywhere in
the line. This resolves a defect in a future change, and also exposed
a couple of small spots that needed some attention.
Before this patch, using cmdutil.command() for "@command" annotation
prevents perf.py from being loaded by Mercurial earlier than 1.9 (or
d4096ee63f8e), because cmdutil.command() isn't available in such
Mercurial, even though there are some code paths for Mercurial earlier
than 1.9.
For example, setting "_prereadsize" attribute in perfindex() and
perfnodelookup() is effective only with hg earlier than 1.8 (or
1299f0c14572).
In addition to it, "norepo" option of command annotation has been
available since 3.1 (or 9cbb59f03d57), and this is another blocker for
loading perf.py with earlier Mercurial.
============ ============ ======
command of
hg version cmdutil norepo
============ ============ ======
3.1 or later o o
1.9 or later o x
earlier x x
============ ============ ======
This patch defines "command()" for annotation locally as below:
- define wrapper of existing cmdutil.command(), if cmdutil.command()
doesn't support "norepo"
(for Mercurial earlier than 3.1)
- define full command() locally with minimum function, if
cmdutil.command() isn't available at runtime
(for Mercurial earlier than 1.9)
This patch also defines parsealiases() locally without examining
whether it is available or not, because it is small enough to define
locally.
Before this patch, referring commands.formatteropts prevents perf.py
from being loaded by Mercurial earlier than 3.2 (or 44a82ed65df7),
because it isn't available in such Mercurial, even though formatting
itself has been available since 2.2 (or 045c8375c770).
In addition to it, there are some code paths for Mercurial earlier
than 3.2. For example, setting "_prereadsize" attribute in perfindex()
and perfnodelookup() is effective only with hg earlier than 1.8 (or
1299f0c14572).
This patch uses empty option list as formatteropts, if it isn't
available in commands module at runtime.
Disabling -T/--template option for earlier Mercurial should be
reasonable, because:
- since 0c03b2a1206a, -T/--template for formatter has been available
- since 44a82ed65df7, commands.formatteropts has been available
- the latter revision is direct child of the former
Before this patch, referring commands.debugrevlogopts prevents perf.py
from being loaded by Mercurial earlier than 3.7 (or 1a89336e03aa),
because it isn't available in such Mercurial, even though
cmdutil.openrevlog(), a user of these options, has been available
since 1.9 (or f32fd7cab084).
In addition to it, there are some code paths for Mercurial earlier
than 3.7. For example, setting "_prereadsize" attribute in perfindex()
and perfnodelookup() is effective only with hg earlier than 1.8 (or
1299f0c14572).
But just "using locally defined revlog option list" might cause
unexpected behavior at runtime. If --dir option is specified to
cmdutil.openrevlog() of Mercurial earlier than 3.5 (or e3ab0b30c05e),
it is silently ignored without any warning or so.
============ ============ ===== ===============
debugrevlogopts
hg version openrevlog() --dir of commands
============ ============ ===== ===============
3.7 or later o o o
3.5 or later o o x
1.9 or later o x x
earlier x x x
============ ============ ===== ===============
Therefore, this patch does:
- use locally defined option list, if commands.debugrevlogopts isn't
available (for Mercurial earlier than 3.7)
- wrap cmdutil.openrevlog(), if it is ambiguous whether
cmdutil.openrevlog() can recognize --dir option correctly
(for Mercurial earlier than 3.5)
This wrapper function aborts execution, if:
- --dir option is specified, and
- localrepository doesn't have "dirlog" attribute, which indicates
that localrepository has a function for '--dir'
BTW, extensions.wrapfunction() has been available since 1.1 (or
56ba0b824b91), and this seems old enough for "historical
portability" of perf.py, which has been available since 1.1 (or
bca5e7427e89).
Before this patch, using util.safehasattr() prevents perf.py from
being loaded by Mercurial earlier than 1.9.3 (or 36f152380d7d),
because util.safehasattr() isn't available in such Mercurial, even
though there are some code paths for Mercurial earlier than 1.9.3.
For example, setting "_prereadsize" attribute in perfindex() and
perfnodelookup() is effective only with Mercurial earlier than 1.8 (or
1299f0c14572).
This patch is a preparation for using util.safehasattr() safely in
subsequent patches.
This patch defines util.safehasattr() forcibly without examining
whether it is available or not, because:
- examining existence of "safehasattr" safely itself needs similar logic
- safehasattr() is small enough to define locally
The httplib library is renamed to http.client in python 3. So the
import is conditionalized and a test is added in check-code to warn
to use util.httplib
If the user press 'q' to leave the 'less' pager, it is expected to end the
hg process immediately. We currently rely on SIGPIPE for this behavior. But
SIGPIPE won't arrive if we don't write anything (like doing heavy
computation, reading from network etc). If that happens, the user will feel
that the hg process just hangs.
The patch address the issue by adding a SIGCHLD signal handler and sends
SIGPIPE to the server as soon as the pager exits.
This is also an issue with hg's pager implementation.
Rebuilding translation table (256 size) at each repquote() invocations
is redundant.
For example, this patch decreases user time of command invocation
below from 18.297s to 13.445s (about -27%) on a Linux box. This
command is main part of test-check-code.t.
hg locate | xargs python contrib/check-code.py --warnings --per-file=0
This patch adds "_repquote" prefix to functions and variables factored
out from repquote() to avoid conflict of name in the future.
Before this patch, "missing _() in ui message" rule overlooks
translatable message, which starts with other than alphabet.
To detect "missing _() in ui message" more exactly, this patch
improves the regexp with assumptions below.
- sequence consisting of below might precede "translatable message"
in same string token
- formatting string, which starts with '%'
- escaped character, which starts with 'b' (as replacement of '\\'), or
- characters other than '%', 'b' and 'x' (as replacement of alphabet)
- any string tokens might precede a string token, which contains
"translatable message"
This patch builds an input file, which is used to examine "missing _()
in ui message" detection, before '"$check_code" stringjoin.py' in
test-contrib-check-code.t, because this reduces amount of change churn
in subsequent patch.
This patch also applies "()" instead of "_()" on messages below to
hide false-positives:
- messages for ui.debug() or debug commands/tools
- contrib/debugshell.py
- hgext/win32mbcs.py (ui.write() is used, though)
- mercurial/commands.py
- _debugchangegroup
- debugindex
- debuglocks
- debugrevlog
- debugrevspec
- debugtemplate
- untranslatable messages
- doc/gendoc.py (ReST specific text)
- hgext/hgk.py (permission string)
- hgext/keyword.py (text written into configuration file)
- mercurial/cmdutil.py (formatting strings for JSON)
When auto-completing hg commands, aliases are listed, but not the available
switches for an alias, because `HGPLAIN=1` filters these out. Add a
`HGPLAINEXCEPT=alias` exception to resolve this.
We make heavy use of aliases that drive hg log with custom revsets, sorting and
the -G switch, but want our users to be able to auto-complete any additional
command-line switches.
Before this patch, fromlocalfunc() assumes that "module" attribute of
ast.ImportFrom is None for "from . import a", and Python 2.7.x
satisfies this assumption.
On the other hand, with Python 2.6.x, "module" attribute of
ast.ImportFrom is an empty string for "from . import a", and this
causes failure of test-check-module-imports.t.
Our signal handlers forward signals to the server process, but it will
disappear soon after hgc_close(). So we should unregister handlers before
hgc_close(). Otherwise chg would abort due to kill(perrpid, sig) failure.
The problem is spotted by SIGWINCH while waiting pager termination.
Before this patch, chg will give up when it cannot connect to the new server
within 10 seconds. If the host has high load during that time, 10 seconds
is not enough.
This patch makes it adjustable using the CHGTIMEOUT environment variable.
Before this patch, chg uses the old pager behavior (pre 55f6f7fb60d2), which
executes pager in the main process. The user will see the exit code of the
pager, instead of the hg command.
Like 55f6f7fb60d2, this patch fixes the behavior by executing the pager in
the child process, and wait for it at the end of the main process.
This patch makes repquote() distinguish more characters below, as a
preparation for exact detection in subsequent patch.
- "%" as "%"
- "\\" as "b"(ackslash)
- "*" as "A"(sterisk)
- "+" as "P"(lus)
- "-" as "M"(inus)
Characters other than "%" don't use itself as replacement, because
they are treated as special ones in regexp.
d19c9c93ad10 tried to detect '.. note::' more exactly. But
implementation of it seems not correct, because:
- fromc.find(c) returns -1 for other than "." and ":"
- tochr[-1] returns "q" for such characters, but
- expected result for them is "o"
This patch uses dict to manage replacement instead of replacing
str.find() by str.index(), for improvement/refactoring in subsequent
patches. Examination by fixedmap is placed just after examination for
' ' and '\n', because subsequent patch will integrate the latter into
the former.
This patch also changes regexp for 'string join across lines with no
space' rule, and adds detailed test for it, because d19c9c93ad10 did:
- make repquote() distinguish "." (as "p") and ":" (as "q") from
others (as "o"), but
- not change this regexp without any reason (in commit log, at
least), even though this regexp depends on what "o" means
This patch doesn't focuses on deciding whether "." and/or ":" should
be followed by whitespace or not in translatable messages.
It doesn't make sense that (a) is allowed whereas (b) is disallowed.
a) from mercurial import hg
from mercurial.i18n import _
b) from . import hg
from .i18n import _
Since (b) is banned, we should do the same for (a) for consistency.
a) from mercurial import hg
from mercurial.i18n import _
b) from . import hg
from .i18n import _
This is needed so that launchpad and friends have a unique version number for
each distroseries (trusty, wily, xenial, etc). It was discovered when trying to
upload 3.8 to launchpad.
Debian maintainers already have this and lintian warns us about not
listing 'wish' as a dependency or suggestion so this patch does indeed
just that. The issue, by the way, is that we are shipping hgk (which is
written in tcl/tk) so we should be good citizens and list wish (a meta
package for tcl/tk) as a dependency.
Instead of using bdist_mpkg, we use the modern Apple-provided tools to
build an OS X Installer package directly. This has several advantages:
* Avoids bdist_mpkg which seems to be barely maintained and is hard to
use.
* Creates a single unified .pkg instead of a .mpkg.
* The package we produce is in the modern, single-file format instead of
a directory bundle that we have to zip up for download.
In addition, this way of building the package now correctly:
* Installs the manpages, bringing the `make osx`-generated package in
line with the official Mac packages we publish on the website.
* Installs files with the correct permissions instead of encoding the
UID of the user who happened to build the package.
Thanks to Augie for updating the test expectations.
As we don't use sockdirfd yet, this is the simplest workaround to compile chg
on old Unices where AT_FDCWD does not exist. Foozy pointed out Mac OS X 10.10
is required for AT_FDCWD as well as xxxat() functions.
This bug is hidden by the current bias towards matches at the
beginning of the file. When this bias is tweaked later to address
recursion balancing, the normalization code could cause the next block
to shrink to a negative length, thus creating invalid delta chunks. We
add checks here to disallow that.
This bug requires test cases that are an awkwardly large size for the test
suite, but is very rapidly picked up by the included torture tester.
So far fromlocal recognizes relative imports of the form:
from . import D
from .. import E
It wasn't prepared for recognizing relative imports like:
from ..F import G
The bug was not found so far because all relative imports starting
from the parent was in the list of allowsymbolicimports like:
from ..i18n import
from ..node import
Before this patch, check-py3-compat.py implies unintentional ".pure"
sub-package name at loading pure modules, because module name is
composed by just replacing "/" in the path to actual ".py" file by
".".
This makes pure modules belong to "mercurial.pure" package, and
prevents them from importing a module belonging to "mercurial" package
relatively by "from . import foo" or so.
This is reason why pure modules fail to import another module
relatively only at examination by check-py3-compat.py.
This had the unfortunate side effect of causing the environment to have a
newline due to the fact that some 'cd' outputs the result of the directory
change. So, let's just redirect the meaningless output.
Before this patch, if the user uses chg and ncurses interface, resizing the
terminal window will mess up its content.
This patch fixes the issue by forwarding SIGWINCH to the worker process.
We don't really need to report SyntaxErrors, since in theory
docchecker or a test will catch them, but they happen, and
we can't just have the code crash, so for now, we're reporting
them.
Before this patch, if the server started by chg has exited with code 0 without
creating a connectable unix domain socket at the specified address, chg will
exit with code 0, which is not the correct behavior. It can happen, for
example, CHGHG is set to /bin/true.
This patch addresses the issue by checking the exit code of the server and
printing a new error message if the server exited normally but cannot be
reached.
We check for sockdirfd at freecmdserveropts but not lockfd, which is a bit
strange to people new to the code. Add a comment and an assert to make it
clear that lockfd should be closed earlier.
As part of the series to support long socket paths, we need to add the fd of
the directory to the cmdserveropts structure so we can use basenames instead
of full paths for sockname, redirectsockname, and lockfile.
This is a style fix. I was using tabstop=4 for some early patches, although
I realized we use tabstop=8 later but these early style issues remains. Let's
fix them.
Before this patch, chg always uses color in its debugmsg and abortmsg and
there is no way to turn it off.
This patch adds a global flag to control whether chg should use color or
not and only enables it when stderr is a tty and HGPLAIN is not set.
Before this patch, "connect to" debug message is printed repeatedly because
a previous patch changed how the chg client decides the server is ready to be
connected.
This patch revises the places we print connect debug messages so they are less
repetitive without losing useful information.
On pypy datetime and cProfile are modules written in Python, not in C.
For the purpose of this test, just list them explicitely as builtins,
which silences warnings about them being imported before stdlib modules.
Before this patch, chg will fall back to "hg" if neither CHGHG nor HG are set.
This may have trouble if the "hg" in PATH is not compatible with chg, which
can happen, for example, an old hg is installed in a virtualenv.
Since it's very hard to do a quick hg version check from chg, after discussion
in IRC with smf and marmoute, the quickest solution is to build a package with
a hardcoded absolute hg path in chg. This patch makes it possible by adding a
C macro HGPATH.
Before this patch, if __GNUC__ is not defined, PRINTF_FORMAT_ will not be
defined and will cause compilation error.
This patch solves the issue by making sure PRINTF_FORMAT_ is defined. It
allows chg to be compiled with tcc (http://bellard.org/tcc/) by:
tcc -o chg *.c
All of mercurial.* is now using absolute_import. Most of
mercurial.* is able to ast parse with Python 3. The next big
hurdle is being able to import modules using Python 3.
This patch adds testing of hgext.* and mercurial.* module imports
in Python 3. As the new test output shows, most modules can't
import under Python 3. However, many of the failures are due
to a common problem in a highly imported module (e.g. the bytes vs
str issue in node.py).
Previously, test-check-py3-compat.t parsed Python files with Python 2
and looked for known patterns that are incompatible with Python 3.
Now that we have a mechanism for invoking Python 3 interpreters from
tests, we can expand check-py3-compat.py and its corresponding .t
test to perform an additional AST parse using Python 3.
As the test output shows, we identify a number of new parse failures
on Python 3. There are some redundant warnings for missing parentheses
for the print function. Given the recent influx of patches around
fixing these, the redundancy shouldn't last for too long.
Redirecting stdout to /dev/null has unwanted side effects, namely ui.write
will stop working. This patch removes the redirection code and helps chg to
pass test-bad-extension.t.
If the server has an uncaught exception, it will exit without being able to
write the channel information. In this case, the client is likely to complain
about "failed to read channel", which looks inconsistent with original hg.
This patch silences the error message and makes uncaught exception behavior
more like original hg. It will help chg to pass test-fileset.t.
In some rare cases (next patch), we may want validate to do "unlink" without
forcing the client reconnect. This patch addes a new "reconnect" instruction
and makes "unlink" not to reconnect by default.
Currently the validate command in chgserver expects config can be loaded
without issues but the config can be broken and chg will print a stacktrace
instead of the parsing error, if a chg server is already running.
This patch adds a handler for ParseError in validate and a new instruction
"exit" to make the client exit without abortmsg. A test is also added to make
sure it will behave as expected.
See the previous patch for details. Since the socket will be closed by the
server, handleresponse() will never return:
Traceback (most recent call last):
...
chg: abort: failed to read channel
If chgserver aborts during startup, for example, error.ParseError when parsing
a config file, chg client probably just wants to exit with a same exit code
without printing other unrelated text. This patch changes the text "cmdserver
exited with status" from abortmsg to debugmsg and exits with a same exit code.
It couldn't exclude an empty file containing comments. That's why
hgext/__init__.py had been listed in test-check-py3-compat.t before
"hgext: officially turn 'hgext' into a namespace package".
One problem reported by lintian is "bad-distribution-in-changes-file unstable"
in changelog, but the current changelog for the official package in Ubuntu also
uses that distribution name (unstable), because they import from Debian. This
certainly doesn't stop the build process.
Current pidfile logic will only keep the pid of the newest server, which is
not very useful if we want to kill all servers, and will become outdated if
the server auto exits after being idle for too long.
Besides, the server-side pidfile writing logic runs before chgserver gets
confighash so it's not trivial to append confighash to pidfile basename like
we did for socket file.
This patch removes --pidfile from the command starting chgserver and switches
to an alternative way (unlink socket file) to stop the server.
chgserver now validates and reloads configs automatically. Manually reloading
is no longer necessary. Besides, we are deprecating pid files since the server
will periodically check its ownership of the socket file and exit if it does
not own the socket file any longer, which works more reliable than a pid file.
This patch removes the SIGHUP reload logic from both chg server and client.
The chgserver is designed to load repo config from current directory. "--cwd /"
will prevent chgserver from loading repo config and generate a wrong
confighash, which will result in a redirect loop. This patch removes "--cwd /"
and uses "--daemon-postexec chdir:/" instead.
The custom porting fixers are removed. A comment related to 2to3
has been removed from the import checker.
After this patch, no references to 2to3 remain.
Some users may have hg as a wrapper script which sets sensitive environment
variables (like setting up virtualenv). This will make chg redirect forever
because the environment variables are never considered up to date.
This patch adds a limit (10) for reconnect attempts and warn the user with
a possible solution if the limit is exceeded.
This patch uses the newly added validate method to make sure the server has
loaded the up-to-date config and extensions. If the server cannot validate
itself, the client will receive instructions and follow them to try to reach
another server that is more likely to validate itself. The instructions can
be a redirect (connect to another server address) and/or an unlink (stops an
out-dated server).
It was necessary to go through progress.uisetup() to set up the progressui
wrapper. Since the progress extension has got into the core, progress.assume-tty
is no longer necessary.
Sometimes people may create a symbol link from hg to chg, or write a wrapper
script named hg calling chg. Without $HG and $CHGHG set, this will lead to
chg executes itself causing deadlock. The user will notice chg hangs for some
time and aborts with a timed out message, without knowing the root cause and
how to solve it.
This patch sets a dummy environment variable before executing hg to detect
this situation, and print a fatal message with some possible solutions.
CHGINTERNALMARK is set by chg client to detect the situation that chg is
started by chg. It is temporary and should be dropped to avoid possible
side effects.
There are some known unsupported commands or flags for chg, such as hg serve -d
and hg foo --time. This patch detects these situations and transparently fall
back to the original hg. So the users won't bother remembering what chg can and
cannot do by themselves.
The current detection is not 100% accurate since we do not have an equivalent
command line parser in C. But it tries not to cause false positives that
prevents people from using chg for legit cases. In the future we may want to
implement a more accurate "unsupported" check server-side.
This is a part of the one server per config series. In multiple-server setup,
multiple clients may try to start different servers (on demand) at the same
time. The old lock will not guarantee a client to connect to the server it
just started, and is not crash friendly.
This patch addressed above issues by using flock and does not release the lock
until the client actually connects to the server or times out.
Initially we use --daemon-pipefds to pass file descriptors for synchronization.
Later, in order to support Windows, --daemon-pipefds is changed to accept a
file path to unlink instead. The name is outdated since then.
chg client is designed to use flock, which will be held before starting a
server and until the client actually connects to the server it started. The
unlink synchronization approach is not so helpful in this case.
To address the issues, this patch renames pipefds to postexec and the following
patch will allow the value of --daemon-postexec to be things like
'unlink:/path/to/file' or 'none'.
I've updated the script to reflect changes in Mercurial and to include a much
more through installation guide with configuration examples and details on how
to configure IIS. I've used the script to set up a working server from scratch.
We are going to make chgserver load repo config, remember what it is, and
load repo config again to detect config change. This is the first step that
passes config, repo, cwd options to server. Traceback is passed as well to
cover errors before hitting chgserver.runcommand.
They are like malloc and realloc but will abort the program on error.
A lot of places use {m,re}alloc and check their results. This patch
can simplify them.
This is necessary to suspend/resume long pulls, interactive curses session,
etc.
The implementation is based on emacsclient, but our version doesn't test if
chg process is foreground or not before propagating SIGCONT. This is because
chg isn't always an interactive session. If we copy the SIGTTIN/SIGTTOU
emulation from emacsclient, non-interactive session can't be moved to a
background job.
$ chg pull
^Z
suspended
$ bg %1
[1] continued
[1] suspended (tty input) # wrong
https://github.com/emacs-mirror/emacs/blob/0e96320/lib-src/emacsclient.c#L1094
It seems calling memset() and sigemptyset() is common pattern to initialize
sigaction. And strictly speaking, sigset_t must be initialized by sigemptyset()
or sigfillset(). I saw git and uwsgi do that way, so let's follow them.
If the revset being benchmarked has an exception, the handling code was
encountering an error because the exception did not always have an "output"
attribute (I think it's a python 2.7 thing).
This rule detects "hg extdiff" invocation without -p/--program and
-o/--option.
This patch specifies "-p diff" explicitly in test-extdiff.t to avoid
false positive matching.
Before this patch, check-code examines "magic" pattern (e.g.
'^#!.*python') matching against not contents of a file, but name of
it.
This unintentionally omits code checking against Python source file,
of which filename doesn't end with "*.py" or "*.cgi", even though
contents of it starts with "#!/bin/python" or so.
In this change, 'pre' refers contents of file 'f'.
This is fixing for 'legacy exception syntax; use "as" instead of ","'
check-code rule.
check-code has overlooked these, because files aren't recognized as
one to be checked (this problem is fixed by subsequent patch).
This is fixing for 'missing _() in ui message (use () to hide
false-positives)' check-code rule.
check-code has overlooked this, because a file isn't recognized as one
to be checked (this problem is fixed by subsequent patch).
This is fixing for 'no whitespace around = for named parameters'
check-code rule.
check-code has overlooked this, because a file isn't recognized as one
to be checked (this problem is fixed by subsequent patch).
This is fixing for 'line too long' check-code rule.
check-code has overlooked this, because a file isn't recognized as one
to be checked (this problem is fixed by subsequent patch).
Before this patch, some tests using external "diff" command via
extdiff extension fail on Solaris, because system standard "diff" (=
/usr/bin/diff) on Solaris always formats chunk header in the style
below:
@@ -X.x +Y.y @@
even though "diff" on Linux sometimes omits ".x" and/or ".y" in it.
This patch makes chunk header of external diff glob-ed for portability
of tests, and adds check-code.py rules to detect such diff output in
tests.
This patch also changes "hg diff" output in test-subrepo-git to
simplify detection rules, even though it is certainly portable because
these lines are generated by "git" command.
This patch is a part of making tests using external "diff" portable,
and tests below aren't yet portable even after this patch.
test-largefiles-update.t
test-subrepo-deep-nested-change.t
Before this patch, some tests using external "diff" command via
extdiff extension fail on Solaris, because system standard "diff" (=
/usr/bin/diff) on Solaris doesn't display timezone for timestamp of
each files in diff output.
This patch makes timezone in external diff output glob-ed for
portability of tests, and adds check-code.py a rule to detect such
Before this patch, some tests using external "diff" command via
extdiff extension fail on Solaris, because "-p" (show which C function
each change is in) option isn't supported by system standard "diff" on
Solaris, even though extdiff passes it to external "diff" by default.
Fortunately, this non-portable option isn't important for (current, at
least) tests using external "diff" command via extdiff extension.
This patch omits "-p" for external "diff" command via extdiff
extension for portability of tests, and adds check-code.py a rule to
detect invocation of "diff" with "-p".
Newly added check-code.py rule examines only lines generated by
external "diff" with "-r", because strict examination might
misidentify "hg diff -p" or other complicated lines consisting of
"diff" string as wrong one.
This patch is a part of making tests using external "diff" portable,
and tests below aren't yet portable even after this patch.
test-graft.t
test-largefiles-update.t
test-subrepo-deep-nested-change.t
Before this patch, test-check-config.t fails on Solaris, because
"xargs" doesn't invoke check-config.py with all filenames at once.
"xargs" may invoke specified command multiple times with part of
arguments given from stdin: according to "xargs(1)" man page, this
dividing arguments is system-dependent.
For portability of test-check-config.t, this patch adds "xargs" like
mode to check-config.py and executes it in test-check-config.t without
"xargs".
We're trying to forbid "grep -a" and unintentionally complained even
if the "-a" was part of the filename. Requiring a space before "-a" to
match is probably good enough.
The old code did not understand the difference between the first line of the summary,
and a random line in the summary that happened to include a #, or a
random line in the changes that happened to include it.
7c3798ffdc0c is an example where it fails
For reasons I can't explain (but likely have something to do with a
combination of __import__ inferring default values for arguments and
the demand importer mechanism further assuming defaults), the demand
importer isn't playing well with IPython. Without this patch, we get
a failure "ValueError: Attempted relative import in non-package" when
attempting to import "IPython." The stack has numerous demandimport
calls on it and adding "IPython" to the exclude list in demandimport
isn't enough to make the problem go away, which means the issue is
likely somewhere in the bowells of IPython. It's easier to just disable
the demand importer when importing the debugger.
The goal of commit summary keywords is to help us sort, categorize,
and filter our voluminous commits for our release notes in a way
that's helpful and meaningful to end users. Lately, there have been a
huge number of "keywords" that are neither words nor particularly key.
This patch tries to discourage that by narrowing the allowed
characters to alphanumeric. In particular, it doesn't allow "."
(method, function names, and file extensions) and "/" (filenames). It
also gives a short reminder of what a keyword ought to be.
This addition to the inno installer script means that the windows uninstaller
registry key “DisplayVersion" is set to the application version number and
will show in Add/Remove Programs.
This makes the changes in 68b7b759ebff and 71a3703364df available on Windows.
I'm not setup to make the installer, so someone with experience in this area
should probably give it a look. In looking around to try to figure out how to
build the installer, it looks like the Makefile may need an update to $DOCFILES.
My version of docker (1.8.3) have a different formating for 'docker version'
that broke the build script. We make the version matching more generic in to
work with both version.
A subsequent patch will refactor _chunks() and the calculation of the
offset will no longer occur in that function. Prepare by returning the
offset from _chunkraw().
There are make targets for building mercurial packages for various
distributions using docker. One of the preparation steps before building is to
create inside the docker image a user with the same uid/gid as the current user
on the host system, so that the resulting files have appropriate
ownership/permissions.
It's possible to run `make docker-<distro>` as a user with uid or gid that is
already present in a vanilla docker container of that distibution. For example,
issue4657 is about failing to build fedora packages as a user with uid=999 and
gid=999 because these ids are already used in fedora, and groupadd fails.
useradd would fail too, if the flow ever got to it (and there was a user with
such uid already).
A straightforward (maybe too much) way to fix this is to allow non-unique uid
and gid for the new user and group that get created inside the image. I'm not
sure of the implications of this, but marmoute encouraged me to try and send
this patch for stable.
Previously the -rc in our rc tags got dropped, meaning that those
packages looked newer to the packaging system than the later release
build. This rectifies the issue, though some damage may already have
been done on 3.6-rc builds.
I'm mostly cargo-culting the RPM version format - there don't appear
to be rules for RPM about how to handle this. Hopefully an RPM
enthusiast can fix up what I've done as a followup.
We want to support editors with parameters, eg EDITOR="vim -O" or whatever.
So remove the quotes from around $ED and assume that the editor variable is
properly escaped already.
Now, 'dirstate.write(tr)' delays writing in-memory changes out, if a
transaction is running.
This may cause treating this revision as "the first bad one" at
bisecting in some cases using external hook process inside transaction
scope, because some external hooks and editor process are still
invoked without HG_PENDING and pending changes aren't visible to them.
'dirstate.write()' callers below in localrepo.py explicitly use 'None'
as 'tr', because they can assume that no transaction is running:
- just before starting transaction
- at closing transaction, or
- at unlocking wlock
This fix adds a caret to the start of the regex looking for merge markers. This
avoids the issue arises when you've real merge conflicts in a file that tests
for the existance of merge markers in test output. Editmerge will not open on
the fake/tested merge markers because they'll be indented in.
This adds a test extension to check that the non-normal set contains the
expected entries. It wraps several methods of the dirstate to check that
the non-normal set has the correct values before and after the call. The
extension lives in contrib so that paranoid developers can easily
enable it to make sure that the non-normal set is consistent across more
complex operations than the included tests.
Whenever check-code finds something wrong, the diffs it
generated were fairly hard to read.
The problem is that check-code before this change
would list files that were white listed using
no- check- code but without a glob marker.
Whereas, the test-check-code.t expected output has
no-che?k-code (glob) in order to avoid having itself
flagged as a file to skip.
Thus, in addition to any lines relating to things you
did wrong, all of the white-listed files are listed as
changed.
There is no reason for things to be this painful.
This change makes the output from check-code.py match
the expected output in test-check-code.t
This came up before, but the tests in check-code.py don't find -U (only -u)
and they don't work when the diff is inside a shell function. This fixes
the offending tests and beefs up check-code.py.
Not having this caused warnings on Windows:
mercurial/pure/osutil.py:12: stdlib import follows local import: os
mercurial/pure/osutil.py:13: stdlib import follows local import: socket
mercurial/pure/osutil.py:14: stdlib import follows local import: stat
mercurial/pure/osutil.py:15: stdlib import follows local import: sys
We have a convention of using -c|-m|FILE elsewhere for reading from
revlogs. Use it for `hg perfrevlog`.
While I was here, I also added a docstring to document what this
command does, as "perfrevlog" is ambiguous.
As part of investigating performance improvements to revlog reading,
I needed a mechanism to measure every part of revlog reading so I knew
where time was spent and how effective optimizations were.
This patch implements a perf command for benchmarking the various
stages of reading a single revlog revision.
When executed against a manifest revision at the end of a 30,000+
long delta chain in mozilla-central, the command demonstrates that
~80% of time is spent in zlib decompression.
The old code only partially cleared the caches. Now that we have a
comprehensive method for wiping all caches, let's call it.
This appears to introduce a marginal regression in `hg perfmanifest`
on mozilla-central. This is good because the new result is more
accurate since caches aren't being used.
The /dev/null redirect was causing the following error:
The system cannot find the path specified.
Adjusting HGRCPATH as part of the command line causes the system to try to
execute 'HGRCPATH'.
Once we get a matcher down into manifestmerge, we can make narrowhg
work more easily and potentially let manifest.match().diff() do less
work in manifestmerge.
Python 3 is inevitable. There have been incremental movements towards
converting the code base to be Python 3 compatible. Unfortunately, we
don't have any tests that look for Python 3 compatibility. This patch
changes that.
We introduce a check-py3-compat.py script whose role is to verify
Python 3 compatibility of the files passed in. We add a test that
calls this script with all .py files from the source checkout.
The script currently only verifies that absolute_import and
print_function are used. These are the low hanging fruits for Python
compatbility. Over time, we can include more checks, including
verifying we're able to load each Python file with Python 3. You
have to start somewhere.
Accepting this patch means that all new .py files must have
absolute_import and print_function (if "print" is used) to avoid
a new warning about Python 3 incompatibility. We've already
converted several files to use absolute_import and print_function
is in the same boat, so I don't think this is such a radical
proposition.
Before this patch, import-checker.py didn't know if a name in ImportFrom
statement are module or not. Therefore, it complained the following example
did "direct symbol import from mercurial".
# hgext/foo.py
from mercurial import hg
This patch reuses the dict of local modules to filter out sub-module names.
Because OpenSSL is compiled without SSLv3 support on Debian sid, Python 2.6.9
can't be built without this hack. Python 2.7 is patched appropriately, but
2.6 isn't.
http://bugs.python.org/issue22935
This allows builddeb to handle distributions that are not Debian.
Distributor ID is reported by lsb_release --id, and in case of builddeb it's
usually Debian or Ubuntu.
Debian and Ubuntu releases have both codenames and traditional version numbers.
An entire "branch" of releases is referred to by its codename, and version
numbers (e.g. 8.2, 14.04.3) are used to address individual releases.
Since we use codenames for building .deb packages, let's call the option and
the variable appropriately.
The pull url header can easily grow over 80 chars. The check-commit script was
confusing this with a too long summary line. We update the regular expression to
not match other header.
Many revset consumers construct changectx instances for each returned
result. Add support for benchmarking this to our revset benchmark
script.
In the future, we might want to have some kind of special syntax in
the parsed revset files to engage this mode automatically. This would
enable us to load changectxs for revsets that do that in the code and
would more accurately benchmark what's actually happening. For now,
running all revsets with or without changectxs is sufficient.