These have annoyed me for a long time, and I'm tired of commenting on
them in reviews. I'm sorry for how complicated the regular expression
is, but I was too lazy to go crack open pylint's code and add the
check there.
Rebuilding translation table (256 size) at each repquote() invocations
is redundant.
For example, this patch decreases user time of command invocation
below from 18.297s to 13.445s (about -27%) on a Linux box. This
command is main part of test-check-code.t.
hg locate | xargs python contrib/check-code.py --warnings --per-file=0
This patch adds "_repquote" prefix to functions and variables factored
out from repquote() to avoid conflict of name in the future.
Before this patch, "missing _() in ui message" rule overlooks
translatable message, which starts with other than alphabet.
To detect "missing _() in ui message" more exactly, this patch
improves the regexp with assumptions below.
- sequence consisting of below might precede "translatable message"
in same string token
- formatting string, which starts with '%'
- escaped character, which starts with 'b' (as replacement of '\\'), or
- characters other than '%', 'b' and 'x' (as replacement of alphabet)
- any string tokens might precede a string token, which contains
"translatable message"
This patch builds an input file, which is used to examine "missing _()
in ui message" detection, before '"$check_code" stringjoin.py' in
test-contrib-check-code.t, because this reduces amount of change churn
in subsequent patch.
This patch also applies "()" instead of "_()" on messages below to
hide false-positives:
- messages for ui.debug() or debug commands/tools
- contrib/debugshell.py
- hgext/win32mbcs.py (ui.write() is used, though)
- mercurial/commands.py
- _debugchangegroup
- debugindex
- debuglocks
- debugrevlog
- debugrevspec
- debugtemplate
- untranslatable messages
- doc/gendoc.py (ReST specific text)
- hgext/hgk.py (permission string)
- hgext/keyword.py (text written into configuration file)
- mercurial/cmdutil.py (formatting strings for JSON)
This patch makes repquote() distinguish more characters below, as a
preparation for exact detection in subsequent patch.
- "%" as "%"
- "\\" as "b"(ackslash)
- "*" as "A"(sterisk)
- "+" as "P"(lus)
- "-" as "M"(inus)
Characters other than "%" don't use itself as replacement, because
they are treated as special ones in regexp.
d19c9c93ad10 tried to detect '.. note::' more exactly. But
implementation of it seems not correct, because:
- fromc.find(c) returns -1 for other than "." and ":"
- tochr[-1] returns "q" for such characters, but
- expected result for them is "o"
This patch uses dict to manage replacement instead of replacing
str.find() by str.index(), for improvement/refactoring in subsequent
patches. Examination by fixedmap is placed just after examination for
' ' and '\n', because subsequent patch will integrate the latter into
the former.
This patch also changes regexp for 'string join across lines with no
space' rule, and adds detailed test for it, because d19c9c93ad10 did:
- make repquote() distinguish "." (as "p") and ":" (as "q") from
others (as "o"), but
- not change this regexp without any reason (in commit log, at
least), even though this regexp depends on what "o" means
This patch doesn't focuses on deciding whether "." and/or ":" should
be followed by whitespace or not in translatable messages.
This test (making sure the 'check-code' script run as intended) have been
confused with the test making that the mercurial code base comply with our
coding still by multiple generations of contributors.
We are moving it out of the way so that all tests starting with
'test-check' are now doing compliance testing.