The purpose of this new filter is to make it possible to partially replace the
functionality of the interhg extension. The idea is to be able to define regular
expression based substitutions on a new "websub" config section. hgweb will then
be able to apply these substitutions wherever the "websub" filter is used on a
template.
This first revision just adds the code necessary to load the websub expressions
and adds the websub filter, but it does not add any calls to the websub filter
itself on any of the templates. That will be done on the following revisions.
Currently, the 'user' filter is using util.shortuser(text) (which clearly
doesn't extract only the user portion of an email address, even though the
help text says it does).
The new 'emailuser' filter uses the new util.emailuser(text) function which,
instead, does exactly that.
The help text on the 'user' filter has been modified accordingly.
Add a doctest with an hopefuly-comprehensive list of combinations
we can expect in real-life situations.
This does not cover corner cases, for example when a CR or LF is
embedded in the name (allowed by RFC 5322!).
Code in tests/test-doctest.py contributed by:
Martin Geisler <mg@aragost.com>
Thanks!
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
RFC5322 (Internet Message Format) [0] says that the 'display name' of
an internet address [1] (what Mercurial calls 'person') can be quoted
with DQUOTE (ASCII 34: ") if it contains non-atom characters [2].
For example, dot '.' is a non-atom character. Also, DQUOTEs in a
quoted string will be escaped using "\" [2][3].
The current {author|person} template+filter just extracts the part
before an email address as-is. This can look ugly, especially on the
web interface, or when generating output for post-processing...
Moreover, as an example, the Mercurial repository has a bunch of
incoherent uses of DQUOTES in author names. As per Matt's digging:
$ hg log --template "{author|person}\n" | grep '"' | sort | uniq
"Andrei Vermel
"Aurelien Jacobs
"Daniel Santa Cruz
"Hidetaka Iwai
"Hiroshi Funai"
"Mathieu Clabaut
"Paul Moore
"Peter Arrenbrecht"
"Rafael Villar Burke
"Shun-ichi GOTO"
"Wallace, Eric S"
"Yann E. MORIN"
Josef "Jeff" Sipek
Radoslaw "AstralStorm" Szkodzinski
Fix the 'person' filter to remove leading and trailing DQUOTES,
and unescape remaining DQUOTES.
Given this author: "J. \"random\" DOE" <john@doe.net>
before: {author|person} : "J. \"random\" DOE"
after: {author|person} : J. "random" DOE
For the Mercurial repository, that leaves us with two authors with
DQUOTES, in acceptable positions:
$ hg log --template "{author|person}\n" | grep '"' | sort | uniq
Josef "Jeff" Sipek
Radoslaw "AstralStorm" Szkodzinski
[0] https://tools.ietf.org/html/rfc5322
[1] https://tools.ietf.org/html/rfc5322#section-3.4
[2] https://tools.ietf.org/html/rfc5322#section-3.2.4
[3] https://tools.ietf.org/html/rfc5322#section-3.2.1
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
This new 'bisect' template expands to a cset's bisection status (good,
bad and so on...). There is also a new 'shortbisect' filter that yields
a single char representing the cset's bisection status.
It uses the two recently-added hbisect.label() and .shortlabel() functions.
Example output using the repository in test-bisect2.t, and some made-up
state of the 'end at merge' test (with graphlog, it's so explicit):
$ hg glog --template '{rev}:{node|short} {bisect}\n' \
-r 'bisect(range)|bisect(ignored)'
o 17:228c06deef46: bad
|
o 16:609d82a7ebae: bad (implicit)
|
o 15:857b178a7cf3: bad
|\
| o 13:b0a32c86eb31: good
| |
| o 12:9f259202bbe7: good (implicit)
| |
| o 11:82ca6f06eccd: good
| |
@ | 10:429fcd26f52d: untested
|\ \
| o | 9:3c77083deb4a: skipped
| |/
| o 8:dab8161ac8fc: good
| |
o | 6:a214d5d3811a: ignored
|\ \
| o | 5:385a529b6670: ignored
| | |
o | | 4:5c668c22234f: ignored
| | |
o | | 3:0950834f0a9c: ignored
|/ /
o / 2:051e12f87bf1: ignored
|/
And now the same with the short label:
$ hg log --template '{bisect|shortbisect} {rev}:{node|short}\n'
18:d42e18c7bc9b
B 17:228c06deef46
B 16:609d82a7ebae
B 15:857b178a7cf3
14:faa450606157
G 13:b0a32c86eb31
G 12:9f259202bbe7
G 11:82ca6f06eccd
U 10:429fcd26f52d
S 9:3c77083deb4a
G 8:dab8161ac8fc
7:50c76098bbf2
I 6:a214d5d3811a
I 5:385a529b6670
I 4:5c668c22234f
I 3:0950834f0a9c
I 2:051e12f87bf1
1:4ca5088da217
0:33b1f9bc8bc5
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@anciens.enib.fr>
This will highly simplify the docstring integration. I measured "hg log
--style=changelog" duration on mercurial itself and could not detect any
difference.
It aims to fix javascript error of hgweb's graph view in Japanese 'cp932'
encoding.
'cp932' contains multibyte characters ending with '\x5c' (backslash),
e.g. '\x94\x5c' for Japanese Kanji 'Noh'.
Due to json filter escapes '\' to '\\', multibyte string ending with
'\x5c' is translated to "xxx\", resulting javascript parse error on
a web browser.
This patch changes json() to pass unicode to jsonescape().
Unicode decoding error handler changed to 'replace' by Patrick Mézard.
Mercurial has problem around text wrapping/filling in MBCS encoding
environment, because standard 'textwrap' module of Python can not
treat it correctly. It splits byte sequence for one character into two
lines.
According to unicode specification, "east asian width" classifies
characters into:
W(ide), N(arrow), F(ull-width), H(alf-width), A(mbiguous)
W/N/F/H can be always recognized as 2/1/2/1 bytes in byte sequence,
but 'A' can not. Size of 'A' depends on language in which it is used.
Unicode specification says:
If the context(= language) cannot be established reliably they
should be treated as narrow characters by default
but many of class 'A' characters are full-width, at least, in Japanese
environment.
So, this patch treats class 'A' characters as full-width always for
safety wrapping.
This patch focuses only on MBCS safe-ness, not on writing/printing
rule strict wrapping for each languages
MBCS sensitive textwrap class is originally implemented
by ITO Nobuaki <daydream.trippers@gmail.com>.
It ensures that at least "(none)" is returned in case the argument
passed is None or ''. This is primarily useful to render empty
changelog messages for hgweb but may be useful for others, too.
Adds a new template filter for removing directory levels from a
string. Examples:
{foo|stripdir} -> 'foo'
{foo/bar|stripdir} -> 'foo'
{foo/bar/more|stripdir} -> 'foo/bar'
{foo/bar/more|stripdir|stripdir} -> 'foo'