Before this patch, the first and last characters were stripped from
ui.logtemplate and template.* if they were the same. It could lead to a
strange result as quotes are optional. See the test for example.
SyntaxError is the class representing syntax errors in Python code. We should
use a dedicated exception class for our needs. With this change, unnecessary
re-wrapping of SyntaxError can be eliminated.
This patch eliminates a nested data structure other than the parsed tree.
('template', [(op, data), ..]) -> ('template', (op, data), ..)
New expanded tree can be processed by common parser functions. This change
will help implementing template aliases.
Because a (template ..) node should have at least one child node, an empty
template (template []) is mapped to (string ''). Also a trivial string
(template [(string ..)]) node is unwrapped to (string ..) at parsing phase,
instead of compiling phase.
Now compiled template fragments are packed into a generic type, (func, data),
a string can be a valid template. This change allows us to unwrap a trivial
string node. See the next patch for details.
Before this patch, parsed and compiled templates were kept as lists. That
was inconvenient for applying transformation such as alias expansion.
This patch changes the types of the outermost objects as follows:
stage old new
-------- -------------- ------------------------------
parsed [(op, ..)] ('template', [(op, ..)])
compiled [(func, data)] (runtemplate, [(func, data)])
New templater.parse() function has the same signature as revset.parse()
and fileset.parse().
Silent failure hides bugs and makes it harder to track down the issue. It's
worse than raising exception.
In future patches, I plan to sort out template functions that require 'ui',
'ctx', 'fctx', etc. so that incompatible functions are excluded and the doc can
say in which context these functions are usable.
@templatefunc('label', requires=('ui',))
def label(context, mapping, args):
...
If your mercurial/templates/ directory is dirty, then the template system would
otherwise import duplicate templates from the .orig files and potentially try to
parse .rej files.
Since editing/reverting these templates isn't an unexpected action, and since
they're in .hgignore, it's best that the template system know to skip them."
To describe the bug this fix is addressing, one can do
``$ hg status -T "{label('red', path)}\n" --color=debug``
and observe that the label is not applied before my fix and applied with it.
Instead of the mapping hack introduced by d4686e0c15c9, this patch changes the
way how a label symbol is evaluated. This is still hackish, but should be more
predictable in that it doesn't depend on the known color effects.
This change is intended to eliminate the reference to color._effects so that
color.templatelabel() can be merged with templater.label().
Before this, "{noniterable % template}" raised an exception. This tries to
provide a better indication for the common case, where a left-hand-side
expression is a keyword.
A function argument may be an integer. In this case, it isn't necessary to
convert a value to string and back to integer.
Because an argument may be an arbitrary object (e.g. date tuple), TypeError
should be caught as well.
If a key is constructed from a template expression, it may be a generator.
In that case, a key have to be stringified.
A dictarg should never be a generator, but this patch also changes it to
call evalfuncarg() for consistency.
High-level use case: printing a list of objects with formatter
when each object in turn contains a list of properties (like
when % template symbol is used in {things % '{thing}'}
Let the top-level list contain one thing with two properties:
objs = [{
'props': [
{ 'value': 1, 'show': 1 },
{ 'value': 2 }]
}]
(please note that second property does not have 'show' key)
If a templateformatter is used to print this with template
"{props % '{if(show, value)}'}"
current implementation will print value for both properties,
which is a bug. This happens because in `templater.runmap`
function we only rewrite mapping values with existing new
values for each item. If some mapping value is missing in
the item, it will not be removed.
In this case, a template is parsed recursively with no thunk for lazy
evaluation. This patch prevents recursion by putting a dummy of the same name
into a cache that will be referenced while parsing if there's a recursion.
changeset = {files % changeset}\n
~~~~~~~~~
= [(_runrecursivesymbol, 'changeset')]
It would be nice if we could detect recursion at the parsing phase, but we
can't because a template can refer to a keyword of the same name. For example,
"rev = {rev}" is valid if rev is a keyword, and we don't know if rev is a
keyword or a template while parsing.
This is necessary to obtain a _hybrid object from a dict. If get() yields
a value, it would be stringified.
I see no benefit to make get() lazy, so this patch just changes "yield" to
"return".
In templater, a callable symbol exists for lazy evaluation, which should have
f(**mapping) signature. On the other hand, _hybrid.__call__(), which was
introduced by 4e182fb53989, generates mapping for each element.
This patch renames _hybrid.__call__() to _hybrid.itermaps() so that a _hybrid
object can be a value of a mapping dict.
{namespaces % "{namespace}: {names % "{name }"}\n"}
~~~~~
a _hybrid object
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.
For great justice.
It was introduced by 236440938a03, but the important code was removed by
fcf2407610f4. So there was no positive effect other than exhausting memory.
The problem spotted by 236440938a03 is that you can't use a generator keyword
more than once. For example, in hgweb template, "{child} {child}" doesn't work
because the first "{child}" consumes the generator. But as fcf2407610f4 says,
the fix was wrong because it could overwrite a callable keyword that returns
a generator. Also the fix didn't work for a generator of generator such as
"{diff}" keyword. So, the proper fix for that problem would be to not put
a generator in a keyword table. Instead, it should be a factory of a generator.
Note that this should fix the memory issue in hgweb, but my firefox killed by
OOM in place. Be careful to not use a modern web browser to test the issue4868.
This allows the latest class of tag to be found, such as a release candidate or
final build, instead of just the absolute latest.
It doesn't appear that the existing keyword can be given an optional argument.
There is a keyword, function and filter for 'date', so it doesn't seem harmful
to introduce a new function with the same name as an existing keyword. Most
functions are pretty Mercurial agnostic, but there is {revset()} as precedent.
Even though templatekw.getlatesttags() returns a single tuple, one entry of
which is a list, it is simplest to present this as a list of tags instead of a
single item, with each tag having a distance and change count attribute. It is
also closer to how {latesttag} returns a list of tags, and how this function
works when not given a '%' operator.
Because revset() function generates a list of revisions, it seems sensible
to switch the ctx as well where a list expression will be evaluated. I think
"{revset(...) % "..."}" expression wasn't considered well when it was
introduced at 45e0e191755f.
The keyword extension uses "utcdate" for a different function, so we can't
add new "utcdate" filter or function. Instead, this patch extends "localdate"
to a general timezone converter.
This will allow us to define both a primary expression, ":", and a prefix
operator, ":y". The ambiguity will be resolved by the next patch.
Prefix actions in elements table are adjusted as follows:
original prefix primary prefix
----------------- -------- -----------------
("group", 1, ")") -> n/a ("group", 1, ")")
("negate", 19) -> n/a ("negate", 19)
("symbol",) -> "symbol" n/a
"rawstring" was introduced by cd1b50e99ed8, but it's no longer necessary
because 4f14a9644001 and e99f4f59d2e9 changed the way of processing string
literals.
This patch moves string decoding to the parsing phase as it was before:
('rawstring', s) -> ('string', s)
('string', s) -> ('string', s.decode('string-escape'))
Instead of re-parsing quoted strings as templates, the tokenizer can delegate
the parsing of nested template strings to the parser. It has two benefits:
1. syntax errors can be reported with absolute positions
2. nested template can use quotes just like shell: "{"{rev}"}"
It doesn't sound nice that the tokenizer recurses into the parser. We could
instead make the tokenize itself recursive, but it would be much more
complicated because we would have to adjust binding strengths carefully and
put dummy infix operators to concatenate template fragments.
Now "string" token without r"" never appears. It will be removed by the next
patch.
This patch backs out 297d563e92af which should no longer be needed.
The test for '{\"invalid\"}' is removed because the parser is permissive for
\"...\" literal.
As of Mercurial 3.4, there were several syntax rules to process nested
template strings. Unfortunately, they were inconsistent and conflicted
each other.
a. buildmap() rule
- template string is _parsed_ as string, and parsed as template
- <\"> is not allowed in nested template:
{xs % "{f(\"{x}\")}"} -> parse error
- template escaping <\{> is handled consistently:
{xs % "\{x}"} -> escaped
b. _evalifliteral() rule
- template string is _interpreted_ as string, and parsed as template
in crafted environment to avoid double processing of escape sequences
- <\"> is allowed in nested template:
{if(x, "{f(\"{x}\")}")}
- <\{> and escape sequences in string literal in nested template are not
handled well
c. pad() rule
- template string is first interpreted as string, and parsed as template,
which means escape sequences are processed twice
- <\"> is allowed in nested template:
{pad("{xs % \"{x}\"}', 10)}
Because of the issue of template escaping, issue4714, 56e0b66a4c27 (in stable)
unified the rule (b) to (a). Then, 41e044cfb1ef (in default) unified the rule
(c) to (b) = (a). But they disabled the following syntax that was somewhat
considered valid.
{if(rev, "{if(rev, \"{rev}\")}")}
{pad("{files % \"{file}\"}", 10)}
So, this patch introduces \"...\" literal to work around the escaped-quoted
nested template strings. Because this parsing rule exists only for the backward
compatibility, it is designed to copy the behavior of old _evalifliteral() as
possible.
Future patches will introduce a better parsing rule similar to a command
substitution of POSIX shells or a string interpolation of Ruby, where extra
escapes won't be necessary at all.
{pad("{files % "{file}"}", 10)}
~~~~~~~~~~~~~~~~~~
parsed as a template, not as a string
Because <\> character wasn't allowed in a template fragment, this patch won't
introduce more breakages. But the syntax of nested templates are interpreted
differently by people, there might be unknown issues. So if we want, we could
instead remove e926f2ef639a, 72be08a15d8d and 56e0b66a4c27 from the stable
branch as the bug fixed by these patches existed for longer periods.
554d6fcc3c8, "strip single backslash before quotation mark in quoted template",
should be superseded by this patch. I'll remove it later.
Python 2.6 introduced the "except type as instance" syntax, replacing
the "except type, instance" syntax that came before. Python 3 dropped
support for the latter syntax. Since we no longer support Python 2.4 or
2.5, we have no need to continue supporting the "except type, instance".
This patch mass rewrites the exception syntax to be Python 2.6+ and
Python 3 compatible.
This patch was produced by running `2to3 -f except -w -n .`.
The backslash character should start escape sequences no matter if a string is
prefixed with 'r'. They are just not interpreted as escape sequences in raw
strings. revset.tokenize() handles them correctly, but templater didn't.
https://docs.python.org/2/reference/lexical_analysis.html#string-literals
This can simplify the interface of parse() function. Our tokenizer tends to
have optional arguments other than the message to be parsed.
Before this patch, the "lookup" argument existed only for the revset, and the
templater had to pack [program, start, end] to be passed to its tokenizer.
The problem was spotted at cd1b50e99ed8, that says "this patch invokes it
with "strtoken='rawstring'" in "_evalifliteral()", because "t" is the result
of "arg" evaluation and it should be "string-escape"-ed if "arg" is "string"
expression." This workaround is no longer valid since 72be08a15d8d introduced
strict parsing of '\{'.
Instead, we should interpret bare token as "string" or "rawstring" template.
This is what buildmap() does at parsing phase.
Because double backslashes are processed as a string escape sequence, '\\{'
should start the template syntax. On the other hand, r'' disables any sort
of \-escapes, so r'\{' can go either way, never start the template syntax
or always start it. I simply chose the latter, which means r'\{' is the same
as '\\{'.
This patch brings back pre-2.8.1 behavior.
The result of parsestring() is stored in templater's cache, t.cache, and then
it is parsed as a template string by compiletemplate(). So t.cache should keep
an unparsed string no matter if it is sourced from config value. Otherwise
backslashes would be processed twice.
The test vector is borrowed from 83ff877959a6.
The content of "hg help templating" is largely derived from docstrings
on functions providing functionality. Template functions are the long
holdout.
Prepare for generating them dynamically by defining docstrings for all
template functions.
There are numerous ways these docs could be improved. Right now, the
help output simply shows function names and arguments. So literally
any accurate data is better than what is there now.
I've tried to unify gettemplate() with buildtemplate(), but it didn't go well
because gettemplate() have to bypass mapping dict.
For example, web templates have '{tags%changelogtag}' and 'changelogtag' is
defined in both mapping, the default, and context.cache, sourced from map file.
In general, mapping shadows context variables, but gettemplate() have to pick
it from context.cache.
The previous patch made 'string' is always interpreted as a template. So
this patch removes the special handling of r'rawstring' instead. Now r''
disables template processing at all.
This patch series is intended to unify the interpretation of string literals.
It is breaking change that boldly assumes
a. string literal "..." never contains template-like fragment or it is
intended to be a template
b. we tend to use raw string literal r"..." for regexp pattern in which "{"
should have different meaning
Currently, we don't have a comprehensible rule how string literals are
evaluated in template functions. For example, fill() takes "initialindent"
and "hangindent" as templates, but not for "text", whereas "text" is a
template in pad() function.
date(date, fmt)
diff(includepattern, excludepattern)
fill(text, width, initialident: T, hangindent: T)
get(dict, key)
if(expr, then: T, else: T)
ifcontains(search, thing, then: T, else: T)
ifeq(expr1, expr2, then: T, else: T)
indent(text, indentchars, firstline)
join(list, sep)
label(label: T, expr: T)
pad(text: T, width, fillchar, right)
revset(query, formatargs...])
rstdoc(text, style)
shortest(node, minlength)
startswith(pattern, text)
strip(text, chars)
sub(pattern, replacement, expression: T)
word(number, text, separator)
expr % template: T
T: interpret "string" or r"rawstring" as template
This patch series adjusts the rule as follows:
a. string literal, '' or "", starts template processing (BC)
b. raw string literal, r'' or r"", disables both \-escape and template
processing (BC, done by subsequent patches)
c. fragment not surrounded by {} is non-templated string
"ccc{'aaa'}{r'bbb'}"
------------------ *: template
--- c: string
--- a: template
--- b: rawstring
Because this can eliminate the compilation of template arguments from the
evaluation phase, "hg log -Tdefault" gets faster.
% cd mozilla-central
% LANG=C HGRCPATH=/dev/null hg log -Tdefault -r0:10000 --time > /dev/null
before: real 4.870 secs (user 4.860+0.000 sys 0.010+0.000)
after: real 3.480 secs (user 3.440+0.000 sys 0.030+0.000)
Also, this will allow us to parse nested templates at once for better error
indication.
The next patch will introduce buildtemplate function that should be defined
near runtemplate. But I don't want to insert it between buildmap and runmap.
Although Python supports `X = Y if COND else Z`, this was only
introduced in Python 2.5. Since we have to support Python 2.4, it was
a very common thing to write instead `X = COND and Y or Z`, which is a
bit obscure at a glance. It requires some intricate knowledge of
Python to understand how to parse these one-liners.
We change instead all of these one-liners to 4-liners. This was
executed with the following perlism:
find -name "*.py" -exec perl -pi -e 's,(\s*)([\.\w]+) = \(?(\S+)\s+and\s+(\S*)\)?\s+or\s+(\S*)$,$1if $3:\n$1 $2 = $4\n$1else:\n$1 $2 = $5,' {} \;
I tweaked the following cases from the automatic Perl output:
prev = (parents and parents[0]) or nullid
port = (use_ssl and 443 or 80)
cwd = (pats and repo.getcwd()) or ''
rename = fctx and webutil.renamelink(fctx) or []
ctx = fctx and fctx or ctx
self.base = (mapfile and os.path.dirname(mapfile)) or ''
I also added some newlines wherever they seemd appropriate for readability
There are probably a few ersatz ternary operators still in the code
somewhere, lurking away from the power of a simple regex.
A style name should not contain "/", "\", "." and "..". Otherwise, templates
could be loaded from outside of the specified templates directory by invalid
?style= parameter. hgweb should not allow such requests.
This change means subdir/name is also rejected.
Template functions use "yield"s assuming that the result will be combined
into a string, which means both "f -> str" and "f -> generator" should behave
in the same way.
Before this patch, piping generator function resulted in a cryptic error.
We had to insert "|stringify" in this case.
$ hg log --template '{if(author, author)|user}\n'
abort: template filter 'userfilter' is not compatible with keyword
'[(<function runsymbol at 0x7f5af2e8d8c0>, 'author'),
(<function runsymbol at 0x7f5af2e8d8c0>, 'author')]'
7678263f920c is fine for "{revset()}", but "i.values()[0]" does not work if
each item has more than one values such as "{bookmarks}".
This fixes the problem by using list.__contains__ or dict.__contains__
appropriately.
This change is intended to avoid exposing the implementation detail to
callers. I'm going to extend fullreposet to support "null" revision, so
these mfunc calls will have to use fullreposet() instead of spanset().
"pad" function and "rawstring" type were introduced in parallel, 89145c35f76e
in default and cd1b50e99ed8 in stable respectively. Therefore, "pad" function
lacked handling of "rawstring" unintentionally.
Before this patch, we had to quote integer literals to pass to template
functions. It was error-prone, so we should allow "word(0, x)" syntax.
Currently only decimal integers are allowed. It's easy to support 0x, 0b and 0
prefixes, but I don't think they are useful.
This patch assumes that template keywords and names defined in map files do
not start with digits, except for positional variables seen in the schemes
extension.
The next patch will introduce integer literals, but the schemes extension
expects that '{1}', '{2}', ... are interpreted as keywords. This patch allows
us to process '{foo(1)}' as 'func(integer)', whereas '{1}' as 'symbol'.