In input and output files, literals are actually insufficient to represent the
most negative values of the signed integer types.
Given that we are likely to need richer forms of input and output expressions
anyways, at this point it seems best to allow any expression at the grammatical
level, and add static semantic requirements on the allowed expressions. As done
with all the other requirements, these will be stated in the Leo Reference,
formalized in ACL2, and enforced in the Leo compiler.
This is for the current simplified version of Leo. When the main function is
executed, it returns a value. According to this grammar, the output (i.e.
`.out`) file contains a single literal, which describes the output value,
preceded by an `[output]` title. This may be extended in the future.
Since we only allow four kinds of input section titles, corresponding to the
public/private/constant/const characterization of function inputs, it seems
beneficial to put this into the grammar, where it is easily captured.
Note that the previous version of the rule, which uses `identifier`, is not
quite right, because, for example, `public` is not an identiifer (it is a
keyword). So the rule would have to be modified anyways.
Given that the Leo Reference explains well that the format string grammar
applies to the character sequence after processing escapes, we do not need to
say this here. Instead, we just explain the rule similarly to what we do for
others.
This mentions embeddings and isolates besides overrides, in accordance with the
official terminology at https://www.unicode.org/reports/tr9/.
This update was discussed and agreed with @bendyarm.
Note: The Leo Reference includes text explaining these exclusions, with links to
the document above as well to the web site and paper about Trojan source
exploits.
This is the (sub)grammar for input files. It is an initial draft, written based
on the Notion page 'Leo Input File Doc/Spec'. This should be compared with the
currently implemented parser of input (i.e. .in) files.
As the Leo Reference will describe (that part has not been written yet), the
input grammar is based on the lexical grammar, i.e. an input file consists of
tokens, comments, and whitespace. However, only some tokens (compared to the
syntactic grammar for Leo code files) are used, namely the ones reachable from
the `input-file` nonterminal.
Currently (i.e. im this initial version of Leo) `input-type` is (any) `type` and
`input-expression` is just a `literal`, but these may evolve as we extend the
language (e.g. we'll probably disallow circuit types and allow tuple and array
constructions). The intent is that `input-type` will be a subset of `type` and
that `input-expression` will be a subset of `expression`.
This does not change the language. It just adds a rule to name binary
expressions explicitly. This makes the relation with ternary expressions
clearer, and as usual it explicates more terminology.
* [ABNF] Add a rule for function calls.
This does not change the language. It just slightly reformulates the grammar for
greater clarity and to help establish a nomenclature for constructs.
Also remove a trailing space.
* [ABNF] Re-generate markdown.
Co-authored-by: collin <16715212+collinc97@users.noreply.github.com>
This mirrors the structure of the rules for types, where there is an
`integer-type` consisting of `unsigned-type` and `signed-type`.
There is no change to the language.
In the currently restricted version of Leo, this is necessary for the numerals
in affine group literals to be tokens.
No change necessary to the lexer/parser, which already handle this properly.
Expression statements were removed from the rule for statements, but the rule
for expression statements itself had not been removed. This commit fixes that.