The ending semicolon was missing.
This was found by @bendyarm, while investigating a discrepancy between the Leo
parser in Rust and the Leo parser in ACL2: the latter was correctly following
the erroneous grammar rule; it will be changed to be consistent with the fixed
rule.
The rule failed to disallow expressions like circuit C {}, i.e. with no member
variables. This has been fixed by making the (non-empty) list of circuit inline
elements optional as a whole.
Now that the stdlib includes the type alias declaration
type string = [char; _];
in order for that type declaration to be legal and not "special", 'string' must
be an identifier.
Extending array dimensions for types to allow underscore has the unintended
effect of allowing that also for expressions. This is not necessarily an error,
as the static semantics can rule out underscores from array expressions.
However, it is a simple syntactic constraint that is best captured by the
grammar.
This commit differentiate array dimensions for types and array dimensions for
expressions, using them in the appropriate places.
Replace 'circuit-or-alias-type' with 'identifier-or-self-type'.
This makes the nomenclature for types more clear and extensible:
- A type may be an identifier, which may be the name of a circuit type or the
name of a type alias.
- In the future, an identifier used as a type could refer to other kinds of
types that we may add, such as enumerations.
- While both 'alias type' and 'type alias' could be acceptable terms, it seems
best to standardize on 'type alias': the latter describes an alias of a type
(which is the right concept), while the former suggests a type of the "alias"
kind (cf. 'circuit type', 'field type', 'integer type', etc.). Type aliases
are not another kind of types like the other: they are aliases of (any of)
those kinds of types. So by not having 'circuit-or-alias-type' we avoid
suggesting a notion of 'alias type'.
This does not change the language described by the grammar, it merely changes
some nomenclature in the grammar. Thus, no change to the parser is
needed. Aligning the nomenclature in the abstract syntax and parser to the ABNF
would be good, but entirely optional at this point.
These were removed in a previous commit, because they are already classified as
(boolean) literals, but they were accidentally re-introduced, presumably due to
the type alias RFC referencing the previous version of the keyword grammar rule.
Since, with the introduction of (ASCII and Unicode escapes for) characters, we
now have digits in base 10, 8, and 16, it seems worth being more explicit in the
naming of decimal digits in the grammar.
Just a nonterminal name change, not a structural change to the grammar.
Move the rule for 'digit' just before the ones for 'octal-digit' and
'hexadecimal-digit'.
Update the comments accordingly.
No real change to the grammar here.
Before this commit, 'true' and 'false' were both keywords and boolean literals,
making the grammar slightly more ambiguous than it needs to be. This manifests
in the formal specification of lexing and parsing, which would need an
additional extra-grammatical predicate requiring 'true' and 'false' to be
boolean literals and not keywords. By making 'true' and 'false' just boolean
literals, we obviate that need.
Extend the comment about identifiers and package names not being keywords or
aleo1... to also exclude boolean literals.
This is similar to the grammar of Java, in which 'true', 'false', and 'null' are
not keywords.
This does not necessitate any change to the lexer/parser, which already performs
its own procedural disambiguation of this and other aspects of the grammar.
Add to the rule for package names, as a comment, the same exclusion added to the
rule for identifiers. Also add a few lines to describe it.
Also discuss the disambiguation of identifiers and package names.
Given that we have string literals now, there is no need for the special notion
of format strings. Some other grammar rules go away as they were only involved
in the definition of format strings.
The well-formedness of containers in format strings is now delegated to the
static semantics of Leo: at the grammar level, any string literals is accepted
in console print calls.
Explain the new syntax for circuit member variables.
Explain the tighter syntax for import declarations.
Keep lines to 80 columns max, so that they fit well in the figures in the LaTeX
document.
Now the comment says that an identifier must not only be distinct from a
keyword, but also not be or start with 'aleo1'. Even though the grammar does not
capture these extra-grammatical requirements, we use comments to at least
mention them prominently.
Explain better variable and constant declarations.
Leave one blank line between the rules for variable and constant declarations
(not necessary for ABNF, but consistent with the rest of the file and actually
expected by the ABNF-to-LaTeX converter).
Limit lines to 80 columns, by putting the rules for variable and constant
declarations over two lines each, with proper indentation.
By using markdown in the documentation comments of the grammar, the markdown
file generated from the grammar includes those markdown features in the text,
making it more readable and better-looking.
Also fixed a few typos in the documentation comments.
Also updated a few documentation comments that were out of date after making
changes to the grammar.
Also removed a now-obsolete grammar rule for "input" parameters of functions.