The ending semicolon was missing.
This was found by @bendyarm, while investigating a discrepancy between the Leo
parser in Rust and the Leo parser in ACL2: the latter was correctly following
the erroneous grammar rule; it will be changed to be consistent with the fixed
rule.
Now that the stdlib includes the type alias declaration
type string = [char; _];
in order for that type declaration to be legal and not "special", 'string' must
be an identifier.
Extending array dimensions for types to allow underscore has the unintended
effect of allowing that also for expressions. This is not necessarily an error,
as the static semantics can rule out underscores from array expressions.
However, it is a simple syntactic constraint that is best captured by the
grammar.
This commit differentiate array dimensions for types and array dimensions for
expressions, using them in the appropriate places.
Replace 'circuit-or-alias-type' with 'identifier-or-self-type'.
This makes the nomenclature for types more clear and extensible:
- A type may be an identifier, which may be the name of a circuit type or the
name of a type alias.
- In the future, an identifier used as a type could refer to other kinds of
types that we may add, such as enumerations.
- While both 'alias type' and 'type alias' could be acceptable terms, it seems
best to standardize on 'type alias': the latter describes an alias of a type
(which is the right concept), while the former suggests a type of the "alias"
kind (cf. 'circuit type', 'field type', 'integer type', etc.). Type aliases
are not another kind of types like the other: they are aliases of (any of)
those kinds of types. So by not having 'circuit-or-alias-type' we avoid
suggesting a notion of 'alias type'.
This does not change the language described by the grammar, it merely changes
some nomenclature in the grammar. Thus, no change to the parser is
needed. Aligning the nomenclature in the abstract syntax and parser to the ABNF
would be good, but entirely optional at this point.
These were removed in a previous commit, because they are already classified as
(boolean) literals, but they were accidentally re-introduced, presumably due to
the type alias RFC referencing the previous version of the keyword grammar rule.
Since, with the introduction of (ASCII and Unicode escapes for) characters, we
now have digits in base 10, 8, and 16, it seems worth being more explicit in the
naming of decimal digits in the grammar.
Just a nonterminal name change, not a structural change to the grammar.
Move the rule for 'digit' just before the ones for 'octal-digit' and
'hexadecimal-digit'.
Update the comments accordingly.
No real change to the grammar here.
Before this commit, 'true' and 'false' were both keywords and boolean literals,
making the grammar slightly more ambiguous than it needs to be. This manifests
in the formal specification of lexing and parsing, which would need an
additional extra-grammatical predicate requiring 'true' and 'false' to be
boolean literals and not keywords. By making 'true' and 'false' just boolean
literals, we obviate that need.
Extend the comment about identifiers and package names not being keywords or
aleo1... to also exclude boolean literals.
This is similar to the grammar of Java, in which 'true', 'false', and 'null' are
not keywords.
This does not necessitate any change to the lexer/parser, which already performs
its own procedural disambiguation of this and other aspects of the grammar.
Add to the rule for package names, as a comment, the same exclusion added to the
rule for identifiers. Also add a few lines to describe it.
Also discuss the disambiguation of identifiers and package names.