mirror of
https://github.com/ProvableHQ/leo.git
synced 2024-12-27 04:01:47 +03:00
Merge pull request #835 from AleoHQ/feature/abnf-doc-improvements
[ABNF] Improve the documentation of the grammar and fix a rule
This commit is contained in:
commit
68419f4e10
@ -22,7 +22,7 @@ but its lines will always be terminated by CR LF on any system.
|
|||||||
|
|
||||||
Note that this CR LF requirement only applies to the grammar files themselves.
|
Note that this CR LF requirement only applies to the grammar files themselves.
|
||||||
It does not apply to the lines of the languages described by the grammar.
|
It does not apply to the lines of the languages described by the grammar.
|
||||||
ABNF grammar may describe any kinds of languages,
|
ABNF grammars may describe any kind of languages,
|
||||||
with any kind of line terminators,
|
with any kind of line terminators,
|
||||||
or even without line terminators at all (e.g. for "binary" languages).
|
or even without line terminators at all (e.g. for "binary" languages).
|
||||||
|
|
||||||
@ -33,29 +33,23 @@ or even without line terminators at all (e.g. for "binary" languages).
|
|||||||
Introduction
|
Introduction
|
||||||
------------
|
------------
|
||||||
|
|
||||||
This file contains an initial draft of
|
This file contains an ABNF (Augmented Backus-Naur Form) grammar of Leo.
|
||||||
a complete ABNF (Augmented Backus-Naur Form) grammar of Leo.
|
|
||||||
Background on ABNF is provided later in this file.
|
Background on ABNF is provided later in this file.
|
||||||
|
|
||||||
The initial motivation for creating an ABNF grammar of Leo was that
|
This grammar provides an official definition of the syntax of Leo
|
||||||
we have a formally verified parser of ABNF grammars
|
that is both human-readable and machine-readable.
|
||||||
|
It will be part of an upcoming Leo language reference.
|
||||||
|
It may also be used to generate parser tests at some point.
|
||||||
|
|
||||||
|
We are also using this grammar
|
||||||
|
as part of a mathematical formalization of the Leo language,
|
||||||
|
which we are developing in the ACL2 theorem prover
|
||||||
|
and which we plan to publish at some point.
|
||||||
|
In particular, we have used a formally verified parser of ABNF grammars
|
||||||
(at https://github.com/acl2/acl2/tree/master/books/kestrel/abnf;
|
(at https://github.com/acl2/acl2/tree/master/books/kestrel/abnf;
|
||||||
also see the paper at https://www.kestrel.edu/people/coglio/vstte18.pdf)
|
also see the paper at https://www.kestrel.edu/people/coglio/vstte18.pdf)
|
||||||
which we have used to parse this ABNF grammar of Leo
|
to parse this grammar into a formal representation of the Leo concrete syntax
|
||||||
into a formal representation, in the ACL2 theorem prover,
|
and to validate that the grammar satisfies certain consistency properties.
|
||||||
of the Leo concrete syntax.
|
|
||||||
The parsing of this grammar file into an ACL2 representation
|
|
||||||
happens every time the ACL2 formalization of Leo is built.
|
|
||||||
|
|
||||||
In addition to that initial motivation,
|
|
||||||
this ABNF grammar has now the additional and primary purpose of
|
|
||||||
providing an official definition of the syntax of Leo
|
|
||||||
that is both human-readable and machine-readable.
|
|
||||||
This grammar will be part of the (upcoming) Leo language reference,
|
|
||||||
of the Leo Language Formal Specification
|
|
||||||
(i.e. the LaTeX document in the leo-semantics repo),
|
|
||||||
and of the ACL2 formalization of Leo (which was the initial motivation).
|
|
||||||
It has also been suggested that it may be used to generate tests.
|
|
||||||
|
|
||||||
|
|
||||||
--------
|
--------
|
||||||
@ -65,13 +59,10 @@ Background on ABNF
|
|||||||
------------------
|
------------------
|
||||||
|
|
||||||
ABNF is an Internet standard:
|
ABNF is an Internet standard:
|
||||||
see https://www.rfc-editor.org/info/rfc5234
|
see RFC 5234 at https://www.rfc-editor.org/info/rfc5234
|
||||||
and https://www.rfc-editor.org/info/rfc7405.
|
and RFC 7405 at https://www.rfc-editor.org/info/rfc7405.
|
||||||
It is used to specify the syntax of JSON, HTTP, and other standards.
|
It is used to specify the syntax of JSON, HTTP, and other standards.
|
||||||
|
|
||||||
The following is copied (and "un-LaTeX'd") from the aforementioned paper
|
|
||||||
(at https://www.kestrel.edu/people/coglio/vstte18.pdf).
|
|
||||||
|
|
||||||
ABNF adds conveniences and makes slight modifications
|
ABNF adds conveniences and makes slight modifications
|
||||||
to Backus-Naur Form (BNF),
|
to Backus-Naur Form (BNF),
|
||||||
without going beyond context-free grammars.
|
without going beyond context-free grammars.
|
||||||
@ -88,15 +79,15 @@ ABNF uses only natural numbers as terminals,
|
|||||||
and denotes them via:
|
and denotes them via:
|
||||||
(i) binary, decimal, or hexadecimal sequences,
|
(i) binary, decimal, or hexadecimal sequences,
|
||||||
e.g. %b1.11.1010, %d1.3.10, and %x.1.3.A
|
e.g. %b1.11.1010, %d1.3.10, and %x.1.3.A
|
||||||
all denote the string '1 3 10';
|
all denote the sequence of terminals '1 3 10';
|
||||||
(ii) binary, decimal, or hexadecimal ranges,
|
(ii) binary, decimal, or hexadecimal ranges,
|
||||||
e.g. %x30-39 denotes any string 'n' with 48 <= n <= 57
|
e.g. %x30-39 denotes any singleton sequence of terminals
|
||||||
(an ASCII digit);
|
'n' with 48 <= n <= 57 (an ASCII digit);
|
||||||
(iii) case-sensitive ASCII strings,
|
(iii) case-sensitive ASCII strings,
|
||||||
e.g. "Ab" denotes the string '65 98';
|
e.g. "Ab" denotes the sequence of terminals '65 98';
|
||||||
and (iv) case-insensitive ASCII strings,
|
and (iv) case-insensitive ASCII strings,
|
||||||
e.g. %i"ab", or just "ab", denotes
|
e.g. %i"ab", or just "ab", denotes
|
||||||
any string among
|
any sequence of terminals among
|
||||||
'65 66',
|
'65 66',
|
||||||
'65 98',
|
'65 98',
|
||||||
'97 66', and
|
'97 66', and
|
||||||
@ -120,7 +111,7 @@ Instead of BNF's |, ABNF uses / to separate alternatives.
|
|||||||
Repetition prefixes have precedence over juxtapositions,
|
Repetition prefixes have precedence over juxtapositions,
|
||||||
which have precedence over /.
|
which have precedence over /.
|
||||||
Round brackets group things and override the aforementioned precedence rules,
|
Round brackets group things and override the aforementioned precedence rules,
|
||||||
e.g. *(WSP / CRLF WSP) denotes strings
|
e.g. *(WSP / CRLF WSP) denotes sequences of terminals
|
||||||
obtained by repeating, zero or more times,
|
obtained by repeating, zero or more times,
|
||||||
either (i) a WSP or (ii) a CRLF followed by a WSP.
|
either (i) a WSP or (ii) a CRLF followed by a WSP.
|
||||||
Square brackets also group things but make them optional,
|
Square brackets also group things but make them optional,
|
||||||
@ -149,49 +140,42 @@ and comments (semicolons to line endings).
|
|||||||
Structure
|
Structure
|
||||||
---------
|
---------
|
||||||
|
|
||||||
Language specifications often define the syntax of their languages via
|
This ABNF grammar consists of two (sub-)grammars:
|
||||||
(i) a lexical grammar that describes how
|
(i) a lexical grammar that describes how
|
||||||
sequence of characters are parsed into tokens, and
|
sequence of characters are parsed into tokens, and
|
||||||
(ii) a syntactic grammar that described how
|
(ii) a syntactic grammar that described how
|
||||||
tokens are parsed into expressions, statements, etc.
|
tokens are parsed into expressions, statements, etc.
|
||||||
(The adjectives 'lexical' and 'syntactic' are
|
The adjectives 'lexical' and 'syntactic' are
|
||||||
the ones used in the Java language specification;
|
the same ones used in the Java language reference,
|
||||||
other terms may be used by other languages,
|
for instance;
|
||||||
but the essence is similar.)
|
alternative terms may be used in other languages,
|
||||||
The situation is sometimes more complex,
|
but the separation into these two components is quite common
|
||||||
with multiple passes (e.g. Unicode escape processing in Java),
|
(the situation is sometimes a bit more complex, with multiple passes,
|
||||||
but the division between lexical and syntactic (in the sense above) stands.
|
e.g. Unicode escape processing in Java).
|
||||||
|
|
||||||
In the aforementioned language specifications,
|
This separation enables
|
||||||
both the lexical and syntactic grammars
|
|
||||||
are normally written in a context-free grammar notation,
|
|
||||||
augmented with natural language that may assert, for instance,
|
|
||||||
that tokenization always takes the longest sequence that constitutes a token.
|
|
||||||
|
|
||||||
This dual structure appears to be motivated by the fact that
|
|
||||||
concerns of white space, line endings, etc.
|
concerns of white space, line endings, etc.
|
||||||
can be handled by the lexical grammar,
|
to be handled by the lexical grammar,
|
||||||
allowing the syntactic grammar to focus on the more important structure.
|
with the syntactic grammar focused on the more important structure.
|
||||||
Handling both aspects in a single context-free grammar may be unwieldy,
|
Handling both aspects in a single grammar may be unwieldy,
|
||||||
so having two grammars provides more clarity and readability.
|
so having two grammars provides more clarity and readability.
|
||||||
|
|
||||||
In contrast, PEG (Parsing Expression Grammar) formalisms like Pest
|
ABNF is a context-free grammar notation, with no procedural interpretation.
|
||||||
naturally embody a procedural interpretation
|
The two grammars conceptually define two subsequent processing phases,
|
||||||
that can handle white space and tokenization in just one manageable grammar.
|
|
||||||
However, this procedural interpretaion may be sometimes
|
|
||||||
less clear and readable to humans than context-free rules.
|
|
||||||
Indeed, context-free grammar are commonly used to documentat languages.
|
|
||||||
|
|
||||||
ABNF is a context-free grammar notation,
|
|
||||||
with no procedural interpretation,
|
|
||||||
and therefore it makes sense to define
|
|
||||||
separate lexical and syntactic ABNF grammars for Leo.
|
|
||||||
Conceptually, the two grammars define two subsequent processing phases,
|
|
||||||
as detailed below.
|
as detailed below.
|
||||||
However, a parser implementation does not need to perform
|
However, a parser implementation does not need to perform
|
||||||
two strictly separate phases (in fact, it typically does not),
|
two strictly separate phases (in fact, it typically does not),
|
||||||
so long as it produces the same final result.
|
so long as it produces the same final result.
|
||||||
|
|
||||||
|
The grammar is accompanied by some extra-grammatical requirements,
|
||||||
|
which are not conveniently expressible in a context-free grammar like ABNF.
|
||||||
|
These requirements are needed to make the grammar unambiguous,
|
||||||
|
i.e. to ensure that, for each sequence of terminals,
|
||||||
|
there is exactly one parse tree for that sequence terminals
|
||||||
|
that satisfies not only the grammar rules
|
||||||
|
but also the extra-grammatical requirements.
|
||||||
|
These requirements are expressed as comments in this file.
|
||||||
|
|
||||||
|
|
||||||
--------
|
--------
|
||||||
|
|
||||||
@ -225,8 +209,9 @@ additive-expression =
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
this rule tells us that the additive operators '+' and '-' have
|
These rules tell us
|
||||||
lower precedence than the multiplicative operators '*' and '/',
|
that the additive operators '+' and '-' have lower precedence
|
||||||
|
than the multiplicative operators '*' and '/',
|
||||||
and that both the additive and multiplicative operators associate to the left.
|
and that both the additive and multiplicative operators associate to the left.
|
||||||
This may be best understood via the examples given below.
|
This may be best understood via the examples given below.
|
||||||
|
|
||||||
@ -319,14 +304,14 @@ as second sub-expression, as it would in the second tree above.
|
|||||||
Naming Convention
|
Naming Convention
|
||||||
-----------------
|
-----------------
|
||||||
|
|
||||||
For this ABNF grammar, we choose nonterminal names
|
This ABNF grammar uses nonterminal names
|
||||||
that consist of complete English words, separated by dashes,
|
that consist of complete English words, separated by dashes,
|
||||||
and that describe the construct the way it is in English.
|
and that describe the construct the way it is in English.
|
||||||
For instance, we use the name 'conditional-statement'
|
For instance, we use the name 'conditional-statement'
|
||||||
to describe conditional statements.
|
to describe conditional statements.
|
||||||
|
|
||||||
At the same time, we attempt to establish
|
At the same time, this grammar establishes
|
||||||
a precise and "official" nomenclature for the Leo constructs,
|
a precise and official nomenclature for the Leo constructs,
|
||||||
by way of the nonterminal names that define their syntax.
|
by way of the nonterminal names that define their syntax.
|
||||||
For instance, the rule
|
For instance, the rule
|
||||||
|
|
||||||
@ -350,7 +335,7 @@ The only exception to the nomenclature-establishing role of the grammar
|
|||||||
is the fact that, as discussed above,
|
is the fact that, as discussed above,
|
||||||
we write the grammar rules in a way that determines
|
we write the grammar rules in a way that determines
|
||||||
the relative precedence and the associativity of expression operators,
|
the relative precedence and the associativity of expression operators,
|
||||||
and that therefore we have rules like
|
and therefore we have rules like
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -363,12 +348,14 @@ unary-expression = primary-expression
|
|||||||
|
|
||||||
|
|
||||||
In order to allow the recursion of the rule to stop,
|
In order to allow the recursion of the rule to stop,
|
||||||
we need to regard, in the grammar, a primary expression as a unary expression.
|
we need to regard, in the grammar, a primary expression as a unary expression
|
||||||
|
(i.e. a primary expression is also a unary expression in the grammar;
|
||||||
|
but note that the opposite is not true).
|
||||||
However, this is just a grammatical artifact:
|
However, this is just a grammatical artifact:
|
||||||
ontologically, a primary expression is not really a unary expression,
|
ontologically, a primary expression is not really a unary expression,
|
||||||
because a unary expression is one that consists of
|
because a unary expression is one that consists of
|
||||||
a unary operator and an operand sub-expression.
|
a unary operator and an operand sub-expression.
|
||||||
These terminological "exceptions" should be easy to identify in the rules.
|
These terminological exceptions should be easy to identify in the rules.
|
||||||
|
|
||||||
|
|
||||||
--------
|
--------
|
||||||
@ -382,11 +369,12 @@ represented as Unicode code points,
|
|||||||
which are numbers in the range form 0 to 10FFFFh.
|
which are numbers in the range form 0 to 10FFFFh.
|
||||||
These are captured by the ABNF rule 'character' below.
|
These are captured by the ABNF rule 'character' below.
|
||||||
|
|
||||||
The lexical grammar defines how, conceptually,
|
The lexical grammar defines how, at least conceptually,
|
||||||
the sequence of characters is turned into
|
the sequence of characters is turned into
|
||||||
a sequence of tokens, comments, and whitespaces.
|
a sequence of tokens, comments, and whitespaces:
|
||||||
|
these entities are all defined by the grammar rules below.
|
||||||
|
|
||||||
As stated, the lexical grammar is ambiguous.
|
As stated, the lexical grammar alone is ambiguous.
|
||||||
For example, the sequence of characters '**' (i.e. two stars)
|
For example, the sequence of characters '**' (i.e. two stars)
|
||||||
could be equally parsed as two '*' symbol tokens or one '**' symbol token
|
could be equally parsed as two '*' symbol tokens or one '**' symbol token
|
||||||
(see rule for 'symbol' below).
|
(see rule for 'symbol' below).
|
||||||
@ -401,10 +389,6 @@ the extra-grammatical requirement that
|
|||||||
the longest possible sequence of characters is always parsed.
|
the longest possible sequence of characters is always parsed.
|
||||||
This way, '**' must be parsed as one '**' symbol token,
|
This way, '**' must be parsed as one '**' symbol token,
|
||||||
and '<CR><LF>' must be parsed as one line terminator.
|
and '<CR><LF>' must be parsed as one line terminator.
|
||||||
(We should formalize this requirement,
|
|
||||||
along with the other extra-grammatical requirements given below,
|
|
||||||
and formally prove that they indeed make
|
|
||||||
the lexical grammar of Leo unambiguous.)
|
|
||||||
|
|
||||||
As mentioned above, a character is any Unicode code point.
|
As mentioned above, a character is any Unicode code point.
|
||||||
This grammar does not say how those are encoded in files (e.g. UTF-8):
|
This grammar does not say how those are encoded in files (e.g. UTF-8):
|
||||||
@ -473,14 +457,15 @@ a single carriage return,
|
|||||||
a line feed,
|
a line feed,
|
||||||
or a carriage return immediately followed by a line feed.
|
or a carriage return immediately followed by a line feed.
|
||||||
Note that the latter combination constitutes a single line terminator,
|
Note that the latter combination constitutes a single line terminator,
|
||||||
according to the extra-grammatical rule of the longest sequence.
|
according to the extra-grammatical requirement of the longest sequence,
|
||||||
|
described above.
|
||||||
|
|
||||||
<a name="newline"></a>
|
<a name="newline"></a>
|
||||||
```abnf
|
```abnf
|
||||||
newline = line-feed / carriage-return / carriage-return line-feed
|
newline = line-feed / carriage-return / carriage-return line-feed
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[carriage-return](#user-content-carriage-return), [line-feed](#user-content-line-feed)_;
|
Go to: _[line-feed](#user-content-line-feed), [carriage-return](#user-content-carriage-return)_;
|
||||||
|
|
||||||
|
|
||||||
Line terminators form whitespace, along with spaces and horizontal tabs.
|
Line terminators form whitespace, along with spaces and horizontal tabs.
|
||||||
@ -490,18 +475,18 @@ Line terminators form whitespace, along with spaces and horizontal tabs.
|
|||||||
whitespace = space / horizontal-tab / newline
|
whitespace = space / horizontal-tab / newline
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[space](#user-content-space), [newline](#user-content-newline), [horizontal-tab](#user-content-horizontal-tab)_;
|
Go to: _[newline](#user-content-newline), [horizontal-tab](#user-content-horizontal-tab), [space](#user-content-space)_;
|
||||||
|
|
||||||
|
|
||||||
There are two kinds of comments in Leo, as in other languages.
|
There are two kinds of comments in Leo, as in other languages.
|
||||||
One is block comments of the form '/* ... */',
|
One is block comments of the form '/* ... */',
|
||||||
and the other is end-of-line comments '// ...'.
|
and the other is end-of-line comments of the form '// ...'.
|
||||||
The first kind start at '/*' and end at the first '*/',
|
The first kind start at '/*' and end at the first '*/',
|
||||||
possibly spanning multiple (partial) lines;
|
possibly spanning multiple (partial) lines;
|
||||||
they do no nest.
|
these do no nest.
|
||||||
The second kind start at '//' and extend till the end of the line.
|
The second kind start at '//' and extend till the end of the line.
|
||||||
The rules about comments given below are similar to
|
The rules about comments given below are similar to
|
||||||
the ones used in the Java language specification.
|
the ones used in the Java language reference.
|
||||||
|
|
||||||
<a name="comment"></a>
|
<a name="comment"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -525,7 +510,7 @@ rest-of-block-comment = "*" rest-of-block-comment-after-star
|
|||||||
/ not-star rest-of-block-comment
|
/ not-star rest-of-block-comment
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[rest-of-block-comment](#user-content-rest-of-block-comment), [rest-of-block-comment-after-star](#user-content-rest-of-block-comment-after-star), [not-star](#user-content-not-star)_;
|
Go to: _[rest-of-block-comment](#user-content-rest-of-block-comment), [not-star](#user-content-not-star), [rest-of-block-comment-after-star](#user-content-rest-of-block-comment-after-star)_;
|
||||||
|
|
||||||
|
|
||||||
<a name="rest-of-block-comment-after-star"></a>
|
<a name="rest-of-block-comment-after-star"></a>
|
||||||
@ -643,9 +628,9 @@ Thus an address always consists of 63 characters.
|
|||||||
address = "aleo1" 58( lowercase-letter / digit )
|
address = "aleo1" 58( lowercase-letter / digit )
|
||||||
```
|
```
|
||||||
|
|
||||||
A format string is a sequence of characters, other than double quote,
|
A formatted string is a sequence of characters, other than double quote,
|
||||||
surrounded by double quotes.
|
surrounded by double quotes.
|
||||||
Within a format string, substrings '{}' are distinguished as containers
|
Within a formatted string, sub-strings '{}' are distinguished as containers
|
||||||
(these are the ones that may be matched with values
|
(these are the ones that may be matched with values
|
||||||
whose textual representation replaces the containers
|
whose textual representation replaces the containers
|
||||||
in the printed string).
|
in the printed string).
|
||||||
@ -671,8 +656,8 @@ Go to: _[double-quote](#user-content-double-quote)_;
|
|||||||
|
|
||||||
|
|
||||||
Here is (part of this ABNF comment),
|
Here is (part of this ABNF comment),
|
||||||
an alternative way to specify format strings,
|
an alternative way to specify formatted strings,
|
||||||
which captures the extra-grammatical requirement above in the grammar
|
which captures the extra-grammatical requirement above in the grammar,
|
||||||
but is more complicated:
|
but is more complicated:
|
||||||
|
|
||||||
|
|
||||||
@ -704,11 +689,9 @@ formatted-string = double-quote *formatted-string-element double-quote
|
|||||||
|
|
||||||
|
|
||||||
It is not immediately clear which approach is better; there are tradeoffs.
|
It is not immediately clear which approach is better; there are tradeoffs.
|
||||||
Regardless, we should choose one eventually.
|
We may choose to adopt this one in future revisions of the grammar.
|
||||||
|
|
||||||
Annotations are built out of names and arguments, which are tokens.
|
Annotations have names, which are identifiers immediately preceded by '@'.
|
||||||
Two names are currently supported.
|
|
||||||
An argument is a sequence of one or more letters, digits, and underscores.
|
|
||||||
|
|
||||||
<a name="annotation-name"></a>
|
<a name="annotation-name"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -719,7 +702,7 @@ Go to: _[identifier](#user-content-identifier)_;
|
|||||||
|
|
||||||
|
|
||||||
A natural (number) is a sequence of one or more digits.
|
A natural (number) is a sequence of one or more digits.
|
||||||
Note that we allow leading zeros, e.g. '007'.
|
We allow leading zeros, e.g. '007'.
|
||||||
|
|
||||||
<a name="natural"></a>
|
<a name="natural"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -727,7 +710,7 @@ natural = 1*digit
|
|||||||
```
|
```
|
||||||
|
|
||||||
An integer (number) is either a natural or its negation.
|
An integer (number) is either a natural or its negation.
|
||||||
Note that we also allow leading zeros in negative numbers, e.g. '-007'.
|
We allow leading zeros also in negative numbers, e.g. '-007'.
|
||||||
|
|
||||||
<a name="integer"></a>
|
<a name="integer"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -809,7 +792,9 @@ address-literal = "address" "(" address ")"
|
|||||||
Go to: _[address](#user-content-address)_;
|
Go to: _[address](#user-content-address)_;
|
||||||
|
|
||||||
|
|
||||||
The ones above are all the literals, as defined by the following rule.
|
The ones above are all the atomic literals
|
||||||
|
(in the sense that they are tokens, without whitespace allowed in them),
|
||||||
|
as defined by the following rule.
|
||||||
|
|
||||||
<a name="atomic-literal"></a>
|
<a name="atomic-literal"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -822,21 +807,21 @@ atomic-literal = untyped-literal
|
|||||||
/ address-literal
|
/ address-literal
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[address-literal](#user-content-address-literal), [untyped-literal](#user-content-untyped-literal), [field-literal](#user-content-field-literal), [boolean-literal](#user-content-boolean-literal), [unsigned-literal](#user-content-unsigned-literal), [product-group-literal](#user-content-product-group-literal), [signed-literal](#user-content-signed-literal)_;
|
Go to: _[product-group-literal](#user-content-product-group-literal), [unsigned-literal](#user-content-unsigned-literal), [address-literal](#user-content-address-literal), [boolean-literal](#user-content-boolean-literal), [untyped-literal](#user-content-untyped-literal), [signed-literal](#user-content-signed-literal), [field-literal](#user-content-field-literal)_;
|
||||||
|
|
||||||
|
|
||||||
After defining the (mostly) alphanumeric tokens above,
|
After defining the (mostly) alphanumeric tokens above,
|
||||||
it remains to define tokens for non-alphanumeric symbols such as "+" and "(".
|
it remains to define tokens for non-alphanumeric symbols such as "+" and "(".
|
||||||
Different programming languages used different terminologies for these,
|
Different programming languages used different terminologies for these,
|
||||||
e.g. operators, separators, punctuators, etc.
|
e.g. operators, separators, punctuators, etc.
|
||||||
Here we use 'symbol', for all of them, but we can do something different.
|
Here we use 'symbol', for all of them.
|
||||||
We also include a token consisting of
|
We also include a token consisting of
|
||||||
a closing parenthesis immediately followed by 'group':
|
a closing parenthesis immediately followed by 'group':
|
||||||
as defined in the syntactic grammar,
|
as defined in the syntactic grammar,
|
||||||
this is the final part of an affine group literal.
|
this is the final part of an affine group literal;
|
||||||
Even though it includes letters,
|
even though it includes letters,
|
||||||
it seems appropriate to still consider it a symbol,
|
it seems appropriate to still consider it a symbol,
|
||||||
particularly since it starts with a symbol.
|
particularly since it starts with a proper symbol.
|
||||||
|
|
||||||
We could give names to all of these symbols,
|
We could give names to all of these symbols,
|
||||||
via rules such as
|
via rules such as
|
||||||
@ -895,7 +880,7 @@ token = keyword
|
|||||||
/ symbol
|
/ symbol
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[identifier](#user-content-identifier), [keyword](#user-content-keyword), [package-name](#user-content-package-name), [formatted-string](#user-content-formatted-string), [annotation-name](#user-content-annotation-name), [atomic-literal](#user-content-atomic-literal), [symbol](#user-content-symbol)_;
|
Go to: _[formatted-string](#user-content-formatted-string), [atomic-literal](#user-content-atomic-literal), [identifier](#user-content-identifier), [symbol](#user-content-symbol), [keyword](#user-content-keyword), [package-name](#user-content-package-name), [annotation-name](#user-content-annotation-name)_;
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -952,11 +937,11 @@ group-type = "group"
|
|||||||
arithmetic-type = integer-type / field-type / group-type
|
arithmetic-type = integer-type / field-type / group-type
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[integer-type](#user-content-integer-type), [group-type](#user-content-group-type), [field-type](#user-content-field-type)_;
|
Go to: _[integer-type](#user-content-integer-type), [field-type](#user-content-field-type), [group-type](#user-content-group-type)_;
|
||||||
|
|
||||||
|
|
||||||
The arithmetic types, along with the boolean and address types,
|
The arithmetic types, along with the boolean and address types,
|
||||||
form the scalar types, i.e. the ones whose values do not contain (sub)values.
|
form the scalar types, i.e. the ones whose values do not contain (sub-)values.
|
||||||
|
|
||||||
<a name="boolean-type"></a>
|
<a name="boolean-type"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -973,12 +958,12 @@ address-type = "address"
|
|||||||
scalar-type = boolean-type / arithmetic-type / address-type
|
scalar-type = boolean-type / arithmetic-type / address-type
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[boolean-type](#user-content-boolean-type), [arithmetic-type](#user-content-arithmetic-type), [address-type](#user-content-address-type)_;
|
Go to: _[address-type](#user-content-address-type), [boolean-type](#user-content-boolean-type), [arithmetic-type](#user-content-arithmetic-type)_;
|
||||||
|
|
||||||
|
|
||||||
Circuit types are denoted by identifiers and the keyword 'Self'.
|
Circuit types are denoted by identifiers and the keyword 'Self'.
|
||||||
The latter is only allowed inside a circuit definition,
|
The latter is only allowed inside a circuit definition,
|
||||||
to denote the defined circuit.
|
to denote the circuit being defined.
|
||||||
|
|
||||||
<a name="self-type"></a>
|
<a name="self-type"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -990,7 +975,7 @@ self-type = "Self"
|
|||||||
circuit-type = identifier / self-type
|
circuit-type = identifier / self-type
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[identifier](#user-content-identifier), [self-type](#user-content-self-type)_;
|
Go to: _[self-type](#user-content-self-type), [identifier](#user-content-identifier)_;
|
||||||
|
|
||||||
|
|
||||||
A tuple type consists of zero, two, or more component types.
|
A tuple type consists of zero, two, or more component types.
|
||||||
@ -1005,16 +990,15 @@ Go to: _[type](#user-content-type)_;
|
|||||||
|
|
||||||
An array type consists of an element type
|
An array type consists of an element type
|
||||||
and an indication of dimensions.
|
and an indication of dimensions.
|
||||||
There is either a single dimension (a number),
|
There is either a single dimension,
|
||||||
or a tuple of one or more dimensions
|
or a tuple of one or more dimensions.
|
||||||
(we could restrict the latter to two or more dimensions).
|
|
||||||
|
|
||||||
<a name="array-type"></a>
|
<a name="array-type"></a>
|
||||||
```abnf
|
```abnf
|
||||||
array-type = "[" type ";" array-dimensions "]"
|
array-type = "[" type ";" array-dimensions "]"
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[type](#user-content-type), [array-dimensions](#user-content-array-dimensions)_;
|
Go to: _[array-dimensions](#user-content-array-dimensions), [type](#user-content-type)_;
|
||||||
|
|
||||||
|
|
||||||
<a name="array-dimensions"></a>
|
<a name="array-dimensions"></a>
|
||||||
@ -1027,7 +1011,7 @@ Go to: _[natural](#user-content-natural)_;
|
|||||||
|
|
||||||
|
|
||||||
Circuit, tuple, and array types form the aggregate types,
|
Circuit, tuple, and array types form the aggregate types,
|
||||||
i.e. types whose values contain (sub)values
|
i.e. types whose values contain (sub-)values
|
||||||
(with the corner-case exception of the empty tuple value).
|
(with the corner-case exception of the empty tuple value).
|
||||||
|
|
||||||
<a name="aggregate-type"></a>
|
<a name="aggregate-type"></a>
|
||||||
@ -1035,7 +1019,7 @@ i.e. types whose values contain (sub)values
|
|||||||
aggregate-type = tuple-type / array-type / circuit-type
|
aggregate-type = tuple-type / array-type / circuit-type
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[tuple-type](#user-content-tuple-type), [circuit-type](#user-content-circuit-type), [array-type](#user-content-array-type)_;
|
Go to: _[circuit-type](#user-content-circuit-type), [tuple-type](#user-content-tuple-type), [array-type](#user-content-array-type)_;
|
||||||
|
|
||||||
|
|
||||||
Scalar and aggregate types form all the types.
|
Scalar and aggregate types form all the types.
|
||||||
@ -1045,15 +1029,15 @@ Scalar and aggregate types form all the types.
|
|||||||
type = scalar-type / aggregate-type
|
type = scalar-type / aggregate-type
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[aggregate-type](#user-content-aggregate-type), [scalar-type](#user-content-scalar-type)_;
|
Go to: _[scalar-type](#user-content-scalar-type), [aggregate-type](#user-content-aggregate-type)_;
|
||||||
|
|
||||||
|
|
||||||
The lexical grammar above defines product group literals.
|
The lexical grammar given earlier defines product group literals.
|
||||||
The other kind of group literal is a pair of integer coordinates,
|
The other kind of group literal is a pair of integer coordinates,
|
||||||
which are reduced modulo the prime to identify a point,
|
which are reduced modulo the prime to identify a point,
|
||||||
which must be on the elliptic curve.
|
which must be on the elliptic curve.
|
||||||
It is also allowed to omit one (not both) coordinates,
|
It is also allowed to omit one coordinate (not both),
|
||||||
with an indication of how to infer the missing coordinate
|
with an indication of how to fill in the missing coordinate
|
||||||
(i.e. sign high, sign low, or inferred).
|
(i.e. sign high, sign low, or inferred).
|
||||||
This is an affine group literal,
|
This is an affine group literal,
|
||||||
because it consists of affine point coordinates.
|
because it consists of affine point coordinates.
|
||||||
@ -1068,22 +1052,20 @@ Go to: _[integer](#user-content-integer)_;
|
|||||||
|
|
||||||
<a name="affine-group-literal"></a>
|
<a name="affine-group-literal"></a>
|
||||||
```abnf
|
```abnf
|
||||||
affine-group-literal = "(" group-coordinate "," group-coordinate ")" "group"
|
affine-group-literal = "(" group-coordinate "," group-coordinate ")group"
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[group-coordinate](#user-content-group-coordinate)_;
|
Go to: _[group-coordinate](#user-content-group-coordinate)_;
|
||||||
|
|
||||||
|
|
||||||
A literal is either an atomic one or an affine group literal.
|
A literal is either an atomic one or an affine group literal.
|
||||||
Here 'atomic' refers to being a token or not,
|
|
||||||
since no whitespace is allowed within a token.
|
|
||||||
|
|
||||||
<a name="literal"></a>
|
<a name="literal"></a>
|
||||||
```abnf
|
```abnf
|
||||||
literal = atomic-literal / affine-group-literal
|
literal = atomic-literal / affine-group-literal
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[affine-group-literal](#user-content-affine-group-literal), [atomic-literal](#user-content-atomic-literal)_;
|
Go to: _[atomic-literal](#user-content-atomic-literal), [affine-group-literal](#user-content-affine-group-literal)_;
|
||||||
|
|
||||||
|
|
||||||
The following rule is not directly referenced in the rules for expressions
|
The following rule is not directly referenced in the rules for expressions
|
||||||
@ -1106,13 +1088,12 @@ of operators and other expression constructs,
|
|||||||
and the (left or right) associativity of binary operators.
|
and the (left or right) associativity of binary operators.
|
||||||
|
|
||||||
The primary expressions are self-contained in a way,
|
The primary expressions are self-contained in a way,
|
||||||
i.e. they have clear deliminations.
|
i.e. they have clear deliminations:
|
||||||
Some consist of single tokens:
|
Some consist of single tokens,
|
||||||
identifiers, the keywords 'self' and 'input', and literals.
|
while others have explicit endings.
|
||||||
Primary expressions also include parenthesized expressions,
|
Primary expressions also include parenthesized expressions,
|
||||||
i.e. any expression may be turned into a primary one
|
i.e. any expression may be turned into a primary one
|
||||||
by putting parentheses around it.
|
by putting parentheses around it.
|
||||||
The other kinds are defined and explained below.
|
|
||||||
|
|
||||||
<a name="primary-expression"></a>
|
<a name="primary-expression"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -1126,19 +1107,11 @@ primary-expression = identifier
|
|||||||
/ circuit-expression
|
/ circuit-expression
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[tuple-expression](#user-content-tuple-expression), [identifier](#user-content-identifier), [literal](#user-content-literal), [array-expression](#user-content-array-expression), [circuit-expression](#user-content-circuit-expression), [expression](#user-content-expression)_;
|
Go to: _[expression](#user-content-expression), [array-expression](#user-content-array-expression), [identifier](#user-content-identifier), [circuit-expression](#user-content-circuit-expression), [tuple-expression](#user-content-tuple-expression), [literal](#user-content-literal)_;
|
||||||
|
|
||||||
|
|
||||||
There are tuple expressions to construct and deconstruct tuples.
|
Tuple expressions construct tuples.
|
||||||
A construction consists of zero, two, or more component expressions.
|
Each consists of zero, two, or more component expressions.
|
||||||
A deconstruction uses a component index (zero-indexed).
|
|
||||||
Note that constructions are delimited by closing parentheses
|
|
||||||
and deconstructions are delimited by natural tokens.
|
|
||||||
The rule below, and similar rules for other aggregate types,
|
|
||||||
use the perhaps more familiar 'access',
|
|
||||||
but note that 'deconstruction' has a nice symmetry to 'construction';
|
|
||||||
the term 'destructor' has a different meaning in other languages,
|
|
||||||
so we may want to avoid it, but not necessarily.
|
|
||||||
|
|
||||||
<a name="tuple-construction"></a>
|
<a name="tuple-construction"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -1156,20 +1129,12 @@ tuple-expression = tuple-construction
|
|||||||
Go to: _[tuple-construction](#user-content-tuple-construction)_;
|
Go to: _[tuple-construction](#user-content-tuple-construction)_;
|
||||||
|
|
||||||
|
|
||||||
The are array expressions to construct and deconstruct arrays.
|
Array expressions construct arrays.
|
||||||
There are two kinds of constructions:
|
There are two kinds:
|
||||||
one lists the element expressions (at least one),
|
one lists the element expressions (at least one),
|
||||||
including spreads (via '...') which are arrays being spliced in;
|
including spreads (via '...') which are arrays being spliced in;
|
||||||
the other repeats (the value of) a single expression
|
the other repeats (the value of) a single expression
|
||||||
across one or more dimensions.
|
across one or more dimensions.
|
||||||
There are two kinds of deconstructions:
|
|
||||||
one selects a single element by index (zero-indexed);
|
|
||||||
the other selects a range via two indices,
|
|
||||||
the first inclusive and the second exclusive --
|
|
||||||
both are optional,
|
|
||||||
the first defaulting to 0 and the second to the array length.
|
|
||||||
Note that these expressions are all delimited
|
|
||||||
by closing square brackets.
|
|
||||||
|
|
||||||
<a name="array-inline-construction"></a>
|
<a name="array-inline-construction"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -1214,17 +1179,14 @@ array-expression = array-construction
|
|||||||
Go to: _[array-construction](#user-content-array-construction)_;
|
Go to: _[array-construction](#user-content-array-construction)_;
|
||||||
|
|
||||||
|
|
||||||
There are circuit expressions to construct and deconstruct circuit values.
|
Circuit expressions construct circuit values.
|
||||||
A construction lists values for all the member variables (in any order);
|
Each lists values for all the member variables (in any order);
|
||||||
there must be at least one member variable currently.
|
there must be at least one member variable.
|
||||||
A single identifier abbreviates
|
A single identifier abbreviates
|
||||||
a pair consisting of the same identifier separated by dot;
|
a pair consisting of the same identifier separated by colon;
|
||||||
note that, in the expansion, the left one denotes a member name,
|
note that, in the expansion, the left one denotes a member name,
|
||||||
while the right one denotes an expression (a variable),
|
while the right one denotes an expression (a variable),
|
||||||
so they are syntactically identical but semantically different.
|
so they are syntactically identical but semantically different.
|
||||||
A deconstruction selects a member variable by name.
|
|
||||||
Note that these expressions are delimited,
|
|
||||||
by closing curly braces or identifiers.
|
|
||||||
|
|
||||||
<a name="circuit-construction"></a>
|
<a name="circuit-construction"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -1233,7 +1195,7 @@ circuit-construction = circuit-type "{"
|
|||||||
"}"
|
"}"
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[circuit-inline-element](#user-content-circuit-inline-element), [circuit-type](#user-content-circuit-type)_;
|
Go to: _[circuit-type](#user-content-circuit-type), [circuit-inline-element](#user-content-circuit-inline-element)_;
|
||||||
|
|
||||||
|
|
||||||
<a name="circuit-inline-element"></a>
|
<a name="circuit-inline-element"></a>
|
||||||
@ -1252,12 +1214,25 @@ circuit-expression = circuit-construction
|
|||||||
Go to: _[circuit-construction](#user-content-circuit-construction)_;
|
Go to: _[circuit-construction](#user-content-circuit-construction)_;
|
||||||
|
|
||||||
|
|
||||||
|
After primary expressions, postfix expressions have highest precedence.
|
||||||
|
They apply to primary expressions, and recursively to postfix expressions.
|
||||||
|
|
||||||
|
There are postfix expressions to access parts of aggregate values.
|
||||||
|
A tuple access selects a component by index (zero-based).
|
||||||
|
There are two kinds of array accesses:
|
||||||
|
one selects a single element by index (zero-based);
|
||||||
|
the other selects a range via two indices,
|
||||||
|
the first inclusive and the second exclusive --
|
||||||
|
both are optional,
|
||||||
|
the first defaulting to 0 and the second to the array length.
|
||||||
|
A circuit access selects a member variable by name.
|
||||||
|
|
||||||
|
Function calls are also postfix expressions.
|
||||||
There are three kinds of function calls:
|
There are three kinds of function calls:
|
||||||
top-level function calls,
|
top-level function calls,
|
||||||
instance (i.e. non-static) member function calls, and
|
instance (i.e. non-static) member function calls, and
|
||||||
static member function calls.
|
static member function calls.
|
||||||
What changes is the start, but they all end in an argument list,
|
What changes is the start, but they all end in an argument list.
|
||||||
delimited by a closing parenthesis.
|
|
||||||
|
|
||||||
<a name="function-arguments"></a>
|
<a name="function-arguments"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -1267,11 +1242,6 @@ function-arguments = "(" [ expression *( "," expression ) ] ")"
|
|||||||
Go to: _[expression](#user-content-expression)_;
|
Go to: _[expression](#user-content-expression)_;
|
||||||
|
|
||||||
|
|
||||||
Postfix expressions have highest precedence.
|
|
||||||
They apply to primary expressions.
|
|
||||||
Contains access expressions for arrays, tuples, and circuits.
|
|
||||||
Contains function call types.
|
|
||||||
|
|
||||||
<a name="postfix-expression"></a>
|
<a name="postfix-expression"></a>
|
||||||
```abnf
|
```abnf
|
||||||
postfix-expression = primary-expression
|
postfix-expression = primary-expression
|
||||||
@ -1284,11 +1254,11 @@ postfix-expression = primary-expression
|
|||||||
/ postfix-expression "[" [expression] ".." [expression] "]"
|
/ postfix-expression "[" [expression] ".." [expression] "]"
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[postfix-expression](#user-content-postfix-expression), [natural](#user-content-natural), [primary-expression](#user-content-primary-expression), [function-arguments](#user-content-function-arguments), [circuit-type](#user-content-circuit-type), [identifier](#user-content-identifier), [expression](#user-content-expression)_;
|
Go to: _[postfix-expression](#user-content-postfix-expression), [function-arguments](#user-content-function-arguments), [primary-expression](#user-content-primary-expression), [identifier](#user-content-identifier), [circuit-type](#user-content-circuit-type), [natural](#user-content-natural), [expression](#user-content-expression)_;
|
||||||
|
|
||||||
|
|
||||||
Unary operators have the highest operator precedence.
|
Unary operators have the highest operator precedence.
|
||||||
They apply to postfix expressions
|
They apply to postfix expressions,
|
||||||
and recursively to unary expressions.
|
and recursively to unary expressions.
|
||||||
|
|
||||||
<a name="unary-expression"></a>
|
<a name="unary-expression"></a>
|
||||||
@ -1302,15 +1272,20 @@ Go to: _[postfix-expression](#user-content-postfix-expression), [unary-expressio
|
|||||||
|
|
||||||
|
|
||||||
Next in the operator precedence is casting.
|
Next in the operator precedence is casting.
|
||||||
The current rule below makes exponentiation left-associative,
|
|
||||||
|
<a name="cast-expression"></a>
|
||||||
|
```abnf
|
||||||
cast-expression = unary-expression
|
cast-expression = unary-expression
|
||||||
/ cast-expression "as" type
|
/ cast-expression "as" type
|
||||||
|
```
|
||||||
|
|
||||||
|
Go to: _[cast-expression](#user-content-cast-expression), [type](#user-content-type), [unary-expression](#user-content-unary-expression)_;
|
||||||
|
|
||||||
|
|
||||||
Next in the operator precedence is exponentiation,
|
Next in the operator precedence is exponentiation,
|
||||||
following mathematical practice.
|
following mathematical practice.
|
||||||
The current rule below makes exponentiation left-associative,
|
The current rule below makes exponentiation left-associative,
|
||||||
i.e. 'a ** b ** c' must be parsed as '(a ** b) ** c'.
|
i.e. 'a ** b ** c' must be parsed as '(a ** b) ** c'.
|
||||||
This is easy to change if we want it to be right-associative instead.
|
|
||||||
|
|
||||||
<a name="exponential-expression"></a>
|
<a name="exponential-expression"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -1318,7 +1293,7 @@ exponential-expression = cast-expression
|
|||||||
/ exponential-expression "**" cast-expression
|
/ exponential-expression "**" cast-expression
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[cast-expression](#user-content-cast-expression), [exponential-expression](#user-content-exponential-expression)_;
|
Go to: _[exponential-expression](#user-content-exponential-expression), [cast-expression](#user-content-cast-expression)_;
|
||||||
|
|
||||||
|
|
||||||
Next in precedence come multiplication and division, both left-associative.
|
Next in precedence come multiplication and division, both left-associative.
|
||||||
@ -1342,7 +1317,7 @@ additive-expression = multiplicative-expression
|
|||||||
/ additive-expression "-" multiplicative-expression
|
/ additive-expression "-" multiplicative-expression
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[multiplicative-expression](#user-content-multiplicative-expression), [additive-expression](#user-content-additive-expression)_;
|
Go to: _[additive-expression](#user-content-additive-expression), [multiplicative-expression](#user-content-multiplicative-expression)_;
|
||||||
|
|
||||||
|
|
||||||
Next in the precedence order are ordering relations.
|
Next in the precedence order are ordering relations.
|
||||||
@ -1360,8 +1335,8 @@ ordering-expression = additive-expression
|
|||||||
Go to: _[additive-expression](#user-content-additive-expression)_;
|
Go to: _[additive-expression](#user-content-additive-expression)_;
|
||||||
|
|
||||||
|
|
||||||
Equalities return booleans but may also operate on boolean,
|
Equalities return booleans but may also operate on booleans;
|
||||||
so we make them left-associative.
|
the rule below makes them left-associative.
|
||||||
|
|
||||||
<a name="equality-expression"></a>
|
<a name="equality-expression"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -1370,7 +1345,7 @@ equality-expression = ordering-expression
|
|||||||
/ equality-expression "!=" ordering-expression
|
/ equality-expression "!=" ordering-expression
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[ordering-expression](#user-content-ordering-expression), [equality-expression](#user-content-equality-expression)_;
|
Go to: _[equality-expression](#user-content-equality-expression), [ordering-expression](#user-content-ordering-expression)_;
|
||||||
|
|
||||||
|
|
||||||
Next come conjunctive expressions, left-associative.
|
Next come conjunctive expressions, left-associative.
|
||||||
@ -1408,7 +1383,7 @@ conditional-expression = disjunctive-expression
|
|||||||
Go to: _[disjunctive-expression](#user-content-disjunctive-expression), [conditional-expression](#user-content-conditional-expression), [expression](#user-content-expression)_;
|
Go to: _[disjunctive-expression](#user-content-disjunctive-expression), [conditional-expression](#user-content-conditional-expression), [expression](#user-content-expression)_;
|
||||||
|
|
||||||
|
|
||||||
These are all the expressions.
|
Those above are all the expressions.
|
||||||
Recall that conditional expressions
|
Recall that conditional expressions
|
||||||
may be disjunctive expressions,
|
may be disjunctive expressions,
|
||||||
which may be conjunctive expressions,
|
which may be conjunctive expressions,
|
||||||
@ -1422,9 +1397,8 @@ expression = conditional-expression
|
|||||||
Go to: _[conditional-expression](#user-content-conditional-expression)_;
|
Go to: _[conditional-expression](#user-content-conditional-expression)_;
|
||||||
|
|
||||||
|
|
||||||
There are various kinds of statements,
|
There are various kinds of statements, including blocks.
|
||||||
including blocks, which are
|
Blocks are possibly empty sequences of statements surrounded by curly braces.
|
||||||
possibly empty sequences of statements surounded by curly braces.
|
|
||||||
|
|
||||||
<a name="statement"></a>
|
<a name="statement"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -1438,7 +1412,7 @@ statement = expression-statement
|
|||||||
/ block
|
/ block
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[expression-statement](#user-content-expression-statement), [variable-definition-statement](#user-content-variable-definition-statement), [loop-statement](#user-content-loop-statement), [return-statement](#user-content-return-statement), [conditional-statement](#user-content-conditional-statement), [block](#user-content-block), [assignment-statement](#user-content-assignment-statement), [console-statement](#user-content-console-statement)_;
|
Go to: _[conditional-statement](#user-content-conditional-statement), [loop-statement](#user-content-loop-statement), [variable-definition-statement](#user-content-variable-definition-statement), [assignment-statement](#user-content-assignment-statement), [expression-statement](#user-content-expression-statement), [console-statement](#user-content-console-statement), [return-statement](#user-content-return-statement), [block](#user-content-block)_;
|
||||||
|
|
||||||
|
|
||||||
<a name="block"></a>
|
<a name="block"></a>
|
||||||
@ -1446,7 +1420,7 @@ Go to: _[expression-statement](#user-content-expression-statement), [variable-de
|
|||||||
block = "{" *statement "}"
|
block = "{" *statement "}"
|
||||||
```
|
```
|
||||||
|
|
||||||
An expression (that returns the empty tuple)
|
An expression (that must return the empty tuple, as semantically required)
|
||||||
can be turned into a statement by appending a semicolon.
|
can be turned into a statement by appending a semicolon.
|
||||||
|
|
||||||
<a name="expression-statement"></a>
|
<a name="expression-statement"></a>
|
||||||
@ -1458,7 +1432,7 @@ Go to: _[expression](#user-content-expression)_;
|
|||||||
|
|
||||||
|
|
||||||
A return statement always takes an expression,
|
A return statement always takes an expression,
|
||||||
and does not end with a semicolon (but we may want to change that).
|
and does not end with a semicolon.
|
||||||
|
|
||||||
<a name="return-statement"></a>
|
<a name="return-statement"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -1481,7 +1455,7 @@ variable-definition-statement = ( "let" / "const" )
|
|||||||
[ ":" type ] "=" expression ";"
|
[ ":" type ] "=" expression ";"
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[expression](#user-content-expression), [type](#user-content-type), [identifier-or-identifiers](#user-content-identifier-or-identifiers)_;
|
Go to: _[type](#user-content-type), [identifier-or-identifiers](#user-content-identifier-or-identifiers), [expression](#user-content-expression)_;
|
||||||
|
|
||||||
|
|
||||||
<a name="identifier-or-identifiers"></a>
|
<a name="identifier-or-identifiers"></a>
|
||||||
@ -1497,14 +1471,14 @@ A conditional statement always starts with a condition and a block
|
|||||||
(which together form a branch).
|
(which together form a branch).
|
||||||
It may stop there, or it may continue with an alternative block,
|
It may stop there, or it may continue with an alternative block,
|
||||||
or possibly with another conditional statement, forming a chain.
|
or possibly with another conditional statement, forming a chain.
|
||||||
Note that we require blocks in all branches, not merely statements.
|
Note that blocks are required in all branches, not merely statements.
|
||||||
|
|
||||||
<a name="branch"></a>
|
<a name="branch"></a>
|
||||||
```abnf
|
```abnf
|
||||||
branch = "if" expression block
|
branch = "if" expression block
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[block](#user-content-block), [expression](#user-content-expression)_;
|
Go to: _[expression](#user-content-expression), [block](#user-content-block)_;
|
||||||
|
|
||||||
|
|
||||||
<a name="conditional-statement"></a>
|
<a name="conditional-statement"></a>
|
||||||
@ -1526,7 +1500,7 @@ The body is a block.
|
|||||||
loop-statement = "for" identifier "in" expression ".." expression block
|
loop-statement = "for" identifier "in" expression ".." expression block
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[identifier](#user-content-identifier), [expression](#user-content-expression), [block](#user-content-block)_;
|
Go to: _[block](#user-content-block), [identifier](#user-content-identifier), [expression](#user-content-expression)_;
|
||||||
|
|
||||||
|
|
||||||
An assignment statement is straightforward.
|
An assignment statement is straightforward.
|
||||||
@ -1543,7 +1517,7 @@ assignment-operator = "=" / "+=" / "-=" / "*=" / "/=" / "**="
|
|||||||
assignment-statement = expression assignment-operator expression ";"
|
assignment-statement = expression assignment-operator expression ";"
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[assignment-operator](#user-content-assignment-operator), [expression](#user-content-expression)_;
|
Go to: _[expression](#user-content-expression), [assignment-operator](#user-content-assignment-operator)_;
|
||||||
|
|
||||||
|
|
||||||
Console statements start with the 'console' keyword,
|
Console statements start with the 'console' keyword,
|
||||||
@ -1551,10 +1525,10 @@ followed by a console function call.
|
|||||||
The call may be an assertion or a print command.
|
The call may be an assertion or a print command.
|
||||||
The former takes an expression (which must be boolean) as argument.
|
The former takes an expression (which must be boolean) as argument.
|
||||||
The latter takes either no argument,
|
The latter takes either no argument,
|
||||||
or a format string followed by expressions,
|
or a formatted string followed by expressions,
|
||||||
whose number must match the number of containers '{}' in the format string.
|
whose number must match the number of containers '{}' in the formatted string.
|
||||||
Note that the console function names are identifiers, not keywords.
|
Note that the console function names are identifiers, not keywords.
|
||||||
There are three kind of printing.
|
There are three kinds of print commands.
|
||||||
|
|
||||||
<a name="console-statement"></a>
|
<a name="console-statement"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -1570,7 +1544,7 @@ console-call = assert-call
|
|||||||
/ print-call
|
/ print-call
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[print-call](#user-content-print-call), [assert-call](#user-content-assert-call)_;
|
Go to: _[assert-call](#user-content-assert-call), [print-call](#user-content-print-call)_;
|
||||||
|
|
||||||
|
|
||||||
<a name="assert-call"></a>
|
<a name="assert-call"></a>
|
||||||
@ -1603,7 +1577,7 @@ Go to: _[print-function](#user-content-print-function), [print-arguments](#user-
|
|||||||
|
|
||||||
|
|
||||||
An annotation consists of an annotation name (which starts with '@')
|
An annotation consists of an annotation name (which starts with '@')
|
||||||
with optional annotation arguments.
|
with optional annotation arguments, which are identifiers.
|
||||||
Note that no parentheses are used if there are no arguments.
|
Note that no parentheses are used if there are no arguments.
|
||||||
|
|
||||||
<a name="annotation"></a>
|
<a name="annotation"></a>
|
||||||
@ -1612,16 +1586,14 @@ annotation = annotation-name
|
|||||||
[ "(" identifier *( "," identifier ) ")" ]
|
[ "(" identifier *( "," identifier ) ")" ]
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[identifier](#user-content-identifier), [annotation-name](#user-content-annotation-name)_;
|
Go to: _[annotation-name](#user-content-annotation-name), [identifier](#user-content-identifier)_;
|
||||||
|
|
||||||
|
|
||||||
A function declaration defines a function.
|
A function declaration defines a function.
|
||||||
This could be called 'function-definition' instead,
|
The output type is optional, defaulting to the empty tuple type.
|
||||||
but see the comments about the 'declaration' rule below.
|
|
||||||
The output type is optional (it defaults to the empty tuple type).
|
|
||||||
In general, a function input consists of an identifier and a type,
|
In general, a function input consists of an identifier and a type,
|
||||||
with an optional 'const' modifier.
|
with an optional 'const' modifier.
|
||||||
However, functions inside circuits
|
Additionally, functions inside circuits
|
||||||
may start with a 'mut self' or 'const self' or 'self' parameter.
|
may start with a 'mut self' or 'const self' or 'self' parameter.
|
||||||
Furthermore, any function may end with an 'input' parameter.
|
Furthermore, any function may end with an 'input' parameter.
|
||||||
|
|
||||||
@ -1632,7 +1604,7 @@ function-declaration = *annotation "function" identifier
|
|||||||
block
|
block
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[function-parameters](#user-content-function-parameters), [block](#user-content-block), [identifier](#user-content-identifier), [type](#user-content-type)_;
|
Go to: _[identifier](#user-content-identifier), [function-parameters](#user-content-function-parameters), [block](#user-content-block), [type](#user-content-type)_;
|
||||||
|
|
||||||
|
|
||||||
<a name="function-parameters"></a>
|
<a name="function-parameters"></a>
|
||||||
@ -1643,7 +1615,7 @@ function-parameters = self-parameter [ "," input-parameter ]
|
|||||||
/ input-parameter
|
/ input-parameter
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[input-parameter](#user-content-input-parameter), [self-parameter](#user-content-self-parameter), [function-inputs](#user-content-function-inputs)_;
|
Go to: _[function-inputs](#user-content-function-inputs), [self-parameter](#user-content-self-parameter), [input-parameter](#user-content-input-parameter)_;
|
||||||
|
|
||||||
|
|
||||||
<a name="self-parameter"></a>
|
<a name="self-parameter"></a>
|
||||||
@ -1674,8 +1646,6 @@ input-parameter = "input"
|
|||||||
|
|
||||||
A circuit member variable declaration consists of an identifier and a type.
|
A circuit member variable declaration consists of an identifier and a type.
|
||||||
A circuit member function declaration consists of a function declaration.
|
A circuit member function declaration consists of a function declaration.
|
||||||
We could call these 'member-definition' etc.,
|
|
||||||
but see the comments about the 'declaration' rule below.
|
|
||||||
|
|
||||||
<a name="member-declaration"></a>
|
<a name="member-declaration"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -1683,7 +1653,7 @@ member-declaration = member-variable-declaration
|
|||||||
/ member-function-declaration
|
/ member-function-declaration
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[member-function-declaration](#user-content-member-function-declaration), [member-variable-declaration](#user-content-member-variable-declaration)_;
|
Go to: _[member-variable-declaration](#user-content-member-variable-declaration), [member-function-declaration](#user-content-member-function-declaration)_;
|
||||||
|
|
||||||
|
|
||||||
<a name="member-variable-declaration"></a>
|
<a name="member-variable-declaration"></a>
|
||||||
@ -1702,9 +1672,8 @@ member-function-declaration = function-declaration
|
|||||||
Go to: _[function-declaration](#user-content-function-declaration)_;
|
Go to: _[function-declaration](#user-content-function-declaration)_;
|
||||||
|
|
||||||
|
|
||||||
A circuit declaration defines a circuit type. It is straightforward.
|
A circuit declaration defines a circuit type,
|
||||||
This could be called 'circuit-definition' instead,
|
as consisting of member variables and functions.
|
||||||
but see the comments about the 'declaration' rule below.
|
|
||||||
|
|
||||||
<a name="circuit-declaration"></a>
|
<a name="circuit-declaration"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -1712,7 +1681,7 @@ circuit-declaration = *annotation "circuit" identifier
|
|||||||
"{" member-declaration *( "," member-declaration ) "}"
|
"{" member-declaration *( "," member-declaration ) "}"
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[member-declaration](#user-content-member-declaration), [identifier](#user-content-identifier)_;
|
Go to: _[identifier](#user-content-identifier), [member-declaration](#user-content-member-declaration)_;
|
||||||
|
|
||||||
|
|
||||||
An import declaration consists of the 'import' keyword
|
An import declaration consists of the 'import' keyword
|
||||||
@ -1723,7 +1692,7 @@ a package name followed by a path, recursively;
|
|||||||
or a parenthesized list of package paths,
|
or a parenthesized list of package paths,
|
||||||
which are "fan out" of the initial path.
|
which are "fan out" of the initial path.
|
||||||
Note that we allow the last element of the parenthesized list
|
Note that we allow the last element of the parenthesized list
|
||||||
to be followed by a comma, presumably for convenience.
|
to be followed by a comma, for convenience.
|
||||||
|
|
||||||
<a name="import-declaration"></a>
|
<a name="import-declaration"></a>
|
||||||
```abnf
|
```abnf
|
||||||
@ -1741,16 +1710,10 @@ package-path = "*"
|
|||||||
/ "(" package-path *( "," package-path ) [","] ")"
|
/ "(" package-path *( "," package-path ) [","] ")"
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to: _[package-path](#user-content-package-path), [package-name](#user-content-package-name), [identifier](#user-content-identifier)_;
|
Go to: _[package-name](#user-content-package-name), [identifier](#user-content-identifier), [package-path](#user-content-package-path)_;
|
||||||
|
|
||||||
|
|
||||||
Finally, we define a file as a sequence of zero or more declarations.
|
Finally, we define a file as a sequence of zero or more declarations.
|
||||||
This is why we used 'function-declaration' and 'circuit-declaration'
|
|
||||||
instead of 'function-definition' and 'ciruit-definition':
|
|
||||||
this way, they are all declarations of some sort.
|
|
||||||
An import declaration cannot really called an import definition,
|
|
||||||
because it does not define anything.
|
|
||||||
But we may revisit this, and use 'definition' instead of 'declaration'.
|
|
||||||
|
|
||||||
<a name="declaration"></a>
|
<a name="declaration"></a>
|
||||||
```abnf
|
```abnf
|
||||||
|
@ -18,7 +18,7 @@
|
|||||||
|
|
||||||
; Note that this CR LF requirement only applies to the grammar files themselves.
|
; Note that this CR LF requirement only applies to the grammar files themselves.
|
||||||
; It does not apply to the lines of the languages described by the grammar.
|
; It does not apply to the lines of the languages described by the grammar.
|
||||||
; ABNF grammar may describe any kinds of languages,
|
; ABNF grammars may describe any kind of languages,
|
||||||
; with any kind of line terminators,
|
; with any kind of line terminators,
|
||||||
; or even without line terminators at all (e.g. for "binary" languages).
|
; or even without line terminators at all (e.g. for "binary" languages).
|
||||||
|
|
||||||
@ -27,29 +27,23 @@
|
|||||||
; Introduction
|
; Introduction
|
||||||
; ------------
|
; ------------
|
||||||
|
|
||||||
; This file contains an initial draft of
|
; This file contains an ABNF (Augmented Backus-Naur Form) grammar of Leo.
|
||||||
; a complete ABNF (Augmented Backus-Naur Form) grammar of Leo.
|
|
||||||
; Background on ABNF is provided later in this file.
|
; Background on ABNF is provided later in this file.
|
||||||
|
|
||||||
; The initial motivation for creating an ABNF grammar of Leo was that
|
; This grammar provides an official definition of the syntax of Leo
|
||||||
; we have a formally verified parser of ABNF grammars
|
; that is both human-readable and machine-readable.
|
||||||
|
; It will be part of an upcoming Leo language reference.
|
||||||
|
; It may also be used to generate parser tests at some point.
|
||||||
|
|
||||||
|
; We are also using this grammar
|
||||||
|
; as part of a mathematical formalization of the Leo language,
|
||||||
|
; which we are developing in the ACL2 theorem prover
|
||||||
|
; and which we plan to publish at some point.
|
||||||
|
; In particular, we have used a formally verified parser of ABNF grammars
|
||||||
; (at https://github.com/acl2/acl2/tree/master/books/kestrel/abnf;
|
; (at https://github.com/acl2/acl2/tree/master/books/kestrel/abnf;
|
||||||
; also see the paper at https://www.kestrel.edu/people/coglio/vstte18.pdf)
|
; also see the paper at https://www.kestrel.edu/people/coglio/vstte18.pdf)
|
||||||
; which we have used to parse this ABNF grammar of Leo
|
; to parse this grammar into a formal representation of the Leo concrete syntax
|
||||||
; into a formal representation, in the ACL2 theorem prover,
|
; and to validate that the grammar satisfies certain consistency properties.
|
||||||
; of the Leo concrete syntax.
|
|
||||||
; The parsing of this grammar file into an ACL2 representation
|
|
||||||
; happens every time the ACL2 formalization of Leo is built.
|
|
||||||
|
|
||||||
; In addition to that initial motivation,
|
|
||||||
; this ABNF grammar has now the additional and primary purpose of
|
|
||||||
; providing an official definition of the syntax of Leo
|
|
||||||
; that is both human-readable and machine-readable.
|
|
||||||
; This grammar will be part of the (upcoming) Leo language reference,
|
|
||||||
; of the Leo Language Formal Specification
|
|
||||||
; (i.e. the LaTeX document in the leo-semantics repo),
|
|
||||||
; and of the ACL2 formalization of Leo (which was the initial motivation).
|
|
||||||
; It has also been suggested that it may be used to generate tests.
|
|
||||||
|
|
||||||
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
||||||
|
|
||||||
@ -57,13 +51,10 @@
|
|||||||
; ------------------
|
; ------------------
|
||||||
|
|
||||||
; ABNF is an Internet standard:
|
; ABNF is an Internet standard:
|
||||||
; see https://www.rfc-editor.org/info/rfc5234
|
; see RFC 5234 at https://www.rfc-editor.org/info/rfc5234
|
||||||
; and https://www.rfc-editor.org/info/rfc7405.
|
; and RFC 7405 at https://www.rfc-editor.org/info/rfc7405.
|
||||||
; It is used to specify the syntax of JSON, HTTP, and other standards.
|
; It is used to specify the syntax of JSON, HTTP, and other standards.
|
||||||
|
|
||||||
; The following is copied (and "un-LaTeX'd") from the aforementioned paper
|
|
||||||
; (at https://www.kestrel.edu/people/coglio/vstte18.pdf).
|
|
||||||
|
|
||||||
; ABNF adds conveniences and makes slight modifications
|
; ABNF adds conveniences and makes slight modifications
|
||||||
; to Backus-Naur Form (BNF),
|
; to Backus-Naur Form (BNF),
|
||||||
; without going beyond context-free grammars.
|
; without going beyond context-free grammars.
|
||||||
@ -80,15 +71,15 @@
|
|||||||
; and denotes them via:
|
; and denotes them via:
|
||||||
; (i) binary, decimal, or hexadecimal sequences,
|
; (i) binary, decimal, or hexadecimal sequences,
|
||||||
; e.g. %b1.11.1010, %d1.3.10, and %x.1.3.A
|
; e.g. %b1.11.1010, %d1.3.10, and %x.1.3.A
|
||||||
; all denote the string '1 3 10';
|
; all denote the sequence of terminals '1 3 10';
|
||||||
; (ii) binary, decimal, or hexadecimal ranges,
|
; (ii) binary, decimal, or hexadecimal ranges,
|
||||||
; e.g. %x30-39 denotes any string 'n' with 48 <= n <= 57
|
; e.g. %x30-39 denotes any singleton sequence of terminals
|
||||||
; (an ASCII digit);
|
; 'n' with 48 <= n <= 57 (an ASCII digit);
|
||||||
; (iii) case-sensitive ASCII strings,
|
; (iii) case-sensitive ASCII strings,
|
||||||
; e.g. %s"Ab" denotes the string '65 98';
|
; e.g. %s"Ab" denotes the sequence of terminals '65 98';
|
||||||
; and (iv) case-insensitive ASCII strings,
|
; and (iv) case-insensitive ASCII strings,
|
||||||
; e.g. %i"ab", or just "ab", denotes
|
; e.g. %i"ab", or just "ab", denotes
|
||||||
; any string among
|
; any sequence of terminals among
|
||||||
; '65 66',
|
; '65 66',
|
||||||
; '65 98',
|
; '65 98',
|
||||||
; '97 66', and
|
; '97 66', and
|
||||||
@ -112,7 +103,7 @@
|
|||||||
; Repetition prefixes have precedence over juxtapositions,
|
; Repetition prefixes have precedence over juxtapositions,
|
||||||
; which have precedence over /.
|
; which have precedence over /.
|
||||||
; Round brackets group things and override the aforementioned precedence rules,
|
; Round brackets group things and override the aforementioned precedence rules,
|
||||||
; e.g. *(WSP / CRLF WSP) denotes strings
|
; e.g. *(WSP / CRLF WSP) denotes sequences of terminals
|
||||||
; obtained by repeating, zero or more times,
|
; obtained by repeating, zero or more times,
|
||||||
; either (i) a WSP or (ii) a CRLF followed by a WSP.
|
; either (i) a WSP or (ii) a CRLF followed by a WSP.
|
||||||
; Square brackets also group things but make them optional,
|
; Square brackets also group things but make them optional,
|
||||||
@ -139,49 +130,42 @@
|
|||||||
; Structure
|
; Structure
|
||||||
; ---------
|
; ---------
|
||||||
|
|
||||||
; Language specifications often define the syntax of their languages via
|
; This ABNF grammar consists of two (sub-)grammars:
|
||||||
; (i) a lexical grammar that describes how
|
; (i) a lexical grammar that describes how
|
||||||
; sequence of characters are parsed into tokens, and
|
; sequence of characters are parsed into tokens, and
|
||||||
; (ii) a syntactic grammar that described how
|
; (ii) a syntactic grammar that described how
|
||||||
; tokens are parsed into expressions, statements, etc.
|
; tokens are parsed into expressions, statements, etc.
|
||||||
; (The adjectives 'lexical' and 'syntactic' are
|
; The adjectives 'lexical' and 'syntactic' are
|
||||||
; the ones used in the Java language specification;
|
; the same ones used in the Java language reference,
|
||||||
; other terms may be used by other languages,
|
; for instance;
|
||||||
; but the essence is similar.)
|
; alternative terms may be used in other languages,
|
||||||
; The situation is sometimes more complex,
|
; but the separation into these two components is quite common
|
||||||
; with multiple passes (e.g. Unicode escape processing in Java),
|
; (the situation is sometimes a bit more complex, with multiple passes,
|
||||||
; but the division between lexical and syntactic (in the sense above) stands.
|
; e.g. Unicode escape processing in Java).
|
||||||
|
|
||||||
; In the aforementioned language specifications,
|
; This separation enables
|
||||||
; both the lexical and syntactic grammars
|
|
||||||
; are normally written in a context-free grammar notation,
|
|
||||||
; augmented with natural language that may assert, for instance,
|
|
||||||
; that tokenization always takes the longest sequence that constitutes a token.
|
|
||||||
|
|
||||||
; This dual structure appears to be motivated by the fact that
|
|
||||||
; concerns of white space, line endings, etc.
|
; concerns of white space, line endings, etc.
|
||||||
; can be handled by the lexical grammar,
|
; to be handled by the lexical grammar,
|
||||||
; allowing the syntactic grammar to focus on the more important structure.
|
; with the syntactic grammar focused on the more important structure.
|
||||||
; Handling both aspects in a single context-free grammar may be unwieldy,
|
; Handling both aspects in a single grammar may be unwieldy,
|
||||||
; so having two grammars provides more clarity and readability.
|
; so having two grammars provides more clarity and readability.
|
||||||
|
|
||||||
; In contrast, PEG (Parsing Expression Grammar) formalisms like Pest
|
; ABNF is a context-free grammar notation, with no procedural interpretation.
|
||||||
; naturally embody a procedural interpretation
|
; The two grammars conceptually define two subsequent processing phases,
|
||||||
; that can handle white space and tokenization in just one manageable grammar.
|
|
||||||
; However, this procedural interpretaion may be sometimes
|
|
||||||
; less clear and readable to humans than context-free rules.
|
|
||||||
; Indeed, context-free grammar are commonly used to documentat languages.
|
|
||||||
|
|
||||||
; ABNF is a context-free grammar notation,
|
|
||||||
; with no procedural interpretation,
|
|
||||||
; and therefore it makes sense to define
|
|
||||||
; separate lexical and syntactic ABNF grammars for Leo.
|
|
||||||
; Conceptually, the two grammars define two subsequent processing phases,
|
|
||||||
; as detailed below.
|
; as detailed below.
|
||||||
; However, a parser implementation does not need to perform
|
; However, a parser implementation does not need to perform
|
||||||
; two strictly separate phases (in fact, it typically does not),
|
; two strictly separate phases (in fact, it typically does not),
|
||||||
; so long as it produces the same final result.
|
; so long as it produces the same final result.
|
||||||
|
|
||||||
|
; The grammar is accompanied by some extra-grammatical requirements,
|
||||||
|
; which are not conveniently expressible in a context-free grammar like ABNF.
|
||||||
|
; These requirements are needed to make the grammar unambiguous,
|
||||||
|
; i.e. to ensure that, for each sequence of terminals,
|
||||||
|
; there is exactly one parse tree for that sequence terminals
|
||||||
|
; that satisfies not only the grammar rules
|
||||||
|
; but also the extra-grammatical requirements.
|
||||||
|
; These requirements are expressed as comments in this file.
|
||||||
|
|
||||||
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
||||||
|
|
||||||
; Operator Precedence
|
; Operator Precedence
|
||||||
@ -203,8 +187,9 @@
|
|||||||
; / additive-expression "+" multiplicative-expression
|
; / additive-expression "+" multiplicative-expression
|
||||||
; / additive-expression "-" multiplicative-expression
|
; / additive-expression "-" multiplicative-expression
|
||||||
;
|
;
|
||||||
; this rule tells us that the additive operators '+' and '-' have
|
; These rules tell us
|
||||||
; lower precedence than the multiplicative operators '*' and '/',
|
; that the additive operators '+' and '-' have lower precedence
|
||||||
|
; than the multiplicative operators '*' and '/',
|
||||||
; and that both the additive and multiplicative operators associate to the left.
|
; and that both the additive and multiplicative operators associate to the left.
|
||||||
; This may be best understood via the examples given below.
|
; This may be best understood via the examples given below.
|
||||||
|
|
||||||
@ -259,14 +244,14 @@
|
|||||||
; Naming Convention
|
; Naming Convention
|
||||||
; -----------------
|
; -----------------
|
||||||
|
|
||||||
; For this ABNF grammar, we choose nonterminal names
|
; This ABNF grammar uses nonterminal names
|
||||||
; that consist of complete English words, separated by dashes,
|
; that consist of complete English words, separated by dashes,
|
||||||
; and that describe the construct the way it is in English.
|
; and that describe the construct the way it is in English.
|
||||||
; For instance, we use the name 'conditional-statement'
|
; For instance, we use the name 'conditional-statement'
|
||||||
; to describe conditional statements.
|
; to describe conditional statements.
|
||||||
|
|
||||||
; At the same time, we attempt to establish
|
; At the same time, this grammar establishes
|
||||||
; a precise and "official" nomenclature for the Leo constructs,
|
; a precise and official nomenclature for the Leo constructs,
|
||||||
; by way of the nonterminal names that define their syntax.
|
; by way of the nonterminal names that define their syntax.
|
||||||
; For instance, the rule
|
; For instance, the rule
|
||||||
;
|
;
|
||||||
@ -284,19 +269,21 @@
|
|||||||
; is the fact that, as discussed above,
|
; is the fact that, as discussed above,
|
||||||
; we write the grammar rules in a way that determines
|
; we write the grammar rules in a way that determines
|
||||||
; the relative precedence and the associativity of expression operators,
|
; the relative precedence and the associativity of expression operators,
|
||||||
; and that therefore we have rules like
|
; and therefore we have rules like
|
||||||
;
|
;
|
||||||
; unary-expression = primary-expression
|
; unary-expression = primary-expression
|
||||||
; / "!" unary-expression
|
; / "!" unary-expression
|
||||||
; / "-" unary-expression
|
; / "-" unary-expression
|
||||||
;
|
;
|
||||||
; In order to allow the recursion of the rule to stop,
|
; In order to allow the recursion of the rule to stop,
|
||||||
; we need to regard, in the grammar, a primary expression as a unary expression.
|
; we need to regard, in the grammar, a primary expression as a unary expression
|
||||||
|
; (i.e. a primary expression is also a unary expression in the grammar;
|
||||||
|
; but note that the opposite is not true).
|
||||||
; However, this is just a grammatical artifact:
|
; However, this is just a grammatical artifact:
|
||||||
; ontologically, a primary expression is not really a unary expression,
|
; ontologically, a primary expression is not really a unary expression,
|
||||||
; because a unary expression is one that consists of
|
; because a unary expression is one that consists of
|
||||||
; a unary operator and an operand sub-expression.
|
; a unary operator and an operand sub-expression.
|
||||||
; These terminological "exceptions" should be easy to identify in the rules.
|
; These terminological exceptions should be easy to identify in the rules.
|
||||||
|
|
||||||
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
||||||
|
|
||||||
@ -308,11 +295,12 @@
|
|||||||
; which are numbers in the range form 0 to 10FFFFh.
|
; which are numbers in the range form 0 to 10FFFFh.
|
||||||
; These are captured by the ABNF rule 'character' below.
|
; These are captured by the ABNF rule 'character' below.
|
||||||
|
|
||||||
; The lexical grammar defines how, conceptually,
|
; The lexical grammar defines how, at least conceptually,
|
||||||
; the sequence of characters is turned into
|
; the sequence of characters is turned into
|
||||||
; a sequence of tokens, comments, and whitespaces.
|
; a sequence of tokens, comments, and whitespaces:
|
||||||
|
; these entities are all defined by the grammar rules below.
|
||||||
|
|
||||||
; As stated, the lexical grammar is ambiguous.
|
; As stated, the lexical grammar alone is ambiguous.
|
||||||
; For example, the sequence of characters '**' (i.e. two stars)
|
; For example, the sequence of characters '**' (i.e. two stars)
|
||||||
; could be equally parsed as two '*' symbol tokens or one '**' symbol token
|
; could be equally parsed as two '*' symbol tokens or one '**' symbol token
|
||||||
; (see rule for 'symbol' below).
|
; (see rule for 'symbol' below).
|
||||||
@ -327,10 +315,6 @@
|
|||||||
; the longest possible sequence of characters is always parsed.
|
; the longest possible sequence of characters is always parsed.
|
||||||
; This way, '**' must be parsed as one '**' symbol token,
|
; This way, '**' must be parsed as one '**' symbol token,
|
||||||
; and '<CR><LF>' must be parsed as one line terminator.
|
; and '<CR><LF>' must be parsed as one line terminator.
|
||||||
; (We should formalize this requirement,
|
|
||||||
; along with the other extra-grammatical requirements given below,
|
|
||||||
; and formally prove that they indeed make
|
|
||||||
; the lexical grammar of Leo unambiguous.)
|
|
||||||
|
|
||||||
; As mentioned above, a character is any Unicode code point.
|
; As mentioned above, a character is any Unicode code point.
|
||||||
; This grammar does not say how those are encoded in files (e.g. UTF-8):
|
; This grammar does not say how those are encoded in files (e.g. UTF-8):
|
||||||
@ -369,7 +353,8 @@ not-star-or-slash = %x0-29 / %x2B-2E / %x30-10FFFF ; anything but * or /
|
|||||||
; a line feed,
|
; a line feed,
|
||||||
; or a carriage return immediately followed by a line feed.
|
; or a carriage return immediately followed by a line feed.
|
||||||
; Note that the latter combination constitutes a single line terminator,
|
; Note that the latter combination constitutes a single line terminator,
|
||||||
; according to the extra-grammatical rule of the longest sequence.
|
; according to the extra-grammatical requirement of the longest sequence,
|
||||||
|
; described above.
|
||||||
|
|
||||||
newline = line-feed / carriage-return / carriage-return line-feed
|
newline = line-feed / carriage-return / carriage-return line-feed
|
||||||
|
|
||||||
@ -379,13 +364,13 @@ whitespace = space / horizontal-tab / newline
|
|||||||
|
|
||||||
; There are two kinds of comments in Leo, as in other languages.
|
; There are two kinds of comments in Leo, as in other languages.
|
||||||
; One is block comments of the form '/* ... */',
|
; One is block comments of the form '/* ... */',
|
||||||
; and the other is end-of-line comments '// ...'.
|
; and the other is end-of-line comments of the form '// ...'.
|
||||||
; The first kind start at '/*' and end at the first '*/',
|
; The first kind start at '/*' and end at the first '*/',
|
||||||
; possibly spanning multiple (partial) lines;
|
; possibly spanning multiple (partial) lines;
|
||||||
; they do no nest.
|
; these do no nest.
|
||||||
; The second kind start at '//' and extend till the end of the line.
|
; The second kind start at '//' and extend till the end of the line.
|
||||||
; The rules about comments given below are similar to
|
; The rules about comments given below are similar to
|
||||||
; the ones used in the Java language specification.
|
; the ones used in the Java language reference.
|
||||||
|
|
||||||
comment = block-comment / end-of-line-comment
|
comment = block-comment / end-of-line-comment
|
||||||
|
|
||||||
@ -467,9 +452,9 @@ package-name = 1*( lowercase-letter / digit )
|
|||||||
|
|
||||||
address = %s"aleo1" 58( lowercase-letter / digit )
|
address = %s"aleo1" 58( lowercase-letter / digit )
|
||||||
|
|
||||||
; A format string is a sequence of characters, other than double quote,
|
; A formatted string is a sequence of characters, other than double quote,
|
||||||
; surrounded by double quotes.
|
; surrounded by double quotes.
|
||||||
; Within a format string, substrings '{}' are distinguished as containers
|
; Within a formatted string, sub-strings '{}' are distinguished as containers
|
||||||
; (these are the ones that may be matched with values
|
; (these are the ones that may be matched with values
|
||||||
; whose textual representation replaces the containers
|
; whose textual representation replaces the containers
|
||||||
; in the printed string).
|
; in the printed string).
|
||||||
@ -486,8 +471,8 @@ formatted-string = double-quote
|
|||||||
double-quote
|
double-quote
|
||||||
|
|
||||||
; Here is (part of this ABNF comment),
|
; Here is (part of this ABNF comment),
|
||||||
; an alternative way to specify format strings,
|
; an alternative way to specify formatted strings,
|
||||||
; which captures the extra-grammatical requirement above in the grammar
|
; which captures the extra-grammatical requirement above in the grammar,
|
||||||
; but is more complicated:
|
; but is more complicated:
|
||||||
;
|
;
|
||||||
; not-double-quote-or-open-brace = %x0-22 / %x24-7A / %x7C-10FFFF
|
; not-double-quote-or-open-brace = %x0-22 / %x24-7A / %x7C-10FFFF
|
||||||
@ -501,21 +486,19 @@ formatted-string = double-quote
|
|||||||
; formatted-string = double-quote *formatted-string-element double-quote
|
; formatted-string = double-quote *formatted-string-element double-quote
|
||||||
;
|
;
|
||||||
; It is not immediately clear which approach is better; there are tradeoffs.
|
; It is not immediately clear which approach is better; there are tradeoffs.
|
||||||
; Regardless, we should choose one eventually.
|
; We may choose to adopt this one in future revisions of the grammar.
|
||||||
|
|
||||||
; Annotations are built out of names and arguments, which are tokens.
|
; Annotations have names, which are identifiers immediately preceded by '@'.
|
||||||
; Two names are currently supported.
|
|
||||||
; An argument is a sequence of one or more letters, digits, and underscores.
|
|
||||||
|
|
||||||
annotation-name = "@" identifier
|
annotation-name = "@" identifier
|
||||||
|
|
||||||
; A natural (number) is a sequence of one or more digits.
|
; A natural (number) is a sequence of one or more digits.
|
||||||
; Note that we allow leading zeros, e.g. '007'.
|
; We allow leading zeros, e.g. '007'.
|
||||||
|
|
||||||
natural = 1*digit
|
natural = 1*digit
|
||||||
|
|
||||||
; An integer (number) is either a natural or its negation.
|
; An integer (number) is either a natural or its negation.
|
||||||
; Note that we also allow leading zeros in negative numbers, e.g. '-007'.
|
; We allow leading zeros also in negative numbers, e.g. '-007'.
|
||||||
|
|
||||||
integer = [ "-" ] natural
|
integer = [ "-" ] natural
|
||||||
|
|
||||||
@ -552,7 +535,9 @@ boolean-literal = %s"true" / %s"false"
|
|||||||
|
|
||||||
address-literal = %s"address" "(" address ")"
|
address-literal = %s"address" "(" address ")"
|
||||||
|
|
||||||
; The ones above are all the literals, as defined by the following rule.
|
; The ones above are all the atomic literals
|
||||||
|
; (in the sense that they are tokens, without whitespace allowed in them),
|
||||||
|
; as defined by the following rule.
|
||||||
|
|
||||||
atomic-literal = untyped-literal
|
atomic-literal = untyped-literal
|
||||||
/ unsigned-literal
|
/ unsigned-literal
|
||||||
@ -566,14 +551,14 @@ atomic-literal = untyped-literal
|
|||||||
; it remains to define tokens for non-alphanumeric symbols such as "+" and "(".
|
; it remains to define tokens for non-alphanumeric symbols such as "+" and "(".
|
||||||
; Different programming languages used different terminologies for these,
|
; Different programming languages used different terminologies for these,
|
||||||
; e.g. operators, separators, punctuators, etc.
|
; e.g. operators, separators, punctuators, etc.
|
||||||
; Here we use 'symbol', for all of them, but we can do something different.
|
; Here we use 'symbol', for all of them.
|
||||||
; We also include a token consisting of
|
; We also include a token consisting of
|
||||||
; a closing parenthesis immediately followed by 'group':
|
; a closing parenthesis immediately followed by 'group':
|
||||||
; as defined in the syntactic grammar,
|
; as defined in the syntactic grammar,
|
||||||
; this is the final part of an affine group literal.
|
; this is the final part of an affine group literal;
|
||||||
; Even though it includes letters,
|
; even though it includes letters,
|
||||||
; it seems appropriate to still consider it a symbol,
|
; it seems appropriate to still consider it a symbol,
|
||||||
; particularly since it starts with a symbol.
|
; particularly since it starts with a proper symbol.
|
||||||
|
|
||||||
; We could give names to all of these symbols,
|
; We could give names to all of these symbols,
|
||||||
; via rules such as
|
; via rules such as
|
||||||
@ -646,7 +631,7 @@ group-type = %s"group"
|
|||||||
arithmetic-type = integer-type / field-type / group-type
|
arithmetic-type = integer-type / field-type / group-type
|
||||||
|
|
||||||
; The arithmetic types, along with the boolean and address types,
|
; The arithmetic types, along with the boolean and address types,
|
||||||
; form the scalar types, i.e. the ones whose values do not contain (sub)values.
|
; form the scalar types, i.e. the ones whose values do not contain (sub-)values.
|
||||||
|
|
||||||
boolean-type = %s"bool"
|
boolean-type = %s"bool"
|
||||||
|
|
||||||
@ -656,7 +641,7 @@ scalar-type = boolean-type / arithmetic-type / address-type
|
|||||||
|
|
||||||
; Circuit types are denoted by identifiers and the keyword 'Self'.
|
; Circuit types are denoted by identifiers and the keyword 'Self'.
|
||||||
; The latter is only allowed inside a circuit definition,
|
; The latter is only allowed inside a circuit definition,
|
||||||
; to denote the defined circuit.
|
; to denote the circuit being defined.
|
||||||
|
|
||||||
self-type = %s"Self"
|
self-type = %s"Self"
|
||||||
|
|
||||||
@ -668,9 +653,8 @@ tuple-type = "(" [ type 1*( "," type ) ] ")"
|
|||||||
|
|
||||||
; An array type consists of an element type
|
; An array type consists of an element type
|
||||||
; and an indication of dimensions.
|
; and an indication of dimensions.
|
||||||
; There is either a single dimension (a number),
|
; There is either a single dimension,
|
||||||
; or a tuple of one or more dimensions
|
; or a tuple of one or more dimensions.
|
||||||
; (we could restrict the latter to two or more dimensions).
|
|
||||||
|
|
||||||
array-type = "[" type ";" array-dimensions "]"
|
array-type = "[" type ";" array-dimensions "]"
|
||||||
|
|
||||||
@ -678,7 +662,7 @@ array-dimensions = natural
|
|||||||
/ "(" natural *( "," natural ) ")"
|
/ "(" natural *( "," natural ) ")"
|
||||||
|
|
||||||
; Circuit, tuple, and array types form the aggregate types,
|
; Circuit, tuple, and array types form the aggregate types,
|
||||||
; i.e. types whose values contain (sub)values
|
; i.e. types whose values contain (sub-)values
|
||||||
; (with the corner-case exception of the empty tuple value).
|
; (with the corner-case exception of the empty tuple value).
|
||||||
|
|
||||||
aggregate-type = tuple-type / array-type / circuit-type
|
aggregate-type = tuple-type / array-type / circuit-type
|
||||||
@ -687,23 +671,21 @@ aggregate-type = tuple-type / array-type / circuit-type
|
|||||||
|
|
||||||
type = scalar-type / aggregate-type
|
type = scalar-type / aggregate-type
|
||||||
|
|
||||||
; The lexical grammar above defines product group literals.
|
; The lexical grammar given earlier defines product group literals.
|
||||||
; The other kind of group literal is a pair of integer coordinates,
|
; The other kind of group literal is a pair of integer coordinates,
|
||||||
; which are reduced modulo the prime to identify a point,
|
; which are reduced modulo the prime to identify a point,
|
||||||
; which must be on the elliptic curve.
|
; which must be on the elliptic curve.
|
||||||
; It is also allowed to omit one (not both) coordinates,
|
; It is also allowed to omit one coordinate (not both),
|
||||||
; with an indication of how to infer the missing coordinate
|
; with an indication of how to fill in the missing coordinate
|
||||||
; (i.e. sign high, sign low, or inferred).
|
; (i.e. sign high, sign low, or inferred).
|
||||||
; This is an affine group literal,
|
; This is an affine group literal,
|
||||||
; because it consists of affine point coordinates.
|
; because it consists of affine point coordinates.
|
||||||
|
|
||||||
group-coordinate = integer / "+" / "-" / "_"
|
group-coordinate = integer / "+" / "-" / "_"
|
||||||
|
|
||||||
affine-group-literal = "(" group-coordinate "," group-coordinate ")" %s"group"
|
affine-group-literal = "(" group-coordinate "," group-coordinate %s")group"
|
||||||
|
|
||||||
; A literal is either an atomic one or an affine group literal.
|
; A literal is either an atomic one or an affine group literal.
|
||||||
; Here 'atomic' refers to being a token or not,
|
|
||||||
; since no whitespace is allowed within a token.
|
|
||||||
|
|
||||||
literal = atomic-literal / affine-group-literal
|
literal = atomic-literal / affine-group-literal
|
||||||
|
|
||||||
@ -721,13 +703,12 @@ group-literal = product-group-literal / affine-group-literal
|
|||||||
; and the (left or right) associativity of binary operators.
|
; and the (left or right) associativity of binary operators.
|
||||||
|
|
||||||
; The primary expressions are self-contained in a way,
|
; The primary expressions are self-contained in a way,
|
||||||
; i.e. they have clear deliminations.
|
; i.e. they have clear deliminations:
|
||||||
; Some consist of single tokens:
|
; Some consist of single tokens,
|
||||||
; identifiers, the keywords 'self' and 'input', and literals.
|
; while others have explicit endings.
|
||||||
; Primary expressions also include parenthesized expressions,
|
; Primary expressions also include parenthesized expressions,
|
||||||
; i.e. any expression may be turned into a primary one
|
; i.e. any expression may be turned into a primary one
|
||||||
; by putting parentheses around it.
|
; by putting parentheses around it.
|
||||||
; The other kinds are defined and explained below.
|
|
||||||
|
|
||||||
primary-expression = identifier
|
primary-expression = identifier
|
||||||
/ %s"self"
|
/ %s"self"
|
||||||
@ -738,35 +719,19 @@ primary-expression = identifier
|
|||||||
/ array-expression
|
/ array-expression
|
||||||
/ circuit-expression
|
/ circuit-expression
|
||||||
|
|
||||||
; There are tuple expressions to construct and deconstruct tuples.
|
; Tuple expressions construct tuples.
|
||||||
; A construction consists of zero, two, or more component expressions.
|
; Each consists of zero, two, or more component expressions.
|
||||||
; A deconstruction uses a component index (zero-indexed).
|
|
||||||
; Note that constructions are delimited by closing parentheses
|
|
||||||
; and deconstructions are delimited by natural tokens.
|
|
||||||
; The rule below, and similar rules for other aggregate types,
|
|
||||||
; use the perhaps more familiar 'access',
|
|
||||||
; but note that 'deconstruction' has a nice symmetry to 'construction';
|
|
||||||
; the term 'destructor' has a different meaning in other languages,
|
|
||||||
; so we may want to avoid it, but not necessarily.
|
|
||||||
|
|
||||||
tuple-construction = "(" [ expression 1*( "," expression ) ] ")"
|
tuple-construction = "(" [ expression 1*( "," expression ) ] ")"
|
||||||
|
|
||||||
tuple-expression = tuple-construction
|
tuple-expression = tuple-construction
|
||||||
|
|
||||||
; The are array expressions to construct and deconstruct arrays.
|
; Array expressions construct arrays.
|
||||||
; There are two kinds of constructions:
|
; There are two kinds:
|
||||||
; one lists the element expressions (at least one),
|
; one lists the element expressions (at least one),
|
||||||
; including spreads (via '...') which are arrays being spliced in;
|
; including spreads (via '...') which are arrays being spliced in;
|
||||||
; the other repeats (the value of) a single expression
|
; the other repeats (the value of) a single expression
|
||||||
; across one or more dimensions.
|
; across one or more dimensions.
|
||||||
; There are two kinds of deconstructions:
|
|
||||||
; one selects a single element by index (zero-indexed);
|
|
||||||
; the other selects a range via two indices,
|
|
||||||
; the first inclusive and the second exclusive --
|
|
||||||
; both are optional,
|
|
||||||
; the first defaulting to 0 and the second to the array length.
|
|
||||||
; Note that these expressions are all delimited
|
|
||||||
; by closing square brackets.
|
|
||||||
|
|
||||||
array-inline-construction = "["
|
array-inline-construction = "["
|
||||||
array-inline-element
|
array-inline-element
|
||||||
@ -781,17 +746,14 @@ array-construction = array-inline-construction / array-repeat-construction
|
|||||||
|
|
||||||
array-expression = array-construction
|
array-expression = array-construction
|
||||||
|
|
||||||
; There are circuit expressions to construct and deconstruct circuit values.
|
; Circuit expressions construct circuit values.
|
||||||
; A construction lists values for all the member variables (in any order);
|
; Each lists values for all the member variables (in any order);
|
||||||
; there must be at least one member variable currently.
|
; there must be at least one member variable.
|
||||||
; A single identifier abbreviates
|
; A single identifier abbreviates
|
||||||
; a pair consisting of the same identifier separated by dot;
|
; a pair consisting of the same identifier separated by colon;
|
||||||
; note that, in the expansion, the left one denotes a member name,
|
; note that, in the expansion, the left one denotes a member name,
|
||||||
; while the right one denotes an expression (a variable),
|
; while the right one denotes an expression (a variable),
|
||||||
; so they are syntactically identical but semantically different.
|
; so they are syntactically identical but semantically different.
|
||||||
; A deconstruction selects a member variable by name.
|
|
||||||
; Note that these expressions are delimited,
|
|
||||||
; by closing curly braces or identifiers.
|
|
||||||
|
|
||||||
circuit-construction = circuit-type "{"
|
circuit-construction = circuit-type "{"
|
||||||
circuit-inline-element *( "," circuit-inline-element )
|
circuit-inline-element *( "," circuit-inline-element )
|
||||||
@ -801,20 +763,28 @@ circuit-inline-element = identifier ":" expression / identifier
|
|||||||
|
|
||||||
circuit-expression = circuit-construction
|
circuit-expression = circuit-construction
|
||||||
|
|
||||||
|
; After primary expressions, postfix expressions have highest precedence.
|
||||||
|
; They apply to primary expressions, and recursively to postfix expressions.
|
||||||
|
|
||||||
|
; There are postfix expressions to access parts of aggregate values.
|
||||||
|
; A tuple access selects a component by index (zero-based).
|
||||||
|
; There are two kinds of array accesses:
|
||||||
|
; one selects a single element by index (zero-based);
|
||||||
|
; the other selects a range via two indices,
|
||||||
|
; the first inclusive and the second exclusive --
|
||||||
|
; both are optional,
|
||||||
|
; the first defaulting to 0 and the second to the array length.
|
||||||
|
; A circuit access selects a member variable by name.
|
||||||
|
|
||||||
|
; Function calls are also postfix expressions.
|
||||||
; There are three kinds of function calls:
|
; There are three kinds of function calls:
|
||||||
; top-level function calls,
|
; top-level function calls,
|
||||||
; instance (i.e. non-static) member function calls, and
|
; instance (i.e. non-static) member function calls, and
|
||||||
; static member function calls.
|
; static member function calls.
|
||||||
; What changes is the start, but they all end in an argument list,
|
; What changes is the start, but they all end in an argument list.
|
||||||
; delimited by a closing parenthesis.
|
|
||||||
|
|
||||||
function-arguments = "(" [ expression *( "," expression ) ] ")"
|
function-arguments = "(" [ expression *( "," expression ) ] ")"
|
||||||
|
|
||||||
; Postfix expressions have highest precedence.
|
|
||||||
; They apply to primary expressions.
|
|
||||||
; Contains access expressions for arrays, tuples, and circuits.
|
|
||||||
; Contains function call types.
|
|
||||||
|
|
||||||
postfix-expression = primary-expression
|
postfix-expression = primary-expression
|
||||||
/ postfix-expression "." natural
|
/ postfix-expression "." natural
|
||||||
/ postfix-expression "." identifier
|
/ postfix-expression "." identifier
|
||||||
@ -825,7 +795,7 @@ postfix-expression = primary-expression
|
|||||||
/ postfix-expression "[" [expression] ".." [expression] "]"
|
/ postfix-expression "[" [expression] ".." [expression] "]"
|
||||||
|
|
||||||
; Unary operators have the highest operator precedence.
|
; Unary operators have the highest operator precedence.
|
||||||
; They apply to postfix expressions
|
; They apply to postfix expressions,
|
||||||
; and recursively to unary expressions.
|
; and recursively to unary expressions.
|
||||||
|
|
||||||
unary-expression = postfix-expression
|
unary-expression = postfix-expression
|
||||||
@ -833,7 +803,7 @@ unary-expression = postfix-expression
|
|||||||
/ "-" unary-expression
|
/ "-" unary-expression
|
||||||
|
|
||||||
; Next in the operator precedence is casting.
|
; Next in the operator precedence is casting.
|
||||||
; The current rule below makes exponentiation left-associative,
|
|
||||||
cast-expression = unary-expression
|
cast-expression = unary-expression
|
||||||
/ cast-expression %s"as" type
|
/ cast-expression %s"as" type
|
||||||
|
|
||||||
@ -841,7 +811,6 @@ cast-expression = unary-expression
|
|||||||
; following mathematical practice.
|
; following mathematical practice.
|
||||||
; The current rule below makes exponentiation left-associative,
|
; The current rule below makes exponentiation left-associative,
|
||||||
; i.e. 'a ** b ** c' must be parsed as '(a ** b) ** c'.
|
; i.e. 'a ** b ** c' must be parsed as '(a ** b) ** c'.
|
||||||
; This is easy to change if we want it to be right-associative instead.
|
|
||||||
|
|
||||||
exponential-expression = cast-expression
|
exponential-expression = cast-expression
|
||||||
/ exponential-expression "**" cast-expression
|
/ exponential-expression "**" cast-expression
|
||||||
@ -867,8 +836,8 @@ ordering-expression = additive-expression
|
|||||||
/ additive-expression "<=" additive-expression
|
/ additive-expression "<=" additive-expression
|
||||||
/ additive-expression ">=" additive-expression
|
/ additive-expression ">=" additive-expression
|
||||||
|
|
||||||
; Equalities return booleans but may also operate on boolean,
|
; Equalities return booleans but may also operate on booleans;
|
||||||
; so we make them left-associative.
|
; the rule below makes them left-associative.
|
||||||
|
|
||||||
equality-expression = ordering-expression
|
equality-expression = ordering-expression
|
||||||
/ equality-expression "==" ordering-expression
|
/ equality-expression "==" ordering-expression
|
||||||
@ -891,7 +860,7 @@ conditional-expression = disjunctive-expression
|
|||||||
"?" expression
|
"?" expression
|
||||||
":" conditional-expression
|
":" conditional-expression
|
||||||
|
|
||||||
; These are all the expressions.
|
; Those above are all the expressions.
|
||||||
; Recall that conditional expressions
|
; Recall that conditional expressions
|
||||||
; may be disjunctive expressions,
|
; may be disjunctive expressions,
|
||||||
; which may be conjunctive expressions,
|
; which may be conjunctive expressions,
|
||||||
@ -899,9 +868,8 @@ conditional-expression = disjunctive-expression
|
|||||||
|
|
||||||
expression = conditional-expression
|
expression = conditional-expression
|
||||||
|
|
||||||
; There are various kinds of statements,
|
; There are various kinds of statements, including blocks.
|
||||||
; including blocks, which are
|
; Blocks are possibly empty sequences of statements surrounded by curly braces.
|
||||||
; possibly empty sequences of statements surounded by curly braces.
|
|
||||||
|
|
||||||
statement = expression-statement
|
statement = expression-statement
|
||||||
/ return-statement
|
/ return-statement
|
||||||
@ -914,13 +882,13 @@ statement = expression-statement
|
|||||||
|
|
||||||
block = "{" *statement "}"
|
block = "{" *statement "}"
|
||||||
|
|
||||||
; An expression (that returns the empty tuple)
|
; An expression (that must return the empty tuple, as semantically required)
|
||||||
; can be turned into a statement by appending a semicolon.
|
; can be turned into a statement by appending a semicolon.
|
||||||
|
|
||||||
expression-statement = expression ";"
|
expression-statement = expression ";"
|
||||||
|
|
||||||
; A return statement always takes an expression,
|
; A return statement always takes an expression,
|
||||||
; and does not end with a semicolon (but we may want to change that).
|
; and does not end with a semicolon.
|
||||||
|
|
||||||
return-statement = %s"return" expression
|
return-statement = %s"return" expression
|
||||||
|
|
||||||
@ -941,7 +909,7 @@ identifier-or-identifiers = identifier
|
|||||||
; (which together form a branch).
|
; (which together form a branch).
|
||||||
; It may stop there, or it may continue with an alternative block,
|
; It may stop there, or it may continue with an alternative block,
|
||||||
; or possibly with another conditional statement, forming a chain.
|
; or possibly with another conditional statement, forming a chain.
|
||||||
; Note that we require blocks in all branches, not merely statements.
|
; Note that blocks are required in all branches, not merely statements.
|
||||||
|
|
||||||
branch = %s"if" expression block
|
branch = %s"if" expression block
|
||||||
|
|
||||||
@ -968,10 +936,10 @@ assignment-statement = expression assignment-operator expression ";"
|
|||||||
; The call may be an assertion or a print command.
|
; The call may be an assertion or a print command.
|
||||||
; The former takes an expression (which must be boolean) as argument.
|
; The former takes an expression (which must be boolean) as argument.
|
||||||
; The latter takes either no argument,
|
; The latter takes either no argument,
|
||||||
; or a format string followed by expressions,
|
; or a formatted string followed by expressions,
|
||||||
; whose number must match the number of containers '{}' in the format string.
|
; whose number must match the number of containers '{}' in the formatted string.
|
||||||
; Note that the console function names are identifiers, not keywords.
|
; Note that the console function names are identifiers, not keywords.
|
||||||
; There are three kind of printing.
|
; There are three kinds of print commands.
|
||||||
|
|
||||||
console-statement = %s"console" "." console-call
|
console-statement = %s"console" "." console-call
|
||||||
|
|
||||||
@ -987,19 +955,17 @@ print-arguments = "(" [ formatted-string *( "," expression ) ] ")"
|
|||||||
print-call = print-function print-arguments
|
print-call = print-function print-arguments
|
||||||
|
|
||||||
; An annotation consists of an annotation name (which starts with '@')
|
; An annotation consists of an annotation name (which starts with '@')
|
||||||
; with optional annotation arguments.
|
; with optional annotation arguments, which are identifiers.
|
||||||
; Note that no parentheses are used if there are no arguments.
|
; Note that no parentheses are used if there are no arguments.
|
||||||
|
|
||||||
annotation = annotation-name
|
annotation = annotation-name
|
||||||
[ "(" identifier *( "," identifier ) ")" ]
|
[ "(" identifier *( "," identifier ) ")" ]
|
||||||
|
|
||||||
; A function declaration defines a function.
|
; A function declaration defines a function.
|
||||||
; This could be called 'function-definition' instead,
|
; The output type is optional, defaulting to the empty tuple type.
|
||||||
; but see the comments about the 'declaration' rule below.
|
|
||||||
; The output type is optional (it defaults to the empty tuple type).
|
|
||||||
; In general, a function input consists of an identifier and a type,
|
; In general, a function input consists of an identifier and a type,
|
||||||
; with an optional 'const' modifier.
|
; with an optional 'const' modifier.
|
||||||
; However, functions inside circuits
|
; Additionally, functions inside circuits
|
||||||
; may start with a 'mut self' or 'const self' or 'self' parameter.
|
; may start with a 'mut self' or 'const self' or 'self' parameter.
|
||||||
; Furthermore, any function may end with an 'input' parameter.
|
; Furthermore, any function may end with an 'input' parameter.
|
||||||
|
|
||||||
@ -1022,8 +988,6 @@ input-parameter = %s"input"
|
|||||||
|
|
||||||
; A circuit member variable declaration consists of an identifier and a type.
|
; A circuit member variable declaration consists of an identifier and a type.
|
||||||
; A circuit member function declaration consists of a function declaration.
|
; A circuit member function declaration consists of a function declaration.
|
||||||
; We could call these 'member-definition' etc.,
|
|
||||||
; but see the comments about the 'declaration' rule below.
|
|
||||||
|
|
||||||
member-declaration = member-variable-declaration
|
member-declaration = member-variable-declaration
|
||||||
/ member-function-declaration
|
/ member-function-declaration
|
||||||
@ -1032,9 +996,8 @@ member-variable-declaration = identifier ":" type
|
|||||||
|
|
||||||
member-function-declaration = function-declaration
|
member-function-declaration = function-declaration
|
||||||
|
|
||||||
; A circuit declaration defines a circuit type. It is straightforward.
|
; A circuit declaration defines a circuit type,
|
||||||
; This could be called 'circuit-definition' instead,
|
; as consisting of member variables and functions.
|
||||||
; but see the comments about the 'declaration' rule below.
|
|
||||||
|
|
||||||
circuit-declaration = *annotation %s"circuit" identifier
|
circuit-declaration = *annotation %s"circuit" identifier
|
||||||
"{" member-declaration *( "," member-declaration ) "}"
|
"{" member-declaration *( "," member-declaration ) "}"
|
||||||
@ -1047,7 +1010,7 @@ circuit-declaration = *annotation %s"circuit" identifier
|
|||||||
; or a parenthesized list of package paths,
|
; or a parenthesized list of package paths,
|
||||||
; which are "fan out" of the initial path.
|
; which are "fan out" of the initial path.
|
||||||
; Note that we allow the last element of the parenthesized list
|
; Note that we allow the last element of the parenthesized list
|
||||||
; to be followed by a comma, presumably for convenience.
|
; to be followed by a comma, for convenience.
|
||||||
|
|
||||||
import-declaration = %s"import" package-path
|
import-declaration = %s"import" package-path
|
||||||
|
|
||||||
@ -1057,12 +1020,6 @@ package-path = "*"
|
|||||||
/ "(" package-path *( "," package-path ) [","] ")"
|
/ "(" package-path *( "," package-path ) [","] ")"
|
||||||
|
|
||||||
; Finally, we define a file as a sequence of zero or more declarations.
|
; Finally, we define a file as a sequence of zero or more declarations.
|
||||||
; This is why we used 'function-declaration' and 'circuit-declaration'
|
|
||||||
; instead of 'function-definition' and 'ciruit-definition':
|
|
||||||
; this way, they are all declarations of some sort.
|
|
||||||
; An import declaration cannot really called an import definition,
|
|
||||||
; because it does not define anything.
|
|
||||||
; But we may revisit this, and use 'definition' instead of 'declaration'.
|
|
||||||
|
|
||||||
declaration = import-declaration
|
declaration = import-declaration
|
||||||
/ function-declaration
|
/ function-declaration
|
||||||
|
Loading…
Reference in New Issue
Block a user