diff --git a/grammar/abnf-grammar.txt b/grammar/abnf-grammar.txt
new file mode 100644
index 0000000000..ff93d19c09
--- /dev/null
+++ b/grammar/abnf-grammar.txt
@@ -0,0 +1,1089 @@
+; Leo Library
+;
+; Copyright (C) 2021 Aleo Systems Inc.
+;
+; Author: Alessandro Coglio (acoglio on GitHub)
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+; Format Note
+; -----------
+
+; The ABNF standard requires grammars to consist of lines terminated by CR LF
+; (i.e. carriage return followed by line feed, DOS/Windows-style),
+; as explained in the background on ABNF later in this file.
+; This file's lines are therefore terminated by CR LF.
+; To avoid losing this requirement across systems,
+; this file is marked as 'text eol=crlf' in .gitattributes:
+; this means that the file is textual, enabling visual diffs,
+; but its lines will always be terminated by CR LF on any system.
+
+; Note that this CR LF requirement only applies to the grammar files themselves.
+; It does not apply to the lines of the languages described by the grammar.
+; ABNF grammar may describe any kinds of languages,
+; with any kind of line terminators,
+; or even without line terminators at all (e.g. for "binary" languages).
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+; Introduction
+; ------------
+
+; This file contains an initial draft of
+; a complete ABNF (Augmented Backus-Naur Form) grammar of Leo.
+; Background on ABNF is provided later in this file.
+
+; The initial motivation for creating an ABNF grammar of Leo was that
+; we have a formally verified parser of ABNF grammars
+; (at https://github.com/acl2/acl2/tree/master/books/kestrel/abnf;
+; also see the paper at https://www.kestrel.edu/people/coglio/vstte18.pdf)
+; which we have used to parse this ABNF grammar of Leo
+; into a formal representation, in the ACL2 theorem prover,
+; of the Leo concrete syntax.
+; The parsing of this grammar file into an ACL2 representation
+; happens every time the ACL2 formalization of Leo is built.
+
+; In addition to that initial motivation,
+; this ABNF grammar has now the additional and primary purpose of
+; providing an official definition of the syntax of Leo
+; that is both human-readable and machine-readable.
+; This grammar will be part of the (upcoming) Leo language reference,
+; of the Leo Language Formal Specification
+; (i.e. the LaTeX document in the leo-semantics repo),
+; and of the ACL2 formalization of Leo (which was the initial motivation).
+; It has also been suggested that it may be used to generate tests.
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+; Background on ABNF
+; ------------------
+
+; ABNF is an Internet standard:
+; see https://www.rfc-editor.org/info/rfc5234
+; and https://www.rfc-editor.org/info/rfc7405.
+; It is used to specify the syntax of JSON, HTTP, and other standards.
+
+; The following is copied (and "un-LaTeX'd") from the aforementioned paper
+; (at https://www.kestrel.edu/people/coglio/vstte18.pdf).
+
+; ABNF adds conveniences and makes slight modifications
+; to Backus-Naur Form (BNF),
+; without going beyond context-free grammars.
+
+; Instead of BNF's angle-bracket notation for nonterminals,
+; ABNF uses case-insensitive names consisting of letters, digits, and dashes,
+; e.g. HTTP-message and IPv6address.
+; ABNF includes an angle-bracket notation for prose descriptions,
+; e.g. <host, see [RFC3986], Section 3.2.2>,
+; usable as last resort in the definiens of a nonterminal.
+
+; While BNF allows arbitrary terminals,
+; ABNF uses only natural numbers as terminals,
+; and denotes them via:
+; (i) binary, decimal, or hexadecimal sequences,
+; e.g. %b1.11.1010, %d1.3.10, and %x.1.3.A
+; all denote the string '1 3 10';
+; (ii) binary, decimal, or hexadecimal ranges,
+; e.g. %x30-39 denotes any string 'n' with 48 <= n <= 57
+; (an ASCII digit);
+; (iii) case-sensitive ASCII strings,
+; e.g. %s"Ab" denotes the string '65 98';
+; and (iv) case-insensitive ASCII strings,
+; e.g. %i"ab", or just "ab", denotes
+; any string among
+; '65 66',
+; '65 98',
+; '97 66', and
+; '97 98'.
+; ABNF terminals in suitable sets represent ASCII or Unicode characters.
+
+; ABNF allows repetition prefixes n*m,
+; where n and m are natural numbers in decimal notation;
+; if absent,
+; n defaults to 0, and
+; m defaults to infinity.
+; For example,
+; 1*4HEXDIG denotes one to four HEXDIGs,
+; *3DIGIT denotes up to three DIGITs, and
+; 1*OCTET denotes one or more OCTETs.
+; A single n prefix
+; abbreviates n*n,
+; e.g. 3DIGIT denotes three DIGITs.
+
+; Instead of BNF's |, ABNF uses / to separate alternatives.
+; Repetition prefixes have precedence over juxtapositions,
+; which have precedence over /.
+; Round brackets group things and override the aforementioned precedence rules,
+; e.g. *(WSP / CRLF WSP) denotes strings
+; obtained by repeating, zero or more times,
+; either (i) a WSP or (ii) a CRLF followed by a WSP.
+; Square brackets also group things but make them optional,
+; e.g. [":" port] is equivalent to 0*1(":" port).
+
+; Instead of BNF's ::=, ABNF uses = to define nonterminals,
+; and =/ to incrementally add alternatives
+; to previously defined nonterminals.
+; For example, the rule BIT = "0" / "1"
+; is equivalent to BIT = "0" followed by BIT =/ "1".
+
+; The syntax of ABNF itself is formally specified in ABNF
+; (in Section 4 of the aforementioned RFC 5234,
+; after the syntax and semantics of ABNF
+; are informally specified in natural language
+; (in Sections 1, 2, and 3 of the aforementioned RFC 5234).
+; The syntax rules of ABNF prescribe the ASCII codes allowed for
+; white space (spaces and horizontal tabs),
+; line endings (carriage returns followed by line feeds),
+; and comments (semicolons to line endings).
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+; Structure
+; ---------
+
+; Language specifications often define the syntax of their languages via
+; (i) a lexical grammar that describes how
+; sequence of characters are parsed into tokens, and
+; (ii) a syntactic grammar that described how
+; tokens are parsed into expressions, statements, etc.
+; (The adjectives 'lexical' and 'syntactic' are
+; the ones used in the Java language specification;
+; other terms may be used by other languages,
+; but the essence is similar.)
+; The situation is sometimes more complex,
+; with multiple passes (e.g. Unicode escape processing in Java),
+; but the division between lexical and syntactic (in the sense above) stands.
+
+; In the aforementioned language specifications,
+; both the lexical and syntactic grammars
+; are normally written in a context-free grammar notation,
+; augmented with natural language that may assert, for instance,
+; that tokenization always takes the longest sequence that constitutes a token.
+
+; This dual structure appears to be motivated by the fact that
+; concerns of white space, line endings, etc.
+; can be handled by the lexical grammar,
+; allowing the syntactic grammar to focus on the more important structure.
+; Handling both aspects in a single context-free grammar may be unwieldy,
+; so having two grammars provides more clarity and readability.
+
+; In contrast, PEG (Parsing Expression Grammar) formalisms like Pest
+; naturally embody a procedural interpretation
+; that can handle white space and tokenization in just one manageable grammar.
+; However, this procedural interpretaion may be sometimes
+; less clear and readable to humans than context-free rules.
+; Indeed, context-free grammar are commonly used to documentat languages.
+
+; ABNF is a context-free grammar notation,
+; with no procedural interpretation,
+; and therefore it makes sense to define
+; separate lexical and syntactic ABNF grammars for Leo.
+; Conceptually, the two grammars define two subsequent processing phases,
+; as detailed below.
+; However, a parser implementation does not need to perform
+; two strictly separate phases (in fact, it typically does not),
+; so long as it produces the same final result.
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+; Operator Precedence
+; -------------------
+
+; We formulate the grammar rules for expressions
+; in a way that describes the relative precedence of operators,
+; as often done in language syntax specifications.
+
+; For instance, consider the rules
+;
+;  multiplicative-expression =
+;          exponential-expression
+;        / multiplicative-expression "*" exponential-expression
+;        / multiplicative-expression "/" exponential-expression
+;
+;  additive-expression =
+;          multiplicative-expression
+;        / additive-expression "+" multiplicative-expression
+;        / additive-expression "-" multiplicative-expression
+;
+; this rule tells us that the additive operators '+' and '-' have
+; lower precedence than the multiplicative operators '*' and '/',
+; and that both the additive and multiplicative operators associate to the left.
+; This may be best understood via the examples given below.
+
+; According to the rules, the expression
+;
+;  x + y * z
+;
+; can only be parsed as
+;
+;    +
+;   / \
+;  x   *
+;     / \
+;    y   z
+;
+; and not as
+;
+;      *
+;     / \
+;    +   z
+;   / \
+;  x   y
+;
+; because a multiplicative expression cannot have an additive expression
+; as first sub-expression, as it would in the second tree above.
+
+; Also according to the rules, the expression
+;
+;  x + y + z
+;
+; can only be parsed as
+;
+;      +
+;     / \
+;    +   z
+;   / \
+;  x   y
+;
+; and not as
+;
+;    +
+;   / \
+;  x   +
+;     / \
+;    y   z
+;
+; because an additive expression cannot have an additive expression
+; as second sub-expression, as it would in the second tree above.
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+; Naming Convention
+; -----------------
+
+; For this ABNF grammar, we choose nonterminal names
+; that consist of complete English words, separated by dashes,
+; and that describe the construct the way it is in English.
+; For instance, we use the name 'conditional-statement'
+; to describe conditional statements.
+
+; At the same time, we attempt to establish
+; a precise and "official" nomenclature for the Leo constructs,
+; by way of the nonterminal names that define their syntax.
+; For instance, the rule
+;
+;  group-literal = product-group-literal
+;                / affine-group-literal
+;
+; tells us that there are two kinds of group literals,
+; namely product group literals and affine group literals.
+; This is more precise than describing them as
+; integers (which are not really group elements per se),
+; or points (they are all points, just differently specified),
+; or being singletons vs. pairs (which is a bit generic).
+
+; The only exception to the nomenclature-establishing role of the grammar
+; is the fact that, as discussed above,
+; we write the grammar rules in a way that determines
+; the relative precedence and the associativity of expression operators,
+; and that therefore we have rules like
+;
+;  unary-expression = primary-expression
+;                   / "!" unary-expression
+;                   / "-" unary-expression
+;
+; In order to allow the recursion of the rule to stop,
+; we need to regard, in the grammar, a primary expression as a unary expression.
+; However, this is just a grammatical artifact:
+; ontologically, a primary expression is not really a unary expression,
+; because a unary expression is one that consists of
+; a unary operator and an operand sub-expression.
+; These terminological "exceptions" should be easy to identify in the rules.
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+; Lexical Grammar
+; ---------------
+
+; A Leo file is a finite sequence of Unicode characters,
+; represented as Unicode code points,
+; which are numbers in the range form 0 to 10FFFFh.
+; These are captured by the ABNF rule 'character' below.
+
+; The lexical grammar defines how, conceptually,
+; the sequence of characters is turned into
+; a sequence of tokens, comments, and whitespaces.
+
+; As stated, the lexical grammar is ambiguous.
+; For example, the sequence of characters '**' (i.e. two stars)
+; could be equally parsed as two '*' symbol tokens or one '**' symbol token
+; (see rule for 'symbol' below).
+; As another example, the sequence or characters '<CR><LF>'
+; (i.e. carriage return followed by line feed)
+; could be equally parsed as two line terminators or one
+; (see rule for 'newline').
+
+; Thus, as often done in language syntax definitions,
+; the lexical grammar is disambiguated by
+; the extra-grammatical requirement that
+; the longest possible sequence of characters is always parsed.
+; This way, '**' must be parsed as one '**' symbol token,
+; and '<CR><LF>' must be parsed as one line terminator.
+; (We should formalize this requirement,
+; along with the other extra-grammatical requirements given below,
+; and formally prove that they indeed make
+; the lexical grammar of Leo unambiguous.)
+
+; As mentioned above, a character is any Unicode code point.
+; This grammar does not say how those are encoded in files (e.g. UTF-8):
+; it starts with a decoded sequence of Unicode code points.
+; Note that we allow any value,
+; even though some values may not be used according to the Unicode standard.
+
+character = %x0-10FFFF   ; any Unicode code point
+
+; We give names to certain ASCII characters.
+
+horizontal-tab = %x9
+
+line-feed = %xA
+
+carriage-return = %xD
+
+space = %x20
+
+double-quote = %x22
+
+; We give names to complements of certain ASCII characters.
+; These consist of all the Unicode characters except for one or two.
+
+not-double-quote = %x0-22 / %x24-10FFFF   ; anything but "
+
+not-star = %x0-29 / %x2B-10FFFF   ; anything but *
+
+not-line-feed-or-carriage-return = %x0-9 / %xB-C / %xE-10FFFF
+                                   ; anything but LF or CR
+
+not-star-or-slash = %x0-29 / %x2B-2E / %x30-10FFFF   ; anything but * or /
+
+; Lines in Leo may be terminated via
+; a single carriage return,
+; a line feed,
+; or a carriage return immediately followed by a line feed.
+; Note that the latter combination constitutes a single line terminator,
+; according to the extra-grammatical rule of the longest sequence.
+
+newline = line-feed / carriage-return / carriage-return line-feed
+
+; Line terminators form whitespace, along with spaces and horizontal tabs.
+
+whitespace = space / horizontal-tab / newline
+
+; There are two kinds of comments in Leo, as in other languages.
+; One is block comments of the form '/* ... */',
+; and the other is end-of-line comments '// ...'.
+; The first kind start at '/*' and end at the first '*/',
+; possibly spanning multiple (partial) lines;
+; they do no nest.
+; The second kind start at '//' and extend till the end of the line.
+; The rules about comments given below are similar to
+; the ones used in the Java language specification.
+
+comment = block-comment / end-of-line-comment
+
+block-comment = "/*" rest-of-block-comment
+
+rest-of-block-comment = "*" rest-of-block-comment-after-star
+                      / not-star rest-of-block-comment
+
+rest-of-block-comment-after-star = "/"
+                                 / "*" rest-of-block-comment-after-star
+                                 / not-star-or-slash rest-of-block-comment
+
+end-of-line-comment = "//" *not-line-feed-or-carriage-return newline
+
+; Below are the keywords in the Leo language.
+; They cannot be used as identifiers.
+
+keyword = %s"address"
+        / %s"as"
+        / %s"bool"
+        / %s"circuit"
+        / %s"console"
+        / %s"const"
+        / %s"else"
+        / %s"false"
+        / %s"field"
+        / %s"for"
+        / %s"function"
+        / %s"group"
+        / %s"i8"
+        / %s"i16"
+        / %s"i32"
+        / %s"i64"
+        / %s"i128"
+        / %s"if"
+        / %s"import"
+        / %s"in"
+        / %s"input"
+        / %s"let"
+        / %s"mut"
+        / %s"return"
+        / %s"Self"
+        / %s"self"
+        / %s"static"
+        / %s"string"
+        / %s"true"
+        / %s"u8"
+        / %s"u16"
+        / %s"u32"
+        / %s"u64"
+        / %s"u128"
+
+; The following rules define (ASCII) digits
+; and (uppercase and lowercase) letters.
+
+digit = %x30-39   ; 0-9
+
+uppercase-letter = %x41-5A   ; A-Z
+
+lowercase-letter = %x61-7A   ; a-z
+
+letter = uppercase-letter / lowercase-letter
+
+; An identifier is a non-empty sequence of letters, digits, and underscores,
+; starting with a letter.
+; It must not be a keyword: this is an extra-grammatical constraint.
+
+identifier = letter *( letter / digit / "_" )   ; but not a keyword
+
+; A package name consists of one or more segments separated by single dashes,
+; where each segment is a non-empty sequence of lowercase letters and digits.
+
+package-name = 1*( lowercase-letter / digit )
+               *( "-" 1*( lowercase-letter / digit ) )
+
+; An address starts with 'aleo1'
+; and continues with exactly 58 lowercase letters and digits.
+; Thus an address always consists of 63 characters.
+
+address = %s"aleo1" 58( lowercase-letter / digit )
+
+; A format string is a sequence of characters, other than double quote,
+; surrounded by double quotes.
+; Within a format string, substrings '{}' are distinguished as containers
+; (these are the ones that may be matched with values
+; whose textual representation replaces the containers
+; in the printed string).
+; There is an implicit extra-grammatical requirements that
+; the explicit 'format-container' instances include
+; all the occurrences of '{}' in the parsed character sequence:
+; that is, there may not be two contiguous 'not-double-quote' instances
+; that are '{' and '}'.
+
+format-string-container = "{}"
+
+format-string = double-quote
+                *( not-double-quote / format-string-container )
+                double-quote
+
+; Here is (part of this ABNF comment),
+; an alternative way to specify format strings,
+; which captures the extra-grammatical requirement above in the grammar
+; but is more complicated:
+;
+;  not-double-quote-or-open-brace = %x0-22 / %x24-7A / %x7C-10FFFF
+;
+;  not-double-quote-or-close-brace = %x0-22 / %x24-7C / %x7E-10FFFF
+;
+;  format-string-element = not-double-quote-or-open-brace
+;                        / "{" not-double-quote-or-close-brace
+;                        / format-container
+;
+;  format-string = double-quote *format-string-element double-quote
+;
+; It is not immediately clear which approach is better; there are tradeoffs.
+; Regardless, we should choose one eventually.
+
+; Annotations are built out of names and arguments, which are tokens.
+; Two names are currently supported.
+; An argument is a sequence of one or more letters, digits, and underscores.
+
+annotation-name = %s"@context" / %s"@test"
+
+annotation-argument = 1*( letter / digit / "_" )
+
+; A natural (number) is a sequence of one or more digits.
+; Note that we allow leading zeros, e.g. '007'.
+
+natural = 1*digit
+
+; An integer (number) is either a natural or its negation.
+; Note that we also allow leading zeros in negative numbers, e.g. '-007'.
+
+integer = [ "-" ] natural
+
+; An untyped literal is just an integer.
+
+untyped-literal = integer
+
+; Unsigned literals are naturals followed by unsigned types.
+
+unsigned-literal = natural ( %s"u8" / %s"u16" / %s"u32" / %s"u64" / %s"u128" )
+
+; Signed literals are integers followed by signed types.
+
+signed-literal = integer ( %s"i8" / %s"i16" / %s"i32" / %s"i64" / %s"i128" )
+
+; Field literals are integers followed by the type of field elements.
+
+field-literal = integer %s"field"
+
+; There are two kinds of group literals.
+; One is a single integer followed by the type of group elements,
+; which denotes the scalar product of the generator point by the integer.
+; The other is a pair of integer coordinates,
+; which are reduced modulo the prime to identify a point,
+; which must be on the elliptic curve.
+; It is also allowed to omit one (not both) coordinates,
+; with an indication of how to infer the missing coordinate
+; (i.e. sign high, sign low, or inferred).
+
+product-group-literal = integer %s"group"
+
+group-coordinate = integer / "+" / "-" / "_"
+
+affine-group-literal = "(" group-coordinate "," group-coordinate ")" %s"group"
+
+group-literal = product-group-literal / affine-group-literal
+
+; Note that the rule for group literals above
+; allows no whitespace between coordinates.
+; If we want to allow whitespace,
+; e.g. '(3, 4)group' as opposed to requiring '(3,4)group',
+; then we should define affine group literals
+; in the syntactic grammar instead of in the lexical grammar.
+; We should have a notion of atomic literal in the lexical grammar,
+; and (non-atomic) literal in the syntactic grammar.
+; The lexical grammar should define a token for ')group'
+; if we want no whitespace between the closing parenthesis and 'group'.
+; More precisely, the rule for 'literal' below in the lexical grammar
+; would be replaced with
+;
+;  atomic-literal = ... / product-group-literal
+;
+; where the '...' stands for all the '...-literal' alternatives
+; in the current rule for 'literal' below, except 'group-literal'.
+; Furthermore, the rule for 'symbol' below in the lexical grammar
+; would be extended to
+;
+;  symbol = ... / %s")group"
+;
+; where '...' stands for the current definiens of the rule.
+; We would also have to adjust the rule for 'token' below in the lexical grammar
+; to reference 'atomic-literal' instead of 'literal' in the definiens.
+; We would then add to the syntactic grammar the following rules
+;
+;  affine-group-literal = "(" group-coordinate "," group-coordinate %s")group"
+;
+;  literal = atomic-literal / affine-group-literal
+;
+; which would now define literals in the syntactic grammar.
+; Note that now an affine group literal would have the form
+;
+;  ( <ow> <coordinate> <ow> , <ow> <coordinate> <ow> )group
+;
+; where <ow> is optional whitespace;
+; however, no whitespace is allowed between the closing ')' and 'group'.
+
+; Boolean literals are the usual two.
+
+boolean-literal = %s"true" / %s"false"
+
+; An address literal is an address wrapped into an indication of address,
+; to differentiate it from an identifier.
+
+address-literal = %s"address" "(" address ")"
+
+; The ones above are all the literals, as defined by the following rule.
+
+literal = untyped-literal
+        / unsigned-literal
+        / signed-literal
+        / field-literal
+        / group-literal
+        / boolean-literal
+        / address-literal
+
+; After defining the (mostly) alphanumeric tokens above,
+; it remains to define tokens for non-alphanumeric symbols such as "+" and "(".
+; Different programming languages used different terminologies for these,
+; e.g. operators, separators, punctuators, etc.
+; Here we use 'symbol', for all of them, but we can do something different.
+; We could give names to all of these symbols,
+; via rules such as
+;
+;  equality-operator = "=="
+;
+; and defining 'symbol' in terms of those
+;
+;  symbol = ... / equality-operator / ...
+;
+; This may or may not make the grammar more readable,
+; but it would help establish a terminology in the grammar,
+; namely the exact names of some of these token.
+; On the other hand, at least some of them are perhaps simple enough
+; that they could be just described in terms of their symbols,
+; e.g. 'double dot', 'question mark', etc.
+
+symbol = "!" / "&&" / "||"
+       / "==" / "!="
+       / "<" / "<=" / ">" / ">="
+       / "+" / "-" / "*" / "/" / "**"
+       / "=" / "+=" / "-=" / "*=" / "/=" / "**="
+       / "(" / ")"
+       / "[" / "]"
+       / "{" / "}"
+       / "," / "." / ".." / "..." / ";" / ":" / "::" / "?"
+       / "->"
+
+; Everything defined above, other than comments and whitespace,
+; is a token, as defined by the following rule.
+
+token = keyword
+      / identifier
+      / literal
+      / package-name
+      / format-string
+      / annotation-name
+      / annotation-argument
+      / symbol
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+; Syntactic Grammar
+; -----------------
+
+; The processing defined by the lexical grammar above
+; turns the initial sequence of characters
+; into a sequence of tokens, comments, and whitespaces.
+; The purpose of comments and whitespaces, from a syntactic point of view,
+; is just to separate tokens:
+; they are discarded, leaving a sequence of tokens.
+; The syntactic grammar describes how to turn
+; a sequence of tokens into concrete syntax trees.
+
+; There are unsigned and signed integer types, for five sizes.
+
+unsigned-type = %s"u8" / %s"u16" / %s"u32" / %s"u64" / %s"u128"
+
+signed-type = %s"i8" / %s"i16" / %s"i32" / %s"i64" / %s"i128"
+
+integer-type = unsigned-type / signed-type
+
+; The integer types, along with the field and group types,
+; for the arithmetic types, i.e. the ones that support arithmetic operations.
+
+field-type = %s"field"
+
+group-type = %s"group"
+
+arithmetic-type = integer-type / field-type / group-type
+
+; The arithmetic types, along with the boolean and address types,
+; form the scalar types, i.e. the ones whose values do not contain (sub)values.
+
+boolean-type = %s"bool"
+
+address-type = %s"address"
+
+scalar-type =  boolean-type / arithmetic-type / address-type
+
+; Circuit types are denoted by identifiers and the keyword 'Self'.
+; The latter is only allowed inside a circuit definition,
+; to denote the defined circuit.
+
+self-type = %s"Self"
+
+circuit-type = identifier / self-type
+
+; A tuple type consists of zero, two, or more component types.
+
+tuple-type = "(" [ type 1*( "," type ) ] ")"
+
+; An array type consists of an element type
+; and an indication of dimensions.
+; There is either a single dimension (a number),
+; or a tuple of one or more dimensions
+; (we could restrict  the latter to two or more dimensions).
+
+array-type = "[" type ";" array-dimensions "]"
+
+array-dimensions = natural
+                 / "(" natural *( "," natural ) ")"
+
+; Circuit, tuple, and array types form the aggregate types,
+; i.e. types whose values contain (sub)values
+; (with the corner-case exception of the empty tuple value).
+
+aggregate-type = tuple-type / array-type / circuit-type
+
+; Scalar and aggregate types form all the types.
+
+type = scalar-type / aggregate-type
+
+; As often done in grammatical language syntax specifications,
+; we define rules for different kinds of expressions,
+; which also defines the relative precedence
+; of operators and other expression constructs,
+; and the (left or right) associativity of binary operators.
+
+; The primary expressions are self-contained in a way,
+; i.e. they have clear deliminations.
+; Some consist of single tokens:
+; identifiers, the keywords 'self' and 'input', and literals.
+; Primary expressions also include parenthesized expressions,
+; i.e. any expression may be turned into a primary one
+; by putting parentheses around it.
+; The other kinds are defined and explained below.
+
+primary-expression = identifier
+                   / %s"self"
+                   / %s"input"
+                   / literal
+                   / "(" expression ")"
+                   / tuple-expression
+                   / array-expression
+                   / circuit-expression
+                   / function-call
+
+; There are tuple expressions to construct and deconstruct tuples.
+; A construction consists of zero, two, or more component expressions.
+; A deconstruction uses a component index (zero-indexed).
+; Note that constructions are delimited by closing parentheses
+; and deconstructions are delimited by natural tokens.
+; The rule below, and similar rules for other aggregate types,
+; use the perhaps more familiar 'access',
+; but note that 'deconstruction' has a nice symmetry to 'construction';
+; the term 'destructor' has a different meaning in other languages,
+; so we may want to avoid it, but not necessarily.
+
+tuple-construction = "(" [ expression 1*( "," expression ) ] ")"
+
+tuple-access = primary-expression "." natural
+
+tuple-expression = tuple-construction / tuple-access
+
+; The are array expressions to construct and deconstruct arrays.
+; There are two kinds of constructions:
+; one lists the element expressions (at least one),
+; including spreads (via '...') which are arrays being spliced in;
+; the other repeats (the value of) a single expression
+; across one or more dimensions.
+; There are two kinds of deconstructions:
+; one selects a single element by index (zero-indexed);
+; the other selects a range via two indices,
+; the first inclusive and the second exclusive --
+; both are optional,
+; the first defaulting to 0 and the second to the array length.
+; Note that these expressions are all delimited
+; by closing square brackets.
+
+array-inline-construction = "["
+                            array-inline-element
+                            *( "," array-inline-element )
+                            "]"
+
+array-inline-element = expression / "..." expression
+
+array-repeat-construction = "[" expression ";" array-dimensions "]"
+
+array-construction = array-inline-construction / array-repeat-construction
+
+array-element-access = primary-expression "[" expression "]"
+
+array-range-access = primary-expression "[" [expression] ".." [expression] "]"
+
+array-access = array-element-access / array-range-access
+
+array-expression = array-construction / array-access
+
+; There are circuit expressions to construct and deconstruct circuit values.
+; A construction lists values for all the member variables (in any order);
+; there must be at least one member variable currently.
+; A deconstruction selects a member variable by name.
+; Note that these expressions are delimited,
+; by closing curly braces or identifiers.
+
+circuit-construction = circuit-type "{"
+                       circuit-inline-element *( "," circuit-inline-element )
+                       "}"
+
+circuit-inline-element = identifier ":" expression
+
+circuit-access = primary-expression "." identifier
+
+circuit-expression = circuit-construction / circuit-access
+
+; There are three kinds of function calls:
+; top-level function calls,
+; instance (i.e. non-static) member function calls, and
+; static member function calls.
+; What changes is the start, but they all end in an argument list,
+; delimited by a closing parenthesis.
+
+top-level-function-call = identifier function-arguments
+
+instance-member-function-call = primary-expression "."
+                                identifier function-arguments
+
+static-member-function-call = circuit-type "::" identifier function-arguments
+
+function-call = top-level-function-call
+              / instance-member-function-call
+              / static-member-function-call
+
+function-arguments = "(" [ expression *( "," expression ) ] ")"
+
+; Unary operators have the highest precedence.
+; They apply to primary expressions
+; and recursively to unary expressions.
+
+unary-expression = primary-expression
+                 / "!" unary-expression
+                 / "-" unary-expression
+
+; Next in the operator precedence is exponentiation,
+; following mathematical practice.
+; The current rule below makes exponentiation left-associative,
+; i.e. 'a ** b ** c' must be parsed as '(a ** b) ** c'.
+; This is easy to change if we want it to be right-associative instead.
+
+exponential-expression = unary-expression
+                       / exponential-expression "**" unary-expression
+
+; Next in precedence come multiplication and division, both left-associative.
+
+multiplicative-expression = exponential-expression
+                          / multiplicative-expression "*" exponential-expression
+                          / multiplicative-expression "/" exponential-expression
+
+; Then there are addition and subtraction, both left-assocative.
+
+additive-expression = multiplicative-expression
+                    / additive-expression "+" multiplicative-expression
+                    / additive-expression "-" multiplicative-expression
+
+; Next in the precedence order are ordering relations.
+; These are not associative, because they return boolean values.
+
+ordering-expression = additive-expression
+                    / additive-expression "<" additive-expression
+                    / additive-expression ">" additive-expression
+                    / additive-expression "<=" additive-expression
+                    / additive-expression ">=" additive-expression
+
+; Equalities return booleans but may also operate on boolean,
+; so we make them left-associative.
+
+equality-expression = ordering-expression
+                    / equality-expression "==" ordering-expression
+                    / equality-expression "!=" ordering-expression
+
+; Next come conjunctive expressions, left-associative.
+
+conjunctive-expression = equality-expression
+                       / conjunctive-expression "&&" equality-expression
+
+; Next come disjunctive expressions, left-associative.
+
+disjunctive-expression = conjunctive-expression
+                       / disjunctive-expression "||" conjunctive-expression
+
+; Finally we have conditional expressions.
+
+conditional-expression = disjunctive-expression
+                       / conditional-expression
+                         "?" expression
+                         ":" conditional-expression
+
+; These are all the expressions.
+; Recall that conditional expressions
+; may be disjunctive expressions,
+; which may be conjunctive expressions,
+; and so on all the way to primary expressions.
+
+expression = conditional-expression
+
+; There are various kinds of statements,
+; including blocks, which are
+; possibly empty sequences of statements surounded by curly braces.
+
+statement = expression-statement
+          / return-statement
+          / variable-definition-statement
+          / conditional-statement
+          / loop-statement
+          / assignment-statement
+          / console-statement
+          / block
+
+block = "{" *statement "}"
+
+; An expression (that returns the empty tuple)
+; can be turned into a statement by appending a semicolon.
+
+expression-statement = expression ";"
+
+; A return statement always takes an expression,
+; and does not end with a semicolon (but we may want to change that).
+
+return-statement = %s"return" expression
+
+; There are two kinds of variable definition statements,
+; which only differ in the starting keyword.
+; The variables are either a single one or a tuple of two or more;
+; in all cases, there is just one optional type
+; and just one initializing expression.
+
+variable-definition-statement = ( %s"let" / %s"const" )
+                                identifier-or-identifiers
+                                [ ":" type ] "=" expression ";"
+
+identifier-or-identifiers = identifier
+                          / "(" identifier 1*( "," identifier ) ")"
+
+; A conditional statement always starts with a condition and a block
+; (which together form a branch).
+; It may stop there, or it may continue with an alternative block,
+; or possibly with another conditional statement, forming a chain.
+; Note that we require blocks in all branches, not merely statements.
+
+branch = %s"if" expression block
+
+conditional-statement = branch
+                      / branch %s"else" block
+                      / branch %s"else" conditional-statement
+
+; A loop statement implicitly defines a loop variable
+; that goes from a starting value (inclusive) to an ending value (exclusive).
+; The body is a block.
+
+loop-statement = %s"for" identifier %s"in" expression ".." expression block
+
+; An assignment statement is straightforward.
+; Based on the operator, the assignment may be simple (i.e. '=')
+; or compound (i.e. combining assignment with an arithmetic operation).
+
+assignment-operator = "=" / "+=" / "-=" / "*=" / "/=" / "**="
+
+assignment-statement = expression assignment-operator expression ";"
+
+; Console statements start with the 'console' keyword,
+; followed by a console function call.
+; The call may be an assertion or a print command.
+; The former takes an expression (which must be boolean) as argument.
+; The latter takes either no argument,
+; or a format string followed by expressions,
+; whose number must match the number of containers '{}' in the format string.
+; Note that the console function names are identifiers, not keywords.
+; There are three kind of printing.
+
+console-statement = %s"console" "." console-call
+
+console-call = assert-call
+             / print-call
+
+assert-call = %s"assert" "(" expression ")"
+
+print-function = %s"debug" / %s"error" / %s"log"
+
+print-arguments = "(" [ format-string *( "," expression ) ] ")"
+
+print-call = print-function print-arguments
+
+; An annotation consists of an annotation name (which starts with '@')
+; with optional annotation arguments.
+; Note that no parentheses are used if there are no arguments.
+
+annotation = annotation-name
+             [ "(" annotation-argument *( "," annotation-argument ) ")" ]
+
+; A function declaration defines a function.
+; This could be called 'function-definition' instead,
+; but see the comments about the 'declaration' rule below.
+; The output type is optional (it defaults to the empty tuple type).
+; In general, a function input consists of an identifier and a type,
+; with an optional 'const' modifier.
+; However, functions inside circuits
+; may start with a 'mut self' or 'self' parameter,
+; which may be the only parameter in fact.
+; Furthermore, any function may end with an 'input' parameter,
+; which may be the only parameter in fact.
+
+function-declaration = *annotation %s"function" identifier
+                       "(" [ function-parameters ] ")" [ "->" type ]
+                       block
+
+function-parameters = self-parameter [ "," input-parameter ]
+                    / self-parameter "," function-inputs [ "," input-parameter ]
+                    / function-inputs [ "," input-parameter ]
+                    / input-parameter
+
+self-parameter = [%s"mut"] %s"self"
+
+function-inputs = function-input *( "," function-input )
+
+function-input = [ %s"const" ] identifier ":" type
+
+input-parameter = %s"input"
+
+; A circuit member variable declaration consists of an identifier and a type.
+; A circuit member function declaration consists of a function declaration.
+; We could call these 'member-definition' etc.,
+; but see the comments about the 'declaration' rule below.
+
+member-declaration = member-variable-declaration
+                   / member-function-declaration
+
+member-variable-declaration = identifier ":" type
+
+member-function-declaration = function-declaration
+
+; A circuit declaration defines a circuit type. It is straightforward.
+; This could be called 'circuit-definition' instead,
+; but see the comments about the 'declaration' rule below.
+
+circuit-declaration = *annotation %s"circuit" identifier
+                      "{" member-declaration *( "," member-declaration ) "}"
+
+; An import declaration consists of the 'import' keyword
+; followed by a package path, which may be one of the following:
+; a single wildcard;
+; an identifier, optionally followed by a local renamer;
+; a package name followed by a path, recursively;
+; or a parenthesized list of package paths,
+; which are "fan out" of the initial path.
+; Note that we allow the last element of the parenthesized list
+; to be followed by a comma, presumably for convenience.
+
+import-declaration = %s"import" package-path
+
+package-path = "*"
+             / identifier [ %s"as" identifier ]
+             / package-name "." package-path
+             / "(" package-path *( "," package-path ) [","] ")"
+
+; Finally, we define a file as a sequence of zero or more declarations.
+; This is why we used 'function-declaration' and 'circuit-declaration'
+; instead of 'function-definition' and 'ciruit-definition':
+; this way, they are all declarations of some sort.
+; An import declaration cannot really called an import definition,
+; because it does not define anything.
+; But we may revisit this, and use 'definition' instead of 'declaration'.
+
+declaration = import-declaration
+            / function-declaration
+            / circuit-declaration
+
+file = *declaration