Cleanup out dated docs

This commit is contained in:
Pranav Gaddamadugu 2022-08-18 15:45:30 -07:00
parent 7f37b2f3d8
commit eb4ee1428c
45 changed files with 9 additions and 4680 deletions

26
FAQs.md
View File

@ -1,26 +0,0 @@
# FAQs
#### For some given code, changing the value in a constant variable changes the number of constraints in the generated circuit. Is this behavior correct?
**Yes**, take the integers as an example. In Leo, integers are represented as its binary decomposition,
with each bit occupying one field element (that takes on 0 or 1). Then, for an expression such as `a == 4u32`, the operation to evaluate equality
would comprise a linear pass of bitwise `AND` operations, comparing every bit in the **variable** value with each bit in the **constant** value.
As the constant value is already known to the compiler during circuit synthesis, the compiler is already able to complete part of the equality evaluation,
by assuming that any bit in the constant value that is `0` will clearly evaluate to `0`. As such, depending on the value of the constant integer in your code,
the total number of constraints in the generate circuit can vary.
To illustrate this, here are two examples to show the difference:
```
constant = 00000001
variable = abcdefgh
---------------------------------
output = 0000000h (1 constraint)
```
```
constant = 01110001
variable = abcdefgh
---------------------------------
output = 0bcd000h (4 constraints)
```

View File

@ -35,9 +35,8 @@ integrate private applications into your stack. Leo compiles to circuits making
The syntax of Leo is influenced by traditional programming languages like JavaScript, Scala, and Rust, with a strong emphasis on readability and ease-of-use.
Leo offers developers with tools to sanity check circuits including unit tests, integration tests, and console functions.
Leo is one part of a greater ecosystem for building private applications on [Aleo](https://aleo.org/). If your goal is to build a user experience
on the web that is both truly personal and truly private, then we recommend downloading the [Aleo Studio IDE](https://aleo.studio/)
and checking out the [Aleo Package Manager](https://aleo.pm/).
Leo is one part of a greater ecosystem for building private applications on [Aleo](https://aleo.org/).
The language is currently in an alpha stage and is subject to breaking changes.
## 2. Build Guide
@ -58,22 +57,9 @@ We recommend installing Rust using [rustup](https://www.rustup.rs/). You can ins
Download the [Windows 32-bit executable](https://win.rustup.rs/i686) and follow the on-screen instructions.
### 2.2a Build from Crates.io
### 2.2 Build from Source Code
We recommend installing Leo this way. In your terminal, run:
```bash
cargo install leo-lang
```
Now to use Leo, in your terminal, run:
```bash
leo
```
### 2.2b Build from Source Code
Alternatively, you can install Leo by building from the source code as follows:
We recommend installing Leo by building from the source code as follows:
```bash
# Download the source code
@ -95,8 +81,8 @@ Use the Leo CLI to create a new project
```bash
# create a new `hello-world` Leo project
leo new hello-world
cd hello-world
leo new helloworld
cd helloworld
# build & setup & prove & verify
leo run
@ -104,7 +90,7 @@ leo run
The `leo new` command creates a new Leo project with a given name.
The `leo run` command will compile the main program, generate keys for a trusted setup, fetch inputs, generate a proof and verify it.
The `leo run` command will compile the program into Aleo instructions and run it.
Congratulations! You've just run your first Leo program.
@ -112,8 +98,7 @@ Congratulations! You've just run your first Leo program.
* [Hello World - Next Steps](https://developer.aleo.org/developer/getting_started/hello_world)
* [Leo Language Documentation](https://developer.aleo.org/developer/language/layout)
* [Leo ABNF Grammar](./grammar/README.md)
* [Leo CLI Documentation](https://developer.aleo.org/developer/cli/new)
* [Leo ABNF Grammar](./docs/grammar/abnf-grammar.txt)
* [Homepage](https://developer.aleo.org/developer/getting_started/overview)
## 5. Contributing

View File

@ -2,16 +2,8 @@
The following describes our procedure for addressing major and minor security concerns in Leo.
## Testnet I
As Aleo is currently in the prototype stage and does not operate a platform intended for production use,
our security procedures are designed to promote public disclosure and quick security resolution.
In preparation for the production stage, we will release new security guidelines and
issue new procedures for addressing the disclosure of sensitive security vulnerabilities.
### Reporting a Bug
During Testnet I, all software bugs should be reported by filing a Github issue.
During Testnet 3, all software bugs should be reported by filing a Github issue.
If you are unsure and would like to reach out to us directly, please email security \_at\_ aleo.org to elaborate on the issue.

View File

@ -4,349 +4,3 @@
[![Authors](https://img.shields.io/badge/authors-Aleo-orange.svg)](../AUTHORS)
[![License](https://img.shields.io/badge/License-GPLv3-blue.svg)](./LICENSE.md)
This directory contains the code for the AST of a Leo Program.
## Node Types
There are several types of nodes in the AST that then have further breakdowns.
All nodes store a Span, which is useful for tracking the lines and
columns of where the node was taken from in the Leo Program.
### [Program/File](./src/program.rs)
The top level nodes in a Leo Program.
#### [Imports](./src/imports/import.rs)
Represents an import statement in a Leo Program.
A list of these are stored on the Program.
It stores the path to an import and what is being imported.
**NOTE**: The import does not contain the source code of the imported Leo Program.
#### [Circuits](./src/circuits/circuit.rs)
A circuit node represents a defined Circuit in a Leo Program.
An order-preserving map of these are stored on the Program.
Contains the Circuit's name, as well as its members.
The members are a function, or a variable, or a constant.
For all of them the Circuit preserves their names.
#### [Annotations](./src/functions/annotation.rs)
An annotation node is a decorator that can be applied to a function.
Stored on the function themselves despite being a top-level node.
The node stores the name of the annotation, as well as any args passed to it.
#### [Functions](./src/functions/mod.rs)
A function node represents a defined function in a Leo Program.
An order-preserving map of these are stored on the Program.
A function node stores the following information:
- The annotations applied to the function.
- An identifier the name of the function.
- The inputs to the function, both their names and types.
- The output of the function as a type if it exists.
- The function body stored as a block statement.
#### [Global Consts](./src/program.rs)
A global const is a bit special and has no special node for itself, but rather is a definition statement.
An order-preserving map of these are stored on the Program.
### [Types](./src/types/type_.rs)
The different types in a Leo Program.
Types themselves are not a node, but rather just information to be stored on a node.
#### Address
The address type follows the [BIP_0173](https://en.bitcoin.it/wiki/BIP_0173) format starting with `aleo1`.
#### Boolean
The boolean type consists of two values **true** and **false**.
#### Char
The char type resents a character from the inclusive range [0, 10FFFF].
#### Field
The field type an unsigned number less than the modulus of the field.
#### Group
The group type a set of affine points on the elliptic curve passed.
#### [IntegerType](./src/types/integer_type.rs)
The integer type represents a range of integer types.
##### U8
A integer in the inclusive range [0, 255].
##### U16
A integer in the inclusive range [0, 65535].
##### U32
A integer in the inclusive range [0, 4294967295].
##### U64
A integer in the inclusive range [0, 18446744073709551615].
##### U128
A integer in the inclusive range [0, 340282366920938463463374607431768211455].
##### I8
A integer in the inclusive range [-128, 127].
##### I16
A integer in the inclusive range [-32768, 32767].
##### I32
A integer in the inclusive range [-2147483648, 2147483647].
##### I64
A integer in the inclusive range [-9223372036854775808, 9223372036854775807].
##### I128
A integer in the inclusive range [-170141183460469231731687303715884105728, 170141183460469231731687303715884105727].
#### Array
The array type contains another type,
then the number of elements of that type greater than 0
for monodimensional arrays,
or a list of such numbers of elements
for multidimensional arrays.
#### Tuple
The tuple type contains n types, where n is greater than or equal to 0.
#### Identifier
An identifier type is either a circuit type or a type alias;
every circuit type represents a different type.
#### SelfType
The self type represented by `Self` and only usable inside a circuit.
### [Statements](./src/statements/statement.rs)
The statement level nodes in a Leo Program.
#### [Assignment Statements](./src/statements/assign/)
An assignment statement node stores the following:
- The operation.
- **=**
- **+=**
- **-=**
- **\*=**
- **/=**
- **\*\*=**
- **&&=**
- **||=**
- The assignee which is a variable that has context of any access expressions on it.
- The value which is an expression.
#### [Block Statements](./src/statements/block.rs)
A block statement node stores the following:
- The list of statements inside the block.
#### [Conditional Statements](./src/statements/conditional.rs)
A conditional statement node stores the following:
- The condition which is an expression.
- The block statement.
- The next block of the conditional if it exists.
#### [Console Statements](./src/statements/)
A console statement node stores the following:
- The console function being called which stores the type of console function it is and its arguments.
#### [Definition Statements](./src/statements/definition/mod.rs)
A definition statement node stores the following:
- The declaration type:
- `let` for mutable definitions.
- `const` for cosntant definitions.
- The names of the variables defined.
- The optional type.
- The values to be assigned to the varaibles.
#### [Expression Statements](./src/statements/expression.rs)
An expression statement node stores the following:
- The expression.
#### [Iteration Statements](./src/statements/iteration.rs)
A iteration statement node stores the following:
- The loop iterator variable name.
- The expression to define the starting loop value.
- The expression to define the stopping loop value.
- A flag indicating whether the stopping value is inclusive or not.
- The block to run for the loop.
#### [Return Statements](./src/statements/return_statement.rs)
A return statement node stores the following:
- The expression that is being returned.
### [Expressions](./src/expressions/mod.rs)
The expression nodes in a Leo Program.
#### [ArrayAccess Expressions](./src/accesses/array_access.rs)
An array access expression node stores the following:
- The array expression.
- The index represented by an expression.
#### [ArrayInit Expressions](./src/expression/array_init.rs)
An array init expression node stores the following:
- The element expression to fill the array with.
- The dimensions of the array to build.
#### [ArrayInline Expressions](./src/expression/array_inline.rs)
An array inline expression node stores the following:
- The elements of an array, each of which is either a spread or an expression.
#### [ArrayRangeAccess Expressions](./src/accesses/array_range_access.rs)
An array range access expression node stores the following:
- The array expression.
- The optional left side of the range of the array bounds to access.
- The optional right side of the range of the array bounds to access.
#### [Binary Expressions](./src/expression/binary.rs)
A binary expression node stores the following:
- The left side of the expression.
- The right side of the expression.
- The binary operation of the expression:
- **+**
- **-**
- **\***
- **/**
- **\*\***
- **||**
- **&&**
- **==**
- **!=**
- **>=**
- **>**
- **<=**
- **<**
#### [Call Expressions](./src/expression/call.rs)
A call expression node stores the following:
- The function expression being called.
- The aruments a list of expressions.
#### [CircuitInit Expressions](./src/expression/circuit_init.rs)
A circuit init expression node stores the following:
- The name of the circuit expression being initialized.
- The arguments a list of expressions.
#### [MemberAccess Expressions](./src/accesses/member_access.rs)
A member access expression node stores the following:
- The expression being accessed.
- The name of the member being accessed.
- The optional inferred type.
#### [StaticAccess Expressions](./src/accesses/static_access.rs)
A static function access expression node stores the following:
- The expression being accessed.
- The name of the member being statically accessed.
- The optional inferred type.
#### [Identifier Expressions](./src/common/identifier.rs)
An identifer expression node stores the following:
- An identifier stores the string name.
#### [Ternary Expressions](./src/expression/ternary.rs)
A ternary expression node stores the following:
- The condition of the ternary stored as an expression.
- The expression returned if the condition is true.
- The expression returned if the condition is false.
#### [TupleAccess Expressions](./src/accesses/tuple_access.rs)
A tuple access expression node stores the following:
- The tuple expression being accessed.
- The index a positive number greater than or equal to 0.
#### [TupleInit Expressions](./src/expression/tuple_init.rs)
A tuple init expression node stores the following:
- The element expressions to fill the tuple with.
#### [Unary Expressions](./src/expression/unary.rs)
An unary expression node stores the following:
- The inner expression.
- The unary operator:
- **!**
- **-**
#### [Value Expressions](./src/expression/value.rs)
A value expression node stores one of the following:
- Address and its value and span.
- Boolean and its value and span.
- Char and its value and span.
- Field and its value and span.
- Group and its value and span.
- Implicit and its value and span.
- Integer and its value and span.
- String and its value and span.

View File

@ -3,72 +3,3 @@
[![Crates.io](https://img.shields.io/crates/v/leo-ast.svg?color=neon)](https://crates.io/crates/leo-core)
[![Authors](https://img.shields.io/badge/authors-Aleo-orange.svg)](../AUTHORS)
[![License](https://img.shields.io/badge/License-GPLv3-blue.svg)](./LICENSE.md)
This directory includes the core library for Leo.
## Usage
Leo's core library is statically built into the compiler.
All modules in the `leo-core` rust module will be automatically included in a `core` Leo module that can be imported into a Leo program.
## Implementations
For a full explanation of core library functions, see the [Aleo developer documentation](https://developer.aleo.org/)
### Core Library (Account)
#### Compute Key
```ts
import core.account.ComputeKey;
function foo(public compute_key: ComputeKey);
function foo(compute_key: ComputeKey);
```
#### Private Key
```ts
import core.account.PrivateKey;
function foo(public private_key: PrivateKey);
function foo(private_key: PrivateKey);
```
#### Record
```ts
import core.account.Record;
function foo(public record: Record);
function foo(record: Record);
```
#### Signature
```ts
import core.account.Signature;
function foo(public signature: Signature);
function foo(signature: Signature);
```
#### View Key
```ts
import core.account.ViewKey;
function foo(public view_key: ViewKey);
function foo(view_key: ViewKey);
```
### Core Library (Algorithms)
#### Poseidon
```ts
import core.algorithms.Poseidon;
function foo(public poseidon: Poseidon);
function foo(poseidon: Poseidon);
```

View File

@ -3,123 +3,3 @@
[![Crates.io](https://img.shields.io/crates/v/leo-parser.svg?color=neon)](https://crates.io/crates/leo-parser)
[![Authors](https://img.shields.io/badge/authors-Aleo-orange.svg)](../AUTHORS)
[![License](https://img.shields.io/badge/License-GPLv3-blue.svg)](./LICENSE.md)
This directory contains the code to tokenize, lex and parse Leo files to the [Leo AST](./../ast/README.md).
## Tokenizer
The tokenizer contains all tokens in Leo.
It also decides which tokens are keywords.
Meaning that keywords are a subset of tokens.
The lexer goes through character by character as bytes, and converts the bytes into the tokens.
### Tokens
Bolded ones are also keywords.
#### Literals
- CommentLine
- CommentBlock
- StringLit
- Ident
- Int
- **True**
- **False**
- AddressLit
- CharLit
#### Symbols
- At
- Not
- And (`&&`)
- Ampersand (`&`)
- Or
- Eq
- NotEq
- Lt
- LtEq
- Gt
- GtEq
- Add
- Minus
- Mul
- Div
- Exp
- Assign
- AddEq
- MinusEq
- MulEq
- DivEq
- ExpEq
- LeftParen
- RightParen
- LeftSquare
- RightSquare
- LeftCurly
- RightCurly
- Comma
- Dot
- DotDot
- DotDotDot
- Semicolon
- Colon
- DoubleColon
- Question
- Arrow
- Underscore
#### Types
- **U8**
- **U16**
- **U32**
- **U64**
- **U128**
- **I8**
- **I16**
- **I32**
- **I64**
- **I128**
- **Field**
- **Group**
- **Bool**
- **Address**
- **Char**
- **BigSelf**
#### Words
- **Input**
- **LittleSelf**
- **Import**
- **As**
- **Circuit**
- **Console**
- **Const**
- **Else**
- **For**
- **Function**
- **If**
- **In**
- **Let**
- **Return**
- **Static**
- **String**
#### Meta
- Eof
## Parser
The parser converts the tokens to the [Leo AST](./../ast/README.md).
The parser is broken down to different files that correspond to different aspects of the AST:
- [File](./src/parser/file.rs) - Parses the top level nodes in Leo.
- [Types](./src/parser/type_.rs) - Parses the type declarations in Leo.
- [Statements](./src/parser/statement.rs) - Parses the different kinds of statements.
- [Expressions](./src/parser/expression.rs) - Parses the different kinds of expressions.
For more information on those please read the Leo AST README, linked above.
## Grammar Relation
All function and token names are as close as possible to the [Leo Grammar](./../grammar/README.md)

View File

@ -3,11 +3,3 @@
[![Crates.io](https://img.shields.io/crates/v/leo-ast.svg?color=neon)](https://crates.io/crates/leo-passes)
[![Authors](https://img.shields.io/badge/authors-Aleo-orange.svg)](../AUTHORS)
[![License](https://img.shields.io/badge/License-GPLv3-blue.svg)](./LICENSE.md)
## Usage
The code here is split into several usages. Each usage represents a different pass or modification when given an AST.
## Structure
Each different type of pass is located in its own directory within the src directory.

Binary file not shown.

View File

@ -1,51 +0,0 @@
# Empty array dimensions
## Example
This error occurs when specifying an empty tuple as the dimensions of an array.
Erroneous code example:
```js
function main() {
let foo = [42; ()];
}
```
The compiler will reject this code with, for example...:
```js
Error [EPAR0370023]: Array dimensions specified as a tuple cannot be empty.
--> test.leo:2:20
|
2 | let foo = [42; ()];
| ^^
```
## Solution
If you wanted a single dimensional array, you can achieve that by specifying the length like so:
```js
function main() {
let foo = [42; 4];
}
```
This will give you the array `[42, 42, 42, 42]`.
If instead you wanted a multi-dimensional array, e.g., a 2 x 3 matrix, you can achieve that with:
```js
function main() {
let foo = [42; (2, 3)];
}
```
Alternatively, you can use the simple syntax all the way instead:
```js
function main() {
let foo = [[42; 2]; 3];
}
```

View File

@ -1,35 +0,0 @@
# `@context function` is deprecated
## Example
This error occurs when a function is prefixed with `@context`.
Erroneous code example:
```js
@context()
function foo() {
// logic...
}
```
The compiler will reject this code with, for example...:
```js
Error [EPAR0370017]: "@context(...)" is deprecated. Did you mean @test annotation?
--> test.leo:1:2
|
1 | @context()
| ^^^^^^^
```
## Solution
The `@context function` syntax is deprecated, but you can use `@test function` instead:
```js
@test
function foo() {
// logic...
}
```

View File

@ -1,44 +0,0 @@
# Invalid address literal
## Example
This error occurs when a syntactically invalid address is specified.
Erroneous code example:
```js
function main() {
let addr = aleo1Qnr4dkkvkgfqph0vzc3y6z2eu975wnpz2925ntjccd5cfqxtyu8s7pyjh9;
}
```
The compiler will reject this code with, for example...:
```js
Error [EPAR0370001]: invalid address literal: 'aleo1Qnr4dkkvkgfqph0vzc3y6z2eu975wnpz2925ntjccd5cfqxtyu8s7pyjh9'
--> test.leo:2:16
|
2 | let addr = aleo1Qnr4dkkvkgfqph0vzc3y6z2eu975wnpz2925ntjccd5cfqxtyu8s7pyjh9;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```
A valid address literal must start with `aleo1`,
followed by 58 characters any of which can be either a lowercase letter,
or an ASCII digit (`0` to `9`).
In the example above, the problem is `Q`, an uppercase letter,
and the second character after `aleo1`.
## Solution
To fix the issue, we can write...:
```js
function main() {
let addr = aleo1qnr4dkkvkgfqph0vzc3y6z2eu975wnpz2925ntjccd5cfqxtyu8s7pyjh9;
}
```
...and the compiler will accept it.
Note however that the compiler does not check whether the address is valid on-chain, but merely that the written program follows the rules of the language grammar.

View File

@ -1,32 +0,0 @@
# Invalid assignment target
## Example
This error currently occurs when a `static const` member or a member function
is used as the target of an assignment statement.
Erroneous code example:
```js
circuit Foo {
static const static_const: u8 = 0;
}
function main() {
Foo::static_const = 0;
}
```
The compiler will reject this code with, for example...:
```js
Error [EPAR0370011]: invalid assignment target
--> test.leo:6:5
|
6 | Foo::static_const = 0;
| ^^^^^^^^^^^^^^^^^
```
It's not possible to assign to `static const` members or member functions,
so this is not allowed syntax.
The solution is likely to rethink your approach to the problem you are solving.

View File

@ -1,53 +0,0 @@
# An empty `import` list
## Example
This error occurs when no sub-packages
or items were specified in an import list.
Erroneous code example:
```js
import gardening.();
```
The compiler will reject this code with, for example...:
```js
Error [EPAR0370002]: Cannot import empty list
--> test.leo:1:18
|
1 | import gardening.();
| ^^
```
...as the compiler does not know what to import in `gardening`.
## Solutions
There are different solutions to this problems.
Here are 2 of them to consider.
### Comment out the `import`
If don't know yet what to import from `gardening`,
comment out the `import` like so:
```js
// import gardening.();
```
Later, you can come back and specify what to import like below.
You can also remove the `import` line entirely,
which will have the same effect.
### Specify items to `import`
If you know that you'd like to import, for example,
the functions `water_flowers` and `prune`,
you can specify them in the import list like so:
```js
import gardening.(water_flowers, prune);
```

View File

@ -1,33 +0,0 @@
# `let mut` is deprecated
## Example
This error occurs when a variable declaration is marked with `mut`.
Erroneous code example:
```js
function main() {
let mut x = 0;
}
```
The compiler will reject this code with, for example...:
```js
Error [EPAR0370015]: let mut = ... is deprecated. `let` keyword implies mutabality by default.
--> test.leo:2:5
|
2 | let mut x = 0;
| ^^^^^^^
```
## Solution
As the `mut` modifier is implicitly assumed, the solution is to remove the `mut` modifier:
```js
function main() {
let x = 0;
}
```

View File

@ -1,37 +0,0 @@
# `static const` after circuit functions
## Example
This error occurs when `static const` circuit members occur after circuit member functions.
Erroneous code example:
```js
circuit Foo {
function bar() {}
static const baz: bool = true;
}
```
The compiler will reject this code with, for example...:
```js
Error [EPAR0370021]: Member functions must come after member consts.
--> test.leo:4:18
|
4 | static const baz: bool = true;
| ^^^^^^^^^^^^^^^^
```
## Solution
The issue can be solved by moving all `static const` members before circuit member functions...:
```js
circuit Foo {
static const baz: bool = true;
function bar() {}
}
```

View File

@ -1,37 +0,0 @@
# `static const` after normal variables
## Example
This error occurs when `static const` circuit members occur after normal member variables.
Erroneous code example:
```js
circuit Foo {
bar: u8,
static const baz: bool = true;
}
```
The compiler will reject this code with, for example...:
```js
Error [EPAR0370020]: Member variables must come after member consts.
--> test.leo:4:18
|
4 | static const baz: bool = true;
| ^^^^^^^^^^^^^^^^
```
## Solution
The issue can be solved by moving all `static const` members before normal member variables...:
```js
circuit Foo {
static const baz: bool = true;
bar: u8,
}
```

View File

@ -1,37 +0,0 @@
# Member variables after circuit functions
## Example
This error occurs when circuit member variables occur after circuit member functions.
Erroneous code example:
```js
circuit Foo {
function bar() {}
baz: bool;
}
```
The compiler will reject this code with, for example...:
```js
Error [EPAR0370022]: Member functions must come after member variables.
--> test.leo:4:5
|
4 | baz: bool;
| ^^^
```
## Solution
The issue can be solved by moving all member variables before any circuit member functions...:
```js
circuit Foo {
baz: bool;
function bar() {}
}
```

View File

@ -1,48 +0,0 @@
# Mixed commas and semicolons in circuit definitions
## Example
This error occurs when mixing semicolons, `;`,
and commas, `,` together in the list of member variables in a circuit definition.
Erroneous code example:
```js
circuit A {
foo: u8,
bar: u16;
}
```
The compiler will reject this code with:
```js
Error [EPAR0370006]: Cannot mix use of commas and semi-colons for circuit member variable declarations.
--> test.leo:3:13
|
3 | bar: u16;
| ^
```
## Solutions
The solution is simply to consistently use `;` or `,` after each member variable,
and avoid mixing `;` and `,` together. So we could write either...:
```js
circuit A {
foo: u8,
bar: u16,
}
```
...or write...:
```js
circuit A {
foo: u8;
bar: u16;
}
```
...and the compiler would accept it.

View File

@ -1,37 +0,0 @@
# Deprecated `mut` parameter
## Example
This error occurs when a function parameter is marked as `mut`.
Erroneous code example:
```js
circuit Foo {
function bar(mut x: u8) {
x = 0;
}
}
```
The compiler will reject this code with, for example...:
```js
Error [EPAR0370014]: function func(mut a: u32) { ... } is deprecated. Passed variables are mutable by default.
--> test.leo:2:18
|
2 | function bar(mut x: u8) {
| ^^^^^
```
## Solution
As the `mut` modifier is implicitly assumed, the solution is to remove the `mut` modifier:
```js
circuit Foo {
function bar(x: u8) {
x = 0;
}
}
```

View File

@ -1,41 +0,0 @@
# Deprecated `mut` parameter
## Example
This error occurs when a function parameter is marked as `mut`.
Erroneous code example:
```js
circuit Foo {
bar: u8,
function bar(mut self) {
self.bar = 0;
}
}
```
The compiler will reject this code with, for example...:
```js
Error [EPAR0370019]: `mut self` is no longer accepted. Use `&self` if you would like to pass in a mutable reference to `self`
--> test.leo:4:18
|
4 | function bar(mut self) {
| ^^^^^^^^
```
## Solution
As the `mut` modifier is implicitly assumed, the solution is to remove the `mut` modifier from `self`:
```js
circuit Foo {
bar: u8,
function bar(self) {
self.bar = 0;
}
}
```

View File

@ -1,43 +0,0 @@
# Illegal spread expression in array initializer
## Example
This error occurs when a spread expression, e.g., `...foo` occurs in an array initializer.
Erroneous code example:
```js
function main() {
let foo = [0, 1];
let array = [...foo; 3];
}
```
The compiler will reject this code with, for example...:
```js
Error [EPAR0370010]: illegal spread in array initializer
--> test.leo:3:17
|
3 | let array = [...foo; 3];
| ^^^^^^^
```
## Solution
The Leo language does not allow `...foo` as the element to repeat
in an array repeat expression like the one above.
This is because `foo` is not an element but rather a full array.
One could imagine that the expression above means `[...foo, ...foo, ...foo]`.
That is, `...foo` repeated as many times as was specified in the array size.
However, that is ambiguous with `[element; 3]` resulting in an array with size `3`.
To solve the issue, disambiguate your intention.
Most likely, you really wanted `[...foo, ...foo, ...foo]`, so the solution is to write that out...:
```js
function main() {
let foo = [0, 1];
let array = [...foo, ...foo, ...foo];
}
```

View File

@ -1,34 +0,0 @@
# `test function` is deprecated
## Example
This error occurs when a function is prefixed with `test`.
Erroneous code example:
```js
test function foo() {
// logic...
}
```
The compiler will reject this code with, for example...:
```js
Error [EPAR0370016]: "test function..." is deprecated. Did you mean @test annotation?
--> test.leo:1:1
|
1 | test function foo() {
| ^^^^
```
## Solution
The `test function` syntax is deprecated, but you can achieve the same result with `@test function`:
```js
@test
function foo() {
// logic...
}
```

View File

@ -1,34 +0,0 @@
# Unable to parse array dimensions
## Example
This error occurs when there is a syntax error in the array dimensions of an array initializer.
Erroneous code example:
```js
function main() {
let x = [1; +];
}
```
The compiler will reject this code with, for example...:
```js
Error [EPAR0370018]: unable to parse array dimensions
--> test.leo:2:13
|
2 | let x = [1; +];
| ^
```
## Solution
In the case above, the error occurs due to the `+`.
The issue can be resolved by specifying the number of elements desired, e.g., `5`...:
```js
function main() {
let x = [1; 5];
}
```

View File

@ -1,37 +0,0 @@
# Expected "x" -- got "y"
## Example
This error occurs when a specific token, e.g., `class` was encountered but a different one,
e.g., `circuit` was expected instead.
Erroneous code example:
```js
class A {}
```
The compiler will reject this code with:
```js
Error: --> main.leo:1:1
|
1 | class A {}
| ^^^^^
|
= expected 'import', 'circuit', 'function', 'test', '@' -- got 'class'
```
## Solutions
The error message above says that `class` cannot be used at that location,
and also lists a few tokens that are valid. Note that this is context specific,
and depends on what tokens preceded the current token.
Using the list of tokens that are valid, and knowing that `circuit A {}` is valid syntax,
we replace `class` with `circuit`...:
```js
circuit A {}
```
...and the error is now resolved.

View File

@ -1,49 +0,0 @@
# An unexpected end of file
## Example
This error occurs when the Leo compiler tries to parse your program
and unexpectedly reaches the end of a `.leo` file.
Erroneous code example:
```js
function main() {
```
The compiler will reject this code with:
```js
Error [EPAR0370003]: unexpected EOF
--> test.leo:1:17
|
1 | function main() {
| ^
```
## Solutions
The problem typically occurs when there are unbalanced delimiters,
which we have an instance of above.
More specifically, in the example,
the issue is that there is no `}` to close the opening brace `{`.
An even simpler variant of this is:
```js
function main(
```
The solution here is to close the opening delimiter, in this case `(`.
## The general issue
To illustrate the heart of the problem, consider this invalid file:
```js
// ↳ main.leo
function
```
When parsing the file, the compiler expects something, in this case,
the function's name, but instead, the parser reaches the end of the file.

View File

@ -1,42 +0,0 @@
# Unexpected identifier: expected "x" -- got "y"
## Example
This error occurs when a specific *identifier*, e.g., `error` was expected but a different one,
e.g., `fail` was encountered instead.
Erroneous code example:
```js
function main() {
console.fail("Houston we have a problem!");
}
```
The compiler will reject this code with:
```js
Error [EPAR0370007]: unexpected identifier: expected 'assert', 'error', 'log' -- got 'fail'
--> test.leo:2:11
|
2 | console.fail("Houston we have a problem!");
| ^^^^
```
## Solutions
The error message above says that `fail` cannot be used at that location,
and also lists a few identifiers that are valid. Note that this is context specific,
and depends on what preceded the valid tokens in the location.
The error message lists identifiers that are valid, e.g., `error`.
Here, since we used `.fail(...)`, we most likely wanted to trigger a compile error,
which `.error(...)` will achieve, so we use that instead...:
```js
function main() {
console.error("Houston we have a problem!");
}
```
Note that this error currently only occurs when using `console`.

View File

@ -1,42 +0,0 @@
# Unexpected statement: expected "x" -- got "y"
## Example
This error occurs when a statement, which isn't `if`, follows `else` directly.
Erroneous code example:
```js
function main () {
if true {
console.log("It was true.");
} else
console.log("It was false.");
}
```
The compiler will reject this code with:
```js
Error [EPAR0370008]: unexpected statement: expected 'Block or Conditional', got 'console.log("It was false.", );'
--> test.leo:5:9
|
5 | console.log("It was false.");
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```
## Solutions
To fix the problem, wrap the statement in a block, so by turning the snippet above into...:
```js
function main () {
if true {
console.log("It was true.");
} else {
console.log("It was false.");
}
}
```
...the error is fixed.

View File

@ -1,36 +0,0 @@
# Expected string "x" -- got "y"
## Example
This error occurs when a specific "string" (in reality a token),
was expected but a different one was expected instead.
Erroneous code example:
```js
function main () {
let x: [u8; (!)] = [0];
}
```
The compiler will reject this code with:
```js
Error [EPAR0370009]: unexpected string: expected 'int', got '!'
--> test.leo:2:18
|
2 | let x: [u8; (!)] = [0];
| ^
```
## Solutions
The error message "unexpected string" depends on the context.
In the example above, we need to replace `!` with `1`...:
```js
function main () {
let x: [u8; 1] = [0];
}
```

View File

@ -1,40 +0,0 @@
# An unexpected token
## Example
This error occurs when the Leo compiler tries to parse your program.
More specifically, during a phase called 'lexing'.
In this phase, the compiler first takes your code,
consisting of characters, and interprets it as a list of tokens.
These tokens are a sort of *alphabet* internal to Leo.
Consider the English language. It only has 26 letters in its alphabet.
So there are some letters, e.g., `Γ` from the greek alphabet,
which would not fit if we tried to "tokenize" English.
Leo, while being a programming language, is similar here.
There are characters or sequences of characters,
that Leo does not understand and cannot lex into tokens.
Since this error occured, that is what has happened.
Erroneous code example:
```js
~
```
The compiler will reject this code with:
```js
Error [EPAR0370000]: ~
--> test.leo:1:1
|
1 | ~
| ^
```
## Solutions
What the solution to an unexpected token is depends on what you wanted to achieve.
Most likely, you made a typo somewhere.
For a more complete overview of valid Leo tokens, consult the [Leo grammar](https://github.com/AleoHQ/leo/blob/testnet3/grammar/README.md).

View File

@ -1,34 +0,0 @@
# Unexpected whitespace
## Example
This error occurs when there was unexpected white space when in your program.
Typically, the white space occurs in a literal with a typed suffix.
Erroneous code example:
```js
function main() {
let x = 1 u8;
}
```
The compiler will reject this code with:
```js
Error [EPAR0370004]: Unexpected white space between terms 1 and u8
--> test.leo:2:13
|
2 | let x = 1 u8;
| ^
```
## Solutions
The problem is solved by removing the white space between the literal and its suffix. So given the example above, we can fix it by writing:
```js
function main() {
let x = 1u8;
}
```

View File

@ -1,430 +0,0 @@
# Leo RFC 001: Initial String Support
## Authors
The Aleo Team.
## Status
IMPLEMENTED
## Summary
The purpose of this proposal is to provide initial support for strings in Leo.
Since strings are sequences of characters,
the proposal inextricably also involves characters.
This proposal is described as initial,
because it provides some basic features that we may extend in the future;
the initial features should be sufficiently simple and conservative
that they should not limit the design of the future features.
This proposal adds a new scalar type for characters,
along with a new kind of literals to denote characters.
A string is then simply an array of characters,
but this proposal also adds a new kind of literals to denote strings
more directly than via character array construction expressions.
Along with equality and inequality, which always apply to every Leo type,
this proposal also introduces some operations on characters and strings
that can be implemented over time.
By not prescribing a new type for strings,
this initial proposal leaves the door open
to a future more flexible type of resizable strings.
## Motivation
Strings (and characters) are common in programming languages.
Use cases for Leo include
simple ones like URLs and token ticker symbols,
and more complex ones like Bech32 encoding,
edit distance in strings representing proteins,
and zero-knowledge proofs of occurrences or absences of patterns in textual logs.
## Design
Since strings are sequences of characters,
a design for strings inextricably also involves a design for characters.
Thus, we first present a design for both characters and strings.
### Characters
We add a new scalar type, `char` for characters.
In accord with Leo's strong typing,
this new type is separate from all the other scalar types.
Explicit constructs will be needed to convert between `char` and other types.
The set of values of type `char` is isomorphic to
the set of Unicode code points from 0 to 10FFFF (both inclusive).
That is, we support Unicode characters, more precisely code points
(this may include some invalid code points,
but it is simpler to allow every code point in that range).
A character is an atomic entity:
there is no notion of Unicode encoding (e.g. UTF-8) that applies here.
We add a new kind of literals for characters,
consisting of single characters or escapes,
surrounded by single quotes.
Any single Unicode character except a single quote is allowed,
e.g. `'a'`, `'*'`, and `'"'`.
Single quotes must be escaped with a backslash, i.e. `'\''`;
backslashes must be escaped as well, i.e. `'\\'`
We allow other backslash escapes
for commonly used characters that are not otherwise easily denoted.
This is the complete list of single-character backslash escapes:
* `\'` for code point 39 (single quote)
* `\"` for code point 34 (double quote)
* `\\` for code point 92 (backslash)
* `\n` for code point 10 (line feed)
* `\r` for code point 13 (carriage return)
* `\t` for core point 9 (horizontal tab)
* `\0` for code point 0 (the null character)
We also allow ASCII escapes of the form `\xOH`,
where `O` is an octal digit and `H` is a hexadecimal digit.
Both uppercase and lowercase hex digits are allowed.
The `x` must be lowercase.
These represent ASCII code points, i.e. from 0 to 127 (both inclusive).
We also allow Unicode escapes of the form `'\u{X}'`,
where `X` is a sequence of one to six hex digits.
Both uppercase and lowercase letters are allowed.
The `u` must be lowercase.
The value must be between 0 and 10FFFF, inclusive.
Note that this syntax for character literals is very close to the Rust syntax documented here (as of 2021-05-26):
https://doc.rust-lang.org/reference/tokens.html#character-literals
The only difference is that this syntax does not support Unicode escapes with underbars in them.
The following is true in Rust but not in this proposal for Leo:
`'\u{1_____0__F____F______FF__________________________}' == '\u{10FFFF}'`.
Note that the literal character is assembled by the compiler---for
creating literals, there is no need for the circuit to know
which code points are disallowed.
The equality operators `==` and `!=` are automatically available for `char`.
The following code sample illustrates three ways of defining characters:
character literal, single-character escapes, and Unicode escapes.
```js
function main() -> [char; 5] {
// using char literals to form an array
const world: [char; 5] = ['w', 'o', 'r', 'l', 'd'];
// escaped characters
const escaped: [char; 4] = ['\n', '\t', '\\', '\''];
// unicode escapes - using emoji character 😊
const smiling_face: char = '\u{1F60A}';
return [smiling_face, ...escaped];
}
```
### Strings
In this initial design proposal, we do not introduce any new type for strings.
Instead, we rely on the fact that Leo already has arrays
and that arrays of characters can be regarded as strings.
Existing array operations, such as element and range access,
apply to these strings without the need of language extensions.
To ease the common use case of writing a string value in the code,
we add a new kind of literal for strings (i.e. character arrays),
consisting of a sequence of **one or more** single characters or escapes
surrounded by double quotes;
this is just syntactic sugar for the literal array construction.
Any Unicode character except double quote or backslash is allowed without escape.
Examples: `"Aleo"`, `"it's"`, and `"x + y"`.
Double quotes must be escaped with a backslash, e.g. `"say \"hi\""`;
backslashes must be escaped as well, e.g. `"c:\\dir"`.
We also allow the same backslash escapes allowed for character literals
(see the section on characters above).
We also allow the same Unicode escapes allowed in character literals
(described in the section on characters above).
Note that this syntax for string literals is very close to the Rust syntax documented here (as of 2021-05-26):
https://doc.rust-lang.org/reference/tokens.html#string-literals.
The main difference is that this syntax does not support the Rust `STRING_CONTINUE` syntax.
In this syntax a backslash may not be followed by a newline, and newlines have no special handling.
Another differences is that this syntax does **not** permit the empty string `""`.
Also, this syntax does not allow underbars in Unicode escapes in string literals.
The type of a string literal is `[char; N]`,
where `N` is the length of the string measured in characters,
i.e. the size of the array.
Note that in this language design there is no notion of Unicode encoding (e.g. UTF-8)
that applies to string literals.
The rationale for not introducing a new type for strings initially,
and instead, piggybacking on the existing array types and operations,
is twofold.
First, it is an economical design
that lets us reuse the existing array machinery,
both at the language level (e.g. readily use array operations)
and at the R1CS compilation level
(see the section on compilation to R1CS below).
Second, it leaves the door open to providing,
in a future design iteration,
a richer type for strings,
as discussed in the section about future extensions below.
Recall that empty arrays are disallowed in Leo.
(The reason is that arrays,
which must have a size known at compile time and are not resizable,
are flattened into their elements when compiling to R1CS;
thus, an empty array would be flattened into nothing.)
Therefore, in this initial design empty strings must be disallowed as well.
A future type of resizable strings will support empty strings.
Because array, and therefore string, sizes must be known at compile time,
there is no point to having an operation to return the length of a string.
This operation will be supported for a future type of resizable strings.
Below are some examples of array operations
that are also common for strings in other programming languages:
* `[...s1, ...s2]` concatenates the strings `s1` and `s2`.
* `[c, ...s]` adds the character `c` in front of the string `s`.
* `s[i]` extracts the `i`-th character from the string `s`.
* `s[1..]` removes the first character from the string `s`.
The following code shows a string literal and its actual transformation into an
array of characters as well as possible array-like operations on strings:
concatenation and comparison.
```js
function main() -> bool {
// double quotes create char array from string
let hello: [char; 5] = "hello";
let world: [char; 5] = ['w','o','r','l','d'];
// string concatenation can be performed using array syntax
let hello_world: [char; 11] = [...hello, ' ', ...world];
// string comparison is also implemented via array type
return hello_world == "hello world";
}
```
### Format Strings
Leo currently supports format strings as their own entity,
usable exclusively as first arguments of console print calls.
This proposal eliminates this very specific notion,
which is subsumed by the string literals described above.
In other words, a console print call
will take a string literal as the first argument,
which will be interpreted as a format string
according to the semantics of console print calls.
The internal UTF-32 string will be translated to UTF-8 for output.
### Compilation to R1CS
So far, the discussion has been independent from R1CS
(except for a brief reference when discussing the rationale behind the design).
This is intentional because the syntax and semantics of Leo
should be understandable independently from the compilation of Leo to R1CS.
However, compilation to R1CS is a critical consideration
that affects the design of Leo.
This section discusses R1CS compilation considerations
for this proposal for characters and strings.
Values of type `char` can be represented directly as field elements,
since the prime of the field is (much) larger than 10FFFF.
This is more efficient than using a bit representation of characters.
By construction, field elements that represent `char` values
are never above 10FFFF.
Note that `field` and `char` remain separate types in Leo:
it is only in the compilation to R1CS
that everything is reduced to field elements.
Since strings are just arrays of characters,
there is nothing special about compiling strings to R1CS,
compared to other types of arrays.
In particular, the machinery to infer array sizes at compile time,
necessary for the flattening to R1CS,
applies to strings without exception.
String literals are just syntactic sugar for
suitable array inline construction expressions.
There are at least two approaches to implementing
ordering operations `<` and `<=` on `char` values.
Recalling that characters are represented as field values
that are (well) below `(p-1)/2` where `p` is the prime,
we can compare two field values `x` and `y`,
both below `(p-1)/2`, via the constraints
```
(2) (x - y) = (b0 + 2*b1 + 4*b2 + ...)
(b0) (1 - b0) = 0
(b1) (1 - b1) = 0
(b2) (1 - b2) = 0
...
```
that take the difference, double it, and convert to bits.
If `x >= y`, the difference is below `(p-1)/2`,
and doubling results in an even number below `p`,
with therefore `b0 = 0`.
If `x < y`, the difference is above `(p-1)/2` (when reduced modulo `p`),
and doubling results in an odd number when reduced modulo `p`,
with therefore `b0 = 1`.
Note that we need one variable and one constraint for every bit of `p`.
The other approach is to convert the `x` and `y` to bits
and compare them as integers;
in this case we only need 21 bits for each.
We need more analysis to determine which approach is more efficient.
The details of implementing other character and string operations in R1CS
will be fleshed out as each operation is added.
## Drawbacks
This proposal does not appear to bring any real drawbacks,
other than making the language inevitably slightly more complex.
But the need to support characters and strings justifies the extra complexity.
## Effect on Ecosystem
With the ability of Leo programs to process strings,
it may be useful to have external tools that convert Leo strings
to/from common formats, e.g. UTF-8.
See the discussion of input files in the design section.
## Alternatives
### 32-bit Unsigned Integers for Characters
We could avoid the new `char` type altogether,
and instead, rely on the existing `u32` to represent Unicode code points,
and provide character-oriented operations on `u32` values.
(Note that both `u8` and `u16` are too small for 10FFFF,
and that signed integer types include negative integers
which are not Unicode code points:
this makes `u32` the obvious choice.)
However, many values of type `u32` are above 10FFFF,
and many operations on `u32` do not really make sense on code points.
We would probably want a notation for character literals anyhow,
which could be (arguably mis)used for non-character unsigned integers.
All in all, introducing a new type for characters
is consistent with Leo's strong typing approach.
Furthermore, for compilation to R1CS, `u32`,
even if restricted to the number of bits needed for Unicode code points,
is less efficient than the field representation described earlier
because `u32` requires a field element for each bit.
### Separate String Type, Isomorphic to Character Arrays
Instead of representing strings as character arrays,
we could introduce a new type `string`
whose values are finite sequences of zero or more characters.
These strings would be isomorphic to, but distinct form, character arrays.
However, for compilation to R1CS, it would be necessary to
perform the same kind of known-size analysis on strings
that is already performed on arrays,
possibly necessitating to include size as part of the type, i.e. `string(N)`,
which is obviously isomorphic to `[char; N]`.
Thus, using character arrays avoids duplication.
Furthermore, as noted in the section on future extensions,
this leaves the door open to
introducing a future type `string` for resizable strings.
### Field Elements for Characters
Yet another option could be to use directly `field` to represent characters
and `[field; N]` to represent strings of `N` characters.
However, many values of type `field` are not valid Unicode code points,
and many field operations do not make sense for characters.
Thus, having a separate type `char` for characters seems better,
and more in accordance with Leo's strong typing.
## Future Extensions
### Operations on Characters
Given that characters are essentially code points,
we may also support the ordering operators `<`, `<=`, `>`, and `>=`;
these may be useful to check whether a character is in certain range.
Below is a list of possible operations we could support on characters.
It should be fairly easy to add more.
- [ ] `is_alphabetic` - Returns `true` if the `char` has the `Alphabetic` property.
- [ ] `is_ascii` - Returns `true` if the `char` is in the `ASCII` range.
- [ ] `is_ascii_alphabetic` - Returns `true` if the `char` is in the `ASCII Alphabetic` range.
- [ ] `is_lowercase` - Returns `true` if the `char` has the `Lowercase` property.
- [ ] `is_numeric` - Returns `true` if the `char` has one of the general categories for numbers.
- [ ] `is_uppercase` - Returns `true` if the `char` has the `Uppercase` property.
- [ ] `is_whitespace` - Returns `true` if the `char` has the `White_Space` property.
- [ ] `to_digit` - Converts the `char` to the given `radix` format.
- [ ] `from_digit` - Inverse of to_digit.
- [ ] `to_uppercase` - Converts lowercase to uppercase, leaving others unchanged.
- [ ] `to_lowercase` - Converts uppercase to lowercase, leaving others unchanged.
It seems natural to convert between `char` values
and `u8` or `u16` or `u32` values, under suitable range conditions;
perhaps also between `char` values and
(non-negative) `i8` or `i16` or `i32` values.
This may be accomplished as part of the type casting extension of Leo.
### Operations on Strings
Below is a list of possible operations we could support on strings.
It should be fairly easy to add more.
- [ ] `u8` to `[char; 2]` hexstring, .., `u128` to `[char; 32]` hexstring.
- [ ] Field element to `[char; 64]` hexstring. (Application can test leading zeros and slice them out if it needs to return, say, a 40-hex-digit string.)
- [ ] Apply `to_uppercase` (see above) to every character.
- [ ] Apply `to_lowercase` (see above) to every character.
Note that the latter two could be also realized via simple loops through the string.
Given the natural conversions between `char` values and integer values discussed earlier,
it may be natural to also support element-wise conversions between strings and arrays of integers.
This may be accomplished as part of the type casting extensions of Leo.
### Circuit Types for Character and String Operations
The operations on characters and lists described earlier, e.g. `is_ascii`,
are provided as static member functions of two new built-in or library circuit types `Char` and `String`.
Thus, an example call is `Char::is_ascii(c)`.
This seems a general good way to organize built-in or library operations,
and supports the use of the same name with different circuit types,
e.g. `Char::to_uppercase` and `String::to_uppercase`.
These circuit types could also include constants, e.g. for certain ASCII characters.
However, currently Leo does not support constants in circuit types,
so that would have to be added separately first.
These two circuit types are just meant to collect static member functions for characters and strings.
They are not meant to be the types of characters and strings:
as mentioned previously, `char` is a new scalar (not circuit) type (like `bool`, `address`, `u8`, etc.)
and there is no string type as such for now, but we use character arrays for strings.
In the future we may want all the Leo types to be circuit types of some sort,
but that is a separate feature that would have to be designed and developed independently.
### Input and Output of Literal Characters and Strings
Since UTF-8 is a standard encoding, it would make sense for
the literal characters and strings in the `.in` file
to be automatically converted to UTF-32 by the Leo compiler.
However, the size of a string can be confusing since multiple
Unicode code points can be composed into a single glyph which
then appears to be a single character. If a parameter of type `[char; 10]`
[if that is the syntax we decide on] is passed a literal string
of a different size, the error message should explain that the
size must be the number of codepoints needed to encode the string.
### String Type
As alluded to in the design section above,
for now, we are avoiding the introduction of a string type,
isomorphic to but separate from character arrays,
because we may want to introduce later a more flexible type of strings,
in particular, one that supports resizing.
This may be realized via a built-in or library circuit type
that includes a character array and a fill index.
This may be a special case of a built-in or library circuit type
for resizable vectors,
possibly realized via an array and a fill index.
This hypothetical type of resizable vectors
may have to be parameterized over the element type,
requiring an extension of the Leo type system
that is much more general than strings.

View File

@ -1,518 +0,0 @@
# Leo RFC 002: Bounded Recursion
## Authors
The Aleo Team.
## Status
FINAL
## Summary
This proposal provides support for recursion in Leo,
via a user-configurable limit to the allowed depth of the recursion.
If the recursion can be completely inlined without exceeding the limit,
compilation succeeds;
otherwise, an informative message is shown to the user,
who can try and increase the limit.
Compilation may also fail
if a circularity is detected before exceeding the limit.
Future analyses may also recognize cases in which the recursion terminates,
informing the user and setting or suggesting a sufficient limit.
A similar approach could be also used for loops in the future.
User-configurable limits may be also appropriate for
other compiler transformations that are known to terminate
but could result in a very large number of R1CS constraints.
## Motivation
Leo currently allows functions to call other functions,
but recursion is disallowed:
a function cannot call itself, directly or indirectly.
However, recursion is a natural programming idiom in some cases,
compared to iteration (i.e. loops).
## Background
### Function Inlining
Since R1CS are flat collections of constraints,
compiling Leo to R1CS involves _flattening_ the Leo code:
unrolling loops, inlining functions, decomposing arrays, etc.
Of interest to this RFC is the inlining of functions,
in which a function call is replaced with the function body,
after binding the formal parameters to the the actual arguments,
and taking care to rename variables if needed to avoid conflicts.
Since the `main` function is the entry point into a Leo program,
conceptually, for the purpose of this RFC,
we can think of function inlining as transitively inlining
all the functions into `main`
(this is just a conceptual model;
it does not mean that it should be necessarily implemented this way).
This is a simple example,
where '`===> {<description>}`' indicates a transformation
described in the curly braces:
```js
function f(x: u32) -> u32 {
return 2 * x;
}
function main(a: u32) -> u32 {
return f(a + 1);
}
===> {inline call f(a + 1)}
function main(a: u32) -> u32 {
let x = a + 1; // bind actual argument to formal argument
return 2 * x; // replace call with body
}
```
### Constants and Variables
A Leo program has two kinds of inputs: constants and variables;
both come from input files.
They are passed as arguments to the `main` functions:
the parameters marked with `const` receive the constant inputs,
while the other parameters receive the variable inputs.
Leo has constants and variables,
of which the just mentioned `main` parameters are examples;
constants may only depend on literals and other constants,
and therefore only on the constant inputs of the program;
variables have no such restrictions.
The distinction between constants and variables
is significant to the compilation of Leo to R1CS.
Even though specific values of both constant and variable inputs
are known when the Leo program is compiled and the zk-proof is generated,
the generated R1CS does not depend
on the specific values of the variable inputs;
it only depends on the specific values of the constant inputs.
Stated another way, Leo variables are represented by R1CS variables,
while Leo constants are folded into the R1CS.
For instance, in
```js
function main(base: u32, const exponent: u32) -> u32 {
return base ** exponent; // raise base to exponent
}
```
the base is a variable while the exponent is a constant.
Both base and exponent are known, supplied in the input file,
e.g. the base is 2 and the exponent is 5.
However, only the information about the exponent being 5
is folded into the R1CS, which retains the base as a variable.
Conceptually, the R1CS corresponds to the _partially evaluated_ Leo program
```js
function main(base: u32) -> u32 {
return base ** 5;
}
```
where the constant `exponent` has been replaced with its value 5.
This partial evaluation is carried out
as part of the Leo program flattening transformations mentioned earlier.
This also involves constant propagation and folding,
e.g. a constant expression `exponent + 1` is replaced with 6
when the constant `exponent` is known to be 5.
(The example program above does not need any constant propagation and folding.)
Circling back to the topic of Leo function inlining,
it is the case that, due to the aforementioned partial evaluation,
the `const` arguments of function calls have known values
when the flattening transformations are carried out.
### Inlining Recursive Functions
In the presence of recursion,
attempting to exhaustively inline function calls does not terminate in general.
However, in conjunction with the partial evaluation discussed above,
inlining of recursive functions may terminate, under appropriate conditions.
This is an example:
```js
function double(const count: u32, sum: u32) -> u32 {
if count > 1 {
return double(count - 1, sum + sum);
} else {
return sum + sum;
}
}
function main(x: u32) -> u32 {
return double(3, x);
}
===> {inline call double(3, x)}
function main(x: u32) -> u32 {
let sum1 = x; // bind and rename parameter of function sum
if 3 > 1 {
return double(2, sum1 + sum1);
} else {
return sum1 + sum1;
}
}
===> {evaluate 3 > 1 to true and simplify if statement}
function main(x: u32) -> u32 {
let sum1 = x;
return double(2, sum1 + sum1);
}
===> {inine call double(2, sum1 + sum1)}
function main(x: u32) -> u32 {
let sum1 = x;
let sum2 = sum1 + sum1; // bind and rename parameter of function sum
if 2 > 1 {
return double(1, sum2 + sum2);
} else {
return sum2 + sum2;
}
}
===> {evaluate 2 > 1 to true and simplify if statement}
function main(x: u32) -> u32 {
let sum1 = x;
let sum2 = sum1 + sum1;
return double(1, sum2 + sum2)
}
===> {inline call double(1, sum2 + sum2)}
function main(x: u32) -> u32 {
let sum1 = x;
let sum2 = sum1 + sum1;
let sum3 = sum2 + sum2; // bind and rename parameter of function sum
if 1 > 1 {
return double(0, sum3 + sum3);
} else {
return sum3 + sum3;
}
}
===> {evaluate 1 > 1 to false and simplify if statement}
function main(x: u32) -> u32 {
let sum1 = x;
let sum2 = sum1 + sum1;
let sum3 = sum2 + sum2;
return sum3 + sum3;
}
```
This is a slightly more complex example
```js
function double(const count: u32, sum: u32) -> u32 {
if count > 1 && sum < 30 {
return double(count - 1, sum + sum);
} else {
return sum + sum;
}
}
function main(x: u32) -> u32 {
return double(3, x);
}
===> {inline call double(3, x)}
function main(x: u32) -> u32 {
let sum1 = x; // bind and rename parameter of function sum
if 3 > 1 && sum1 < 30 {
return double(2, sum1 + sum1);
} else {
return sum1 + sum1;
}
}
===> {evaluate 3 > 1 to true and simplify if test}
function main(x: u32) -> u32 {
let sum1 = x;
if sum1 < 30 {
return double(2, sum1 + sum1);
} else {
return sum1 + sum1;
}
}
===> {inline call double(2, sum1 + sum1)}
function main(x: u32) -> u32 {
let sum1 = x;
if sum1 < 30 {
let sum2 = sum1 + sum1; // bind and rename parameter of function sum
if 2 > 1 && sum2 < 30 {
return double(1, sum2 + sum2);
} else {
return sum2 + sum2;
}
} else {
return sum1 + sum1;
}
}
===> {evaluate 2 > 1 to true and simplify if test}
function main(x: u32) -> u32 {
let sum1 = x;
if sum1 < 30 {
let sum2 = sum1 + sum1;
if sum2 < 30 {
return double(1, sum2 + sum2);
} else {
return sum2 + sum2;
}
} else {
return sum1 + sum1;
}
}
===> {inline call double(1, sum2 + sum2)}
function main(x: u32) -> u32 {
let sum1 = x;
if sum1 < 30 {
let sum2 = sum1 + sum1;
if sum2 < 30 {
let sum3 = sum2 + sum2; // bind and rename parameter of function sum
if 1 > 1 && sum3 < 30 {
return double(0, sum3 + sum3);
} else {
return sum3 + sum3;
}
} else {
return sum2 + sum2;
}
} else {
return sum1 + sum1;
}
}
===> {evaluate 1 > 1 to false and simplify if statement}
function main(x: u32) -> u32 {
let sum1 = x;
if sum1 < 30 {
let sum2 = sum1 + sum1;
if sum2 < 30 {
let sum3 = sum2 + sum2;
return sum3 + sum3;
} else {
return sum2 + sum2;
}
} else {
return sum1 + sum1;
}
}
```
But here is an example in which the inlining does not terminate:
```js
function forever(const n: u32) {
forever(n);
}
function main() {
forever(5);
}
===> {inline call forever(5)}
function main() {
forever(5);
}
===> {inline call forever(5)}
...
```
## Design
### Configurable Limit
Our proposed approach to avoid non-termination
when inlining recursive functions is to
(i) keep track of the depth of the inlining call stack and
(ii) stop when a certain limit is reached.
If the limit is reached,
the compiler will provide an informative message to the user,
explaining which recursive calls caused the limit to be reached.
The limit is configurable by the user.
In particular, based on the informative message described above,
the user may decide to re-attempt compilation with a higher limit.
Both variants of the `double` example given earlier reach depth 3
(if we start with depth 0 at `main`).
The default limit (i.e. when the user does not specify one)
should be chosen in a way that
the compiler does not take too long to reach the limit.
Since inlining larger functions
takes more time than inlining smaller functions,
we may consider adjusting the default limit
based on some measure of the complexity of the functions.
In any case, compiler responsiveness is a broader topic.
As the Leo compiler sometimes performs expensive computations,
it may be important that it provide periodic output to the user,
to reassure them that the compiler is not stuck.
We will add a flag to the `leo` CLI whose long form is
```
--inline-limit
```
and whose short form is
```
-il
```
This option is followed by a number (more precisley, a positive integer)
that specifies the limit to the depth of the inlining stack.
The name of this option has been chosen
according to a `--...-limit` template
that may be used to specify other kinds of limits,
as discussed later.
In Aleo Studio, this compiler option is presumably specified
via GUI preferences and build configurations.
### Circularity Detection
Besides the depth of the inlining call stack,
the compiler could also keep track of
the values of the `const` arguments at each recursive call.
If the same argument values are encountered twice,
they indicate a circularity
(see the discussion on termination analysis below):
in that case, there is no need to continue inlining until the limit is reached,
and the compiler can show to the user the trace of circular calls.
This approach would readily reject the `forever` example given earlier.
## Drawbacks
This proposal does not appear to bring any real drawbacks,
other than making the compiler inevitably more complex.
But the benefits to support recursion justifies the extra complexity.
## Effect on Ecosystem
This proposal does not appear to have any direct effects on the ecosystem.
It simply enables certain Leo programs to be written in a more natural style.
## Alternatives
An alternative approach is to treat recursion analogously to loops.
That is, we could restrict the forms of allowed recursion
to ones whose inlining is known to terminate at compile time.
However, the configurable limit approach seems more flexible.
It does not even preclude a termination analysis (discussed below).
Furthermore, in practical terms,
non-termination is not much different from excessively long computation.
and the configurable limit approach may be uniformly suitable
to avoid both non-termination and excessively long computation (discussed below).
## Future Extensions
### Termination Analysis
In general, a recursive function
(a generic kind of function, not necessarily a Leo function)
terminates when
there exists a _measure_ of its arguments
that decreases at each recursive call,
under the tests that govern the recursive call,
according to a _well-founded relation_.
This is well known in theorem proving,
where termination of recursive functions
is needed for logical consistency.
For example, the mathematical factorial function
on the natural numbers (i.e. non-negative integers)
```
n! =def= [IF n = 0 THEN 1 ELSE n * (n-1)!]
```
terminates because, if `n` is not 0, we have that `n-1 < n`,
and `<` is well-founded on the natural numbers;
in this example, the measure of the argument is the argument itself.
(A relation is well-founded when
it has no infinite strictly decreasing sequence;
note that, in the factorial example,
we are considering the `<` relation on natural numbers only,
not on all the integers).
This property is undecidable in general,
but there are many cases in which termination can be proved automatically,
as routinely done in theorem provers.
In Leo, we are interested in
the termination of the inlining transformations.
Therefore, the termination condition above
must involve the `const` parameters of recursive functions:
a recursive inlining in Leo terminates when
there exists a measure of the `const` arguments
that decreases at each recursive call,
under the tests that govern the recursive call,
according to a well-founded relation.
The governing test must necessarily involve the `const` parameters,
but they may involve variable parameters as well,
as one of the `double` examples given earlier shows.
We could have the Leo compiler attempt to recognize
recursive functions whose `const` parameters
satisfy the termination condition given above.
(This does not have to involve any proof search;
the compiler could just recognize structures
for which a known proof can be readily instantiated.)
If the recognition succeed,
we know that the recursion inlining will terminate,
and also possibly in how many steps,
enabling the information to be presented to the user
in case the configurable limit is insufficient.
If the recognition fails,
the compiler falls back to inlining until
either inlining terminates or the limit is reached.
### Application to Loops
Loops are conceptually not different from recursion.
Loops and recursion are two ways to repeat computations,
and it is well-known that each can emulate the other in various ways.
Currenly Leo restricts the allowed loop statements
to a form whose unrolling always terminates at compile time.
If we were to use a similar approach for recursion,
we would only allow certain forms of recursion
whose inlining always terminates at compile time
(see the discussion above about termination analysis).
Turning things around,
we could consider allowing general forms of loops (e.g. `while` loops)
and use a configurable limit to unroll loops.
We could also detect circularities
(when the values of the local constants of the loop repeat themselves).
We could also perform a termination analysis on loops,
which in particular would readily recognize
the currently allowed loop forms to terminate.
All of this should be the topic of a separate RFC.
### Application to Potentially Slow Transformations
Some flattening transformations in the Leo compiler are known to terminate,
but they may take an excessively long time to do so.
Examples include decomposing large arrays into their elements
or decomposing large integers (e.g. of type `u128`) into their bits.
Long compilation times have been observed for cases like these.
Thus, we could consider using configurable limits
not only for flattening transformations that may not otherwise terminate,
but also for ones that may take a long time to do so.
This is a broader topic that should be discussed in a separate RFC.

View File

@ -1,238 +0,0 @@
# Leo RFC 003: Imports Stabilization
## Authors
The Aleo Team.
## Status
IMPLEMENTED
## Summary
This proposal aims to improve the import management system in Leo programs to
make the program environment more reproducible, predictable and compatible. To achieve
that we suggest a few changes to the Leo CLI and Manifest:
- add a "dependencies" section to the Leo Manifest and add a command to pull those dependencies;
- allow custom names for imports to manually resolve name conflicts;
- add "curve" and "proving system" sections to the Manifest;
- add "include" and "exclude" parameters for "proving system" and "curve";
- add a lock file to store imported dependencies and their relations;
## Motivation
The current design of imports does not provide any guarantees on what's stored
in program imports and published with the program to Aleo Package Manager.
When a dependency is "added," it is stored inside the imports folder, and it is possible
to manually edit and/or add packages in this folder.
Also, imports are stored under the package name, which makes it impossible to import
two different packages with the same name.
Another important detail in the scope of this proposal is that, in the future, Leo
programs will have the ability to run with different proving systems
and curves, possibly creating incompatibility between programs written
for different proving systems or curves. To make a foundation for these features,
imports need to be managed with include/exclude lists for allowed (compatible)
proving systems and curves.
## Background
Leo supports packages and importing, similarly to other languages.
A Leo _project_ consists of a `main.leo` file
and zero or more additional `.leo` files;
the latter may be organized in subdirectories.
The content of each file is as defined by `file` in the ABNF grammar:
it consists of types, functions, etc.
Each non-main file forms a Leo _package_ (because it packages the types, functions, etc. that it defines),
which can be referenced via its path, without the `.leo` extension and with `/` replaced by `.`
(restrictions on the file and directory names derive from the definition of `package-name` in the ABNF grammar,
and ensure that the package names can be resolved to file system paths).
A Leo project also forms a package, which can be published in the Aleo Package Manager (APM),
a repository of Aleo packages, similar to `crates.io` in Rust.
The files in a Leo project can import entities (types, functions, etc.) from
not only local packages (i.e. other files in the same project),
but also packages in the APM.
This RFC is focused on importing packages from the APM.
Each package in the APM is uniquely identified by:
* The author, who must have a registered account on the APM, with a unique username.
* The package name, which is unique within each author's account.
* The package version, which allows different versions of the same package to be treated like different packages.
## Design
### Leo Manifest - target section
To lay the foundation for the future of the Leo ecosystem and to start integrating
information about programs compatibility, we suggest adding two new fields to
the new `[target]` section of the Leo Manifest: `proving_system` and `curve`.
Currently, the Leo compiler only supports `Groth16` for the proving system and `Bls12_377`
for the curve, which are meant to be default values in Leo Manifest.
```toml
[project]
name = "first"
version = "0.1.0"
description = "The first package"
license = "MIT"
[target]
curve = "Bls12_377"
proving_system = "Groth16"
```
These fields are meant to be used to determine whether the imported program is
compatible to the original when support for different curves and proving systems
is added.
### Leo Manifest - dependencies
Dependencies section:
```toml
[dependencies]
name = { author = "author", package = "package", version = "version" }
# alternative way of adding dependency record
[dependencies.name]
author = "author"
package = "package"
version = "1.0"
```
#### Parameters description
The `name` field sets the name of the dependency in Leo code. That way we allow
developer to resolve collisions in import names manually. So, for example,
if a developer is adding the `howard/silly-sudoku` package to his program, he
might define its in-code name as `sudoku` and import it with that name:
```ts
import sudoku;
```
`package`, `author` and `version` are package name, package author and
version respectively. They are already used as arguments in `leo add`
command, so these fields are already understood by the Leo developers.
### Leo CLI
To support updating the Manifest, a new command should be added to Leo CLI.
```bash
# pull imports
leo fetch
```
### Imports Restructurization
One of the goals of the proposed changes is to allow importing packages with the
same name but different authors. This has to be solved not only on the
language level but also on the level of storing program imports.
We suggest using the set of all 3 possible program identifiers for the import
folder name: `author-package@version`. Later it can be extended to
include a hash for the version, but having the inital set already solves name
collisions.
So, the updated imports would look like:
```
leo-program
├── Leo.toml
├── README.md
├── imports
│ ├── author1-program@0.1.0
│ │ └── ...
│ ├── author2-program2@1.0.4
│ └── ...
├── inputs
│ └── ...
└── src
└── main.leo
```
This change also affects the way imports are being processed on the ASG
level, and we need to add an imports map as an argument to the Leo compiler.
The Leo Manifest's dependencies sections needs to be parsed and passed as
a hashmap to the compiler:
```
first-program => author1-program@0.1.0
second-program => author2-program2@1.0.4
```
### Leo.lock
For the imports map to be generated and read by the Leo binary and then by the Leo compiler,
a lock file needs to be created. The lock file should be generated by the `leo fetch` command,
which will pull the dependencies, process their manifests, and put the required information
to the file in the root directory of the program called `Leo.lock`.
The suggested structure of this file is similar to the Cargo.lock file:
```
[[package]]
name = "suit-mk2"
version = "0.2.0"
author = "ironman"
import_name = "suit-mk2"
[package.dependencies]
garbage = "ironman-suit@0.1.0"
[[package]]
name = "suit"
version = "0.1.0"
author = "ironman"
import_name = "garbage"
```
In the example above, you can see that all program dependencies are defined as an
array called `package`. Each of the dependencies contains main information about
it, including the `import_name` field which is the imported package's name in
the Leo program. Also, it stores relationships between these dependencies in the
field `dependencies`.
The format described here allows the Leo binary to form an imports map which can be
passed to the compiler.
It is important to note that the Leo.lock file is created only when a package has dependencies.
For programs with no dependencies, a lock file is not required and not created.
### Recursive Dependencies
This improvement introduces recursive dependencies. To solve this case preemptively
Leo CLI needs to check the dependency tree and throw an error when a recursive dependency
is met. We suggest implementing simple dependency tree checking while fetching
imports - if imported dependency is met on a higher level - abort the execution.
Later this solution can be improved by building a lock file containing all the
information on program dependencies, and the file itself will have enough data
to track and prevent recursion.
## Drawbacks
This change might require the update of already published programs on Aleo PM due to
Leo Manifest change. However it is possible to implement it in a backward-compatible
way.
It also introduces the danger of having recursive dependencies, but this problem is addressed in the Design section above.
## Effect on Ecosystem
The proposed improvement provides safety inside Leo programs and should not affect
the ecosystem except for the tools which use Leo directly (such as Aleo Studio).
It is possible that some of the proposed features will open new features on Aleo PM.
## Alternatives
Another approach to the stated cases is to keep everything as we have now but change
the way programs are imported and stored and make names unique. Also, the current
implementation provides some guarantees on import stablitity and consistency.

View File

@ -1,223 +0,0 @@
# Leo RFC 004: Integer Type Casts
## Authors
The Aleo Team.
## Status
FINAL
## Summary
This proposal provides support for casts among integer types in Leo.
The syntax is similar to Rust.
The semantics is _value-preserving_,
i.e. the casts just serve to change types
but cause errors when the mathematical values are not representable in the new types.
## Motivation
Currently the Leo integer types are "siloed":
arithmetic integer operations require operands of the same type
and return results of the same type.
There are no implicit or explicit ways to turn, for example,
a `u8` into a `u16`, even though
every non-negative integer that fits in 8 bits also fits in 16 bits.
However, the ability to convert values between different (integer) types
is a useful feature that is normally found in programming languages.
## Background
Leo supports the following _integer types_:
```
u8 u16 u32 u64 u128
i8 i16 i32 i64 i128
```
Those are for unsigned and signed integers of 8, 16, 32, 64, and 128 bits.
## Design
### Scope
This RFC proposes type casts between any two integer types,
but not between two non-integer types
or between an integer type and a non-integer type.
This RFC does not propose any implicit cast,
even widening casts (i.e. upcasts)
from a type to another type with the same signedness
and with the same or larger size
(e.g. from `u8` to `u16`).
All the type casts must be explicit.
### Syntax and Static Semantics
The proposed syntax is
```
<expression> as <integer-type>
```
where `<expression>` must have an integer type.
The ABNF grammar of Leo is modified as follows:
```
; add this rule:
cast-expression = unary-expression
/ cast-expression %s"as" integer-type
; modify this rule:
exponential-expression = cast-expression
/ cast-expression "**" exponential-expression
```
There is no need to modify the `keyword` rule
because it already includes `as` as one of the keywords.
Note the use of `integer-type` in the `cast-expression` rule.
The above grammar rules imply that casts bind
tighter than binary operators and looser than unary operators.
For instance,
```
x + - y as u8
```
is like
```
x + ((- y) as u8)
```
This precedence is the same as in Rust:
see [here](https://doc.rust-lang.org/stable/reference/expressions.html#expression-precedence).
### Dynamic Semantics
When the mathematical integer value of the expression
is representable in the type that the expression is cast to,
there is no question that the cast must succeed
and merely change the type of the Leo value,
but not its mathematical integer value.
This is always the case when the cast is to a type
with the same signedness and with the same or larger size.
This is also the case when
the cast is to a type whose range does not cover the range of the source type
but the value in question is in the intersection of the two ranges.
When the mathematical integer value of the expression
is not representable in the type that the expression is cast to,
there are two possible approaches:
_value-preserving casts_,
which just serve to change types
but cause errors when values are not representable in the new types;
and _values-changing casts_,
which never cause errors but may change the mathematical values.
This RFC proposes value-preserving casts;
value-changing casts are discussed in the 'Alternatives' section,
for completeness.
With value-preserving casts,
when the mathematical integer value of the expression
is not representable in the type that the expression is cast to,
it is an error.
That is, we require casts to always preserve the mathematical integer values.
Recall that all inputs are known at compile time in Leo,
so these checks can be performed easily.
Thus integer casts only serve to change types, never values.
When values are to be changed, separate (built-in) functions can be used,
e.g. to mask bits and achieve the same effect as
the value-changing casts discussed below.
This approach is consistent with Leo's treatment of potentially erroneous situations like integer overflows.
The principle is that developers should explicitly use
operations that may overflow if that is their intention,
rather than having those situation possibly occur unexpectedly.
A value-preserving cast to a type
whose range does not cover the original type's range
implicitly expresses a developer expectation that the value
is actually in the intersection of the two types' ranges,
in the same way that the use of integer addition
implicitly expresses the expectation that the addition does not overflow.
Consider this somewhat abstract example:
```
... // some computations on u32 values, which could not be done with u16
let r: u32 = ...; // this is the final result of the u32 operations above
let s: u16 = r as u16; // but r is expected to fit in u16, so we cast it here
```
With value-preserving casts, the expectation mentioned above
is checked by the Leo compiler during proof generation,
in the same way as with integer overflow.
In the example above,
if instead the variable `s` is meant to contain the low 16 bits of `r`,
e.g. in a cryptographic computation,
then the value-preserving cast should be preceded by
an explicit operation to obtain the low 16 bits, making the intent clear:
```
... // some computations on u32 values, which could not be done with u16
let r: u32 = ...; // this is the final result of the u32 operations above
let r_low16: u32 = r & 0xFFFF; // assuming we have bitwise ops and hex literals
let s: u16 = r_low16 as u16; // no value change here
```
### Compilation to R1CS
It may be more efficient (in terms of number of R1CS constraints)
to compile Leo casts as if they had a value-changing semantics.
If the R1CS constraints represent Leo integers as bits,
the bits of the new value can be determined from the bits of the old value,
with additional zero or sign extension bits when needed
(see the details of the value-changing semantics in the 'Alternatives' section).
There is no need to add checks to the R1CS constraints
because the compiler ensures that the cast values do not actually change given the known inputs,
and therefore the value-changing and value-preserving semantics are equivalent on the known inputs.
The idea is that the R1CS constraints can have a "don't care" behavior on inputs that cause errors in Leo.
## Drawbacks
This proposal does not appear to bring any drawbacks,
other than making the language and compiler inevitably more complex.
But the benefits to support type casts justifies the extra complexity.
## Effect on Ecosystem
This proposal does not appear to have any direct effects on the ecosystem.
## Alternatives
As mentioned above, an alternative semantics for casts is value-changing:
1. `uN` to `uM` with `N < M`: just change type of value.
2. `uN` to `uM` with `N > M`: take low `M` bits of value.
3. `iN` to `iM` with `N < M`: just change type of value.
4. `iN` to `iM` with `N > M`: take low `M` bits of value.
5. `uN` to `iM` with `N < M`: zero-extend to `M` bits and re-interpret as signed.
6. `uN` to `iM` with `N > M`: take low `M` bits and re-interpret as signed.
7. `uN` to `iN`: re-interpret as signed
8. `iN` to `uM` with `N < M`: sign-extend to `M` bits and re-interpret as unsigned.
9. `iN` to `uM` with `N > M`: take low `M` bits and re-interpret as unsigned.
10. `iN` to `uN`: re-interpret as unsigned
Except for the 1st and 3rd cases, the value may change.
This value-changing approach is common in other programming languages.
However, it should be noted that other programming languages
typically do not check for overflow in integer operations either
(at least, not for production code).
Presumably, the behavior of type casts in those programming languages
is motivated by efficiency of execution, at least in part.
Since in Leo the input data is available at compile time,
considerations that apply to typical programming languages
do not necessarily apply to Leo.
Back to the somewhat abstract example in the section on value-preserving casts,
note that, with value-changing casts, the expectation that the final result fits in `u16`
would have to be checked with explicit code:
```
... // some computations on u32 values, which could not be done with u16
let r: u32 = ...; // this is the final result of the u32 operations above
if (r > 0xFFFF) {
... // error
}
let s: u16 = r as u16; // could change value in principle, but does not here
```
However, it would be easy for a developer to neglect to add the checking code,
and thus have the Leo code silently produce an unexpected result.

View File

@ -1,285 +0,0 @@
# Leo RFC 005: Countdown Loops
## Authors
The Aleo Team.
## Status
IMPLEMENTED
# Summary
This proposal suggests adding countdown loops and inclusive loop ranges into the Leo language.
# Motivation
In the current design of the language only incremental ranges are allowed. Though
in some cases there's a need for loops going in the reverse direction. These examples
demonstrate the shaker sort and bubble sort algorithms where countdown loops are mocked:
```ts
function shaker_sort(a: [u32; 10], const rounds: u32) -> [u32; 10] {
for k in 0..rounds {
for i in 0..9 {
if a[i] > a[i + 1] {
let tmp = a[i];
a[i] = a[i + 1];
a[i + 1] = tmp;
}
}
for j in 0..9 { // j goes from 0 to 8
let i = 8 - j; // j is flipped
if a[i] > a[i + 1] {
let tmp = a[i];
a[i] = a[i + 1];
a[i + 1] = tmp;
}
}
}
return a;
}
```
```ts
function bubble_sort(a: [u32; 10]) -> [u32; 10] {
for i in 0..9 { // i counts up
for j in 0..9-i { // i is flipped
if (a[j] > a[j+1]) {
let tmp = a[j];
a[j] = a[j+1];
a[j+1] = tmp;
}
}
}
return a
}
```
Having a countdown loop in the examples above could improve readability and
usability of the language by making it more natural to the developer.
However, if we imagined this example using a countdown loop, we would see that
it wouldn't be possible to count to 0; because the first bound of the range is
inclusive and the second is exclusive, and loops ranges must use only unsigned integers.
```ts
// loop goes 0,1,2,3,4,5,6,7,8
for i in 0..9 { /* ... */ }
// loop goes 9,8,7,6,5,4,3,2,1
for i in 9..0 { /* ... */ }
```
Hence direct implementation of the coundown loop ranges would create asymmetry (1)
and would not allow loops to count down to 0 (2). To implement coundown loops and
solve these two problems we suggest adding an inclusive range bounds.
# Design
## Coundown loops
Countdown ranges do not need any changes to the existing syntax. However their
functionality needs to be implemented in the compiler.
```ts
for i in 5..0 {}
```
## Inclusive ranges
To solve loop asymmetry and to improve loop ranges in general we suggest adding
inclusive range operator to Leo. Inclusive range would extend the second bound
of the loop making it inclusive (instead of default - exclusive)
therefore allowing countdown loops to reach 0 value.
```ts
// default loop: 0,1,2,3,4
for i in 0..5 {}
// inclusive range: 0,1,2,3,4,5
for i in 0..=5 {}
```
## Step and Direction
We remark that the step of both counting-up and counting-down loops is implicitly 1;
that is, the loop variable is incremented or decremented by 1.
Whether the loop counts up or down is determined by how the starting and ending bounds compare.
Note that the bounds are not necessarily literals;
they may be more complex `const` expressions, and thus in general their values are resolved at code flattening time.
Because of the type restrictions on bounds, their values are always non-negative integers.
If `S` is the integer value of the starting bound and `E` is the integer value of the ending bound,
there are several cases to consider:
1. If `S == E` and the ending bound is exclusive, there is no actual loop; the range is empty.
2. If `S == E` and the ending bound is inclusive, the loop consists of just one iteration; the loop counts neither up nor down.
3. If `S < E` and the ending bound is exclusive, the loop counts up, from `S` to `E-1`.
4. If `S < E` and the ending bound is inclusive, the loop counts up, from `S` to `E`.
5. If `S > E` and the ending bound is exclusive, the loop counts down, from `S` to `E+1`.
6. If `S > E` and the ending bound is inclusive, the loop counts down, from `S` to `E`.
Cases 3 and 5 consist of one or more iterations; cases 4 and 6 consist of two or more iterations.
## Examples
The code examples demostrated in the Motivation part of this document
could be extended (or simplified) with the suggested syntax:
```ts
function shaker_sort(a: [u32; 10], const rounds: u32) -> [u32; 10] {
for k in 0..rounds {
for i in 0..9 { // i goes from 0 to 8
if a[i] > a[i + 1] {
let tmp = a[i];
a[i] = a[i + 1];
a[i + 1] = tmp;
}
}
for i in 8..=0 { // i goes from 8 to 0
if a[i] > a[i + 1] {
let tmp = a[i];
a[i] = a[i + 1];
a[i + 1] = tmp;
}
}
}
return a;
}
```
```ts
function bubble_sort(a: [u32; 10]) -> [u32; 10] {
for i in 9..0 { // counts down
for j in 0..i { // no flipping
if (a[j] > a[j+1]) {
let tmp = a[j];
a[j] = a[j+1];
a[j+1] = tmp;
}
}
}
return a
}
```
# Drawbacks
No obvious drawback.
# Effect on Ecosystem
Suggested change should have no effect on ecosystem because of its backward compatibility.
# Alternatives
## Mocking
Coundown loops can be mocked manually.
## Exclusive Starting Bounds
While the ability to designate the ending bound of a loop as either exclusive or inclusive is critical as discussed below,
we could also consider adding the ability to designate the starting bound of a loop as either exclusive or inclusive.
If we do that, we run into a sort of asymmetry in the defaults for starting and ending bounds:
the default for the starting bound is inclusive, while the default for ending bounds is exclusive.
The most symmetric but verbose approach is exemplified as follows:
* `0=..=5` for `0 1 2 3 4 5`
* `0<..=5` for `1 2 3 4 5`
* `0=..<5` for `0 1 2 3 4`
* `0<..<5` for `1 2 3 4`
* `5=..=0` for `5 4 3 2 1 0`
* `5>..=0` for `4 3 2 1 0`
* `5=..>0` for `5 4 3 2 1`
* `5>..>0` for `4 3 2 1`
That is, this approach makes exclusivensss and inclusiveness implicit.
The use of `<` vs. `>` also indicates a loop direction, which can be inferred anyhow when the `const` bounds are resolved,
so that would entail an additional consistency check,
namely that the inequality sign/signs is/are consistent with the inferred loop direction.
Within the symmetric approach above, there are different options for defaults.
The most symmetric default would be perhaps `=` for both bounds,
but that would be a different behavior from current Leo.
We could instead go for different defaults for starting and ending bounds,
i.e. `=` for the starting bound and `<` or `>` (depending on direction) for the ending bound.
A drawback of this approach is that it is somewhat verbose.
Furthermore, some of the authors of this RFC do not find it very readable.
## Flipping Bound Defaults for Countdown
In the proposed design, there is an asymmetry between the treatment of loops that count up vs. down.
This can be seen clearly by thinking how to iterate through an array of size `N`:
```ts
for i in 0..n { ... a[i] ... } // count up -- 0 1 2 ... n-1
for i in n-1..=0 { ... a[i] ... } // count down -- n-1 ... 2 1 0
```
While the loop that counts up has nice and simple bounds `0` and `n`,
the loop that counts down needs `n-1` and `=0`.
So a possible idea is to use different defaults depending on the loop direction:
* For a loop that counts up:
* The starting (i.e. lower) bound is always inclusive.
* The ending (i.e. upper) bound is exclusive by default, inclusive with `=`.
* For loop that counts down:
* The ending (i.e. lower) bound is always inclusive.
* The starting (i.e. upper) bound is exclusive by default, inclusive with `=`.
That is, different defaults apply to lower vs. upper bound, rather than to starting and ending bounds.
Things become more symmetric in a way:
```ts
for i in 0..n { ... a[i] ... } // count up -- 0 1 2 ... n-1
for i in n..0 { ... a[i] ... } // count down -- n-1 ... 2 1 0
```
This is also consistent with Rust in a way,
where countdown loops are obtained by reversing the increasing range into a decreasing range, which flips the bounds.
However, if we consider a possible extension in which the step may be larger than 1, we run into some awkwardness.
Imagine an extension in which `step` is specified:
```ts
for i in 10..0 step 2 ... // i = 8 6 4 2 0 -- starts at 10-2 = 8
for i in 10..0 step 3 ... // i = 9 6 3 0 -- starts at 10-1 = 9
```
Note how the actual starting index does not depend on starting/upper bound and step,
but rather on ending/lower bound and step, and must be calculated explicitly;
it doesn't "jump" at the reader.
## Explicit Indication of Loop Direction
Another idea that was brought up is to always write the range as `<lower>..<upper>`,
but include an explicit indication when the loop must count down, e.g.
```ts
for i in 0..n down { ... array[i] ... } // where 'down' indicates count down
```
The advantages are that
we retain the default that the first/lower bound is inclusive and the second/upper bound is exclusive,
and the direction is explicit and does not have to be inferred.
The direction matches starting/ending bound to lower/upper bound or upper/lower bound.
But the awkwardness with larger steps than 1 remains:
```ts
for i in 0..10 down step 2 ... // i = 8 6 4 2 0 -- starts at 10-2 = 8
for i in 0..10 down step 3 ... // i = 9 6 3 0 -- starts at 10-1 = 9
```
## Variable in the Middle of Range with Equalities or Inequalities
Another approach is to put the variable in the middle of the range,
along with equality or inequality signs around the variable, e.g.
```ts
for 0 <= i < 5 // 0 1 2 3 4
for 0 <= i <= 5 // 0 1 2 3 4 5
for 5 > i >= 0 // 4 3 2 1 0
```
This maximizes explicitness, but it may need tweaking to avoid parsing ambiguities or difficulties
(recall that the bounds may be complex `const` expressions).
This could be a future addition to consider, but it seems that it would not replace the current Rust-like syntax.

View File

@ -1,155 +0,0 @@
# Leo RFC 006: Array Types with Unspecified Size
## Authors
The Aleo Team.
## Status
IMPLEMENTED
# Summary
This RFC proposes the addition, at the user level, of array types with unspecified size,
of the form `[T, _]`, where `T` is the element type and the underscore stands for an unspecified size.
It must be possible to infer the size at compile time.
When these types are used in a function,
different calls of the function (which are inlined) may resolve the sizes of these types to different values.
To make this extension more useful, this RFC also proposes the addition of
an operator to return the length of an array, whose result is resolved at compile time.
# Motivation
The initial motivation was the ability to have a type `string` for Leo strings,
which are currently represented as character arrays,
therefore requiring a size indication, i.e. `[char; <size>]`.
Allowing a `[char; _]` type, where `_` stands for an unspecified size,
makes it possible to define a type alias `string` for it,
once we also have an (orthogonal) extension of Leo to support type aliases.
However, allowing `[T; _]` for any `T` (not just `char`) is a more generally useful feature.
This kind of types is already used internally in the Leo compiler.
Allowing their use externally should provide additional convenience to the user.
Some examples are shown in the 'Design' section.
# Design
User-facing array types currently have a specified size, indicated as a positive integer according to the grammar and static semantics
(the grammar allows 0, via the `natural` rule, but the static semantics checks that the natural is not 0).
Internally, the Leo compiler uses array types with unspecified size in some cases, e.g. when taking a slice with non-literal bounds.
These internal unspecified sizes must be resolved at compile time (by evaluating the constant bound expressions), in order for compilation to succeed.
This RFC proposes to make array types with unspecified size available at the user level,
with the same requirement that their sizes must be resolved in order for compilation to succeed.
The ABNF grammar changes as follows:
```
; new rule:
array-dimension = natural / "_"
; modified rule:
array-dimensions = array-dimension
/ "(" array-dimension *( "," array-dimension ) ")"
```
That is, an array dimension may be unspecified; this is also the case for multidimensional array types.
Note that `array-dimension` is also referenced in this rule of the ABNF grammar:
```
; existing rule:
array-repeat-construction = "[" expression ";" array-dimensions "]"
```
The compiler will enforce, post-parsing, that array dimensions in array repeat expressions are positive integers, i.e. non-zero naturals.
This will be part of the static semantics of Leo.
Array types may appear, either directly or within other types, in the following constructs:
- Constant declarations, global or local to functions.
- Variable declarations, local to functions.
- Function inputs.
- Function outputs.
- Member variable declarations.
Thus, those are also the places where array types with unspecified size may occur.
An array type with unspecified size that occurs in a global constant declaration must be resolved to a unique size.
On the other hand, an array type with unspecified size that occurs in a function
(whether a variable declaration, function input, or function output)
could be resolved to different sizes for different inlined calls of the function.
Finally, there seems to be no point in allowing array types of unspecified sizes in member variable declarations:
the circuit type must be completely known, including the types of its member variables;
therefore, this RFC prescribes that array types with unspecified size be disallowed in member variable declarations.
(This may be revisited if a good use case, and procedure for resolution, comes up.)
## Examples
In the following example, the array type with unspecified size obviates the need to explicate the size (3),
since it can be resolved by the compiler:
```
let x: [u8; _] = [1, 2, 3];
```
Currently it is possible to omit the type of `x` altogether of course,
but then at least one of the elements must have a type suffix, e.g. `1u8`.
Using an array type of unspecified size for a function input makes the function generic over the size:
```
function f(x: [u8; _]) ...
```
That is, `f` can take an array of `u8` of any size, and perform some generic computation on it,
because different inlined calls of `f` may resolve the size to different values (at compile time).
But this brings up the issue discussed below.
## Array Size Operator
Currently Leo has no array size operator, which makes sense because arrays have known sizes.
However, if we allow array types with unspecified size as explained above,
we may also need to extend Leo with an array size operator.
However, consider a function `f` as above, which takes as input an array of `u8` of unspecified size.
In order to do something with the array, e.g. add all its elements and return the sum,
`f` should be able to access the size of the array.
Thus, this RFC also proposed to extend Leo with such an operator.
A possibility is `<expression>.length`, where `<expression>` is an expression of array type.
A variation is `<expression>.len()`, if we want it look more like a built-in method on arrays.
Yet another option is `length(<expression>)`, which is more like a built-in function.
A shorter name could be `len`, leading to the three possibilities
`<expression>.len`, `<expression>.len()`, and `len(<expression>)`.
So one dimension of the choice is the name (`length` vs. `len`),
and another dimension is the style:
member variable style,
member function style,
or global function style.
The decision on the latter should be driven by broader considerations
of how we want to treat this kind of built-in operators.
Note that the result of this operator can, and in fact must, be calculated at compile time;
not as part of the Leo interpreter, but rather as part of the flattening of Leo to R1CS.
In other words, this is really a compile-time operator, akin to `sizeof` in C.
With that operator, the following function can be written:
```
function f(x: [u8; _]) -> u8 {
let sum = 0u8;
for i in 0..length(x) {
sum += x[i];
}
return sum;
}
```
# Drawbacks
None, aside from inevitably making the language and compiler slightly more complex.
# Effect on Ecosystem
None.
# Alternatives
None.
# Implementation Decisions
For the length of an array, we decided on `<expression>.len()`, where `<expression>` is any expression of array type.

View File

@ -1,206 +0,0 @@
# Leo RFC 007: Type Aliases
## Authors
The Aleo Team.
## Status
IMPLEMENTED
# Summary
This RFC proposes the addition of type aliases to Leo,
i.e. identifiers that abbreviate types and can be used wherever the latter can be used.
A new top-level construct is proposed to define type aliases; no circularities are allowed.
Type aliases are expanded away during compilation.
# Motivation
Many programming languages provide the ability to create aliases (i.e. synonyms) of types, such as C's `typedef`.
The purpose may be to abbreviate a longer type,
such as an alias `matrix` for `[i32; (3, 3)]` in an application in which 3x3 matrices of 32-bit integers are relevant
(e.g. for 3-D rotations, even though fractional numbers may be more realistic).
The purpose may also be to clarify the intent and use of an existing type,
such as an alias `balance` for `u64` in an application that keeps track of balances.
The initial motivation that inspired this RFC (along with other RFCs)
was the ability to have a type `string` for strings.
Strings are arrays of characters according to RFC 001.
With the array types of unspecified size proposed in RFC 006,
`[char; _]` becomes a generic type for strings, which is desirable to alias with `string`.
# Design
## Syntax
The ABNF grammar changes as follows:
```
; modified rule:
keyword = ...
/ %s"string"
/ %s"type" ; new
/ %s"u8"
/ ...
; new rule:
type-alias-declaration = %s"type" identifier "=" type ";"
; modified rule:
declaration = import-declaration
/ function-declaration
/ circuit-declaration
/ constant-declaration
/ type-alias-declaration ; new
```
A type alias declaration introduces the identifier to stand for the type.
Only top-level type alias declarations are supported;
they are not supported inside functions or circuit types.
In addition, the following changes to the grammar are appropriate.
First, the rule
```
circuit-type = identifier / self-type ; replace with the one below
```
should be replaced with the rule
```
circuit-or-alias-type = identifier / self-type
```
The reason is that, at parsing time, an identifier is not necessarily a circuit type;
it may be a type alias that may expand to a (circuit or non-circuit type).
Thus, the nomenclature `circuit-or-alias-type` is appropriate.
Consequently, references to `circuit-type` in the following rules must be replaced with `circuit-or-alias-type`:
```
; modified rule:
circuit-construction = circuit-or-alias-type "{" ; modified
circuit-inline-element
*( "," circuit-inline-element ) [ "," ]
"}"
; modified rule:
postfix-expression = primary-expression
/ postfix-expression "." natural
/ postfix-expression "." identifier
/ identifier function-arguments
/ postfix-expression "." identifier function-arguments
/ circuit-or-alias-type "::" identifier function-arguments ; modified
/ postfix-expression "[" expression "]"
/ postfix-expression "[" [expression] ".." [expression] "]"
```
Second, the rule
```
aggregate-type = tuple-type / array-type / circuit-type
```
should be removed, because if we replaced `circuit-type` with `circuit-or-alias-type` there,
the identifier could be a type alias, not necessarily an aggregate type.
(The notion of aggregate type remains at a semantic level, but has no longer a place in the grammar.)
Consequently, the rule
```
type = scalar-type / aggregate-type
```
should be rephrased as
```
type = scalar-type / tuple-type / array-type / circuit-or-alias-type
```
which "inlines" the previous `aggregate-type` with `circuit-type` replaced with `circuit-or-alias-type`.
## Semantics
There must be no direct or indirect circularity in the type aliases.
That is, it must be possible to expand all the type aliases away,
obtaining an equivalent program without any type aliases.
Note that the built-in `Self` is a bit like a type alias, standing for the enclosing circuit type;
and `Self` is replaced with the enclosing circuit type during canonicalization.
Thus, canonicalization could be a natural place to expand user-defined type aliases;
after all, type aliases introduce multiple ways to denote the same types
(and not just via direct aliasing, but also via indirect aliasing, or via aliasing of components),
and canonicalization serves exactly to reduce multiple ways to say the same thing to one canonical way.
On the other hand, expanding type aliases is more complicated than the current canonicalization transformations,
which are all local and relatively simple.
Expanding type aliases requires not only checking for circularities,
but also to take into account references to type aliases from import declarations.
For this reason, we may perform type alias expansion after canonicalization,
such as just before type checking and inference.
We could also make the expansion a part of the type checking and inference process,
which already transforms the program by inferring missing types,
so it could also expand type aliases away.
In any case, it seems beneficial to expand type aliases away
(whether during canonicalization or as part or preamble to type checking and inference)
prior to performing more processing of the program for eventual compilation to R1CS.
## Examples
The aforementioned 3x3 matrix example could be written as follows:
```ts
type matrix = [u32; (3, 3)];
function matrix_multiply(x: matrix, y: matrix) -> matrix {
...
}
```
The aforementioned balance example could be written as follows:
```ts
type balance = u64;
function f(...) -> (..., balance, ...) {
...
}
```
The aforementioned string example could be written as follows:
```ts
type string = [char; _];
function f(str: string) -> ... {
...
}
```
# Drawbacks
As other extensions of the language, this makes things inherently a bit more complicated.
# Effect on Ecosystem
None; this is just a convenience for the Leo developer.
# Alternatives
An alternative to creating a type alias
```ts
type T = U;
```
is to create a circuit type
```ts
circuit T { get: U }
```
that contains a single member variable.
This is clearly not equivalent to a type alias, because it involves conversions between `T` and `U`
```ts
T { get: u } // convert u:U to T
t.get // convert t:T to U
```
whereas a type alias involves no conversions:
if `T` is an alias of `U`, then `T` and `U` are the same type,
more precisely two syntactically different ways to designate the same semantic type.
While the conversions generally cause overhead in traditional programming languages,
this may not be the case for Leo's compilation to R1CS,
in which everything is flattened, including member variables of circuit types.
Thus, it may be the case that the circuit `T` above reduces to just its member `U` in R1CS.
It might also be argued that wrapping a type into a one-member-variable circuit type
could be a better practice than aliasing the type, to enforce better type separation and safety.
We need to consider the pros and cons of the two approaches,
particularly in light of Leo's non-traditional compilation target.

View File

@ -1,52 +0,0 @@
# Leo RFC 008: Built-in Declarations
## Authors
The Aleo Team.
## Status
IMPLEMENTED
## Summary
This RFC proposes a framework for making certain (top-level) declarations (e.g. type aliases) available in every Leo program without the need to write those declarations explicitly. These may be hardwired into the language or provided by standard libraries/packages; in the latter case, the libraries may be implicitly imported or required to be explicitly imported.
## Motivation
It is common for programming languages to provide predefined types, functions, etc.
that can be readily used in programs. The initial motivation for this in Leo was to have a type alias `string` for character arrays of unspecified sizes (array types of unspecified sizes and type aliases are discussed in separate RFCs), but the feature is clearly more general.
## Design
Leo supports five kinds of top-level declarations:
- Import declarations.
- Function declarations.
- Circuit type declarations.
- Global constant declarations.
- Type alias declarations. (Proposed in a separate RFC.)
Leaving import declarations aside for the moment since they are "meta" in some sense
(as they bring in names of entities declared elsewhere),
it may make sense for any of the four kinds of declarations above to have built-in instances, i.e., we could have some built-in functions, circuit types, global constants, and type aliases. These features are why this RFC talks of built-in declarations, more broadly than just built-in type aliases that inspired it.
The built-in status of the envisioned declarations will be done through explicitly declared standard library(stdlib) files. Then these stdlib files must expressly be imported, except the files found in stdlib/prelude/*. The ones found in the prelude are features determined to be helpful enough in standard programs and are auto-imported.
## Drawbacks
This does not seem to bring any drawbacks.
## Effect on Ecosystem
This change may interact with libraries and packages in some way.
But it should not be much different from standard libraries/packages.
## Alternatives
Some alternative approaches are:
1. Having all stdlib imports auto included.
2. Require that all stdlib imports are explicitly imported.
The differences between the two above approaches and the chosen one are just how many imports are imported explicitly.

View File

@ -1,277 +0,0 @@
# Leo RFC 009: Conversions with Bits and Bytes
## Authors
The Aleo Team.
## Status
FINAL
# Summary
This RFC proposes the addition of natively implemented global functions to perform conversions
between Leo integer values and sequences of bits or bytes in big endian or little endian order.
This RFC also proposes a future transition from these functions to methods associated to the integer types.
# Motivation
Conversions of integers to bits and bytes are fairly common in programming languages.
Use case include communication with the external world
(since external data is sometimes represented as bits and bytes rather than higher-level data structures),
and serialization/deserialization for cryptographic purposes (e.g. hashing data).
# Design
## Concepts
The Leo integer values can be thought of sequences of bits.
Therefore, it makes sense to convert between integer values and their corresponding sequences of bits;
the sequences of bits can be in little or big endian order (i.e. least vs. most significant bit first),
naturally leading to two possible conversions.
Obviously, the bits represent the integers in base 2.
Since all the Leo integer values consist of multiples of 8 bits,
it also makes sense to convert between integer values and squences of bytes,
which represents the integers in base 256.
Again, the bytes may be in little or big endian order.
It could also make sense to convert between integers consisting of `N` bits
and sequences of "words" of `M` bits if `N` is a multiple of `M`,
e.g. convert a `u32` into a sequence of two `u16`s, or convert a `u128` into a sequence of four `u32`s.
However, the case in which `M` is 1 (bits) or 8 (bytes) is by far the most common,
and therefore the initial focus of this RFC;
nonetheless, it seems valuable to keep these possible generalizations in mind as we work though this initial design.
Another possible generalization is to lift these conversions to sequences,
e.g. converting from a sequence of integer values to a sequence of bits or bytes
by concatenating the results of converting the integer values,
and converting from a sequence of bits or bytes to a sequence of integer values
by grouping the bits or bytes into chunks and converting each chunk into an integer.
For instance, a sequence of 4 `u32` values can be turned into a sequence of 32 bytes or a sequence of 128 bits.
Note that, in these cases, the endianness only applies to the individual element conversion,
not to the ordering of the integer values, which should be preserved by the conversion.
Besides integers, it could make sense to consider converting other Leo values between bits and bytes,
namely characters, field elements, group elements, and addresses (but perhaps not booleans).
If this is further extended to aggregate values (tuples, arrays, and circuits),
then this moves towards a general serialization/deserialization library for Leo, which could be a separate feature.
## Representation of Bits
In Leo's current type system, bits can be represented as `bool` values.
These are not quite the numbers 0 and 1, but they are isomorphic, and it is easy to convert between booleans and bits:
```ts
// convert a boolean x to a bit:
(x ? 1 : 0)
// convert f bit y to a boolean:
(y == 1)
```
If Leo had a type `u1` for unsigned 1-bit integers, we could use that instead of `bool`.
Separately from this RFC, such a type could be added.
There is also an outstanding proposal (not in an RFC currently) to support types `uN` and `iN` for every positive `N`,
in which case `u1` would be an instance of that.
## Representation of Bytes
The type `u8` is the natural way to represent a byte.
The type `i8` is isomorphic to that, but we tend to think of bytes as unsigned.
## Representation of Sequences
This applies to the sequence of bits or bytes that a Leo integer converts to or from.
E.g. a `u32` is converted to/from a sequence of bits or bytes.
Sequences in Leo may be ntaurally represented as arrays or tuples.
Arrays are more flexible; in particular, they allow indexing via expressions rather than just numbers, unlike tuples.
Thus, arrays are the natural choice to represent these sequences.
## Conversion Functions
We propose the following global functions,
for which we write declarations without bodies below,
since the implementation is native.
(It is a separate issue whether the syntax below should be allowed,
in order to represent natively implemented functions,
or whether there should be a more explicit indication such as `native` in Java).
These are tentative names, which we can tweak.
What is more important is the selection of operations, and their input/output types.
### Conversions between Integers and Bits
```ts
// unsigned to bits, little and big endian
function u8_to_bits_le(x: u8) -> [bool; 8];
function u8_to_bits_be(x: u8) -> [bool; 8];
function u16_to_bits_le(x: u16) -> [bool; 16];
function u16_to_bits_be(x: u16) -> [bool; 16];
function u32_to_bits_le(x: u32) -> [bool; 32];
function u32_to_bits_be(x: u32) -> [bool; 32];
function u64_to_bits_le(x: u64) -> [bool; 64];
function u64_to_bits_be(x: u64) -> [bool; 64];
function u128_to_bits_le(x: u128) -> [bool; 128];
function u128_to_bits_be(x: u128) -> [bool; 128];
// signed to bits, little and big endian
function i8_to_bits_le(x: i8) -> [bool; 8];
function i8_to_bits_be(x: i8) -> [bool; 8];
function i16_to_bits_le(x: i16) -> [bool; 16];
function i16_to_bits_be(x: i16) -> [bool; 16];
function i32_to_bits_le(x: i32) -> [bool; 32];
function i32_to_bits_be(x: i32) -> [bool; 32];
function i64_to_bits_le(x: i64) -> [bool; 64];
function i64_to_bits_be(x: i64) -> [bool; 64];
function i128_to_bits_le(x: i128) -> [bool; 128];
function i128_to_bits_be(x: i128) -> [bool; 128];
// unsigned from bits, little and big endian
function u8_from_bits_le(x: [bool; 8]) -> u8;
function u8_from_bits_be(x: [bool; 8]) -> u8;
function u16_from_bits_le(x: [bool; 16]) -> u16;
function u16_from_bits_be(x: [bool; 16]) -> u16;
function u32_from_bits_le(x: [bool; 32]) -> u32;
function u32_from_bits_be(x: [bool; 32]) -> u32;
function u64_from_bits_le(x: [bool; 64]) -> u64;
function u64_from_bits_be(x: [bool; 64]) -> u64;
function u128_from_bits_le(x: [bool; 128]) -> u128;
function u128_from_bits_be(x: [bool; 128]) -> u128;
// signed from bits, little and big endian
function i8_from_bits_le(x: [bool; 8]) -> i8;
function i8_from_bits_be(x: [bool; 8]) -> i8;
function i16_from_bits_le(x: [bool; 16]) -> i16;
function i16_from_bits_be(x: [bool; 16]) -> i16;
function i32_from_bits_le(x: [bool; 32]) -> i32;
function i32_from_bits_be(x: [bool; 32]) -> i32;
function i64_from_bits_le(x: [bool; 64]) -> i64;
function i64_from_bits_be(x: [bool; 64]) -> i64;
function i128_from_bits_le(x: [bool; 128]) -> i128;
function i128_from_bits_be(x: [bool; 128]) -> i128;
```
### Conversions between Integers and Bytes
```ts
// unsigned to bytes, little and big endian
function u16_to_bytes_le(x: u16) -> [u8; 2];
function u16_to_bytes_be(x: u16) -> [u8; 2];
function u32_to_bytes_le(x: u32) -> [u8; 4];
function u32_to_bytes_be(x: u32) -> [u8; 4];
function u64_to_bytes_le(x: u64) -> [u8; 8];
function u64_to_bytes_be(x: u64) -> [u8; 8];
function u128_to_bytes_le(x: u128) -> [u8; 16];
function u128_to_bytes_be(x: u128) -> [u8; 16];
// signed to bytes, little and big endian
function i16_to_bytes_le(x: i16) -> [u8; 2];
function i16_to_bytes_be(x: i16) -> [u8; 2];
function i32_to_bytes_le(x: i32) -> [u8; 4];
function i32_to_bytes_be(x: i32) -> [u8; 4];
function i64_to_bytes_le(x: i64) -> [u8; 8];
function i64_to_bytes_be(x: i64) -> [u8; 8];
function i128_to_bytes_le(x: i128) -> [u8; 16];
function i128_to_bytes_be(x: i128) -> [u8; 16];
// unsigned from bytes, little and big endian
function u16_from_bytes_le(x: [u8; 2]) -> u16;
function u16_from_bytes_be(x: [u8; 2]) -> u16;
function u32_from_bytes_le(x: [u8; 4]) -> u32;
function u32_from_bytes_be(x: [u8; 4]) -> u32;
function u64_from_bytes_le(x: [u8; 8]) -> u64;
function u64_from_bytes_be(x: [u8; 8]) -> u64;
function u128_from_bytes_le(x: [u8; 16]) -> u128;
function u128_from_bytes_be(x: [u8; 16]) -> u128;
// signed from bytes, little and big endian
function i16_from_bytes_le(x: [u8; 2]) -> i16;
function i16_from_bytes_be(x: [u8; 2]) -> i16;
function i32_from_bytes_le(x: [u8; 4]) -> i32;
function i32_from_bytes_be(x: [u8; 4]) -> i32;
function i64_from_bytes_le(x: [u8; 8]) -> i64;
function i64_from_bytes_be(x: [u8; 8]) -> i64;
function i128_from_bytes_le(x: [u8; 16]) -> i128;
function i128_from_bytes_be(x: [u8; 16]) -> i128;
```
## Handling of the Native Functions
Given the relatively large number and regular structure of the functions above,
it makes sense to generate them programmatically (e.g. via Rust macros),
rather than enumerating all of them explicitly in the implementation.
It may also makes sense, at R1CS generation time,
to use generated or suitably parameterized code to recognize them and turn them into the corresponding gadgets.
## Transition to Methods
Once a separate proposal for adding methods to Leo scalar types is realized,
we may want to turn the global functions listed above into methods,
deprecating the global functions, and eventually eliminating them.
Conversions to bits or bytes will be instance methods of the integer types,
e.g. `u8` will include an instance method `to_bits_le` that takes no arguments and that returns a `[bool; 8]`.
Example:
```ts
let int: u8 = 12;
let bits: [bool; 8] = int.to_bits_le();
console.assert(bits == [false, false, true, true, false, false, false, false]); // 00110000 (little endian)
```
Conversions from bits or bytes will be static methods of the integer types,
e.g. `u8` will include a static metod `from_bits_le` that takes a `[bool; 8]` argument and returns a `u8`.
Example:
```ts
let bits: [bool; 8] = [false, false, true, true, false, false, false, false]; // 00110000 (little endian)
let int = u8::from_bits_le(bits);
console.assert(int == 12);
```
# Drawbacks
This does not seem to bring any drawbacks.
# Effect on Ecosystem
None.
# Alternatives
## Pure Leo Implementation
These conversions can be realized in Leo (i.e. without native implementations),
provided that Leo is extended with certain operations that are already separately planned:
* Integer division and remainder, along with type casts, could be used.
* Bitwise shifts and masks, along with type casts, could be used.
However, compiling the Leo code that realizes the conversions may result in less efficient R1CS than the native ones.
## Naming Bit and Byte Types Explicitly
Names like `u8_to_bits_le` and `u32_to_bytes_le` talk about bits and bytes,
therefore relying on a choice of representation for bits and bytes,
which is `bool` for bits and `u8` for bytes as explained above.
An alternative is to have names like `u8_to_bools_le` and `u32_to_u8s_le`,
which explicate the representation of bits and bytes in the name,
and open the door to additional conversions to different representations.
In particular, if and when Leo is extended with a type `u1` for bits,
there could be additional operations like `u8_to_u1s_le`.
This more explicit naming scheme also provides a path towards extending
bit and byte conversions to more generic "word" conversions,
such as `u64_to_u16s_le`, which would turn a `u64` into a `[u16; 4]`.
In general, it makes sense to convert between `uN` or `iN` and `[uM; P]` when `N == M * P`.
If Leo were extended with types `uN` and `iN` for all positive `N` as proposed elsewhere,
there could be a family of all such conversions.
## Methods Directly
Given that we eventually plan to use methods on scalar types for these conversions,
it may make sense to do that right away.
This is predicated on having support for methods on scalar types,
for which a separate RFC is in the works.
If we decide for this approach, we will revise the above proposal to reflect that.
The concepts and (essential) names and input/output types remain unchanged,
but the conversions are packaged in slightly different form.

View File

@ -1,211 +0,0 @@
# Leo RFC 010: Improved Native Functions
## Authors
The Aleo Team.
## Status
DRAFT
# Summary
This RFC proposes an improved approach to handling natively implemented functions ('native functions', for short) in Leo,
that is functions implemented not via Leo code but (in essence) via Rust code.
Currently there is just one such function, namely BLAKE2s.
The scope of this proposal is limited to native functions defined by the developers of Leo itself,
not by users of Leo (i.e. developers of applications written in Leo).
The approach proposed here is to allow (global and member) Leo functions to have no defining bodies,
in which case they are regarded as natively implemented;
this is only allowed in Leo files that contain standard/core libraries, provided with the Leo toolchain.
Most of the compiler can work essentially in the same way as it does now;
at R1CS generation time, native functions must be recognized, and turned into the known gadgets that implement them.
# Motivation
Many languages support native functions (here we generically use 'functions' to also denote methods),
where 'native' refers to the fact that the functions are implemented not in the language under consideration,
but rather in the language used to implement the language under consideration.
For instance, Java supports native methods that are implemented in C rather than Java.
There are two main reasons for native functions in programming languages:
1. The functionality cannot be expressed at all in the language under consideration,
e.g. Java has no constructs to print on screen, making a native implementation necessary.
2. The functionality can be realized more efficiently in the native language.
The first reason above may not apply to Leo, at least currently,
as Leo's intended use is mainly for "pure computations" rather than interactions with the external world.
However, we note that console statements could be regarded as native functions (as opposed to "ad hoc" statements),
and this may be in fact the path to pursue if we extend the scope of console features (e.g. even to full GUIs),
as has been recently proposed (we emphasize that the console code is not meant to be compiled to R1CS).
The second reason above applies to Leo right now.
While there is currently just one native function supported in Leo, namely BLAKE2s,
it is conceivable that there may be other cryptographic (or non-cryptographic) functions
for which hand-crafted R1CS gadgets are available
that are more efficient than what the Leo compiler would generate if their functionality were written in Leo.
While we will continue working towards making the Leo compiler better,
and perhaps eventually capable to generate R1CS whose efficiency is competitive with hand-crafted gadgets,
this will take time, and in the meanwhile new and more native functions may be added,
resulting in a sort of arms race.
In other words, it is conceivable that Leo will need to support native functions in the foreseeable future.
Languages typically come with standard/core libraries that application developers can readily use.
Even though the Leo standard/core libraries are currently limited (perhaps just to BLAKE2s),
it seems likely that we will want to provide more extensive standard/core libraries,
not just for cryptographic functions, but also for data structures and operations on them.
The just mentioned use case of data structures brings about an important point.
Leo circuit types are reasonable ways to provide library data structures,
as they support static and instance member functions that realize operations on those data structures.
Just like some Java library classes provide a mix of native and non-native methods,
we could imagine certain Leo library circuit types providing a mix of native and non-native member functions, e.g.:
```ts
circuit Point2D {
x: u32;
y: u32;
function origin() -> Point2D { ... } // probably non-native
function move(mut self, delta_x: u32, delta_y: u32) { ... } // probably non-native
function distance(self, other:Point2D); // maybe native (involves square root)
}
```
Our initial motivation for naive functions is limited to Leo standard/core libraries,
not to user-defined libraries or applications.
That is, only the developers of the Leo language will be able to create native functions.
Leo users, i.e. developers of Leo applications, will be able to use the provided native functions,
but not to create their own.
If support for user-defined native functions may become desired in the future, it will be discussed in a separate RFC.
# Design
## Background
### Current Approach to Native Functions
The BLAKE2s native function is currently implemented as follows (as a high-level view):
1. Prior to type checking/inference, its declaration (without a defining body)
is programmatically added to the program under consideration.
This way, the input and output types of the BLAKE2s function can be used to type-check code that calls it.
2. At R1CS generation time, when the BLAKE2s function is compiled, it is recognized as native and,
instead of translating its body to R1CS (which is not possible as the function has no Leo body),
a known BLAKE2s gadget is used.
This approach is fine for a single native function, but may not be the best for a richer collection of native functions.
In particular, consider the `Point2D` example above, which has a mix of native and non-native functions:
presumably, we would like to write at least the non-native functions of `Point2D` directly in a Leo file,
as opposed to programmatically generating them prior to type checking/inference.
### Multi-File Compilation
Leo already supports the compilation of multiple files that form a program, via packages and importing.
This capability is independent from native functions.
We note that, for the purpose of type checking code that calls a function `f`,
the defining body of `f` is irrelevant: only the input and output types of `f` are relevant.
The defining body is of course type-checked when `f` itself is type-checked,
and furthermore the defining body is obviously needed to generate R1CS,
but the point here is that only the input and output types of `f` are needed to type-check code that calls `f`.
In particular, this means that, if a Leo file imports a package,
only the type information from the package is needed to type-check the file that imports the package.
Conceptually, each package exports a symbol table, used (and sufficient) to type-check files that import that package.
## Proposal
we propose to:
1. Allow declarations of (global and member) functions to have no defining body, signaling that the function is native.
2. Extend the AST and ASG to allow functions to have no bodies.
3. Have the compiler allow empty function bodies only in standard/core library files, which should be known.
4. Have type checking/inference "skip over" absent function bodies.
5. At R1CS generation time, when a function without body is encountered, find and use the known gadget for it.
Currently the ABNF grammar requires function declarations to have a defining body (a block), i.e. to be implemented in Leo:
```
function-declaration = *annotation %s"function" identifier
"(" [ function-parameters ] ")" [ "->" type ]
block
```
We propose to relax the rule as follows:
```
function-declaration = *annotation %s"function" identifier
"(" [ function-parameters ] ")" [ "->" type ]
( block / ";" )
```
This allows a function declaration to have a terminating semicolon instead of a block.
Since we do not have anything like abstract methods in Leo, this is a workable way to indicate native functions.
However, it is easy, if desired, to have a more promiment indication, such as a `native` keyword, or an annotation.
It may be necessary to extend the AST and ASG to accommodate function bodies to be optional,
although this may be already the case for representing BLAKE2s in its current form described above.
The compiler should know which files are part of the Leo standard/core libraries and which ones are not.
Functions without bodies will be only allowed to appear in those files.
It will be an error if any other file (e.g. user-defined) contains functions without bodies.
Type checking/inference may be where we make this check, or perhaps in some other phase.
Because of the already existing support for multi-file compilation described above,
no essential change is needed in the compiler's type checking/inference.
We just need to make sure that functions without bodies are expected and properly handled
(i.e. their input and output types must be checked and added to the proper symbol tables,
but their absent bodies must be skipped);
this may already be the case, for the treatment of BLAKE2s described above.
The main change is in R1CS generation.
Normally, when a function definition is encountered, its Leo body is translated to R1CS.
For a native function, we need to find and use a known gadget instead.
The compiler must know a mapping from native functions in the standard/core libraries
to the R1CS gadgets that implement them, so it should be just a matter of selecting the appropriate one.
Some of this logic must be already present, in order to detect and select the BLAkE2s gadget.
This approach is used in Java, where Java files may declare certain methods as `native`,
without a body but with a declaration of input and output types.
The actual native implementations, i.e. the native method bodies live in different files, as they are written in C.
# Drawbacks
This does not seem to bring any drawbacks.
A capability for native functions (for BLAKE2s) already exists;
this RFC proposes a way to make it more flexible,
with mild (and likely simplifying) changes to the compiler.
# Effect on Ecosystem
This should help support richer standard/core libraries for Leo.
# Alternatives
## Programmatic Generation
Instead of storing declarations of native functions in standard/core files as proposed above,
we could programmatically generate them as currently done for BLAKE2s.
Macros may be used to generate families of similar function declarations.
However, consider `Point2D` above, which has a mix or native and non-native functions.
One approach is to programmatically generate the whole `Point2D` declarative,
with both native and non-native functions.
But it seems that a Leo file would be clearer and more maintainable than a Rust string in the compiler.
We could think of splitting the non-native and native functions of `Point2D`:
the former in a Leo file, and the latter programmatically added.
Again, this looks more complicated than just declaring native funcions in Leo files.
## Leo Code in Rust Files
It has been pointed out that it would be beneficial to have
both the Leo code (for the non-native functions)
and the Rust code (for the native functions)
in the same place (i.e. file).
This is not possible if the non-native code is in a Leo file, because Leo files cannot contain Rust code
(and there is no plan to allow that, i.e. no inline Rust code).
However, we can turn things around and leverage Rust's macro system to accommodate Leo code in Rust files.
That is, we can have Rust files that include both the non-native Leo code,
written as Leo code (with some surrounding macro call or something like that),
along with the Rust code that implements the naive functions.
This may turn out to be in fact the preferred design in the end,
as it combines the advantage of writing non-native code in Leo
with the advantage of having native and non-native code in the same place.
In that case, we will revise this RFC to swap this design proposal with the one in the main section,
moving the proposal for Leo files to this section as an alternative.

View File

@ -1,94 +0,0 @@
# Leo RFC 011: Scalar Type Accesses And Methods
## Authors
The Aleo Team.
## Status
FINAL
## Summary
This RFC proposes three things:
1. The scalar types in Leo (integers, fields, etc.) can have static methods.
2. The scalar types in Leo (integers, fields, etc.) can have static constants.
3. Expressions of scalar type can have instance methods called on them.
## Motivation
This approach allows for a clean interface to provide built-in methods or static members for these basic types.
## Design
### Semantics
Firstly we would have to modify both the ABNF and parsing of Leo to allow static method calls onto a scalar type.
The ABNF would look as follows:
```abnf
; This is an existing old rule.
scalar-type = boolean-type / arithmetic-type / address-type / character-type
; Add this rule.
named-type = identifier / self-type / scalar-type
; Modify this rule.
postfix-expression = primary-expression
/ postfix-expression "." natural
/ postfix-expression "." identifier
/ identifier function-arguments
/ postfix-expression "." identifier function-arguments
/ named-type "::" identifier function-arguments ; this used to be identifier-or-self-type
/ named-type "::" identifier ; this is new to allow member constants
/ postfix-expression "[" expression "]"
/ postfix-expression "[" [expression] ".." [expression] "]"
; Also need to add a new static member variable declaration rule to allow for static constant members.
member-constant-declaration = %s"static" %s"const" identifier ":" type = literal ";"
; We then need to modify the circuit declaration rule.
circuit-declaration = %s"circuit" identifier
"{" *member-constant-declaration
[ member-variable-declarations ]
*member-function-declaration "}"
```
Now methods and static members would be first-class citizens of scalar types and their values. For example, the following could be done:
```ts
let x = 1u8.to_bits(); // A method call on on a scalar value itself
let x = u8::MAX; // A constant value on the scalar type
let y = u8::to_bits(1u8); // A static method on the scalar type
```
It also allows for static constants for circuits in general:
```ts
circuit Point {
static SLOPE: u32 = 3;
x: u32;
y: u32;
function new(x: u32, y: u32) -> Self {
return Self {
x,
y: y + Self::SLOPE
};
}
}
```
## Drawbacks
This change adds more complexity to the language.
## Effect on Ecosystem
None. The new parsing changes would not break any older programs.
## Alternatives
None.

View File

@ -1,367 +0,0 @@
# Leo RFC 012: Improved Record and Transaction Model
## Authors
The Aleo Team.
## Status
DRAFT
## Summary
This RFC describes an improved model for how Leo programs interact with the Aleo blockchain.
The description is oriented to the Leo developer:
it does not describe the zero-knowledge details,
as the whole purpose of Leo is to enable developers
to write applications with only a very high-level understanding of zero-knowledge.
## Motivation
While Leo can be described as a regular programming language
(albeit with certain non-standard restrictions motivated by its compilation to zero-knowledge circuits),
its purpose is to build applications for the Aleo blockchain.
It is thus important to describe precisely how Leo programs operate in the Aleo blockchain.
## Background
### Zexe
The Aleo blockchain follows the Zexe model, with some variations.
It is thus useful to briefly review some aspects of Zexe first.
In Zexe, there are _records_ that contain application-specific data,
and _transactions_ that consume _n_ old records and produce _m_ new records.
The computation of the new records from the old records
is arbitrary and unknown to the blockchain;
the blockchain only enforces that the old records satisfy known _death predicates_
and that the new records satisfy known _birth predicates_.
See the [Zexe paper](https://eprint.iacr.org/2018/962.pdf) for details.
### Aleo Blockchain
In the Aleo blockchain, a transaction always consumes 2 old records and produces 2 new records.
That is, _n_ = 2 and _m_ = 2 with respect to the Zexe model.
Other choices are possible, and may be supported in the future;
the current choice of 2 old and 2 new records is motivated by
being the minimum to represent certain computations of interest, such as token exchanges,
which may involve records owned by different parties
(and therefore need to consume more than one record, since each record has a unique owner).
One or both of the old records may be dummy,
if only one old actual record is desired,
or if new records are created "from nothing".
One or both of the new records may be dummy,
if only one new actual record is desired,
or if old records just have to be consumed.
Aleo records and transactions have a well-defined structure.
They are ordered collections of typed slots.
Of particular interest is the _payload_ slot,
which contains a fixed number of bytes (currently 128)
to store application-specific data.
(Note that the developer documentation is out of date at the time of this writing.)
In the Aleo blockchain, unlike Zexe, there is no separation among
computation of new records from old records, death predicates, and birth predicates.
Instead, a Leo program plays the role of all three, as described below.
### Current Leo Program Execution Model
A Leo program is a collection of files,
with `file` as defined in the ABNF grammar,
i.e. as a sequence of declarations.
A Leo program has one main file,
which may contain import declarations,
which resolve to other files,
which may in turn contain import declarations,
and so on until a closed set of files is obtained:
that (linked) set of files is the _program_.
In order to be used in the Aleo blockchain,
a Leo program must include a function called `main`, in its aforementioned main file.
The processing of a transaction corresponds to an invocation of this `main` function.
The word 'corresponds' in the preceding sentence is important:
unlike other blockchains like Ethereum,
the processing of the transaction does not involve executing the Leo code;
rather, it involves checking a zero-knowledge proof
of the execution of the Leo program,
which was prepared when the Leo program was compiled.
This is what 'corresponds' means, in that sentence.
However, for the high-level purpose of this RFC, these are zero-knowledge details.
In general, the `main` function takes some `const` and some non-`const` inputs (declared as parameters),
and returns an output (declared as a return type), which may be a tuple to represent "multiple" outputs.
The `const` inputs are compiled into the zero-knowledge circuit,
so they can be ignored for our purpose here,
leaving only the non-`const` inputs and the output for consideration.
The execution of `main` can be described as a mathematical function
```
main : Record x Record x Inputs -> Record x Record x Output
```
where `x` is cartesian product,
`Record` is the set of possible records,
`Inputs` is the set of possible inputs to `main`, and
`Output` is the set of possible outputs from `main`.
(These sets can be thought as "types", but mathematically we talk about sets.)
That is, this mathematical function
takes three inputs (the two old records and the `main` inputs)
and returns three outputs (the two new records and the `main` output).
While `Record` is fixed, i.e. it is the same for all Leo programs,
both `Inputs` and `Output` differ based on the Leo input and output types of `main`.
In the Leo code, in `main` or in functions called by `main`,
the values in `Inputs` are accessed via the `main` parameters,
while the old records are accessed via the special `input` variable,
which provides access to the two old records and their slots,
including the payloads that contain application-specific data.
The picture for new records and values in `Output` is less clear from the documentation:
experimentation suggests that the new records are obtained
by serializing the output value in `Output` (which, recall, may be a tuple).
It is important to note that the values in `Inputs` do not come from the two old records.
Rather, they are private inputs, supplied by the developer
when they compile the Leo program and generate the zero-knowledge proof.
Indeed, as mentioned above, the processing of the transaction in the blockchain
does not execute the Leo code, and thus does not need to know the values in `Inputs`.
Rather, the blockchain has to verify a zero-knowledge proof asserting that
there exist values in `Input`, known to the creator of the transaction,
such that the execution of the Leo program's `main`
on those values and on the old records
yields the new records, along with some value in `Output`;
this is, roughly speaking, the assertion proved in zero-knowledge.
### Input and Output Files
Currently the compilation of a Leo program involves:
1. A `.in` file, containing `const` and non-`const` inputs.
2. A `.state` file, containing transaction data.
3. A `.out` file, containing results.
The compilation takes the first two files as inputs and returns the third file as output.
## Design
### Multiple Entry Points
We propose to generalize from one entry point (i.e. the `main` function) to multiple entry points,
in line with the smart contract paradigm.
Instead of implicitly designating `main` as the only entry point,
we need a mechanism to explicitly designate one or more Leo functions as entry points.
A simple approach could be to use an annotation like `@entrypoint` to designate _entry point functions_:
```
@entrypoint
function mint(...) -> ... { ... }
@entrypoint
function transfer(...) -> ... { ... }
```
This has a precedent, in the use of `@test` to designate Leo test functions that are not compiled to circuits.
Another approach is to use a keyword, e.g.
```
entrypoint function mint(...) -> ... { ... }
entrypoint function transfer(...) -> ... { ... }
```
Yet another approach is to group entrypoint functions inside a new block construct, e.g.
```
entrypoint {
function mint(...) -> ... { ... }
function transfer(...) -> ... { ... }
}
```
In the rest of this design section we assume the annotation approach (i.e. `@entrypoint`) for concreteness,
but that can be replaced as soon as we converge on a choice.
### Types for Transaction Inputs and Outputs
We propose to add types for transaction inputs and outputs to the Leo standard library,
and possibly include them in the prelude that is implicitly imported by every Leo program.
Given that records have a fixed structure with typed slots,
their format could be described by a Leo circuit type, e.g. called `Record`,
whose member variables correspond to the slots.
The types of the slots are fairly low-level,
i.e. byte arrays (e.g. `u8[128]` for the payload)
and unsigned integers (e.g. `u64` for the balance),
because they must have a clear correspondence with the serialized form of records.
This means that the Leo code may have to do
its own deserialization of the payload bytes into higher-level Leo values;
standard serialization/deserialization libraries for Leo types may be provided for this,
as an independent and more generally useful feature.
Given that a transaction input consists of two records and possibly additional information,
it makes sense to also have a circuit type `TransactionInput`,
which includes two `Record` slots and possibly additional slots.
Additionally, it makes sense to have a circuit type `TransactionOutput`
that describes the output data of a transaction that is produced by the Leo program.
This could also include two `Record` slots for the new records,
or possibly "subsets" of records if the values of some record slots are calculated
not by the Leo program but instead by the Leo CLI (i.e. build process).
All these types should be documented, as part of the standard library.
We will need to flesh out their exact definition,
but we note that this is fairly easy to change when it is in the standard library.
### Entry Point Input and Output Types
We propose that each entry point function of a Leo program
explicitly produce transaction outputs from transaction inputs
by taking a `TransactionInput` input and returning a `TransactionOutput` output:
```
@entrypoint
function ...(input: TransactionInput, ...) -> TransactionOutput { ... }
```
This way, the calculation of transaction outputs from transaction inputs is made functional and explicit.
As special cases (both of which may apply to the same entry point):
1. We could allow the `TransactionInput` input to be absent,
when an entry point does not need access the transaction input data,
e.g. when producing new records without consuming old records.
2. We could allow the function output to be `()` instead of `TransactionOutput`,
when an entry point does not need to produce transaction outputs,
e.g. when consuming old records without producing new records.
Compared to the current Leo program execution model (described earlier in the background section),
`input` is made an explicit input here, instead of being like a built-in global variable.
Furthermore, the output type is restricted to be `TransactionOutput` (or `()`),
thus eliminating the implicit serialization and the asymmetry with the treatment of transaction inputs.
There is still no restriction on the non-`TransactionInput` inputs of an entry point function;
as noted earlier, they are existentially quantified in the zero-knowledge assertion.
Thus, a Leo program entry point can be now described as a mathematical function
```
entrypoint : Record x Record x Inputs -> Record x Record
```
where `Output` is no longer present.
(If `TransactionInput` includes additional data, besides the two old records, that may affect the transaction output,
then we would need to add that to this mathematical model;
however, the model above is sufficiently accurate for the current discussion.)
We may require the `TransactionInput` input of an entry point function, if present,
to be the first input of the function, for clarity and readability.
A question is whether we should extend that requirement to non-entry-point functions
that may be passed `TransactionInput` values.
We note that none of these restrictions are necessary, though.
A necessary restriction is that each entry point function takes at most one `TransactionInput` input.
We may require the `TransactionInput` input of an entry point function, if present,
to be called `input`, or some other predefined name.
However, this is not a necessary restriction, and we may decide to demote that to a convention rather than a requirement.
(Currently `input` is a keyword and its own kind of Leo expression, which slightly complicates the language.)
### Access to Transaction Input and Output Types
Currently the member variables of Leo circuit types are always accessible for both reading and writing.
It is thus possible for a Leo program
to read from the member variables of `TransactionInput`
and to write to the member variables of `TransactionOutput`.
Therefore, for an initial implementation,
it suffices for these two circuit types to provide member variables for the needed slots.
We might want the member variables of `TransactionInput` to be read-only.
This is not necessary for the transaction model to work:
so long as `TransactionInput` is properly initialized before calling the entry point,
and that after the call the resulting `TransactionOutput` is used to create the transaction,
there is no harm in the Leo program to modify the copy of `TransactionInput` passed to the program.
Nonetheless, we may want to enforce this restriction to encourage good coding practices
(unless we find a use case to the contrary).
There is currently no mechanism in Leo to enforce that.
Designating the transaction input as `const` is not right,
as that designation normally means that the value is compiled into the circuit.
We could provide read-only access via member function (e.g. `payload()`, `balance()`),
but we still have to prohibit assignments to member variables (which is currently allowed on any circuit type).
As an orthogonal and more generally useful feature,
we could consider adding public/private access designations to Leo circuit members.
Another approach is to avoid exposing the member variables,
and just make the member functions available via an implicit import declaration.
All of this needs to be thought through more carefully, in the broader context of the Leo language design.
If `TransactionInput` has member functions, it may also be useful for `TransactionOutput` to have member functions,
presumably to create new instances and to set values of member variables.
A somewhat related consideration is whether it should be allowed
to make copies of the `TransactionInput` value passed to an entry point function.
There is no harm in doing that: the model still works, as explained above.
(Since a `TransactionInput` is a relatively large structure,
there may be harm consisting in creating a relatively large number of R1CS constraints,
but that may happen with user-defined types too, and is a separable problem.)
Nonetheless, we may want to enforce a discipline of single-threadedness,
which could also allow us to treat transaction input as an immutable reference behind the scenes,
thus reducing the number of R1CS constraints.
Analogous considerations apply to `TransactionOutput`,
namely whether it should be treated in a single-threaded way,
i.e. effectively as a built-in global variable,
which could enable compiler optimizations.
### Input and Output Files
According to the new model proposed above, we should have just two files involved in the Leo compilation process:
1. A `.in` file, from which the `TransactionInput` value is produced.
2. A `.out` file, produced from the `TransactionOutput` returned by the program.
There seems to be no longer a need for a `.state` file and for explicit registers.
## Alternatives
The 'Design' section above still outlines certain alternatives to consider.
Once we make some specific choices, we can move the other options to this section.
### Built-in Global Variable for Transaction Input
Instead of having explicit `TransactionInput` inputs in entry point functions,
we could maintain the current approach of viewing `input` as a built-in global variable, of type `TransactionInput`.
Everything else would be the same, except that `input` would be implicitly available.
An advantage is that single-threadedness would be immediately guaranteed,
if we wanted to enforced that as discussed above.
On the other hand, explicating transaction inputs as entry point inputs makes the code more functional
and simplifies certain aspects of the Leo compiler.
### Built-in Global Variable for Transaction Output
Instead of having explicit `TransactionOutput` outputs in entry point functions,
we could introduce a built-in `output` global variable, of type `TransactionOutput`.
This has similar advantages and disadvantages to the ones discussed above for `input` as a built-in global variable.
In any case, we may want this `output` global variable alternative
to go hand-in-hand with the `input` global variable alternative.
That is, we either adopt both or none.
The current treatment in Leo is asymmetric in this respect.
### Implicit Serialization of Output Values
Instead of having an explicit `TransactionOutput` type
whose values describe exactly the output data for a transaction,
we could keep something like the current model,
in which an entry point function may return values of arbitrary types,
which are implicitly serialized into output records.
This may be a bit simpler for the beginning developer,
but it also introduces less control on the output data.
Futhermore, given that records have payloads of limited size,
it is not difficult to write a program that attempts to produce too much data.
In any case, if we were to go this route, there would be an asymmetry with the treatment of transaction inputs,
unless we also allow `input` (by this we mean the value passed to an entry point function with the transaction inputs)
to consist of arbitrary Leo types (subject to serialization size limitations).
Note that this requires the Leo type of `input` to potentially vary across different programs
(which appears to be the case in current Leo),
which is more complicated than having some fixed types in the standard library.
All in all, it seems that having `TransactionInput` and `TransactionOutput` types provides more explicit control.
Furthermore, in the future the Leo standard library could provide serialization and deserialization tools
that will make it easy to map between record slots and higher-level Leo types.

View File

@ -1,119 +0,0 @@
# Leo RFC 013: Constant Functions
## Author(s)
The Aleo Team.
## Status
IMPLEMENTED
## Summary
This RFC proposes the additional of an optional `const` modifier to function declarations
to explicitly designate functions that return constant values that can be calculated at compile time.
## Motivation
Explicitly designating constant functions makes user intention explicit
and simplifies the Leo compiler's checks, as explained below.
## Background
Leo code is partially evaluated on its `const` inputs prior to being translated to R1CS.
A function that returns a value that only depends on `const` inputs directly or indirectly,
can be partially evaluated away, without having to be inlined during flattening.
## Design
### Syntax
The ABNF grammar is extended by adding an optional `const` modifier to function declarations:
```
function-declaration = *annotation [ %s"const" ] %s"function" identifier
"(" [ function-parameters ] ")" [ "->" type ]
block
```
This applies to both top-level and member functions.
### Static Semantics
A `const` function declaration must satisfy the following conditions:
* All its parameters are `const`, including the `self` parameter for instance circuit member functions.
* The body does not reference the special `input` variable.
* The body only calls other `const` functions.
### Dynamic Semantics
This has no impact on the dynamic semantics of Leo, viewed as a traditional programming language.
### Flattening
Given that `const` expressions are evaluated completely during flattening,
the values of the arguments of a `const` function call are known during flattening,
and therefore the function call can be completely evaluated as well.
If the function is recursive (individually or with others),
the evaluation involves the bounded recursion analysis described in a separate RFC.
### Implementation Considerations
ASTs for function declarations are extended with a boolean flag `const_`.
If a `const` function has a non-`const` parameter,
an ASG error occurs.
If the body of a `const` function references the `input` variable or calls a non-`const` function,
an ASG error occurs.
The description of static semantics, dynamic semantics, and flattening given above
are expressed in terms of Leo, because that is the user's view of the language.
In the implementation, flattening occurs after the Leo code is translated to the IR.
### Examples
```ts
const function len(const arr: [u8; _]) -> u32 {
return arr.len();
}
circuit Sample {
x: [char; 5]
const function say_hi(const self) -> [char; 5] {
return self.x;
}
}
```
## Drawbacks
This extension does not appear to bring any drawbacks.
## Effect on Ecosystem
None.
## Alternatives
### No Constant Designation
Without an explicit designation of constant functions,
the Leo compiler needs to perform an inter-procedural analysis:
if `f` calls `g`, in order for `f` to be constant, also `g` must be constant.
In other words, the call graph must be taken into account.
In contrast, with the `const` designation,
an intra-procedural analysis suffices,
as discussed in the static semantics section above.
## Future Extensions
In other languages like Rust, `const` functions are not required to have all constant parameters.
They are just required to return constant results for constant arguments,
i.e. they must not access global variables and they must only call other `const` functions.
In other words, these `const` functions are polymorphic over "constancy".
This could be also realized in Leo, because type inference/checking determines `const` and non-`const` expressions.
This tells the compiler which function calls have all `const` arguments and which ones do not.
Therefore, the compiler can selectively evaluate, during flattening, only the calls of `const` functions on `const` arguments.

View File

@ -1,6 +1,3 @@
# Overview
This directory contains the source code for the Leo command line interface.
For a comprehensive list of commands and usage see the [developer documentation](https://developer.aleo.org/developer/cli/overview).
The `leo-package` module is also included in this directory for convenience.

View File

@ -3,28 +3,3 @@
[![Crates.io](https://img.shields.io/crates/v/leo-package.svg?color=neon)](https://crates.io/crates/leo-package)
[![Authors](https://img.shields.io/badge/authors-Aleo-orange.svg)](../AUTHORS)
[![License](https://img.shields.io/badge/License-GPLv3-blue.svg)](./LICENSE.md)
## Description
This module defines the structure of a Leo project package. And describes behavior of package internals, such
as Leo Manifest (Leo.toml), Lock File (Leo.lock), source files and imports.
Mainly used by Leo binary.
## Structure
Each directory in this crate mirrors a corresponding file generated in a new Leo project package:
```bash
package/src
├── errors # crate level error definitions
├── imports # program imports management
├── inputs # program inputs directory
├── outputs # program outputs directory
├── root # program root: Leo.toml, Leo.lock
└── source # source files directory
```
## Testing
Package features functional tests in the `tests/` directory.