diff --git a/FAQs.md b/FAQs.md deleted file mode 100644 index b889f2052d..0000000000 --- a/FAQs.md +++ /dev/null @@ -1,26 +0,0 @@ -# FAQs - -#### For some given code, changing the value in a constant variable changes the number of constraints in the generated circuit. Is this behavior correct? - -**Yes**, take the integers as an example. In Leo, integers are represented as its binary decomposition, -with each bit occupying one field element (that takes on 0 or 1). Then, for an expression such as `a == 4u32`, the operation to evaluate equality -would comprise a linear pass of bitwise `AND` operations, comparing every bit in the **variable** value with each bit in the **constant** value. - -As the constant value is already known to the compiler during circuit synthesis, the compiler is already able to complete part of the equality evaluation, -by assuming that any bit in the constant value that is `0` will clearly evaluate to `0`. As such, depending on the value of the constant integer in your code, -the total number of constraints in the generate circuit can vary. - -To illustrate this, here are two examples to show the difference: -``` -constant = 00000001 -variable = abcdefgh ---------------------------------- -output = 0000000h (1 constraint) -``` - -``` -constant = 01110001 -variable = abcdefgh ---------------------------------- -output = 0bcd000h (4 constraints) -``` diff --git a/README.md b/README.md index ef46397616..86f926dc6f 100644 --- a/README.md +++ b/README.md @@ -35,9 +35,8 @@ integrate private applications into your stack. Leo compiles to circuits making The syntax of Leo is influenced by traditional programming languages like JavaScript, Scala, and Rust, with a strong emphasis on readability and ease-of-use. Leo offers developers with tools to sanity check circuits including unit tests, integration tests, and console functions. -Leo is one part of a greater ecosystem for building private applications on [Aleo](https://aleo.org/). If your goal is to build a user experience -on the web that is both truly personal and truly private, then we recommend downloading the [Aleo Studio IDE](https://aleo.studio/) -and checking out the [Aleo Package Manager](https://aleo.pm/). +Leo is one part of a greater ecosystem for building private applications on [Aleo](https://aleo.org/). +The language is currently in an alpha stage and is subject to breaking changes. ## 2. Build Guide @@ -58,22 +57,9 @@ We recommend installing Rust using [rustup](https://www.rustup.rs/). You can ins Download the [Windows 32-bit executable](https://win.rustup.rs/i686) and follow the on-screen instructions. -### 2.2a Build from Crates.io +### 2.2 Build from Source Code -We recommend installing Leo this way. In your terminal, run: - -```bash -cargo install leo-lang -``` - -Now to use Leo, in your terminal, run: -```bash -leo -``` - -### 2.2b Build from Source Code - -Alternatively, you can install Leo by building from the source code as follows: +We recommend installing Leo by building from the source code as follows: ```bash # Download the source code @@ -95,8 +81,8 @@ Use the Leo CLI to create a new project ```bash # create a new `hello-world` Leo project -leo new hello-world -cd hello-world +leo new helloworld +cd helloworld # build & setup & prove & verify leo run @@ -104,7 +90,7 @@ leo run The `leo new` command creates a new Leo project with a given name. -The `leo run` command will compile the main program, generate keys for a trusted setup, fetch inputs, generate a proof and verify it. +The `leo run` command will compile the program into Aleo instructions and run it. Congratulations! You've just run your first Leo program. @@ -112,8 +98,7 @@ Congratulations! You've just run your first Leo program. * [Hello World - Next Steps](https://developer.aleo.org/developer/getting_started/hello_world) * [Leo Language Documentation](https://developer.aleo.org/developer/language/layout) -* [Leo ABNF Grammar](./grammar/README.md) -* [Leo CLI Documentation](https://developer.aleo.org/developer/cli/new) +* [Leo ABNF Grammar](./docs/grammar/abnf-grammar.txt) * [Homepage](https://developer.aleo.org/developer/getting_started/overview) ## 5. Contributing diff --git a/SECURITY.md b/SECURITY.md index 4ea72aac7f..5af1cb5897 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -2,16 +2,8 @@ The following describes our procedure for addressing major and minor security concerns in Leo. -## Testnet I - -As Aleo is currently in the prototype stage and does not operate a platform intended for production use, -our security procedures are designed to promote public disclosure and quick security resolution. - -In preparation for the production stage, we will release new security guidelines and -issue new procedures for addressing the disclosure of sensitive security vulnerabilities. - ### Reporting a Bug -During Testnet I, all software bugs should be reported by filing a Github issue. +During Testnet 3, all software bugs should be reported by filing a Github issue. If you are unsure and would like to reach out to us directly, please email security \_at\_ aleo.org to elaborate on the issue. diff --git a/compiler/ast/README.md b/compiler/ast/README.md index 85615b9c8b..b23ea6ed90 100644 --- a/compiler/ast/README.md +++ b/compiler/ast/README.md @@ -4,349 +4,3 @@ [![Authors](https://img.shields.io/badge/authors-Aleo-orange.svg)](../AUTHORS) [![License](https://img.shields.io/badge/License-GPLv3-blue.svg)](./LICENSE.md) -This directory contains the code for the AST of a Leo Program. - -## Node Types - -There are several types of nodes in the AST that then have further breakdowns. - -All nodes store a Span, which is useful for tracking the lines and -columns of where the node was taken from in the Leo Program. - -### [Program/File](./src/program.rs) - -The top level nodes in a Leo Program. - -#### [Imports](./src/imports/import.rs) - -Represents an import statement in a Leo Program. -A list of these are stored on the Program. -It stores the path to an import and what is being imported. - -**NOTE**: The import does not contain the source code of the imported Leo Program. - -#### [Circuits](./src/circuits/circuit.rs) - -A circuit node represents a defined Circuit in a Leo Program. -An order-preserving map of these are stored on the Program. -Contains the Circuit's name, as well as its members. -The members are a function, or a variable, or a constant. -For all of them the Circuit preserves their names. - -#### [Annotations](./src/functions/annotation.rs) - -An annotation node is a decorator that can be applied to a function. -Stored on the function themselves despite being a top-level node. -The node stores the name of the annotation, as well as any args passed to it. - -#### [Functions](./src/functions/mod.rs) - -A function node represents a defined function in a Leo Program. -An order-preserving map of these are stored on the Program. -A function node stores the following information: - -- The annotations applied to the function. -- An identifier the name of the function. -- The inputs to the function, both their names and types. -- The output of the function as a type if it exists. -- The function body stored as a block statement. - -#### [Global Consts](./src/program.rs) - -A global const is a bit special and has no special node for itself, but rather is a definition statement. -An order-preserving map of these are stored on the Program. - -### [Types](./src/types/type_.rs) - -The different types in a Leo Program. -Types themselves are not a node, but rather just information to be stored on a node. - -#### Address - -The address type follows the [BIP_0173](https://en.bitcoin.it/wiki/BIP_0173) format starting with `aleo1`. - -#### Boolean - -The boolean type consists of two values **true** and **false**. - -#### Char - -The char type resents a character from the inclusive range [0, 10FFFF]. - -#### Field - -The field type an unsigned number less than the modulus of the field. - -#### Group - -The group type a set of affine points on the elliptic curve passed. - -#### [IntegerType](./src/types/integer_type.rs) - -The integer type represents a range of integer types. - -##### U8 - -A integer in the inclusive range [0, 255]. - -##### U16 - -A integer in the inclusive range [0, 65535]. - -##### U32 - -A integer in the inclusive range [0, 4294967295]. - -##### U64 - -A integer in the inclusive range [0, 18446744073709551615]. - -##### U128 - -A integer in the inclusive range [0, 340282366920938463463374607431768211455]. - -##### I8 - -A integer in the inclusive range [-128, 127]. - -##### I16 - -A integer in the inclusive range [-32768, 32767]. - -##### I32 - -A integer in the inclusive range [-2147483648, 2147483647]. - -##### I64 - -A integer in the inclusive range [-9223372036854775808, 9223372036854775807]. - -##### I128 - -A integer in the inclusive range [-170141183460469231731687303715884105728, 170141183460469231731687303715884105727]. - -#### Array - -The array type contains another type, -then the number of elements of that type greater than 0 -for monodimensional arrays, -or a list of such numbers of elements -for multidimensional arrays. - -#### Tuple - -The tuple type contains n types, where n is greater than or equal to 0. - -#### Identifier - -An identifier type is either a circuit type or a type alias; -every circuit type represents a different type. - -#### SelfType - -The self type represented by `Self` and only usable inside a circuit. - -### [Statements](./src/statements/statement.rs) - -The statement level nodes in a Leo Program. - -#### [Assignment Statements](./src/statements/assign/) - -An assignment statement node stores the following: - -- The operation. - - **=** - - **+=** - - **-=** - - **\*=** - - **/=** - - **\*\*=** - - **&&=** - - **||=** -- The assignee which is a variable that has context of any access expressions on it. -- The value which is an expression. - -#### [Block Statements](./src/statements/block.rs) - -A block statement node stores the following: - -- The list of statements inside the block. - -#### [Conditional Statements](./src/statements/conditional.rs) - -A conditional statement node stores the following: - -- The condition which is an expression. -- The block statement. -- The next block of the conditional if it exists. - -#### [Console Statements](./src/statements/) - -A console statement node stores the following: - -- The console function being called which stores the type of console function it is and its arguments. - -#### [Definition Statements](./src/statements/definition/mod.rs) - -A definition statement node stores the following: - -- The declaration type: - - `let` for mutable definitions. - - `const` for cosntant definitions. -- The names of the variables defined. -- The optional type. -- The values to be assigned to the varaibles. - -#### [Expression Statements](./src/statements/expression.rs) - -An expression statement node stores the following: - -- The expression. - -#### [Iteration Statements](./src/statements/iteration.rs) - -A iteration statement node stores the following: - -- The loop iterator variable name. -- The expression to define the starting loop value. -- The expression to define the stopping loop value. -- A flag indicating whether the stopping value is inclusive or not. -- The block to run for the loop. - -#### [Return Statements](./src/statements/return_statement.rs) - -A return statement node stores the following: - -- The expression that is being returned. - -### [Expressions](./src/expressions/mod.rs) - -The expression nodes in a Leo Program. - -#### [ArrayAccess Expressions](./src/accesses/array_access.rs) - -An array access expression node stores the following: - -- The array expression. -- The index represented by an expression. - -#### [ArrayInit Expressions](./src/expression/array_init.rs) - -An array init expression node stores the following: - -- The element expression to fill the array with. -- The dimensions of the array to build. - -#### [ArrayInline Expressions](./src/expression/array_inline.rs) - -An array inline expression node stores the following: - -- The elements of an array, each of which is either a spread or an expression. - -#### [ArrayRangeAccess Expressions](./src/accesses/array_range_access.rs) - -An array range access expression node stores the following: - -- The array expression. -- The optional left side of the range of the array bounds to access. -- The optional right side of the range of the array bounds to access. - -#### [Binary Expressions](./src/expression/binary.rs) - -A binary expression node stores the following: - -- The left side of the expression. -- The right side of the expression. -- The binary operation of the expression: - - **+** - - **-** - - **\*** - - **/** - - **\*\*** - - **||** - - **&&** - - **==** - - **!=** - - **>=** - - **>** - - **<=** - - **<** - -#### [Call Expressions](./src/expression/call.rs) - -A call expression node stores the following: - -- The function expression being called. -- The aruments a list of expressions. - -#### [CircuitInit Expressions](./src/expression/circuit_init.rs) - -A circuit init expression node stores the following: - -- The name of the circuit expression being initialized. -- The arguments a list of expressions. - -#### [MemberAccess Expressions](./src/accesses/member_access.rs) - -A member access expression node stores the following: - -- The expression being accessed. -- The name of the member being accessed. -- The optional inferred type. - -#### [StaticAccess Expressions](./src/accesses/static_access.rs) - -A static function access expression node stores the following: - -- The expression being accessed. -- The name of the member being statically accessed. -- The optional inferred type. - -#### [Identifier Expressions](./src/common/identifier.rs) - -An identifer expression node stores the following: - -- An identifier stores the string name. - -#### [Ternary Expressions](./src/expression/ternary.rs) - -A ternary expression node stores the following: - -- The condition of the ternary stored as an expression. -- The expression returned if the condition is true. -- The expression returned if the condition is false. - -#### [TupleAccess Expressions](./src/accesses/tuple_access.rs) - -A tuple access expression node stores the following: - -- The tuple expression being accessed. -- The index a positive number greater than or equal to 0. - -#### [TupleInit Expressions](./src/expression/tuple_init.rs) - -A tuple init expression node stores the following: - -- The element expressions to fill the tuple with. - -#### [Unary Expressions](./src/expression/unary.rs) - -An unary expression node stores the following: - -- The inner expression. -- The unary operator: - - **!** - - **-** - -#### [Value Expressions](./src/expression/value.rs) - -A value expression node stores one of the following: - -- Address and its value and span. -- Boolean and its value and span. -- Char and its value and span. -- Field and its value and span. -- Group and its value and span. -- Implicit and its value and span. -- Integer and its value and span. -- String and its value and span. diff --git a/compiler/core/README.md b/compiler/core/README.md index b10a86fcc7..ee2b405f42 100644 --- a/compiler/core/README.md +++ b/compiler/core/README.md @@ -3,72 +3,3 @@ [![Crates.io](https://img.shields.io/crates/v/leo-ast.svg?color=neon)](https://crates.io/crates/leo-core) [![Authors](https://img.shields.io/badge/authors-Aleo-orange.svg)](../AUTHORS) [![License](https://img.shields.io/badge/License-GPLv3-blue.svg)](./LICENSE.md) - -This directory includes the core library for Leo. - -## Usage - -Leo's core library is statically built into the compiler. -All modules in the `leo-core` rust module will be automatically included in a `core` Leo module that can be imported into a Leo program. - -## Implementations - -For a full explanation of core library functions, see the [Aleo developer documentation](https://developer.aleo.org/) - -### Core Library (Account) - -#### Compute Key - -```ts -import core.account.ComputeKey; - -function foo(public compute_key: ComputeKey); -function foo(compute_key: ComputeKey); -``` - -#### Private Key - -```ts -import core.account.PrivateKey; - -function foo(public private_key: PrivateKey); -function foo(private_key: PrivateKey); -``` - -#### Record - -```ts -import core.account.Record; - -function foo(public record: Record); -function foo(record: Record); -``` - -#### Signature - -```ts -import core.account.Signature; - -function foo(public signature: Signature); -function foo(signature: Signature); -``` - -#### View Key - -```ts -import core.account.ViewKey; - -function foo(public view_key: ViewKey); -function foo(view_key: ViewKey); -``` - -### Core Library (Algorithms) - -#### Poseidon - -```ts -import core.algorithms.Poseidon; - -function foo(public poseidon: Poseidon); -function foo(poseidon: Poseidon); -``` \ No newline at end of file diff --git a/compiler/parser/README.md b/compiler/parser/README.md index aa895e996d..c0ff343922 100644 --- a/compiler/parser/README.md +++ b/compiler/parser/README.md @@ -3,123 +3,3 @@ [![Crates.io](https://img.shields.io/crates/v/leo-parser.svg?color=neon)](https://crates.io/crates/leo-parser) [![Authors](https://img.shields.io/badge/authors-Aleo-orange.svg)](../AUTHORS) [![License](https://img.shields.io/badge/License-GPLv3-blue.svg)](./LICENSE.md) - -This directory contains the code to tokenize, lex and parse Leo files to the [Leo AST](./../ast/README.md). - -## Tokenizer - -The tokenizer contains all tokens in Leo. -It also decides which tokens are keywords. -Meaning that keywords are a subset of tokens. -The lexer goes through character by character as bytes, and converts the bytes into the tokens. - -### Tokens - -Bolded ones are also keywords. - -#### Literals -- CommentLine -- CommentBlock -- StringLit -- Ident -- Int -- **True** -- **False** -- AddressLit -- CharLit - -#### Symbols -- At -- Not -- And (`&&`) -- Ampersand (`&`) -- Or -- Eq -- NotEq -- Lt -- LtEq -- Gt -- GtEq -- Add -- Minus -- Mul -- Div -- Exp -- Assign -- AddEq -- MinusEq -- MulEq -- DivEq -- ExpEq -- LeftParen -- RightParen -- LeftSquare -- RightSquare -- LeftCurly -- RightCurly -- Comma -- Dot -- DotDot -- DotDotDot -- Semicolon -- Colon -- DoubleColon -- Question -- Arrow -- Underscore - -#### Types -- **U8** -- **U16** -- **U32** -- **U64** -- **U128** -- **I8** -- **I16** -- **I32** -- **I64** -- **I128** -- **Field** -- **Group** -- **Bool** -- **Address** -- **Char** -- **BigSelf** - -#### Words -- **Input** -- **LittleSelf** -- **Import** -- **As** -- **Circuit** -- **Console** -- **Const** -- **Else** -- **For** -- **Function** -- **If** -- **In** -- **Let** -- **Return** -- **Static** -- **String** - -#### Meta -- Eof - -## Parser - -The parser converts the tokens to the [Leo AST](./../ast/README.md). - -The parser is broken down to different files that correspond to different aspects of the AST: - -- [File](./src/parser/file.rs) - Parses the top level nodes in Leo. -- [Types](./src/parser/type_.rs) - Parses the type declarations in Leo. -- [Statements](./src/parser/statement.rs) - Parses the different kinds of statements. -- [Expressions](./src/parser/expression.rs) - Parses the different kinds of expressions. - - For more information on those please read the Leo AST README, linked above. - -## Grammar Relation - -All function and token names are as close as possible to the [Leo Grammar](./../grammar/README.md) diff --git a/compiler/passes/README.md b/compiler/passes/README.md index db60cf1506..605c632869 100644 --- a/compiler/passes/README.md +++ b/compiler/passes/README.md @@ -3,11 +3,3 @@ [![Crates.io](https://img.shields.io/crates/v/leo-ast.svg?color=neon)](https://crates.io/crates/leo-passes) [![Authors](https://img.shields.io/badge/authors-Aleo-orange.svg)](../AUTHORS) [![License](https://img.shields.io/badge/License-GPLv3-blue.svg)](./LICENSE.md) - -## Usage - -The code here is split into several usages. Each usage represents a different pass or modification when given an AST. - -## Structure - -Each different type of pass is located in its own directory within the src directory. diff --git a/docs/compiler-phases.pdf b/docs/compiler-phases.pdf deleted file mode 100644 index 57ba65a9a0..0000000000 Binary files a/docs/compiler-phases.pdf and /dev/null differ diff --git a/docs/error-guides/parser/array_tuple_dimensions_empty.md b/docs/error-guides/parser/array_tuple_dimensions_empty.md deleted file mode 100644 index 98b74f38fb..0000000000 --- a/docs/error-guides/parser/array_tuple_dimensions_empty.md +++ /dev/null @@ -1,51 +0,0 @@ -# Empty array dimensions - -## Example - -This error occurs when specifying an empty tuple as the dimensions of an array. - -Erroneous code example: - -```js -function main() { - let foo = [42; ()]; -} -``` - -The compiler will reject this code with, for example...: - -```js -Error [EPAR0370023]: Array dimensions specified as a tuple cannot be empty. - --> test.leo:2:20 - | - 2 | let foo = [42; ()]; - | ^^ -``` - -## Solution - -If you wanted a single dimensional array, you can achieve that by specifying the length like so: - -```js -function main() { - let foo = [42; 4]; -} -``` - -This will give you the array `[42, 42, 42, 42]`. - -If instead you wanted a multi-dimensional array, e.g., a 2 x 3 matrix, you can achieve that with: - -```js -function main() { - let foo = [42; (2, 3)]; -} -``` - -Alternatively, you can use the simple syntax all the way instead: - -```js -function main() { - let foo = [[42; 2]; 3]; -} -``` diff --git a/docs/error-guides/parser/context_annotation.md b/docs/error-guides/parser/context_annotation.md deleted file mode 100644 index 78746b6f82..0000000000 --- a/docs/error-guides/parser/context_annotation.md +++ /dev/null @@ -1,35 +0,0 @@ -# `@context function` is deprecated - -## Example - -This error occurs when a function is prefixed with `@context`. - -Erroneous code example: - -```js -@context() -function foo() { - // logic... -} -``` - -The compiler will reject this code with, for example...: - -```js -Error [EPAR0370017]: "@context(...)" is deprecated. Did you mean @test annotation? - --> test.leo:1:2 - | - 1 | @context() - | ^^^^^^^ -``` - -## Solution - -The `@context function` syntax is deprecated, but you can use `@test function` instead: - -```js -@test -function foo() { - // logic... -} -``` diff --git a/docs/error-guides/parser/invalid_address_lit.md b/docs/error-guides/parser/invalid_address_lit.md deleted file mode 100644 index b138965aef..0000000000 --- a/docs/error-guides/parser/invalid_address_lit.md +++ /dev/null @@ -1,44 +0,0 @@ -# Invalid address literal - -## Example - -This error occurs when a syntactically invalid address is specified. - -Erroneous code example: - -```js -function main() { - let addr = aleo1Qnr4dkkvkgfqph0vzc3y6z2eu975wnpz2925ntjccd5cfqxtyu8s7pyjh9; -} -``` - -The compiler will reject this code with, for example...: - -```js -Error [EPAR0370001]: invalid address literal: 'aleo1Qnr4dkkvkgfqph0vzc3y6z2eu975wnpz2925ntjccd5cfqxtyu8s7pyjh9' - --> test.leo:2:16 - | - 2 | let addr = aleo1Qnr4dkkvkgfqph0vzc3y6z2eu975wnpz2925ntjccd5cfqxtyu8s7pyjh9; - | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -``` - -A valid address literal must start with `aleo1`, -followed by 58 characters any of which can be either a lowercase letter, -or an ASCII digit (`0` to `9`). - -In the example above, the problem is `Q`, an uppercase letter, -and the second character after `aleo1`. - -## Solution - -To fix the issue, we can write...: - -```js -function main() { - let addr = aleo1qnr4dkkvkgfqph0vzc3y6z2eu975wnpz2925ntjccd5cfqxtyu8s7pyjh9; -} -``` - -...and the compiler will accept it. - -Note however that the compiler does not check whether the address is valid on-chain, but merely that the written program follows the rules of the language grammar. diff --git a/docs/error-guides/parser/invalid_assignment_target.md b/docs/error-guides/parser/invalid_assignment_target.md deleted file mode 100644 index 78dffbd689..0000000000 --- a/docs/error-guides/parser/invalid_assignment_target.md +++ /dev/null @@ -1,32 +0,0 @@ -# Invalid assignment target - -## Example - -This error currently occurs when a `static const` member or a member function -is used as the target of an assignment statement. - -Erroneous code example: - -```js -circuit Foo { - static const static_const: u8 = 0; -} - -function main() { - Foo::static_const = 0; -} -``` - -The compiler will reject this code with, for example...: - -```js -Error [EPAR0370011]: invalid assignment target - --> test.leo:6:5 - | - 6 | Foo::static_const = 0; - | ^^^^^^^^^^^^^^^^^ -``` - -It's not possible to assign to `static const` members or member functions, -so this is not allowed syntax. -The solution is likely to rethink your approach to the problem you are solving. diff --git a/docs/error-guides/parser/invalid_import_list.md b/docs/error-guides/parser/invalid_import_list.md deleted file mode 100644 index 2d73b8e588..0000000000 --- a/docs/error-guides/parser/invalid_import_list.md +++ /dev/null @@ -1,53 +0,0 @@ -# An empty `import` list - -## Example - -This error occurs when no sub-packages -or items were specified in an import list. - -Erroneous code example: - -```js -import gardening.(); -``` - -The compiler will reject this code with, for example...: - -```js -Error [EPAR0370002]: Cannot import empty list - --> test.leo:1:18 - | - 1 | import gardening.(); - | ^^ -``` - -...as the compiler does not know what to import in `gardening`. - -## Solutions - -There are different solutions to this problems. -Here are 2 of them to consider. - -### Comment out the `import` - -If don't know yet what to import from `gardening`, -comment out the `import` like so: - -```js -// import gardening.(); -``` - -Later, you can come back and specify what to import like below. - -You can also remove the `import` line entirely, -which will have the same effect. - -### Specify items to `import` - -If you know that you'd like to import, for example, -the functions `water_flowers` and `prune`, -you can specify them in the import list like so: - -```js -import gardening.(water_flowers, prune); -``` diff --git a/docs/error-guides/parser/let_mut_statement.md b/docs/error-guides/parser/let_mut_statement.md deleted file mode 100644 index 644a42bd56..0000000000 --- a/docs/error-guides/parser/let_mut_statement.md +++ /dev/null @@ -1,33 +0,0 @@ -# `let mut` is deprecated - -## Example - -This error occurs when a variable declaration is marked with `mut`. - -Erroneous code example: - -```js -function main() { - let mut x = 0; -} -``` - -The compiler will reject this code with, for example...: - -```js -Error [EPAR0370015]: let mut = ... is deprecated. `let` keyword implies mutabality by default. - --> test.leo:2:5 - | - 2 | let mut x = 0; - | ^^^^^^^ -``` - -## Solution - -As the `mut` modifier is implicitly assumed, the solution is to remove the `mut` modifier: - -```js -function main() { - let x = 0; -} -``` diff --git a/docs/error-guides/parser/member_const_after_fun.md b/docs/error-guides/parser/member_const_after_fun.md deleted file mode 100644 index 397d25ce7b..0000000000 --- a/docs/error-guides/parser/member_const_after_fun.md +++ /dev/null @@ -1,37 +0,0 @@ -# `static const` after circuit functions - -## Example - -This error occurs when `static const` circuit members occur after circuit member functions. - -Erroneous code example: - -```js -circuit Foo { - function bar() {} - - static const baz: bool = true; -} -``` - -The compiler will reject this code with, for example...: - -```js -Error [EPAR0370021]: Member functions must come after member consts. - --> test.leo:4:18 - | - 4 | static const baz: bool = true; - | ^^^^^^^^^^^^^^^^ -``` - -## Solution - -The issue can be solved by moving all `static const` members before circuit member functions...: - -```js -circuit Foo { - static const baz: bool = true; - - function bar() {} -} -``` diff --git a/docs/error-guides/parser/member_const_after_var.md b/docs/error-guides/parser/member_const_after_var.md deleted file mode 100644 index 566d77d713..0000000000 --- a/docs/error-guides/parser/member_const_after_var.md +++ /dev/null @@ -1,37 +0,0 @@ -# `static const` after normal variables - -## Example - -This error occurs when `static const` circuit members occur after normal member variables. - -Erroneous code example: - -```js -circuit Foo { - bar: u8, - - static const baz: bool = true; -} -``` - -The compiler will reject this code with, for example...: - -```js -Error [EPAR0370020]: Member variables must come after member consts. - --> test.leo:4:18 - | - 4 | static const baz: bool = true; - | ^^^^^^^^^^^^^^^^ -``` - -## Solution - -The issue can be solved by moving all `static const` members before normal member variables...: - -```js -circuit Foo { - static const baz: bool = true; - - bar: u8, -} -``` diff --git a/docs/error-guides/parser/member_var_after_fun.md b/docs/error-guides/parser/member_var_after_fun.md deleted file mode 100644 index 9ed9a0655d..0000000000 --- a/docs/error-guides/parser/member_var_after_fun.md +++ /dev/null @@ -1,37 +0,0 @@ -# Member variables after circuit functions - -## Example - -This error occurs when circuit member variables occur after circuit member functions. - -Erroneous code example: - -```js -circuit Foo { - function bar() {} - - baz: bool; -} -``` - -The compiler will reject this code with, for example...: - -```js -Error [EPAR0370022]: Member functions must come after member variables. - --> test.leo:4:5 - | - 4 | baz: bool; - | ^^^ -``` - -## Solution - -The issue can be solved by moving all member variables before any circuit member functions...: - -```js -circuit Foo { - baz: bool; - - function bar() {} -} -``` diff --git a/docs/error-guides/parser/mixed_commas_and_semicolons.md b/docs/error-guides/parser/mixed_commas_and_semicolons.md deleted file mode 100644 index ed060dcd47..0000000000 --- a/docs/error-guides/parser/mixed_commas_and_semicolons.md +++ /dev/null @@ -1,48 +0,0 @@ -# Mixed commas and semicolons in circuit definitions - -## Example - -This error occurs when mixing semicolons, `;`, -and commas, `,` together in the list of member variables in a circuit definition. - -Erroneous code example: - -```js -circuit A { - foo: u8, - bar: u16; -} -``` - -The compiler will reject this code with: - -```js -Error [EPAR0370006]: Cannot mix use of commas and semi-colons for circuit member variable declarations. - --> test.leo:3:13 - | - 3 | bar: u16; - | ^ -``` - -## Solutions - -The solution is simply to consistently use `;` or `,` after each member variable, -and avoid mixing `;` and `,` together. So we could write either...: - -```js -circuit A { - foo: u8, - bar: u16, -} -``` - -...or write...: - -```js -circuit A { - foo: u8; - bar: u16; -} -``` - -...and the compiler would accept it. diff --git a/docs/error-guides/parser/mut_function_input.md b/docs/error-guides/parser/mut_function_input.md deleted file mode 100644 index c98fc0ef68..0000000000 --- a/docs/error-guides/parser/mut_function_input.md +++ /dev/null @@ -1,37 +0,0 @@ -# Deprecated `mut` parameter - -## Example - -This error occurs when a function parameter is marked as `mut`. - -Erroneous code example: - -```js -circuit Foo { - function bar(mut x: u8) { - x = 0; - } -} -``` - -The compiler will reject this code with, for example...: - -```js -Error [EPAR0370014]: function func(mut a: u32) { ... } is deprecated. Passed variables are mutable by default. - --> test.leo:2:18 - | - 2 | function bar(mut x: u8) { - | ^^^^^ -``` - -## Solution - -As the `mut` modifier is implicitly assumed, the solution is to remove the `mut` modifier: - -```js -circuit Foo { - function bar(x: u8) { - x = 0; - } -} -``` diff --git a/docs/error-guides/parser/mut_self_parameter.md b/docs/error-guides/parser/mut_self_parameter.md deleted file mode 100644 index 65849c868b..0000000000 --- a/docs/error-guides/parser/mut_self_parameter.md +++ /dev/null @@ -1,41 +0,0 @@ -# Deprecated `mut` parameter - -## Example - -This error occurs when a function parameter is marked as `mut`. - -Erroneous code example: - -```js -circuit Foo { - bar: u8, - - function bar(mut self) { - self.bar = 0; - } -} -``` - -The compiler will reject this code with, for example...: - -```js -Error [EPAR0370019]: `mut self` is no longer accepted. Use `&self` if you would like to pass in a mutable reference to `self` - --> test.leo:4:18 - | - 4 | function bar(mut self) { - | ^^^^^^^^ -``` - -## Solution - -As the `mut` modifier is implicitly assumed, the solution is to remove the `mut` modifier from `self`: - -```js -circuit Foo { - bar: u8, - - function bar(self) { - self.bar = 0; - } -} -``` diff --git a/docs/error-guides/parser/spread_in_array_init.md b/docs/error-guides/parser/spread_in_array_init.md deleted file mode 100644 index 7b966b107d..0000000000 --- a/docs/error-guides/parser/spread_in_array_init.md +++ /dev/null @@ -1,43 +0,0 @@ -# Illegal spread expression in array initializer - -## Example - -This error occurs when a spread expression, e.g., `...foo` occurs in an array initializer. - -Erroneous code example: - -```js -function main() { - let foo = [0, 1]; - let array = [...foo; 3]; -} -``` - -The compiler will reject this code with, for example...: - -```js -Error [EPAR0370010]: illegal spread in array initializer - --> test.leo:3:17 - | - 3 | let array = [...foo; 3]; - | ^^^^^^^ -``` - -## Solution - -The Leo language does not allow `...foo` as the element to repeat -in an array repeat expression like the one above. -This is because `foo` is not an element but rather a full array. -One could imagine that the expression above means `[...foo, ...foo, ...foo]`. -That is, `...foo` repeated as many times as was specified in the array size. -However, that is ambiguous with `[element; 3]` resulting in an array with size `3`. - -To solve the issue, disambiguate your intention. -Most likely, you really wanted `[...foo, ...foo, ...foo]`, so the solution is to write that out...: - -```js -function main() { - let foo = [0, 1]; - let array = [...foo, ...foo, ...foo]; -} -``` diff --git a/docs/error-guides/parser/test_function.md b/docs/error-guides/parser/test_function.md deleted file mode 100644 index 78d022c2da..0000000000 --- a/docs/error-guides/parser/test_function.md +++ /dev/null @@ -1,34 +0,0 @@ -# `test function` is deprecated - -## Example - -This error occurs when a function is prefixed with `test`. - -Erroneous code example: - -```js -test function foo() { - // logic... -} -``` - -The compiler will reject this code with, for example...: - -```js -Error [EPAR0370016]: "test function..." is deprecated. Did you mean @test annotation? - --> test.leo:1:1 - | - 1 | test function foo() { - | ^^^^ -``` - -## Solution - -The `test function` syntax is deprecated, but you can achieve the same result with `@test function`: - -```js -@test -function foo() { - // logic... -} -``` diff --git a/docs/error-guides/parser/unable_to_parse_array_dimensions.md b/docs/error-guides/parser/unable_to_parse_array_dimensions.md deleted file mode 100644 index 92aacd06c8..0000000000 --- a/docs/error-guides/parser/unable_to_parse_array_dimensions.md +++ /dev/null @@ -1,34 +0,0 @@ -# Unable to parse array dimensions - -## Example - -This error occurs when there is a syntax error in the array dimensions of an array initializer. - -Erroneous code example: - -```js -function main() { - let x = [1; +]; -} -``` - -The compiler will reject this code with, for example...: - -```js -Error [EPAR0370018]: unable to parse array dimensions - --> test.leo:2:13 - | - 2 | let x = [1; +]; - | ^ -``` - -## Solution - -In the case above, the error occurs due to the `+`. -The issue can be resolved by specifying the number of elements desired, e.g., `5`...: - -```js -function main() { - let x = [1; 5]; -} -``` diff --git a/docs/error-guides/parser/unexpected.md b/docs/error-guides/parser/unexpected.md deleted file mode 100644 index 691ffcfd16..0000000000 --- a/docs/error-guides/parser/unexpected.md +++ /dev/null @@ -1,37 +0,0 @@ -# Expected "x" -- got "y" - -## Example - -This error occurs when a specific token, e.g., `class` was encountered but a different one, -e.g., `circuit` was expected instead. - -Erroneous code example: - -```js -class A {} -``` - -The compiler will reject this code with: - -```js -Error: --> main.leo:1:1 - | - 1 | class A {} - | ^^^^^ - | - = expected 'import', 'circuit', 'function', 'test', '@' -- got 'class' -``` - -## Solutions - -The error message above says that `class` cannot be used at that location, -and also lists a few tokens that are valid. Note that this is context specific, -and depends on what tokens preceded the current token. -Using the list of tokens that are valid, and knowing that `circuit A {}` is valid syntax, -we replace `class` with `circuit`...: - -```js -circuit A {} -``` - -...and the error is now resolved. diff --git a/docs/error-guides/parser/unexpected_eof.md b/docs/error-guides/parser/unexpected_eof.md deleted file mode 100644 index b1d76f16a5..0000000000 --- a/docs/error-guides/parser/unexpected_eof.md +++ /dev/null @@ -1,49 +0,0 @@ -# An unexpected end of file - -## Example - -This error occurs when the Leo compiler tries to parse your program -and unexpectedly reaches the end of a `.leo` file. - -Erroneous code example: - -```js -function main() { -``` - -The compiler will reject this code with: - -```js -Error [EPAR0370003]: unexpected EOF - --> test.leo:1:17 - | - 1 | function main() { - | ^ -``` - -## Solutions - -The problem typically occurs when there are unbalanced delimiters, -which we have an instance of above. -More specifically, in the example, -the issue is that there is no `}` to close the opening brace `{`. - -An even simpler variant of this is: - -```js -function main( -``` - -The solution here is to close the opening delimiter, in this case `(`. - -## The general issue - -To illustrate the heart of the problem, consider this invalid file: - -```js -// ↳ main.leo -function -``` - -When parsing the file, the compiler expects something, in this case, -the function's name, but instead, the parser reaches the end of the file. diff --git a/docs/error-guides/parser/unexpected_ident.md b/docs/error-guides/parser/unexpected_ident.md deleted file mode 100644 index ead09d0dd7..0000000000 --- a/docs/error-guides/parser/unexpected_ident.md +++ /dev/null @@ -1,42 +0,0 @@ -# Unexpected identifier: expected "x" -- got "y" - -## Example - -This error occurs when a specific *identifier*, e.g., `error` was expected but a different one, -e.g., `fail` was encountered instead. - -Erroneous code example: - -```js -function main() { - console.fail("Houston we have a problem!"); -} -``` - -The compiler will reject this code with: - -```js -Error [EPAR0370007]: unexpected identifier: expected 'assert', 'error', 'log' -- got 'fail' - --> test.leo:2:11 - | - 2 | console.fail("Houston we have a problem!"); - | ^^^^ -``` - -## Solutions - -The error message above says that `fail` cannot be used at that location, -and also lists a few identifiers that are valid. Note that this is context specific, -and depends on what preceded the valid tokens in the location. - -The error message lists identifiers that are valid, e.g., `error`. -Here, since we used `.fail(...)`, we most likely wanted to trigger a compile error, -which `.error(...)` will achieve, so we use that instead...: - -```js -function main() { - console.error("Houston we have a problem!"); -} -``` - -Note that this error currently only occurs when using `console`. diff --git a/docs/error-guides/parser/unexpected_statement.md b/docs/error-guides/parser/unexpected_statement.md deleted file mode 100644 index c7c03aa9af..0000000000 --- a/docs/error-guides/parser/unexpected_statement.md +++ /dev/null @@ -1,42 +0,0 @@ -# Unexpected statement: expected "x" -- got "y" - -## Example - -This error occurs when a statement, which isn't `if`, follows `else` directly. - -Erroneous code example: - -```js -function main () { - if true { - console.log("It was true."); - } else - console.log("It was false."); -} -``` - -The compiler will reject this code with: - -```js -Error [EPAR0370008]: unexpected statement: expected 'Block or Conditional', got 'console.log("It was false.", );' - --> test.leo:5:9 - | - 5 | console.log("It was false."); - | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -``` - -## Solutions - -To fix the problem, wrap the statement in a block, so by turning the snippet above into...: - -```js -function main () { - if true { - console.log("It was true."); - } else { - console.log("It was false."); - } -} -``` - -...the error is fixed. diff --git a/docs/error-guides/parser/unexpected_str.md b/docs/error-guides/parser/unexpected_str.md deleted file mode 100644 index 02dfbc08a7..0000000000 --- a/docs/error-guides/parser/unexpected_str.md +++ /dev/null @@ -1,36 +0,0 @@ -# Expected string "x" -- got "y" - -## Example - -This error occurs when a specific "string" (in reality a token), -was expected but a different one was expected instead. - -Erroneous code example: - -```js -function main () { - let x: [u8; (!)] = [0]; -} -``` - -The compiler will reject this code with: - -```js -Error [EPAR0370009]: unexpected string: expected 'int', got '!' - --> test.leo:2:18 - | - 2 | let x: [u8; (!)] = [0]; - | ^ - -``` - -## Solutions - -The error message "unexpected string" depends on the context. -In the example above, we need to replace `!` with `1`...: - -```js -function main () { - let x: [u8; 1] = [0]; -} -``` diff --git a/docs/error-guides/parser/unexpected_token.md b/docs/error-guides/parser/unexpected_token.md deleted file mode 100644 index 8a4653219c..0000000000 --- a/docs/error-guides/parser/unexpected_token.md +++ /dev/null @@ -1,40 +0,0 @@ -# An unexpected token - -## Example - -This error occurs when the Leo compiler tries to parse your program. -More specifically, during a phase called 'lexing'. -In this phase, the compiler first takes your code, -consisting of characters, and interprets it as a list of tokens. -These tokens are a sort of *alphabet* internal to Leo. - -Consider the English language. It only has 26 letters in its alphabet. -So there are some letters, e.g., `Γ` from the greek alphabet, -which would not fit if we tried to "tokenize" English. - -Leo, while being a programming language, is similar here. -There are characters or sequences of characters, -that Leo does not understand and cannot lex into tokens. -Since this error occured, that is what has happened. - -Erroneous code example: - -```js -~ -``` - -The compiler will reject this code with: - -```js -Error [EPAR0370000]: ~ - --> test.leo:1:1 - | - 1 | ~ - | ^ -``` - -## Solutions - -What the solution to an unexpected token is depends on what you wanted to achieve. -Most likely, you made a typo somewhere. -For a more complete overview of valid Leo tokens, consult the [Leo grammar](https://github.com/AleoHQ/leo/blob/testnet3/grammar/README.md). diff --git a/docs/error-guides/parser/unexpected_whitespace.md b/docs/error-guides/parser/unexpected_whitespace.md deleted file mode 100644 index 422d21bb98..0000000000 --- a/docs/error-guides/parser/unexpected_whitespace.md +++ /dev/null @@ -1,34 +0,0 @@ -# Unexpected whitespace - -## Example - -This error occurs when there was unexpected white space when in your program. -Typically, the white space occurs in a literal with a typed suffix. - -Erroneous code example: - -```js -function main() { - let x = 1 u8; -} -``` - -The compiler will reject this code with: - -```js -Error [EPAR0370004]: Unexpected white space between terms 1 and u8 - --> test.leo:2:13 - | - 2 | let x = 1 u8; - | ^ -``` - -## Solutions - -The problem is solved by removing the white space between the literal and its suffix. So given the example above, we can fix it by writing: - -```js -function main() { - let x = 1u8; -} -``` diff --git a/docs/rfc/001-initial-strings.md b/docs/rfc/001-initial-strings.md deleted file mode 100644 index 50b4be6482..0000000000 --- a/docs/rfc/001-initial-strings.md +++ /dev/null @@ -1,430 +0,0 @@ -# Leo RFC 001: Initial String Support - -## Authors - -The Aleo Team. - -## Status - -IMPLEMENTED - -## Summary - -The purpose of this proposal is to provide initial support for strings in Leo. -Since strings are sequences of characters, -the proposal inextricably also involves characters. -This proposal is described as initial, -because it provides some basic features that we may extend in the future; -the initial features should be sufficiently simple and conservative -that they should not limit the design of the future features. - -This proposal adds a new scalar type for characters, -along with a new kind of literals to denote characters. -A string is then simply an array of characters, -but this proposal also adds a new kind of literals to denote strings -more directly than via character array construction expressions. -Along with equality and inequality, which always apply to every Leo type, -this proposal also introduces some operations on characters and strings -that can be implemented over time. - -By not prescribing a new type for strings, -this initial proposal leaves the door open -to a future more flexible type of resizable strings. - -## Motivation - -Strings (and characters) are common in programming languages. -Use cases for Leo include -simple ones like URLs and token ticker symbols, -and more complex ones like Bech32 encoding, -edit distance in strings representing proteins, -and zero-knowledge proofs of occurrences or absences of patterns in textual logs. - -## Design - -Since strings are sequences of characters, -a design for strings inextricably also involves a design for characters. -Thus, we first present a design for both characters and strings. - -### Characters - -We add a new scalar type, `char` for characters. -In accord with Leo's strong typing, -this new type is separate from all the other scalar types. -Explicit constructs will be needed to convert between `char` and other types. - -The set of values of type `char` is isomorphic to -the set of Unicode code points from 0 to 10FFFF (both inclusive). -That is, we support Unicode characters, more precisely code points -(this may include some invalid code points, -but it is simpler to allow every code point in that range). -A character is an atomic entity: -there is no notion of Unicode encoding (e.g. UTF-8) that applies here. - -We add a new kind of literals for characters, -consisting of single characters or escapes, -surrounded by single quotes. -Any single Unicode character except a single quote is allowed, -e.g. `'a'`, `'*'`, and `'"'`. -Single quotes must be escaped with a backslash, i.e. `'\''`; -backslashes must be escaped as well, i.e. `'\\'` -We allow other backslash escapes -for commonly used characters that are not otherwise easily denoted. -This is the complete list of single-character backslash escapes: -* `\'` for code point 39 (single quote) -* `\"` for code point 34 (double quote) -* `\\` for code point 92 (backslash) -* `\n` for code point 10 (line feed) -* `\r` for code point 13 (carriage return) -* `\t` for core point 9 (horizontal tab) -* `\0` for code point 0 (the null character) - -We also allow ASCII escapes of the form `\xOH`, -where `O` is an octal digit and `H` is a hexadecimal digit. -Both uppercase and lowercase hex digits are allowed. -The `x` must be lowercase. -These represent ASCII code points, i.e. from 0 to 127 (both inclusive). - -We also allow Unicode escapes of the form `'\u{X}'`, -where `X` is a sequence of one to six hex digits. -Both uppercase and lowercase letters are allowed. -The `u` must be lowercase. -The value must be between 0 and 10FFFF, inclusive. - -Note that this syntax for character literals is very close to the Rust syntax documented here (as of 2021-05-26): -https://doc.rust-lang.org/reference/tokens.html#character-literals -The only difference is that this syntax does not support Unicode escapes with underbars in them. -The following is true in Rust but not in this proposal for Leo: -`'\u{1_____0__F____F______FF__________________________}' == '\u{10FFFF}'`. - -Note that the literal character is assembled by the compiler---for -creating literals, there is no need for the circuit to know -which code points are disallowed. - -The equality operators `==` and `!=` are automatically available for `char`. - -The following code sample illustrates three ways of defining characters: -character literal, single-character escapes, and Unicode escapes. - -```js -function main() -> [char; 5] { - - // using char literals to form an array - const world: [char; 5] = ['w', 'o', 'r', 'l', 'd']; - - // escaped characters - const escaped: [char; 4] = ['\n', '\t', '\\', '\'']; - - // unicode escapes - using emoji character 😊 - const smiling_face: char = '\u{1F60A}'; - - return [smiling_face, ...escaped]; -} -``` - -### Strings - -In this initial design proposal, we do not introduce any new type for strings. -Instead, we rely on the fact that Leo already has arrays -and that arrays of characters can be regarded as strings. -Existing array operations, such as element and range access, -apply to these strings without the need of language extensions. - -To ease the common use case of writing a string value in the code, -we add a new kind of literal for strings (i.e. character arrays), -consisting of a sequence of **one or more** single characters or escapes -surrounded by double quotes; -this is just syntactic sugar for the literal array construction. -Any Unicode character except double quote or backslash is allowed without escape. -Examples: `"Aleo"`, `"it's"`, and `"x + y"`. -Double quotes must be escaped with a backslash, e.g. `"say \"hi\""`; -backslashes must be escaped as well, e.g. `"c:\\dir"`. -We also allow the same backslash escapes allowed for character literals -(see the section on characters above). -We also allow the same Unicode escapes allowed in character literals -(described in the section on characters above). - -Note that this syntax for string literals is very close to the Rust syntax documented here (as of 2021-05-26): -https://doc.rust-lang.org/reference/tokens.html#string-literals. -The main difference is that this syntax does not support the Rust `STRING_CONTINUE` syntax. -In this syntax a backslash may not be followed by a newline, and newlines have no special handling. -Another differences is that this syntax does **not** permit the empty string `""`. -Also, this syntax does not allow underbars in Unicode escapes in string literals. - -The type of a string literal is `[char; N]`, -where `N` is the length of the string measured in characters, -i.e. the size of the array. -Note that in this language design there is no notion of Unicode encoding (e.g. UTF-8) -that applies to string literals. - -The rationale for not introducing a new type for strings initially, -and instead, piggybacking on the existing array types and operations, -is twofold. -First, it is an economical design -that lets us reuse the existing array machinery, -both at the language level (e.g. readily use array operations) -and at the R1CS compilation level -(see the section on compilation to R1CS below). -Second, it leaves the door open to providing, -in a future design iteration, -a richer type for strings, -as discussed in the section about future extensions below. - -Recall that empty arrays are disallowed in Leo. -(The reason is that arrays, -which must have a size known at compile time and are not resizable, -are flattened into their elements when compiling to R1CS; -thus, an empty array would be flattened into nothing.) -Therefore, in this initial design empty strings must be disallowed as well. -A future type of resizable strings will support empty strings. - -Because array, and therefore string, sizes must be known at compile time, -there is no point to having an operation to return the length of a string. -This operation will be supported for a future type of resizable strings. - -Below are some examples of array operations -that are also common for strings in other programming languages: -* `[...s1, ...s2]` concatenates the strings `s1` and `s2`. -* `[c, ...s]` adds the character `c` in front of the string `s`. -* `s[i]` extracts the `i`-th character from the string `s`. -* `s[1..]` removes the first character from the string `s`. - -The following code shows a string literal and its actual transformation into an -array of characters as well as possible array-like operations on strings: -concatenation and comparison. - -```js -function main() -> bool { - // double quotes create char array from string - let hello: [char; 5] = "hello"; - let world: [char; 5] = ['w','o','r','l','d']; - - // string concatenation can be performed using array syntax - let hello_world: [char; 11] = [...hello, ' ', ...world]; - - // string comparison is also implemented via array type - return hello_world == "hello world"; -} -``` - -### Format Strings - -Leo currently supports format strings as their own entity, -usable exclusively as first arguments of console print calls. -This proposal eliminates this very specific notion, -which is subsumed by the string literals described above. -In other words, a console print call -will take a string literal as the first argument, -which will be interpreted as a format string -according to the semantics of console print calls. -The internal UTF-32 string will be translated to UTF-8 for output. - -### Compilation to R1CS - -So far, the discussion has been independent from R1CS -(except for a brief reference when discussing the rationale behind the design). -This is intentional because the syntax and semantics of Leo -should be understandable independently from the compilation of Leo to R1CS. -However, compilation to R1CS is a critical consideration -that affects the design of Leo. -This section discusses R1CS compilation considerations -for this proposal for characters and strings. - -Values of type `char` can be represented directly as field elements, -since the prime of the field is (much) larger than 10FFFF. -This is more efficient than using a bit representation of characters. -By construction, field elements that represent `char` values -are never above 10FFFF. -Note that `field` and `char` remain separate types in Leo: -it is only in the compilation to R1CS -that everything is reduced to field elements. - -Since strings are just arrays of characters, -there is nothing special about compiling strings to R1CS, -compared to other types of arrays. -In particular, the machinery to infer array sizes at compile time, -necessary for the flattening to R1CS, -applies to strings without exception. -String literals are just syntactic sugar for -suitable array inline construction expressions. - -There are at least two approaches to implementing -ordering operations `<` and `<=` on `char` values. -Recalling that characters are represented as field values -that are (well) below `(p-1)/2` where `p` is the prime, -we can compare two field values `x` and `y`, -both below `(p-1)/2`, via the constraints -``` -(2) (x - y) = (b0 + 2*b1 + 4*b2 + ...) -(b0) (1 - b0) = 0 -(b1) (1 - b1) = 0 -(b2) (1 - b2) = 0 -... -``` -that take the difference, double it, and convert to bits. -If `x >= y`, the difference is below `(p-1)/2`, -and doubling results in an even number below `p`, -with therefore `b0 = 0`. -If `x < y`, the difference is above `(p-1)/2` (when reduced modulo `p`), -and doubling results in an odd number when reduced modulo `p`, -with therefore `b0 = 1`. -Note that we need one variable and one constraint for every bit of `p`. -The other approach is to convert the `x` and `y` to bits -and compare them as integers; -in this case we only need 21 bits for each. -We need more analysis to determine which approach is more efficient. - -The details of implementing other character and string operations in R1CS -will be fleshed out as each operation is added. - -## Drawbacks - -This proposal does not appear to bring any real drawbacks, -other than making the language inevitably slightly more complex. -But the need to support characters and strings justifies the extra complexity. - -## Effect on Ecosystem - -With the ability of Leo programs to process strings, -it may be useful to have external tools that convert Leo strings -to/from common formats, e.g. UTF-8. -See the discussion of input files in the design section. - -## Alternatives - -### 32-bit Unsigned Integers for Characters - -We could avoid the new `char` type altogether, -and instead, rely on the existing `u32` to represent Unicode code points, -and provide character-oriented operations on `u32` values. -(Note that both `u8` and `u16` are too small for 10FFFF, -and that signed integer types include negative integers -which are not Unicode code points: -this makes `u32` the obvious choice.) -However, many values of type `u32` are above 10FFFF, -and many operations on `u32` do not really make sense on code points. -We would probably want a notation for character literals anyhow, -which could be (arguably mis)used for non-character unsigned integers. -All in all, introducing a new type for characters -is consistent with Leo's strong typing approach. -Furthermore, for compilation to R1CS, `u32`, -even if restricted to the number of bits needed for Unicode code points, -is less efficient than the field representation described earlier -because `u32` requires a field element for each bit. - -### Separate String Type, Isomorphic to Character Arrays - -Instead of representing strings as character arrays, -we could introduce a new type `string` -whose values are finite sequences of zero or more characters. -These strings would be isomorphic to, but distinct form, character arrays. -However, for compilation to R1CS, it would be necessary to -perform the same kind of known-size analysis on strings -that is already performed on arrays, -possibly necessitating to include size as part of the type, i.e. `string(N)`, -which is obviously isomorphic to `[char; N]`. -Thus, using character arrays avoids duplication. -Furthermore, as noted in the section on future extensions, -this leaves the door open to -introducing a future type `string` for resizable strings. - -### Field Elements for Characters - -Yet another option could be to use directly `field` to represent characters -and `[field; N]` to represent strings of `N` characters. -However, many values of type `field` are not valid Unicode code points, -and many field operations do not make sense for characters. -Thus, having a separate type `char` for characters seems better, -and more in accordance with Leo's strong typing. - -## Future Extensions - -### Operations on Characters - -Given that characters are essentially code points, -we may also support the ordering operators `<`, `<=`, `>`, and `>=`; -these may be useful to check whether a character is in certain range. - -Below is a list of possible operations we could support on characters. -It should be fairly easy to add more. -- [ ] `is_alphabetic` - Returns `true` if the `char` has the `Alphabetic` property. -- [ ] `is_ascii` - Returns `true` if the `char` is in the `ASCII` range. -- [ ] `is_ascii_alphabetic` - Returns `true` if the `char` is in the `ASCII Alphabetic` range. -- [ ] `is_lowercase` - Returns `true` if the `char` has the `Lowercase` property. -- [ ] `is_numeric` - Returns `true` if the `char` has one of the general categories for numbers. -- [ ] `is_uppercase` - Returns `true` if the `char` has the `Uppercase` property. -- [ ] `is_whitespace` - Returns `true` if the `char` has the `White_Space` property. -- [ ] `to_digit` - Converts the `char` to the given `radix` format. -- [ ] `from_digit` - Inverse of to_digit. -- [ ] `to_uppercase` - Converts lowercase to uppercase, leaving others unchanged. -- [ ] `to_lowercase` - Converts uppercase to lowercase, leaving others unchanged. - -It seems natural to convert between `char` values -and `u8` or `u16` or `u32` values, under suitable range conditions; -perhaps also between `char` values and -(non-negative) `i8` or `i16` or `i32` values. -This may be accomplished as part of the type casting extension of Leo. - -### Operations on Strings - -Below is a list of possible operations we could support on strings. -It should be fairly easy to add more. -- [ ] `u8` to `[char; 2]` hexstring, .., `u128` to `[char; 32]` hexstring. -- [ ] Field element to `[char; 64]` hexstring. (Application can test leading zeros and slice them out if it needs to return, say, a 40-hex-digit string.) -- [ ] Apply `to_uppercase` (see above) to every character. -- [ ] Apply `to_lowercase` (see above) to every character. - -Note that the latter two could be also realized via simple loops through the string. - -Given the natural conversions between `char` values and integer values discussed earlier, -it may be natural to also support element-wise conversions between strings and arrays of integers. -This may be accomplished as part of the type casting extensions of Leo. - -### Circuit Types for Character and String Operations - -The operations on characters and lists described earlier, e.g. `is_ascii`, -are provided as static member functions of two new built-in or library circuit types `Char` and `String`. -Thus, an example call is `Char::is_ascii(c)`. -This seems a general good way to organize built-in or library operations, -and supports the use of the same name with different circuit types, -e.g. `Char::to_uppercase` and `String::to_uppercase`. - -These circuit types could also include constants, e.g. for certain ASCII characters. -However, currently Leo does not support constants in circuit types, -so that would have to be added separately first. - -These two circuit types are just meant to collect static member functions for characters and strings. -They are not meant to be the types of characters and strings: -as mentioned previously, `char` is a new scalar (not circuit) type (like `bool`, `address`, `u8`, etc.) -and there is no string type as such for now, but we use character arrays for strings. -In the future we may want all the Leo types to be circuit types of some sort, -but that is a separate feature that would have to be designed and developed independently. - -### Input and Output of Literal Characters and Strings - -Since UTF-8 is a standard encoding, it would make sense for -the literal characters and strings in the `.in` file -to be automatically converted to UTF-32 by the Leo compiler. -However, the size of a string can be confusing since multiple -Unicode code points can be composed into a single glyph which -then appears to be a single character. If a parameter of type `[char; 10]` -[if that is the syntax we decide on] is passed a literal string -of a different size, the error message should explain that the -size must be the number of codepoints needed to encode the string. - -### String Type - -As alluded to in the design section above, -for now, we are avoiding the introduction of a string type, -isomorphic to but separate from character arrays, -because we may want to introduce later a more flexible type of strings, -in particular, one that supports resizing. -This may be realized via a built-in or library circuit type -that includes a character array and a fill index. -This may be a special case of a built-in or library circuit type -for resizable vectors, -possibly realized via an array and a fill index. -This hypothetical type of resizable vectors -may have to be parameterized over the element type, -requiring an extension of the Leo type system -that is much more general than strings. diff --git a/docs/rfc/002-bounded-recursion.md b/docs/rfc/002-bounded-recursion.md deleted file mode 100644 index 19872430ae..0000000000 --- a/docs/rfc/002-bounded-recursion.md +++ /dev/null @@ -1,518 +0,0 @@ -# Leo RFC 002: Bounded Recursion - -## Authors - -The Aleo Team. - -## Status - -FINAL - -## Summary - -This proposal provides support for recursion in Leo, -via a user-configurable limit to the allowed depth of the recursion. -If the recursion can be completely inlined without exceeding the limit, -compilation succeeds; -otherwise, an informative message is shown to the user, -who can try and increase the limit. -Compilation may also fail -if a circularity is detected before exceeding the limit. - -Future analyses may also recognize cases in which the recursion terminates, -informing the user and setting or suggesting a sufficient limit. - -A similar approach could be also used for loops in the future. -User-configurable limits may be also appropriate for -other compiler transformations that are known to terminate -but could result in a very large number of R1CS constraints. - -## Motivation - -Leo currently allows functions to call other functions, -but recursion is disallowed: -a function cannot call itself, directly or indirectly. -However, recursion is a natural programming idiom in some cases, -compared to iteration (i.e. loops). - -## Background - -### Function Inlining - -Since R1CS are flat collections of constraints, -compiling Leo to R1CS involves _flattening_ the Leo code: -unrolling loops, inlining functions, decomposing arrays, etc. -Of interest to this RFC is the inlining of functions, -in which a function call is replaced with the function body, -after binding the formal parameters to the the actual arguments, -and taking care to rename variables if needed to avoid conflicts. - -Since the `main` function is the entry point into a Leo program, -conceptually, for the purpose of this RFC, -we can think of function inlining as transitively inlining -all the functions into `main` -(this is just a conceptual model; -it does not mean that it should be necessarily implemented this way). -This is a simple example, -where '`===> {}`' indicates a transformation -described in the curly braces: -```js -function f(x: u32) -> u32 { - return 2 * x; -} -function main(a: u32) -> u32 { - return f(a + 1); -} - -===> {inline call f(a + 1)} - -function main(a: u32) -> u32 { - let x = a + 1; // bind actual argument to formal argument - return 2 * x; // replace call with body -} -``` - -### Constants and Variables - -A Leo program has two kinds of inputs: constants and variables; -both come from input files. -They are passed as arguments to the `main` functions: -the parameters marked with `const` receive the constant inputs, -while the other parameters receive the variable inputs. -Leo has constants and variables, -of which the just mentioned `main` parameters are examples; -constants may only depend on literals and other constants, -and therefore only on the constant inputs of the program; -variables have no such restrictions. - -The distinction between constants and variables -is significant to the compilation of Leo to R1CS. -Even though specific values of both constant and variable inputs -are known when the Leo program is compiled and the zk-proof is generated, -the generated R1CS does not depend -on the specific values of the variable inputs; -it only depends on the specific values of the constant inputs. -Stated another way, Leo variables are represented by R1CS variables, -while Leo constants are folded into the R1CS. - -For instance, in -```js -function main(base: u32, const exponent: u32) -> u32 { - return base ** exponent; // raise base to exponent -} -``` -the base is a variable while the exponent is a constant. -Both base and exponent are known, supplied in the input file, -e.g. the base is 2 and the exponent is 5. -However, only the information about the exponent being 5 -is folded into the R1CS, which retains the base as a variable. -Conceptually, the R1CS corresponds to the _partially evaluated_ Leo program -```js -function main(base: u32) -> u32 { - return base ** 5; -} -``` -where the constant `exponent` has been replaced with its value 5. - -This partial evaluation is carried out -as part of the Leo program flattening transformations mentioned earlier. -This also involves constant propagation and folding, -e.g. a constant expression `exponent + 1` is replaced with 6 -when the constant `exponent` is known to be 5. -(The example program above does not need any constant propagation and folding.) - -Circling back to the topic of Leo function inlining, -it is the case that, due to the aforementioned partial evaluation, -the `const` arguments of function calls have known values -when the flattening transformations are carried out. - -### Inlining Recursive Functions - -In the presence of recursion, -attempting to exhaustively inline function calls does not terminate in general. -However, in conjunction with the partial evaluation discussed above, -inlining of recursive functions may terminate, under appropriate conditions. - -This is an example: -```js -function double(const count: u32, sum: u32) -> u32 { - if count > 1 { - return double(count - 1, sum + sum); - } else { - return sum + sum; - } -} -function main(x: u32) -> u32 { - return double(3, x); -} - -===> {inline call double(3, x)} - -function main(x: u32) -> u32 { - let sum1 = x; // bind and rename parameter of function sum - if 3 > 1 { - return double(2, sum1 + sum1); - } else { - return sum1 + sum1; - } -} - -===> {evaluate 3 > 1 to true and simplify if statement} - -function main(x: u32) -> u32 { - let sum1 = x; - return double(2, sum1 + sum1); -} - -===> {inine call double(2, sum1 + sum1)} - -function main(x: u32) -> u32 { - let sum1 = x; - let sum2 = sum1 + sum1; // bind and rename parameter of function sum - if 2 > 1 { - return double(1, sum2 + sum2); - } else { - return sum2 + sum2; - } -} - -===> {evaluate 2 > 1 to true and simplify if statement} - -function main(x: u32) -> u32 { - let sum1 = x; - let sum2 = sum1 + sum1; - return double(1, sum2 + sum2) -} - -===> {inline call double(1, sum2 + sum2)} - -function main(x: u32) -> u32 { - let sum1 = x; - let sum2 = sum1 + sum1; - let sum3 = sum2 + sum2; // bind and rename parameter of function sum - if 1 > 1 { - return double(0, sum3 + sum3); - } else { - return sum3 + sum3; - } -} - -===> {evaluate 1 > 1 to false and simplify if statement} - -function main(x: u32) -> u32 { - let sum1 = x; - let sum2 = sum1 + sum1; - let sum3 = sum2 + sum2; - return sum3 + sum3; -} -``` - -This is a slightly more complex example -```js -function double(const count: u32, sum: u32) -> u32 { - if count > 1 && sum < 30 { - return double(count - 1, sum + sum); - } else { - return sum + sum; - } -} -function main(x: u32) -> u32 { - return double(3, x); -} - -===> {inline call double(3, x)} - -function main(x: u32) -> u32 { - let sum1 = x; // bind and rename parameter of function sum - if 3 > 1 && sum1 < 30 { - return double(2, sum1 + sum1); - } else { - return sum1 + sum1; - } -} - -===> {evaluate 3 > 1 to true and simplify if test} - -function main(x: u32) -> u32 { - let sum1 = x; - if sum1 < 30 { - return double(2, sum1 + sum1); - } else { - return sum1 + sum1; - } -} - -===> {inline call double(2, sum1 + sum1)} - -function main(x: u32) -> u32 { - let sum1 = x; - if sum1 < 30 { - let sum2 = sum1 + sum1; // bind and rename parameter of function sum - if 2 > 1 && sum2 < 30 { - return double(1, sum2 + sum2); - } else { - return sum2 + sum2; - } - } else { - return sum1 + sum1; - } -} - -===> {evaluate 2 > 1 to true and simplify if test} - -function main(x: u32) -> u32 { - let sum1 = x; - if sum1 < 30 { - let sum2 = sum1 + sum1; - if sum2 < 30 { - return double(1, sum2 + sum2); - } else { - return sum2 + sum2; - } - } else { - return sum1 + sum1; - } -} - -===> {inline call double(1, sum2 + sum2)} - -function main(x: u32) -> u32 { - let sum1 = x; - if sum1 < 30 { - let sum2 = sum1 + sum1; - if sum2 < 30 { - let sum3 = sum2 + sum2; // bind and rename parameter of function sum - if 1 > 1 && sum3 < 30 { - return double(0, sum3 + sum3); - } else { - return sum3 + sum3; - } - } else { - return sum2 + sum2; - } - } else { - return sum1 + sum1; - } -} - -===> {evaluate 1 > 1 to false and simplify if statement} - -function main(x: u32) -> u32 { - let sum1 = x; - if sum1 < 30 { - let sum2 = sum1 + sum1; - if sum2 < 30 { - let sum3 = sum2 + sum2; - return sum3 + sum3; - } else { - return sum2 + sum2; - } - } else { - return sum1 + sum1; - } -} -``` - -But here is an example in which the inlining does not terminate: -```js -function forever(const n: u32) { - forever(n); -} -function main() { - forever(5); -} - -===> {inline call forever(5)} - -function main() { - forever(5); -} - -===> {inline call forever(5)} - -... -``` - -## Design - -### Configurable Limit - -Our proposed approach to avoid non-termination -when inlining recursive functions is to -(i) keep track of the depth of the inlining call stack and -(ii) stop when a certain limit is reached. -If the limit is reached, -the compiler will provide an informative message to the user, -explaining which recursive calls caused the limit to be reached. -The limit is configurable by the user. -In particular, based on the informative message described above, -the user may decide to re-attempt compilation with a higher limit. - -Both variants of the `double` example given earlier reach depth 3 -(if we start with depth 0 at `main`). - -The default limit (i.e. when the user does not specify one) -should be chosen in a way that -the compiler does not take too long to reach the limit. -Since inlining larger functions -takes more time than inlining smaller functions, -we may consider adjusting the default limit -based on some measure of the complexity of the functions. - -In any case, compiler responsiveness is a broader topic. -As the Leo compiler sometimes performs expensive computations, -it may be important that it provide periodic output to the user, -to reassure them that the compiler is not stuck. - -We will add a flag to the `leo` CLI whose long form is -``` ---inline-limit -``` -and whose short form is -``` --il -``` -This option is followed by a number (more precisley, a positive integer) -that specifies the limit to the depth of the inlining stack. - -The name of this option has been chosen -according to a `--...-limit` template -that may be used to specify other kinds of limits, -as discussed later. - -In Aleo Studio, this compiler option is presumably specified -via GUI preferences and build configurations. - -### Circularity Detection - -Besides the depth of the inlining call stack, -the compiler could also keep track of -the values of the `const` arguments at each recursive call. -If the same argument values are encountered twice, -they indicate a circularity -(see the discussion on termination analysis below): -in that case, there is no need to continue inlining until the limit is reached, -and the compiler can show to the user the trace of circular calls. - -This approach would readily reject the `forever` example given earlier. - -## Drawbacks - -This proposal does not appear to bring any real drawbacks, -other than making the compiler inevitably more complex. -But the benefits to support recursion justifies the extra complexity. - -## Effect on Ecosystem - -This proposal does not appear to have any direct effects on the ecosystem. -It simply enables certain Leo programs to be written in a more natural style. - -## Alternatives - -An alternative approach is to treat recursion analogously to loops. -That is, we could restrict the forms of allowed recursion -to ones whose inlining is known to terminate at compile time. - -However, the configurable limit approach seems more flexible. -It does not even preclude a termination analysis (discussed below). -Furthermore, in practical terms, -non-termination is not much different from excessively long computation. -and the configurable limit approach may be uniformly suitable -to avoid both non-termination and excessively long computation (discussed below). - -## Future Extensions - -### Termination Analysis - -In general, a recursive function -(a generic kind of function, not necessarily a Leo function) -terminates when -there exists a _measure_ of its arguments -that decreases at each recursive call, -under the tests that govern the recursive call, -according to a _well-founded relation_. -This is well known in theorem proving, -where termination of recursive functions -is needed for logical consistency. - -For example, the mathematical factorial function -on the natural numbers (i.e. non-negative integers) -``` -n! =def= [IF n = 0 THEN 1 ELSE n * (n-1)!] -``` -terminates because, if `n` is not 0, we have that `n-1 < n`, -and `<` is well-founded on the natural numbers; -in this example, the measure of the argument is the argument itself. -(A relation is well-founded when -it has no infinite strictly decreasing sequence; -note that, in the factorial example, -we are considering the `<` relation on natural numbers only, -not on all the integers). - -This property is undecidable in general, -but there are many cases in which termination can be proved automatically, -as routinely done in theorem provers. - -In Leo, we are interested in -the termination of the inlining transformations. -Therefore, the termination condition above -must involve the `const` parameters of recursive functions: -a recursive inlining in Leo terminates when -there exists a measure of the `const` arguments -that decreases at each recursive call, -under the tests that govern the recursive call, -according to a well-founded relation. -The governing test must necessarily involve the `const` parameters, -but they may involve variable parameters as well, -as one of the `double` examples given earlier shows. - -We could have the Leo compiler attempt to recognize -recursive functions whose `const` parameters -satisfy the termination condition given above. -(This does not have to involve any proof search; -the compiler could just recognize structures -for which a known proof can be readily instantiated.) -If the recognition succeed, -we know that the recursion inlining will terminate, -and also possibly in how many steps, -enabling the information to be presented to the user -in case the configurable limit is insufficient. -If the recognition fails, -the compiler falls back to inlining until -either inlining terminates or the limit is reached. - -### Application to Loops - -Loops are conceptually not different from recursion. -Loops and recursion are two ways to repeat computations, -and it is well-known that each can emulate the other in various ways. - -Currenly Leo restricts the allowed loop statements -to a form whose unrolling always terminates at compile time. -If we were to use a similar approach for recursion, -we would only allow certain forms of recursion -whose inlining always terminates at compile time -(see the discussion above about termination analysis). - -Turning things around, -we could consider allowing general forms of loops (e.g. `while` loops) -and use a configurable limit to unroll loops. -We could also detect circularities -(when the values of the local constants of the loop repeat themselves). -We could also perform a termination analysis on loops, -which in particular would readily recognize -the currently allowed loop forms to terminate. -All of this should be the topic of a separate RFC. - -### Application to Potentially Slow Transformations - -Some flattening transformations in the Leo compiler are known to terminate, -but they may take an excessively long time to do so. -Examples include decomposing large arrays into their elements -or decomposing large integers (e.g. of type `u128`) into their bits. -Long compilation times have been observed for cases like these. - -Thus, we could consider using configurable limits -not only for flattening transformations that may not otherwise terminate, -but also for ones that may take a long time to do so. -This is a broader topic that should be discussed in a separate RFC. diff --git a/docs/rfc/003-imports-stabilization.md b/docs/rfc/003-imports-stabilization.md deleted file mode 100644 index 36ee00c75b..0000000000 --- a/docs/rfc/003-imports-stabilization.md +++ /dev/null @@ -1,238 +0,0 @@ -# Leo RFC 003: Imports Stabilization - -## Authors - -The Aleo Team. - -## Status - -IMPLEMENTED - -## Summary - -This proposal aims to improve the import management system in Leo programs to -make the program environment more reproducible, predictable and compatible. To achieve -that we suggest a few changes to the Leo CLI and Manifest: - -- add a "dependencies" section to the Leo Manifest and add a command to pull those dependencies; -- allow custom names for imports to manually resolve name conflicts; -- add "curve" and "proving system" sections to the Manifest; -- add "include" and "exclude" parameters for "proving system" and "curve"; -- add a lock file to store imported dependencies and their relations; - -## Motivation - -The current design of imports does not provide any guarantees on what's stored -in program imports and published with the program to Aleo Package Manager. -When a dependency is "added," it is stored inside the imports folder, and it is possible -to manually edit and/or add packages in this folder. - -Also, imports are stored under the package name, which makes it impossible to import -two different packages with the same name. - -Another important detail in the scope of this proposal is that, in the future, Leo -programs will have the ability to run with different proving systems -and curves, possibly creating incompatibility between programs written -for different proving systems or curves. To make a foundation for these features, -imports need to be managed with include/exclude lists for allowed (compatible) -proving systems and curves. - -## Background - -Leo supports packages and importing, similarly to other languages. - -A Leo _project_ consists of a `main.leo` file -and zero or more additional `.leo` files; -the latter may be organized in subdirectories. -The content of each file is as defined by `file` in the ABNF grammar: -it consists of types, functions, etc. -Each non-main file forms a Leo _package_ (because it packages the types, functions, etc. that it defines), -which can be referenced via its path, without the `.leo` extension and with `/` replaced by `.` -(restrictions on the file and directory names derive from the definition of `package-name` in the ABNF grammar, -and ensure that the package names can be resolved to file system paths). -A Leo project also forms a package, which can be published in the Aleo Package Manager (APM), -a repository of Aleo packages, similar to `crates.io` in Rust. - -The files in a Leo project can import entities (types, functions, etc.) from -not only local packages (i.e. other files in the same project), -but also packages in the APM. -This RFC is focused on importing packages from the APM. - -Each package in the APM is uniquely identified by: -* The author, who must have a registered account on the APM, with a unique username. -* The package name, which is unique within each author's account. -* The package version, which allows different versions of the same package to be treated like different packages. - -## Design - -### Leo Manifest - target section - -To lay the foundation for the future of the Leo ecosystem and to start integrating -information about programs compatibility, we suggest adding two new fields to -the new `[target]` section of the Leo Manifest: `proving_system` and `curve`. - -Currently, the Leo compiler only supports `Groth16` for the proving system and `Bls12_377` -for the curve, which are meant to be default values in Leo Manifest. - -```toml -[project] -name = "first" -version = "0.1.0" -description = "The first package" -license = "MIT" - -[target] -curve = "Bls12_377" -proving_system = "Groth16" -``` - -These fields are meant to be used to determine whether the imported program is -compatible to the original when support for different curves and proving systems -is added. - -### Leo Manifest - dependencies - -Dependencies section: - -```toml -[dependencies] -name = { author = "author", package = "package", version = "version" } - -# alternative way of adding dependency record -[dependencies.name] -author = "author" -package = "package" -version = "1.0" -``` - -#### Parameters description - -The `name` field sets the name of the dependency in Leo code. That way we allow -developer to resolve collisions in import names manually. So, for example, -if a developer is adding the `howard/silly-sudoku` package to his program, he -might define its in-code name as `sudoku` and import it with that name: - -```ts -import sudoku; -``` - -`package`, `author` and `version` are package name, package author and -version respectively. They are already used as arguments in `leo add` -command, so these fields are already understood by the Leo developers. - -### Leo CLI - -To support updating the Manifest, a new command should be added to Leo CLI. - -```bash -# pull imports -leo fetch -``` - -### Imports Restructurization - -One of the goals of the proposed changes is to allow importing packages with the -same name but different authors. This has to be solved not only on the -language level but also on the level of storing program imports. - -We suggest using the set of all 3 possible program identifiers for the import -folder name: `author-package@version`. Later it can be extended to -include a hash for the version, but having the inital set already solves name -collisions. - -So, the updated imports would look like: - -``` -leo-program -├── Leo.toml -├── README.md -├── imports -│ ├── author1-program@0.1.0 -│ │ └── ... -│ ├── author2-program2@1.0.4 -│ └── ... -├── inputs -│ └── ... -└── src - └── main.leo -``` - -This change also affects the way imports are being processed on the ASG -level, and we need to add an imports map as an argument to the Leo compiler. -The Leo Manifest's dependencies sections needs to be parsed and passed as -a hashmap to the compiler: - -``` -first-program => author1-program@0.1.0 -second-program => author2-program2@1.0.4 -``` - -### Leo.lock - -For the imports map to be generated and read by the Leo binary and then by the Leo compiler, -a lock file needs to be created. The lock file should be generated by the `leo fetch` command, -which will pull the dependencies, process their manifests, and put the required information -to the file in the root directory of the program called `Leo.lock`. - -The suggested structure of this file is similar to the Cargo.lock file: - -``` -[[package]] -name = "suit-mk2" -version = "0.2.0" -author = "ironman" -import_name = "suit-mk2" - -[package.dependencies] -garbage = "ironman-suit@0.1.0" - -[[package]] -name = "suit" -version = "0.1.0" -author = "ironman" -import_name = "garbage" -``` - -In the example above, you can see that all program dependencies are defined as an -array called `package`. Each of the dependencies contains main information about -it, including the `import_name` field which is the imported package's name in -the Leo program. Also, it stores relationships between these dependencies in the -field `dependencies`. - -The format described here allows the Leo binary to form an imports map which can be -passed to the compiler. - -It is important to note that the Leo.lock file is created only when a package has dependencies. -For programs with no dependencies, a lock file is not required and not created. - -### Recursive Dependencies - -This improvement introduces recursive dependencies. To solve this case preemptively -Leo CLI needs to check the dependency tree and throw an error when a recursive dependency -is met. We suggest implementing simple dependency tree checking while fetching -imports - if imported dependency is met on a higher level - abort the execution. - -Later this solution can be improved by building a lock file containing all the -information on program dependencies, and the file itself will have enough data -to track and prevent recursion. - -## Drawbacks - -This change might require the update of already published programs on Aleo PM due to -Leo Manifest change. However it is possible to implement it in a backward-compatible -way. - -It also introduces the danger of having recursive dependencies, but this problem is addressed in the Design section above. - -## Effect on Ecosystem - -The proposed improvement provides safety inside Leo programs and should not affect -the ecosystem except for the tools which use Leo directly (such as Aleo Studio). - -It is possible that some of the proposed features will open new features on Aleo PM. - -## Alternatives - -Another approach to the stated cases is to keep everything as we have now but change -the way programs are imported and stored and make names unique. Also, the current -implementation provides some guarantees on import stablitity and consistency. diff --git a/docs/rfc/004-integer-type-casts.md b/docs/rfc/004-integer-type-casts.md deleted file mode 100644 index 3e40489322..0000000000 --- a/docs/rfc/004-integer-type-casts.md +++ /dev/null @@ -1,223 +0,0 @@ -# Leo RFC 004: Integer Type Casts - -## Authors - -The Aleo Team. - -## Status - -FINAL - -## Summary - -This proposal provides support for casts among integer types in Leo. -The syntax is similar to Rust. -The semantics is _value-preserving_, -i.e. the casts just serve to change types -but cause errors when the mathematical values are not representable in the new types. - -## Motivation - -Currently the Leo integer types are "siloed": -arithmetic integer operations require operands of the same type -and return results of the same type. -There are no implicit or explicit ways to turn, for example, -a `u8` into a `u16`, even though -every non-negative integer that fits in 8 bits also fits in 16 bits. -However, the ability to convert values between different (integer) types -is a useful feature that is normally found in programming languages. - -## Background - -Leo supports the following _integer types_: -``` -u8 u16 u32 u64 u128 -i8 i16 i32 i64 i128 -``` - -Those are for unsigned and signed integers of 8, 16, 32, 64, and 128 bits. - -## Design - -### Scope - -This RFC proposes type casts between any two integer types, -but not between two non-integer types -or between an integer type and a non-integer type. - -This RFC does not propose any implicit cast, -even widening casts (i.e. upcasts) -from a type to another type with the same signedness -and with the same or larger size -(e.g. from `u8` to `u16`). -All the type casts must be explicit. - -### Syntax and Static Semantics - -The proposed syntax is -``` - as -``` -where `` must have an integer type. - -The ABNF grammar of Leo is modified as follows: -``` -; add this rule: -cast-expression = unary-expression - / cast-expression %s"as" integer-type - -; modify this rule: -exponential-expression = cast-expression - / cast-expression "**" exponential-expression -``` -There is no need to modify the `keyword` rule -because it already includes `as` as one of the keywords. -Note the use of `integer-type` in the `cast-expression` rule. - -The above grammar rules imply that casts bind -tighter than binary operators and looser than unary operators. -For instance, -``` -x + - y as u8 -``` -is like -``` -x + ((- y) as u8) -``` -This precedence is the same as in Rust: -see [here](https://doc.rust-lang.org/stable/reference/expressions.html#expression-precedence). - -### Dynamic Semantics - -When the mathematical integer value of the expression -is representable in the type that the expression is cast to, -there is no question that the cast must succeed -and merely change the type of the Leo value, -but not its mathematical integer value. -This is always the case when the cast is to a type -with the same signedness and with the same or larger size. -This is also the case when -the cast is to a type whose range does not cover the range of the source type -but the value in question is in the intersection of the two ranges. - -When the mathematical integer value of the expression -is not representable in the type that the expression is cast to, -there are two possible approaches: -_value-preserving casts_, -which just serve to change types -but cause errors when values are not representable in the new types; -and _values-changing casts_, -which never cause errors but may change the mathematical values. - -This RFC proposes value-preserving casts; -value-changing casts are discussed in the 'Alternatives' section, -for completeness. - -With value-preserving casts, -when the mathematical integer value of the expression -is not representable in the type that the expression is cast to, -it is an error. -That is, we require casts to always preserve the mathematical integer values. -Recall that all inputs are known at compile time in Leo, -so these checks can be performed easily. - -Thus integer casts only serve to change types, never values. -When values are to be changed, separate (built-in) functions can be used, -e.g. to mask bits and achieve the same effect as -the value-changing casts discussed below. - -This approach is consistent with Leo's treatment of potentially erroneous situations like integer overflows. -The principle is that developers should explicitly use -operations that may overflow if that is their intention, -rather than having those situation possibly occur unexpectedly. - -A value-preserving cast to a type -whose range does not cover the original type's range -implicitly expresses a developer expectation that the value -is actually in the intersection of the two types' ranges, -in the same way that the use of integer addition -implicitly expresses the expectation that the addition does not overflow. - -Consider this somewhat abstract example: -``` -... // some computations on u32 values, which could not be done with u16 -let r: u32 = ...; // this is the final result of the u32 operations above -let s: u16 = r as u16; // but r is expected to fit in u16, so we cast it here -``` -With value-preserving casts, the expectation mentioned above -is checked by the Leo compiler during proof generation, -in the same way as with integer overflow. - -In the example above, -if instead the variable `s` is meant to contain the low 16 bits of `r`, -e.g. in a cryptographic computation, -then the value-preserving cast should be preceded by -an explicit operation to obtain the low 16 bits, making the intent clear: -``` -... // some computations on u32 values, which could not be done with u16 -let r: u32 = ...; // this is the final result of the u32 operations above -let r_low16: u32 = r & 0xFFFF; // assuming we have bitwise ops and hex literals -let s: u16 = r_low16 as u16; // no value change here -``` - -### Compilation to R1CS - -It may be more efficient (in terms of number of R1CS constraints) -to compile Leo casts as if they had a value-changing semantics. -If the R1CS constraints represent Leo integers as bits, -the bits of the new value can be determined from the bits of the old value, -with additional zero or sign extension bits when needed -(see the details of the value-changing semantics in the 'Alternatives' section). -There is no need to add checks to the R1CS constraints -because the compiler ensures that the cast values do not actually change given the known inputs, -and therefore the value-changing and value-preserving semantics are equivalent on the known inputs. -The idea is that the R1CS constraints can have a "don't care" behavior on inputs that cause errors in Leo. - -## Drawbacks - -This proposal does not appear to bring any drawbacks, -other than making the language and compiler inevitably more complex. -But the benefits to support type casts justifies the extra complexity. - -## Effect on Ecosystem - -This proposal does not appear to have any direct effects on the ecosystem. - -## Alternatives - -As mentioned above, an alternative semantics for casts is value-changing: - 1. `uN` to `uM` with `N < M`: just change type of value. - 2. `uN` to `uM` with `N > M`: take low `M` bits of value. - 3. `iN` to `iM` with `N < M`: just change type of value. - 4. `iN` to `iM` with `N > M`: take low `M` bits of value. - 5. `uN` to `iM` with `N < M`: zero-extend to `M` bits and re-interpret as signed. - 6. `uN` to `iM` with `N > M`: take low `M` bits and re-interpret as signed. - 7. `uN` to `iN`: re-interpret as signed - 8. `iN` to `uM` with `N < M`: sign-extend to `M` bits and re-interpret as unsigned. - 9. `iN` to `uM` with `N > M`: take low `M` bits and re-interpret as unsigned. -10. `iN` to `uN`: re-interpret as unsigned -Except for the 1st and 3rd cases, the value may change. - -This value-changing approach is common in other programming languages. -However, it should be noted that other programming languages -typically do not check for overflow in integer operations either -(at least, not for production code). -Presumably, the behavior of type casts in those programming languages -is motivated by efficiency of execution, at least in part. -Since in Leo the input data is available at compile time, -considerations that apply to typical programming languages -do not necessarily apply to Leo. - -Back to the somewhat abstract example in the section on value-preserving casts, -note that, with value-changing casts, the expectation that the final result fits in `u16` -would have to be checked with explicit code: -``` -... // some computations on u32 values, which could not be done with u16 -let r: u32 = ...; // this is the final result of the u32 operations above -if (r > 0xFFFF) { - ... // error -} -let s: u16 = r as u16; // could change value in principle, but does not here -``` -However, it would be easy for a developer to neglect to add the checking code, -and thus have the Leo code silently produce an unexpected result. diff --git a/docs/rfc/005-countdown-loops.md b/docs/rfc/005-countdown-loops.md deleted file mode 100644 index ab76016ec5..0000000000 --- a/docs/rfc/005-countdown-loops.md +++ /dev/null @@ -1,285 +0,0 @@ -# Leo RFC 005: Countdown Loops - -## Authors - -The Aleo Team. - -## Status - -IMPLEMENTED - -# Summary - -This proposal suggests adding countdown loops and inclusive loop ranges into the Leo language. - -# Motivation - -In the current design of the language only incremental ranges are allowed. Though -in some cases there's a need for loops going in the reverse direction. These examples -demonstrate the shaker sort and bubble sort algorithms where countdown loops are mocked: - -```ts -function shaker_sort(a: [u32; 10], const rounds: u32) -> [u32; 10] { - for k in 0..rounds { - for i in 0..9 { - if a[i] > a[i + 1] { - let tmp = a[i]; - a[i] = a[i + 1]; - a[i + 1] = tmp; - } - - } - for j in 0..9 { // j goes from 0 to 8 - let i = 8 - j; // j is flipped - if a[i] > a[i + 1] { - let tmp = a[i]; - a[i] = a[i + 1]; - a[i + 1] = tmp; - } - } - } - return a; -} -``` - -```ts -function bubble_sort(a: [u32; 10]) -> [u32; 10] { - for i in 0..9 { // i counts up - for j in 0..9-i { // i is flipped - if (a[j] > a[j+1]) { - let tmp = a[j]; - a[j] = a[j+1]; - a[j+1] = tmp; - } - } - } - return a -} -``` - -Having a countdown loop in the examples above could improve readability and -usability of the language by making it more natural to the developer. - -However, if we imagined this example using a countdown loop, we would see that -it wouldn't be possible to count to 0; because the first bound of the range is -inclusive and the second is exclusive, and loops ranges must use only unsigned integers. - -```ts -// loop goes 0,1,2,3,4,5,6,7,8 -for i in 0..9 { /* ... */ } - -// loop goes 9,8,7,6,5,4,3,2,1 -for i in 9..0 { /* ... */ } -``` - -Hence direct implementation of the coundown loop ranges would create asymmetry (1) -and would not allow loops to count down to 0 (2). To implement coundown loops and -solve these two problems we suggest adding an inclusive range bounds. - -# Design - -## Coundown loops - -Countdown ranges do not need any changes to the existing syntax. However their -functionality needs to be implemented in the compiler. - -```ts -for i in 5..0 {} -``` - -## Inclusive ranges - -To solve loop asymmetry and to improve loop ranges in general we suggest adding -inclusive range operator to Leo. Inclusive range would extend the second bound -of the loop making it inclusive (instead of default - exclusive) -therefore allowing countdown loops to reach 0 value. - -```ts -// default loop: 0,1,2,3,4 -for i in 0..5 {} - -// inclusive range: 0,1,2,3,4,5 -for i in 0..=5 {} -``` - -## Step and Direction - -We remark that the step of both counting-up and counting-down loops is implicitly 1; -that is, the loop variable is incremented or decremented by 1. - -Whether the loop counts up or down is determined by how the starting and ending bounds compare. -Note that the bounds are not necessarily literals; -they may be more complex `const` expressions, and thus in general their values are resolved at code flattening time. -Because of the type restrictions on bounds, their values are always non-negative integers. -If `S` is the integer value of the starting bound and `E` is the integer value of the ending bound, -there are several cases to consider: -1. If `S == E` and the ending bound is exclusive, there is no actual loop; the range is empty. -2. If `S == E` and the ending bound is inclusive, the loop consists of just one iteration; the loop counts neither up nor down. -3. If `S < E` and the ending bound is exclusive, the loop counts up, from `S` to `E-1`. -4. If `S < E` and the ending bound is inclusive, the loop counts up, from `S` to `E`. -5. If `S > E` and the ending bound is exclusive, the loop counts down, from `S` to `E+1`. -6. If `S > E` and the ending bound is inclusive, the loop counts down, from `S` to `E`. - -Cases 3 and 5 consist of one or more iterations; cases 4 and 6 consist of two or more iterations. - -## Examples - -The code examples demostrated in the Motivation part of this document -could be extended (or simplified) with the suggested syntax: - -```ts -function shaker_sort(a: [u32; 10], const rounds: u32) -> [u32; 10] { - for k in 0..rounds { - for i in 0..9 { // i goes from 0 to 8 - if a[i] > a[i + 1] { - let tmp = a[i]; - a[i] = a[i + 1]; - a[i + 1] = tmp; - } - - } - for i in 8..=0 { // i goes from 8 to 0 - if a[i] > a[i + 1] { - let tmp = a[i]; - a[i] = a[i + 1]; - a[i + 1] = tmp; - } - } - } - return a; -} -``` - -```ts -function bubble_sort(a: [u32; 10]) -> [u32; 10] { - for i in 9..0 { // counts down - for j in 0..i { // no flipping - if (a[j] > a[j+1]) { - let tmp = a[j]; - a[j] = a[j+1]; - a[j+1] = tmp; - } - } - } - return a -} -``` - -# Drawbacks - -No obvious drawback. - -# Effect on Ecosystem - -Suggested change should have no effect on ecosystem because of its backward compatibility. - -# Alternatives - -## Mocking - -Coundown loops can be mocked manually. - -## Exclusive Starting Bounds - -While the ability to designate the ending bound of a loop as either exclusive or inclusive is critical as discussed below, -we could also consider adding the ability to designate the starting bound of a loop as either exclusive or inclusive. -If we do that, we run into a sort of asymmetry in the defaults for starting and ending bounds: -the default for the starting bound is inclusive, while the default for ending bounds is exclusive. - -The most symmetric but verbose approach is exemplified as follows: -* `0=..=5` for `0 1 2 3 4 5` -* `0<..=5` for `1 2 3 4 5` -* `0=..<5` for `0 1 2 3 4` -* `0<..<5` for `1 2 3 4` -* `5=..=0` for `5 4 3 2 1 0` -* `5>..=0` for `4 3 2 1 0` -* `5=..>0` for `5 4 3 2 1` -* `5>..>0` for `4 3 2 1` -That is, this approach makes exclusivensss and inclusiveness implicit. -The use of `<` vs. `>` also indicates a loop direction, which can be inferred anyhow when the `const` bounds are resolved, -so that would entail an additional consistency check, -namely that the inequality sign/signs is/are consistent with the inferred loop direction. - -Within the symmetric approach above, there are different options for defaults. -The most symmetric default would be perhaps `=` for both bounds, -but that would be a different behavior from current Leo. -We could instead go for different defaults for starting and ending bounds, -i.e. `=` for the starting bound and `<` or `>` (depending on direction) for the ending bound. - -A drawback of this approach is that it is somewhat verbose. -Furthermore, some of the authors of this RFC do not find it very readable. - -## Flipping Bound Defaults for Countdown - -In the proposed design, there is an asymmetry between the treatment of loops that count up vs. down. -This can be seen clearly by thinking how to iterate through an array of size `N`: -```ts -for i in 0..n { ... a[i] ... } // count up -- 0 1 2 ... n-1 -for i in n-1..=0 { ... a[i] ... } // count down -- n-1 ... 2 1 0 -``` -While the loop that counts up has nice and simple bounds `0` and `n`, -the loop that counts down needs `n-1` and `=0`. - -So a possible idea is to use different defaults depending on the loop direction: -* For a loop that counts up: - * The starting (i.e. lower) bound is always inclusive. - * The ending (i.e. upper) bound is exclusive by default, inclusive with `=`. -* For loop that counts down: - * The ending (i.e. lower) bound is always inclusive. - * The starting (i.e. upper) bound is exclusive by default, inclusive with `=`. - -That is, different defaults apply to lower vs. upper bound, rather than to starting and ending bounds. - -Things become more symmetric in a way: -```ts -for i in 0..n { ... a[i] ... } // count up -- 0 1 2 ... n-1 -for i in n..0 { ... a[i] ... } // count down -- n-1 ... 2 1 0 -``` - -This is also consistent with Rust in a way, -where countdown loops are obtained by reversing the increasing range into a decreasing range, which flips the bounds. - -However, if we consider a possible extension in which the step may be larger than 1, we run into some awkwardness. -Imagine an extension in which `step` is specified: -```ts -for i in 10..0 step 2 ... // i = 8 6 4 2 0 -- starts at 10-2 = 8 -for i in 10..0 step 3 ... // i = 9 6 3 0 -- starts at 10-1 = 9 -``` - -Note how the actual starting index does not depend on starting/upper bound and step, -but rather on ending/lower bound and step, and must be calculated explicitly; -it doesn't "jump" at the reader. - -## Explicit Indication of Loop Direction - -Another idea that was brought up is to always write the range as `..`, -but include an explicit indication when the loop must count down, e.g. -```ts -for i in 0..n down { ... array[i] ... } // where 'down' indicates count down -``` - -The advantages are that -we retain the default that the first/lower bound is inclusive and the second/upper bound is exclusive, -and the direction is explicit and does not have to be inferred. -The direction matches starting/ending bound to lower/upper bound or upper/lower bound. - -But the awkwardness with larger steps than 1 remains: -```ts -for i in 0..10 down step 2 ... // i = 8 6 4 2 0 -- starts at 10-2 = 8 -for i in 0..10 down step 3 ... // i = 9 6 3 0 -- starts at 10-1 = 9 -``` - -## Variable in the Middle of Range with Equalities or Inequalities - -Another approach is to put the variable in the middle of the range, -along with equality or inequality signs around the variable, e.g. -```ts - for 0 <= i < 5 // 0 1 2 3 4 - for 0 <= i <= 5 // 0 1 2 3 4 5 - for 5 > i >= 0 // 4 3 2 1 0 -``` - -This maximizes explicitness, but it may need tweaking to avoid parsing ambiguities or difficulties -(recall that the bounds may be complex `const` expressions). - -This could be a future addition to consider, but it seems that it would not replace the current Rust-like syntax. diff --git a/docs/rfc/006-arrays-without-size.md b/docs/rfc/006-arrays-without-size.md deleted file mode 100644 index 002f057f03..0000000000 --- a/docs/rfc/006-arrays-without-size.md +++ /dev/null @@ -1,155 +0,0 @@ -# Leo RFC 006: Array Types with Unspecified Size - -## Authors - -The Aleo Team. - -## Status - -IMPLEMENTED - -# Summary - -This RFC proposes the addition, at the user level, of array types with unspecified size, -of the form `[T, _]`, where `T` is the element type and the underscore stands for an unspecified size. -It must be possible to infer the size at compile time. - -When these types are used in a function, -different calls of the function (which are inlined) may resolve the sizes of these types to different values. - -To make this extension more useful, this RFC also proposes the addition of -an operator to return the length of an array, whose result is resolved at compile time. - -# Motivation - -The initial motivation was the ability to have a type `string` for Leo strings, -which are currently represented as character arrays, -therefore requiring a size indication, i.e. `[char; ]`. -Allowing a `[char; _]` type, where `_` stands for an unspecified size, -makes it possible to define a type alias `string` for it, -once we also have an (orthogonal) extension of Leo to support type aliases. - -However, allowing `[T; _]` for any `T` (not just `char`) is a more generally useful feature. -This kind of types is already used internally in the Leo compiler. -Allowing their use externally should provide additional convenience to the user. -Some examples are shown in the 'Design' section. - -# Design - -User-facing array types currently have a specified size, indicated as a positive integer according to the grammar and static semantics -(the grammar allows 0, via the `natural` rule, but the static semantics checks that the natural is not 0). -Internally, the Leo compiler uses array types with unspecified size in some cases, e.g. when taking a slice with non-literal bounds. -These internal unspecified sizes must be resolved at compile time (by evaluating the constant bound expressions), in order for compilation to succeed. - -This RFC proposes to make array types with unspecified size available at the user level, -with the same requirement that their sizes must be resolved in order for compilation to succeed. - -The ABNF grammar changes as follows: -``` -; new rule: -array-dimension = natural / "_" - -; modified rule: -array-dimensions = array-dimension - / "(" array-dimension *( "," array-dimension ) ")" -``` -That is, an array dimension may be unspecified; this is also the case for multidimensional array types. - -Note that `array-dimension` is also referenced in this rule of the ABNF grammar: -``` -; existing rule: -array-repeat-construction = "[" expression ";" array-dimensions "]" -``` -The compiler will enforce, post-parsing, that array dimensions in array repeat expressions are positive integers, i.e. non-zero naturals. -This will be part of the static semantics of Leo. - -Array types may appear, either directly or within other types, in the following constructs: -- Constant declarations, global or local to functions. -- Variable declarations, local to functions. -- Function inputs. -- Function outputs. -- Member variable declarations. - -Thus, those are also the places where array types with unspecified size may occur. - -An array type with unspecified size that occurs in a global constant declaration must be resolved to a unique size. -On the other hand, an array type with unspecified size that occurs in a function -(whether a variable declaration, function input, or function output) -could be resolved to different sizes for different inlined calls of the function. -Finally, there seems to be no point in allowing array types of unspecified sizes in member variable declarations: -the circuit type must be completely known, including the types of its member variables; -therefore, this RFC prescribes that array types with unspecified size be disallowed in member variable declarations. -(This may be revisited if a good use case, and procedure for resolution, comes up.) - -## Examples - -In the following example, the array type with unspecified size obviates the need to explicate the size (3), -since it can be resolved by the compiler: -``` -let x: [u8; _] = [1, 2, 3]; -``` -Currently it is possible to omit the type of `x` altogether of course, -but then at least one of the elements must have a type suffix, e.g. `1u8`. - -Using an array type of unspecified size for a function input makes the function generic over the size: -``` -function f(x: [u8; _]) ... -``` -That is, `f` can take an array of `u8` of any size, and perform some generic computation on it, -because different inlined calls of `f` may resolve the size to different values (at compile time). -But this brings up the issue discussed below. - -## Array Size Operator - -Currently Leo has no array size operator, which makes sense because arrays have known sizes. -However, if we allow array types with unspecified size as explained above, -we may also need to extend Leo with an array size operator. - -However, consider a function `f` as above, which takes as input an array of `u8` of unspecified size. -In order to do something with the array, e.g. add all its elements and return the sum, -`f` should be able to access the size of the array. - -Thus, this RFC also proposed to extend Leo with such an operator. -A possibility is `.length`, where `` is an expression of array type. -A variation is `.len()`, if we want it look more like a built-in method on arrays. -Yet another option is `length()`, which is more like a built-in function. -A shorter name could be `len`, leading to the three possibilities -`.len`, `.len()`, and `len()`. -So one dimension of the choice is the name (`length` vs. `len`), -and another dimension is the style: -member variable style, -member function style, -or global function style. -The decision on the latter should be driven by broader considerations -of how we want to treat this kind of built-in operators. - -Note that the result of this operator can, and in fact must, be calculated at compile time; -not as part of the Leo interpreter, but rather as part of the flattening of Leo to R1CS. -In other words, this is really a compile-time operator, akin to `sizeof` in C. - -With that operator, the following function can be written: -``` -function f(x: [u8; _]) -> u8 { - let sum = 0u8; - for i in 0..length(x) { - sum += x[i]; - } - return sum; -} -``` - -# Drawbacks - -None, aside from inevitably making the language and compiler slightly more complex. - -# Effect on Ecosystem - -None. - -# Alternatives - -None. - -# Implementation Decisions - -For the length of an array, we decided on `.len()`, where `` is any expression of array type. diff --git a/docs/rfc/007-type-aliases.md b/docs/rfc/007-type-aliases.md deleted file mode 100644 index 514a8d8323..0000000000 --- a/docs/rfc/007-type-aliases.md +++ /dev/null @@ -1,206 +0,0 @@ -# Leo RFC 007: Type Aliases - -## Authors - -The Aleo Team. - -## Status - -IMPLEMENTED - -# Summary - -This RFC proposes the addition of type aliases to Leo, -i.e. identifiers that abbreviate types and can be used wherever the latter can be used. -A new top-level construct is proposed to define type aliases; no circularities are allowed. -Type aliases are expanded away during compilation. - -# Motivation - -Many programming languages provide the ability to create aliases (i.e. synonyms) of types, such as C's `typedef`. -The purpose may be to abbreviate a longer type, -such as an alias `matrix` for `[i32; (3, 3)]` in an application in which 3x3 matrices of 32-bit integers are relevant -(e.g. for 3-D rotations, even though fractional numbers may be more realistic). -The purpose may also be to clarify the intent and use of an existing type, -such as an alias `balance` for `u64` in an application that keeps track of balances. - -The initial motivation that inspired this RFC (along with other RFCs) -was the ability to have a type `string` for strings. -Strings are arrays of characters according to RFC 001. -With the array types of unspecified size proposed in RFC 006, -`[char; _]` becomes a generic type for strings, which is desirable to alias with `string`. - -# Design - -## Syntax - -The ABNF grammar changes as follows: -``` -; modified rule: -keyword = ... - / %s"string" - / %s"type" ; new - / %s"u8" - / ... - -; new rule: -type-alias-declaration = %s"type" identifier "=" type ";" - -; modified rule: -declaration = import-declaration - / function-declaration - / circuit-declaration - / constant-declaration - / type-alias-declaration ; new - - -``` - -A type alias declaration introduces the identifier to stand for the type. -Only top-level type alias declarations are supported; -they are not supported inside functions or circuit types. - -In addition, the following changes to the grammar are appropriate. - -First, the rule -``` -circuit-type = identifier / self-type ; replace with the one below -``` -should be replaced with the rule -``` -circuit-or-alias-type = identifier / self-type -``` -The reason is that, at parsing time, an identifier is not necessarily a circuit type; -it may be a type alias that may expand to a (circuit or non-circuit type). -Thus, the nomenclature `circuit-or-alias-type` is appropriate. -Consequently, references to `circuit-type` in the following rules must be replaced with `circuit-or-alias-type`: -``` -; modified rule: -circuit-construction = circuit-or-alias-type "{" ; modified - circuit-inline-element - *( "," circuit-inline-element ) [ "," ] - "}" - -; modified rule: -postfix-expression = primary-expression - / postfix-expression "." natural - / postfix-expression "." identifier - / identifier function-arguments - / postfix-expression "." identifier function-arguments - / circuit-or-alias-type "::" identifier function-arguments ; modified - / postfix-expression "[" expression "]" - / postfix-expression "[" [expression] ".." [expression] "]" -``` - -Second, the rule -``` -aggregate-type = tuple-type / array-type / circuit-type -``` -should be removed, because if we replaced `circuit-type` with `circuit-or-alias-type` there, -the identifier could be a type alias, not necessarily an aggregate type. -(The notion of aggregate type remains at a semantic level, but has no longer a place in the grammar.) -Consequently, the rule -``` -type = scalar-type / aggregate-type -``` -should be rephrased as -``` -type = scalar-type / tuple-type / array-type / circuit-or-alias-type -``` -which "inlines" the previous `aggregate-type` with `circuit-type` replaced with `circuit-or-alias-type`. - -## Semantics - -There must be no direct or indirect circularity in the type aliases. -That is, it must be possible to expand all the type aliases away, -obtaining an equivalent program without any type aliases. - -Note that the built-in `Self` is a bit like a type alias, standing for the enclosing circuit type; -and `Self` is replaced with the enclosing circuit type during canonicalization. -Thus, canonicalization could be a natural place to expand user-defined type aliases; -after all, type aliases introduce multiple ways to denote the same types -(and not just via direct aliasing, but also via indirect aliasing, or via aliasing of components), -and canonicalization serves exactly to reduce multiple ways to say the same thing to one canonical way. - -On the other hand, expanding type aliases is more complicated than the current canonicalization transformations, -which are all local and relatively simple. -Expanding type aliases requires not only checking for circularities, -but also to take into account references to type aliases from import declarations. -For this reason, we may perform type alias expansion after canonicalization, -such as just before type checking and inference. -We could also make the expansion a part of the type checking and inference process, -which already transforms the program by inferring missing types, -so it could also expand type aliases away. - -In any case, it seems beneficial to expand type aliases away -(whether during canonicalization or as part or preamble to type checking and inference) -prior to performing more processing of the program for eventual compilation to R1CS. - -## Examples - -The aforementioned 3x3 matrix example could be written as follows: -```ts -type matrix = [u32; (3, 3)]; - -function matrix_multiply(x: matrix, y: matrix) -> matrix { - ... -} -``` - -The aforementioned balance example could be written as follows: -```ts -type balance = u64; - -function f(...) -> (..., balance, ...) { - ... -} -``` - -The aforementioned string example could be written as follows: -```ts -type string = [char; _]; - -function f(str: string) -> ... { - ... -} -``` - -# Drawbacks - -As other extensions of the language, this makes things inherently a bit more complicated. - -# Effect on Ecosystem - -None; this is just a convenience for the Leo developer. - -# Alternatives - -An alternative to creating a type alias -```ts -type T = U; -``` -is to create a circuit type -```ts -circuit T { get: U } -``` -that contains a single member variable. - -This is clearly not equivalent to a type alias, because it involves conversions between `T` and `U` -```ts -T { get: u } // convert u:U to T -t.get // convert t:T to U -``` -whereas a type alias involves no conversions: -if `T` is an alias of `U`, then `T` and `U` are the same type, -more precisely two syntactically different ways to designate the same semantic type. - -While the conversions generally cause overhead in traditional programming languages, -this may not be the case for Leo's compilation to R1CS, -in which everything is flattened, including member variables of circuit types. -Thus, it may be the case that the circuit `T` above reduces to just its member `U` in R1CS. - -It might also be argued that wrapping a type into a one-member-variable circuit type -could be a better practice than aliasing the type, to enforce better type separation and safety. - -We need to consider the pros and cons of the two approaches, -particularly in light of Leo's non-traditional compilation target. diff --git a/docs/rfc/008-built-in-declarations.md b/docs/rfc/008-built-in-declarations.md deleted file mode 100644 index bee13d00c6..0000000000 --- a/docs/rfc/008-built-in-declarations.md +++ /dev/null @@ -1,52 +0,0 @@ -# Leo RFC 008: Built-in Declarations - -## Authors - -The Aleo Team. - -## Status - -IMPLEMENTED - -## Summary - -This RFC proposes a framework for making certain (top-level) declarations (e.g. type aliases) available in every Leo program without the need to write those declarations explicitly. These may be hardwired into the language or provided by standard libraries/packages; in the latter case, the libraries may be implicitly imported or required to be explicitly imported. - -## Motivation - -It is common for programming languages to provide predefined types, functions, etc. -that can be readily used in programs. The initial motivation for this in Leo was to have a type alias `string` for character arrays of unspecified sizes (array types of unspecified sizes and type aliases are discussed in separate RFCs), but the feature is clearly more general. - -## Design - -Leo supports five kinds of top-level declarations: - -- Import declarations. -- Function declarations. -- Circuit type declarations. -- Global constant declarations. -- Type alias declarations. (Proposed in a separate RFC.) - -Leaving import declarations aside for the moment since they are "meta" in some sense -(as they bring in names of entities declared elsewhere), -it may make sense for any of the four kinds of declarations above to have built-in instances, i.e., we could have some built-in functions, circuit types, global constants, and type aliases. These features are why this RFC talks of built-in declarations, more broadly than just built-in type aliases that inspired it. - -The built-in status of the envisioned declarations will be done through explicitly declared standard library(stdlib) files. Then these stdlib files must expressly be imported, except the files found in stdlib/prelude/*. The ones found in the prelude are features determined to be helpful enough in standard programs and are auto-imported. - -## Drawbacks - -This does not seem to bring any drawbacks. - -## Effect on Ecosystem - -This change may interact with libraries and packages in some way. -But it should not be much different from standard libraries/packages. - -## Alternatives - -Some alternative approaches are: - -1. Having all stdlib imports auto included. -2. Require that all stdlib imports are explicitly imported. - -The differences between the two above approaches and the chosen one are just how many imports are imported explicitly. diff --git a/docs/rfc/009-bit-byte-conversions.md b/docs/rfc/009-bit-byte-conversions.md deleted file mode 100644 index 98f587ae59..0000000000 --- a/docs/rfc/009-bit-byte-conversions.md +++ /dev/null @@ -1,277 +0,0 @@ -# Leo RFC 009: Conversions with Bits and Bytes - -## Authors - -The Aleo Team. - -## Status - -FINAL - -# Summary - -This RFC proposes the addition of natively implemented global functions to perform conversions -between Leo integer values and sequences of bits or bytes in big endian or little endian order. -This RFC also proposes a future transition from these functions to methods associated to the integer types. - -# Motivation - -Conversions of integers to bits and bytes are fairly common in programming languages. -Use case include communication with the external world -(since external data is sometimes represented as bits and bytes rather than higher-level data structures), -and serialization/deserialization for cryptographic purposes (e.g. hashing data). - -# Design - -## Concepts - -The Leo integer values can be thought of sequences of bits. -Therefore, it makes sense to convert between integer values and their corresponding sequences of bits; -the sequences of bits can be in little or big endian order (i.e. least vs. most significant bit first), -naturally leading to two possible conversions. -Obviously, the bits represent the integers in base 2. - -Since all the Leo integer values consist of multiples of 8 bits, -it also makes sense to convert between integer values and squences of bytes, -which represents the integers in base 256. -Again, the bytes may be in little or big endian order. - -It could also make sense to convert between integers consisting of `N` bits -and sequences of "words" of `M` bits if `N` is a multiple of `M`, -e.g. convert a `u32` into a sequence of two `u16`s, or convert a `u128` into a sequence of four `u32`s. -However, the case in which `M` is 1 (bits) or 8 (bytes) is by far the most common, -and therefore the initial focus of this RFC; -nonetheless, it seems valuable to keep these possible generalizations in mind as we work though this initial design. - -Another possible generalization is to lift these conversions to sequences, -e.g. converting from a sequence of integer values to a sequence of bits or bytes -by concatenating the results of converting the integer values, -and converting from a sequence of bits or bytes to a sequence of integer values -by grouping the bits or bytes into chunks and converting each chunk into an integer. -For instance, a sequence of 4 `u32` values can be turned into a sequence of 32 bytes or a sequence of 128 bits. -Note that, in these cases, the endianness only applies to the individual element conversion, -not to the ordering of the integer values, which should be preserved by the conversion. - -Besides integers, it could make sense to consider converting other Leo values between bits and bytes, -namely characters, field elements, group elements, and addresses (but perhaps not booleans). -If this is further extended to aggregate values (tuples, arrays, and circuits), -then this moves towards a general serialization/deserialization library for Leo, which could be a separate feature. - -## Representation of Bits - -In Leo's current type system, bits can be represented as `bool` values. -These are not quite the numbers 0 and 1, but they are isomorphic, and it is easy to convert between booleans and bits: -```ts -// convert a boolean x to a bit: -(x ? 1 : 0) - -// convert f bit y to a boolean: -(y == 1) -``` - -If Leo had a type `u1` for unsigned 1-bit integers, we could use that instead of `bool`. -Separately from this RFC, such a type could be added. -There is also an outstanding proposal (not in an RFC currently) to support types `uN` and `iN` for every positive `N`, -in which case `u1` would be an instance of that. - -## Representation of Bytes - -The type `u8` is the natural way to represent a byte. -The type `i8` is isomorphic to that, but we tend to think of bytes as unsigned. - -## Representation of Sequences - -This applies to the sequence of bits or bytes that a Leo integer converts to or from. -E.g. a `u32` is converted to/from a sequence of bits or bytes. - -Sequences in Leo may be ntaurally represented as arrays or tuples. -Arrays are more flexible; in particular, they allow indexing via expressions rather than just numbers, unlike tuples. -Thus, arrays are the natural choice to represent these sequences. - -## Conversion Functions - -We propose the following global functions, -for which we write declarations without bodies below, -since the implementation is native. -(It is a separate issue whether the syntax below should be allowed, -in order to represent natively implemented functions, -or whether there should be a more explicit indication such as `native` in Java). - -These are tentative names, which we can tweak. -What is more important is the selection of operations, and their input/output types. - -### Conversions between Integers and Bits - -```ts -// unsigned to bits, little and big endian -function u8_to_bits_le(x: u8) -> [bool; 8]; -function u8_to_bits_be(x: u8) -> [bool; 8]; -function u16_to_bits_le(x: u16) -> [bool; 16]; -function u16_to_bits_be(x: u16) -> [bool; 16]; -function u32_to_bits_le(x: u32) -> [bool; 32]; -function u32_to_bits_be(x: u32) -> [bool; 32]; -function u64_to_bits_le(x: u64) -> [bool; 64]; -function u64_to_bits_be(x: u64) -> [bool; 64]; -function u128_to_bits_le(x: u128) -> [bool; 128]; -function u128_to_bits_be(x: u128) -> [bool; 128]; - -// signed to bits, little and big endian -function i8_to_bits_le(x: i8) -> [bool; 8]; -function i8_to_bits_be(x: i8) -> [bool; 8]; -function i16_to_bits_le(x: i16) -> [bool; 16]; -function i16_to_bits_be(x: i16) -> [bool; 16]; -function i32_to_bits_le(x: i32) -> [bool; 32]; -function i32_to_bits_be(x: i32) -> [bool; 32]; -function i64_to_bits_le(x: i64) -> [bool; 64]; -function i64_to_bits_be(x: i64) -> [bool; 64]; -function i128_to_bits_le(x: i128) -> [bool; 128]; -function i128_to_bits_be(x: i128) -> [bool; 128]; - -// unsigned from bits, little and big endian -function u8_from_bits_le(x: [bool; 8]) -> u8; -function u8_from_bits_be(x: [bool; 8]) -> u8; -function u16_from_bits_le(x: [bool; 16]) -> u16; -function u16_from_bits_be(x: [bool; 16]) -> u16; -function u32_from_bits_le(x: [bool; 32]) -> u32; -function u32_from_bits_be(x: [bool; 32]) -> u32; -function u64_from_bits_le(x: [bool; 64]) -> u64; -function u64_from_bits_be(x: [bool; 64]) -> u64; -function u128_from_bits_le(x: [bool; 128]) -> u128; -function u128_from_bits_be(x: [bool; 128]) -> u128; - -// signed from bits, little and big endian -function i8_from_bits_le(x: [bool; 8]) -> i8; -function i8_from_bits_be(x: [bool; 8]) -> i8; -function i16_from_bits_le(x: [bool; 16]) -> i16; -function i16_from_bits_be(x: [bool; 16]) -> i16; -function i32_from_bits_le(x: [bool; 32]) -> i32; -function i32_from_bits_be(x: [bool; 32]) -> i32; -function i64_from_bits_le(x: [bool; 64]) -> i64; -function i64_from_bits_be(x: [bool; 64]) -> i64; -function i128_from_bits_le(x: [bool; 128]) -> i128; -function i128_from_bits_be(x: [bool; 128]) -> i128; -``` - -### Conversions between Integers and Bytes - -```ts -// unsigned to bytes, little and big endian -function u16_to_bytes_le(x: u16) -> [u8; 2]; -function u16_to_bytes_be(x: u16) -> [u8; 2]; -function u32_to_bytes_le(x: u32) -> [u8; 4]; -function u32_to_bytes_be(x: u32) -> [u8; 4]; -function u64_to_bytes_le(x: u64) -> [u8; 8]; -function u64_to_bytes_be(x: u64) -> [u8; 8]; -function u128_to_bytes_le(x: u128) -> [u8; 16]; -function u128_to_bytes_be(x: u128) -> [u8; 16]; - -// signed to bytes, little and big endian -function i16_to_bytes_le(x: i16) -> [u8; 2]; -function i16_to_bytes_be(x: i16) -> [u8; 2]; -function i32_to_bytes_le(x: i32) -> [u8; 4]; -function i32_to_bytes_be(x: i32) -> [u8; 4]; -function i64_to_bytes_le(x: i64) -> [u8; 8]; -function i64_to_bytes_be(x: i64) -> [u8; 8]; -function i128_to_bytes_le(x: i128) -> [u8; 16]; -function i128_to_bytes_be(x: i128) -> [u8; 16]; - -// unsigned from bytes, little and big endian -function u16_from_bytes_le(x: [u8; 2]) -> u16; -function u16_from_bytes_be(x: [u8; 2]) -> u16; -function u32_from_bytes_le(x: [u8; 4]) -> u32; -function u32_from_bytes_be(x: [u8; 4]) -> u32; -function u64_from_bytes_le(x: [u8; 8]) -> u64; -function u64_from_bytes_be(x: [u8; 8]) -> u64; -function u128_from_bytes_le(x: [u8; 16]) -> u128; -function u128_from_bytes_be(x: [u8; 16]) -> u128; - -// signed from bytes, little and big endian -function i16_from_bytes_le(x: [u8; 2]) -> i16; -function i16_from_bytes_be(x: [u8; 2]) -> i16; -function i32_from_bytes_le(x: [u8; 4]) -> i32; -function i32_from_bytes_be(x: [u8; 4]) -> i32; -function i64_from_bytes_le(x: [u8; 8]) -> i64; -function i64_from_bytes_be(x: [u8; 8]) -> i64; -function i128_from_bytes_le(x: [u8; 16]) -> i128; -function i128_from_bytes_be(x: [u8; 16]) -> i128; -``` - -## Handling of the Native Functions - -Given the relatively large number and regular structure of the functions above, -it makes sense to generate them programmatically (e.g. via Rust macros), -rather than enumerating all of them explicitly in the implementation. -It may also makes sense, at R1CS generation time, -to use generated or suitably parameterized code to recognize them and turn them into the corresponding gadgets. - -## Transition to Methods - -Once a separate proposal for adding methods to Leo scalar types is realized, -we may want to turn the global functions listed above into methods, -deprecating the global functions, and eventually eliminating them. - -Conversions to bits or bytes will be instance methods of the integer types, -e.g. `u8` will include an instance method `to_bits_le` that takes no arguments and that returns a `[bool; 8]`. -Example: -```ts -let int: u8 = 12; -let bits: [bool; 8] = int.to_bits_le(); -console.assert(bits == [false, false, true, true, false, false, false, false]); // 00110000 (little endian) -``` - -Conversions from bits or bytes will be static methods of the integer types, -e.g. `u8` will include a static metod `from_bits_le` that takes a `[bool; 8]` argument and returns a `u8`. -Example: -```ts -let bits: [bool; 8] = [false, false, true, true, false, false, false, false]; // 00110000 (little endian) -let int = u8::from_bits_le(bits); -console.assert(int == 12); -``` - -# Drawbacks - -This does not seem to bring any drawbacks. - -# Effect on Ecosystem - -None. - -# Alternatives - -## Pure Leo Implementation - -These conversions can be realized in Leo (i.e. without native implementations), -provided that Leo is extended with certain operations that are already separately planned: -* Integer division and remainder, along with type casts, could be used. -* Bitwise shifts and masks, along with type casts, could be used. - -However, compiling the Leo code that realizes the conversions may result in less efficient R1CS than the native ones. - -## Naming Bit and Byte Types Explicitly - -Names like `u8_to_bits_le` and `u32_to_bytes_le` talk about bits and bytes, -therefore relying on a choice of representation for bits and bytes, -which is `bool` for bits and `u8` for bytes as explained above. -An alternative is to have names like `u8_to_bools_le` and `u32_to_u8s_le`, -which explicate the representation of bits and bytes in the name, -and open the door to additional conversions to different representations. -In particular, if and when Leo is extended with a type `u1` for bits, -there could be additional operations like `u8_to_u1s_le`. - -This more explicit naming scheme also provides a path towards extending -bit and byte conversions to more generic "word" conversions, -such as `u64_to_u16s_le`, which would turn a `u64` into a `[u16; 4]`. -In general, it makes sense to convert between `uN` or `iN` and `[uM; P]` when `N == M * P`. -If Leo were extended with types `uN` and `iN` for all positive `N` as proposed elsewhere, -there could be a family of all such conversions. - -## Methods Directly - -Given that we eventually plan to use methods on scalar types for these conversions, -it may make sense to do that right away. -This is predicated on having support for methods on scalar types, -for which a separate RFC is in the works. - -If we decide for this approach, we will revise the above proposal to reflect that. -The concepts and (essential) names and input/output types remain unchanged, -but the conversions are packaged in slightly different form. diff --git a/docs/rfc/010-native-functions.md b/docs/rfc/010-native-functions.md deleted file mode 100644 index 3d85f58cc7..0000000000 --- a/docs/rfc/010-native-functions.md +++ /dev/null @@ -1,211 +0,0 @@ -# Leo RFC 010: Improved Native Functions - -## Authors - -The Aleo Team. - -## Status - -DRAFT - -# Summary - -This RFC proposes an improved approach to handling natively implemented functions ('native functions', for short) in Leo, -that is functions implemented not via Leo code but (in essence) via Rust code. -Currently there is just one such function, namely BLAKE2s. -The scope of this proposal is limited to native functions defined by the developers of Leo itself, -not by users of Leo (i.e. developers of applications written in Leo). - -The approach proposed here is to allow (global and member) Leo functions to have no defining bodies, -in which case they are regarded as natively implemented; -this is only allowed in Leo files that contain standard/core libraries, provided with the Leo toolchain. -Most of the compiler can work essentially in the same way as it does now; -at R1CS generation time, native functions must be recognized, and turned into the known gadgets that implement them. - -# Motivation - -Many languages support native functions (here we generically use 'functions' to also denote methods), -where 'native' refers to the fact that the functions are implemented not in the language under consideration, -but rather in the language used to implement the language under consideration. -For instance, Java supports native methods that are implemented in C rather than Java. - -There are two main reasons for native functions in programming languages: -1. The functionality cannot be expressed at all in the language under consideration, - e.g. Java has no constructs to print on screen, making a native implementation necessary. -2. The functionality can be realized more efficiently in the native language. - -The first reason above may not apply to Leo, at least currently, -as Leo's intended use is mainly for "pure computations" rather than interactions with the external world. -However, we note that console statements could be regarded as native functions (as opposed to "ad hoc" statements), -and this may be in fact the path to pursue if we extend the scope of console features (e.g. even to full GUIs), -as has been recently proposed (we emphasize that the console code is not meant to be compiled to R1CS). - -The second reason above applies to Leo right now. -While there is currently just one native function supported in Leo, namely BLAKE2s, -it is conceivable that there may be other cryptographic (or non-cryptographic) functions -for which hand-crafted R1CS gadgets are available -that are more efficient than what the Leo compiler would generate if their functionality were written in Leo. -While we will continue working towards making the Leo compiler better, -and perhaps eventually capable to generate R1CS whose efficiency is competitive with hand-crafted gadgets, -this will take time, and in the meanwhile new and more native functions may be added, -resulting in a sort of arms race. -In other words, it is conceivable that Leo will need to support native functions in the foreseeable future. - -Languages typically come with standard/core libraries that application developers can readily use. -Even though the Leo standard/core libraries are currently limited (perhaps just to BLAKE2s), -it seems likely that we will want to provide more extensive standard/core libraries, -not just for cryptographic functions, but also for data structures and operations on them. - -The just mentioned use case of data structures brings about an important point. -Leo circuit types are reasonable ways to provide library data structures, -as they support static and instance member functions that realize operations on those data structures. -Just like some Java library classes provide a mix of native and non-native methods, -we could imagine certain Leo library circuit types providing a mix of native and non-native member functions, e.g.: -```ts -circuit Point2D { - x: u32; - y: u32; - function origin() -> Point2D { ... } // probably non-native - function move(mut self, delta_x: u32, delta_y: u32) { ... } // probably non-native - function distance(self, other:Point2D); // maybe native (involves square root) -} -``` - -Our initial motivation for naive functions is limited to Leo standard/core libraries, -not to user-defined libraries or applications. -That is, only the developers of the Leo language will be able to create native functions. -Leo users, i.e. developers of Leo applications, will be able to use the provided native functions, -but not to create their own. -If support for user-defined native functions may become desired in the future, it will be discussed in a separate RFC. - -# Design - -## Background - -### Current Approach to Native Functions - -The BLAKE2s native function is currently implemented as follows (as a high-level view): -1. Prior to type checking/inference, its declaration (without a defining body) - is programmatically added to the program under consideration. - This way, the input and output types of the BLAKE2s function can be used to type-check code that calls it. -2. At R1CS generation time, when the BLAKE2s function is compiled, it is recognized as native and, - instead of translating its body to R1CS (which is not possible as the function has no Leo body), - a known BLAKE2s gadget is used. - -This approach is fine for a single native function, but may not be the best for a richer collection of native functions. -In particular, consider the `Point2D` example above, which has a mix of native and non-native functions: -presumably, we would like to write at least the non-native functions of `Point2D` directly in a Leo file, -as opposed to programmatically generating them prior to type checking/inference. - -### Multi-File Compilation - -Leo already supports the compilation of multiple files that form a program, via packages and importing. -This capability is independent from native functions. - -We note that, for the purpose of type checking code that calls a function `f`, -the defining body of `f` is irrelevant: only the input and output types of `f` are relevant. -The defining body is of course type-checked when `f` itself is type-checked, -and furthermore the defining body is obviously needed to generate R1CS, -but the point here is that only the input and output types of `f` are needed to type-check code that calls `f`. -In particular, this means that, if a Leo file imports a package, -only the type information from the package is needed to type-check the file that imports the package. -Conceptually, each package exports a symbol table, used (and sufficient) to type-check files that import that package. - -## Proposal - -we propose to: -1. Allow declarations of (global and member) functions to have no defining body, signaling that the function is native. -2. Extend the AST and ASG to allow functions to have no bodies. -3. Have the compiler allow empty function bodies only in standard/core library files, which should be known. -4. Have type checking/inference "skip over" absent function bodies. -5. At R1CS generation time, when a function without body is encountered, find and use the known gadget for it. - -Currently the ABNF grammar requires function declarations to have a defining body (a block), i.e. to be implemented in Leo: -``` -function-declaration = *annotation %s"function" identifier - "(" [ function-parameters ] ")" [ "->" type ] - block -``` -We propose to relax the rule as follows: -``` -function-declaration = *annotation %s"function" identifier - "(" [ function-parameters ] ")" [ "->" type ] - ( block / ";" ) -``` -This allows a function declaration to have a terminating semicolon instead of a block. - -Since we do not have anything like abstract methods in Leo, this is a workable way to indicate native functions. -However, it is easy, if desired, to have a more promiment indication, such as a `native` keyword, or an annotation. - -It may be necessary to extend the AST and ASG to accommodate function bodies to be optional, -although this may be already the case for representing BLAKE2s in its current form described above. - -The compiler should know which files are part of the Leo standard/core libraries and which ones are not. -Functions without bodies will be only allowed to appear in those files. -It will be an error if any other file (e.g. user-defined) contains functions without bodies. -Type checking/inference may be where we make this check, or perhaps in some other phase. - -Because of the already existing support for multi-file compilation described above, -no essential change is needed in the compiler's type checking/inference. -We just need to make sure that functions without bodies are expected and properly handled -(i.e. their input and output types must be checked and added to the proper symbol tables, -but their absent bodies must be skipped); -this may already be the case, for the treatment of BLAKE2s described above. - -The main change is in R1CS generation. -Normally, when a function definition is encountered, its Leo body is translated to R1CS. -For a native function, we need to find and use a known gadget instead. -The compiler must know a mapping from native functions in the standard/core libraries -to the R1CS gadgets that implement them, so it should be just a matter of selecting the appropriate one. -Some of this logic must be already present, in order to detect and select the BLAkE2s gadget. - -This approach is used in Java, where Java files may declare certain methods as `native`, -without a body but with a declaration of input and output types. -The actual native implementations, i.e. the native method bodies live in different files, as they are written in C. - -# Drawbacks - -This does not seem to bring any drawbacks. -A capability for native functions (for BLAKE2s) already exists; -this RFC proposes a way to make it more flexible, -with mild (and likely simplifying) changes to the compiler. - -# Effect on Ecosystem - -This should help support richer standard/core libraries for Leo. - -# Alternatives - -## Programmatic Generation - -Instead of storing declarations of native functions in standard/core files as proposed above, -we could programmatically generate them as currently done for BLAKE2s. -Macros may be used to generate families of similar function declarations. - -However, consider `Point2D` above, which has a mix or native and non-native functions. -One approach is to programmatically generate the whole `Point2D` declarative, -with both native and non-native functions. -But it seems that a Leo file would be clearer and more maintainable than a Rust string in the compiler. -We could think of splitting the non-native and native functions of `Point2D`: -the former in a Leo file, and the latter programmatically added. -Again, this looks more complicated than just declaring native funcions in Leo files. - -## Leo Code in Rust Files - -It has been pointed out that it would be beneficial to have -both the Leo code (for the non-native functions) -and the Rust code (for the native functions) -in the same place (i.e. file). -This is not possible if the non-native code is in a Leo file, because Leo files cannot contain Rust code -(and there is no plan to allow that, i.e. no inline Rust code). - -However, we can turn things around and leverage Rust's macro system to accommodate Leo code in Rust files. -That is, we can have Rust files that include both the non-native Leo code, -written as Leo code (with some surrounding macro call or something like that), -along with the Rust code that implements the naive functions. - -This may turn out to be in fact the preferred design in the end, -as it combines the advantage of writing non-native code in Leo -with the advantage of having native and non-native code in the same place. -In that case, we will revise this RFC to swap this design proposal with the one in the main section, -moving the proposal for Leo files to this section as an alternative. \ No newline at end of file diff --git a/docs/rfc/011-scalar-type-accesses-and-methods.md b/docs/rfc/011-scalar-type-accesses-and-methods.md deleted file mode 100644 index ad51767209..0000000000 --- a/docs/rfc/011-scalar-type-accesses-and-methods.md +++ /dev/null @@ -1,94 +0,0 @@ -# Leo RFC 011: Scalar Type Accesses And Methods - -## Authors - -The Aleo Team. - -## Status - -FINAL - -## Summary - -This RFC proposes three things: - -1. The scalar types in Leo (integers, fields, etc.) can have static methods. -2. The scalar types in Leo (integers, fields, etc.) can have static constants. -3. Expressions of scalar type can have instance methods called on them. - -## Motivation - -This approach allows for a clean interface to provide built-in methods or static members for these basic types. - -## Design - -### Semantics - -Firstly we would have to modify both the ABNF and parsing of Leo to allow static method calls onto a scalar type. - -The ABNF would look as follows: - -```abnf -; This is an existing old rule. -scalar-type = boolean-type / arithmetic-type / address-type / character-type - -; Add this rule. -named-type = identifier / self-type / scalar-type - -; Modify this rule. -postfix-expression = primary-expression - / postfix-expression "." natural - / postfix-expression "." identifier - / identifier function-arguments - / postfix-expression "." identifier function-arguments - / named-type "::" identifier function-arguments ; this used to be identifier-or-self-type - / named-type "::" identifier ; this is new to allow member constants - / postfix-expression "[" expression "]" - / postfix-expression "[" [expression] ".." [expression] "]" - -; Also need to add a new static member variable declaration rule to allow for static constant members. -member-constant-declaration = %s"static" %s"const" identifier ":" type = literal ";" - -; We then need to modify the circuit declaration rule. -circuit-declaration = %s"circuit" identifier - "{" *member-constant-declaration - [ member-variable-declarations ] - *member-function-declaration "}" -``` - -Now methods and static members would be first-class citizens of scalar types and their values. For example, the following could be done: - -```ts -let x = 1u8.to_bits(); // A method call on on a scalar value itself -let x = u8::MAX; // A constant value on the scalar type -let y = u8::to_bits(1u8); // A static method on the scalar type -``` - -It also allows for static constants for circuits in general: - -```ts -circuit Point { - static SLOPE: u32 = 3; - x: u32; - y: u32; - - function new(x: u32, y: u32) -> Self { - return Self { - x, - y: y + Self::SLOPE - }; - } -} -``` - -## Drawbacks - -This change adds more complexity to the language. - -## Effect on Ecosystem - -None. The new parsing changes would not break any older programs. - -## Alternatives - -None. diff --git a/docs/rfc/012-record-transaction-model.md b/docs/rfc/012-record-transaction-model.md deleted file mode 100644 index 297cfee860..0000000000 --- a/docs/rfc/012-record-transaction-model.md +++ /dev/null @@ -1,367 +0,0 @@ -# Leo RFC 012: Improved Record and Transaction Model - -## Authors - -The Aleo Team. - -## Status - -DRAFT - -## Summary - -This RFC describes an improved model for how Leo programs interact with the Aleo blockchain. -The description is oriented to the Leo developer: -it does not describe the zero-knowledge details, -as the whole purpose of Leo is to enable developers -to write applications with only a very high-level understanding of zero-knowledge. - -## Motivation - -While Leo can be described as a regular programming language -(albeit with certain non-standard restrictions motivated by its compilation to zero-knowledge circuits), -its purpose is to build applications for the Aleo blockchain. -It is thus important to describe precisely how Leo programs operate in the Aleo blockchain. - -## Background - -### Zexe - -The Aleo blockchain follows the Zexe model, with some variations. -It is thus useful to briefly review some aspects of Zexe first. - -In Zexe, there are _records_ that contain application-specific data, -and _transactions_ that consume _n_ old records and produce _m_ new records. -The computation of the new records from the old records -is arbitrary and unknown to the blockchain; -the blockchain only enforces that the old records satisfy known _death predicates_ -and that the new records satisfy known _birth predicates_. -See the [Zexe paper](https://eprint.iacr.org/2018/962.pdf) for details. - -### Aleo Blockchain - -In the Aleo blockchain, a transaction always consumes 2 old records and produces 2 new records. -That is, _n_ = 2 and _m_ = 2 with respect to the Zexe model. -Other choices are possible, and may be supported in the future; -the current choice of 2 old and 2 new records is motivated by -being the minimum to represent certain computations of interest, such as token exchanges, -which may involve records owned by different parties -(and therefore need to consume more than one record, since each record has a unique owner). - -One or both of the old records may be dummy, -if only one old actual record is desired, -or if new records are created "from nothing". -One or both of the new records may be dummy, -if only one new actual record is desired, -or if old records just have to be consumed. - -Aleo records and transactions have a well-defined structure. -They are ordered collections of typed slots. -Of particular interest is the _payload_ slot, -which contains a fixed number of bytes (currently 128) -to store application-specific data. -(Note that the developer documentation is out of date at the time of this writing.) - -In the Aleo blockchain, unlike Zexe, there is no separation among -computation of new records from old records, death predicates, and birth predicates. -Instead, a Leo program plays the role of all three, as described below. - -### Current Leo Program Execution Model - -A Leo program is a collection of files, -with `file` as defined in the ABNF grammar, -i.e. as a sequence of declarations. -A Leo program has one main file, -which may contain import declarations, -which resolve to other files, -which may in turn contain import declarations, -and so on until a closed set of files is obtained: -that (linked) set of files is the _program_. - -In order to be used in the Aleo blockchain, -a Leo program must include a function called `main`, in its aforementioned main file. -The processing of a transaction corresponds to an invocation of this `main` function. -The word 'corresponds' in the preceding sentence is important: -unlike other blockchains like Ethereum, -the processing of the transaction does not involve executing the Leo code; -rather, it involves checking a zero-knowledge proof -of the execution of the Leo program, -which was prepared when the Leo program was compiled. -This is what 'corresponds' means, in that sentence. -However, for the high-level purpose of this RFC, these are zero-knowledge details. - -In general, the `main` function takes some `const` and some non-`const` inputs (declared as parameters), -and returns an output (declared as a return type), which may be a tuple to represent "multiple" outputs. -The `const` inputs are compiled into the zero-knowledge circuit, -so they can be ignored for our purpose here, -leaving only the non-`const` inputs and the output for consideration. - -The execution of `main` can be described as a mathematical function -``` -main : Record x Record x Inputs -> Record x Record x Output -``` -where `x` is cartesian product, -`Record` is the set of possible records, -`Inputs` is the set of possible inputs to `main`, and -`Output` is the set of possible outputs from `main`. -(These sets can be thought as "types", but mathematically we talk about sets.) -That is, this mathematical function -takes three inputs (the two old records and the `main` inputs) -and returns three outputs (the two new records and the `main` output). -While `Record` is fixed, i.e. it is the same for all Leo programs, -both `Inputs` and `Output` differ based on the Leo input and output types of `main`. - -In the Leo code, in `main` or in functions called by `main`, -the values in `Inputs` are accessed via the `main` parameters, -while the old records are accessed via the special `input` variable, -which provides access to the two old records and their slots, -including the payloads that contain application-specific data. -The picture for new records and values in `Output` is less clear from the documentation: -experimentation suggests that the new records are obtained -by serializing the output value in `Output` (which, recall, may be a tuple). - -It is important to note that the values in `Inputs` do not come from the two old records. -Rather, they are private inputs, supplied by the developer -when they compile the Leo program and generate the zero-knowledge proof. -Indeed, as mentioned above, the processing of the transaction in the blockchain -does not execute the Leo code, and thus does not need to know the values in `Inputs`. -Rather, the blockchain has to verify a zero-knowledge proof asserting that -there exist values in `Input`, known to the creator of the transaction, -such that the execution of the Leo program's `main` -on those values and on the old records -yields the new records, along with some value in `Output`; -this is, roughly speaking, the assertion proved in zero-knowledge. - -### Input and Output Files - -Currently the compilation of a Leo program involves: -1. A `.in` file, containing `const` and non-`const` inputs. -2. A `.state` file, containing transaction data. -3. A `.out` file, containing results. - -The compilation takes the first two files as inputs and returns the third file as output. - -## Design - -### Multiple Entry Points - -We propose to generalize from one entry point (i.e. the `main` function) to multiple entry points, -in line with the smart contract paradigm. - -Instead of implicitly designating `main` as the only entry point, -we need a mechanism to explicitly designate one or more Leo functions as entry points. - -A simple approach could be to use an annotation like `@entrypoint` to designate _entry point functions_: -``` -@entrypoint -function mint(...) -> ... { ... } - -@entrypoint -function transfer(...) -> ... { ... } -``` -This has a precedent, in the use of `@test` to designate Leo test functions that are not compiled to circuits. - -Another approach is to use a keyword, e.g. -``` -entrypoint function mint(...) -> ... { ... } - -entrypoint function transfer(...) -> ... { ... } -``` -Yet another approach is to group entrypoint functions inside a new block construct, e.g. -``` -entrypoint { - - function mint(...) -> ... { ... } - - function transfer(...) -> ... { ... } -} -``` - -In the rest of this design section we assume the annotation approach (i.e. `@entrypoint`) for concreteness, -but that can be replaced as soon as we converge on a choice. - -### Types for Transaction Inputs and Outputs - -We propose to add types for transaction inputs and outputs to the Leo standard library, -and possibly include them in the prelude that is implicitly imported by every Leo program. - -Given that records have a fixed structure with typed slots, -their format could be described by a Leo circuit type, e.g. called `Record`, -whose member variables correspond to the slots. -The types of the slots are fairly low-level, -i.e. byte arrays (e.g. `u8[128]` for the payload) -and unsigned integers (e.g. `u64` for the balance), -because they must have a clear correspondence with the serialized form of records. -This means that the Leo code may have to do -its own deserialization of the payload bytes into higher-level Leo values; -standard serialization/deserialization libraries for Leo types may be provided for this, -as an independent and more generally useful feature. - -Given that a transaction input consists of two records and possibly additional information, -it makes sense to also have a circuit type `TransactionInput`, -which includes two `Record` slots and possibly additional slots. - -Additionally, it makes sense to have a circuit type `TransactionOutput` -that describes the output data of a transaction that is produced by the Leo program. -This could also include two `Record` slots for the new records, -or possibly "subsets" of records if the values of some record slots are calculated -not by the Leo program but instead by the Leo CLI (i.e. build process). - -All these types should be documented, as part of the standard library. -We will need to flesh out their exact definition, -but we note that this is fairly easy to change when it is in the standard library. - -### Entry Point Input and Output Types - -We propose that each entry point function of a Leo program -explicitly produce transaction outputs from transaction inputs -by taking a `TransactionInput` input and returning a `TransactionOutput` output: -``` -@entrypoint -function ...(input: TransactionInput, ...) -> TransactionOutput { ... } -``` -This way, the calculation of transaction outputs from transaction inputs is made functional and explicit. - -As special cases (both of which may apply to the same entry point): -1. We could allow the `TransactionInput` input to be absent, - when an entry point does not need access the transaction input data, - e.g. when producing new records without consuming old records. -2. We could allow the function output to be `()` instead of `TransactionOutput`, - when an entry point does not need to produce transaction outputs, - e.g. when consuming old records without producing new records. - -Compared to the current Leo program execution model (described earlier in the background section), -`input` is made an explicit input here, instead of being like a built-in global variable. -Furthermore, the output type is restricted to be `TransactionOutput` (or `()`), -thus eliminating the implicit serialization and the asymmetry with the treatment of transaction inputs. -There is still no restriction on the non-`TransactionInput` inputs of an entry point function; -as noted earlier, they are existentially quantified in the zero-knowledge assertion. - -Thus, a Leo program entry point can be now described as a mathematical function -``` -entrypoint : Record x Record x Inputs -> Record x Record -``` -where `Output` is no longer present. -(If `TransactionInput` includes additional data, besides the two old records, that may affect the transaction output, -then we would need to add that to this mathematical model; -however, the model above is sufficiently accurate for the current discussion.) - -We may require the `TransactionInput` input of an entry point function, if present, -to be the first input of the function, for clarity and readability. -A question is whether we should extend that requirement to non-entry-point functions -that may be passed `TransactionInput` values. -We note that none of these restrictions are necessary, though. -A necessary restriction is that each entry point function takes at most one `TransactionInput` input. - -We may require the `TransactionInput` input of an entry point function, if present, -to be called `input`, or some other predefined name. -However, this is not a necessary restriction, and we may decide to demote that to a convention rather than a requirement. -(Currently `input` is a keyword and its own kind of Leo expression, which slightly complicates the language.) - -### Access to Transaction Input and Output Types - -Currently the member variables of Leo circuit types are always accessible for both reading and writing. -It is thus possible for a Leo program -to read from the member variables of `TransactionInput` -and to write to the member variables of `TransactionOutput`. -Therefore, for an initial implementation, -it suffices for these two circuit types to provide member variables for the needed slots. - -We might want the member variables of `TransactionInput` to be read-only. -This is not necessary for the transaction model to work: -so long as `TransactionInput` is properly initialized before calling the entry point, -and that after the call the resulting `TransactionOutput` is used to create the transaction, -there is no harm in the Leo program to modify the copy of `TransactionInput` passed to the program. -Nonetheless, we may want to enforce this restriction to encourage good coding practices -(unless we find a use case to the contrary). - -There is currently no mechanism in Leo to enforce that. -Designating the transaction input as `const` is not right, -as that designation normally means that the value is compiled into the circuit. - -We could provide read-only access via member function (e.g. `payload()`, `balance()`), -but we still have to prohibit assignments to member variables (which is currently allowed on any circuit type). -As an orthogonal and more generally useful feature, -we could consider adding public/private access designations to Leo circuit members. -Another approach is to avoid exposing the member variables, -and just make the member functions available via an implicit import declaration. -All of this needs to be thought through more carefully, in the broader context of the Leo language design. - -If `TransactionInput` has member functions, it may also be useful for `TransactionOutput` to have member functions, -presumably to create new instances and to set values of member variables. - -A somewhat related consideration is whether it should be allowed -to make copies of the `TransactionInput` value passed to an entry point function. -There is no harm in doing that: the model still works, as explained above. -(Since a `TransactionInput` is a relatively large structure, -there may be harm consisting in creating a relatively large number of R1CS constraints, -but that may happen with user-defined types too, and is a separable problem.) -Nonetheless, we may want to enforce a discipline of single-threadedness, -which could also allow us to treat transaction input as an immutable reference behind the scenes, -thus reducing the number of R1CS constraints. - -Analogous considerations apply to `TransactionOutput`, -namely whether it should be treated in a single-threaded way, -i.e. effectively as a built-in global variable, -which could enable compiler optimizations. - -### Input and Output Files - -According to the new model proposed above, we should have just two files involved in the Leo compilation process: -1. A `.in` file, from which the `TransactionInput` value is produced. -2. A `.out` file, produced from the `TransactionOutput` returned by the program. - -There seems to be no longer a need for a `.state` file and for explicit registers. - -## Alternatives - -The 'Design' section above still outlines certain alternatives to consider. -Once we make some specific choices, we can move the other options to this section. - -### Built-in Global Variable for Transaction Input - -Instead of having explicit `TransactionInput` inputs in entry point functions, -we could maintain the current approach of viewing `input` as a built-in global variable, of type `TransactionInput`. -Everything else would be the same, except that `input` would be implicitly available. - -An advantage is that single-threadedness would be immediately guaranteed, -if we wanted to enforced that as discussed above. - -On the other hand, explicating transaction inputs as entry point inputs makes the code more functional -and simplifies certain aspects of the Leo compiler. - -### Built-in Global Variable for Transaction Output - -Instead of having explicit `TransactionOutput` outputs in entry point functions, -we could introduce a built-in `output` global variable, of type `TransactionOutput`. - -This has similar advantages and disadvantages to the ones discussed above for `input` as a built-in global variable. - -In any case, we may want this `output` global variable alternative -to go hand-in-hand with the `input` global variable alternative. -That is, we either adopt both or none. -The current treatment in Leo is asymmetric in this respect. - -### Implicit Serialization of Output Values - -Instead of having an explicit `TransactionOutput` type -whose values describe exactly the output data for a transaction, -we could keep something like the current model, -in which an entry point function may return values of arbitrary types, -which are implicitly serialized into output records. - -This may be a bit simpler for the beginning developer, -but it also introduces less control on the output data. -Futhermore, given that records have payloads of limited size, -it is not difficult to write a program that attempts to produce too much data. - -In any case, if we were to go this route, there would be an asymmetry with the treatment of transaction inputs, -unless we also allow `input` (by this we mean the value passed to an entry point function with the transaction inputs) -to consist of arbitrary Leo types (subject to serialization size limitations). -Note that this requires the Leo type of `input` to potentially vary across different programs -(which appears to be the case in current Leo), -which is more complicated than having some fixed types in the standard library. - -All in all, it seems that having `TransactionInput` and `TransactionOutput` types provides more explicit control. -Furthermore, in the future the Leo standard library could provide serialization and deserialization tools -that will make it easy to map between record slots and higher-level Leo types. diff --git a/docs/rfc/013-const-functions.md b/docs/rfc/013-const-functions.md deleted file mode 100644 index 68b9431a79..0000000000 --- a/docs/rfc/013-const-functions.md +++ /dev/null @@ -1,119 +0,0 @@ -# Leo RFC 013: Constant Functions - -## Author(s) - -The Aleo Team. - -## Status - -IMPLEMENTED - -## Summary - -This RFC proposes the additional of an optional `const` modifier to function declarations -to explicitly designate functions that return constant values that can be calculated at compile time. - -## Motivation - -Explicitly designating constant functions makes user intention explicit -and simplifies the Leo compiler's checks, as explained below. - -## Background - -Leo code is partially evaluated on its `const` inputs prior to being translated to R1CS. -A function that returns a value that only depends on `const` inputs directly or indirectly, -can be partially evaluated away, without having to be inlined during flattening. - -## Design - -### Syntax - -The ABNF grammar is extended by adding an optional `const` modifier to function declarations: -``` -function-declaration = *annotation [ %s"const" ] %s"function" identifier - "(" [ function-parameters ] ")" [ "->" type ] - block -``` - -This applies to both top-level and member functions. - -### Static Semantics - -A `const` function declaration must satisfy the following conditions: -* All its parameters are `const`, including the `self` parameter for instance circuit member functions. -* The body does not reference the special `input` variable. -* The body only calls other `const` functions. - -### Dynamic Semantics - -This has no impact on the dynamic semantics of Leo, viewed as a traditional programming language. - -### Flattening - -Given that `const` expressions are evaluated completely during flattening, -the values of the arguments of a `const` function call are known during flattening, -and therefore the function call can be completely evaluated as well. - -If the function is recursive (individually or with others), -the evaluation involves the bounded recursion analysis described in a separate RFC. - -### Implementation Considerations - -ASTs for function declarations are extended with a boolean flag `const_`. - -If a `const` function has a non-`const` parameter, -an ASG error occurs. - -If the body of a `const` function references the `input` variable or calls a non-`const` function, -an ASG error occurs. - -The description of static semantics, dynamic semantics, and flattening given above -are expressed in terms of Leo, because that is the user's view of the language. -In the implementation, flattening occurs after the Leo code is translated to the IR. - -### Examples - -```ts -const function len(const arr: [u8; _]) -> u32 { - return arr.len(); -} - -circuit Sample { - x: [char; 5] - const function say_hi(const self) -> [char; 5] { - return self.x; - } -} -``` - -## Drawbacks - -This extension does not appear to bring any drawbacks. - -## Effect on Ecosystem - -None. - -## Alternatives - -### No Constant Designation - -Without an explicit designation of constant functions, -the Leo compiler needs to perform an inter-procedural analysis: -if `f` calls `g`, in order for `f` to be constant, also `g` must be constant. -In other words, the call graph must be taken into account. - -In contrast, with the `const` designation, -an intra-procedural analysis suffices, -as discussed in the static semantics section above. - -## Future Extensions - -In other languages like Rust, `const` functions are not required to have all constant parameters. -They are just required to return constant results for constant arguments, -i.e. they must not access global variables and they must only call other `const` functions. -In other words, these `const` functions are polymorphic over "constancy". - -This could be also realized in Leo, because type inference/checking determines `const` and non-`const` expressions. -This tells the compiler which function calls have all `const` arguments and which ones do not. -Therefore, the compiler can selectively evaluate, during flattening, only the calls of `const` functions on `const` arguments. diff --git a/leo/README.md b/leo/README.md index 551c4c0c71..234403ef5c 100644 --- a/leo/README.md +++ b/leo/README.md @@ -1,6 +1,3 @@ # Overview This directory contains the source code for the Leo command line interface. -For a comprehensive list of commands and usage see the [developer documentation](https://developer.aleo.org/developer/cli/overview). - -The `leo-package` module is also included in this directory for convenience. diff --git a/leo/package/README.md b/leo/package/README.md index aa62f7c2a7..7a139eb435 100644 --- a/leo/package/README.md +++ b/leo/package/README.md @@ -3,28 +3,3 @@ [![Crates.io](https://img.shields.io/crates/v/leo-package.svg?color=neon)](https://crates.io/crates/leo-package) [![Authors](https://img.shields.io/badge/authors-Aleo-orange.svg)](../AUTHORS) [![License](https://img.shields.io/badge/License-GPLv3-blue.svg)](./LICENSE.md) - -## Description - -This module defines the structure of a Leo project package. And describes behavior of package internals, such -as Leo Manifest (Leo.toml), Lock File (Leo.lock), source files and imports. - -Mainly used by Leo binary. - -## Structure - -Each directory in this crate mirrors a corresponding file generated in a new Leo project package: - -```bash -package/src -├── errors # crate level error definitions -├── imports # program imports management -├── inputs # program inputs directory -├── outputs # program outputs directory -├── root # program root: Leo.toml, Leo.lock -└── source # source files directory -``` - -## Testing - -Package features functional tests in the `tests/` directory.