mirror of
https://github.com/ProvableHQ/leo.git
synced 2024-11-27 02:24:15 +03:00
[RFC] Initial proposal for native functions.
This commit is contained in:
parent
2f9e315c78
commit
f1384787ee
195
docs/rfc/010-native-functions.md
Normal file
195
docs/rfc/010-native-functions.md
Normal file
@ -0,0 +1,195 @@
|
||||
# Leo RFC 010: Native Functions
|
||||
|
||||
## Authors
|
||||
|
||||
- Max Bruce
|
||||
- Collin Chin
|
||||
- Alessandro Coglio
|
||||
- Eric McCarthy
|
||||
- Jon Pavlik
|
||||
- Damir Shamanaev
|
||||
- Damon Sicore
|
||||
- Howard Wu
|
||||
|
||||
## Status
|
||||
|
||||
DRAFT
|
||||
|
||||
# Summary
|
||||
|
||||
This RFC proposes an approach to handling natively implemented functions ('native functions', for short) in Leo,
|
||||
that is functions implemented not via Leo code but (in essence) via Rust code.
|
||||
Currently there is just one such function, namely BLAKE2s.
|
||||
The scope of this proposal is limited to native functions defined by the developers of Leo itself,
|
||||
not by users of Leo (i.e. developers of applications written in Leo).
|
||||
|
||||
The approach proposed here is to allow (global and member) Leo functions to have no defining bodies,
|
||||
in which case they are regarded as natively implemented;
|
||||
this is only allowed in Leo files that contain standard/core libraries, provided with the Leo toolchain.
|
||||
Most of the compiler can work essentially in the same way as it does now;
|
||||
at R1CS generation time, native functions must be recognized, and turned into the known gadgets that implement them.
|
||||
|
||||
# Motivation
|
||||
|
||||
Many languages support native functions (here we generically use 'functions' to also denote methods),
|
||||
where 'native' refers to the fact that the functions are implemented not in the language under consideration,
|
||||
but rather in the language used to implement the language under consideration.
|
||||
For instance, Java supports native methods that are implemented in C rather than Java.
|
||||
|
||||
There are two main reasons for native functions in programming languages:
|
||||
1. The functionality cannot be expressed at all in the language under consideration,
|
||||
e.g. Java has no constructs to print on screen, making a native implementation necessary.
|
||||
2. The functionality can be realized more efficiently in the native language.
|
||||
|
||||
The first reason above may not apply to Leo, at least currently,
|
||||
as Leo's intended use is mainly for "pure computations" rather than interactions with the external world.
|
||||
However, we note that console statements could be regarded as native functions (as opposed to "ad hoc" statements),
|
||||
and this may be in fact the path to pursue if we extend the scope of console features (e.g. even to full GUIs),
|
||||
as has been recently proposed (we emphasize that the console code is not meant to be compiled to R1CS).
|
||||
|
||||
The second reason above applies to Leo right now.
|
||||
While there is currently just one native function supported in Leo, namely BLAKE2s,
|
||||
it is conceivable that there may be other cryptographic (or non-cryptographic) functions
|
||||
for which hand-crafted R1CS gadgets are available
|
||||
that are more efficient than what the Leo compiler would generate if their functionality were written in Leo.
|
||||
While we will continue working towards making the Leo compiler better,
|
||||
and perhaps eventually capable to generate R1CS whose efficiency is competitive with hand-crafted gadgets,
|
||||
this will take time, and in the meanwhile new and more native functions may be added,
|
||||
resulting in a sort of arms race.
|
||||
In other words, it is conceivable that Leo will need to support native functions in the foreseeable future.
|
||||
|
||||
Languages typically come with standard/core libraries that application developers can readily use.
|
||||
Even though the Leo standard/core libraries are currently limited (perhaps just to BLAKE2s),
|
||||
it seems likely that we will want to provide more extensive standard/core libraries,
|
||||
not just for cryptographic functions, but also for data structures and operations on them.
|
||||
|
||||
The just mentioned use case of data structures brings about an important point.
|
||||
Leo circuit types are reasonable ways to provide library data structures,
|
||||
as they support static and instance member functions that realize operations on those data structures.
|
||||
Just like some Java library classes provide a mix of native and non-native methods,
|
||||
we could imagine certain Leo library circuit types providing a mix of native and non-native member functions, e.g.:
|
||||
```ts
|
||||
circuit Point2D {
|
||||
x: u32;
|
||||
y: u32;
|
||||
function origin() -> Point2D { ... } // probably non-native
|
||||
function move(mut self, delta_x: u32, delta_y: u32) { ... } // probably non-native
|
||||
function distance(self, other:Point2D); // maybe native (involves square root)
|
||||
}
|
||||
```
|
||||
|
||||
Our initial motivation for naive functions is limited to Leo standard/core libraries,
|
||||
not to user-defined libraries or applications.
|
||||
That is, only the developers of the Leo language will be able to create native functions.
|
||||
Leo users, i.e. developers of Leo applications, will be able to use the provided native functions,
|
||||
but not to create their own.
|
||||
If support for user-defined native functions may become desired in the future, it will be discussed in a separate RFC.
|
||||
|
||||
# Design
|
||||
|
||||
## Background
|
||||
|
||||
### Current Approach to Native Functions
|
||||
|
||||
The BLAKE2s native function is currently implemented as follows (as a high-level view):
|
||||
1. Prior to type checking/inference, its declaration (without a defining body)
|
||||
is programmatically added to the program under consideration.
|
||||
This way, the input and output types of the BLAKE2s function can be used to type-check code that calls it.
|
||||
2. At R1CS generation time, when the BLAKE2s function is compiled, it is recognized as native and,
|
||||
instead of translating its body to R1CS (which is not possible as the function has no Leo body),
|
||||
a known BLAKE2s gadget is used.
|
||||
|
||||
This approach is fine for a single native function, but may not be the best for a richer collection of native functions.
|
||||
In particular, consider the `Point2D` example above, which has a mix of native and non-native functions:
|
||||
presumably, we would like to write at least the non-native functions of `Point2D` directly in a Leo file,
|
||||
as opposed to programmatically generating them prior to type checking/inference.
|
||||
|
||||
### Multi-File Compilation
|
||||
|
||||
Leo already supports the compilation of multiple files that form a program, via packages and importing.
|
||||
This capability is independent from native functions.
|
||||
|
||||
We note that, for the purpose of type checking code that calls a function `f`,
|
||||
the defining body of `f` is irrelevant: only the input and output types of `f` are relevant.
|
||||
The defining body is of course type-checked when `f` itself is type-checked,
|
||||
and furthermore the defining body is obviously needed to generate R1CS,
|
||||
but the point here is that only the input and output types of `f` are needed to type-check code that calls `f`.
|
||||
In particular, this means that, if a Leo file imports a package,
|
||||
only the type information from the package is needed to type-check the file that imports the package.
|
||||
Conceptually, each package exports a symbol table, used (and sufficient) to type-check files that import that package.
|
||||
|
||||
## Proposal
|
||||
|
||||
we propose to:
|
||||
1. Allow declarations of (global and member) functions to have no defining body, signaling that the function is native.
|
||||
2. Extend the AST and ASG to allow functions to have no bodies.
|
||||
3. Have the compiler allow empty function bodies only in standard/core library files, which should be known.
|
||||
4. Have type checking/inference "skip over" absent function bodies.
|
||||
5. At R1CS generation time, when a function without body is encountered, find and use the known gadget for it.
|
||||
|
||||
Currently the ABNF grammar requires function declarations to have a defining body (a block), i.e. to be implemented in Leo:
|
||||
```
|
||||
function-declaration = *annotation %s"function" identifier
|
||||
"(" [ function-parameters ] ")" [ "->" type ]
|
||||
block
|
||||
```
|
||||
We propose to relax the rule as follows:
|
||||
```
|
||||
function-declaration = *annotation %s"function" identifier
|
||||
"(" [ function-parameters ] ")" [ "->" type ]
|
||||
( block / ";" )
|
||||
```
|
||||
This allows a function declaration to have a terminating semicolon instead of a block.
|
||||
|
||||
Since we do not have anything like abstract methods in Leo, this is a workable way to indicate native functions.
|
||||
However, it is easy, if desired, to have a more promiment indication, such as a `native` keyword, or an annotation.
|
||||
|
||||
It may be necessary to extend the AST and ASG to accommodate function bodies to be optional,
|
||||
although this may be already the case for representing BLAKE2s in its current form described above.
|
||||
|
||||
The compiler should know which files are part of the Leo standard/core libraries and which ones are not.
|
||||
Functions without bodies will be only allowed to appear in those files.
|
||||
It will be an error if any other file (e.g. user-defined) contains functions without bodies.
|
||||
Type checking/inference may be where we make this check, or perhaps in some other phase.
|
||||
|
||||
Because of the already existing support for multi-file compilation described above,
|
||||
no essential change is needed in the compiler's type checking/inference.
|
||||
We just need to make sure that functions without bodies are expected and properly handled
|
||||
(i.e. their input and output types must be checked and added to the proper symbol tables,
|
||||
but their absent bodies must be skipped);
|
||||
this may already be the case, for the treatment of BLAKE2s described above.
|
||||
|
||||
The main change is in R1CS generation.
|
||||
Normally, when a function definition is encountered, its Leo body is translated to R1CS.
|
||||
For a native function, we need to find and use a known gadget instead.
|
||||
The compiler must know a mapping from native functions in the standard/core libraries
|
||||
to the R1CS gadgets that implement them, so it should be just a matter of selecting the appropriate one.
|
||||
Some of this logic must be already present, in order to detect and select the BLAkE2s gadget.
|
||||
|
||||
# Drawbacks
|
||||
|
||||
This does not seem to bring any drawbacks.
|
||||
A capability for native functions (for BLAKE2s) already exists;
|
||||
this RFC proposes a way to make it more flexible,
|
||||
with mild (and likely simplifying) changes to the compiler.
|
||||
|
||||
# Effect on Ecosystem
|
||||
|
||||
This should help support richer standard/core libraries for Leo.
|
||||
|
||||
# Alternatives
|
||||
|
||||
Instead of storing declarations of native functions in standard/core files as proposed above,
|
||||
we could programmatically generate them as currently done for BLAKE2s.
|
||||
Macros may be used to generate families of similar function declarations.
|
||||
|
||||
However, consider `Point2D` above, which has a mix or native and non-native functions.
|
||||
One approach is to programmatically generate the whole `Point2D` declarative,
|
||||
with both native and non-native functions.
|
||||
But it seems that a Leo file would be clearer and more maintainable than a Rust string in the compiler.
|
||||
We could think of splitting the non-native and native functions of `Point2D`:
|
||||
the former in a Leo file, and the latter programmatically added.
|
||||
Again, this looks more complicated than just declaring native funcions in Leo files.
|
||||
|
||||
In summary, accommodating native functions in Leo standard/core file
|
||||
should be doable with mild and possibly simplifying changes to the current Leo compiler.
|
Loading…
Reference in New Issue
Block a user