mirror of
https://github.com/roc-lang/roc.git
synced 2024-09-22 00:09:33 +03:00
add Roc wiring
This commit is contained in:
parent
c8287032b6
commit
e7523ad41d
@ -4,20 +4,21 @@ Builtins are the functions and modules that are implicitly imported into every m
|
||||
|
||||
### module/src/symbol.rs
|
||||
|
||||
Towards the bottom of `symbol.rs` there is a `define_builtins!` macro being used that takes many modules and function names. The first level (`List`, `Int` ..) is the module name, and the second level is the function or value name (`reverse`, `mod` ..). If you wanted to add a `Int` function called `addTwo` go to `2 Int: "Int" => {` and inside that case add to the bottom `38 INT_ADD_TWO: "addTwo"` (assuming there are 37 existing ones).
|
||||
Towards the bottom of `symbol.rs` there is a `define_builtins!` macro being used that takes many modules and function names. The first level (`List`, `Int` ..) is the module name, and the second level is the function or value name (`reverse`, `mod` ..). If you wanted to add a `Int` function called `addTwo` go to `2 Int: "Int" => {` and inside that case add to the bottom `38 INT_ADD_TWO: "addTwo"` (assuming there are 37 existing ones).
|
||||
|
||||
Some of these have `#` inside their name (`first#list`, `#lt` ..). This is a trick we are doing to hide implementation details from Roc programmers. To a Roc programmer, a name with `#` in it is invalid, because `#` means everything after it is parsed to a comment. We are constructing these functions manually, so we are circumventing the parsing step and dont have such restrictions. We get to make functions and values with `#` which as a consequence are not accessible to Roc programmers. Roc programmers simply cannot reference them.
|
||||
|
||||
But we can use these values and some of these are necessary for implementing builtins. For example, `List.get` returns tags, and it is not easy for us to create tags when composing LLVM. What is easier however, is:
|
||||
|
||||
- ..writing `List.#getUnsafe` that has the dangerous signature of `List elem, Nat -> elem` in LLVM
|
||||
- ..writing `List elem, Nat -> Result elem [ OutOfBounds ]*` in a type safe way that uses `getUnsafe` internally, only after it checks if the `elem` at `Nat` index exists.
|
||||
|
||||
|
||||
### can/src/builtins.rs
|
||||
|
||||
Right at the top of this module is a function called `builtin_defs`. All this is doing is mapping the `Symbol` defined in `module/src/symbol.rs` to its implementation. Some of the builtins are quite complex, such as `list_get`. What makes `list_get` is that it returns tags, and in order to return tags it first has to defer to lower-level functions via an if statement.
|
||||
Right at the top of this module is a function called `builtin_defs`. All this is doing is mapping the `Symbol` defined in `module/src/symbol.rs` to its implementation. Some of the builtins are quite complex, such as `list_get`. What makes `list_get` is that it returns tags, and in order to return tags it first has to defer to lower-level functions via an if statement.
|
||||
|
||||
Lets look at `List.repeat : elem, Nat -> List elem`, which is more straight-forward, and points directly to its lower level implementation:
|
||||
|
||||
```
|
||||
fn list_repeat(symbol: Symbol, var_store: &mut VarStore) -> Def {
|
||||
let elem_var = var_store.fresh();
|
||||
@ -42,29 +43,41 @@ fn list_repeat(symbol: Symbol, var_store: &mut VarStore) -> Def {
|
||||
)
|
||||
}
|
||||
```
|
||||
|
||||
In these builtin definitions you will need to allocate for and list the arguments. For `List.repeat`, the arguments are the `elem_var` and the `len_var`. So in both the `body` and `defn` we list these arguments in a vector, with the `Symobl::ARG_1` and` Symvol::ARG_2` designating which argument is which.
|
||||
|
||||
Since `List.repeat` is implemented entirely as low level functions, its `body` is a `RunLowLevel`, and the `op` is `LowLevel::ListRepeat`. Lets talk about `LowLevel` in the next section.
|
||||
|
||||
## Connecting the definition to the implementation
|
||||
|
||||
### module/src/low_level.rs
|
||||
|
||||
This `LowLevel` thing connects the builtin defined in this module to its implementation. Its referenced in `can/src/builtins.rs` and it is used in `gen/src/llvm/build.rs`.
|
||||
|
||||
## Bottom level LLVM values and functions
|
||||
|
||||
### gen/src/llvm/build.rs
|
||||
|
||||
This is where bottom-level functions that need to be written as LLVM are created. If the function leads to a tag thats a good sign it should not be written here in `build.rs`. If its simple fundamental stuff like `INT_ADD` then it certainly should be written here.
|
||||
|
||||
## Letting the compiler know these functions exist
|
||||
|
||||
### builtins/src/std.rs
|
||||
|
||||
Its one thing to actually write these functions, its _another_ thing to let the Roc compiler know they exist as part of the standard library. You have to tell the compiler "Hey, this function exists, and it has this type signature". That happens in `std.rs`.
|
||||
|
||||
## Specifying how we pass args to the function
|
||||
|
||||
### builtins/mono/src/borrow.rs
|
||||
After we have all of this, we need to specify if the arguments we're passing are owned, borrowed or irrelvant. Towards the bottom of this file, add a new case for you builtin and specify each arg. Be sure to read the comment, as it explains this in more detail.
|
||||
|
||||
After we have all of this, we need to specify if the arguments we're passing are owned, borrowed or irrelvant. Towards the bottom of this file, add a new case for your builtin and specify each arg. Be sure to read the comment, as it explains this in more detail.
|
||||
|
||||
## Testing it
|
||||
|
||||
### solve/tests/solve_expr.rs
|
||||
|
||||
To make sure that Roc is properly inferring the type of the new builtin, add a test to this file simlar to:
|
||||
|
||||
```
|
||||
#[test]
|
||||
fn atan() {
|
||||
@ -78,11 +91,14 @@ fn atan() {
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
But replace `Num.atan` and the type signature with the new builtin.
|
||||
|
||||
### gen/test/*.rs
|
||||
### gen/test/\*.rs
|
||||
|
||||
In this directory, there are a couple files like `gen_num.rs`, `gen_str.rs`, etc. For the `Str` module builtins, put the test in `gen_str.rs`, etc. Find the one for the new builtin, and add a test like:
|
||||
```
|
||||
|
||||
````
|
||||
#[test]
|
||||
fn atan() {
|
||||
assert_evals_to!("Num.atan 10", 1.4711276743037347, f64);
|
||||
@ -96,3 +112,4 @@ When implementing a new builtin, it is often easy to copy and paste the implemen
|
||||
|
||||
- `List.keepIf` did not work for a long time because in builtins its `LowLevel` was `ListMap`. This was because I copy and pasted the `List.map` implementation in `builtins.rs
|
||||
- `List.walkBackwards` had mysterious memory bugs for a little while because in `unique.rs` its return type was `list_type(flex(b))` instead of `flex(b)` since it was copy and pasted from `List.keepIf`.
|
||||
````
|
||||
|
@ -121,6 +121,7 @@ comptime {
|
||||
exportStrFn(str.fromUtf8C, "from_utf8");
|
||||
exportStrFn(str.fromUtf8RangeC, "from_utf8_range");
|
||||
exportStrFn(str.repeat, "repeat");
|
||||
exportStrFn(str.strTrim, "trim");
|
||||
}
|
||||
|
||||
// Utils
|
||||
|
@ -1504,9 +1504,13 @@ test "isWhitespace" {
|
||||
try expect(!isWhitespace('x'));
|
||||
}
|
||||
|
||||
pub fn strTrim(string: RocStr) callconv(.C) RocStr {
|
||||
return @call(.{ .modifier = always_inline }, trim, .{string});
|
||||
}
|
||||
|
||||
// TODO GIESCH
|
||||
// ask & read about small & large strings
|
||||
fn strTrim(string: RocStr) RocStr {
|
||||
fn trim(string: RocStr) RocStr {
|
||||
if (string.isEmpty()) return RocStr.empty();
|
||||
|
||||
const leading_bytes = countLeadingWhitespaceBytes(string);
|
||||
@ -1518,7 +1522,7 @@ fn strTrim(string: RocStr) RocStr {
|
||||
}
|
||||
|
||||
// TODO GIESCH
|
||||
// should this just use isUnique? (are small strings safe for mutation?)
|
||||
// should this just use isUnique? (are all small strings safe for mutation?)
|
||||
// should we rename isUnique to isUnleakable or something?
|
||||
// could also just inline the unsafe reallocate call
|
||||
if (string.isRefcountOne()) {
|
||||
@ -1643,4 +1647,4 @@ test "strTrim: unique hello world" {
|
||||
}
|
||||
|
||||
// TODO GIESCH
|
||||
// wire up to actual Roc code, add top level tests
|
||||
// add top level roc tests
|
||||
|
@ -142,6 +142,7 @@ pub const STR_TO_UTF8: &str = "roc_builtins.str.to_utf8";
|
||||
pub const STR_FROM_UTF8: &str = "roc_builtins.str.from_utf8";
|
||||
pub const STR_FROM_UTF8_RANGE: &str = "roc_builtins.str.from_utf8_range";
|
||||
pub const STR_REPEAT: &str = "roc_builtins.str.repeat";
|
||||
pub const STR_TRIM: &str = "roc_builtins.str.trim";
|
||||
|
||||
pub const DICT_HASH: &str = "roc_builtins.dict.hash";
|
||||
pub const DICT_HASH_STR: &str = "roc_builtins.dict.hash_str";
|
||||
|
@ -632,6 +632,9 @@ pub fn types() -> MutMap<Symbol, (SolvedType, Region)> {
|
||||
Box::new(str_type())
|
||||
);
|
||||
|
||||
// trim : Str -> Str
|
||||
add_top_level_function_type!(Symbol::STR_TRIM, vec![str_type()], Box::new(str_type()));
|
||||
|
||||
// fromUtf8 : List U8 -> Result Str [ BadUtf8 Utf8Problem ]*
|
||||
{
|
||||
let bad_utf8 = SolvedType::TagUnion(
|
||||
|
@ -67,6 +67,7 @@ pub fn builtin_defs_map(symbol: Symbol, var_store: &mut VarStore) -> Option<Def>
|
||||
STR_TO_UTF8 => str_to_utf8,
|
||||
STR_FROM_FLOAT=> str_from_float,
|
||||
STR_REPEAT => str_repeat,
|
||||
STR_TRIM => str_trim,
|
||||
LIST_LEN => list_len,
|
||||
LIST_GET => list_get,
|
||||
LIST_SET => list_set,
|
||||
@ -1236,6 +1237,26 @@ fn str_split(symbol: Symbol, var_store: &mut VarStore) -> Def {
|
||||
)
|
||||
}
|
||||
|
||||
/// Str.trim : Str -> Str
|
||||
fn str_trim(symbol: Symbol, var_store: &mut VarStore) -> Def {
|
||||
// TODO GIESCH understand when/why this can be reused
|
||||
let str_var = var_store.fresh();
|
||||
|
||||
let body = RunLowLevel {
|
||||
op: LowLevel::StrTrim,
|
||||
args: vec![(str_var, Var(Symbol::ARG_1))],
|
||||
ret_var: str_var,
|
||||
};
|
||||
|
||||
defn(
|
||||
symbol,
|
||||
vec![(str_var, Symbol::ARG_1)],
|
||||
var_store,
|
||||
body,
|
||||
str_var,
|
||||
)
|
||||
}
|
||||
|
||||
/// Str.repeat : Str, Nat -> Str
|
||||
fn str_repeat(symbol: Symbol, var_store: &mut VarStore) -> Def {
|
||||
let str_var = var_store.fresh();
|
||||
|
@ -17,7 +17,7 @@ use crate::llvm::build_list::{
|
||||
use crate::llvm::build_str::{
|
||||
empty_str, str_concat, str_count_graphemes, str_ends_with, str_from_float, str_from_int,
|
||||
str_from_utf8, str_from_utf8_range, str_join_with, str_number_of_bytes, str_repeat, str_split,
|
||||
str_starts_with, str_starts_with_code_point, str_to_utf8,
|
||||
str_starts_with, str_starts_with_code_point, str_to_utf8, str_trim,
|
||||
};
|
||||
use crate::llvm::compare::{generic_eq, generic_neq};
|
||||
use crate::llvm::convert::{
|
||||
@ -4953,6 +4953,12 @@ fn run_low_level<'a, 'ctx, 'env>(
|
||||
|
||||
str_count_graphemes(env, scope, args[0])
|
||||
}
|
||||
StrTrim => {
|
||||
// Str.trim : Str -> Str
|
||||
debug_assert_eq!(args.len(), 1);
|
||||
|
||||
str_trim(env, scope, args[0])
|
||||
}
|
||||
ListLen => {
|
||||
// List.len : List * -> Int
|
||||
debug_assert_eq!(args.len(), 1);
|
||||
|
@ -249,6 +249,16 @@ pub fn str_count_graphemes<'a, 'ctx, 'env>(
|
||||
)
|
||||
}
|
||||
|
||||
/// Str.trim : Str -> Str
|
||||
pub fn str_trim<'a, 'ctx, 'env>(
|
||||
env: &Env<'a, 'ctx, 'env>,
|
||||
scope: &Scope<'a, 'ctx>,
|
||||
str_symbol: Symbol,
|
||||
) -> BasicValueEnum<'ctx> {
|
||||
let str_i128 = str_symbol_to_c_abi(env, scope, str_symbol);
|
||||
call_bitcode_fn(env, &[str_i128.into()], bitcode::STR_TRIM)
|
||||
}
|
||||
|
||||
/// Str.fromInt : Int -> Str
|
||||
pub fn str_from_int<'a, 'ctx, 'env>(
|
||||
env: &Env<'a, 'ctx, 'env>,
|
||||
|
@ -17,6 +17,7 @@ pub enum LowLevel {
|
||||
StrToUtf8,
|
||||
StrRepeat,
|
||||
StrFromFloat,
|
||||
StrTrim,
|
||||
ListLen,
|
||||
ListGetUnsafe,
|
||||
ListSet,
|
||||
@ -123,6 +124,7 @@ macro_rules! first_order {
|
||||
| StrFromUtf8Range
|
||||
| StrToUtf8
|
||||
| StrRepeat
|
||||
| StrTrim
|
||||
| StrFromFloat
|
||||
| ListLen
|
||||
| ListGetUnsafe
|
||||
|
@ -1015,6 +1015,7 @@ define_builtins! {
|
||||
17 STR_ALIAS_ANALYSIS_STATIC: "#aliasAnalysisStatic" // string with the static lifetime
|
||||
18 STR_FROM_UTF8_RANGE: "fromUtf8Range"
|
||||
19 STR_REPEAT: "repeat"
|
||||
20 STR_TRIM: "trim"
|
||||
}
|
||||
4 LIST: "List" => {
|
||||
0 LIST_LIST: "List" imported // the List.List type alias
|
||||
|
@ -922,6 +922,7 @@ pub fn lowlevel_borrow_signature(arena: &Bump, op: LowLevel) -> &[bool] {
|
||||
ListGetUnsafe => arena.alloc_slice_copy(&[borrowed, irrelevant]),
|
||||
ListConcat => arena.alloc_slice_copy(&[owned, owned]),
|
||||
StrConcat => arena.alloc_slice_copy(&[owned, borrowed]),
|
||||
StrTrim => arena.alloc_slice_copy(&[owned]),
|
||||
StrSplit => arena.alloc_slice_copy(&[borrowed, borrowed]),
|
||||
ListSingle => arena.alloc_slice_copy(&[irrelevant]),
|
||||
ListRepeat => arena.alloc_slice_copy(&[irrelevant, borrowed]),
|
||||
|
Loading…
Reference in New Issue
Block a user