Kevin Gillette: markdown typo fixes

Luke Boswell: move Str.md as it looks more like a design doc than rust crate

Co-authored-by: Luke Boswell <lukewilliamboswell@gmail.com>
This commit is contained in:
Kevin Gillette 2023-02-22 21:19:29 -07:00
parent 4f48873178
commit 0321f91c70
No known key found for this signature in database
GPG Key ID: 9009F701BBC0D562
14 changed files with 272 additions and 63 deletions

View File

@ -4,7 +4,7 @@ If you run into any problems getting Roc built from source, please ask for help
## Using Nix
On Macos and Linux, we highly recommend Using [nix](https://nixos.org/download.html) to quickly install all dependencies necessary to build roc.
On MacOS and Linux, we highly recommend Using [nix](https://nixos.org/download.html) to quickly install all dependencies necessary to build roc.
:warning: If you tried to run `cargo` in the repo folder before installing nix, make sure to execute `cargo clean` first. To prevent you from executing `cargo` outside of nix, tools like [direnv](https://github.com/nix-community/nix-direnv) and [lorri](https://github.com/nix-community/lorri) can put you in a nix shell automatically when you `cd` into the directory.
@ -177,7 +177,7 @@ error: No suitable version of LLVM was found system-wide or pointed
Add `export LLVM_SYS_130_PREFIX=/usr/lib/llvm-13` to your `~/.bashrc` or equivalent file for your shell.
### LLVM installation on macOS
### LLVM installation on MacOS
If installing LLVM fails, it might help to run `sudo xcode-select -r` before installing again.

View File

@ -96,7 +96,7 @@ In case you have multiple commits, you can sign them in two ways:
- Find the oldest commit you want to sign, using the `git log --show-signature` command.
- Run the command `git rebase --exec 'git commit --amend --no-edit -n -S' -i HASH` which would sign all commits up to commit `HASH`.
If you already pushed unsigned commits, you mmay have to do a force push with `git push origin -f <branch_name>`.
If you already pushed unsigned commits, you may have to do a force push with `git push origin -f <branch_name>`.
## Can we do better?

View File

@ -18,7 +18,7 @@ You can run your new tests locally using `cargo test-gen-llvm`. You can add a fi
Towards the bottom of `symbol.rs` there is a `define_builtins!` macro being used that takes many modules and function names. The first level (`List`, `Int` ..) is the module name, and the second level is the function or value name (`reverse`, `rem` ..). If you wanted to add a `Int` function called `addTwo` go to `2 Int: "Int" => {` and inside that case add to the bottom `38 INT_ADD_TWO: "addTwo"` (assuming there are 37 existing ones).
Some of these have `#` inside their name (`first#list`, `#lt` ..). This is a trick we are doing to hide implementation details from Roc programmers. To a Roc programmer, a name with `#` in it is invalid, because `#` means everything after it is parsed to a comment. We are constructing these functions manually, so we are circumventing the parsing step and dont have such restrictions. We get to make functions and values with `#` which as a consequence are not accessible to Roc programmers. Roc programmers simply cannot reference them.
Some of these have `#` inside their name (`first#list`, `#lt` ..). This is a trick we are doing to hide implementation details from Roc programmers. To a Roc programmer, a name with `#` in it is invalid, because `#` means everything after it is parsed to a comment. We are constructing these functions manually, so we are circumventing the parsing step and don't have such restrictions. We get to make functions and values with `#` which as a consequence are not accessible to Roc programmers. Roc programmers simply cannot reference them.
But we can use these values and some of these are necessary for implementing builtins. For example, `List.get` returns tags, and it is not easy for us to create tags when composing LLVM. What is easier however, is:
@ -70,7 +70,7 @@ This `LowLevel` thing connects the builtin defined in this module to its impleme
### gen/src/llvm/build.rs
This is where bottom-level functions that need to be written as LLVM are created. If the function leads to a tag thats a good sign it should not be written here in `build.rs`. If it's simple fundamental stuff like `INT_ADD` then it certainly should be written here.
This is where bottom-level functions that need to be written as LLVM are created. If the function leads to a tag that's a good sign it should not be written here in `build.rs`. If it's simple fundamental stuff like `INT_ADD` then it certainly should be written here.
## Letting the compiler know these functions exist
@ -121,7 +121,7 @@ But replace `Num.atan`, the return value, and the return type with your new buil
## Mistakes that are easy to make!!
When implementing a new builtin, it is often easy to copy and paste the implementation for an existing builtin. This can take you quite far since many builtins are very similar, but it also risks forgetting to change one small part of what you copy and pasted and losing a lot of time later on when you cant figure out why things dont work. So, speaking from experience, even if you are copying an existing builtin, try and implement it manually without copying and pasting. Two recent instances of this (as of September 7th, 2020):
When implementing a new builtin, it is often easy to copy and paste the implementation for an existing builtin. This can take you quite far since many builtins are very similar, but it also risks forgetting to change one small part of what you copy and pasted and losing a lot of time later on when you cant figure out why things don't work. So, speaking from experience, even if you are copying an existing builtin, try and implement it manually without copying and pasting. Two recent instances of this (as of September 7th, 2020):
- `List.keepIf` did not work for a long time because in builtins its `LowLevel` was `ListMap`. This was because I copy and pasted the `List.map` implementation in `builtins.rs
- `List.walkBackwards` had mysterious memory bugs for a little while because in `unique.rs` its return type was `list_type(flex(b))` instead of `flex(b)` since it was copy and pasted from `List.keepIf`.

View File

@ -21,7 +21,7 @@ implement them in a higher-level language like Zig, then compile
the result to LLVM bitcode, and import that bitcode into the compiler.
Compiling the bitcode happens automatically in a Rust build script at `compiler/builtins/build.rs`.
Then `builtins/src/bitcode/rs` staticlly imports the compiled bitcode for use in the compiler.
Then `builtins/src/bitcode/rs` statically imports the compiled bitcode for use in the compiler.
You can find the compiled bitcode in `target/debug/build/roc_builtins-[some random characters]/out/builtins.bc`.
There will be two directories like `roc_builtins-[some random characters]`, look for the one that has an
@ -33,4 +33,4 @@ There will be two directories like `roc_builtins-[some random characters]`, look
## Calling bitcode functions
use the `call_bitcode_fn` function defined in `llvm/src/build.rs` to call bitcode functions.
Use the `call_bitcode_fn` function defined in `llvm/src/build.rs` to call bitcode functions.

View File

@ -1,11 +1,11 @@
# Dev Backend
The dev backend is focused on generating decent binaries extremely fast.
It goes from Roc's [mono ir](https://github.com/roc-lang/roc/blob/main/crates/compiler/mono/src/ir.rs) to an object file ready to be linked.
It goes from Roc's [Mono IR](https://github.com/roc-lang/roc/blob/main/crates/compiler/mono/src/ir.rs) to an object file ready to be linked.
## General Process
The backend is essentially defined as two recursive match statement over the mono ir.
The backend is essentially defined as two recursive match statement over the Mono IR.
The first pass is used to do simple linear scan lifetime analysis.
In the future it may be expanded to add a few other quick optimizations.
The second pass is the actual meat of the backend that generates the byte buffer of output binary.
@ -14,32 +14,32 @@ The process is pretty simple, but can get quite complex when you have to deal wi
## Core Abstractions
This library is built with a number of core traits/generic types that may look quite weird at first glance.
The reason for all of the generics and traits is to allow rust to optimize each target specific backend.
Instead of every needing an `if linux ...` or `if arm ...` statement within the backend,
rust should be abled compile each specific target (`linux-arm`, `darwin-x86_64`, etc) as a static optimized backend without branches on target or dynamic dispatch.
The reason for all of the generics and traits is to allow Rust to optimize each target-specific backend.
Instead of needing an `if linux ...` or `if arm ...` statement everywhere within the backend,
Rust should be able to compile each specific target (`linux-arm`, `darwin-x86_64`, etc) as a static optimized backend without branches on target or dynamic dispatch.
**Note:** links below are to files, not specific lines. Just look up the specific type in the file.
### Backend
[Backend](https://github.com/roc-lang/roc/blob/main/crates/compiler/gen_dev/src/lib.rs) is the core abstraction.
It understands Roc's mono ir and some high level ideas about the generation process.
The main job of Backend is to do high level optimizatons (like lazy literal loading) and parse the mono ir.
It understands Roc's Mono IR and some high level ideas about the generation process.
The main job of Backend is to do high level optimizations (like lazy literal loading) and parse the Mono IR.
Every target specific backend must implement this trait.
### Backend64Bit
[Backend64Bit](https://github.com/roc-lang/roc/blob/main/crates/compiler/gen_dev/src/generic64/mod.rs) is more or less what it sounds like.
It is the backend that understands 64 bit architectures.
Currently it is the only backend implementation, but a 32 bit implementation will probably come in the future.
This backend understands that the unit of data movement is 64 bit.
It also knows about things common to all 64 bit architectures (general purpose registers, stack, float regs, etc).
It is the backend that understands 64-bit architectures.
Currently it is the only backend implementation, but a 32-bit implementation will probably come in the future.
This backend understands that the unit of data movement is 64-bit.
It also knows about things common to all 64-bit architectures (general purpose registers, stack, float regs, etc).
If you look at the signiture for Backend64Bit, it is actually quite complex.
If you look at the signature for Backend64Bit, it is actually quite complex.
Backend64Bit is generic over things like the register type, assembler, and calling convention.
This enables to backend to support multiple architectures and operating systems.
For example, the `windows-x86_64` would use the x86 register set, the x86 assembler, and the x86 windows calling convention.
`darwin-x86_64` and `linux-x86_64` would use the same register set and assembler, but they would use the system v amd64 abi calling convention.
`darwin-x86_64` and `linux-x86_64` would use the same register set and assembler, but they would use the System V AMD64 ABI calling convention.
Backend64Bit is generic over these types instead of containing these types within it's struct to avoid the cost of dynamic dispatch.
### Assembler
@ -71,8 +71,8 @@ This is the general procedure I follow with some helpful links:
1. Pick/write the simplest test case you can find for the new feature.
Just add `feature = "gen-dev"` to the `cfg` line for the test case.
1. Uncomment the code to print out procedures [from here](https://github.com/roc-lang/roc/blob/b03ed18553569314a420d5bf1fb0ead4b6b5ecda/compiler/test_gen/src/helpers/dev.rs#L76) and run the test.
It should fail and print out the mono ir for this test case.
Seeing the actual mono ir tends to be very helpful for complex additions.
It should fail and print out the Mono IR for this test case.
Seeing the actual Mono IR tends to be very helpful for complex additions.
1. Generally it will fail in one of the match statements in the [Backend](https://github.com/roc-lang/roc/blob/main/crates/compiler/gen_dev/src/lib.rs) trait.
Add the correct pattern matching and likely new function for your new builtin.
This will break the compile until you add the same function to places that implement the trait,
@ -83,7 +83,7 @@ This is the general procedure I follow with some helpful links:
See the helpful resources section below for guides on figuring out assembly bytes.
1. Hopefully at some point everything compiles and the test is passing.
If so, yay. Now add more tests for the same feature and make sure you didn't miss the edge cases.
1. If things aren't working, reach out on zulip. Get advice, maybe even pair.
1. If things aren't working, reach out on Zulip. Get advice, maybe even pair.
1. Make a PR.
## Debugging x86_64 backend output
@ -148,8 +148,8 @@ The output lines contain the hexadecimal representation of the x86 opcodes and f
Lets you inspect exactly what is generated in a binary.
Can inspect assembly, relocations, and more.
I use this all the time for debugging and inspecting C sample apps.
May write a larger tutorial for this because it can be seriosly helpful.
As a note, when dealing with relocatoins, please make sure to compile with PIC.
May write a larger tutorial for this because it can be seriously helpful.
As a note, when dealing with relocations, please make sure to compile with PIC.
- [Online Assembler](https://defuse.ca/online-x86-assembler.htm#disassembly) -
Useful for seeing the actual bytes generated by assembly instructions.
A lot of time it gives on out of multiple options because x86_64 has many ways to do things.
@ -172,5 +172,5 @@ The output lines contain the hexadecimal representation of the x86 opcodes and f
- If there is anything else basic that you want to know,
there is a good chance it is include in lectures from compiler courses.
Definitely look at some of the free moocs, lectures, or youtube class recordings on the subject.
- If you have any specific questions feel free to ping Brendan Hansknecht on zulip.
- If you have any specific questions feel free to ping Brendan Hansknecht on Zulip.
- If you have any other resource that you find useful, please add them here.

View File

@ -1,4 +1,4 @@
# Important things to add (Not necessarily in order)
# Important things to add (not necessarily in order)
- Expand to way more builtins and assembly calls.
- Deal with function calling, basic layouts, and all the fun basics of argument passing.
@ -10,7 +10,7 @@
Otherwise, many will always be inlined.
- Add basic const folding? It should be really easy to add as an optimization.
It should be a nice optimization for little cost. Just be sure to make it optional, otherwise our tests will do nothing.
- Automatically build the zig builtins .o file and make it available here.
- Automatically build the Zig builtins .o file and make it available here.
We will need to link against it and use it whenever we call specific builtins.
- Add unwind tables and landing pads.
- Add ability to wrap functions with exceptions or return a results.

View File

@ -6,7 +6,7 @@ This restriction enables ignoring most of linking.
## General Overview
This linker is run in 2 phases: preprocessing and surigical linking.
This linker is run in 2 phases: preprocessing and surgical linking.
### Platform Preprocessor

View File

@ -95,14 +95,14 @@ e.g. you have a test `calculate_sum_test` that only uses the function `add`, whe
- [Unreal Engine 4](https://www.unrealengine.com/en-US/)
- [Blueprints](https://docs.unrealengine.com/en-US/Engine/Blueprints/index.html) visual scripting (not suggesting visual scripting for Roc)
- [Live Programing](https://www.microsoft.com/en-us/research/project/live-programming/?from=http%3A%2F%2Fresearch.microsoft.com%2Fen-us%2Fprojects%2Fliveprogramming%2Ftypography.aspx#!publications) by [Microsoft Research] it contains many interesting research papers.
- [Live Programming](https://www.microsoft.com/en-us/research/project/live-programming/?from=http%3A%2F%2Fresearch.microsoft.com%2Fen-us%2Fprojects%2Fliveprogramming%2Ftypography.aspx#!publications) by [Microsoft Research] it contains many interesting research papers.
- [Math Inspector](https://mathinspector.com/), [github](https://github.com/MathInspector/MathInspector)
- [Lamdu](http://www.lamdu.org/) live functional programming.
- [Sourcetrail](https://www.sourcetrail.com/) nice tree-like source explorer.
- [Unisonweb](https://www.unisonweb.org), definition based [editor](https://twitter.com/shojberg/status/1364666092598288385) as opposed to file based.
- [Utopia](https://utopia.app/) integrated design and development environment for React. Design and code update each other, in real time.
- [Paredit](https://calva.io/paredit/) structural clojure editing, navigation and selection. [Another overview](http://danmidwood.com/content/2014/11/21/animated-paredit.html)
- [tylr](https://tylr.fun/) projectional editor ux that helps you make it easier to do edits that are typically difficult with projectional editors but are easy with classic editors.
- [tylr](https://tylr.fun/) projectional editor UX that helps you make it easier to do edits that are typically difficult with projectional editors but are easy with classic editors.
### Project exploration
@ -110,7 +110,7 @@ e.g. you have a test `calculate_sum_test` that only uses the function `add`, whe
#### Inspiration
- [Github Next](https://next.github.com/projects/repo-visualization) each file and folder is visualised as a circle: the circles color is the type of file, and the circles size represents the size of the file. Sidenote, a cool addition to this might be to use heatmap colors for the circles; circles for files that have had lots of commits could be more red, files with few commits would be blue.
- [Github Next](https://next.github.com/projects/repo-visualization) each file and folder is visualised as a circle: the circles color is the type of file, and the circles size represents the size of the file. Side note: a cool addition to this might be to use heatmap colors for the circles; circles for files that have had lots of commits could be more red, files with few commits would be blue.
- [AppMap](https://appland.com/docs/appmap-overview.html) records code execution traces, collecting information about how your code works and what it does. Then it presents this information as interactive diagrams that you can search and navigate. In the diagrams, you can see exactly how functions, web services, data stores, security, I/O, and dependent services all work together when application code runs.
- [Discussion on flow based ( nodes and wires) programming](https://marianoguerra.github.io/future-of-coding-weekly/history/weekly/2022/08/W1/thinking-together.html#2022-07-25T00:47:49.408Z) if the wires are a mess, is your program a mess?
@ -195,16 +195,16 @@ e.g. you have a test `calculate_sum_test` that only uses the function `add`, whe
- look for similar errors in github issues of the relevant libraries
- search stackoverflow questions
- search a local history of previously encountered errors and fixes
- search through a database of our zullip questions
- search through a database of our Zulip questions
- ...
- smart insert: press a shortcut and enter a plain english description of a code snippet you need. Examples: "convert string to list of chars", "sort list of records by field foo descending", "plot this list with date on x-axis"...
- After the user has refactored code to be simpler, try finding other places in the code base where the same simplification can be made.
- Show most commonly changed settings on first run so new users can quickly customize their experience. Keeping record of changed settings should be opt-in.
- Detection of multiple people within same company/team working on same code at the same time (opt-in).
- Autocorrect likely typos for stuff like `-<` when not in string.
- If multiple functions are available for import, use function were types would match in insetion position.
- If multiple functions are available for import, use function were types would match in insertion position.
- Recommend imports based on imports in other files in same project.
- Machine Learning model to determine confidence in a possiblte auto import. Automatically add the importt if confidence is very high.
- Machine Learning model to determine confidence in a possible auto import. Automatically add the import if confidence is very high.
- Ability to print logs in different color depending on which file they come from.
- Clicking on a log print should take you to the exact line of code that called the log function
- When detecting that the user is repeating a transformation such as replacing a string in a text manually, offer to do the replacement for all occurrences in this string/function/file/workspace.
@ -342,7 +342,7 @@ e.g. you have a test `calculate_sum_test` that only uses the function `add`, whe
## Tutorials
- Inclusion of step-by-step tutrials in Roc libraries, platforms or business specific code.
- Inclusion of step-by-step tutorials in Roc libraries, platforms or business specific code.
- Having to set up your own website for a tutorial can be a lot of work, making it easy to make quality tutorials would make for a more delightful experience.
## High performance
@ -369,25 +369,25 @@ e.g. you have a test `calculate_sum_test` that only uses the function `add`, whe
## Accessibility
- Visual Imapirments
No Animation is most benign form of cognitive disabity but really important base line of people with tense nerve system.
- Visual Impairments
No Animation is most benign form of cognitive disability but really important base line of people with tense nerve system.
Insensitivity to certain or all colors.
Need of highcontrast
Need of high contrast
Or Everything Magnified for me with no glasses.
Or Total blindness where we need to trough sound to communicate to the user
Screen readers read trees of labeled elements. Each OS has different apis, but I think they are horrible. Just close your eyes and imagine listening to screen reader all day while you are using this majectic machines called computers.
But blind people walk with a tool and they can react much better to sound/space relations than full on visal majority does. They are acute to sound as a spatial hint. And a hand for most of them is a very sensitive tool that can make sounds in space.
Imagine if everytime for the user doesnt want to rely on shining rendered pixels on the screen for a feedback from machine, we make a acoustic room simulation, where with moving the "stick", either with mouse or with key arrows, we bump into one of the objects and that produces certain contextually appropriate sound (clean)*ding*
Screen readers read trees of labeled elements. Each OS has different APIs, but I think they are horrible. Just close your eyes and imagine listening to screen reader all day while you are using these majestic machines called computers.
But blind people walk with a tool and they can react much better to sound/space relations than full on visual majority does. They are acute to sound as a spatial hint. And a hand for most of them is a very sensitive tool that can make sounds in space.
Imagine if every time the user doesn't want to rely on shining rendered pixels on the screen for a feedback from machine, we make a acoustic room simulation, where with moving the "stick", either with mouse or with key arrows, we bump into one of the objects and that produces certain contextually appropriate sound (clean)*ding*
On the each level of abstraction they can make sounds more deeper, so then when you type letters you feel like you are playing with the sand (soft)*shh*. We would need help from some sound engineer about it, but imagine moving down, which can be voice triggered command for motion impaired, you hear (soft)*pup* and the name of the module, and then you have options and commands appropriate for the module, they could map to those basic 4 buttons that we trained user on, and he would shortcut all the soft talk with click of a button. Think of the satisfaction when you can skip the dialog of the game and get straight into action. (X) Open functions! each function would make a sound and say its name, unless you press search and start searching for a specific function inside module, if you want one you select or move to next.
On the each level of abstraction they can make sounds more deeper, so then when you type letters you feel like you are playing with the sand (soft)*shh*. We would need help from some sound engineer about it, but imagine moving down, which can be voice triggered command for motion impaired, you hear (soft)*pup* and the name of the module, and then you have options and commands appropriate for the module, they could map to those basic 4 buttons that we trained user on, and he would shortcut all the soft talk with click of a button. Think of the satisfaction when you can skip the dialog of the game and get straight into action. (X) Open functions! Each function would make a sound and say its name, unless you press search and start searching for a specific function inside module, if you want one you select or move to next.
- Related idea: Playing sounds in rapid succession for different expressions in your program might be a high throughput alternative to stepping through your code line by line. I'd bet you quickly learn what your program should sound like. The difference in throughput would be even larger for those who need to rely on voice transcription.
- Motor impariments
- Motor impairments
[rant]BACKS OF CODERS ARE NOT HEALTHY! We need to change that![/neverstop]
Too much mouse waving and sitting for too long is bad for humans.
Keyboard is basic accessability tool but
Keyboard is basic accessibility tool but
Keyboard is also optional, some people have too shaky hands even for keyboard.
They rely on eye tracking to move mouse cursor arond.
They rely on eye tracking to move mouse cursor around.
If we employ *some* voice recognition functions we could make same interface as we could do for consoles where 4+2 buttons and directional pad would suffice.
That is 10 phrases that need to be pulled trough as many possible translations so people don't have to pretend that they are from Maine or Texas so they get voice recognition to work. Believe me I was there with Apple's Siri :D That is why we have 10 phrases for movement and management and most basic syntax.
- Builtin fonts that can be read more easily by those with dyslexia.

View File

@ -5,7 +5,7 @@ Status: we invite you to try out abilities for beta use, and are working on reso
This design idea addresses a variety of problems in Roc at once. It also unlocks some very exciting benefits that I didn't expect at the outset! It's a significant addition to the language, but it also means two other language features can be removed, and numbers can get a lot simpler.
Thankfully it's a nonbreaking change for most Roc code, and in the few places where it actually is a breaking change, the fix should consist only of shifting a handful of characters around. Still, it feels like a big change because of all the implications it brings. Here we go!
Thankfully it's a non-breaking change for most Roc code, and in the few places where it actually is a breaking change, the fix should consist only of shifting a handful of characters around. Still, it feels like a big change because of all the implications it brings. Here we go!
## Background
Elm has a few specially constrained type variables: `number`, `comparable`, `appendable`, and the lesser-known `compappend`. Roc does not have these; it has no `appendable` or `compappend` equivalent, and instead of `number` and `comparable` it has:

209
design/language/RocStr.md Normal file
View File

@ -0,0 +1,209 @@
# RocStr
This is the in-memory representation for Roc's `Str`. To explain how `Str` is laid out in memory, it's helpful to start with how `List` is laid out.
## Empty list
An empty `List Str` is essentially this Rust type with all 0s in memory:
```rust
struct List {
pointer: *Str, // pointers are the same size as `usize`
length: usize
}
```
On a 64-bit system, this `struct` would take up 16B in memory. On a 32-bit system, it would take up 8B.
Here's what the fields mean:
- `pointer` is the memory address of the heap-allocated memory containing the `Bool` elements. For an empty list, the pointer is null (that is, 0).
- `length` is the number of `Bool` elements in the list. For an empty list, this is also 0.
## Nonempty list
Now let's say we define a `List Str` with two elements in it, like so: `["foo", "bar"]`.
First we'd have the `struct` above, with both `length` and `capacity` set to 2. Then, we'd have some memory allocated on the heap, and `pointer` would store that memory's address.
Here's how that heap memory would be laid out on a 64-bit system. It's a total of 48 bytes.
```text
|------16B------|------16B------|---8B---|---8B---|
string #1 string #2 refcount unused
```
Just like how `List` is a `struct` that takes up `2 * usize` bytes in memory, `Str` takes up the same amount of memory - namely, 16B on a 64-bit system. That's why each of the two strings take up 16B of this heap-allocated memory. (Those structs may also point to other heap memory, but they could also be empty strings! Either way we just store the structs in the list, which take up 16B.)
We'll get to what the refcount is for shortly, but first let's talk about the memory layout. The refcount is a `usize` integer, so 8B on our 64-bit system. Why is there 8B of unused memory after it?
This is because of memory alignment. Whenever a system loads some memory from a memory address, it's much more efficient if the address is a multiple of the number of bytes it wants to get. So if we want to load a 16B string struct, we want its address to be a multiple of 16.
When we're allocating memory on the heap, the way we specify what alignment we want is to say how big each element is, and how many of them we want. In this case, we say we want 16B elements, and we want 3 of them. Then we use the first 16B slot to store the 8B refcount, and the 8B after it are unused.
This is memory-inefficient, but it's the price we pay for having all the 16B strings stored in addresses that are multiples of 16. It'd be worse for performance if we tried to pack everything tightly, so we accept the memory inefficiency as a cost of achieving better overall execution speed.
> Note: if we happened to have 8B elements instead of 16B elements, the alignment would be 8 anyway and we'd have no unused memory.
## Reference counting
Let's go back to the refcount - short for "reference count."
The refcount is a `usize` integer which counts how many times this `List` has been shared. For example, if we named this list `myList` and then wrote `[myList, myList, myList]` then we'd increment that refcount 3 times because `myList` is now being shared three more times.
If we were to later call `List.pop` on that list, and the result was an in-place mutation that removed one of the `myList` entries, we'd decrement the refcount. If we did that again and again until the refcount got all the way down to 0, meaning nothing is using it anymore, then we'd deallocate these 48B of heap memory because nobody is using them anymore.
In some cases, the compiler can detect that no reference counting is necessary. In that scenario, it doesn't bother allocating extra space for the refcount; instead, it inserts an instruction to allocate the memory at the appropriate place, another to free it later, and that's it.
## Pointing to the first element
The fact that the reference count may or may not be present could create a tricky situation for some `List` operations.
For example, should `List.get 0` return the first 16B of the heap-allocated bytes, or the second 16B? If there's a reference count in the first 16B, it should return the second 16B. If there's no refcount, it should return the first 16B.
To solve this, the pointer in the List struct *always* points to the first element in the list. That means to access the reference count, it does negative pointer arithmetic to get the address at 16B *preceding* the memory address it has stored in its pointer field.
## Growing lists
If uniqueness typing tells us that a list is Unique, we know two things about it:
1. It doesn't need a refcount, because nothing else ever references it.
2. It can be mutated in-place.
One of the in-place mutations that can happen to a list is that its length can increase. For example, if I call `List.append list1 list2`, and `list1` is unique, then we'll attempt to append `list2`'s contents in-place into `list1`.
Calling `List.append` on a Shared list results in allocating a new chunk of heap memory large enough to hold both lists (with a fresh refcount, since nothing is referencing the new memory yet), then copying the contents of both lists into the new memory, and finally decrementing the refcount of the old memory.
Calling `List.append` on a Unique list can potentially be done in-place instead.
First, `List.append` repurposes the `usize` slot normally used to store refcount, and stores a `capacity` counter in there instead of a refcount. (After all, unique lists don't need to be refcounted.) A list's capacity refers to how many elements the list *can* hold given the memory it has allocated to it, which is always guaranteed to be at least as many as its length.
When calling `List.append list1 list2` on a unique `list1`, first we'll check to see if `list1.capacity <= list1.length + list2.length`. If it is, then we can copy in the new values without needing to allocate more memory for `list1`.
If there is not enough capacity to fit both lists, then we can try to call [`realloc`](https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/realloc?view=vs-2019) to hopefully extend the size of our allocated memory. If `realloc` succeeds (meaning there happened to be enough free memory right after our current allocation), then we update `capacity` to reflect the new amount of space, and move on.
> **Note:** The reason we store capacity right after the last element in the list is because of how memory cache lines work. Whenever we need to access `capacity`, it's because we're about to increase the length of the list, which means that we will most certainly be writing to the memory location right after its last element. That in turn means that we'll need to have that memory location in cache, which in turn means that looking up the `capacity` there is guaranteed not to cause a cache miss. (It's possible that writing the new capacity value to a later address could cause a cache miss, but this strategy minimizes the chance of that happening.) An alternate design would be where we store the capacity right before the first element in the list. In that design we wouldn't have to re-write the capacity value at the end of the list every time we grew it, but we'd be much more likely to incur more cache misses that way - because we're working at the end of the list, not at the beginning. Cache misses are many times more expensive than an extra write to a memory address that's in cache already, not to mention the potential extra load instruction to add the length to the memory address of the first element (instead of subtracting 1 from that address), so we optimize for minimizing the highly expensive cache misses by always paying a tiny additional cost when increasing the length of the list, as well as a potential even tinier cost (zero, if the length already happens to be in a register) when looking up its capacity or refcount.
If `realloc` fails, then we have to fall back on the same "allocate new memory and copy everything" strategy that we do with shared lists.
When you have a way to anticipate that a list will want to grow incrementally to a certain size, you can avoid such extra allocations by using `List.reserve` to guarantee more capacity up front. (`List.reserve` works like Rust's [`Vec::reserve`](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.reserve).)
> **Note:** Calling `List.reserve 0 myList` will have no effect on a Unique list, but on a Shared list it will clone `myList` and return a Unique one. If you want to do a bunch of in-place mutations on a list, but it's currently Shared, calling `List.reserve 0` on it to get a Unique clone could actually be an effective performance optimization!
## Capacity and alignment
Some lists may end up beginning with excess capacity due to memory alignment requirements. Since the refcount is `usize`, all lists need a minimum of that alignment. For example, on a 64-bit system, a `List Bool` has an alignment of 8B even though bools can fit in 1B.
This means the list `[True, True, False]` would have a memory layout like this):
```text
|--------------8B--------------|--1B--|--1B--|--1B--|-----5B-----|
either refcount or capacity bool1 bool2 bool3 unused
```
As such, if this list is Unique, it would start out with a length of 3 and a capacity of 8.
Since each bool value is a byte, it's okay for them to be packed side-by-side even though the overall alignment of the list elements is 8. This is fine because each of their individual memory addresses will end up being a multiple of their size in bytes.
Note that unlike in the `List Str` example before, there wouldn't be any unused memory between the refcount (or capacity, depending on whether the list was shared or unique) and the first element in the list. That will always be the case when the size of the refcount is no bigger than the alignment of the list's elements.
## Distinguishing between refcount and capacity in the host
If I'm a platform author, and I receive a `List` from the application, it's important that I be able to tell whether I'm dealing with a refcount or a capacity. (The uniqueness type information will have been erased by this time, because some applications will return a Unique list while others return a Shared list, so I need to be able to tell using runtime information only which is which.) This way, I can know to either increment the refcount, or to feel free to mutate it in-place using the capacity value.
We use a very simple system to distinguish the two: if the high bit in that `usize` value is 0, then it's capacity. If it's 1, then it's a refcount.
This has a couple of implications:
- `capacity` actually has a maximum of `isize::MAX`, not `usize::MAX` - because if its high bit flips to 1, then now suddenly it's considered a refcount by the host. As it happens, capacity's maximum value is naturally `isize::MAX` anyway, so there's no downside here.
- `refcount` actually begins at `isize::MIN` and increments towards 0, rather than beginning at 0 and incrementing towards a larger number. When a decrement instruction is executed and the refcount is `isize::MIN`, it gets freed instead. Since all refcounts do is count up and down, and they only ever compare the refcount to a fixed constant, there's no real performance cost to comparing to `isize::MIN` instead of to 0. So this has no significant downside either.
Using this representation, hosts can trivially distinguish any list they receive as being either refcounted or having a capacity value, without any runtime cost in either the refcounted case or the capacity case.
### Saturated reference count
What happens if the reference count overflows? As in, we try to reference the same list more than `isize` times?
In this situation, the reference count becomes unreliable. Suppose we try to increment it 3 more times after it's already been incremented `isize` times, and since we can't store any further numbers without flipping the high bit from 1 to 0 (meaning it will become a capacity value instead of a refcount), we leave it at -1. If we later decrement it `isize` times, we'll be down to `isize::MIN` and will free the memory, even though 3 things are still referencing that memory!
This would be a total disaster, so what we do instead is that we decide to leak the memory. Once the reference count hits -1, we neither increment nor decrement it ever again, which in turn means we will never free it. So -1 is a special reference count meaning "this memory has become unreclaimable, and must never be freed."
This has the downside of being potentially wasteful of the program's memory, but it's less detrimental to user experience than a crash, and it doesn't impact correctness at all.
## Summary of Lists
Lists are a `2 * usize` struct which contains a length and a pointer.
That pointer is a memory address (null in the case of an empty list) which points to the first element in a sequential array of memory.
If that pointer is shared in multiple places, then there will be a `usize` reference count stored right before the first element of the list. There may be unused memory after the refcount if `usize` is smaller than the alignment of one of the list's elements.
Refcounts get incremented each time a list gets shared somewhere, and decremented each time that shared value is no longer referenced by anything else (for example, by going out of scope). Once there are no more references, the list's heap memory can be safely freed. If a reference count gets all the way up to `usize`, then it will never be decremented again and the memory will never be freed.
Whenever a list grows, it will grow in-place if it's Unique and there is enough capacity. (Capacity is stored where a refcount would be in a Shared list.) If there isn't enough capacity - even after trying `realloc` - or if the list is Shared, then instead new heap memory will be allocated, all the necessary elements will get copied into it, and the original list's refcount will be decremented.
## Strings
Strings have several things in common with lists:
- They are a `2 * usize` struct, sometimes with a non-null pointer to some heap memory
- They have a length and a capacity, and they can grow in basically the same way
- They are reference counted in basically the same way
However, they also have two things going on that lists do not:
- The Small String Optimization
- Literals stored in read-only memory
## The Small String Optimization
In practice, a lot of strings are pretty small. For example, the string `"Richard Feldman"` can be stored in 15 UTF-8 bytes. If we stored that string the same way we store a list, then on a 64-bit system we'd need a 16B struct, which would include a pointer to 24B of heap memory (including the refcount/capacity and one unused byte for alignment).
That's a total of 48B to store 15B of data, when we could have fit the whole string into the original 16B we needed for the struct, with one byte left over.
The Small String Optimization is where we store strings directly in the struct, assuming they can fit in there. We reserve one of those bytes to indicate whether this is a Small String or a larger one that actually uses a pointer.
## String Memory Layout
How do we tell small strings apart from nonsmall strings?
We make use of the fact that lengths (for both strings *and* lists) are `usize` values which have a maximum value of `isize::MAX` rather than `usize::MAX`. This is because `List.get` compiles down to an array access operation, and LLVM uses `isize` indices for those because they do signed arithmetic on the pointer in case the caller wants to add a negative number to the address. (We don't want to, as it happens, but that's what the low-level API supports, so we are bound by its limitations.)
Since the string's length is a `usize` value with a maximum of `isize::MAX`, we can be sure that its most significant bit will always be 0, not 1. (If it were a 1, that would be a negative `isize`!) We can use this fact to use that spare bit as a flag indicating whether the string is small: if that bit is a 1, it's a small string; otherwise, it's a nonsmall string.
This makes calculating the length of the string a multi-step process:
1. Get the length field out of the struct.
2. Look at its highest bit. If that bit is 0, return the length as-is.
3. If the bit is 1, then this is a small string, and its length is packed into the highest byte of the `usize` length field we're currently examining. Take that byte and bit shift it by 1 (to drop the `1` flag we used to indicate this is a small string), cast the resulting byte to `usize`, and that's our length. (Actually we bit shift by 4, not 1, because we only need 4 bits to store a length of 0-16, and the leftover 3 bits can potentially be useful in the future.)
Using this strategy with a [conditional move instruction](https://stackoverflow.com/questions/14131096/why-is-a-conditional-move-not-vulnerable-for-branch-prediction-failure), we can always get the length of a `Str` in 2-3 cheap instructions on a single `usize` value, without any chance of a branch misprediction.
Thus, the layout of a small string on a 64-bit big-endian architecture would be:
```text
|-----------usize length field----------|-----------usize pointer field---------|
|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|
len 'R' 'i' 'c' 'h' 'a' 'r' 'd' ' ' 'F' 'e' 'l' 'd' 'm' 'a' 'n'
```
The `len` value here would be the number 15, plus a 1 (to flag that this is a small string) that would always get bit-shifted away. The capacity of a small Unique string is always equal to `2 * usize`, because that's how much you can fit without promoting to a nonsmall string.
## Endianness
The preceding memory layout example works on a big-endian architecture, but most CPUs are little-endian. That means the high bit where we want to store the flag (the 0 or 1
that would make an `isize` either negative or positive) will actually be the `usize`'s last byte rather than its first byte.
That means we'd have to move swap the order of the struct's length and pointer fields. Here's how the string `"Roc string"` would be stored on a little-endian system:
```text
|-----------usize pointer field---------|-----------usize length field----------|
|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|-1B-|
'R' 'o' 'c' ' ' 's' 't' 'r' 'i' 'n' 'g' 0 0 0 0 0 len
```
Here, `len` would have the same format as before (including the extra 1 in the same position, which we'd bit shift away) except that it'd store a length of 10 instead of 15.
Notice that the leftover bytes are stored as zeroes. This is handy because it means we can convert small Roc strings into C strings (which are 0-terminated) for free as long as they have at least one unused byte. Also notice that `usize pointer field` and `usize length field` have been swapped compared to the preceding example!
## Storing string literals in read-only memory

View File

@ -1,4 +1,4 @@
# Roc installation guide for x86_64 macOS systems
# Roc installation guide for x86_64 MacOS systems
## How to install Roc

View File

@ -1323,24 +1323,24 @@ Some differences to note:
Here are various Roc expressions involving operators, and what they desugar to.
| Expression | Desugars to |
| --------------- | ---------------- |
| Expression | Desugars to |
| ----------------- | ------------------ |
| `a + b` | `Num.add a b` |
| `a - b` | `Num.sub a b` |
| `a * b` | `Num.mul a b` |
| `a / b` | `Num.div a b` |
| `a // b` | `Num.divTrunc a b` |
| `a / b` | `Num.div a b` |
| `a // b` | `Num.divTrunc a b` |
| `a ^ b` | `Num.pow a b` |
| `a % b` | `Num.rem a b` |
| `a >> b` | `Num.shr a b` |
| `a << b` | `Num.shl a b` |
| `a % b` | `Num.rem a b` |
| `a >> b` | `Num.shr a b` |
| `a << b` | `Num.shl a b` |
| `-a` | `Num.neg a` |
| `-f x y` | `Num.neg (f x y)` |
| `a == b` | `Bool.isEq a b` |
| `a != b` | `Bool.isNotEq a b` |
| `a && b` | `Bool.and a b` |
| `a \|\| b` | `Bool.or a b` |
| `a \|\| b` | `Bool.or a b` |
| `!a` | `Bool.not a` |
| `!f x y` | `Bool.not (f x y)` |
| `a \|> b` | `b a` |
| `a b c \|> f x y` | `f (a b c) x y` |
| `a \|> b` | `b a` |
| `a b c \|> f x y` | `f (a b c) x y` |

View File

@ -4,7 +4,7 @@ To view the website after you've made a change, execute:
```bash
./www/build.sh
cd www/build
simple-http-server --nocache # If you're using the nix flake simple-http-server will already be installed. Withouth nix you can install it with `cargo install simple-http-server`.
simple-http-server --nocache # If you're using the nix flake simple-http-server will already be installed. Without nix you can install it with `cargo install simple-http-server`.
```
Open http://0.0.0.0:8000 in your browser.

View File

@ -1054,7 +1054,7 @@ You can tell some interesting things about functions based on the type parameter
Similarly, the only way to have a function whose type is `a -> a` is if the function's implementation returns its argument without modifying it in any way. This is known as [the identity function](https://en.wikipedia.org/wiki/Identity_function).
### [Tag Union Types](#tag-union-tyes) {#tag-union-tyes}
### [Tag Union Types](#tag-union-types) {#tag-union-types}
We can also annotate types that include tags:
@ -1239,7 +1239,7 @@ Ideally, Roc programs would never crash. However, there are some situations wher
1. When doing normal integer arithmetic (e.g. `x + y`) that [overflows](https://en.wikipedia.org/wiki/Integer_overflow).
2. When the system runs out of memory.
3. When a variable-length collection (like a `List` or `Str`) gets too long to be representible in the operating system's address space. (A 64-bit operating system's address space can represent several [exabytes](https://en.wikipedia.org/wiki/Byte#Multiple-byte_units) of data, so this case should not come up often.)
3. When a variable-length collection (like a `List` or `Str`) gets too long to be representable in the operating system's address space. (A 64-bit operating system's address space can represent several [exabytes](https://en.wikipedia.org/wiki/Byte#Multiple-byte_units) of data, so this case should not come up often.)
Crashes in Roc are not like [try/catch exceptions](https://en.wikipedia.org/wiki/Exception_handling) found in some other programming languages. There is no way to "catch" a crash. It immediately ends the program, and what happens next is defined by the [platform](https://github.com/roc-lang/roc/wiki/Roc-concepts-explained#platform). For example, a command-line interface platform might exit with a nonzero [exit code](https://en.wikipedia.org/wiki/Exit_status), whereas a web server platform might have the current request respond with a [HTTP 500 error](https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#500).
@ -1332,7 +1332,7 @@ So you'll want to use `roc dev` or `roc test` to get the output for `expect`.
## [Modules](#modules) {#modules}
Each `.roc` file is a separate module and contains Roc code for different purposes. Here are all of the different types of modules that Roc suppports;
Each `.roc` file is a separate module and contains Roc code for different purposes. Here are all of the different types of modules that Roc supports;
- **Builtins** provide functions that are automatically imported into every module.
- **Applications** are combined with a platform and compiled into an executable.