* Adds a RISC0 backend which generates Rust code that can be compiled
with the official RISC0 toolchain.
* The RISC0 backend is a wrapper around the Rust backend.
* Adds the `risc0-rust` to the `compile` CLI command, which creates a
directory containing host and guest Rust sources for the RISC0 zkVM. The
generated code can be compiled/run using `cargo` from inside the created
directory (requires having RISC0 installed:
https://dev.risczero.com/api/zkvm/install).
This pr introduces parallelism in the pipeline to gain performance. I've
included benchmarks at the end.
- Closes#2750.
# Flags:
There are two new global flags:
1. `-N / --threads`. It is used to set the number of capabilities.
According to [GHC
documentation](https://hackage.haskell.org/package/base-4.20.0.0/docs/GHC-Conc.html#v:setNumCapabilities):
_Set the number of Haskell threads that can run truly simultaneously (on
separate physical processors) at any given time_. When compiling in
parallel, we create this many worker threads. The default value is `-N
auto`, which sets `-N` to half the number of logical cores, capped at 8.
2. `--dev-show-thread-ids`. When given, the thread id is printed in the
compilation progress log. E.g.
![image](https://github.com/anoma/juvix/assets/5511599/9359fae2-0be1-43e5-8d74-faa82cba4034)
# Parallel compilation
1. I've added `src/Parallel/ParallelTemplate.hs` which contains all the
concurrency related code. I think it is good to keep this code separated
from the actual compiler code.
2. I've added a progress log (only for the parallel driver) that outputs
a log of the compilation progress, similar to what stack/cabal do.
# Code changes:
1. I've removed the `setup` stage where we were registering
dependencies. Instead, the dependencies are registered when the
`pathResolver` is run for the first time. This way it is safer.
1. Now the `ImportTree` is needed to run the pipeline. Cycles are
detected during the construction of this tree, so I've removed `Reader
ImportParents` from the pipeline.
3. For the package pathresolver, we do not support parallelism yet (we
could add support for it in the future, but the gains will be small).
4. When `-N1`, the pipeline remains unchanged, so performance should be
the same as in the main branch (except there is a small performance
degradation due to adding the `-threaded` flag).
5. I've introduced `PipelineOptions`, which are options that are used to
pass options to the effects in the pipeline.
6. `PathResolver` constraint has been removed from the `upTo*` functions
in the pipeline due to being redundant.
7. I've added a lot of `NFData` instances. They are needed to force the
full evaluation of `Stored.ModuleInfo` in each of the threads.
2. The `Cache` effect uses
[`SharedState`](https://hackage.haskell.org/package/effectful-core-2.3.0.1/docs/Effectful-State-Static-Shared.html)
as opposed to
[`LocalState`](https://hackage.haskell.org/package/effectful-core-2.3.0.1/docs/Effectful-Writer-Static-Local.html).
Perhaps we should provide different versions.
3. I've added a `Cache` handler that accepts a setup function. The setup
is triggered when a miss is detected. It is used to lazily compile the
modules in parallel.
# Tests
1. I've adapted the smoke test suite to ignore the progress log in the
stderr.
5. I've had to adapt `tests/positive/Internal/Lambda.juvix`. Due to
laziness, a crash happening in this file was not being caught. The
problem is that in this file we have a lambda function with different
number of patterns in their clauses, which we currently do not support
(https://github.com/anoma/juvix/issues/1706).
6. I've had to comment out the definition
```
x : Box ((A : Type) → A → A) := box λ {A a := a};
```
From the test as it was causing a crash
(https://github.com/anoma/juvix/issues/2247).
# Future Work
1. It should be investigated how much performance we lose by fully
evaluating the `Stored.ModuleInfo`, since some information in it will be
discarded. It may be possible to be more fine-grained when forcing
evaluation.
8. The scanning of imports to build the import tree is sequential. Now,
we build the import tree from the entry point module and only the
modules that are imported from it are in the tree. However, we have
discussed that at some point we should make a distinction between
`juvix` _the compiler_ and `juvix` _the build tool_. When using `juvix`
as a build tool it makes sense to typecheck/compile (to stored core) all
modules in the project. When/if we do this, scanning imports in all
modules in parallel becomes trivial.
9. The implementation of the `ParallelTemplate` uses low level
primitives such as
[forkIO](https://hackage.haskell.org/package/base-4.20.0.0/docs/Control-Concurrent.html#v:forkIO).
At some point it should be refactored to use safer functions from the
[`Effectful.Concurrent.Async`](https://hackage.haskell.org/package/effectful-2.3.0.0/docs/Effectful-Concurrent-Async.html)
module.
10. The number of cores and worker threads that we spawn is determined
by the command line. Ideally, we could use to import tree to compute an
upper bound to the ideal number of cores to use.
11. We could add an animation that displays which modules are being
compiled in parallel and which have finished being compiled.
# Benchmarks
On some benchmarks, I include the GHC runtime option
[`-A`](https://downloads.haskell.org/ghc/latest/docs/users_guide/runtime_control.html#rts-flag--A%20%E2%9F%A8size%E2%9F%A9),
which sometimes makes a good impact on performance. Thanks to
@paulcadman for pointing this out. I've figured a good combination of
`-N` and `-A` through trial and error (but this oviously depends on the
cpu and juvix projects).
## Typecheck the standard library
### Clean run (88% faster than main):
```
hyperfine --warmup 1 --prepare 'juvix clean' 'juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432' 'juvix -N 4 typecheck Stdlib/Prelude.juvix' 'juvix-main typecheck Stdlib/Prelude.juvix'
Benchmark 1: juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432
Time (mean ± σ): 444.1 ms ± 6.5 ms [User: 1018.0 ms, System: 77.7 ms]
Range (min … max): 432.6 ms … 455.9 ms 10 runs
Benchmark 2: juvix -N 4 typecheck Stdlib/Prelude.juvix
Time (mean ± σ): 628.3 ms ± 23.9 ms [User: 1227.6 ms, System: 69.5 ms]
Range (min … max): 584.7 ms … 670.6 ms 10 runs
Benchmark 3: juvix-main typecheck Stdlib/Prelude.juvix
Time (mean ± σ): 835.9 ms ± 12.3 ms [User: 788.5 ms, System: 31.9 ms]
Range (min … max): 816.0 ms … 853.6 ms 10 runs
Summary
juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 ran
1.41 ± 0.06 times faster than juvix -N 4 typecheck Stdlib/Prelude.juvix
1.88 ± 0.04 times faster than juvix-main typecheck Stdlib/Prelude.juvix
```
### Cached run (43% faster than main):
```
hyperfine --warmup 1 'juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432' 'juvix -N 4 typecheck Stdlib/Prelude.juvix' 'juvix-main typecheck Stdlib/Prelude.juvix'
Benchmark 1: juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432
Time (mean ± σ): 241.3 ms ± 7.3 ms [User: 538.6 ms, System: 101.3 ms]
Range (min … max): 231.5 ms … 251.3 ms 11 runs
Benchmark 2: juvix -N 4 typecheck Stdlib/Prelude.juvix
Time (mean ± σ): 235.1 ms ± 12.0 ms [User: 405.3 ms, System: 87.7 ms]
Range (min … max): 216.1 ms … 253.1 ms 12 runs
Benchmark 3: juvix-main typecheck Stdlib/Prelude.juvix
Time (mean ± σ): 336.7 ms ± 13.3 ms [User: 269.5 ms, System: 67.1 ms]
Range (min … max): 316.9 ms … 351.8 ms 10 runs
Summary
juvix -N 4 typecheck Stdlib/Prelude.juvix ran
1.03 ± 0.06 times faster than juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432
1.43 ± 0.09 times faster than juvix-main typecheck Stdlib/Prelude.juvix
```
## Typecheck the test suite of the containers library
At the moment this is the biggest juvix project that we have.
### Clean run (105% faster than main)
```
hyperfine --warmup 1 --prepare 'juvix clean' 'juvix -N 6 typecheck Main.juvix +RTS -A67108864' 'juvix -N 4 typecheck Main.juvix' 'juvix-main typecheck Main.juvix'
Benchmark 1: juvix -N 6 typecheck Main.juvix +RTS -A67108864
Time (mean ± σ): 1.006 s ± 0.011 s [User: 2.171 s, System: 0.162 s]
Range (min … max): 0.991 s … 1.023 s 10 runs
Benchmark 2: juvix -N 4 typecheck Main.juvix
Time (mean ± σ): 1.584 s ± 0.046 s [User: 2.934 s, System: 0.149 s]
Range (min … max): 1.535 s … 1.660 s 10 runs
Benchmark 3: juvix-main typecheck Main.juvix
Time (mean ± σ): 2.066 s ± 0.010 s [User: 1.939 s, System: 0.089 s]
Range (min … max): 2.048 s … 2.077 s 10 runs
Summary
juvix -N 6 typecheck Main.juvix +RTS -A67108864 ran
1.57 ± 0.05 times faster than juvix -N 4 typecheck Main.juvix
2.05 ± 0.03 times faster than juvix-main typecheck Main.juvix
```
### Cached run (54% faster than main)
```
hyperfine --warmup 1 'juvix -N 6 typecheck Main.juvix +RTS -A33554432' 'juvix -N 4 typecheck Main.juvix' 'juvix-main typecheck Main.juvix'
Benchmark 1: juvix -N 6 typecheck Main.juvix +RTS -A33554432
Time (mean ± σ): 551.8 ms ± 13.2 ms [User: 1419.8 ms, System: 199.4 ms]
Range (min … max): 535.2 ms … 570.6 ms 10 runs
Benchmark 2: juvix -N 4 typecheck Main.juvix
Time (mean ± σ): 636.7 ms ± 17.3 ms [User: 1006.3 ms, System: 196.3 ms]
Range (min … max): 601.6 ms … 655.3 ms 10 runs
Benchmark 3: juvix-main typecheck Main.juvix
Time (mean ± σ): 847.2 ms ± 58.9 ms [User: 710.1 ms, System: 126.5 ms]
Range (min … max): 731.1 ms … 890.0 ms 10 runs
Summary
juvix -N 6 typecheck Main.juvix +RTS -A33554432 ran
1.15 ± 0.04 times faster than juvix -N 4 typecheck Main.juvix
1.54 ± 0.11 times faster than juvix-main typecheck Main.juvix
```
* Implements code generation through Rust.
* CLI: adds two `dev` compilation targets:
1. `rust` for generating Rust code
2. `native-rust` for generating a native executable via Rust
* Adds end-to-end tests for compilation from Juvix to native executable
via Rust.
* A target for RISC0 needs to be added in a separate PR building on this
one.
The following benchmark compares juvix 0.6.0 with polysemy and a new
version (implemented in this pr) which replaces polysemy by effectful.
# Typecheck standard library without caching
```
hyperfine --warmup 2 --prepare 'juvix-polysemy clean' 'juvix-polysemy typecheck Stdlib/Prelude.juvix' 'juvix-effectful typecheck Stdlib/Prelude.juvix'
Benchmark 1: juvix-polysemy typecheck Stdlib/Prelude.juvix
Time (mean ± σ): 3.924 s ± 0.143 s [User: 3.787 s, System: 0.084 s]
Range (min … max): 3.649 s … 4.142 s 10 runs
Benchmark 2: juvix-effectful typecheck Stdlib/Prelude.juvix
Time (mean ± σ): 2.558 s ± 0.074 s [User: 2.430 s, System: 0.084 s]
Range (min … max): 2.403 s … 2.646 s 10 runs
Summary
juvix-effectful typecheck Stdlib/Prelude.juvix ran
1.53 ± 0.07 times faster than juvix-polysemy typecheck Stdlib/Prelude.juvix
```
# Typecheck standard library with caching
```
hyperfine --warmup 1 'juvix-effectful typecheck Stdlib/Prelude.juvix' 'juvix-polysemy typecheck Stdlib/Prelude.juvix' --min-runs 20
Benchmark 1: juvix-effectful typecheck Stdlib/Prelude.juvix
Time (mean ± σ): 1.194 s ± 0.068 s [User: 0.979 s, System: 0.211 s]
Range (min … max): 1.113 s … 1.307 s 20 runs
Benchmark 2: juvix-polysemy typecheck Stdlib/Prelude.juvix
Time (mean ± σ): 1.237 s ± 0.083 s [User: 0.997 s, System: 0.231 s]
Range (min … max): 1.061 s … 1.476 s 20 runs
Summary
juvix-effectful typecheck Stdlib/Prelude.juvix ran
1.04 ± 0.09 times faster than juvix-polysemy typecheck Stdlib/Prelude.juvix
```
This PR adds an parser, pretty printer, evaluator, repl and quasi-quoter
for Nock terms.
## Parser / Pretty Printer
The parser and pretty printer handle both standard Nock terms and
'pretty' Nock terms (where op codes and paths can be named). Standard
and pretty Nock forms can be mixed in the same term.
For example instead of `[0 2]` you can write `[@ L]`.
See
a6028b0d92/src/Juvix/Compiler/Nockma/Language.hs (L79)
for the correspondence between pretty Nock and Nock operators.
In pretty Nock, paths are represented as strings of `L` (for head) and
`R` (for tail) instead of the number encoding in standard nock. The
character `S` is used to refer to the whole subject, i.e it is sugar for
`1` in standard Nock.
See
a6028b0d92/src/Juvix/Compiler/Nockma/Language.hs (L177)
for the correspondence between pretty Nock path and standard Nock
position.
## Quasi-quoter
A quasi-quoter is added so Nock terms can be included in the source, e.g
`[nock| [@ LL] |]`.
## REPL
Launch the repl with `juvix dev nockma repl`.
A Nock `[subject formula]` cell is input as `subject / formula` , e.g:
```
nockma> [1 0] / [@ L]
1
```
The subject can be set using `:set-stack`.
```
nockma> :set-stack [1 0]
nockma> [@ L]
1
```
The subject can be viewed using `:get-stack`.
```
nockma> :set-stack [1 0]
nockma> :get-stack
[1 0]
```
You can assign a Nock term to a variable and use it in another
expression:
```
nockma> r := [@ L]
nockma> [1 0] / r
1
```
A list of assignments can be read from a file:
```
$ cat stack.nock
r := [@ L]
$ juvix dev nockma repl
nockma> :load stack.nock
nockma> [1 0] / r
1
```
* Closes https://github.com/anoma/juvix/issues/2557
---------
Co-authored-by: Jan Mas Rovira <janmasrovira@gmail.com>
Co-authored-by: Lukasz Czajka <lukasz@heliax.dev>
* Closes#2392
Changes checklist
-----------------
* [X] Abstract out data types for stored module representation
(`ModuleInfo` in `Juvix.Compiler.Store.Language`)
* [X] Adapt the parser to operate per-module
* [X] Adapt the scoper to operate per-module
* [X] Adapt the arity checker to operate per-module
* [X] Adapt the type checker to operate per-module
* [x] Adapt Core transformations to operate per-module
* [X] Adapt the pipeline functions in `Juvix.Compiler.Pipeline`
* [X] Add `Juvix.Compiler.Pipeline.Driver` which drives the per-module
compilation process
* [x] Implement module saving / loading in `Pipeline.Driver`
* [x] Detect cyclic module dependencies in `Pipeline.Driver`
* [x] Cache visited modules in memory in `Pipeline.Driver` to avoid
excessive disk operations and repeated hash re-computations
* [x] Recompile a module if one of its dependencies needs recompilation
and contains functions that are always inlined.
* [x] Fix identifier dependencies for mutual block creation in
`Internal.fromConcrete`
- Fixed by making textually later definitions depend on earlier ones.
- Now instances are used for resolution only after the textual point of
their definition.
- Similarly, type synonyms will be unfolded only after the textual point
of their definition.
* [x] Fix CLI
* [x] Fix REPL
* [x] Fix highlighting
* [x] Fix HTML generation
* [x] Adapt test suite
This patch dramatically increases the efficiency of `juvix dev root`,
which was unnecessarily parsing all dependencies included in the
`Package.juvix` file. Other commands that do not require the `Package`
will also be faster.
It also refactors some functions so that the `TaggedLock` effect is run
globally.
I've added `singletons-base` as a dependency so we can have `++` on the
type level. We've tried to define a type family ourselves but inference
was not working properly.
This implements a basic version of the algorithm from: Luc Maranget,
[Compiling pattern matching to good decision
trees](http://moscova.inria.fr/~maranget/papers/ml05e-maranget.pdf). No
heuristics are used - the first column is always chosen.
* Closes#1798
* Closes#1225
* Closes#1926
* Adds a global `--no-coverage` option which turns off coverage checking
in favour of generating runtime failures
* Changes the representation of Match patterns in JuvixCore to achieve a
more streamlined implementation
* Adds options to the Core pipeline
* import all non-compile axioms with alphanum names in entrypoint
This commit adds `__attribute__((input_name(<name>)))` to the type
signature of all axioms that do not have a compile block. This indicates
to the compiler that this function should be added to the input table of
the WASM binary.
Adds a test that an imported function can be called from Juvix.
* test: Run node command in same directory as WASM output
* Don't generate importName for non-alphnum axioms
* Add a tutorial on Juvix module to JS interop via Wasm
* Export all functions with alphanum names in entrypoint
Set the export_name attribute on every function signature that has a
fully alpha numeric name.
* Adds a non-WASI target to compile command
WASM libraries we want to run in the browser and in Anoma VM do not have
access to the WASI runtime. For this usecase we need a target runtime
that does not use the WASI runtime.
A `wasi-` prefix is added to existing compile targets and introduce a new
standalone target. This target does not have IO functions like
`putStrLn` and `exit` calls `__builtin_trap` which corresponds to the
`unreachable` WASM instruction.
In the non-exhaustive case error a debug message is emitted to tell the
user which function had the error. This is now abstracted to a `debug`
function in the runtime which calls `putStrLn` in the case of WASI and
is no-op in the case of non-WASI.
Tests are added which check that exported names can be called with the
correct name and result.
* Moves walloc to a common directory
* Renaming MiniJuvix to Juvix
* Make Ormolu happy
* Make Hlint happy
* Remove redundant imports
* Fix shell tests and add target ci to our Makefile
* Make pre-commit happy
* Embed stdlib in minijuvix library
We add a new step at the beginning of the pipeline called Setup that
registers the modules in the standard library with the Files effect. The
standard library is then used when the Scoper queries the Files effect
for modules as it resolves import statements.
Use of the standard library can be disabled using the global
`--no-stdlib` command-line option.
* CI: Checkout submodules recursively for stdlib
* Add a new `--no-stdlib` option to shell check
* Poke CI
* CI: Checkout submodules in the test job
* [WIP] EntryPoint now has options. --no-termination is a new global opt.
* Add TerminationChecking to the pipeline
* Add TerminationChecking to the pipeline
* Keep GlobalOptions in App
* Fix reviewer's comments
* delete unnecessary parens
Co-authored-by: Jan Mas Rovira <janmasrovira@gmail.com>
* [cbackend] Adds an AST for C
This should cover enough C to implement the microjuvix backend.
* [cbackend] Add C serializer using language-c library
We may decide to write our own serializer for the C AST but this
demonstrates that the C AST is sufficient at least.
* [cbackend] Declarations will always be typed
* [cbackend] Add CPP support to AST
* [cbackend] Rename some names for clarity
* [cbackend] Add translation of InductiveDef to C
* [cbackend] Add CLI for C backend
* [cbackend] Add stdbool.h to file header
* [cbackend] Allow Cpp and Verbatim code inline
* [cbackend] Add a newline after printing C
* [cbackend] Support foreign blocks
* [cbackend] Add support for axioms
* [cbackend] Remove code examples
* [cbackend] wip FunctionDef including Expressions
* [parser] Support esacping '}' inside a foreign block
* [cbackend] Add support for patterns in functions
* [cbackend] Add foreign C support to HelloWorld.mjuvix
* hlint fixes
* More hlint fixes not picked up by pre-commit
* [cbackend] Remove CompileStatement from MonoJuvix
* [cbackend] Add support for compile blocks
* [cbackend] Move compileInfo extraction to MonoJuvixResult
* [minihaskell] Fix compile block support
* [chore] Remove ununsed isBackendSupported function
* [chore] Remove unused imports
* [cbackend] Use a Reader for pattern bindings
* [cbackend] Fix compiler warnings
* [cbackend] Add support for nested patterns
* [cbackend] Use functions to instantiate argument names
* [cbackend] Add non-exhaustive pattern error message
* [cbackend] Adds test for c to WASM compile and execution
* [cbackend] Add links to test dependencies in quickstart
* [cbackend] Add test with inductive types and patterns
* [cbackend] Fix indentation
* [cbackend] Remove ExpressionTyped case
https://github.com/heliaxdev/minijuvix/issues/79
* [lexer] Fix lexing of \ inside a foreign block
* [cbackend] PR review fixes
* [chore] Remove unused import
* [cbackend] Rename CJuvix to MiniC
* [cbackend] Rename MonoJuvixToC to MonoJuvixToMiniC
* [cbackend] Add test for polymorphic function
* [cbackend] Add module for string literals