This PR updates the juvix-stdlib submodule ref to point to the current
main ref of the juvix-stdlib repository.
Previously it was pointing to the ref of a branch.
Currently formatting a project is equivalent to running `juvix format`
on each individual file. Hence, the performance is quadratic wrt the
number of modules in the project. This pr fixes that and we now we only
process each module once.
# Benchmark (1236% faster 🚀)
Checking the standard library
```
hyperfine --warmup 1 'juvix format --check' 'juvix-main format --check'
Benchmark 1: juvix format --check
Time (mean ± σ): 450.6 ms ± 33.7 ms [User: 707.2 ms, System: 178.7 ms]
Range (min … max): 396.0 ms … 497.0 ms 10 runs
Benchmark 2: juvix-main format --check
Time (mean ± σ): 6.019 s ± 0.267 s [User: 9.333 s, System: 1.512 s]
Range (min … max): 5.598 s … 6.524 s 10 runs
Summary
juvix format --check ran
13.36 ± 1.16 times faster than juvix-main format --check
```
# Other changes:
1. The `EntryPoint` field `entryPointModulePath` is now optional.
2. I've introduced a new type `TopModulePathKey` which is analogous to
`TopModulePath` but wihout location information. It is used in hashmap
keys where the location in the key is never used. This is useful as we
can now get a `TopModulePathKey` from a `Path Rel File`.
3. I've refactored the `_formatInput` field in `FormatOptions` so that
it doesn't need to be a special case anymore.
4. I've introduced a new effect `Forcing` that allows to individually
force fields of a record type with a convenient syntax.
5. I've refactored some of the constraints in scoping so that they only
require `Reader Package` instead of `Reader EntryPoint`.
6. I've introduced a new type family so that local modules are no longer
required to have `ModuleId` from their type. Before, they were assigned
one, but it was never used.
# Future work:
1. For project-wise formatting, the compilation is done in parallel, but
the formatting is still done sequentially. That should be improved.
* Closes#2394
* Removes the use of Uniplate in `letFunctionDefs` altogether, in favour
of handwritten traversal accumulating let definitions (implemented via
the new `HasLetDefs` typeclass).
Benchmark results
------------------
Using Uniplate:
```
heliax@imac bench % hyperfine --prepare 'juvix clean --global' -w 1 'juvix typecheck bench.juvix -N 1'
Benchmark 1: juvix typecheck bench.juvix -N 1
Time (mean ± σ): 1.399 s ± 0.023 s [User: 1.346 s, System: 0.041 s]
Range (min … max): 1.374 s … 1.452 s 10 runs
```
Using `HasLetDefs`:
```
heliax@imac bench % hyperfine --prepare 'juvix clean --global' -w 1 'juvix typecheck bench.juvix -N 1'
Benchmark 1: juvix typecheck bench.juvix -N 1
Time (mean ± σ): 1.098 s ± 0.015 s [User: 1.047 s, System: 0.040 s]
Range (min … max): 1.074 s … 1.120 s 10 runs
```
So it's roughly 1.1s vs. 1.4s, faster by 0.2s. About 14% improvement.
The benchmark file just imports the standard library:
```
module bench;
import Stdlib.Prelude open;
main : Nat := 0;
```
Both `juvix` binaries were compiled with optimizations, using `just
install`.
* Closes#2703
* Adds [peephole
optimization](https://en.wikipedia.org/wiki/Peephole_optimization) of
Cairo assembly.
* Adds a transformation framework for the CASM IR.
* Adds `--transforms`, `--run` and `--no-print` options to the `dev casm
read` command.
* Closes#2829
* Adds a transformation which converts `br` to `if` when the variable
branched on was assigned in the previous instruction. The transformation
itself doesn't check liveness and doesn't remove the assignment. Dead
code elimination should be run afterwards to remove the assignment.
* For Cairo, it only makes sense to convert `br` to `if` for equality
comparisons against zero. The assignment before `br` will always become
dead after converting `br` to `if`, because we convert to SSA before.
This PR adds support for the `anoma{encode, decode, sign, verify,
signDetached, verifyDetached}` functions to the Core evaluator.
## Encoding / Decoding
The serialization of values to `ByteString` for `anomaEncode` reuses the
Stored Core serialization. The Stored Core `Node` is extended to include
closures.
Anoma expects encoding of bytes as little-endian integers. In general
`ByteString -> Integer` is ambiguous because two `ByteString`s that
differ only by zero padding will map to the same integer. So we must
encode the length of the ByteString in the encoded integer alongside the
ByteString data so when it is decoded we can pad appropriately. We use
the same length encoding scheme that is used by Nockma jam.
## Verify
The Core evaluator implementation of `anomaVerify` crashes if the
verification fails. This matches the behaviour of Anoma node.
### `jvc` Support
You can now use `anoma-*` functions within `.jvc` core files.
* Closes https://github.com/anoma/juvix/issues/2808
- This PR adds a temporary compiler error for when the bug #2247
happens. We do not have plans to fix this bug until we move the
typechecker to Core, so it makes sense to add a better error message.
* Closes#2827
* Adds an optimization phase to the JuvixReg -> Casm pipeline, which
consists of repeated copy & constant propagation and dead code
elimination.
- Closes#2429
This pr introduces two enchancements to import statements:
1. They can have `using/hiding` list of symbols, with a behaviour
analogous to the open statement.
2. They can be public. When an import is marked as public, a local
module (or a series of nested local modules) is generated like this:
```
import A public;
-- equivalent to
import A;
module A;
open A public;
end;
```
It is easier to understand when there is an alias.
```
import A as X.Y public;
-- equivalent to
import A;
module X;
module Y;
open A public;
end;
end;
```
Public imports are allowed to be combined with `using/hiding` modifier
and open statements with the expected behaviour.
* Closes#2745
* Adds inlining of immediate values, i.e., values that don't require
computation or memory allocation.
* Non-immediate zero-argument functions / values should not be inlined,
because when not inlined they can be computed only once.
The stdlib composition function `∘` has fixity `composition` which means
it is right associative.
We will rename `∘` to `<<` in the stdlib and add a new function:
```
>> {a b c} : (a -> b) -> (b -> c) -> a -> c
```
for consistency with `<<` this should be left associative.
This is not a breaking change to package-base so we don't need to
increment the version number.
* Closes#2845
* Copy propagation is not correct without subsequent adjusting of live
variables. See the comments in #2845.
* Enables JuvixReg transformations in the test suite, which exposes the
bug.
* Adds a test in JuvixAsm crafted specifically to expose this bug.
* Closes#1614
* Implements the copy propagation transformation in JuvixReg and adds
tests for it.
* For this optimization to give any improvement, we need to run dead
code elimination afterwards (#2827).
- Closes#2800
I've added a field `_nameKindPretty :: NameKind` that is used to store
the nameKind which should be used when highlighting and printing.
There is no need for pragmas as we can set a color for each builtin
axiom statically in the haskell code.
I've also removed the fields `_infoHighlightDoc :: DocTable` and
`_infoHighlightNames :: [S.AName]` from the
`Store/Scoped/Data/InfoTable` as they were never being read.
* Types of arguments to `main` can now be field elements, numbers,
booleans and (nested) records and lists.
* Type of `main` result can now be a record of field elements, numbers
and booleans. Lists or nested records are not allowed for the result.
* Adds checks for the type of `main` in the Cairo pipeline.
* Requires updating
[juvix-cairo-vm](https://github.com/anoma/juvix-cairo-vm). The input can
be provided in a Json file via the `--program_input` option of
`juvix-cairo-vm`.
I didn't release 0.6.2 last week, but we can do it now. I've updated the
CHANGELOG with the work we did last week.
Other release changes for 0.6.2 were done in
https://github.com/anoma/juvix/pull/2795
It's failing because of rate limits exceeded when installing RISC0 VM.
We should disable the macOS CI until we figure out how to reliably get
around this problem.
* Adds a RISC0 backend which generates Rust code that can be compiled
with the official RISC0 toolchain.
* The RISC0 backend is a wrapper around the Rust backend.
* Adds the `risc0-rust` to the `compile` CLI command, which creates a
directory containing host and guest Rust sources for the RISC0 zkVM. The
generated code can be compiled/run using `cargo` from inside the created
directory (requires having RISC0 installed:
https://dev.risczero.com/api/zkvm/install).
* Closes#2689
* Adds the command `juvix isabelle program.juvix` which translates a
given file to an Isabelle/HOL theory.
* Currently, only a single module is translated.
* By default translates types and function signatures. Translating
function signatures can be disabled with the `--only-types` option.
Blocked by:
- https://github.com/anoma/juvix/issues/2763
---------
Co-authored-by: Jonathan Cubides <jonathan.cubides@uib.no>
This PR adds support for the Anoma stdlib `sign-detached` API.
```
builtin anoma-sign-detached
axiom anomaSignDetached : {A : Type}
-- message to sign
-> A
-- private key
-> Nat
-- signature
-> Nat;
```
This corresponds to the
[sign_detached](https://hexdocs.pm/enacl/enacl.html#sign_detached-2)
libsodium API.
This is requried to support to new Anoma nullifier format:
d6a61451ae
Previously resource nullifiers were defined using `anomaSign`:
```
nullifier (r : Resource) (secretKey : Nat) : Nat :=
anomaSign (anomaEncode (nullifierHeader, r)) secretKey;
```
They are now defined using `anomaSignDetached`:
```
nullifier (r : Resource) (secretKey : Nat) : Nat :=
let encodedResource : Nat := anomaEncode (nullifierHeader, r) in
anomaEncode (encodedResource , anomaSignDetached encodedResource secretKey);
```
This is so that a logic function can access the nullified resources
directly from the `nullifier` field.
## Evaluator Note
When decoding a public key, private key or signature from an integer
atom to a bytestring it's important to pad the bytestring to the
appropriate number of bytes. For example a private key must be 64 bytes
but the corresponding encoded integer may fit into 63 bytes or fewer
bytes (depending on leading zeros). This PR also fixes this issue by
adding a
[`atomToByteStringLen`](c68c7187b1/src/Juvix/Compiler/Nockma/Encoding/ByteString.hs (L14))
function with also accepts the expected size of the resulting
bytestring.
Anoma decodes integer atoms as bytes assuming a little endian layout.
ea25f88cea/lib/noun.ex (L182)
This commit adds functions `byteStringToIntegerLE` and
`integerToByteStringLE` that makes it clear that little endian encoding
is being used.
This pr introduces parallelism in the pipeline to gain performance. I've
included benchmarks at the end.
- Closes#2750.
# Flags:
There are two new global flags:
1. `-N / --threads`. It is used to set the number of capabilities.
According to [GHC
documentation](https://hackage.haskell.org/package/base-4.20.0.0/docs/GHC-Conc.html#v:setNumCapabilities):
_Set the number of Haskell threads that can run truly simultaneously (on
separate physical processors) at any given time_. When compiling in
parallel, we create this many worker threads. The default value is `-N
auto`, which sets `-N` to half the number of logical cores, capped at 8.
2. `--dev-show-thread-ids`. When given, the thread id is printed in the
compilation progress log. E.g.
![image](https://github.com/anoma/juvix/assets/5511599/9359fae2-0be1-43e5-8d74-faa82cba4034)
# Parallel compilation
1. I've added `src/Parallel/ParallelTemplate.hs` which contains all the
concurrency related code. I think it is good to keep this code separated
from the actual compiler code.
2. I've added a progress log (only for the parallel driver) that outputs
a log of the compilation progress, similar to what stack/cabal do.
# Code changes:
1. I've removed the `setup` stage where we were registering
dependencies. Instead, the dependencies are registered when the
`pathResolver` is run for the first time. This way it is safer.
1. Now the `ImportTree` is needed to run the pipeline. Cycles are
detected during the construction of this tree, so I've removed `Reader
ImportParents` from the pipeline.
3. For the package pathresolver, we do not support parallelism yet (we
could add support for it in the future, but the gains will be small).
4. When `-N1`, the pipeline remains unchanged, so performance should be
the same as in the main branch (except there is a small performance
degradation due to adding the `-threaded` flag).
5. I've introduced `PipelineOptions`, which are options that are used to
pass options to the effects in the pipeline.
6. `PathResolver` constraint has been removed from the `upTo*` functions
in the pipeline due to being redundant.
7. I've added a lot of `NFData` instances. They are needed to force the
full evaluation of `Stored.ModuleInfo` in each of the threads.
2. The `Cache` effect uses
[`SharedState`](https://hackage.haskell.org/package/effectful-core-2.3.0.1/docs/Effectful-State-Static-Shared.html)
as opposed to
[`LocalState`](https://hackage.haskell.org/package/effectful-core-2.3.0.1/docs/Effectful-Writer-Static-Local.html).
Perhaps we should provide different versions.
3. I've added a `Cache` handler that accepts a setup function. The setup
is triggered when a miss is detected. It is used to lazily compile the
modules in parallel.
# Tests
1. I've adapted the smoke test suite to ignore the progress log in the
stderr.
5. I've had to adapt `tests/positive/Internal/Lambda.juvix`. Due to
laziness, a crash happening in this file was not being caught. The
problem is that in this file we have a lambda function with different
number of patterns in their clauses, which we currently do not support
(https://github.com/anoma/juvix/issues/1706).
6. I've had to comment out the definition
```
x : Box ((A : Type) → A → A) := box λ {A a := a};
```
From the test as it was causing a crash
(https://github.com/anoma/juvix/issues/2247).
# Future Work
1. It should be investigated how much performance we lose by fully
evaluating the `Stored.ModuleInfo`, since some information in it will be
discarded. It may be possible to be more fine-grained when forcing
evaluation.
8. The scanning of imports to build the import tree is sequential. Now,
we build the import tree from the entry point module and only the
modules that are imported from it are in the tree. However, we have
discussed that at some point we should make a distinction between
`juvix` _the compiler_ and `juvix` _the build tool_. When using `juvix`
as a build tool it makes sense to typecheck/compile (to stored core) all
modules in the project. When/if we do this, scanning imports in all
modules in parallel becomes trivial.
9. The implementation of the `ParallelTemplate` uses low level
primitives such as
[forkIO](https://hackage.haskell.org/package/base-4.20.0.0/docs/Control-Concurrent.html#v:forkIO).
At some point it should be refactored to use safer functions from the
[`Effectful.Concurrent.Async`](https://hackage.haskell.org/package/effectful-2.3.0.0/docs/Effectful-Concurrent-Async.html)
module.
10. The number of cores and worker threads that we spawn is determined
by the command line. Ideally, we could use to import tree to compute an
upper bound to the ideal number of cores to use.
11. We could add an animation that displays which modules are being
compiled in parallel and which have finished being compiled.
# Benchmarks
On some benchmarks, I include the GHC runtime option
[`-A`](https://downloads.haskell.org/ghc/latest/docs/users_guide/runtime_control.html#rts-flag--A%20%E2%9F%A8size%E2%9F%A9),
which sometimes makes a good impact on performance. Thanks to
@paulcadman for pointing this out. I've figured a good combination of
`-N` and `-A` through trial and error (but this oviously depends on the
cpu and juvix projects).
## Typecheck the standard library
### Clean run (88% faster than main):
```
hyperfine --warmup 1 --prepare 'juvix clean' 'juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432' 'juvix -N 4 typecheck Stdlib/Prelude.juvix' 'juvix-main typecheck Stdlib/Prelude.juvix'
Benchmark 1: juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432
Time (mean ± σ): 444.1 ms ± 6.5 ms [User: 1018.0 ms, System: 77.7 ms]
Range (min … max): 432.6 ms … 455.9 ms 10 runs
Benchmark 2: juvix -N 4 typecheck Stdlib/Prelude.juvix
Time (mean ± σ): 628.3 ms ± 23.9 ms [User: 1227.6 ms, System: 69.5 ms]
Range (min … max): 584.7 ms … 670.6 ms 10 runs
Benchmark 3: juvix-main typecheck Stdlib/Prelude.juvix
Time (mean ± σ): 835.9 ms ± 12.3 ms [User: 788.5 ms, System: 31.9 ms]
Range (min … max): 816.0 ms … 853.6 ms 10 runs
Summary
juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 ran
1.41 ± 0.06 times faster than juvix -N 4 typecheck Stdlib/Prelude.juvix
1.88 ± 0.04 times faster than juvix-main typecheck Stdlib/Prelude.juvix
```
### Cached run (43% faster than main):
```
hyperfine --warmup 1 'juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432' 'juvix -N 4 typecheck Stdlib/Prelude.juvix' 'juvix-main typecheck Stdlib/Prelude.juvix'
Benchmark 1: juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432
Time (mean ± σ): 241.3 ms ± 7.3 ms [User: 538.6 ms, System: 101.3 ms]
Range (min … max): 231.5 ms … 251.3 ms 11 runs
Benchmark 2: juvix -N 4 typecheck Stdlib/Prelude.juvix
Time (mean ± σ): 235.1 ms ± 12.0 ms [User: 405.3 ms, System: 87.7 ms]
Range (min … max): 216.1 ms … 253.1 ms 12 runs
Benchmark 3: juvix-main typecheck Stdlib/Prelude.juvix
Time (mean ± σ): 336.7 ms ± 13.3 ms [User: 269.5 ms, System: 67.1 ms]
Range (min … max): 316.9 ms … 351.8 ms 10 runs
Summary
juvix -N 4 typecheck Stdlib/Prelude.juvix ran
1.03 ± 0.06 times faster than juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432
1.43 ± 0.09 times faster than juvix-main typecheck Stdlib/Prelude.juvix
```
## Typecheck the test suite of the containers library
At the moment this is the biggest juvix project that we have.
### Clean run (105% faster than main)
```
hyperfine --warmup 1 --prepare 'juvix clean' 'juvix -N 6 typecheck Main.juvix +RTS -A67108864' 'juvix -N 4 typecheck Main.juvix' 'juvix-main typecheck Main.juvix'
Benchmark 1: juvix -N 6 typecheck Main.juvix +RTS -A67108864
Time (mean ± σ): 1.006 s ± 0.011 s [User: 2.171 s, System: 0.162 s]
Range (min … max): 0.991 s … 1.023 s 10 runs
Benchmark 2: juvix -N 4 typecheck Main.juvix
Time (mean ± σ): 1.584 s ± 0.046 s [User: 2.934 s, System: 0.149 s]
Range (min … max): 1.535 s … 1.660 s 10 runs
Benchmark 3: juvix-main typecheck Main.juvix
Time (mean ± σ): 2.066 s ± 0.010 s [User: 1.939 s, System: 0.089 s]
Range (min … max): 2.048 s … 2.077 s 10 runs
Summary
juvix -N 6 typecheck Main.juvix +RTS -A67108864 ran
1.57 ± 0.05 times faster than juvix -N 4 typecheck Main.juvix
2.05 ± 0.03 times faster than juvix-main typecheck Main.juvix
```
### Cached run (54% faster than main)
```
hyperfine --warmup 1 'juvix -N 6 typecheck Main.juvix +RTS -A33554432' 'juvix -N 4 typecheck Main.juvix' 'juvix-main typecheck Main.juvix'
Benchmark 1: juvix -N 6 typecheck Main.juvix +RTS -A33554432
Time (mean ± σ): 551.8 ms ± 13.2 ms [User: 1419.8 ms, System: 199.4 ms]
Range (min … max): 535.2 ms … 570.6 ms 10 runs
Benchmark 2: juvix -N 4 typecheck Main.juvix
Time (mean ± σ): 636.7 ms ± 17.3 ms [User: 1006.3 ms, System: 196.3 ms]
Range (min … max): 601.6 ms … 655.3 ms 10 runs
Benchmark 3: juvix-main typecheck Main.juvix
Time (mean ± σ): 847.2 ms ± 58.9 ms [User: 710.1 ms, System: 126.5 ms]
Range (min … max): 731.1 ms … 890.0 ms 10 runs
Summary
juvix -N 6 typecheck Main.juvix +RTS -A33554432 ran
1.15 ± 0.04 times faster than juvix -N 4 typecheck Main.juvix
1.54 ± 0.11 times faster than juvix-main typecheck Main.juvix
```
* Implements code generation through Rust.
* CLI: adds two `dev` compilation targets:
1. `rust` for generating Rust code
2. `native-rust` for generating a native executable via Rust
* Adds end-to-end tests for compilation from Juvix to native executable
via Rust.
* A target for RISC0 needs to be added in a separate PR building on this
one.
This PR adds support for the `String` type, String literals and string
concatenation to the Nockma backend. Support for the builtins `show` and
`intToString` is not supported.
### Example
test079.juvix
```
module test079;
import Stdlib.Prelude open;
main (s : String) : String :=
s ++str " " ++str "✨ héllo" ++str " " ++str "world ✨";
```
args.nockma
```
[quote "Juvix!"]
```
```
$ juvix compile anoma test079.juvix
$ juvix dev nockma run test079.pretty.nockma --args args.nockma
"Juvix! ✨ héllo world ✨"
```
### String representation
A String is a sequence of UTF-8 encoded bytes. We interpret these bytes
as a sequence of bits to represent the string as an integer atom in
nockma.
For example:
The string `"a"` is UTF-8 encoded as `97` which is `0b1100001` in bits.
The string `"ab"` is UTF-8 encoded at the pair of bytes: `97 98` which
is `0b1100001 0b1100010`.
When we combine the bytes into a single sequence of bits we must take
care to pad each binary representation with zeros to each byte boundary.
So the binary representation of `"ab"` as an atom is `0b110000101100010`
or `24930` as an integer atom.
### String concatenation
We use the
[cat](ea25f88cea/hoon/anoma.hoon (L215))
function in the Anoma stdlib to concatenate the bytes representing two
strings.
We need to use the block parameter `3` in the Anoma call because we want
to treat the atoms representing the strings as sequences of bytes (= 2^3
bits).
To find the relevant Nock code to call `cat` with block parameter `3` we
use the urbit dojo as follows:
```
=> anoma !=(~(cat block 3))
[8 [9 10 0 7] 9 4 10 [6 7 [0 3] 1 3] 0 2]
```
### Stdlib intercept in Evaluator
The evaluator has support for strings using `AtomHint`s, so strings can
be printed and traced. The stdlib `cat` call is also intercepted because
evaluating the unjetted hoon version is slow.
### String support in pretty nockma
In a pretty nockma file or `nock` quasi-quote you can write double
quoted string literals, e.g "abc". These are automatically translated to
UTF-8 integer atoms as in the previous section.
This PR adds support for the Anoma stdlib `sign` and `verify` APIs.
```
builtin anoma-sign
axiom anomaSign : {A : Type}
-- message to sign
-> A
-- secret key
-> Nat
-- signed message
-> Nat;
builtin anoma-verify
axiom anomaVerify : {A : Type}
-- signed message to verify
-> Nat
-- public key
-> Nat
-- message with signature removed
-> A;
```
These correspond to the
[`sign`](https://hexdocs.pm/enacl/enacl.html#sign-2) and
[`sign_open`](https://hexdocs.pm/enacl/enacl.html#sign_open-2) APIs from
libsodium respectively.
If signature verification fails in `anomaVerify`, the Anoma program
exits. We copy this behaviour in the evaluator by throwing an error in
this case.
## Notes
The Haskell Ed25519 library does not support `sign_open`. Its
verification function returns Bool, i.e it checks that the signature is
valid. The signed message is simply the concatenation of the signature
(64 bytes) and the original message so I added a function to remove the
signature from a signed message.
This PR adds support for`anomaVerifyDetached` stdlib API via a Juvix
builtin.
It has signature:
```
builtin anoma-verify-detached
axiom anomaVerifyDetached : {A : Type}
--- signature
-> Nat
--- message
-> A
--- public key
-> Nat
-> Bool;
```
The [ed25519](https://hackage.haskell.org/package/ed25519) library is
used in the evaluator becuase Anoma uses ed25519 signatures
(https://hexdocs.pm/enacl/enacl.html).
---------
Co-authored-by: Jan Mas Rovira <janmasrovira@gmail.com>
This stdlib contains the cryptographic stdlib APIs, sign, verify,
sign-detatched (sic) and verify-detatched (sic)
It is obtained from
c28de348c1/lib/nock.ex (L469)
As noted in the comment the stdlibPaths are obtained using the Urbit
dojo. Run:
```
=> anoma !=(s)
```
where s is a stdlib symbol
eg:
```
=> anoma !=(add)
[9 20 0 63]
```