1
1
mirror of https://github.com/anoma/juvix.git synced 2025-01-05 22:46:08 +03:00
juvix/test/Base.hs

164 lines
4.8 KiB
Haskell
Raw Normal View History

2022-04-05 20:57:21 +03:00
module Base
( module Test.Tasty,
module Test.Tasty.HUnit,
module Juvix.Prelude,
2022-04-05 20:57:21 +03:00
module Base,
2022-12-20 15:05:40 +03:00
module Juvix.Extra.Paths,
2023-01-05 19:48:26 +03:00
module Juvix.Prelude.Env,
module Juvix.Compiler.Pipeline.Run,
module Juvix.Compiler.Pipeline.EntryPoint.IO,
2022-04-05 20:57:21 +03:00
)
where
2022-02-15 16:12:53 +03:00
import Control.Monad.Extra as Monad
Add C code generation backend (#68) * [cbackend] Adds an AST for C This should cover enough C to implement the microjuvix backend. * [cbackend] Add C serializer using language-c library We may decide to write our own serializer for the C AST but this demonstrates that the C AST is sufficient at least. * [cbackend] Declarations will always be typed * [cbackend] Add CPP support to AST * [cbackend] Rename some names for clarity * [cbackend] Add translation of InductiveDef to C * [cbackend] Add CLI for C backend * [cbackend] Add stdbool.h to file header * [cbackend] Allow Cpp and Verbatim code inline * [cbackend] Add a newline after printing C * [cbackend] Support foreign blocks * [cbackend] Add support for axioms * [cbackend] Remove code examples * [cbackend] wip FunctionDef including Expressions * [parser] Support esacping '}' inside a foreign block * [cbackend] Add support for patterns in functions * [cbackend] Add foreign C support to HelloWorld.mjuvix * hlint fixes * More hlint fixes not picked up by pre-commit * [cbackend] Remove CompileStatement from MonoJuvix * [cbackend] Add support for compile blocks * [cbackend] Move compileInfo extraction to MonoJuvixResult * [minihaskell] Fix compile block support * [chore] Remove ununsed isBackendSupported function * [chore] Remove unused imports * [cbackend] Use a Reader for pattern bindings * [cbackend] Fix compiler warnings * [cbackend] Add support for nested patterns * [cbackend] Use functions to instantiate argument names * [cbackend] Add non-exhaustive pattern error message * [cbackend] Adds test for c to WASM compile and execution * [cbackend] Add links to test dependencies in quickstart * [cbackend] Add test with inductive types and patterns * [cbackend] Fix indentation * [cbackend] Remove ExpressionTyped case https://github.com/heliaxdev/minijuvix/issues/79 * [lexer] Fix lexing of \ inside a foreign block * [cbackend] PR review fixes * [chore] Remove unused import * [cbackend] Rename CJuvix to MiniC * [cbackend] Rename MonoJuvixToC to MonoJuvixToMiniC * [cbackend] Add test for polymorphic function * [cbackend] Add module for string literals
2022-05-05 16:12:17 +03:00
import Data.Algorithm.Diff
import Data.Algorithm.DiffOutput
import Juvix.Compiler.Internal.Translation.FromInternal.Analysis.Termination
import Juvix.Compiler.Pipeline.EntryPoint.IO
import Juvix.Compiler.Pipeline.Loader.PathResolver
import Juvix.Compiler.Pipeline.Run
import Juvix.Data.Effect.TaggedLock
import Juvix.Extra.Paths hiding (rootBuildDir)
import Juvix.Prelude hiding (assert)
2023-01-05 19:48:26 +03:00
import Juvix.Prelude.Env
Parallel pipeline (#2779) This pr introduces parallelism in the pipeline to gain performance. I've included benchmarks at the end. - Closes #2750. # Flags: There are two new global flags: 1. `-N / --threads`. It is used to set the number of capabilities. According to [GHC documentation](https://hackage.haskell.org/package/base-4.20.0.0/docs/GHC-Conc.html#v:setNumCapabilities): _Set the number of Haskell threads that can run truly simultaneously (on separate physical processors) at any given time_. When compiling in parallel, we create this many worker threads. The default value is `-N auto`, which sets `-N` to half the number of logical cores, capped at 8. 2. `--dev-show-thread-ids`. When given, the thread id is printed in the compilation progress log. E.g. ![image](https://github.com/anoma/juvix/assets/5511599/9359fae2-0be1-43e5-8d74-faa82cba4034) # Parallel compilation 1. I've added `src/Parallel/ParallelTemplate.hs` which contains all the concurrency related code. I think it is good to keep this code separated from the actual compiler code. 2. I've added a progress log (only for the parallel driver) that outputs a log of the compilation progress, similar to what stack/cabal do. # Code changes: 1. I've removed the `setup` stage where we were registering dependencies. Instead, the dependencies are registered when the `pathResolver` is run for the first time. This way it is safer. 1. Now the `ImportTree` is needed to run the pipeline. Cycles are detected during the construction of this tree, so I've removed `Reader ImportParents` from the pipeline. 3. For the package pathresolver, we do not support parallelism yet (we could add support for it in the future, but the gains will be small). 4. When `-N1`, the pipeline remains unchanged, so performance should be the same as in the main branch (except there is a small performance degradation due to adding the `-threaded` flag). 5. I've introduced `PipelineOptions`, which are options that are used to pass options to the effects in the pipeline. 6. `PathResolver` constraint has been removed from the `upTo*` functions in the pipeline due to being redundant. 7. I've added a lot of `NFData` instances. They are needed to force the full evaluation of `Stored.ModuleInfo` in each of the threads. 2. The `Cache` effect uses [`SharedState`](https://hackage.haskell.org/package/effectful-core-2.3.0.1/docs/Effectful-State-Static-Shared.html) as opposed to [`LocalState`](https://hackage.haskell.org/package/effectful-core-2.3.0.1/docs/Effectful-Writer-Static-Local.html). Perhaps we should provide different versions. 3. I've added a `Cache` handler that accepts a setup function. The setup is triggered when a miss is detected. It is used to lazily compile the modules in parallel. # Tests 1. I've adapted the smoke test suite to ignore the progress log in the stderr. 5. I've had to adapt `tests/positive/Internal/Lambda.juvix`. Due to laziness, a crash happening in this file was not being caught. The problem is that in this file we have a lambda function with different number of patterns in their clauses, which we currently do not support (https://github.com/anoma/juvix/issues/1706). 6. I've had to comment out the definition ``` x : Box ((A : Type) → A → A) := box λ {A a := a}; ``` From the test as it was causing a crash (https://github.com/anoma/juvix/issues/2247). # Future Work 1. It should be investigated how much performance we lose by fully evaluating the `Stored.ModuleInfo`, since some information in it will be discarded. It may be possible to be more fine-grained when forcing evaluation. 8. The scanning of imports to build the import tree is sequential. Now, we build the import tree from the entry point module and only the modules that are imported from it are in the tree. However, we have discussed that at some point we should make a distinction between `juvix` _the compiler_ and `juvix` _the build tool_. When using `juvix` as a build tool it makes sense to typecheck/compile (to stored core) all modules in the project. When/if we do this, scanning imports in all modules in parallel becomes trivial. 9. The implementation of the `ParallelTemplate` uses low level primitives such as [forkIO](https://hackage.haskell.org/package/base-4.20.0.0/docs/Control-Concurrent.html#v:forkIO). At some point it should be refactored to use safer functions from the [`Effectful.Concurrent.Async`](https://hackage.haskell.org/package/effectful-2.3.0.0/docs/Effectful-Concurrent-Async.html) module. 10. The number of cores and worker threads that we spawn is determined by the command line. Ideally, we could use to import tree to compute an upper bound to the ideal number of cores to use. 11. We could add an animation that displays which modules are being compiled in parallel and which have finished being compiled. # Benchmarks On some benchmarks, I include the GHC runtime option [`-A`](https://downloads.haskell.org/ghc/latest/docs/users_guide/runtime_control.html#rts-flag--A%20%E2%9F%A8size%E2%9F%A9), which sometimes makes a good impact on performance. Thanks to @paulcadman for pointing this out. I've figured a good combination of `-N` and `-A` through trial and error (but this oviously depends on the cpu and juvix projects). ## Typecheck the standard library ### Clean run (88% faster than main): ``` hyperfine --warmup 1 --prepare 'juvix clean' 'juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432' 'juvix -N 4 typecheck Stdlib/Prelude.juvix' 'juvix-main typecheck Stdlib/Prelude.juvix' Benchmark 1: juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 Time (mean ± σ): 444.1 ms ± 6.5 ms [User: 1018.0 ms, System: 77.7 ms] Range (min … max): 432.6 ms … 455.9 ms 10 runs Benchmark 2: juvix -N 4 typecheck Stdlib/Prelude.juvix Time (mean ± σ): 628.3 ms ± 23.9 ms [User: 1227.6 ms, System: 69.5 ms] Range (min … max): 584.7 ms … 670.6 ms 10 runs Benchmark 3: juvix-main typecheck Stdlib/Prelude.juvix Time (mean ± σ): 835.9 ms ± 12.3 ms [User: 788.5 ms, System: 31.9 ms] Range (min … max): 816.0 ms … 853.6 ms 10 runs Summary juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 ran 1.41 ± 0.06 times faster than juvix -N 4 typecheck Stdlib/Prelude.juvix 1.88 ± 0.04 times faster than juvix-main typecheck Stdlib/Prelude.juvix ``` ### Cached run (43% faster than main): ``` hyperfine --warmup 1 'juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432' 'juvix -N 4 typecheck Stdlib/Prelude.juvix' 'juvix-main typecheck Stdlib/Prelude.juvix' Benchmark 1: juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 Time (mean ± σ): 241.3 ms ± 7.3 ms [User: 538.6 ms, System: 101.3 ms] Range (min … max): 231.5 ms … 251.3 ms 11 runs Benchmark 2: juvix -N 4 typecheck Stdlib/Prelude.juvix Time (mean ± σ): 235.1 ms ± 12.0 ms [User: 405.3 ms, System: 87.7 ms] Range (min … max): 216.1 ms … 253.1 ms 12 runs Benchmark 3: juvix-main typecheck Stdlib/Prelude.juvix Time (mean ± σ): 336.7 ms ± 13.3 ms [User: 269.5 ms, System: 67.1 ms] Range (min … max): 316.9 ms … 351.8 ms 10 runs Summary juvix -N 4 typecheck Stdlib/Prelude.juvix ran 1.03 ± 0.06 times faster than juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 1.43 ± 0.09 times faster than juvix-main typecheck Stdlib/Prelude.juvix ``` ## Typecheck the test suite of the containers library At the moment this is the biggest juvix project that we have. ### Clean run (105% faster than main) ``` hyperfine --warmup 1 --prepare 'juvix clean' 'juvix -N 6 typecheck Main.juvix +RTS -A67108864' 'juvix -N 4 typecheck Main.juvix' 'juvix-main typecheck Main.juvix' Benchmark 1: juvix -N 6 typecheck Main.juvix +RTS -A67108864 Time (mean ± σ): 1.006 s ± 0.011 s [User: 2.171 s, System: 0.162 s] Range (min … max): 0.991 s … 1.023 s 10 runs Benchmark 2: juvix -N 4 typecheck Main.juvix Time (mean ± σ): 1.584 s ± 0.046 s [User: 2.934 s, System: 0.149 s] Range (min … max): 1.535 s … 1.660 s 10 runs Benchmark 3: juvix-main typecheck Main.juvix Time (mean ± σ): 2.066 s ± 0.010 s [User: 1.939 s, System: 0.089 s] Range (min … max): 2.048 s … 2.077 s 10 runs Summary juvix -N 6 typecheck Main.juvix +RTS -A67108864 ran 1.57 ± 0.05 times faster than juvix -N 4 typecheck Main.juvix 2.05 ± 0.03 times faster than juvix-main typecheck Main.juvix ``` ### Cached run (54% faster than main) ``` hyperfine --warmup 1 'juvix -N 6 typecheck Main.juvix +RTS -A33554432' 'juvix -N 4 typecheck Main.juvix' 'juvix-main typecheck Main.juvix' Benchmark 1: juvix -N 6 typecheck Main.juvix +RTS -A33554432 Time (mean ± σ): 551.8 ms ± 13.2 ms [User: 1419.8 ms, System: 199.4 ms] Range (min … max): 535.2 ms … 570.6 ms 10 runs Benchmark 2: juvix -N 4 typecheck Main.juvix Time (mean ± σ): 636.7 ms ± 17.3 ms [User: 1006.3 ms, System: 196.3 ms] Range (min … max): 601.6 ms … 655.3 ms 10 runs Benchmark 3: juvix-main typecheck Main.juvix Time (mean ± σ): 847.2 ms ± 58.9 ms [User: 710.1 ms, System: 126.5 ms] Range (min … max): 731.1 ms … 890.0 ms 10 runs Summary juvix -N 6 typecheck Main.juvix +RTS -A33554432 ran 1.15 ± 0.04 times faster than juvix -N 4 typecheck Main.juvix 1.54 ± 0.11 times faster than juvix-main typecheck Main.juvix ```
2024-05-31 14:41:30 +03:00
import Parallel.ProgressLog
import System.Process qualified as P
2022-02-15 16:12:53 +03:00
import Test.Tasty
Add nockma evaluator (#2564) This PR adds an parser, pretty printer, evaluator, repl and quasi-quoter for Nock terms. ## Parser / Pretty Printer The parser and pretty printer handle both standard Nock terms and 'pretty' Nock terms (where op codes and paths can be named). Standard and pretty Nock forms can be mixed in the same term. For example instead of `[0 2]` you can write `[@ L]`. See https://github.com/anoma/juvix/blob/a6028b0d92e2dff02329ab7f441bf48ccdeb3eb3/src/Juvix/Compiler/Nockma/Language.hs#L79 for the correspondence between pretty Nock and Nock operators. In pretty Nock, paths are represented as strings of `L` (for head) and `R` (for tail) instead of the number encoding in standard nock. The character `S` is used to refer to the whole subject, i.e it is sugar for `1` in standard Nock. See https://github.com/anoma/juvix/blob/a6028b0d92e2dff02329ab7f441bf48ccdeb3eb3/src/Juvix/Compiler/Nockma/Language.hs#L177 for the correspondence between pretty Nock path and standard Nock position. ## Quasi-quoter A quasi-quoter is added so Nock terms can be included in the source, e.g `[nock| [@ LL] |]`. ## REPL Launch the repl with `juvix dev nockma repl`. A Nock `[subject formula]` cell is input as `subject / formula` , e.g: ``` nockma> [1 0] / [@ L] 1 ``` The subject can be set using `:set-stack`. ``` nockma> :set-stack [1 0] nockma> [@ L] 1 ``` The subject can be viewed using `:get-stack`. ``` nockma> :set-stack [1 0] nockma> :get-stack [1 0] ``` You can assign a Nock term to a variable and use it in another expression: ``` nockma> r := [@ L] nockma> [1 0] / r 1 ``` A list of assignments can be read from a file: ``` $ cat stack.nock r := [@ L] $ juvix dev nockma repl nockma> :load stack.nock nockma> [1 0] / r 1 ``` * Closes https://github.com/anoma/juvix/issues/2557 --------- Co-authored-by: Jan Mas Rovira <janmasrovira@gmail.com> Co-authored-by: Lukasz Czajka <lukasz@heliax.dev>
2024-01-11 15:04:38 +03:00
import Test.Tasty.HUnit hiding (assertFailure)
import Test.Tasty.HUnit qualified as HUnit
2022-02-15 16:12:53 +03:00
2022-04-05 20:57:21 +03:00
data AssertionDescr
= Single Assertion
2022-02-18 15:01:42 +03:00
| Steps ((String -> IO ()) -> Assertion)
2022-04-05 20:57:21 +03:00
data TestDescr = TestDescr
2022-04-07 19:10:53 +03:00
{ _testName :: String,
2022-12-20 15:05:40 +03:00
_testRoot :: Path Abs Dir,
2022-04-05 20:57:21 +03:00
-- | relative to root
2022-04-07 19:10:53 +03:00
_testAssertion :: AssertionDescr
2022-02-15 16:12:53 +03:00
}
newtype WASMInfo = WASMInfo
2022-12-20 15:05:40 +03:00
{ _wasmInfoActual :: Path Abs File -> IO Text
}
2022-04-07 19:10:53 +03:00
makeLenses ''TestDescr
2023-01-05 19:48:26 +03:00
data StdlibMode
= StdlibInclude
| StdlibExclude
deriving stock (Show, Eq)
2023-01-05 19:48:26 +03:00
data CompileMode
= WASI StdlibMode
| WASM WASMInfo
2022-02-15 16:12:53 +03:00
mkTest :: TestDescr -> TestTree
2022-04-07 19:10:53 +03:00
mkTest TestDescr {..} = case _testAssertion of
Single assertion -> testCase _testName (withCurrentDir _testRoot assertion)
2022-12-20 15:05:40 +03:00
Steps steps -> testCaseSteps _testName (withCurrentDir _testRoot . steps)
Add C code generation backend (#68) * [cbackend] Adds an AST for C This should cover enough C to implement the microjuvix backend. * [cbackend] Add C serializer using language-c library We may decide to write our own serializer for the C AST but this demonstrates that the C AST is sufficient at least. * [cbackend] Declarations will always be typed * [cbackend] Add CPP support to AST * [cbackend] Rename some names for clarity * [cbackend] Add translation of InductiveDef to C * [cbackend] Add CLI for C backend * [cbackend] Add stdbool.h to file header * [cbackend] Allow Cpp and Verbatim code inline * [cbackend] Add a newline after printing C * [cbackend] Support foreign blocks * [cbackend] Add support for axioms * [cbackend] Remove code examples * [cbackend] wip FunctionDef including Expressions * [parser] Support esacping '}' inside a foreign block * [cbackend] Add support for patterns in functions * [cbackend] Add foreign C support to HelloWorld.mjuvix * hlint fixes * More hlint fixes not picked up by pre-commit * [cbackend] Remove CompileStatement from MonoJuvix * [cbackend] Add support for compile blocks * [cbackend] Move compileInfo extraction to MonoJuvixResult * [minihaskell] Fix compile block support * [chore] Remove ununsed isBackendSupported function * [chore] Remove unused imports * [cbackend] Use a Reader for pattern bindings * [cbackend] Fix compiler warnings * [cbackend] Add support for nested patterns * [cbackend] Use functions to instantiate argument names * [cbackend] Add non-exhaustive pattern error message * [cbackend] Adds test for c to WASM compile and execution * [cbackend] Add links to test dependencies in quickstart * [cbackend] Add test with inductive types and patterns * [cbackend] Fix indentation * [cbackend] Remove ExpressionTyped case https://github.com/heliaxdev/minijuvix/issues/79 * [lexer] Fix lexing of \ inside a foreign block * [cbackend] PR review fixes * [chore] Remove unused import * [cbackend] Rename CJuvix to MiniC * [cbackend] Rename MonoJuvixToC to MonoJuvixToMiniC * [cbackend] Add test for polymorphic function * [cbackend] Add module for string literals
2022-05-05 16:12:17 +03:00
assertEqDiffText :: String -> Text -> Text -> Assertion
assertEqDiffText = assertEqDiff unpack
assertEqDiff :: (Eq a) => (a -> String) -> String -> a -> a -> Assertion
assertEqDiff show_ msg a b
Add C code generation backend (#68) * [cbackend] Adds an AST for C This should cover enough C to implement the microjuvix backend. * [cbackend] Add C serializer using language-c library We may decide to write our own serializer for the C AST but this demonstrates that the C AST is sufficient at least. * [cbackend] Declarations will always be typed * [cbackend] Add CPP support to AST * [cbackend] Rename some names for clarity * [cbackend] Add translation of InductiveDef to C * [cbackend] Add CLI for C backend * [cbackend] Add stdbool.h to file header * [cbackend] Allow Cpp and Verbatim code inline * [cbackend] Add a newline after printing C * [cbackend] Support foreign blocks * [cbackend] Add support for axioms * [cbackend] Remove code examples * [cbackend] wip FunctionDef including Expressions * [parser] Support esacping '}' inside a foreign block * [cbackend] Add support for patterns in functions * [cbackend] Add foreign C support to HelloWorld.mjuvix * hlint fixes * More hlint fixes not picked up by pre-commit * [cbackend] Remove CompileStatement from MonoJuvix * [cbackend] Add support for compile blocks * [cbackend] Move compileInfo extraction to MonoJuvixResult * [minihaskell] Fix compile block support * [chore] Remove ununsed isBackendSupported function * [chore] Remove unused imports * [cbackend] Use a Reader for pattern bindings * [cbackend] Fix compiler warnings * [cbackend] Add support for nested patterns * [cbackend] Use functions to instantiate argument names * [cbackend] Add non-exhaustive pattern error message * [cbackend] Adds test for c to WASM compile and execution * [cbackend] Add links to test dependencies in quickstart * [cbackend] Add test with inductive types and patterns * [cbackend] Fix indentation * [cbackend] Remove ExpressionTyped case https://github.com/heliaxdev/minijuvix/issues/79 * [lexer] Fix lexing of \ inside a foreign block * [cbackend] PR review fixes * [chore] Remove unused import * [cbackend] Rename CJuvix to MiniC * [cbackend] Rename MonoJuvixToC to MonoJuvixToMiniC * [cbackend] Add test for polymorphic function * [cbackend] Add module for string literals
2022-05-05 16:12:17 +03:00
| a == b = return ()
| otherwise = do
putStrLn (pack $ ppDiff (getGroupedDiff pa pb))
putStrLn "End diff"
Monad.fail msg
Add C code generation backend (#68) * [cbackend] Adds an AST for C This should cover enough C to implement the microjuvix backend. * [cbackend] Add C serializer using language-c library We may decide to write our own serializer for the C AST but this demonstrates that the C AST is sufficient at least. * [cbackend] Declarations will always be typed * [cbackend] Add CPP support to AST * [cbackend] Rename some names for clarity * [cbackend] Add translation of InductiveDef to C * [cbackend] Add CLI for C backend * [cbackend] Add stdbool.h to file header * [cbackend] Allow Cpp and Verbatim code inline * [cbackend] Add a newline after printing C * [cbackend] Support foreign blocks * [cbackend] Add support for axioms * [cbackend] Remove code examples * [cbackend] wip FunctionDef including Expressions * [parser] Support esacping '}' inside a foreign block * [cbackend] Add support for patterns in functions * [cbackend] Add foreign C support to HelloWorld.mjuvix * hlint fixes * More hlint fixes not picked up by pre-commit * [cbackend] Remove CompileStatement from MonoJuvix * [cbackend] Add support for compile blocks * [cbackend] Move compileInfo extraction to MonoJuvixResult * [minihaskell] Fix compile block support * [chore] Remove ununsed isBackendSupported function * [chore] Remove unused imports * [cbackend] Use a Reader for pattern bindings * [cbackend] Fix compiler warnings * [cbackend] Add support for nested patterns * [cbackend] Use functions to instantiate argument names * [cbackend] Add non-exhaustive pattern error message * [cbackend] Adds test for c to WASM compile and execution * [cbackend] Add links to test dependencies in quickstart * [cbackend] Add test with inductive types and patterns * [cbackend] Fix indentation * [cbackend] Remove ExpressionTyped case https://github.com/heliaxdev/minijuvix/issues/79 * [lexer] Fix lexing of \ inside a foreign block * [cbackend] PR review fixes * [chore] Remove unused import * [cbackend] Rename CJuvix to MiniC * [cbackend] Rename MonoJuvixToC to MonoJuvixToMiniC * [cbackend] Add test for polymorphic function * [cbackend] Add module for string literals
2022-05-05 16:12:17 +03:00
where
pa = lines $ show_ a
pb = lines $ show_ b
assertEqDiffShow :: (Eq a, Show a) => String -> a -> a -> Assertion
assertEqDiffShow = assertEqDiff show
Add C code generation backend (#68) * [cbackend] Adds an AST for C This should cover enough C to implement the microjuvix backend. * [cbackend] Add C serializer using language-c library We may decide to write our own serializer for the C AST but this demonstrates that the C AST is sufficient at least. * [cbackend] Declarations will always be typed * [cbackend] Add CPP support to AST * [cbackend] Rename some names for clarity * [cbackend] Add translation of InductiveDef to C * [cbackend] Add CLI for C backend * [cbackend] Add stdbool.h to file header * [cbackend] Allow Cpp and Verbatim code inline * [cbackend] Add a newline after printing C * [cbackend] Support foreign blocks * [cbackend] Add support for axioms * [cbackend] Remove code examples * [cbackend] wip FunctionDef including Expressions * [parser] Support esacping '}' inside a foreign block * [cbackend] Add support for patterns in functions * [cbackend] Add foreign C support to HelloWorld.mjuvix * hlint fixes * More hlint fixes not picked up by pre-commit * [cbackend] Remove CompileStatement from MonoJuvix * [cbackend] Add support for compile blocks * [cbackend] Move compileInfo extraction to MonoJuvixResult * [minihaskell] Fix compile block support * [chore] Remove ununsed isBackendSupported function * [chore] Remove unused imports * [cbackend] Use a Reader for pattern bindings * [cbackend] Fix compiler warnings * [cbackend] Add support for nested patterns * [cbackend] Use functions to instantiate argument names * [cbackend] Add non-exhaustive pattern error message * [cbackend] Adds test for c to WASM compile and execution * [cbackend] Add links to test dependencies in quickstart * [cbackend] Add test with inductive types and patterns * [cbackend] Fix indentation * [cbackend] Remove ExpressionTyped case https://github.com/heliaxdev/minijuvix/issues/79 * [lexer] Fix lexing of \ inside a foreign block * [cbackend] PR review fixes * [chore] Remove unused import * [cbackend] Rename CJuvix to MiniC * [cbackend] Rename MonoJuvixToC to MonoJuvixToMiniC * [cbackend] Add test for polymorphic function * [cbackend] Add module for string literals
2022-05-05 16:12:17 +03:00
2022-12-20 15:05:40 +03:00
assertCmdExists :: Path Rel File -> Assertion
assertCmdExists cmd =
2022-12-20 15:05:40 +03:00
assertBool ("Command: " <> toFilePath cmd <> " is not present on $PATH")
. isJust
=<< findExecutable cmd
testTaggedLockedToIO :: (MonadIO m) => Sem PipelineAppEffects a -> m a
testTaggedLockedToIO =
Replace `polysemy` by `effectful` (#2663) The following benchmark compares juvix 0.6.0 with polysemy and a new version (implemented in this pr) which replaces polysemy by effectful. # Typecheck standard library without caching ``` hyperfine --warmup 2 --prepare 'juvix-polysemy clean' 'juvix-polysemy typecheck Stdlib/Prelude.juvix' 'juvix-effectful typecheck Stdlib/Prelude.juvix' Benchmark 1: juvix-polysemy typecheck Stdlib/Prelude.juvix Time (mean ± σ): 3.924 s ± 0.143 s [User: 3.787 s, System: 0.084 s] Range (min … max): 3.649 s … 4.142 s 10 runs Benchmark 2: juvix-effectful typecheck Stdlib/Prelude.juvix Time (mean ± σ): 2.558 s ± 0.074 s [User: 2.430 s, System: 0.084 s] Range (min … max): 2.403 s … 2.646 s 10 runs Summary juvix-effectful typecheck Stdlib/Prelude.juvix ran 1.53 ± 0.07 times faster than juvix-polysemy typecheck Stdlib/Prelude.juvix ``` # Typecheck standard library with caching ``` hyperfine --warmup 1 'juvix-effectful typecheck Stdlib/Prelude.juvix' 'juvix-polysemy typecheck Stdlib/Prelude.juvix' --min-runs 20 Benchmark 1: juvix-effectful typecheck Stdlib/Prelude.juvix Time (mean ± σ): 1.194 s ± 0.068 s [User: 0.979 s, System: 0.211 s] Range (min … max): 1.113 s … 1.307 s 20 runs Benchmark 2: juvix-polysemy typecheck Stdlib/Prelude.juvix Time (mean ± σ): 1.237 s ± 0.083 s [User: 0.997 s, System: 0.231 s] Range (min … max): 1.061 s … 1.476 s 20 runs Summary juvix-effectful typecheck Stdlib/Prelude.juvix ran 1.04 ± 0.09 times faster than juvix-polysemy typecheck Stdlib/Prelude.juvix ```
2024-03-21 15:09:34 +03:00
runM
Parallel pipeline (#2779) This pr introduces parallelism in the pipeline to gain performance. I've included benchmarks at the end. - Closes #2750. # Flags: There are two new global flags: 1. `-N / --threads`. It is used to set the number of capabilities. According to [GHC documentation](https://hackage.haskell.org/package/base-4.20.0.0/docs/GHC-Conc.html#v:setNumCapabilities): _Set the number of Haskell threads that can run truly simultaneously (on separate physical processors) at any given time_. When compiling in parallel, we create this many worker threads. The default value is `-N auto`, which sets `-N` to half the number of logical cores, capped at 8. 2. `--dev-show-thread-ids`. When given, the thread id is printed in the compilation progress log. E.g. ![image](https://github.com/anoma/juvix/assets/5511599/9359fae2-0be1-43e5-8d74-faa82cba4034) # Parallel compilation 1. I've added `src/Parallel/ParallelTemplate.hs` which contains all the concurrency related code. I think it is good to keep this code separated from the actual compiler code. 2. I've added a progress log (only for the parallel driver) that outputs a log of the compilation progress, similar to what stack/cabal do. # Code changes: 1. I've removed the `setup` stage where we were registering dependencies. Instead, the dependencies are registered when the `pathResolver` is run for the first time. This way it is safer. 1. Now the `ImportTree` is needed to run the pipeline. Cycles are detected during the construction of this tree, so I've removed `Reader ImportParents` from the pipeline. 3. For the package pathresolver, we do not support parallelism yet (we could add support for it in the future, but the gains will be small). 4. When `-N1`, the pipeline remains unchanged, so performance should be the same as in the main branch (except there is a small performance degradation due to adding the `-threaded` flag). 5. I've introduced `PipelineOptions`, which are options that are used to pass options to the effects in the pipeline. 6. `PathResolver` constraint has been removed from the `upTo*` functions in the pipeline due to being redundant. 7. I've added a lot of `NFData` instances. They are needed to force the full evaluation of `Stored.ModuleInfo` in each of the threads. 2. The `Cache` effect uses [`SharedState`](https://hackage.haskell.org/package/effectful-core-2.3.0.1/docs/Effectful-State-Static-Shared.html) as opposed to [`LocalState`](https://hackage.haskell.org/package/effectful-core-2.3.0.1/docs/Effectful-Writer-Static-Local.html). Perhaps we should provide different versions. 3. I've added a `Cache` handler that accepts a setup function. The setup is triggered when a miss is detected. It is used to lazily compile the modules in parallel. # Tests 1. I've adapted the smoke test suite to ignore the progress log in the stderr. 5. I've had to adapt `tests/positive/Internal/Lambda.juvix`. Due to laziness, a crash happening in this file was not being caught. The problem is that in this file we have a lambda function with different number of patterns in their clauses, which we currently do not support (https://github.com/anoma/juvix/issues/1706). 6. I've had to comment out the definition ``` x : Box ((A : Type) → A → A) := box λ {A a := a}; ``` From the test as it was causing a crash (https://github.com/anoma/juvix/issues/2247). # Future Work 1. It should be investigated how much performance we lose by fully evaluating the `Stored.ModuleInfo`, since some information in it will be discarded. It may be possible to be more fine-grained when forcing evaluation. 8. The scanning of imports to build the import tree is sequential. Now, we build the import tree from the entry point module and only the modules that are imported from it are in the tree. However, we have discussed that at some point we should make a distinction between `juvix` _the compiler_ and `juvix` _the build tool_. When using `juvix` as a build tool it makes sense to typecheck/compile (to stored core) all modules in the project. When/if we do this, scanning imports in all modules in parallel becomes trivial. 9. The implementation of the `ParallelTemplate` uses low level primitives such as [forkIO](https://hackage.haskell.org/package/base-4.20.0.0/docs/Control-Concurrent.html#v:forkIO). At some point it should be refactored to use safer functions from the [`Effectful.Concurrent.Async`](https://hackage.haskell.org/package/effectful-2.3.0.0/docs/Effectful-Concurrent-Async.html) module. 10. The number of cores and worker threads that we spawn is determined by the command line. Ideally, we could use to import tree to compute an upper bound to the ideal number of cores to use. 11. We could add an animation that displays which modules are being compiled in parallel and which have finished being compiled. # Benchmarks On some benchmarks, I include the GHC runtime option [`-A`](https://downloads.haskell.org/ghc/latest/docs/users_guide/runtime_control.html#rts-flag--A%20%E2%9F%A8size%E2%9F%A9), which sometimes makes a good impact on performance. Thanks to @paulcadman for pointing this out. I've figured a good combination of `-N` and `-A` through trial and error (but this oviously depends on the cpu and juvix projects). ## Typecheck the standard library ### Clean run (88% faster than main): ``` hyperfine --warmup 1 --prepare 'juvix clean' 'juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432' 'juvix -N 4 typecheck Stdlib/Prelude.juvix' 'juvix-main typecheck Stdlib/Prelude.juvix' Benchmark 1: juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 Time (mean ± σ): 444.1 ms ± 6.5 ms [User: 1018.0 ms, System: 77.7 ms] Range (min … max): 432.6 ms … 455.9 ms 10 runs Benchmark 2: juvix -N 4 typecheck Stdlib/Prelude.juvix Time (mean ± σ): 628.3 ms ± 23.9 ms [User: 1227.6 ms, System: 69.5 ms] Range (min … max): 584.7 ms … 670.6 ms 10 runs Benchmark 3: juvix-main typecheck Stdlib/Prelude.juvix Time (mean ± σ): 835.9 ms ± 12.3 ms [User: 788.5 ms, System: 31.9 ms] Range (min … max): 816.0 ms … 853.6 ms 10 runs Summary juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 ran 1.41 ± 0.06 times faster than juvix -N 4 typecheck Stdlib/Prelude.juvix 1.88 ± 0.04 times faster than juvix-main typecheck Stdlib/Prelude.juvix ``` ### Cached run (43% faster than main): ``` hyperfine --warmup 1 'juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432' 'juvix -N 4 typecheck Stdlib/Prelude.juvix' 'juvix-main typecheck Stdlib/Prelude.juvix' Benchmark 1: juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 Time (mean ± σ): 241.3 ms ± 7.3 ms [User: 538.6 ms, System: 101.3 ms] Range (min … max): 231.5 ms … 251.3 ms 11 runs Benchmark 2: juvix -N 4 typecheck Stdlib/Prelude.juvix Time (mean ± σ): 235.1 ms ± 12.0 ms [User: 405.3 ms, System: 87.7 ms] Range (min … max): 216.1 ms … 253.1 ms 12 runs Benchmark 3: juvix-main typecheck Stdlib/Prelude.juvix Time (mean ± σ): 336.7 ms ± 13.3 ms [User: 269.5 ms, System: 67.1 ms] Range (min … max): 316.9 ms … 351.8 ms 10 runs Summary juvix -N 4 typecheck Stdlib/Prelude.juvix ran 1.03 ± 0.06 times faster than juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 1.43 ± 0.09 times faster than juvix-main typecheck Stdlib/Prelude.juvix ``` ## Typecheck the test suite of the containers library At the moment this is the biggest juvix project that we have. ### Clean run (105% faster than main) ``` hyperfine --warmup 1 --prepare 'juvix clean' 'juvix -N 6 typecheck Main.juvix +RTS -A67108864' 'juvix -N 4 typecheck Main.juvix' 'juvix-main typecheck Main.juvix' Benchmark 1: juvix -N 6 typecheck Main.juvix +RTS -A67108864 Time (mean ± σ): 1.006 s ± 0.011 s [User: 2.171 s, System: 0.162 s] Range (min … max): 0.991 s … 1.023 s 10 runs Benchmark 2: juvix -N 4 typecheck Main.juvix Time (mean ± σ): 1.584 s ± 0.046 s [User: 2.934 s, System: 0.149 s] Range (min … max): 1.535 s … 1.660 s 10 runs Benchmark 3: juvix-main typecheck Main.juvix Time (mean ± σ): 2.066 s ± 0.010 s [User: 1.939 s, System: 0.089 s] Range (min … max): 2.048 s … 2.077 s 10 runs Summary juvix -N 6 typecheck Main.juvix +RTS -A67108864 ran 1.57 ± 0.05 times faster than juvix -N 4 typecheck Main.juvix 2.05 ± 0.03 times faster than juvix-main typecheck Main.juvix ``` ### Cached run (54% faster than main) ``` hyperfine --warmup 1 'juvix -N 6 typecheck Main.juvix +RTS -A33554432' 'juvix -N 4 typecheck Main.juvix' 'juvix-main typecheck Main.juvix' Benchmark 1: juvix -N 6 typecheck Main.juvix +RTS -A33554432 Time (mean ± σ): 551.8 ms ± 13.2 ms [User: 1419.8 ms, System: 199.4 ms] Range (min … max): 535.2 ms … 570.6 ms 10 runs Benchmark 2: juvix -N 4 typecheck Main.juvix Time (mean ± σ): 636.7 ms ± 17.3 ms [User: 1006.3 ms, System: 196.3 ms] Range (min … max): 601.6 ms … 655.3 ms 10 runs Benchmark 3: juvix-main typecheck Main.juvix Time (mean ± σ): 847.2 ms ± 58.9 ms [User: 710.1 ms, System: 126.5 ms] Range (min … max): 731.1 ms … 890.0 ms 10 runs Summary juvix -N 6 typecheck Main.juvix +RTS -A33554432 ran 1.15 ± 0.04 times faster than juvix -N 4 typecheck Main.juvix 1.54 ± 0.11 times faster than juvix-main typecheck Main.juvix ```
2024-05-31 14:41:30 +03:00
. ignoreProgressLog
. runReader testPipelineOptions
. runTaggedLock LockModeExclusive
testRunIO ::
forall a m.
(MonadIO m) =>
EntryPoint ->
Sem (PipelineEff PipelineAppEffects) a ->
m (ResolverState, PipelineResult a)
Parallel pipeline (#2779) This pr introduces parallelism in the pipeline to gain performance. I've included benchmarks at the end. - Closes #2750. # Flags: There are two new global flags: 1. `-N / --threads`. It is used to set the number of capabilities. According to [GHC documentation](https://hackage.haskell.org/package/base-4.20.0.0/docs/GHC-Conc.html#v:setNumCapabilities): _Set the number of Haskell threads that can run truly simultaneously (on separate physical processors) at any given time_. When compiling in parallel, we create this many worker threads. The default value is `-N auto`, which sets `-N` to half the number of logical cores, capped at 8. 2. `--dev-show-thread-ids`. When given, the thread id is printed in the compilation progress log. E.g. ![image](https://github.com/anoma/juvix/assets/5511599/9359fae2-0be1-43e5-8d74-faa82cba4034) # Parallel compilation 1. I've added `src/Parallel/ParallelTemplate.hs` which contains all the concurrency related code. I think it is good to keep this code separated from the actual compiler code. 2. I've added a progress log (only for the parallel driver) that outputs a log of the compilation progress, similar to what stack/cabal do. # Code changes: 1. I've removed the `setup` stage where we were registering dependencies. Instead, the dependencies are registered when the `pathResolver` is run for the first time. This way it is safer. 1. Now the `ImportTree` is needed to run the pipeline. Cycles are detected during the construction of this tree, so I've removed `Reader ImportParents` from the pipeline. 3. For the package pathresolver, we do not support parallelism yet (we could add support for it in the future, but the gains will be small). 4. When `-N1`, the pipeline remains unchanged, so performance should be the same as in the main branch (except there is a small performance degradation due to adding the `-threaded` flag). 5. I've introduced `PipelineOptions`, which are options that are used to pass options to the effects in the pipeline. 6. `PathResolver` constraint has been removed from the `upTo*` functions in the pipeline due to being redundant. 7. I've added a lot of `NFData` instances. They are needed to force the full evaluation of `Stored.ModuleInfo` in each of the threads. 2. The `Cache` effect uses [`SharedState`](https://hackage.haskell.org/package/effectful-core-2.3.0.1/docs/Effectful-State-Static-Shared.html) as opposed to [`LocalState`](https://hackage.haskell.org/package/effectful-core-2.3.0.1/docs/Effectful-Writer-Static-Local.html). Perhaps we should provide different versions. 3. I've added a `Cache` handler that accepts a setup function. The setup is triggered when a miss is detected. It is used to lazily compile the modules in parallel. # Tests 1. I've adapted the smoke test suite to ignore the progress log in the stderr. 5. I've had to adapt `tests/positive/Internal/Lambda.juvix`. Due to laziness, a crash happening in this file was not being caught. The problem is that in this file we have a lambda function with different number of patterns in their clauses, which we currently do not support (https://github.com/anoma/juvix/issues/1706). 6. I've had to comment out the definition ``` x : Box ((A : Type) → A → A) := box λ {A a := a}; ``` From the test as it was causing a crash (https://github.com/anoma/juvix/issues/2247). # Future Work 1. It should be investigated how much performance we lose by fully evaluating the `Stored.ModuleInfo`, since some information in it will be discarded. It may be possible to be more fine-grained when forcing evaluation. 8. The scanning of imports to build the import tree is sequential. Now, we build the import tree from the entry point module and only the modules that are imported from it are in the tree. However, we have discussed that at some point we should make a distinction between `juvix` _the compiler_ and `juvix` _the build tool_. When using `juvix` as a build tool it makes sense to typecheck/compile (to stored core) all modules in the project. When/if we do this, scanning imports in all modules in parallel becomes trivial. 9. The implementation of the `ParallelTemplate` uses low level primitives such as [forkIO](https://hackage.haskell.org/package/base-4.20.0.0/docs/Control-Concurrent.html#v:forkIO). At some point it should be refactored to use safer functions from the [`Effectful.Concurrent.Async`](https://hackage.haskell.org/package/effectful-2.3.0.0/docs/Effectful-Concurrent-Async.html) module. 10. The number of cores and worker threads that we spawn is determined by the command line. Ideally, we could use to import tree to compute an upper bound to the ideal number of cores to use. 11. We could add an animation that displays which modules are being compiled in parallel and which have finished being compiled. # Benchmarks On some benchmarks, I include the GHC runtime option [`-A`](https://downloads.haskell.org/ghc/latest/docs/users_guide/runtime_control.html#rts-flag--A%20%E2%9F%A8size%E2%9F%A9), which sometimes makes a good impact on performance. Thanks to @paulcadman for pointing this out. I've figured a good combination of `-N` and `-A` through trial and error (but this oviously depends on the cpu and juvix projects). ## Typecheck the standard library ### Clean run (88% faster than main): ``` hyperfine --warmup 1 --prepare 'juvix clean' 'juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432' 'juvix -N 4 typecheck Stdlib/Prelude.juvix' 'juvix-main typecheck Stdlib/Prelude.juvix' Benchmark 1: juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 Time (mean ± σ): 444.1 ms ± 6.5 ms [User: 1018.0 ms, System: 77.7 ms] Range (min … max): 432.6 ms … 455.9 ms 10 runs Benchmark 2: juvix -N 4 typecheck Stdlib/Prelude.juvix Time (mean ± σ): 628.3 ms ± 23.9 ms [User: 1227.6 ms, System: 69.5 ms] Range (min … max): 584.7 ms … 670.6 ms 10 runs Benchmark 3: juvix-main typecheck Stdlib/Prelude.juvix Time (mean ± σ): 835.9 ms ± 12.3 ms [User: 788.5 ms, System: 31.9 ms] Range (min … max): 816.0 ms … 853.6 ms 10 runs Summary juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 ran 1.41 ± 0.06 times faster than juvix -N 4 typecheck Stdlib/Prelude.juvix 1.88 ± 0.04 times faster than juvix-main typecheck Stdlib/Prelude.juvix ``` ### Cached run (43% faster than main): ``` hyperfine --warmup 1 'juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432' 'juvix -N 4 typecheck Stdlib/Prelude.juvix' 'juvix-main typecheck Stdlib/Prelude.juvix' Benchmark 1: juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 Time (mean ± σ): 241.3 ms ± 7.3 ms [User: 538.6 ms, System: 101.3 ms] Range (min … max): 231.5 ms … 251.3 ms 11 runs Benchmark 2: juvix -N 4 typecheck Stdlib/Prelude.juvix Time (mean ± σ): 235.1 ms ± 12.0 ms [User: 405.3 ms, System: 87.7 ms] Range (min … max): 216.1 ms … 253.1 ms 12 runs Benchmark 3: juvix-main typecheck Stdlib/Prelude.juvix Time (mean ± σ): 336.7 ms ± 13.3 ms [User: 269.5 ms, System: 67.1 ms] Range (min … max): 316.9 ms … 351.8 ms 10 runs Summary juvix -N 4 typecheck Stdlib/Prelude.juvix ran 1.03 ± 0.06 times faster than juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 1.43 ± 0.09 times faster than juvix-main typecheck Stdlib/Prelude.juvix ``` ## Typecheck the test suite of the containers library At the moment this is the biggest juvix project that we have. ### Clean run (105% faster than main) ``` hyperfine --warmup 1 --prepare 'juvix clean' 'juvix -N 6 typecheck Main.juvix +RTS -A67108864' 'juvix -N 4 typecheck Main.juvix' 'juvix-main typecheck Main.juvix' Benchmark 1: juvix -N 6 typecheck Main.juvix +RTS -A67108864 Time (mean ± σ): 1.006 s ± 0.011 s [User: 2.171 s, System: 0.162 s] Range (min … max): 0.991 s … 1.023 s 10 runs Benchmark 2: juvix -N 4 typecheck Main.juvix Time (mean ± σ): 1.584 s ± 0.046 s [User: 2.934 s, System: 0.149 s] Range (min … max): 1.535 s … 1.660 s 10 runs Benchmark 3: juvix-main typecheck Main.juvix Time (mean ± σ): 2.066 s ± 0.010 s [User: 1.939 s, System: 0.089 s] Range (min … max): 2.048 s … 2.077 s 10 runs Summary juvix -N 6 typecheck Main.juvix +RTS -A67108864 ran 1.57 ± 0.05 times faster than juvix -N 4 typecheck Main.juvix 2.05 ± 0.03 times faster than juvix-main typecheck Main.juvix ``` ### Cached run (54% faster than main) ``` hyperfine --warmup 1 'juvix -N 6 typecheck Main.juvix +RTS -A33554432' 'juvix -N 4 typecheck Main.juvix' 'juvix-main typecheck Main.juvix' Benchmark 1: juvix -N 6 typecheck Main.juvix +RTS -A33554432 Time (mean ± σ): 551.8 ms ± 13.2 ms [User: 1419.8 ms, System: 199.4 ms] Range (min … max): 535.2 ms … 570.6 ms 10 runs Benchmark 2: juvix -N 4 typecheck Main.juvix Time (mean ± σ): 636.7 ms ± 17.3 ms [User: 1006.3 ms, System: 196.3 ms] Range (min … max): 601.6 ms … 655.3 ms 10 runs Benchmark 3: juvix-main typecheck Main.juvix Time (mean ± σ): 847.2 ms ± 58.9 ms [User: 710.1 ms, System: 126.5 ms] Range (min … max): 731.1 ms … 890.0 ms 10 runs Summary juvix -N 6 typecheck Main.juvix +RTS -A33554432 ran 1.15 ± 0.04 times faster than juvix -N 4 typecheck Main.juvix 1.54 ± 0.11 times faster than juvix-main typecheck Main.juvix ```
2024-05-31 14:41:30 +03:00
testRunIO e =
testTaggedLockedToIO
. runIO defaultGenericOptions e
testDefaultEntryPointIO :: (MonadIO m) => Path Abs Dir -> Path Abs File -> m EntryPoint
Parallel pipeline (#2779) This pr introduces parallelism in the pipeline to gain performance. I've included benchmarks at the end. - Closes #2750. # Flags: There are two new global flags: 1. `-N / --threads`. It is used to set the number of capabilities. According to [GHC documentation](https://hackage.haskell.org/package/base-4.20.0.0/docs/GHC-Conc.html#v:setNumCapabilities): _Set the number of Haskell threads that can run truly simultaneously (on separate physical processors) at any given time_. When compiling in parallel, we create this many worker threads. The default value is `-N auto`, which sets `-N` to half the number of logical cores, capped at 8. 2. `--dev-show-thread-ids`. When given, the thread id is printed in the compilation progress log. E.g. ![image](https://github.com/anoma/juvix/assets/5511599/9359fae2-0be1-43e5-8d74-faa82cba4034) # Parallel compilation 1. I've added `src/Parallel/ParallelTemplate.hs` which contains all the concurrency related code. I think it is good to keep this code separated from the actual compiler code. 2. I've added a progress log (only for the parallel driver) that outputs a log of the compilation progress, similar to what stack/cabal do. # Code changes: 1. I've removed the `setup` stage where we were registering dependencies. Instead, the dependencies are registered when the `pathResolver` is run for the first time. This way it is safer. 1. Now the `ImportTree` is needed to run the pipeline. Cycles are detected during the construction of this tree, so I've removed `Reader ImportParents` from the pipeline. 3. For the package pathresolver, we do not support parallelism yet (we could add support for it in the future, but the gains will be small). 4. When `-N1`, the pipeline remains unchanged, so performance should be the same as in the main branch (except there is a small performance degradation due to adding the `-threaded` flag). 5. I've introduced `PipelineOptions`, which are options that are used to pass options to the effects in the pipeline. 6. `PathResolver` constraint has been removed from the `upTo*` functions in the pipeline due to being redundant. 7. I've added a lot of `NFData` instances. They are needed to force the full evaluation of `Stored.ModuleInfo` in each of the threads. 2. The `Cache` effect uses [`SharedState`](https://hackage.haskell.org/package/effectful-core-2.3.0.1/docs/Effectful-State-Static-Shared.html) as opposed to [`LocalState`](https://hackage.haskell.org/package/effectful-core-2.3.0.1/docs/Effectful-Writer-Static-Local.html). Perhaps we should provide different versions. 3. I've added a `Cache` handler that accepts a setup function. The setup is triggered when a miss is detected. It is used to lazily compile the modules in parallel. # Tests 1. I've adapted the smoke test suite to ignore the progress log in the stderr. 5. I've had to adapt `tests/positive/Internal/Lambda.juvix`. Due to laziness, a crash happening in this file was not being caught. The problem is that in this file we have a lambda function with different number of patterns in their clauses, which we currently do not support (https://github.com/anoma/juvix/issues/1706). 6. I've had to comment out the definition ``` x : Box ((A : Type) → A → A) := box λ {A a := a}; ``` From the test as it was causing a crash (https://github.com/anoma/juvix/issues/2247). # Future Work 1. It should be investigated how much performance we lose by fully evaluating the `Stored.ModuleInfo`, since some information in it will be discarded. It may be possible to be more fine-grained when forcing evaluation. 8. The scanning of imports to build the import tree is sequential. Now, we build the import tree from the entry point module and only the modules that are imported from it are in the tree. However, we have discussed that at some point we should make a distinction between `juvix` _the compiler_ and `juvix` _the build tool_. When using `juvix` as a build tool it makes sense to typecheck/compile (to stored core) all modules in the project. When/if we do this, scanning imports in all modules in parallel becomes trivial. 9. The implementation of the `ParallelTemplate` uses low level primitives such as [forkIO](https://hackage.haskell.org/package/base-4.20.0.0/docs/Control-Concurrent.html#v:forkIO). At some point it should be refactored to use safer functions from the [`Effectful.Concurrent.Async`](https://hackage.haskell.org/package/effectful-2.3.0.0/docs/Effectful-Concurrent-Async.html) module. 10. The number of cores and worker threads that we spawn is determined by the command line. Ideally, we could use to import tree to compute an upper bound to the ideal number of cores to use. 11. We could add an animation that displays which modules are being compiled in parallel and which have finished being compiled. # Benchmarks On some benchmarks, I include the GHC runtime option [`-A`](https://downloads.haskell.org/ghc/latest/docs/users_guide/runtime_control.html#rts-flag--A%20%E2%9F%A8size%E2%9F%A9), which sometimes makes a good impact on performance. Thanks to @paulcadman for pointing this out. I've figured a good combination of `-N` and `-A` through trial and error (but this oviously depends on the cpu and juvix projects). ## Typecheck the standard library ### Clean run (88% faster than main): ``` hyperfine --warmup 1 --prepare 'juvix clean' 'juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432' 'juvix -N 4 typecheck Stdlib/Prelude.juvix' 'juvix-main typecheck Stdlib/Prelude.juvix' Benchmark 1: juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 Time (mean ± σ): 444.1 ms ± 6.5 ms [User: 1018.0 ms, System: 77.7 ms] Range (min … max): 432.6 ms … 455.9 ms 10 runs Benchmark 2: juvix -N 4 typecheck Stdlib/Prelude.juvix Time (mean ± σ): 628.3 ms ± 23.9 ms [User: 1227.6 ms, System: 69.5 ms] Range (min … max): 584.7 ms … 670.6 ms 10 runs Benchmark 3: juvix-main typecheck Stdlib/Prelude.juvix Time (mean ± σ): 835.9 ms ± 12.3 ms [User: 788.5 ms, System: 31.9 ms] Range (min … max): 816.0 ms … 853.6 ms 10 runs Summary juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 ran 1.41 ± 0.06 times faster than juvix -N 4 typecheck Stdlib/Prelude.juvix 1.88 ± 0.04 times faster than juvix-main typecheck Stdlib/Prelude.juvix ``` ### Cached run (43% faster than main): ``` hyperfine --warmup 1 'juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432' 'juvix -N 4 typecheck Stdlib/Prelude.juvix' 'juvix-main typecheck Stdlib/Prelude.juvix' Benchmark 1: juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 Time (mean ± σ): 241.3 ms ± 7.3 ms [User: 538.6 ms, System: 101.3 ms] Range (min … max): 231.5 ms … 251.3 ms 11 runs Benchmark 2: juvix -N 4 typecheck Stdlib/Prelude.juvix Time (mean ± σ): 235.1 ms ± 12.0 ms [User: 405.3 ms, System: 87.7 ms] Range (min … max): 216.1 ms … 253.1 ms 12 runs Benchmark 3: juvix-main typecheck Stdlib/Prelude.juvix Time (mean ± σ): 336.7 ms ± 13.3 ms [User: 269.5 ms, System: 67.1 ms] Range (min … max): 316.9 ms … 351.8 ms 10 runs Summary juvix -N 4 typecheck Stdlib/Prelude.juvix ran 1.03 ± 0.06 times faster than juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 1.43 ± 0.09 times faster than juvix-main typecheck Stdlib/Prelude.juvix ``` ## Typecheck the test suite of the containers library At the moment this is the biggest juvix project that we have. ### Clean run (105% faster than main) ``` hyperfine --warmup 1 --prepare 'juvix clean' 'juvix -N 6 typecheck Main.juvix +RTS -A67108864' 'juvix -N 4 typecheck Main.juvix' 'juvix-main typecheck Main.juvix' Benchmark 1: juvix -N 6 typecheck Main.juvix +RTS -A67108864 Time (mean ± σ): 1.006 s ± 0.011 s [User: 2.171 s, System: 0.162 s] Range (min … max): 0.991 s … 1.023 s 10 runs Benchmark 2: juvix -N 4 typecheck Main.juvix Time (mean ± σ): 1.584 s ± 0.046 s [User: 2.934 s, System: 0.149 s] Range (min … max): 1.535 s … 1.660 s 10 runs Benchmark 3: juvix-main typecheck Main.juvix Time (mean ± σ): 2.066 s ± 0.010 s [User: 1.939 s, System: 0.089 s] Range (min … max): 2.048 s … 2.077 s 10 runs Summary juvix -N 6 typecheck Main.juvix +RTS -A67108864 ran 1.57 ± 0.05 times faster than juvix -N 4 typecheck Main.juvix 2.05 ± 0.03 times faster than juvix-main typecheck Main.juvix ``` ### Cached run (54% faster than main) ``` hyperfine --warmup 1 'juvix -N 6 typecheck Main.juvix +RTS -A33554432' 'juvix -N 4 typecheck Main.juvix' 'juvix-main typecheck Main.juvix' Benchmark 1: juvix -N 6 typecheck Main.juvix +RTS -A33554432 Time (mean ± σ): 551.8 ms ± 13.2 ms [User: 1419.8 ms, System: 199.4 ms] Range (min … max): 535.2 ms … 570.6 ms 10 runs Benchmark 2: juvix -N 4 typecheck Main.juvix Time (mean ± σ): 636.7 ms ± 17.3 ms [User: 1006.3 ms, System: 196.3 ms] Range (min … max): 601.6 ms … 655.3 ms 10 runs Benchmark 3: juvix-main typecheck Main.juvix Time (mean ± σ): 847.2 ms ± 58.9 ms [User: 710.1 ms, System: 126.5 ms] Range (min … max): 731.1 ms … 890.0 ms 10 runs Summary juvix -N 6 typecheck Main.juvix +RTS -A33554432 ran 1.15 ± 0.04 times faster than juvix -N 4 typecheck Main.juvix 1.54 ± 0.11 times faster than juvix-main typecheck Main.juvix ```
2024-05-31 14:41:30 +03:00
testDefaultEntryPointIO cwd mainFile =
testTaggedLockedToIO $
defaultEntryPointIO cwd mainFile
testDefaultEntryPointNoFileIO :: Path Abs Dir -> IO EntryPoint
testDefaultEntryPointNoFileIO cwd = testTaggedLockedToIO (defaultEntryPointNoFileIO cwd)
testRunIOEither ::
EntryPoint ->
Sem (PipelineEff PipelineAppEffects) a ->
IO (Either JuvixError (ResolverState, PipelineResult a))
Parallel pipeline (#2779) This pr introduces parallelism in the pipeline to gain performance. I've included benchmarks at the end. - Closes #2750. # Flags: There are two new global flags: 1. `-N / --threads`. It is used to set the number of capabilities. According to [GHC documentation](https://hackage.haskell.org/package/base-4.20.0.0/docs/GHC-Conc.html#v:setNumCapabilities): _Set the number of Haskell threads that can run truly simultaneously (on separate physical processors) at any given time_. When compiling in parallel, we create this many worker threads. The default value is `-N auto`, which sets `-N` to half the number of logical cores, capped at 8. 2. `--dev-show-thread-ids`. When given, the thread id is printed in the compilation progress log. E.g. ![image](https://github.com/anoma/juvix/assets/5511599/9359fae2-0be1-43e5-8d74-faa82cba4034) # Parallel compilation 1. I've added `src/Parallel/ParallelTemplate.hs` which contains all the concurrency related code. I think it is good to keep this code separated from the actual compiler code. 2. I've added a progress log (only for the parallel driver) that outputs a log of the compilation progress, similar to what stack/cabal do. # Code changes: 1. I've removed the `setup` stage where we were registering dependencies. Instead, the dependencies are registered when the `pathResolver` is run for the first time. This way it is safer. 1. Now the `ImportTree` is needed to run the pipeline. Cycles are detected during the construction of this tree, so I've removed `Reader ImportParents` from the pipeline. 3. For the package pathresolver, we do not support parallelism yet (we could add support for it in the future, but the gains will be small). 4. When `-N1`, the pipeline remains unchanged, so performance should be the same as in the main branch (except there is a small performance degradation due to adding the `-threaded` flag). 5. I've introduced `PipelineOptions`, which are options that are used to pass options to the effects in the pipeline. 6. `PathResolver` constraint has been removed from the `upTo*` functions in the pipeline due to being redundant. 7. I've added a lot of `NFData` instances. They are needed to force the full evaluation of `Stored.ModuleInfo` in each of the threads. 2. The `Cache` effect uses [`SharedState`](https://hackage.haskell.org/package/effectful-core-2.3.0.1/docs/Effectful-State-Static-Shared.html) as opposed to [`LocalState`](https://hackage.haskell.org/package/effectful-core-2.3.0.1/docs/Effectful-Writer-Static-Local.html). Perhaps we should provide different versions. 3. I've added a `Cache` handler that accepts a setup function. The setup is triggered when a miss is detected. It is used to lazily compile the modules in parallel. # Tests 1. I've adapted the smoke test suite to ignore the progress log in the stderr. 5. I've had to adapt `tests/positive/Internal/Lambda.juvix`. Due to laziness, a crash happening in this file was not being caught. The problem is that in this file we have a lambda function with different number of patterns in their clauses, which we currently do not support (https://github.com/anoma/juvix/issues/1706). 6. I've had to comment out the definition ``` x : Box ((A : Type) → A → A) := box λ {A a := a}; ``` From the test as it was causing a crash (https://github.com/anoma/juvix/issues/2247). # Future Work 1. It should be investigated how much performance we lose by fully evaluating the `Stored.ModuleInfo`, since some information in it will be discarded. It may be possible to be more fine-grained when forcing evaluation. 8. The scanning of imports to build the import tree is sequential. Now, we build the import tree from the entry point module and only the modules that are imported from it are in the tree. However, we have discussed that at some point we should make a distinction between `juvix` _the compiler_ and `juvix` _the build tool_. When using `juvix` as a build tool it makes sense to typecheck/compile (to stored core) all modules in the project. When/if we do this, scanning imports in all modules in parallel becomes trivial. 9. The implementation of the `ParallelTemplate` uses low level primitives such as [forkIO](https://hackage.haskell.org/package/base-4.20.0.0/docs/Control-Concurrent.html#v:forkIO). At some point it should be refactored to use safer functions from the [`Effectful.Concurrent.Async`](https://hackage.haskell.org/package/effectful-2.3.0.0/docs/Effectful-Concurrent-Async.html) module. 10. The number of cores and worker threads that we spawn is determined by the command line. Ideally, we could use to import tree to compute an upper bound to the ideal number of cores to use. 11. We could add an animation that displays which modules are being compiled in parallel and which have finished being compiled. # Benchmarks On some benchmarks, I include the GHC runtime option [`-A`](https://downloads.haskell.org/ghc/latest/docs/users_guide/runtime_control.html#rts-flag--A%20%E2%9F%A8size%E2%9F%A9), which sometimes makes a good impact on performance. Thanks to @paulcadman for pointing this out. I've figured a good combination of `-N` and `-A` through trial and error (but this oviously depends on the cpu and juvix projects). ## Typecheck the standard library ### Clean run (88% faster than main): ``` hyperfine --warmup 1 --prepare 'juvix clean' 'juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432' 'juvix -N 4 typecheck Stdlib/Prelude.juvix' 'juvix-main typecheck Stdlib/Prelude.juvix' Benchmark 1: juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 Time (mean ± σ): 444.1 ms ± 6.5 ms [User: 1018.0 ms, System: 77.7 ms] Range (min … max): 432.6 ms … 455.9 ms 10 runs Benchmark 2: juvix -N 4 typecheck Stdlib/Prelude.juvix Time (mean ± σ): 628.3 ms ± 23.9 ms [User: 1227.6 ms, System: 69.5 ms] Range (min … max): 584.7 ms … 670.6 ms 10 runs Benchmark 3: juvix-main typecheck Stdlib/Prelude.juvix Time (mean ± σ): 835.9 ms ± 12.3 ms [User: 788.5 ms, System: 31.9 ms] Range (min … max): 816.0 ms … 853.6 ms 10 runs Summary juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 ran 1.41 ± 0.06 times faster than juvix -N 4 typecheck Stdlib/Prelude.juvix 1.88 ± 0.04 times faster than juvix-main typecheck Stdlib/Prelude.juvix ``` ### Cached run (43% faster than main): ``` hyperfine --warmup 1 'juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432' 'juvix -N 4 typecheck Stdlib/Prelude.juvix' 'juvix-main typecheck Stdlib/Prelude.juvix' Benchmark 1: juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 Time (mean ± σ): 241.3 ms ± 7.3 ms [User: 538.6 ms, System: 101.3 ms] Range (min … max): 231.5 ms … 251.3 ms 11 runs Benchmark 2: juvix -N 4 typecheck Stdlib/Prelude.juvix Time (mean ± σ): 235.1 ms ± 12.0 ms [User: 405.3 ms, System: 87.7 ms] Range (min … max): 216.1 ms … 253.1 ms 12 runs Benchmark 3: juvix-main typecheck Stdlib/Prelude.juvix Time (mean ± σ): 336.7 ms ± 13.3 ms [User: 269.5 ms, System: 67.1 ms] Range (min … max): 316.9 ms … 351.8 ms 10 runs Summary juvix -N 4 typecheck Stdlib/Prelude.juvix ran 1.03 ± 0.06 times faster than juvix -N 4 typecheck Stdlib/Prelude.juvix +RTS -A33554432 1.43 ± 0.09 times faster than juvix-main typecheck Stdlib/Prelude.juvix ``` ## Typecheck the test suite of the containers library At the moment this is the biggest juvix project that we have. ### Clean run (105% faster than main) ``` hyperfine --warmup 1 --prepare 'juvix clean' 'juvix -N 6 typecheck Main.juvix +RTS -A67108864' 'juvix -N 4 typecheck Main.juvix' 'juvix-main typecheck Main.juvix' Benchmark 1: juvix -N 6 typecheck Main.juvix +RTS -A67108864 Time (mean ± σ): 1.006 s ± 0.011 s [User: 2.171 s, System: 0.162 s] Range (min … max): 0.991 s … 1.023 s 10 runs Benchmark 2: juvix -N 4 typecheck Main.juvix Time (mean ± σ): 1.584 s ± 0.046 s [User: 2.934 s, System: 0.149 s] Range (min … max): 1.535 s … 1.660 s 10 runs Benchmark 3: juvix-main typecheck Main.juvix Time (mean ± σ): 2.066 s ± 0.010 s [User: 1.939 s, System: 0.089 s] Range (min … max): 2.048 s … 2.077 s 10 runs Summary juvix -N 6 typecheck Main.juvix +RTS -A67108864 ran 1.57 ± 0.05 times faster than juvix -N 4 typecheck Main.juvix 2.05 ± 0.03 times faster than juvix-main typecheck Main.juvix ``` ### Cached run (54% faster than main) ``` hyperfine --warmup 1 'juvix -N 6 typecheck Main.juvix +RTS -A33554432' 'juvix -N 4 typecheck Main.juvix' 'juvix-main typecheck Main.juvix' Benchmark 1: juvix -N 6 typecheck Main.juvix +RTS -A33554432 Time (mean ± σ): 551.8 ms ± 13.2 ms [User: 1419.8 ms, System: 199.4 ms] Range (min … max): 535.2 ms … 570.6 ms 10 runs Benchmark 2: juvix -N 4 typecheck Main.juvix Time (mean ± σ): 636.7 ms ± 17.3 ms [User: 1006.3 ms, System: 196.3 ms] Range (min … max): 601.6 ms … 655.3 ms 10 runs Benchmark 3: juvix-main typecheck Main.juvix Time (mean ± σ): 847.2 ms ± 58.9 ms [User: 710.1 ms, System: 126.5 ms] Range (min … max): 731.1 ms … 890.0 ms 10 runs Summary juvix -N 6 typecheck Main.juvix +RTS -A33554432 ran 1.15 ± 0.04 times faster than juvix -N 4 typecheck Main.juvix 1.54 ± 0.11 times faster than juvix-main typecheck Main.juvix ```
2024-05-31 14:41:30 +03:00
testRunIOEither entry =
testTaggedLockedToIO
. runIOEither entry
testRunIOEitherTermination ::
EntryPoint ->
Sem (Termination ': PipelineEff PipelineAppEffects) a ->
IO (Either JuvixError (ResolverState, PipelineResult a))
testRunIOEitherTermination entry =
testRunIOEither entry
. evalTermination iniTerminationState
Add nockma evaluator (#2564) This PR adds an parser, pretty printer, evaluator, repl and quasi-quoter for Nock terms. ## Parser / Pretty Printer The parser and pretty printer handle both standard Nock terms and 'pretty' Nock terms (where op codes and paths can be named). Standard and pretty Nock forms can be mixed in the same term. For example instead of `[0 2]` you can write `[@ L]`. See https://github.com/anoma/juvix/blob/a6028b0d92e2dff02329ab7f441bf48ccdeb3eb3/src/Juvix/Compiler/Nockma/Language.hs#L79 for the correspondence between pretty Nock and Nock operators. In pretty Nock, paths are represented as strings of `L` (for head) and `R` (for tail) instead of the number encoding in standard nock. The character `S` is used to refer to the whole subject, i.e it is sugar for `1` in standard Nock. See https://github.com/anoma/juvix/blob/a6028b0d92e2dff02329ab7f441bf48ccdeb3eb3/src/Juvix/Compiler/Nockma/Language.hs#L177 for the correspondence between pretty Nock path and standard Nock position. ## Quasi-quoter A quasi-quoter is added so Nock terms can be included in the source, e.g `[nock| [@ LL] |]`. ## REPL Launch the repl with `juvix dev nockma repl`. A Nock `[subject formula]` cell is input as `subject / formula` , e.g: ``` nockma> [1 0] / [@ L] 1 ``` The subject can be set using `:set-stack`. ``` nockma> :set-stack [1 0] nockma> [@ L] 1 ``` The subject can be viewed using `:get-stack`. ``` nockma> :set-stack [1 0] nockma> :get-stack [1 0] ``` You can assign a Nock term to a variable and use it in another expression: ``` nockma> r := [@ L] nockma> [1 0] / r 1 ``` A list of assignments can be read from a file: ``` $ cat stack.nock r := [@ L] $ juvix dev nockma repl nockma> :load stack.nock nockma> [1 0] / r 1 ``` * Closes https://github.com/anoma/juvix/issues/2557 --------- Co-authored-by: Jan Mas Rovira <janmasrovira@gmail.com> Co-authored-by: Lukasz Czajka <lukasz@heliax.dev>
2024-01-11 15:04:38 +03:00
assertFailure :: (MonadIO m) => String -> m a
assertFailure = liftIO . HUnit.assertFailure
-- | The same as `P.readProcess` but instead of inheriting `stderr` redirects it
-- to the child's `stdout`.
readProcess :: FilePath -> [String] -> Text -> IO Text
readProcess = readProcessCwd' Nothing Nothing
readProcessWithEnv :: [(String, String)] -> FilePath -> [String] -> Text -> IO Text
readProcessWithEnv env = readProcessCwd' (Just env) Nothing
readProcessCwd :: FilePath -> FilePath -> [String] -> Text -> IO Text
readProcessCwd cwd = readProcessCwd' Nothing (Just cwd)
readProcessCwd' :: Maybe [(String, String)] -> Maybe FilePath -> FilePath -> [String] -> Text -> IO Text
readProcessCwd' menv mcwd cmd args stdinText =
withTempDir'
( \dirPath -> do
(_, hin) <- openTempFile dirPath "stdin"
(_, hout) <- openTempFile dirPath "stdout"
hPutStr hin stdinText
hSeek hin AbsoluteSeek 0
(_, _, _, ph) <-
P.createProcess_
"readProcess"
(P.proc cmd args)
{ P.std_in = P.UseHandle hin,
P.std_out = P.UseHandle hout,
P.std_err = P.UseHandle hout,
P.cwd = mcwd,
P.env = menv
}
P.waitForProcess ph
hSeek hout AbsoluteSeek 0
r <- hGetContents hout
hClose hin
hClose hout
return r
)