Update benchmarks and describe the results

2024-11-22 22:14:21 +03:00 · 2022-03-20 08:48:33 +01:00 · 2022-03-20 08:48:33 +01:00 · adccd8fa7e
commit adccd8fa7e
parent a23a78913d
8 changed files with 97 additions and 3 deletions
--- a/README.md
+++ b/README.md
@ -15,9 +15,8 @@ with the existing Haskell ecosystem.

 Main features:

-1. Very fast (benchmark results with GHC 9.0.2:
-   [countdown](https://raw.githubusercontent.com/haskell-effectful/effectful/master/benchmarks/bench_countdown_1000.png),
-   [filesize](https://raw.githubusercontent.com/haskell-effectful/effectful/master/benchmarks/bench_filesize_1000.png)).
+1. Very fast
+   ([benchmarks](https://github.com/haskell-effectful/effectful/tree/master/benchmarks)).

 2. Easy to use API (no boilerplate code and dealing with arcane types).

--- a/benchmarks/README.md
+++ b/benchmarks/README.md
@ -0,0 +1,95 @@
+# Benchmarks
+
+## Introduction
+
+The benchmark suite of `effectful` compares performance of the most popular
+extensible effects libraries in several scenarios. It implements two benchmarks:
+
+- **countdown** - a microbenchmark that effectively measures performance of
+  monadic binds and the effect dispatch.
+
+- **filesize** - a more down to earth benchmark that does various things,
+  including I/O.
+   
+Each benchmark has two flavours that affect the amount of effects available in
+the context:
+
+- **shallow** - only effects necessary for the benchmark.
+
+- **deep** - necessary effects + 5 redundant effects put into the context before
+  and after the relevant ones (10 in total). This simulates a typical scenario
+  in which the code uses only a portion of the total amount of effects available
+  to the application.
+
+Moreover, the benchmarked code was annotated with `NOINLINE` pragmas to prevent
+GHC from inlining it and/or specializing away type class constraints related to
+effects. This is crucial in order to get realistic results, as for any
+non-trivial, multi-module application the compiler will not be able to do this
+as that would essentially mean performing whole program specialization.
+
+The code was compiled with GHC 9.0.2 and run on a Ryzen 9 5950x.
+
+## Results
+
+### Countdown
+
+<img src="https://raw.githubusercontent.com/haskell-effectful/effectful/master/benchmarks/bench_countdown_1000.png">
+
+Analysis:
+
+1. `effectful` takes the lead. Its static dispatch is on par with the reference
+   implementation that uses the `ST` monad, so it offers no additional
+   overhead. Its dynamic dispatch is also the fastest.
+
+2. `cleff` uses very similar implementation techniques as `effectful` and is
+   only a bit behind with the shallow version, but gets further away with the
+   deep one. This is because it uses `IntMap` for the effect dispatch
+   underneath, so it's not quite constant size in terms of the effect stack. For
+   comparison, `effectful` uses arrays.
+
+3. `freer-simple` does surprisingly well for a solution that's based on free
+   monads.
+   
+4. `mtl` comes next and unfortunately here's when the conventional wisdom stating
+   that it is fast crumbles. The deep version is **50 times** slower than the
+   reference implementation!
+   
+   This is a direct consequence of how type classes are compiled. To be more
+   precise, during compilation type class constraints are translated by the
+   compiler to regular arguments. These arguments are class dictionaries,
+   i.e. data types containing all functions that the type class contains.
+   
+   Now, because usage of `mtl` style effects requires the monad to be
+   polymorphic, such functions at runtime are passed a dictionary of `Monad`
+   specific methods and have to call them. **In particular, this applies to the
+   monadic bind**. That's the crux of a problem - bind is called in between
+   every monadic operation, so making it a function call has a disastrous effect
+   on performance.
+   
+   Why is the result for the deep stack so much worse that for the shallow one
+   though? It's because in reality, each call to bind performs *O(n)* function
+   calls, where *n* is the number of monad transformers on the stack. That's
+   because the implementation of bind for every monad transformer refers to the
+   bind of a monad it transforms.
+   
+   Compare that to `effectful`, where monadic binds are known function calls and
+   can be eliminated by the compiler. What is more, the only piece of data
+   passed via class constraints are dictionaries of `:>`, each represented by a
+   single `Int` pointing at the place in the stack where the relevant effect is
+   located.
+
+5. `fused-effects` exhibits similar behavior as `mtl`. This comes with no
+   surprise as it uses the same implementation techniques. It augments them with
+   additional machinery for convenience, which seems to add even more overhead
+   though.
+
+6. `polysemy` is based on free monads just as `freer-simple` and performs
+   similarly, though with a much higher initial overhead.
+
+### Filesize
+
+<img src="https://raw.githubusercontent.com/haskell-effectful/effectful/master/benchmarks/bench_filesize_1000.png">
+
+The results are similar to the ones of the *countdown* benchmark. It's worth
+noting though that introduction of other effects and I/O makes the difference in
+performance between libraries not nearly as pronounced.
--- a/benchmarks/bench_countdown_1000.png
+++ b/benchmarks/bench_countdown_1000.png
--- a/benchmarks/bench_countdown_2000.png
+++ b/benchmarks/bench_countdown_2000.png
--- a/benchmarks/bench_countdown_3000.png
+++ b/benchmarks/bench_countdown_3000.png
--- a/benchmarks/bench_filesize_1000.png
+++ b/benchmarks/bench_filesize_1000.png
--- a/benchmarks/bench_filesize_2000.png
+++ b/benchmarks/bench_filesize_2000.png
--- a/benchmarks/bench_filesize_3000.png
+++ b/benchmarks/bench_filesize_3000.png