Now they are in FileSystem.Handle module corresponding to the source module
with the same name. Also, now we have them arranged based on space complexity
so that we can apply RTS memory restrictions when running.
Also, now longer benchmarks use a shorter file.
"allocated" is much more stable for regression comparisons as it stays the same
whereas "time" varies based on various factors like cpu frequency, other things
running on the computer, context switches etc.
bytesCopied is a measure of long lived data being retained across GCs, which is
also a good measure of performance.
Some of the benchmarks were order of magnitude off due to missing INLINE for
type class operations. Now, all of them are in reasonable limits. Benchmarks
affected for serial streams:
* Functor, Applicative, Monad, transformers
We need to do a similar exercise for other types of streams and for
folds/parsers as well.
* Add 3 interesting cases for each concatMap case
* For mapM, map concurrently on a serial stream so that we measure the
concurrency overhead of mapM only and not both concurrent generation + mapM
* For Async streams add some benchmarks involving the `async` combinator.
* Add a benchmark for `foldrS`
* Now benchmark modules correspond to source modules. The Prelude module in
source corresponds to several modules one for each stream type.
* Benchmarks in the same order/groupings as they appear in source
* All benchmarks now have division according to space complexity
* Refactoring reduces a lot of code duplication especially the stream
generation and elimination functions.
* The RTS options are now completely set in the shell script to run the
benchmarks.
* RTS options can be set on a per benchmark basis. RTS options work correctly
now.
* The set of streaming/infinite stream benchmarks is now complete and we can
run all such benchmarks coneveniently.
* Benchmark "quick"/"speed" options can now be specified on a per benchmark
basis. Longer benchmarks can have fewer iterations/quick run time.
* Benchmarks are grouped in several groups which can be run on a per group
basis. Comparison groups are also defined for convenient comparisons of
different modules (e.g. arrays or streamD/K).
* The benchmark namespaces are grouped in a consistent manner. Benchmark
executables have a consistent naming based on module names.
Add: intersperseSuffix_, delay, timeIndexed
Change the APIs: times, absTimes, relTimes, timestamped
The new APIs have a default clock granularity of 10 ms.
add: times, relTimes, timestamped
unimplemented skeletons: durations, ticks, timeout
Changes to the original currentTime combinator: remove delay from the first
event, cap the granularity to 1 ms to guarantee reasonable cpu usage.
CPS performs much better for parsing operations like "sequence" and
"choice" on large containers. Given that applicative "sequence" does
not scale, I guess the Monad instance as well won't scale for direct
implementation.