- Add encodeUtf8_, encodeLatin1_, encodeUtf8', encodeLatin1',
decodeUtf8_, decodeUtf8', decodeUtf8Arrays_, decodeUtf8Arrays'.
- encodeLatin1, encodeUtf8 don't fail when they encounter
invalid characters. Their stricter variants encodeLatin1',
encodeUtf8' do.
- encodeUtf8_, encodeLatin1_ drop the invalid characters.
- decodeUtf8 replaces any invalid character encountered with the
Unicode replacement character where as decodeUtf8' fails.
- decodeUtf8Arrays doesn't fail on invalid characters anymore
it replaces them with the Unicode replacement character.
- Deprecate decodeUtf8Lax, encodeLatin1Lax, encodeUtf8Lax.
- Update Changelog.md.
"Streamly.Internal.Data.Strict" is replaced by:
Streamly.Internal.Data.Tuple.Strict
Streamly.Internal.Data.Maybe.Strict
Streamly.Internal.Data.Either.Strict
This commit also has some formatting changes to imports.
This commit results in worse performance because now we are double
buffering once in ParserD and once in ParserK. This can potentially be
fixed but would require bigger changes to unify the backtracking buffer
management for ParserD and ParserK.
Use --stream-size for all the array benchmarks now
Replace the following files:
- benchmark/Streamly/Benchmark/Data/Array.hs
- benchmark/Streamly/Benchmark/Data/ArrayOps.hs
- benchmark/Streamly/Benchmark/Data/Prim/Array.hs
- benchmark/Streamly/Benchmark/Data/Prim/ArrayOps.hs
- benchmark/Streamly/Benchmark/Data/SmallArray.hs
- benchmark/Streamly/Benchmark/Data/SmallArrayOps.hs
- benchmark/Streamly/Benchmark/Memory/Array.hs
- benchmark/Streamly/Benchmark/Memory/ArrayOps.hs
With:
- benchmark/Streamly/Benchmark/Array.hs
- benchmark/Streamly/Benchmark/ArrayOps.hs
Now they are in FileSystem.Handle module corresponding to the source module
with the same name. Also, now we have them arranged based on space complexity
so that we can apply RTS memory restrictions when running.
Also, now longer benchmarks use a shorter file.
"allocated" is much more stable for regression comparisons as it stays the same
whereas "time" varies based on various factors like cpu frequency, other things
running on the computer, context switches etc.
bytesCopied is a measure of long lived data being retained across GCs, which is
also a good measure of performance.
Some of the benchmarks were order of magnitude off due to missing INLINE for
type class operations. Now, all of them are in reasonable limits. Benchmarks
affected for serial streams:
* Functor, Applicative, Monad, transformers
We need to do a similar exercise for other types of streams and for
folds/parsers as well.
* Add 3 interesting cases for each concatMap case
* For mapM, map concurrently on a serial stream so that we measure the
concurrency overhead of mapM only and not both concurrent generation + mapM
* For Async streams add some benchmarks involving the `async` combinator.
* Add a benchmark for `foldrS`
* Now benchmark modules correspond to source modules. The Prelude module in
source corresponds to several modules one for each stream type.
* Benchmarks in the same order/groupings as they appear in source
* All benchmarks now have division according to space complexity
* Refactoring reduces a lot of code duplication especially the stream
generation and elimination functions.
* The RTS options are now completely set in the shell script to run the
benchmarks.
* RTS options can be set on a per benchmark basis. RTS options work correctly
now.
* The set of streaming/infinite stream benchmarks is now complete and we can
run all such benchmarks coneveniently.
* Benchmark "quick"/"speed" options can now be specified on a per benchmark
basis. Longer benchmarks can have fewer iterations/quick run time.
* Benchmarks are grouped in several groups which can be run on a per group
basis. Comparison groups are also defined for convenient comparisons of
different modules (e.g. arrays or streamD/K).
* The benchmark namespaces are grouped in a consistent manner. Benchmark
executables have a consistent naming based on module names.