Some of the benchmarks were order of magnitude off due to missing INLINE for
type class operations. Now, all of them are in reasonable limits. Benchmarks
affected for serial streams:
* Functor, Applicative, Monad, transformers
We need to do a similar exercise for other types of streams and for
folds/parsers as well.
test for maximumBy, maximum
test for head, sum, prod, max, min
toList, length tests
toList, length tests
tests for partial folds and few full folds
Fix indentation, remove warnings
* Document the precise behavior, some changes were made to the earlier behavior
* Make some changes to implementation according to (newly) documented behavior
* TakeByTime: perform the time check before generating the element so that we
do not drop an element after generation.
* TakeByTime now yields at least one element if the duration is non-zero
* dropByTime does not check the time after the drop duration is over
* Add inspection tests
* make the tests for shorter duration, earlier tests took too long
* we are using "unfold" for unfolding, runUnfold is longer and sounds a bit
weird. So similar to "unfold" we should use "fold" for folding.
* Another reason is that "runFold" sound ends in "unfold" so can sound
confusing.
* Reimplement groupsOf for better fusion
* Use A.writeNUnsafe for better fusion with groupsOf
Now we can use chunksOf/groupsOf to implement arraysOf with equivalent
performance. This commit hides the arraysOf from the Array module and export it
via Streamly.Internal because in one case (writeS) it still is faster. This can
be resolved once GHC issue is resolved. Even though writeS is not an officially
exported API, this case indicates that there may potentially be more such cases
which do not fuse well, so we are keeping this as a backup for such cases for
now.
concat is the conventional naming for flattening, and we do not need this name
for something else (e.g. flattening array of arrays). So we can reuse this for
consistency.
Each container type (e.g. Handle/Socket/File) may have similar nested/stream
level operations. We need a standardized way of naming the combinators related
to streams of containers. Also, we cannot have a separate module for such
combinators for each container type. Therefore it makes sense to put them in
the same module.
* Earlier ParallelT was unaffected by `maxBuffer` directive, now `maxBuffer`
can limit the buffer of a ParallelT stream as well. When the buffer becomes
full the producer threads block.
* ParallelT streams no longer have an unlimited buffer by default. Now the
buffer for parallel streams is limited to 1500 by default, the same as other
concurrent stream types.
The implementation of fromStreamDArraysOf is now 3x more efficient compared to
the earlier implementation. This makes byte stream level operations almost as
efficient as array level operations.
Other than this the following changes are added in this commit:
* Add insertAfterEach
* Add writeArraysPackedUpto to Handle IO
* Implement `wc -l` example more efficiently using arrays
* Add benchmark for lines/unlines using arrays
* Add tests for splitArraysOn
* Rename some array/file/handle APIs
* Error handling when the group size in grouping operations is 0 or negative
unsafe use of unsafeInlineIO caused each array allocated in the toArrayN fold
to share the same memory.
This fix uses the IO monad to make sure that the code is not pure and therefore
we always allocate new memory. All such usage of unsafePerformIo have been
fixed. The remaining ones are reviewed to be safe.
After perf measurement these seems to perform the same as a scan followed by
map therefore we have not exposed these but kept for perf comparison and just
in case use.
* Deprecate `scanx`, `foldx`, `foldxM`
* Remove deprecated APIs `scan`, `foldl`, `foldlM`
* Fix the signature of foldrM
* Implement some custom folds in terms of foldr
* Document folds and scans better
* Reorganize the documentation in Prelude
* Add foldrS and foldrT for transforming folds
* add toRevList
* Add benchmarks and tests for the new folds
APIs
----
Removed:
merge
lazy left scans: scanl, scanlM, scanl1, scanl1M
Renamed:
generate and generateM renamed to fromIndices and fromIndicesM
Added:
replicate
mergeByM, mergeAsyncBy, mergeAsyncByM
`intFrom`, `intFromTo`, `intFromThen`, `intFromThenTo`,
`intFromStep`, `fracFrom`, `fracFromThen`, `fracFromThenTo`,
`numFromStep`
Added StreamD version of replicateM and a rewrite rule for replicateMSerial.
Added but not exposed:
postscanl and prescanl ops
Rewrote mergeByS in StreamK, simplified quite a bit and got some perf
improvement too.
Added @since notations to new APIs.
Fixed lines exceeding 80 columns.
Tests
-----
Added tests for the new enumeration APIs.
Improved some tests by generating values randomly using quickcheck forAll. We
can improve more tests similarly.
Removed some redundant transformOps tests.
reorganized test code in groups so as to keep similar functionality together
and added header lines so that we can find relevant code easily.
Benchmarks
----------
Added benchmarks for enumeration primitives added above. Added benchmarks for
scan and fold mixed ops. Added benchmark for concatMap. Fixed foldr and foldrM
benchmarks to use a (+) operation instead of a list operation for fair
comparision with other folds.
Kept only one benchmark each for deleteBy, insertBy, isPrefixOf and
isSubsequenceOf.
Documentation
-------------
Updated documentation, added examples for the new primitives as well as many
old ones. Especially the documentation of folds and scans was rewritten.
Reordered and re-organized the groups of APIs in the doc.
Refactoring
-----------
Some related and urelated refactoring.
Hlint
-----
Fixed some hlint hints introduced recently.
TBD
---
Some APIs need concurrent versions. I have added "XXX" notes for those.
Some more tests have to be added.
Some more benchmarks have to be added.
1) Add foldr1, findIndices, lookup, find, sequence functions
in StreamD, earlier they were present only in StreamK.
2) Change the above functions to use S. in Prelude instead
of K. .
3) Add tests for (!!), insertBy, splitAt, the.
1) Define the instances manually instead of deriving. Deriving picks up
incorrect implementations of methods that have been redefined in the newtypes.
2) Fix the Traversable instance.
3) Define the instances only for Serial stream types. Other streams cannot be
instantiated for Foldable monads.
4) Add test cases
5) add benchmarks
This causes up to 30% regression in async stream generation benchmarks and up
to 200% regression in async nested benchmarks. Mostly, due to an additional
functional call that cannot be inlined.
The main change is a single line change in StreamK.hs in foldxM routine.
Major changes in this commit are due to:
1) Added strictness tests for all foldl and scanl rotuines
2) refactoring to enable independent benchmarking for StreamK, to measure the
impact of the change.
maxYields is used to limit the concurrent executions of a stream when
it it is immediately followed by a "take" limiting the size of the stream.
Also fix the maxBuffer implementation of aheadly.
Also exposed the "filterM" API.
Some benchmarks are affected with this. The most affected benchmarks are:
StreamK ops:
before:
serially/generation/foldMapWith mean 1.940 ms
serially/generation/foldMapWithM mean 3.891 ms
After:
serially/generation/foldMapWith mean 2.874 ms
serially/generation/foldMapWithM mean 5.003 ms
StreamD ops:
zip/zipM are affected
- monadic stream generation functions are now concurrent
- monadic stream transformation (mapM, sequence) functions are now concurrent
- fixed a race which caused blockedindefinitely on MVar in rare cases