mirror of https://github.com/composewell/streamly.git synced 2024-07-14 14:50:38 +03:00

High performance, concurrent functional programming abstractions

Go to file

Harendra Kumar e249c4da2d Use StreamD by default, expose StreamK as separate module Basically remove depdency on GHC rewrite rules. The reason for separating StreamD and StreamK instead of using rewrite rules: * Separate types provide better reasoning for the programmer about performance. Each type has its own pros and cons and the programmer can choose the best one based on the use case. * rewrite rules are fragile, led to broken performance in the past dues to change in GHC. * Rewrite rules lead to optimization problems, blocking fusion in some cases, specifically when combining multiple operations e.g. (filter . drop). * Rewrite rules lead to problems when calling a function recursively. For example, the StreamD version of foldBreak cannot be used recursively when wrapped in rewrite rules because each recursive call adds a roundtrip conversion from D to K and back to D. We can use the StreamK versions of these though because the rewrite rule gets eliminated in that case. * If we have a unified module, we need two different versions of several operations e.g. appendK and appendD, both are useful in different cases.		2023-02-06 22:09:26 +05:30
.circleci	Keep only two sdist builds	2022-10-12 22:33:58 +05:30
.github/workflows	Update packdiff and streamly-process version in cabal.project.* (#2239 )	2023-01-19 18:28:09 +05:30
benchmark	Use StreamD by default, expose StreamK as separate module	2023-02-06 22:09:26 +05:30
bin	Remove fusion-plugin build from GHC 8.8.4	2022-11-01 16:37:31 +05:30
core	Use StreamD by default, expose StreamK as separate module	2023-02-06 22:09:26 +05:30
docs	Move Introduction.md to quick-overview.md	2023-01-19 13:55:44 +05:30
examples	Fix markdown	2021-04-05 07:56:54 +05:30
src	Use StreamD by default, expose StreamK as separate module	2023-02-06 22:09:26 +05:30
targets	Use StreamD by default, expose StreamK as separate module	2023-02-06 22:09:26 +05:30
test	Fix test and benchmarks for concurrent module changes	2023-01-13 16:44:36 +05:30
.ghci	Add .ghci to use -fobject-code option	2021-01-18 12:48:35 +05:30
.gitignore	Update .gitignore	2022-02-04 08:02:40 +05:30
.hlint.ignore	Add test cases for Unbox Generic derived instances (#2207 )	2023-01-07 19:05:30 +05:30
.hlint.yaml	Rename unicode internal modules (#2164 )	2022-12-11 00:23:43 +05:30
.packcheck.ignore	Keep only two sdist builds	2022-10-12 22:33:58 +05:30
appveyor.yml	Update the stack resolver and keep versions DRY	2022-03-02 23:49:51 +05:30
cabal.project	Extract streamly-core out of streamly	2022-03-10 01:25:49 +05:30
cabal.project.coverage	Extract streamly-core out of streamly	2022-03-10 01:25:49 +05:30
cabal.project.doctest	Use StreamD by default, expose StreamK as separate module	2023-02-06 22:09:26 +05:30
cabal.project.ghc-head	Upgrade lockfree-queue to version 0.2.4	2022-08-28 01:37:46 +05:30
cabal.project.hpc-coveralls	Update hpc-coveralls git revision	2021-06-15 17:53:20 +05:30
cabal.project.O0	Add a repl script in "bin"	2022-08-21 00:07:05 +05:30
cabal.project.packdiff	Update packdiff and streamly-process version in cabal.project.* (#2239 )	2023-01-19 18:28:09 +05:30
cabal.project.report	Update packdiff and streamly-process version in cabal.project.* (#2239 )	2023-01-19 18:28:09 +05:30
cabal.project.streamly	Remove streamly-core from extra-source files	2022-10-12 22:33:58 +05:30
cabal.project.Werror	Extract streamly-core out of streamly	2022-03-10 01:25:49 +05:30
cabal.project.Werror-nocode	Fix Werror-nocode build	2022-08-21 00:05:14 +05:30
CHANGELOG.md	Fix symlinks pointing into docs/	2022-04-11 01:16:09 +05:30
configure	Bump the version to the next major release target & run autoreconf	2022-05-25 15:10:16 +05:30
configure.ac	Bump the version to the next major release target & run autoreconf	2022-05-25 15:10:16 +05:30
CONTRIBUTING.md	Symlink a few moved dev files	2022-04-11 01:16:09 +05:30
default.nix	Update nix to 22.05	2022-09-05 20:09:47 +05:30
hie.yaml	Use StreamD by default, expose StreamK as separate module	2023-02-06 22:09:26 +05:30
LICENSE	Fold the copyright notices into the LICENSE file	2021-08-09 18:33:30 +05:30
README.md	Move README to ProjectRelated docs	2023-01-19 13:55:44 +05:30
Setup.hs	Remove doctests	2021-06-21 14:53:18 +05:30
stack.yaml	Upgrade lockfree-queue to version 0.2.4	2022-08-28 01:37:46 +05:30
streamly.cabal	Use StreamD by default, expose StreamK as separate module	2023-02-06 22:09:26 +05:30

README.md

Streamly: Idiomatic Haskell with the Performance of C

Streamly is a Haskell library that provides building blocks to build safe, scalable, modular and high performance software. Streamly offers:

The type safety of Haskell.
The performance of C programs.
Powerful building blocks for modular code.
Idiomatic functional programming.
Declarative, fearless concurrency.
Ecossytem libraries for quick development.

Please read the Streamly Setup and Usage Guide and Streamly Quick Overview to get a taste of the library. Streamly comes with comprehensive documentation, please visit the Haskell Streamly website for documentation.

Performance with Modularity

Usually, you have to pick one of the two, performance or modularity. Using Haskell Streamly you can write highly modular code and still achieve performance close to an equivalent (imperative) C program. Streamly exploits GHC's stream fusion optimizations (case-of-case and spec-constr) aggressively to achieve both modularity and performance at the same time.

Streamly offers excellent performance even for byte-at-a-time stream operations using efficient abstractions like Unfolds and Folds. Byte-at-a-time stream operations can simplify programming because the developer does not have to deal explicitly with chunking and re-combining data.

GHC Plugin for Stream Fusion

Streamly usually performs very well without any compiler plugins. However, we have fixed some deficiencies in GHC's optimizer using a compiler plugin. We hope to fold these optimizations into GHC in the future; until then we recommend that you use this plugin for applications that are performance sensitive.

Performance Benchmarks

We measured several Haskell streaming implementations using various micro-benchmarks. Please see the streaming benchmarks page for a detailed comparison of Streamly against other streaming libraries.

Our results show that Streamly is the fastest effectful streaming implementation on almost all the measured microbenchmarks. In many cases it runs up to 100x faster, and in some cases even 1000x faster than some of the tested alternatives. In some composite operation benchmarks Streamly turns out to be significantly faster than Haskell's list implementation.

Note: If you can write a program in some other way or with some other language that runs significantly faster than what Streamly offers, please let us know and we will improve.

Applications

Streamly comes equipped with a very powerful set of abstractions to accomplish many kinds of programming tasks: it provides support for programming with streams and arrays, for reading and writing from the file system and from the network, for time domain programming (reactive programming), and for reacting to file system events using fsnotify.

Please view Streamly's documentation for more information about Streamly's features.

Design

Design Goals

Our goals for Streamly from the very beginning have been:

To achieve simplicity by unifying abstractions.
To offer high performance.

These goals are hard to achieve simultaneously because they are usually inversely related. We have spent many years trying to get the abstractions right without compromising performance.

Unfold is an example of an abstraction that we have created to achieve high performance when mapping streams on streams. Unfold allows stream generation to be optimized well by the compiler through stream fusion. A Fold with termination capability is another example which modularizes stream elimination operations through stream fusion. Terminating folds can perform many simple parsing tasks that do not require backtracking. In Streamly, Parsers are a natural extension to terminating Folds; Parsers add the ability to backtrack to Folds. Unification leads to simpler abstractions and lower cognitive overheads while also not compromising performance.

Concurrency Design

Streamly uses lock-free synchronization for achieving concurrent operation with low overheads. The number of tasks performed concurrently are determined automatically based on the rate at which a consumer is consuming the results. In other words, you do not need to manage thread pools or decide how many threads to use for a particular task. For CPU-bound tasks Streamly will try to keep the number of threads close to the number of CPUs available; for IO-bound tasks it will utilize more threads.

The parallelism available during program execution can be utilized with very little overhead even where the task size is very small, because Streamly will automatically switch between serial or batched execution of tasks on the same CPU depending on whichever is more efficient. Please see our concurrency benchmarks for more detailed performance measurements, and for a comparison with the async package.

Credits

The following authors/libraries have influenced or inspired this library in a significant way:

Roman Leshchinskiy (vector)
Gabriella Gonzalez (foldl)
Alberto G. Corona (transient)

Please see the credits directory for a full list of contributors, credits and licenses.

Licensing

Streamly is an open source project available under a liberal BSD-3-Clause license

Contributing to Streamly

As an open project we welcome contributions:

Getting Support

Professional support is available for Streamly: please contact support@composewell.com.

You can also join our community chat channel on Gitter.