1
1
mirror of https://github.com/thma/WhyHaskellMatters.git synced 2024-10-04 01:08:49 +03:00
WhyHaskellMatters/README.md

513 lines
19 KiB
Markdown
Raw Normal View History

2020-03-05 19:26:30 +03:00
# Why Haskell Matters
2020-02-08 15:04:49 +03:00
2020-03-25 23:14:19 +03:00
This is an early draft work in progress article.
2020-03-19 22:47:00 +03:00
2020-03-21 20:30:42 +03:00
## Motivation
2020-03-23 22:25:13 +03:00
Haskell was started as an academic research language in the late 1980ies.
It was never one of the most popular languages in the software industry.
2020-03-24 22:50:54 +03:00
So why should we concern ourselves with it?
2020-03-23 22:25:13 +03:00
Instead of answering this question directly I'd like to first have a closer look on the reception of
Haskell in the software developers community:
2020-03-21 20:30:42 +03:00
### A strange development over time
2020-03-19 22:47:00 +03:00
2020-03-23 22:25:13 +03:00
In a talk in 2017 on [the Haskell journey](https://www.youtube.com/watch?v=re96UgMk6GQ)
since its beginnings in the 1980ies Simon Peyton Jones speaks about the
rather unusual life story of Haskell.
2020-03-19 22:47:00 +03:00
2020-03-21 20:30:42 +03:00
First he shows a chart representing the typical life cycle of research languages. They are often created by
2020-03-23 22:25:13 +03:00
a single researcher (who is also the single user) and most of them will be abandoned
2020-03-21 20:30:42 +03:00
after just a few years:
2020-03-19 22:47:00 +03:00
![most research languages](img/language-1.png)
2020-03-21 20:30:42 +03:00
A more successful research language gains some interest in a larger community
2020-03-23 22:25:13 +03:00
but will still not escape the ivory tower and typically will die within ten years:
2020-03-19 22:47:00 +03:00
![successful research languages](img/language-2.png)
2020-03-23 22:25:13 +03:00
On the other hand we have popular programming languages that are quickly adopted by
2020-03-23 22:57:42 +03:00
large numbers of users and thus reach "the threshold of immortality".
2020-03-23 22:25:13 +03:00
That is the base of existing code will grow so large that the language will
be in use for decades:
2020-03-19 22:47:00 +03:00
![successful research languages](img/language-3.png)
In the next chart he rather jokingly depicts the sad fate of languages designed by committees.
2020-03-23 22:25:13 +03:00
They simply never take off:
2020-03-19 22:47:00 +03:00
![commitee languages](img/language-4.png)
2020-03-23 22:25:13 +03:00
Finally he presents a chart showing the Haskell timeline:
2020-03-19 22:47:00 +03:00
![commitee languages](img/language-5.png)
2020-03-21 20:30:42 +03:00
The development shown in this chart is rather unexpected and unusual:
2020-03-23 22:25:13 +03:00
Haskell started as a research language and was even
designed by a committee;
so in all probability it should have been abandoned long before the millennium!
2020-03-19 22:47:00 +03:00
But instead it gained some momentum in the early years followed by a rather quiet phase during
the decade of OO hype (Java was released in 1995).
2020-03-23 22:25:13 +03:00
And then again we see a continuous growth of interest since about 2005.
I'm writing this in early 2020 and we still see this trend.
2020-03-21 20:30:42 +03:00
### Being used versus being discussed
Then Simon Peyton Jones points out another interesting characteristic of the reception of Haskell
in recent years.
In statics that rank programming languages by actual usage Haskell is typically not under the 30 most active languages.
2020-03-24 20:39:43 +03:00
But in statistics that instead rank programming languages by the volume of discussions on the internet
2020-03-23 22:57:42 +03:00
Haskell typically scores much better (often in the top ten).
2020-03-21 20:30:42 +03:00
2020-03-23 22:57:42 +03:00
## So why does Haskell keep to be such a hot topic in the software development community?
2020-03-21 20:30:42 +03:00
A very short answer might be:
2020-03-23 22:57:42 +03:00
Haskell has a number of features that are clearly different from those of most other programming languages.
2020-03-21 20:30:42 +03:00
Many of these features have proven to be powerful tools to solve basic problems of software development elegantly.
2020-03-23 22:57:42 +03:00
Therefore over time other programming languages have adopted parts of these concepts (e.g. pattern matching or type classes).
In discussions about such concepts the origin from Haskell is mentioned
and differences between the Haskell concepts and those of other languages are discussed.
Sometimes people are encouraged to have a closer look at the source of these concepts to get a deeper understanding of
2020-03-24 20:39:43 +03:00
their original intentions. That's why we see a growing number of developers working in
Python, Typescript, Scala, Rust, C++, C# or Java starting to dive into Haskell.
2020-03-21 20:30:42 +03:00
A further essential point is that Haskell is still an experimental laboratory for research in areas such as
2020-03-23 22:57:42 +03:00
compiler construction, programming language design, theorem-provers, type systems etc.
So inevitably Haskell will be a topic in the discussion about these approaches.
2020-03-21 20:30:42 +03:00
2020-03-24 20:39:43 +03:00
In the next section we want to study some of the most distinguishing features of Haskell.
2020-03-19 22:47:00 +03:00
2020-03-23 22:57:42 +03:00
# So what exactly are those magic powers of Haskell?
2020-02-08 15:04:49 +03:00
2020-03-21 20:30:42 +03:00
> Haskell doesn't solve different problems than other languages.
> But it solves them differently.
>
2020-03-23 22:57:42 +03:00
> -- unknown
In this section we will examine the most outstanding features of the Haskell language.
2020-03-24 20:39:43 +03:00
I'll try to keep the learning curve moderate and so I'll start with some very basic concepts.
2020-03-25 23:14:19 +03:00
Even though I'll try to keep the presentation self-contained it's not intended to be an introduction to the Haskell language
(have a look at [Learn You a Haskell](http://www.learnyouahaskell.com/) if you are looking for enjoyable tutorial).
2020-03-23 22:57:42 +03:00
2020-03-26 22:28:55 +03:00
## Functions are First-class
2020-03-25 20:28:43 +03:00
> In computer science, a programming language is said to have first-class functions if it treats functions as
> first-class citizens. This means the language supports **passing functions as arguments to other functions**,
> **returning them as the values from other functions**, and **assigning them to variables or storing them in data
> structures.**[1] Some programming language theorists require **support for anonymous functions** (function literals)
> as well.[2] In languages with first-class functions, the names of functions do not have any special status;
> they are treated like ordinary variables with a function type.
>
> quoted from [Wikipedia](https://en.wikipedia.org/wiki/First-class_function)
2020-03-24 20:39:43 +03:00
2020-03-25 23:14:19 +03:00
We'll go through this one by one:
### functions can be assigned to variables exactly as any other other values
Let's have a look how this looks like in Haskell. First we define some simple values:
2020-03-24 20:39:43 +03:00
2020-03-23 22:57:42 +03:00
```haskell
-- define constant `aNumber` with a value of 42.
aNumber :: Integer
aNumber = 42
-- define constant `aString` with a value of "hello world"
aString :: String
aString = "Hello World"
2020-03-24 20:39:43 +03:00
```
2020-03-23 22:57:42 +03:00
2020-03-24 20:39:43 +03:00
In the first line we see a type signature that defines the constant `aNumber` to be of type `Integer`.
In the second line we define the value of `aNumber` to be `42`.
In the same way we define the constant `aString` to be of type `String`.
Next we define a function `square` that takes an integer argument and returns the square value of the argument:
```Haskell
2020-03-23 22:57:42 +03:00
square :: Integer -> Integer
square x = x * x
```
2020-03-21 20:30:42 +03:00
2020-03-24 20:39:43 +03:00
Definition of a function works exactly in the same way as the definition of any other value.
2020-03-25 20:28:43 +03:00
The only thing special is that we declare the type to be a **function type** by using the `->` notation.
2020-03-24 20:39:43 +03:00
So `:: Integer -> Integer` represents a function from `Integer` to `Integer`.
In the second line we define function `square` to compute `x * x` for any `Integer` argument `x`.
2020-03-26 22:28:55 +03:00
Ok, seems not too difficult, so let's define another function `double` that doubles its input value:
2020-03-25 23:14:19 +03:00
```haskell
double :: Integer -> Integer
double n = 2 * n
```
2020-03-26 22:28:55 +03:00
### support for anonymous functions
Anonymous functions, also known as lambda expressions, can be defined in Haskell like this:
```Haskell
\x -> x * x
```
This expression denotes an anonymous function that takes a single argument x and returns the square of that argument.
The backslash is read as λ (the greek letter lambda).
You can use such as expressions everywhere where you would use any other function. For example you could apply the
anonymous function `\x -> x * x` to a number just like the named function `square`:
```haskell
-- use named function:
result = square 5
-- use anonymous function:
result' = (\x -> x * x) 5
```
We will see more useful applications of anonymous functions in the following section.
2020-03-25 23:14:19 +03:00
### Functions can be returned as values from other functions
2020-03-26 22:28:55 +03:00
#### function composition
Do you remember *function composition* from your high-school math classes?
Function composition is an operation that takes two functions `f` and `g` and produces a function `h` such that
`h(x) = g(f(x))`
The resulting composite function is denoted `h = gf` where `(gf)(x) = g(f(x))`.
Intuitively, composing functions is a chaining process in which the output of function `f` is used as input of function `g`.
So looking from a programmers perspective the `∘` operator is a function that
takes two functions as arguments and returns a new composite function.
In Haskell this operator is represented as the dot operator `.`:
```haskell
(.) :: (b -> c) -> (a -> b) -> a -> c
2020-03-26 23:05:13 +03:00
(.) f g x = f (g x)
2020-03-26 22:28:55 +03:00
```
The brackets around the dot are required as we want to use a non-alphabetical symbol as an identifier.
In Haskell such identifiers can be used as infix operators (as we will see below).
2020-03-26 23:05:13 +03:00
Otherwise `(.)` is defined as any other function.
Please also note how close the syntax is to the original mathematical definition.
2020-03-26 22:28:55 +03:00
2020-03-26 23:05:13 +03:00
Using this operator we can easily create a composite function that first doubles
a number and then computes the square of that doubled number:
2020-03-26 22:28:55 +03:00
```haskell
squareAfterDouble :: Integer -> Integer
squareAfterDouble = square . double
```
#### Currying and Partial Application
2020-03-26 23:05:13 +03:00
In this section we look at another interesting example of functions producing
other functions as return values.
We start by defining a function `add` that takes two `Integer` arguments and computes their sum:
2020-03-25 23:14:19 +03:00
```haskell
-- function adding two numbers
add :: Integer -> Integer -> Integer
add x y = x + y
```
This look quite straightforward. But still there is one interesting detail to note:
2020-03-26 23:05:13 +03:00
the type signature of `add` is not something like
2020-03-25 23:14:19 +03:00
```haskell
add :: (Integer, Integer) -> Integer
```
Instead it is:
```haskell
add :: Integer -> Integer -> Integer
```
What does this signature actually mean?
2020-03-25 23:29:45 +03:00
It can be read as "A function taking an Integer argument and returning a function of type `Integer -> Integer`".
2020-03-25 23:14:19 +03:00
Sounds weird? But that's exactly what Haskell does internally.
So if we call `add 2 3` first `add` is applied to `2` which return a new function of type `Integer -> Integer` which is then applied to `3`.
This technique is called [**Currying**](https://wiki.haskell.org/Currying)
Currying is widely used in Haskell as it allows another cool thing: **partial application**.
In the next code snippet we define a function `add5` by partially applying the function `add` to only one argument:
```haskell
-- partial application: applying add to 5 returns a function of type Integer -> Integer
add5 :: Integer -> Integer
add5 = add 5
```
The trick is as follows: `add 5` returns a function of type `Integer -> Integer` which will add `5` to any Integer argument.
Partial application thus allows us to write functions that return functions as result values.
2020-03-26 23:08:05 +03:00
This technique is frequently used to
[provide functions with configuration data](https://github.com/thma/LtuPatternFactory#dependency-injection--parameter-binding-partial-application).
2020-03-25 23:14:19 +03:00
### Functions can be passed as arguments to other functions
2020-03-26 23:05:13 +03:00
I could keep this section short by telling you that we have already seen an example for this:
the function composition operator `(.)`.
It **accepts two functions as arguments** and returns a new one as in:
```haskell
squareAfterDouble :: Integer -> Integer
squareAfterDouble = square . double
```
2020-03-27 22:52:29 +03:00
But I have another instructive example at hand.
Let's imagine we have to implement a function that doubles any odd Integer:
```haskell
ifOddDouble :: Integer -> Integer
ifOddDouble n =
if odd n
then double n
else n
```
The Haskell code is straightforward: new ingredients are the `if ... then ... else ...` and the
odd `odd` which is a predicate from the Haskell standard library
that returns `True` if an integral number is odd.
Now let's assume that we also need another function that computes the square for any odd number.
As you can imagine we can use the standard library predicate `even`:
```haskell
ifOddSquare :: Integer -> Integer
ifOddSquare n =
if odd n
then square n
else n
```
As vigilant developers we immediately detect a violation of the
[Don't repeat yourself principle](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself) as
both functions only vary in the usage of a different growth functions `double` versus `square`.
So we are looking for a way to refactor this code by a solution that keeps the original
structure but allows to vary the used growth function.
What we need is a function that takes a growth function (of type `(Integer -> Integer)`)
as first argument, an `Integer` as second argument
and returns an `Integer`. The specified growth function will be applied in the `then` clause:
```haskell
ifOdd :: (Integer -> Integer) -> Integer -> Integer
ifOdd growthFunction n =
if odd n
then growthFunction n
else n
```
With this approach we can refactor `ifOddDouble` and `ifOddSquare` as follows:
```haskell
ifOddDouble :: Integer -> Integer
ifOddDouble n = ifOdd double n
ifOddSquare :: Integer -> Integer
ifOddSquare n = ifOdd square n
```
Now imagine that we have to implement new function `ifEvenDouble` and `ifEvenSquare`, that
will work only on even numbers. Instead of repeating ourselves we come up with a function
`ifPredGrow` that takes a predicate function of type `(Integer -> Bool)` as first argument,
a growth function of type `(Integer -> Integer)` as second argument and an Integer as third argument,
returning an `Integer`.
The predicate function will be used to determine whether the growth function has to be applied:
```haskell
ifPredGrow :: (Integer -> Bool) -> (Integer -> Integer) -> Integer -> Integer
ifPredGrow predicate growthFunction n =
if predicate n
then growthFunction n
else n
```
Using this [higher order function](https://en.wikipedia.org/wiki/Higher-order_function)
that even takes two functions as arguments we can write the two new functions and
further refactor the existing ones without breaking the DRY principle:
```haskell
ifEvenDouble :: Integer -> Integer
ifEvenDouble n = ifPredGrow even double n
ifEvenSquare :: Integer -> Integer
ifEvenSquare n = ifPredGrow even square n
ifOddDouble'' :: Integer -> Integer
ifOddDouble'' n = ifPredGrow odd double n
ifOddSquare'' :: Integer -> Integer
ifOddSquare'' n = ifPredGrow odd square n
```
2020-03-28 22:53:53 +03:00
## Pattern matching (part 1)
With the things that we have learnt so far, we can now start to implement some more interesting functions.
So what about implementing the recursive [factorial function](https://en.wikipedia.org/wiki/Factorial)?
The factorial function can be defined as follows:
```
0! = 1
n! = n * (n-1)!
for all n ∈ <sub>0</sub>
```
With our current knowledge of Haskell we can implement this as follows:
```haskell
factorial :: Natural -> Natural -- Natural is a data type representing all non-negative integers
factorial n =
if n == 0
then 1
else n * factorial (n - 1)
```
We are using the data type `Natural` to denote the set of non-negative integers
```haskell
-- definition of factorial using pattern matching
fac :: Natural -> Natural
fac 0 = 1
fac n = n * fac (n - 1)
```
2020-03-27 22:52:29 +03:00
## Dealing with Lists
Working with lists or other kinds of collections is a typical business in many problem domains that software developers
have to deal with.
Support for lists is provided by the Haskell base library and there is also some syntactic sugar built into the
language that makes working with lists quite a pleasant experience.
Let's start by defining a list containing some Integer numbers:
```haskell
someNumbers :: [Integer]
someNumbers = [49,64,97,54,19,90,934,22,215,6,68,325,720,8082,1,33,31]
```
The type signature in the first line declares `someNumbers` as a list of Integers. The brackets `[` and `]` around the type `Integer`
denote the list type.
In the second line we define the actual list value. Again the square brackets are used to form the list.
The bracket notation is syntactic sugar the actual list construction based on an empty list `[]` and the
concatenation operator `(:)` (which is an infix operator like `(.)` from the previous section).
For example, `[1,2,3]` is syntactic sugar for `1 : 2 : 3 : []`.
The concatenation operator `(:)`
There is a nice feature called *arithmetic sequences* which allows you to create sequences of numbers quite easily:
```haskell
upToHundred :: [Integer]
upToHundred = [1..100]
oddsUpToHundred :: [Integer]
oddsUpToHundred = [1,3..100]
```
2020-03-26 23:05:13 +03:00
2020-03-25 23:14:19 +03:00
2020-03-24 20:39:43 +03:00
2020-03-26 22:28:55 +03:00
---
---
This is my scrap book (don't look at it)
2020-03-24 20:39:43 +03:00
2020-02-08 15:04:49 +03:00
- Funktionen sind 1st class citizens (higher order functions, Funktionen könen neue Funktionen erzeugen und andere Funktionen als Argumente haben)
- Abstraktion über Resource management und Abarbeitung (=> deklarativ)
- Immutability ("Variables do not Vary")
- Seiteneffekte müssen in Funktions signaturen explizit gemacht werden.
D.H wenn keine Seiteneffekt angegeben ist, verhindert der Compiler, dass welche auftreten !
Damit lässt sich Seiteneffektfreie Programmierung realisieren ("Purity")
- Evaluierung in Haskell ist "non-strict" (aka "lazy"). Damit lassen sich z.B. abzählbar unendliche Mengen (z.B. alle Primzahlen) sehr elegant beschreiben.
Aber auch kontrollstrukturen lassen sich so selbst bauen (super für DSLs)
- Static and Strong typing (Es gibt kein Casting)
- Type Inferenz. Der Compiler kann die Typ-Signaturen von Funktionen selbst ermitteln. (Eine explizite Signatur ist aber möglich und oft auch sehr hilfreich für Doku und um Klarheit über Code zu gewinnen.)
- Polymorphie (Z.B für "operator overloading", Generische Container Datentypen, etc. auf Basis von "TypKlassen")
- Algebraische Datentypen (Summentypen + Produkttypen) AD helfen typische Fehler, di man von OO Polymorphie kenn zu vermeiden. Sie erlauben es, Code für viele Oerationen auf Datentypen komplett automatisch vom Compiler generieren zu lassen).
- Pattern Matching erlaubt eine sehr klare Verarbeitung von ADTs
- Eleganz: Viele Algorithmen lassen sich sehr kompakt und nah an der Problemdomäne formulieren.
- Data Encapsulation durch Module
- Weniger Bugs durch
- Purity, keine Seiteneffekte
- Starke typisierung. Keine NPEs !
- Hohe Abstraktion, Programme lassen sich oft wie eine deklarative Spezifikation des Algorithmus lesen
- sehr gute Testbarkeit durch "Composobility"
2020-03-19 22:47:00 +03:00
- das "ports & adapters" Beispiel: https://github.com/thma/RestaurantReservation
- TDD / DDD
2020-02-08 15:04:49 +03:00
- Memory Management (sehr schneller GC)
- Modulare Programme. Es gibt ein sehr einfaches aber effektive Modul System und eine grosse Vielzahl kuratierter Bibliotheken.
("Ich habe in 5 Jahre Haskell noch nicht ein einziges Mal debuggen müssen")
2020-03-07 22:22:07 +03:00
- Performance: keine VM, sondern sehr optimierter Maschinencode. Mit ein wenig Feinschliff lassen sich oft Geschwindigkeiten wie bei handoptimiertem C-Code erreichen.
2020-03-19 22:47:00 +03:00
## toc for code chapters (still in german)
2020-03-07 22:22:07 +03:00
- Werte
- Funktionen
- Listen
- Lazyness
- List comprehension
- Eigene Kontrollstrukturen
- Algebraische Datentypen
- Summentypen : Ampelstatus
- Produkttypen (int, int)
Beispiel: Baum mit Knoten (int, Ampelstatus) dann mit map ein Ampelstatus
- deriving (Show, Read) für einfache Serialisierung
- Homoiconicity (Kind of)
- Maybe Datentyp
- totale Funktionen
- Verkettung von Maybe operationen
(um die "dreadful Staircase" zu vermeiden)
=> Monoidal Operations
- explizite Seiten Effekte -> IO Monade
- TypKlassen
2020-03-07 22:28:46 +03:00
- Polymorphismus
z.B. Num a, Eq a
- Show, Read => Homoiconicity bei Serialisierung
- Automatic deriving
(Functor mit Baum Beispiel)
2020-03-11 22:47:58 +03:00
- Testbarkeit
2020-03-19 22:47:00 +03:00
- TDD, higher order functions assembly, Typklassen dispatch (https://jproyo.github.io/posts/2019-03-17-tagless-final-haskell.html)