write-you-a-haskell/009_datatypes.md

377 lines
7.0 KiB
Markdown
Raw Permalink Normal View History

Squashed commit of the following: commit 41ba8c36a90cc11723b14ce6c45599eabdcfaa53 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 21:02:57 2015 -0500 type provenance commit be5eda941bb4c44b4c4af0ddbbd793643938f4ff Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 20:13:06 2015 -0500 provenance prototype commit 7aa958b9c279e7571f7c4887f6aa19443e16f6fb Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 19:35:08 2015 -0500 fix misc typos commit 52d60b3b2630e50ef0cd6ea5f0fa1f308d92e26d Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 15:15:58 2015 -0500 license badge commit 7d34274afe6f05a0002c8f87e5077b6a130b42b4 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 15:07:28 2015 -0500 fix resolution for llvm cfg graphs commit 14d9bc836ecc64f8e9acc60bcbd2da02335255b9 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 13:12:39 2015 -0500 added codegen dsl stub commit 0f74cdd6f95d0a1fe1cafd73e45cb1407709efd8 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 13:01:14 2015 -0500 llvm cfg graphs commit a199d721503985954060e7670c1d2f5e1a65dd11 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 10:56:54 2015 -0500 source code font commit c7db0c5d67b73d8633f08be093971877e2d6ede0 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 09:59:37 2015 -0500 change phrasing around recursion commit 6903700db482524233262e722df54b1066218250 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 18:20:06 2015 -0500 contributors.md commit 14d90a3f2ebf7ddf1229c084fe4a1e9fa13f2e41 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 17:35:41 2015 -0500 added llvm logo commit d270df6d94cbf1ef9eddfdd64af5aabc36ebca72 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 15:50:28 2015 -0500 initial llvm chapter commit e71b189c057ea9e399e90e47d9d49bb4cf12cda8 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 12:21:00 2015 -0500 system-f typing rules commit 2a7d5c7f137cf352eeae64836df634c98118f594 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Thu Jan 15 15:21:14 2015 -0500 flesh out system-f commit 7b3b2f0a2aea5e1102abe093cf5e0559090720aa Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 22:22:14 2015 -0500 started on extended parser commit cdeaf1a2658f15346fe1dc665ca09e954cce6c2e Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 17:25:02 2015 -0500 creative commons license commit f09d210be253a05fc8ad0827cd72ffa32404e2ba Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 16:54:10 2015 -0500 higher res images commit 8555eadfea8843f5683621e6652857e4259fa896 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 14:48:44 2015 -0500 cover page commit e5e542e92610f4bb4c5ac726ffa86cd1e07753e3 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Tue Jan 13 17:31:01 2015 -0500 initial happy/alex parser
2015-01-19 05:04:01 +03:00
<p class="halfbreak">
</p>
Datatypes
=========
2015-11-21 17:52:39 +03:00
Algebraic data types
--------------------
**Algebraic datatypes** are a family of constructions arising out of two
2015-12-12 20:46:28 +03:00
operations, *products* (``a * b``) and *sums* (``a + b``) (sometimes also called
*coproducts*). A product encodes multiple arguments to constructors and sums
encode choice between constructors.
2015-11-21 17:52:39 +03:00
```haskell
2015-12-12 20:46:28 +03:00
{-# LANGUAGE TypeOperators #-}
2015-11-21 17:52:39 +03:00
data Unit = Unit -- 1
data Empty -- 0
data (a * b) = Product a b -- a * b
data (a + b) = Inl a | Inr b -- a + b
data Exp a b = Exp (a -> b) -- a^b
data Rec f = Rec (f (Rec f)) -- \mu
```
2015-12-12 20:46:28 +03:00
The two constructors ``Inl`` and ``Inr`` are the left and right *injections* for
the sum. These allows us to construct sums.
```haskell
Inl :: a -> a + b
Inr :: b -> a + b
```
Likewise for the product there are two function ``fst`` and ``snd`` which are
*projections* which de construct products.
```haskell
2015-12-12 21:42:06 +03:00
fst :: a * b -> a
snd :: a * b -> b
2015-12-12 20:46:28 +03:00
```
Once a language is endowed with the capacity to write a single product or a
single sum, all higher order products can written in terms of sums of products.
For example a 3-tuple can be written in terms of the composite of two 2-tuples.
And indeed any n-tuple or record type can be written in terms of compositions of
products.
```haskell
type Prod3 a b c = a*(b*c)
data Prod3' a b c
= Prod3 a b c
prod3 :: Prod3 Int Int Int
prod3 = Product 1 (Product 2 3)
```
Or a sum type of three options can be written in terms of two sums:
```haskell
type Sum3 a b c = (a+b)+c
data Sum3' a b c
= Opt1 a
| Opt2 b
| Opt3 c
sum3 :: Sum3 Int Int Int
sum3 = Inl (Inl 2)
```
```haskell
data Option a = None | Some a
```
```haskell
type Option' a = Unit + a
some :: Unit + a
some = Inl Unit
none :: a -> Unit + a
none a = Inr a
```
In Haskell the convention for the sum and product notation is as follows:
2015-12-31 10:28:43 +03:00
Notation Haskell Type
--------- ------------
``a * b`` ``(a,b)``
``a + b`` ``Either a b``
``Inl`` ``Left``
``Inr`` ``Right``
``Empty`` ``Void``
``Unit`` ``()``
2015-12-12 20:46:28 +03:00
2016-01-11 05:25:18 +03:00
Isorecursive Types
-------------------
$$
\mathtt{Nat} = \mu \alpha. 1 + \alpha
$$
2015-12-12 20:46:28 +03:00
```haskell
roll :: Rec f -> f (Rec f)
roll (Rec f) = f
unroll :: f (Rec f) -> Rec f
unroll f = Rec f
```
Peano numbers:
```haskell
type Nat = Rec NatF
data NatF s = Zero | Succ s
zero :: Nat
zero = Rec Zero
succ :: Nat -> Nat
succ x = Rec (Succ x)
```
Lists:
```haskell
type List a = Rec (ListF a)
data ListF a b = Nil | Cons a b
nil :: List a
nil = Rec Nil
cons :: a -> List a -> List a
cons x y = Rec (Cons x y)
```
Memory Layout
-------------
Just as the type-level representation
![](img/memory_layout.png)
```cpp
typedef union {
int a;
float b;
} Sum;
typedef struct {
int a;
float b;
} Prod;
```
```cpp
int main()
{
Prod x = { .a = 1, .b = 2.0 };
Sum sum1 = { .a = 1 };
2015-12-13 21:58:00 +03:00
Sum sum2 = { .b = 2.0 };
2015-12-12 20:46:28 +03:00
}
```
```cpp
#include <stddef.h>
typedef struct T
{
enum { NONE, SOME } tag;
union
{
void *none;
int some;
} value;
} Option;
```
```cpp
int main()
{
Option a = { .tag = NONE, .value = { .none = NULL } };
Option b = { .tag = SOME, .value = { .some = 3 } };
}
```
2015-12-14 02:19:34 +03:00
* [Algebraic Datatypes in C (Part 1)](https://github.com/sdiehl/write-you-a-haskell/tree/master/chapter10/adt.c)
* [Algebraic Datatypes in C (Part 2)](https://github.com/sdiehl/write-you-a-haskell/tree/master/chapter10/adt2.c)
Pattern Scrutiny
----------------
2015-12-12 20:46:28 +03:00
In Haskell:
```haskell
data T
= Add T T
| Mul T T
| Div T T
| Sub T T
| Num Int
eval :: T -> Int
eval x = case x of
Add a b -> eval a + eval b
Mul a b -> eval a * eval b
Div a b -> eval a / eval b
Sub a b -> eval a - eval b
2015-12-12 20:46:28 +03:00
Num a -> a
```
In C:
```cpp
typedef struct T {
enum { ADD, MUL, DIV, SUB, NUM } tag;
union {
struct {
struct T *left, *right;
} node;
int value;
};
} Expr;
int eval(Expr t)
{
switch (t.tag) {
case ADD:
return eval(*t.node.left) + eval(*t.node.right);
break;
case MUL:
return eval(*t.node.left) * eval(*t.node.right);
break;
case DIV:
return eval(*t.node.left) / eval(*t.node.right);
break;
case SUB:
return eval(*t.node.left) - eval(*t.node.right);
break;
case NUM:
return t.value;
break;
}
}
```
2015-12-14 02:19:34 +03:00
* [Pattern Matching](https://github.com/sdiehl/write-you-a-haskell/tree/master/chapter10/eval.c)
Squashed commit of the following: commit 41ba8c36a90cc11723b14ce6c45599eabdcfaa53 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 21:02:57 2015 -0500 type provenance commit be5eda941bb4c44b4c4af0ddbbd793643938f4ff Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 20:13:06 2015 -0500 provenance prototype commit 7aa958b9c279e7571f7c4887f6aa19443e16f6fb Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 19:35:08 2015 -0500 fix misc typos commit 52d60b3b2630e50ef0cd6ea5f0fa1f308d92e26d Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 15:15:58 2015 -0500 license badge commit 7d34274afe6f05a0002c8f87e5077b6a130b42b4 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 15:07:28 2015 -0500 fix resolution for llvm cfg graphs commit 14d9bc836ecc64f8e9acc60bcbd2da02335255b9 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 13:12:39 2015 -0500 added codegen dsl stub commit 0f74cdd6f95d0a1fe1cafd73e45cb1407709efd8 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 13:01:14 2015 -0500 llvm cfg graphs commit a199d721503985954060e7670c1d2f5e1a65dd11 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 10:56:54 2015 -0500 source code font commit c7db0c5d67b73d8633f08be093971877e2d6ede0 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 09:59:37 2015 -0500 change phrasing around recursion commit 6903700db482524233262e722df54b1066218250 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 18:20:06 2015 -0500 contributors.md commit 14d90a3f2ebf7ddf1229c084fe4a1e9fa13f2e41 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 17:35:41 2015 -0500 added llvm logo commit d270df6d94cbf1ef9eddfdd64af5aabc36ebca72 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 15:50:28 2015 -0500 initial llvm chapter commit e71b189c057ea9e399e90e47d9d49bb4cf12cda8 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 12:21:00 2015 -0500 system-f typing rules commit 2a7d5c7f137cf352eeae64836df634c98118f594 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Thu Jan 15 15:21:14 2015 -0500 flesh out system-f commit 7b3b2f0a2aea5e1102abe093cf5e0559090720aa Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 22:22:14 2015 -0500 started on extended parser commit cdeaf1a2658f15346fe1dc665ca09e954cce6c2e Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 17:25:02 2015 -0500 creative commons license commit f09d210be253a05fc8ad0827cd72ffa32404e2ba Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 16:54:10 2015 -0500 higher res images commit 8555eadfea8843f5683621e6652857e4259fa896 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 14:48:44 2015 -0500 cover page commit e5e542e92610f4bb4c5ac726ffa86cd1e07753e3 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Tue Jan 13 17:31:01 2015 -0500 initial happy/alex parser
2015-01-19 05:04:01 +03:00
Syntax
------
GHC.Generics
------------
```haskell
class Generic a where
type family Rep a :: * -> *
to :: a -> Rep a x
from :: Rep a x -> a
```
Constructor Models
----------- -------
``V1`` Void: used for datatypes without constructors
``U1`` Unit: used for constructors without arguments
``K1`` Constants, additional parameters.
``:*:`` Products: encode multiple arguments to constructors
``:+:`` Sums: encode choice between constructors
``L1`` Left hand side of a sum.
``R1`` Right hand side of a sum.
``M1`` Meta-information (constructor names, etc.)
```haskell
newtype M1 i c f p = M1 (f p)
newtype K1 i c p = K1 c
data U p = U
```
```haskell
data (:*:) a b p = a p :*: b p
data (:+:) a b p = L1 (a p) | R1 (b p)
```
2015-12-14 02:19:34 +03:00
Implementation:
```haskell
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE DefaultSignatures #-}
import GHC.Generics
-- Auxiliary class
class GEq' f where
geq' :: f a -> f a -> Bool
instance GEq' U1 where
geq' _ _ = True
instance (GEq c) => GEq' (K1 i c) where
geq' (K1 a) (K1 b) = geq a b
instance (GEq' a) => GEq' (M1 i c a) where
geq' (M1 a) (M1 b) = geq' a b
instance (GEq' a, GEq' b) => GEq' (a :+: b) where
geq' (L1 a) (L1 b) = geq' a b
geq' (R1 a) (R1 b) = geq' a b
geq' _ _ = False
instance (GEq' a, GEq' b) => GEq' (a :*: b) where
geq' (a1 :*: b1) (a2 :*: b2) = geq' a1 a2 && geq' b1 b2
--
class GEq a where
geq :: a -> a -> Bool
default geq :: (Generic a, GEq' (Rep a)) => a -> a -> Bool
geq x y = geq' (from x) (from y)
```
```haskell
-- Base equalities
instance GEq Char where geq = (==)
instance GEq Int where geq = (==)
instance GEq Float where geq = (==)
```
```haskell
-- Equalities derived from structure of (:+:) and (:*:) for free
instance GEq a => GEq (Maybe a)
instance (GEq a, GEq b) => GEq (a,b)
```
```haskell
main :: IO ()
main = do
print $ geq 2 (3 :: Int) -- False
print $ geq 'a' 'b' -- False
print $ geq (Just 'a') (Just 'a') -- True
print $ geq ('a','b') ('a', 'b') -- True
```
* [Generics](https://github.com/sdiehl/write-you-a-haskell/tree/master/chapter10/generics.hs)
2016-03-09 03:34:42 +03:00
Pattern Expansion
2016-01-11 05:25:18 +03:00
------------------
2016-03-09 03:34:42 +03:00
<!--
2016-01-11 05:25:18 +03:00
Consider the task of expanding
```haskell
f :: Maybe (Either a b) -> Int
2016-01-11 05:25:18 +03:00
f (Just (Left x)) = 1
f (Just (Right x)) = 2
f Nothing = 3
```
```haskell
f :: Maybe (Either a b) -> Int
f = case ds of _ {
Nothing -> I# 3;
Just ds1 ->
case ds1 of _ {
Left x -> I# 1;
Right x -> I# 2
}
}
```
2016-03-09 03:34:42 +03:00
-->
2016-01-11 05:25:18 +03:00
Squashed commit of the following: commit 41ba8c36a90cc11723b14ce6c45599eabdcfaa53 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 21:02:57 2015 -0500 type provenance commit be5eda941bb4c44b4c4af0ddbbd793643938f4ff Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 20:13:06 2015 -0500 provenance prototype commit 7aa958b9c279e7571f7c4887f6aa19443e16f6fb Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 19:35:08 2015 -0500 fix misc typos commit 52d60b3b2630e50ef0cd6ea5f0fa1f308d92e26d Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 15:15:58 2015 -0500 license badge commit 7d34274afe6f05a0002c8f87e5077b6a130b42b4 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 15:07:28 2015 -0500 fix resolution for llvm cfg graphs commit 14d9bc836ecc64f8e9acc60bcbd2da02335255b9 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 13:12:39 2015 -0500 added codegen dsl stub commit 0f74cdd6f95d0a1fe1cafd73e45cb1407709efd8 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 13:01:14 2015 -0500 llvm cfg graphs commit a199d721503985954060e7670c1d2f5e1a65dd11 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 10:56:54 2015 -0500 source code font commit c7db0c5d67b73d8633f08be093971877e2d6ede0 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 09:59:37 2015 -0500 change phrasing around recursion commit 6903700db482524233262e722df54b1066218250 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 18:20:06 2015 -0500 contributors.md commit 14d90a3f2ebf7ddf1229c084fe4a1e9fa13f2e41 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 17:35:41 2015 -0500 added llvm logo commit d270df6d94cbf1ef9eddfdd64af5aabc36ebca72 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 15:50:28 2015 -0500 initial llvm chapter commit e71b189c057ea9e399e90e47d9d49bb4cf12cda8 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 12:21:00 2015 -0500 system-f typing rules commit 2a7d5c7f137cf352eeae64836df634c98118f594 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Thu Jan 15 15:21:14 2015 -0500 flesh out system-f commit 7b3b2f0a2aea5e1102abe093cf5e0559090720aa Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 22:22:14 2015 -0500 started on extended parser commit cdeaf1a2658f15346fe1dc665ca09e954cce6c2e Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 17:25:02 2015 -0500 creative commons license commit f09d210be253a05fc8ad0827cd72ffa32404e2ba Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 16:54:10 2015 -0500 higher res images commit 8555eadfea8843f5683621e6652857e4259fa896 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 14:48:44 2015 -0500 cover page commit e5e542e92610f4bb4c5ac726ffa86cd1e07753e3 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Tue Jan 13 17:31:01 2015 -0500 initial happy/alex parser
2015-01-19 05:04:01 +03:00
Full Source
-----------
\clearpage