Squashed commit of the following:

commit 41ba8c36a90cc11723b14ce6c45599eabdcfaa53
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Sun Jan 18 21:02:57 2015 -0500

    type provenance

commit be5eda941bb4c44b4c4af0ddbbd793643938f4ff
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Sun Jan 18 20:13:06 2015 -0500

    provenance prototype

commit 7aa958b9c279e7571f7c4887f6aa19443e16f6fb
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Sun Jan 18 19:35:08 2015 -0500

    fix misc typos

commit 52d60b3b2630e50ef0cd6ea5f0fa1f308d92e26d
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Sun Jan 18 15:15:58 2015 -0500

    license badge

commit 7d34274afe6f05a0002c8f87e5077b6a130b42b4
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Sun Jan 18 15:07:28 2015 -0500

    fix resolution for llvm cfg graphs

commit 14d9bc836ecc64f8e9acc60bcbd2da02335255b9
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Sun Jan 18 13:12:39 2015 -0500

    added codegen dsl stub

commit 0f74cdd6f95d0a1fe1cafd73e45cb1407709efd8
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Sun Jan 18 13:01:14 2015 -0500

    llvm cfg graphs

commit a199d721503985954060e7670c1d2f5e1a65dd11
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Sun Jan 18 10:56:54 2015 -0500

    source code font

commit c7db0c5d67b73d8633f08be093971877e2d6ede0
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Sun Jan 18 09:59:37 2015 -0500

    change phrasing around recursion

commit 6903700db482524233262e722df54b1066218250
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Sat Jan 17 18:20:06 2015 -0500

    contributors.md

commit 14d90a3f2ebf7ddf1229c084fe4a1e9fa13f2e41
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Sat Jan 17 17:35:41 2015 -0500

    added llvm logo

commit d270df6d94cbf1ef9eddfdd64af5aabc36ebca72
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Sat Jan 17 15:50:28 2015 -0500

    initial llvm chapter

commit e71b189c057ea9e399e90e47d9d49bb4cf12cda8
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Sat Jan 17 12:21:00 2015 -0500

    system-f typing rules

commit 2a7d5c7f137cf352eeae64836df634c98118f594
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Thu Jan 15 15:21:14 2015 -0500

    flesh out system-f

commit 7b3b2f0a2aea5e1102abe093cf5e0559090720aa
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Wed Jan 14 22:22:14 2015 -0500

    started on extended parser

commit cdeaf1a2658f15346fe1dc665ca09e954cce6c2e
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Wed Jan 14 17:25:02 2015 -0500

    creative commons license

commit f09d210be253a05fc8ad0827cd72ffa32404e2ba
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Wed Jan 14 16:54:10 2015 -0500

    higher res images

commit 8555eadfea8843f5683621e6652857e4259fa896
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Wed Jan 14 14:48:44 2015 -0500

    cover page

commit e5e542e92610f4bb4c5ac726ffa86cd1e07753e3
Author: Stephen Diehl <stephen.m.diehl@gmail.com>
Date:   Tue Jan 13 17:31:01 2015 -0500

    initial happy/alex parser
This commit is contained in:
Stephen Diehl 2015-01-18 21:04:01 -05:00
parent b81a2c9b5f
commit 73b43dcf89
101 changed files with 4067 additions and 166 deletions

View File

@ -1,14 +1,19 @@
<div class="pagetitle">
![](img/titles/introduction.png)
</div>
******
<p class="halfbreak">
</p>
> When the limestone of imperative programming is worn away, the granite of
> functional programming will be observed.
<!--
> *When the limestone of imperative programming is worn away, the granite of
> functional programming will be observed.*
>
> <cite>— Simon Peyton Jones</cite>
<p class="halfbreak">
</p>
-->
Introduction
============
@ -151,42 +156,71 @@ refines the space of allowable behavior and degree of expressible programs for
the language. Types are the world's most popular formal method for analyzing
programs.
$$
\begin{aligned}
1 &: \t{Nat} \\
(\lambda x . x) &: \forall a. a \to a \\
(\lambda x y . x) &: \forall a b. a \to b \to a \\
\end{aligned}
$$
In a language like Python all expressions have the same type at compile time,
and all syntactically valid programs can be evaluated. In the case where the
program is nonsensical the runtime will bubble up exceptions at runtime. The
Python interpreter makes no attempt to analyze the given program for soundness
at all before running it.
In more sophisticated languages types and terms will commingle either with
explicit annotations on binders, or even as first class values themselves.
```bash
>>> True & "false"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for &: 'bool' and 'str'
```
$$
\t{Pair} \ u \ v = \Lambda X . \lambda x^{U \rightarrow V \rightarrow X} . x u v
$$
By comparison Haskell will do quite a bit of work to try to ensure that the
program is well-defined before running it. The language that we use to
predescribe and analyze static semantics of the program is that of *static
types*.
In all the languages which we will implement the types present during compilation are
*erased*. Although they are present in the evaluation semantics, the runtime
cannot dispatch on types of values at runtime. Types by definition only exist at
compile-time in the static semantics of the language.
```bash
Prelude> True && "false"
<interactive>:2:9:
Couldn't match expected type `Bool' with actual type `[Char]'
In the second argument of `(&&)', namely `"false"'
In the expression: True && "false"
In an equation for `it': it = True && "false"
```
Catching minor type mismatch errors is the simplest example of usage, although
they occur extremely frequently as we humans are quite fallible in our reasoning
about even the simplest of program constructions! Although this just the tip of
the iceberg, the gradual trend over the last 20 years toward more *expressive
types* in modern type systems; which are capable of guaranteeing a large variety
of program correctness properties.
* Preventing resource allocation errors.
* Enforcing security access for program logic.
* Side effect management.
* Preventing buffer overruns.
* Ensuring cryptographic properties for network protocols.
* Modeling and verify theorems in mathematics and logic.
* Preventing data races and deadlocks in concurrent systems.
Type systems can never capture all possible behavior of the program. Although
more sophisticated type systems are increasingly able to model a large space of
behavior and is one of the most exciting areas of modern computer science
research. Put most bluntly, **static types let you be dumb** and offload the
checking that you would otherwise have to do in your head to a system that can
do the reasoning for you and work with you to interactively build your program.
Functional Compilers
--------------------
A compiler is typically divided into parts, a *frontend* and a *backend*. These
are loose terms but the frontend typically deals with converting the human
representation of the code into some canonicalized form while the backend
converts the canonicalized form into another form that is suitable for
evaluation.
A *compiler* is a program for turning high-level representation of ideas in a
human readable language into another form. A compiler is typically divided into
parts, a *frontend* and a *backend*. These are loose terms but the frontend
typically deals with converting the human representation of the code into some
canonicalized form while the backend converts the canonicalized form into
another form that is suitable for evaluation.
The high level structure of our functional compiler is described by the
following *block diagram*. Each describes a *phase* which is a sequence of
transformations composed to transform the input program.
<p class="center">
![](img/pipeline1.png)
</p>
* **Source** - The frontend textual source language.
* **Parsing** - Source is parsed into an abstract syntax tree.
@ -200,15 +234,13 @@ A *pass* may transform the input program from one form into another or alter the
internal state of the compiler context. The high level description of the forms
our final compiler will go through is the following sequence:
<p class="center">
![](img/pipeline2.png)
</p>
Internal forms used during compilation are *intermediate representations* and
typically any non-trivial language will involve several.
Lexing
------
Parsing
-------
The source code is simply the raw sequence of text that specifies the program.
Lexing splits the text stream into a sequence of *tokens*. Only the presence of
@ -222,22 +254,22 @@ let f x = x + 1
For instance the previous program might generate a token stream like the
following:
Token Value
----- -----
reserved let
var f
var x
reservedOp =
var x
reservedOp +
integer 1
```haskell
[
TokenLet,
TokenSym "f",
TokenSym "x",
TokenEq,
TokenSym "x",
TokenAdd,
TokenNum 1
]
```
Parsing
-------
A datatype for the *abstract syntax tree* (AST) is constructed by traversal of
the input stream and generation of the appropriate syntactic construct using a
parser.
We can then scan the token stream via and dispatch on predefined patterns of
tokens called *productions* and recursively build up a datatype for the
*abstract syntax tree* (AST) by traversal of the input stream and generation of
the appropriate syntactic.
```haskell
type Name = String
@ -319,6 +351,19 @@ Let "f" []
(Lit (LitInt 1))))
```
Transformation
--------------
The type core representation is often suitable for evaluation, but quite often
different intermediate representations are more amenable to certain
optimizations and make explicit semantic properties of the language explicit.
These kind of intermediate forms will often attach information about free
variables, allocations, and usage information directly onto the AST to make it
The most important form we will use is called the *Spineless Tagless G-Machine*
( STG ), an abstract machine that makes many of the properties of lazy
evaluation explicit directly in the AST.
Code Generation
---------------
@ -356,11 +401,11 @@ resulting module.
```perl
f:
movl %edi, -4(%rsp)
movl -4(%rsp), %edi
addl $1, %edi
movl %edi, %eax
ret
movl %edi, -4(%rsp)
movl -4(%rsp), %edi
addl $1, %edi
movl %edi, %eax
ret
```
@ -370,11 +415,11 @@ instructions defined by the processor specification.
```perl
0000000000000000 <f>:
0: 89 7c 24 fc mov %edi,-0x4(%rsp)
4: 8b 7c 24 fc mov -0x4(%rsp),%edi
8: 81 c7 01 00 00 00 add $0x1,%edi
e: 89 f8 mov %edi,%eax
10: c3 retq
0: 89 7c 24 fc mov %edi,-0x4(%rsp)
4: 8b 7c 24 fc mov -0x4(%rsp),%edi
8: 81 c7 01 00 00 00 add $0x1,%edi
e: 89 f8 mov %edi,%eax
10: c3 retq
```
\pagebreak

View File

@ -1,6 +1,6 @@
<div class="pagetitle">
![](img/titles/basics.png)
******
</div>
<!--
<blockquote>
@ -33,14 +33,12 @@ add :: Integer -> Integer -> Integer
add x y = x + y
```
```haskell
add (x,y) = x + y
```
In Haskell all functions are pure, the only thing a function may do is return a
value.
In Haskell all functions are pure, the only thing a function may do is return a value.
All functions in Haskell are curried, for example a function of three arguments takes up to three arguments and for
anything less than three it yields a partially applied function which when given additional arguments yields a
All functions in Haskell are curried, for example a function of three arguments
takes up to three arguments and for anything less than three it yields a
partially applied function which when given additional arguments yields a
another function or the resulting value if saturated.
```haskell
@ -51,7 +49,8 @@ h :: Int -> Int
h = g 2 3
```
Haskell supports higher-order functions, functions which take functions and yield other functions.
Haskell supports higher-order functions, functions which take functions and
yield other functions.
```haskell
compose f g = \x -> f (g x)
@ -86,10 +85,7 @@ constructors also generates special set of functions known as *selectors* which
extract the values of a specific field from the record.
```haskell
data Prod = Prod
{ a :: Int
, b :: Bool
}
data Prod = Prod { a :: Int , b :: Bool }
-- a :: Prod -> Int
-- b :: Prod -> Bool
@ -158,7 +154,7 @@ Tuples are allowed (with compiler support) up to 15 fields in GHC.
Pattern matching
----------------
Pattern matching allows us to discriminate on the constructor(s) of a datatype,
Pattern matching allows us to discriminate on the constructors of a datatype,
mapping separate cases to separate code paths.
```haskell
@ -216,12 +212,17 @@ Recursion
In Haskell all iteration over data structures is performed by recursion.
Entering a function in Haskell does not create a new stack frame, the logic of
the function is simply entered with the arguments on the stack and yields result
to the register. The resulting logic is compiled identically to ``while`` loops
in other languages, via a ``jmp`` instruction instead of a ``call``.
to the register. In the case where a function returns a invocation of itself
invoked in the *tail position* the resulting logic is compiled identically to
``while`` loops in other languages, via a ``jmp`` instruction instead of a
``call``.
```haskell
factorial 0 = 1
factorial n = n * factorial (n - 1)
sum :: [Int] -> [Int]
sum ys = go ys 0
where
go (x:xs) i = go xs (i+x)
go [] i = i
```
Functions can be defined to recurse mutually on each other.
@ -421,19 +422,19 @@ all monad instances must satisfy.
**Law 1**
```haskell
return a >>= f f a
return a >>= f = f a
```
**Law 2**
```haskell
m >>= return m
m >>= return = m
```
**Law 3**
```haskell
(m >>= f) >>= g m >>= (\x -> f x >>= g)
(m >>= f) >>= g = m >>= (\x -> f x >>= g)
```
Haskell has a level of syntactic sugar for monads known as do-notation. In this
@ -441,9 +442,9 @@ form binds are written sequentially in block form which extract the variable
from the binder.
```haskell
do { a <- f ; m } f >>= \a -> do { m }
do { f ; m } f >> do { m }
do { m } m
do { a <- f ; m } = f >>= \a -> do { m }
do { f ; m } = f >> do { m }
do { m } = m
```
So for example the following are equivalent.
@ -504,6 +505,16 @@ discards the left while ``<*`` discards the right. For example in a monadic
parser combinator library the ``*>`` would parse with first parser argument but
return the second.
Monoids
-------
```haskell
class Monoid a where
mempty :: a
mappend :: a -> a -> a
mconcat :: [a] -> a
```
Deriving
--------
@ -638,9 +649,7 @@ evalStack m = execWriterT (evalStateT (unStack m) 0)
As illustrated by the following stack diagram:
<p class="center">
![](img/stack.png)
</p>
Using mtl and ``GeneralizedNewtypeDeriving`` we can produce the same stack but with
a simpler forward facing interface to the transformer stack. Under the hood mtl

View File

@ -1,6 +1,6 @@
<div class="pagetitle">
![](img/titles/parsing.png)
******
</div>
<!--
> The tools we use have a profound (and devious!) influence on our thinking habits, and, therefore, on our thinking abilities.
@ -29,7 +29,7 @@ NanoParsec
So now let's build our own toy parser combinator library which we'll call
**NanoParsec** just to get the feel of how these things are built.
~~~~ {.haskell slice="chapter3/parsec.hs" lower=1 upper=8}
~~~~ {.haskell slice="chapter3/parsec.hs" lower=0 upper=7}
~~~~
Structurally a parser is a function which takes an input stream of characters
@ -37,7 +37,7 @@ and yields an parse tree by applying the parser logic over sections of the
character stream (called *lexemes*) to build up a composite data structure for
the AST.
~~~~ {.haskell slice="chapter3/parsec.hs" lower=8 upper=9}
~~~~ {.haskell slice="chapter3/parsec.hs" lower=8 upper=8}
~~~~
Running the function will result in traversing the stream of characters yielding
@ -89,7 +89,7 @@ and concatenates the result. Together these give rise to both the Alternative
and MonadPlus class instances which encode the logic for trying multiple parse
functions over the same stream and handling failure and rollover.
The core operator introduced here is (``(<|>)``) operator for combining two
The core operator introduced here is (``<|>``) operator for combining two
optional paths of parser logic, switching to second path if the first fails with
the zero value.
@ -174,10 +174,6 @@ $ runhaskell parsec.hs
7
```
See:
* [Monads for functional programming](http://homepages.inf.ed.ac.uk/wadler/papers/marktoberdorf/baastad.pdf)
**Generalizing String**
The limitations of the String type are well-known, but what is particularly nice

View File

@ -1,12 +1,11 @@
<div class="pagetitle">
![](img/titles/lambda_calculus.png)
</div>
******
<!--
> That language is an instrument of human reason, and not merely a
> medium for the expression of thought, is a truth generally admitted.
> *That language is an instrument of human reason, and not merely a medium for
> the expression of thought, is a truth generally admitted.*
>
> <cite>— George Boole</cite>
-->
<p class="halfbreak">
</p>
@ -19,14 +18,15 @@ composition, function abstraction of a single variable. The **lambda calculus**
consists very simply of three terms and all valid recursive combinations
thereof:
![](img/lambda.png)
The terms are named are typically referred to in code by the following
contractions.
- **Var** - A variable
- **Lam** - A lambda abstraction
- **App** - An application
<p class="center">
![](img/lambda.png)
</p>
$$
\begin{aligned}
e :=\ & x & \trule{Var} \\
@ -183,15 +183,8 @@ choose the Haskell convention which denotes lambda by the backslash (``\``) to
the body with (``->``), and application by spaces. Named variables are simply
alphanumeric sequences of characters.
Logical notation:
$\mathtt{const} = \lambda x y . x$
Haskell notation:
```haskell
const = \x y -> x
```
* **Logical notation**: $\mathtt{const} = \lambda x y . x$
* **Haskell notation**: ``const = \x y -> x``
In addition other terms like literal numbers or booleans can be added, and these
make writing expository examples a little easier. In addition we will add a

View File

@ -1,12 +1,15 @@
<div class="pagetitle">
![](img/titles/type_systems.png)
</div>
******
<p class="halfbreak">
</p>
> [A type system is a] tractable syntactic method for proving the
> *[A type system is a] tractable syntactic method for proving the
> absence of certain program behaviors by classifying phrases
> according to the kinds of values they compute.
> according to the kinds of values they compute.*
>
> <cite>- Benjamin Pierce</cite>
> <cite> Benjamin Pierce</cite>
<p class="halfbreak">
</p>
@ -194,6 +197,11 @@ $$
\tau_1 \to \tau_2 \to \tau_3 \to \tau_4 \quad = \quad \tau_1 \to (\tau_2 \to (\tau_3 \to \tau_4))
$$
In all the languages which we will implement the types present during
compilation are *erased*. Although types are possibly present in the evaluation
semantics, the runtime cannot dispatch on types of values at runtime. Types by
definition only exist at compile-time in the static semantics of the language.
Small-Step Semantics
--------------------
@ -603,18 +611,16 @@ the space of all programs and draw a large line around the universe of discourse
programs that we are willing to consider, since these are the only programs that
we can prove properties for.
> Well-typed programs don't go wrong, but not every program that never goes wrong
> *Well-typed programs don't go wrong, but not every program that never goes wrong
> is well-typed. It's easy to exhibit programs that don't go wrong but are
> ill-typed in ... any ... decidable type system. Many such programs are useful,
> which is why dynamically-typed languages like Erlang and Lisp are justly
> popular.
> popular.*
>
> <cite>-Simon Peyton Jones</cite>
> <cite>Simon Peyton Jones</cite>
<!--
<p class="center">
![](img/abysmal_pain.png)
</p>
-->
Power always comes at a price. Using one system you can do more things. In

View File

@ -1,10 +1,10 @@
<div class="pagetitle">
![](img/titles/evaluation.png)
</div>
******
> Well-typed programs cannot "go wrong".
> *Well-typed programs cannot "go wrong".*
>
> <cite>-Robin Milner</cite>
> <cite>— Robin Milner</cite>
<p class="halfbreak">
</p>
@ -22,8 +22,8 @@ lambda expression are necessarily evaluated before a lambda is reduced. A
language in which the arguments are not necessarily evaluated before a lambda is
reduced is non-strict.
Alternatively expressed, diverging terms are represented by the *bottom* value,
written as $\bot$. A function $f$ is non-strict if:
Alternatively expressed, diverging terms are represented up to equivalence by
the *bottom* value, written as $\bot$. A function $f$ is non-strict if:
$$
f \bot \neq \bot
@ -81,9 +81,9 @@ how the subexpression ``(2 + 2)`` is evaluated to normal form before being
bound.
```haskell
(λx. λy. y x) (2 + 2) λx. x + 1
=> (λy. y 4) λx. x + 1
=> (λy. x + 1) 4
(\x. \y. y x) (2 + 2) λx. x + 1
=> (\y. y 4) \x. x + 1
=> (\y. x + 1) 4
=> 4 + 1
=> 5
```
@ -169,16 +169,15 @@ For example, the same expression we looked at for call-by-value has the same
normal form but arrives at it by a different sequence of reductions:
```haskell
(λx. λy. y x) (2 + 2) λx. x + 1
=> (λy.y (2 + 2)) λx. x + 1
=> (λx.x + 1) (2 + 2)
(\x. \y. y x) (2 + 2) \x. x + 1
=> (\y.y (2 + 2)) λx. x + 1
=> (\x.x + 1) (2 + 2)
=> (2 + 2) + 1
=> 4 + 1
=> 5
```
Call-by-name is non-strict, although very few languages use this model,
[Frege](https://github.com/Frege/frege) being the most notable example.
Call-by-name is non-strict, although very few languages use this model.
Call-by-need
------------

View File

@ -1,8 +1,11 @@
<div class="pagetitle">
![](img/titles/hindley_milner.png)
</div>
******
<p class="halfbreak">
</p>
> There is nothing more practical than a good theory.
> *There is nothing more practical than a good theory.*
>
> <cite>— James C. Maxwell</cite>
@ -20,7 +23,7 @@ program give rise to a set of constraints that when solved always have a unique
*principal type*.
The simplest Hindley Milner type system is defined by a very short set of rules.
The first four rules describe the judgements by which we can each syntactic
The first four rules describe the judgements by which we can map each syntactic
construct (``Lam``, ``App``, ``Var``, ``Let``) to their expected types. We'll
elaborate on these rules shortly.
@ -100,6 +103,51 @@ let S f g x = f x (g x);
As before ``let rec`` expressions will expand out in terms of the fixpoint
operator and are just syntactic sugar.
Polymorphism
------------
We will add an additional constructs to our language that will admit a new form
of *polymorphism* for our language. Polymorphism is property of a term to
simultaneously admit several distinct types for the same function
implementation.
For instance the polymorphic signature for the identity function maps a input of
type $\alpha$
$$
\begin{aligned}
\mathtt{id}\ & ::\ \forall \alpha. \alpha \to \alpha \\
\mathtt{id}\ & =\ \lambda x : \alpha.\ x
\end{aligned}
$$
Now instead of having to duplicate the functionality for every possible type
(i.e. implementing idInt, idBool, ...) we our type system admits any
instantiation that is subsumed by the polymorphic type signature.
$$
\begin{aligned}
& \t{id}_\t{Int} = \t{Int} \to \t{Int} \\
& \t{id}_\t{Bool} = \t{Bool} \to \t{Bool} \\
\end{aligned}
$$
A rather remarkably fact of universal quantification is that many properties
about inhabitants of a type are guaranteed by construction, these are the
so-called *free theorems*. For instance the only (nonpathological)
implementation that can inhabit a function of type ``(a, b) -> a`` is an
implementation precisely identical to that of ``fst``.
A slightly less trivial example is that of the ``fmap`` function of type
``Functor f => (a -> b) -> f a -> f b``. The second functor law states that.
```haskell
forall f g. fmap f . fmap g = fmap (f . g)
```
However it is impossible to write down a (nonpathological) function for ``fmap``
that was well-typed and didn't have this property. We get the theorem for free!
Types
-----
@ -121,11 +169,10 @@ typeInt = TCon "Int"
typeBool = TCon "Bool"
```
However we will add an additional construct that will admit a new form of
*polymorphism* for our language. *Type schemes* model polymorphic types, they
indicate that the type variables bound in quantifier are polymorphic across the
enclosed type and can be instantiated with any type consistent with the
signature.
*Type schemes* model polymorphic types, they indicate that the type variables
bound in quantifier are polymorphic across the enclosed type and can be
instantiated with any type consistent with the signature. Intuitively the
indicate that the implementation of the function
```haskell
data Scheme = Forall [TVar] Type
@ -229,8 +276,8 @@ $$
\begin{aligned}
\FTV{\alpha} &= \{ \alpha \} \\
\FTV{\tau_1 \rightarrow \tau_2} &= \FTV{\tau_1} \cup \FTV{\tau_2} \\
\FTV{\t{Int}} &= \emptyset \\
\FTV{\t{Bool}} &= \emptyset \\
\FTV{\t{Int}} &= \varnothing \\
\FTV{\t{Bool}} &= \varnothing \\
\FTV{\forall x. t} &= \FTV{t} - \{ x \} \\
\end{aligned}
$$
@ -342,9 +389,9 @@ s := [n_0 / m_0, n_1 / m_1, ..., n_k / m_k] \\
$$
Two terms are said to be *unifiable* if there exists a unifying substitution set
between them. A substitution set is said to be **confluent** if the application
of substitutions is independent of the order applied, i.e. if we always arrive
at the same normal form regardless of the order of substitution chosen.
between them. A substitution set is said to be *confluent* if the application of
substitutions is independent of the order applied, i.e. if we always arrive at
the same normal form regardless of the order of substitution chosen.
The notation we'll adopt for unification is, read as two types $\tau, \tau'$ are
unifiable by a substitution $s$.
@ -1206,7 +1253,7 @@ Finally tab completion for our shell will use the interpreter's typing
environment keys to complete on the set of locally defined variables. Repline
supports prefix based tab completion where the prefix of the current command
will be used to determine what to tab complete. In the case where we start with
the command ":load" we will instead tab complete on filenames in the current
the command ``:load`` we will instead tab complete on filenames in the current
working directly instead.
```haskell

View File

@ -1,6 +1,6 @@
<div class="pagetitle">
![](img/titles/protohaskell.png)
******
</div>
<!--
> Functional languages are unnatural to use. [...] The important question is
@ -115,6 +115,11 @@ Things we will not implement are:
* Defaulting rules
* Exceptions
* Parallelism
* Software Transactional Memory
* Foreign Function Interface
Now if one feels so inclined one could of course implement these features on top
our final language, but they are left as an exercise to the reader!
This of course begs the question of whether or not our language is "a Haskell".
In the strictest sense, it will not be since it doesn't fully conform to either
@ -131,16 +136,16 @@ Intermediate Forms
The passes between each of the phases make up the main *compilation pipeline* .
<p class="center">
![](img/proto_pass.png)
</p>
For *ProtoHaskell* our pipeline consists of the transitions between four
intermediate forms of the program.
<p class="center">
![](img/protohaskell.png)
</p>
* The **Source**, the textual representation of the program from a file or user
input. This is stored in a ``Text`` type.
@ -149,8 +154,7 @@ intermediate forms of the program.
after type inference.
* The **PHOAS**, the type-erased Core is transformed into Haskell expressions
by mapping lambda expressions in our language directly into Haskell lambda
expressions and then evaluated using the Haskell runtime. This is simplest way
of implementing a small interpreter.
expressions and then evaluated using the Haskell runtime.
Pass Rep Haskell Type
-------- -------- ---------
@ -269,6 +273,8 @@ Compiling module: prelude.fun
3
λ> :type (>>=)
(>>=) :: Monad m => m a -> (a -> m b) -> m b
λ> :set -ddump-rn
λ> :load test.fun
```
Command line conventions will follow the Haskell's naming conventions. There
@ -900,6 +906,11 @@ Data Declarations
Data declarations are named block of various *ConDecl* constructors for each of
the fields or constructors of a user-defined datatype.
```haskell
data qname [var] where
[tydecl]
```
```haskell
data Unit where
Unit :: Unit
@ -926,6 +937,10 @@ if there is a sequence of where statements these are also attached directly to
the declaration, and will later be desugared away into local let statements
across the body of the function.
```haskell
qname [pat] = rhs [where decls]
```
```haskell
const x y = x
```
@ -997,6 +1012,10 @@ Fixity declarations are exceedingly simple, the store either arity of the
declaration along with its associativity (Left, Right, Non-Associative) and the
infix symbol.
```haskell
[infixl|infixr|infix] [integer] ops;
```
```haskell
infixl 4 +;
```
@ -1009,17 +1028,17 @@ FixityDecl
Typeclass Declarations
----------------------
Consider a very simplified ``Num`` class. Typeclass declarations consist simply
of the list of typeclass constraints, the name of the class, and the type
variable ( single parameter only ). The body of the class is simply a sequence
of scoped ``FunDecl`` declarations with only the ``matchType`` field.
Typeclass declarations consist simply of the list of typeclass constraints, the
name of the class, and the type variable ( single parameter only ). The body of
the class is simply a sequence of scoped ``FunDecl`` declarations with only the
``matchType`` field.
```haskell
class [context] => classname [var] where
[body]
```
For example:
Consider a very simplified ``Num`` class.
```haskell
class Num a where

591
008_extended_parser.md Normal file
View File

@ -0,0 +1,591 @@
<div class="pagetitle">
![](img/titles/extended_parser.png)
</div>
<p class="halfbreak">
</p>
Extended Parser
===============
Up until now we've been using parser combinators to build our parsers. Parser
combinators are a top-down parser formally in the $\mathtt{LL}(k)$ family of
parsers. The parser proceeds top-down, with a sequence of $k$ characters used to
dispatch on the leftmost production rule. Combined with backtracking (i.e. try
combinator) this is simultaneously both an extremely powerful and simple model
to implement as we saw before with our simple 100 line parser library.
However there are a family of grammars that include left-recursion that
$\mathtt{LL}(k)$ can be inefficient and often incapable of parsing.
Left-recursive rules are the case where the left-most symbol of the rule
recurses on itself. For example:
$$
\begin{aligned}
e ::=\ e\ \t{op}\ \t{atom}
\end{aligned}
$$
Now we demonstrated a way before that we could handle these cases using the
parser combinator ``chainl1`` function, and while this is possible sometimes it
can in many cases be inefficient use of parser stack and lead to ambiguous
cases.
The other major family of parsers $\mathtt{LR}$ are not plagued with the same
concerns over left recursion. On the other hand $\mathtt{LR}$ parser are
exceedingly more complicated to implement, relying on a rather sophisticated
method known as Tomita's algorithm to do the heavy lifting. The tooling can
around the construction of the *production rules* in a form that can be handled
by the algorithm is often handled a DSL that generates the code for the parser.
While the tooling is fairly robust, there is a level of indirection between us
and the code that can often be a bit of brittle to extend with custom logic.
The most common form of this toolchain is the Lex/Yacc lexer and parser
generator which compile into efficient C parsers for $\mathtt{LR}$ grammars.
Haskell's **Happy** and **Alex** are roughly the Haskell equivalent of these
tools.
Toolchain
---------
Our parser logic will be spread across two different modules.
* Lexer.x
* Parser.y
The code in each of these modules is a hybrid of the specific Alex/Happy grammar
syntax and arbitrary Haskell logic that is spliced in. Code delineated by braces
(``{}``) is regular Haskell, while code outside is parser/lexer logic.
```haskell
-- **Begin Haskell Syntax**
{
{-# OPTIONS_GHC -w #-}
module Lexer (
Token(..),
scanTokens
) where
import Syntax
}
-- **End Haskell Syntax**
-- **Begin Alex Syntax**
%wrapper "basic"
$digit = 0-9
$alpha = [a-zA-Z]
$eol = [\n]
-- **End Alex Syntax**
```
The files will be used during the code generation of the two modules ``Lexer``
and ``Parser``. The toolchain is accessible in several ways, first via the
command-line tools ``alex`` and ``happy`` will will generate the resulting
modules by passing the appropriate input file to the tool.
```haskell
$ alex Lexer.x # Generates Lexer.hs
$ happy Parser.y # Generates Parser.hs
```
Or inside of the cabal file using the ``build-tools`` command.
```haskell
Build-depends: base, array
build-tools: alex, happy
other-modules:
Parser,
Lexer
```
So the resulting structure of our interpreter will have the following set of
files.
* **Lexer.hs**
* **Parser.hs**
* Eval.hs
* Main.hs
* Syntax.hs
Alex
----
Our lexer module will export our Token definition and a function for converting
an arbitrary String into a *token stream* or a list of Tokens.
```haskell
{
module Lexer (
Token(..),
scanTokens
) where
import Syntax
}
```
The tokens are simply an enumeration of the unique possible tokens in our
grammar.
```haskell
data Token
= TokenLet
| TokenTrue
| TokenFalse
| TokenIn
| TokenLambda
| TokenNum Int
| TokenSym String
| TokenArrow
| TokenEq
| TokenAdd
| TokenSub
| TokenMul
| TokenLParen
| TokenRParen
| TokenEOF
deriving (Eq,Show)
scanTokens :: String -> [Token]
scanTokens = alexScanTokens
```
The token definition is list of function definitions mapping atomic character
and alphabetical sequences to constructors for our ``Token`` datatype.
```haskell
%wrapper "basic"
$digit = 0-9
$alpha = [a-zA-Z]
$eol = [\n]
tokens :-
-- Whitespace insensitive
$eol ;
$white+ ;
-- Comments
"#".* ;
-- Syntax
let { \s -> TokenLet }
True { \s -> TokenTrue }
False { \s -> TokenFalse }
in { \s -> TokenIn }
$digit+ { \s -> TokenNum (read s) }
"->" { \s -> TokenArrow }
\= { \s -> TokenEq }
\\ { \s -> TokenLambda }
[\+] { \s -> TokenAdd }
[\-] { \s -> TokenSub }
[\*] { \s -> TokenMul }
\( { \s -> TokenLParen }
\) { \s -> TokenRParen }
$alpha [$alpha $digit \_ \']* { \s -> TokenSym s }
```
Happy
-----
We'll parse into a small untyped lambda calculus for our frontend language.
```haskell
module Syntax where
type Name = String
data Expr
= Lam Name Expr
| App Expr Expr
| Var Name
| Lit Lit
| Op Binop Expr Expr
deriving (Eq,Show)
data Lit
= LInt Int
| LBool Bool
deriving (Show, Eq, Ord)
data Binop = Add | Sub | Mul | Eql
deriving (Eq, Ord, Show)
```
The token constructors are each assigned to a name that will be used in our
production rules.
```haskell
-- Lexer structure
%tokentype { Token }
-- Token Names
%token
let { TokenLet }
true { TokenTrue }
false { TokenFalse }
in { TokenIn }
NUM { TokenNum $$ }
VAR { TokenSym $$ }
'\\' { TokenLambda }
'->' { TokenArrow }
'=' { TokenEq }
'+' { TokenAdd }
'-' { TokenSub }
'*' { TokenMul }
'(' { TokenLParen }
')' { TokenRParen }
```
The parser itself will live inside of a custom monad of our choosing. In this
simple case we'll just add error handling with the ``Except`` monad.
```haskell
-- Parser monad
%monad { Except String } { (>>=) } { return }
%error { parseError }
```
And finally our production rules, the toplevel entry point for our parser will
be the ``expr`` rule. Notice how naturally we can right left recursive grammar
for our infix operators.
```haskell
-- Entry point
%name expr
-- Operators
%left '+' '-'
%left '*'
%%
Expr : let VAR '=' Expr in Expr { App (Lam $2 $6) $4 }
| '\\' VAR '->' Expr { Lam $2 $4 }
| Form { $1 }
Form : Form '+' Form { Op Add $1 $3 }
| Form '-' Form { Op Sub $1 $3 }
| Form '*' Form { Op Mul $1 $3 }
| Fact { $1 }
Fact : Fact Atom { App $1 $2 }
| Atom { $1 }
Atom : '(' Expr ')' { $2 }
| NUM { Lit (LInt $1) }
| VAR { Var $1 }
| true { Lit (LBool True) }
| false { Lit (LBool True) }
```
Type Provenance
---------------
We will use a technique of track the "flow" of type information through out
typechecker and associate position information associated with the inferred
types back to their position information in the source.
Indentation
-----------
Haskell's syntax uses indentation blocks to delineated sections of code. This
use of indentation sensitive layout to convey the structure of logic is
sometimes called the *offside rule* in parsing literature. At the beginning of
"laidout" block the first declaration or definition can start in any column, and
the parser marks that indentation level. Every subsequent top-level declaration
must have the same indentation.
```haskell
-- Start of layout ( Column: 0 )
fib :: Int -> Int
fib x = truncate $ ( 1 / sqrt 5 ) * ( phi ^ x - psi ^ x ) -- (Column: > 0)
-- Start of new layout ( Column: 2 )
where
-- Indented block ( Column: > 2 )
phi = ( 1 + sqrt 5 ) / 2
psi = ( 1 - sqrt 5 ) / 2
```
The Parsec monad is itself parameterized over a type variable ``s`` which stands
for the State layer baked into the monad allowing us to embed custom parser
state inside of our rules. To adopt our parser to handle sensitive whitespace we
will
```haskell
-- Indentation sensitive Parsec monad.
type IParsec a = Parsec Text ParseState a
data ParseState = ParseState
{ indents :: Column
} deriving (Show)
initParseState :: ParseState
initParseState = ParseState 0
```
Inside of the Parsec the internal position state (SourcePos) is stored during
each traversal, and is accessible inside of rule logic via ``getPosition``
function.
```haskell
data SourcePos = SourcePos SourceName !Line !Column
getPosition :: Monad m => ParsecT s u m SourcePos
```
In terms of this function we can write down a set of logic that will allow us to
query the current column count and then either succeed or fail to match on a
pattern based on the current indentation level. The ``laidout`` combinator will
capture the current indentation state and push it into the ``indents`` field in
the State monad.
```haskell
laidout :: Parsec s ParseState a -> Parsec s ParseState a
laidout m = do
cur <- indents <$> getState
pos <- sourceColumn <$> getPosition
modifyState $ \st -> st { indents = pos }
res <- m
modifyState $ \st -> st { indents = cur }
return res
```
And then have specific logic which guard the parser match based on comparing the
current indentation level to the stored indentation level.
```haskell
indentCmp
:: (Column -> Column -> Bool)
-> Parsec s ParseState ()
indentCmp cmp = do
col <- sourceColumn <$> getPosition
current <- indents <$> getState
guard (col `cmp` current)
```
We can then write two combinators in terms of this function which match on
either positive and identical indentation difference.
```haskell
indented :: IParsec ()
indented = indentCmp (>) <?> "Block (indented)"
align :: IParsec ()
align = indentCmp (==) <?> "Block (same indentation)"
```
On top of these we write our two combinators for handling block syntax, which
match a sequence of vertically aligned patterns as a list.
```haskell
block, block1 :: Parser a -> Parser [a]
block p = laidout (many (align >> p))
block1 p = laidout (many1 (align >> p))
```
GHC uses an optional layout rule for several constructs, allowing us to
equivalently manually delimit indentation sensitive syntax with braces. The most
common is for do-notation. So for example:
```haskell
example = do { a <- m; b }
example = do
a <- m
b
```
To support this in Parsec style we adopt implement a ``maybeBraces`` function.
```haskell
maybeBraces :: Parser a -> Parser [a]
maybeBraces p = braces (endBy p semi) <|> block p
maybeBraces1 :: Parser a -> Parser [a]
maybeBraces1 p = braces (endBy1 p semi) <|> block p
```
Error Reporting
---------------
Parsec's default error reporting leaves a bit to be desired, but does in fact
contain most of the information needed to deliver better messages packed inside
the ParseError structure.
```haskell
showSyntaxError :: L.Text -> ParseError -> String
showSyntaxError s err = L.unpack $ L.unlines [
" ",
" " <> lineContents,
" " <> ((L.replicate col " ") <> "^"),
(L.pack $ show err)
]
where
lineContents = (L.lines s) !! line
pos = errorPos err
line = sourceLine pos - 1
col = fromIntegral $ sourceColumn pos - 1
```
Now when we enter an invalid expression the error reporting will point us
directly to the adjacent lexeme that caused the problem as is common in many
languages.
```bash
λ> \x -> x +
\x -> x +
^
"<interactive>" (line 1, column 11):
unexpected end of input
expecting "(", character, literal string, "[", integer, "if" or identifier
```
Extensible Operators
--------------------
Haskell famously allows the definition of custom infix operators, and extremely
useful language feature although this poses a bit of a challenge to parse! There
are several ways to do this and both depend on two properties of the operators.
* Precedence
* Associativity
The first is the way that GHC does is to parse all operators as left associative
and of the same precedence, and then before desugaring go back and "fix" the
parse tree given all the information we collected after finishing parsing.
The second method is a bit of a hack, and involves simply storing the collected
operators inside of the Parsec state monad and then simply calling
``buildExpressionParser`` on the current state each time we want to parse and
infix operator expression.
```haskell
data FixitySpec = FixitySpec
{ fixityFix :: Fixity
, fixityName :: String
} deriving (Eq, Show)
data Assoc
= L
| R
| N
deriving (Eq,Ord,Show)
data Fixity
= Infix Assoc Int
| Prefix Int
| Postfix Int
deriving (Eq,Ord,Show)
```
In our parser:
```haskell
fixityPrec :: FixitySpec -> Int
fixityPrec (FixitySpec (Infix _ n) _) = n
fixityPrec (FixitySpec _ _) = 0
mkTable ops =
map (map toParser) $
groupBy ((==) `on` fixityPrec) $
reverse $ sortBy (compare `on` fixityPrec) $ ops
toParser (FixitySpec ass tok) = case ass of
Infix L _ -> infixOp tok (op (Name tok)) Ex.AssocLeft
Infix R _ -> infixOp tok (op (Name tok)) Ex.AssocRight
Infix N _ -> infixOp tok (op (Name tok)) Ex.AssocNone
```
```haskell
data ParseState = ParseState
{ indents :: Column
, fixities :: [FixitySpec]
} deriving (Show)
initParseState :: ParseState
initParseState = ParseState 0 defaultOps
defaultOps :: [FixitySpec]
defaultOps = [
FixitySpec (Infix L 4) ">"
, FixitySpec (Infix L 4) "<"
, FixitySpec (Infix L 4) "/="
, FixitySpec (Infix L 4) "=="
, FixitySpec (Infix R 5) ":"
, FixitySpec (Infix L 6) "+"
, FixitySpec (Infix L 6) "-"
, FixitySpec (Infix L 5) "*"
, FixitySpec (Infix L 5) "/"
]
addOperator :: FixitySpec -> Parsec s ParseState ()
addOperator fixdecl = do
modifyState $ \st -> st { fixities = fixdecl : (fixities st) }
```
Now when parsing a infix operator declarations we simply do a state operation
and add add the operator.
```haskell
fixityspec :: Parser FixitySpec
fixityspec = do
fix <- fixity
prec <- precedence
op <- operator
semi
let spec = FixitySpec (fix prec) op
addOperator spec
return spec
where
fixity = Infix L <$ reserved "infixl"
<|> Infix R <$ reserved "infixr"
<|> Infix N <$ reserved "infix"
precedence :: Parser Int
precedence = do
n <- natural
if n <= 10
then return (fromInteger n)
else empty
<?> "Invalid operator precedence"
fixitydecl :: Parser Decl
fixitydecl = do
spec <- fixityspec
return $ FixityDecl spec
<?> "operator fixity definition"
```
Full Source
-----------
* [Happy Parser](https://github.com/sdiehl/write-you-a-haskell/tree/master/chapter9/happy)
Resources
---------
The tooling and documentation for Alex and Happy is well-developed as it is used
extensively inside of GHC:
* [Alex User Guide](https://www.haskell.org/alex/doc)
* [Happy User Guide](https://www.haskell.org/happy/doc/html/)
* [A Tool for Generalized LR Parsing In Haskell](http://www.benmedlock.co.uk/Functional_GLR_Parsing.pdf)
* [Haskell Syntax Definition](https://www.haskell.org/onlinereport/haskell2010/haskellch10.html#x17-17500010)
Haskell itself uses Alex and Happy for it's parser infastructure. The resulting
parser is rather sophisicated.
* [Lexer.x](https://github.com/ghc/ghc/blob/master/compiler/parser/Lexer.x)
* [Parser.y](https://github.com/ghc/ghc/blob/master/compiler/parser/Parser.y)
\clearpage

45
009_datatypes.md Normal file
View File

@ -0,0 +1,45 @@
<p class="halfbreak">
</p>
Datatypes
=========
Syntax
------
GHC.Generics
------------
```haskell
class Generic a where
type family Rep a :: * -> *
to :: a -> Rep a x
from :: Rep a x -> a
```
Constructor Models
----------- -------
``V1`` Void: used for datatypes without constructors
``U1`` Unit: used for constructors without arguments
``K1`` Constants, additional parameters.
``:*:`` Products: encode multiple arguments to constructors
``:+:`` Sums: encode choice between constructors
``L1`` Left hand side of a sum.
``R1`` Right hand side of a sum.
``M1`` Meta-information (constructor names, etc.)
```haskell
newtype M1 i c f p = M1 (f p)
newtype K1 i c p = K1 c
data U p = U
```
```haskell
data (:*:) a b p = a p :*: b p
data (:+:) a b p = L1 (a p) | R1 (b p)
```
Full Source
-----------
\clearpage

13
010_renamer.md Normal file
View File

@ -0,0 +1,13 @@
<p class="halfbreak">
</p>
Renamer
=======
Uniplate
--------
Full Source
-----------
\clearpage

10
011_pattern_matching.md Normal file
View File

@ -0,0 +1,10 @@
<p class="halfbreak">
</p>
Pattern Matching
================
Full Source
-----------
\clearpage

274
012_systemf.md Normal file
View File

@ -0,0 +1,274 @@
<div class="pagetitle">
![](img/titles/systemf.png)
</div>
<p class="halfbreak">
</p>
> *Conventional programming languages are growing ever more enormous, but not
> stronger. Inherent defects at the most basic level cause them to be both fat
> and weak...*
>
> <cite>— John Backus</cite>
<p class="halfbreak">
</p>
System F
========
The *second order lambda calculus* or System F of Girard and Reynolds is an
extension of the simply typed lambda calculus that adds an additional level of
lambda abstraction and type application.
$$
\begin{aligned}
\tau ::=\ & \tau \to \tau \\
& \alpha \\
& \forall \overline \alpha . \tau \\
\end{aligned}
$$
$$
\begin{aligned}
e ::=\ & x \\
& \lambda x:\tau . e \\
& \Lambda \alpha . e \\
& e\ e \\
& e\ \tau \\
\end{aligned}
$$
$$
\begin{aligned}
\Gamma ::=\ & \varnothing \\
& \Gamma,\ x : \tau \\
& \Gamma,\ \alpha \\
\end{aligned}
$$
Examples
--------
In GHC Core's language the identity function is implemented precisely in this
form, with the only exception being that it uses the backslash for both the
value-level and type-level lambda. The distinction is that that a $\Lambda$ is
parameterized by a type variable (indicated by ``@ t``) while a $\lambda$ is
parameterized over value variable.
```haskell
\ (@ t) (x :: t) -> x
```
So for some examples:
**Identity function**:
In System F notation:
$$
\begin{aligned}
\mathtt{id}\ & ::\ \forall \alpha. \alpha \to \alpha \\
\mathtt{id}\ & =\ \Lambda \alpha. \lambda x : \alpha. x
\end{aligned}
$$
In GHC Core Notation:
```haskell
id :: forall a. a -> a
id = \ (@ a) (x :: a) -> x
```
**Compose function**:
In System F notation:
$$
\begin{aligned}
\mathtt{compose}\ & ::\ \forall \alpha \beta \gamma. (\beta \to \gamma) \to (\alpha \to \beta) \to \alpha \to \gamma \\
\mathtt{compose}\ & =\
\Lambda \alpha \beta \gamma.\
\ \lambda \ (f: \beta \to \gamma)
\ (g: \alpha \to \beta)
\ (x: \alpha).
\ f (g\ x)
\end{aligned}
$$
In GHC Core Notation:
```haskell
compose :: forall a b c. (b -> c) -> (a -> b) -> a -> c
compose =
\ (@ b) (@ c) (@ a)
(f :: b -> c) (g :: a -> b) (x :: a)
-> f (g x)
```
Datatypes
---------
Previously when working with our Hindley-Milner type system we've had to "bolt
on" the primitive datatypes into the language itself. Now in System F primitive
datatypes are actually definable.
$$ \t{Pair} \ u \ v = \Lambda X . \lambda x^{U \rightarrow V \rightarrow X} . x u v $$
Rules
-----
$$
\begin{array}{cl}
\infrule{x:\tau \in \Gamma}{\Gamma \vdash x:\tau} & \trule{T-Var} \\ \\
\infrule{\Gamma, x : \tau_1 \vdash e : \tau_2}{\Gamma \vdash \lambda x . \tau_2 : e_1 \rightarrow e_2 } & \trule{T-Lam} \\ \\
\infrule{\Gamma \vdash e_1 : \tau_1 \rightarrow \tau_2 \andalso \Gamma \vdash e_2 : \tau_1}{\Gamma \vdash e_1 e_2 : \tau_2} & \trule{T-App} \\ \\
\infrule{\Gamma, \alpha \vdash x : \tau}{\Gamma \vdash \Lambda \alpha. x : \forall \alpha. \tau } & \trule{T-TAbs} \\ \\
\infrule{\Gamma, \alpha \vdash x : \forall \alpha. \tau_1}{\Gamma \vdash x\ \tau_2 : [\alpha / \tau_2] \tau_2} & \trule{T-TApp} \\ \\
\end{array}
$$
Prenex Restriction
------------------
System F is strictly more general than our previous Hindley-Milner type system,
in the sense that every term can be expressed in System F. Implicitly though
we've made an assumption about the position of qualifiers can only occur at the
front of the type signature in the *prenex position*. So under this restriction
we can't write a function which takes a polymorphic function as an argument or
returns a polymorphic function as a result.
Normally when Haskell's typechecker infers a type signature it places all
quantifiers of type variables at the outermost position such that that no
quantifiers appear within the body of the type expression, called the prenex
restriction This restrict an entire class of type signatures that are would
otherwise expressible within System F, but has the benefit of making inference
tractable.
```haskell
-- Allowed
a :: forall a. (a -> a) -> a -> a
a f x = f x
-- Forbidden
a :: (forall a. a -> a) -> b -> b
a f x = f x
-- Forbidden
a :: a -> (forall b. b -> b)
a x = (\x -> x)
-- Forbidden
a :: (forall a. a -> a) -> (forall b. b -> b)
a x = x
```
The concept of *polymorphism rank* falls out of this notion. Simply put the
level of nesting for the qualifier inside the type specifies the rank of the
type signature.
* Rank 0: ``t``
* Rank 1: ``forall a. a -> t``
* Rank 2: ``(forall a. a -> t) -> t``
* Rank 3: ``((forall a. a -> t) -> t) -> t``
The term *rank-n polymorphism* indicates the type systems polymorphism of any
arbitrary rank, allow the qualifier to appear anywhere exposing the entire
expressible space of System F types.
In GHC's implementation a lambda-bound or case-bound variable the user must
provide an explicit annotation or GHC's type inference will assume that the type
has no ``forall``'s in it and must have a principal Rank-1 type which may not
exist.
```haskell
{-# LANGUAGE RankNTypes #-}
-- No annotation: cannot unify Bool with Char
example1 f = (f True, f 'c')
-- Type checks!
example2 :: (forall a. a -> a) -> (Bool, Char)
example2 f = (f True, f 'c')
```
The language extension ``-XRankNTypes`` loosens the prenex restriction such that
we may explicitly place quantifiers within the body of the type. The bad news is
that the general problem of inference in this relaxed system is undecidable in
general, so we're required to explicitly annotate functions which use
``RankNTypes`` or they are otherwise inferred as rank-1 and may not typecheck at
all.
As noted before to fully implement the dictionaries for monad typeclasses we
will need at least rank-2 polymorphism so that the functions specified in the
``DMonad m`` can be instantiated at arbitrary types for ``a`` and ``b``.
```haskell
{-# LANGUAGE RankNTypes #-}
data DMonad m = DMonad
{ return :: forall a. a -> m a
, bind :: forall a b. m a -> (a -> m b) -> m b
}
data Maybe a = Nothing | Just a
bindMaybe :: Maybe a -> (a -> Maybe b) -> Maybe b
bindMaybe (Just x) f = f x
bindMaybe Nothing f = Nothing
returnMaybe :: a -> Maybe a
returnMaybe x = Just x
-- Maybe monad explicit dictionary.
maybe :: DMonad Maybe
maybe = DMonad
{ bind = bindMaybe
, return = returnMaybe
}
```
Type Variables
--------------
Categories of types
* Rho-types $\sigma$
* Polytypes $\rho$
* Monotypes $\tau$
Categories of type variables
* Meta type variables: $\tau_1, \tau_2$
* Bound type variables: $a, b$
* Skolem type variables
Substitution
------------
Erasure
-------
The type erasure:
$$
\begin{aligned}[lcl]
& \t{erase}(x) &=&\ x \\
& \t{erase}(e_1\ e_2) &=&\ \t{erase}(e_1)\ \t{erase}(e_2) \\
& \t{erase}(\lambda x:t.e) &=&\ \lambda x . \t{erase}(e) \\
& \t{erase}(\Lambda \alpha . e) &=&\ \t{erase}(e) \\
& \t{erase}(e\ t) &=&\ \t{erase}(e) \\
\end{aligned}
$$
Evaluation
----------
unbound-generics
----------------
Up until now we've been writing our own binding implementation. There is however
a better way
\clearpage

693
026_llvm.md Normal file
View File

@ -0,0 +1,693 @@
<div class="pagetitle">
![](img/titles/llvm.png)
</div>
<p class="halfbreak">
</p>
<!--
> *Conventional programming languages are growing ever more enormous, but not
> stronger. Inherent defects at the most basic level cause them to be both fat
> and weak...*
>
> <cite>— John Backus</cite>
-->
<p class="halfbreak">
</p>
LLVM
====
LLVM is a statically typed intermediate representation and an associated
toolchain for manipulating, optimizing and converting this intermediate form
into native code.
So for example consider a simple function which takes two arguments, adds them,
and xors the result. Writing in IR it would be formed as such:
~~~~ {.haskell include="chapter27/example.ll"}
~~~~
Running this through the LLVM toolchain we can target our high level IR into
multiple different assembly codes mapping onto various architectures and CPUs
all from the same platform agnostic intermediate representation.
**x86-64**
```perl
test1:
.cfi_startproc
andl %edx, %esi
andl %edx, %edi
xorl %esi, %edi
movl %edi, %eax
ret
```
**ARM**
```perl
test1:
and r1, r2, r1
and r0, r2, r0
eor r0, r0, r1
mov pc, lr
```
**PowerPC**
```perl
.L.test1:
.cfi_startproc
and 4, 5, 4
and 3, 5, 3
xor 3, 3, 4
blr
.long 0
.quad 0
```
A uncommonly large amount of hardware manufacturers and software vendors (Adobe,
AMD, Apple, ARM, Google, IBM, Intel, Mozilla, Qualcomm, Samsung, Xilinx) have
come have converged on the LLVM toolchain as service agnostic way to talk about
generating machine code.
What's even more impressive is that many of the advances in compiler
optimizations and static analysis have been mechanized in the form of
optimization passes so that all compilers written on top of the LLVM platform
can take advantage of the same advanced optimizers that would often previously
have to be developed independently.
Types
-----
**Primitive**
```perl
i1 ; Boolean type
i8 ; char
i32 ; 32 bit integer
i64 ; 64 bit integer
float ; 32 bit
double ; 64 bit
```
**Arrays**
```perl
[10 x float] ; Array of 10 floats
[10 x [20 x i32]] ; Array of 10 arrays of 20 integers.
```
**Structs**
```perl
{float, i64} ; structure
{float, {double, i3}} ; nested structure
<{float, [2 x i3]}> ; packed structure
```
**Vectors**
```perl
<4 x double>
<8 x float>
```
**Pointers**
```perl
float* ; Pointer to a float
[25 x float]* ; Pointer to an array
```
The traditional ``void*`` pointer in C is a ``i8*`` pointer in LLVM with the
appropriate casts.
**Constants**
```perl
[i1 true, i1 false] ; constant bool array
<i32 42, i32 10> ; constant vector
float 1.23421e+2 ; floating point constant
null ; null pointer constant
```
The ``zeroinitializer`` can be used to instantiate any type to the appropriate
zero of any type.
```perl
<8 x float> zeroinitializer ; Zero vector
```
**Named Types**
```perl
%vec4 = type <4 x i32>
%pair = type { i32, i32 }
```
Recursive types declarations are supported.
```perl
%f = type { %f*, i32 }
```
**Platform Information**
```perl
target datalayout = "
e-
p : 64 : 64 : 64-
i1 : 8 : 8-
i8 : 8 : 8-
i16 : 16 : 16-
i32 : 32 : 32-
i64 : 64 : 64-
f32 : 32 : 32-
f64 : 64 : 64-
v64 : 64 : 64-
v128 : 128 : 128-
a0 : 0 : 64-
s0 : 64 : 64-
f80 : 128 : 128-
n8 : 16 : 32 : 64-
S128
"
target triple = "x86_64-unknown-linux-gnu"
```
Specifications are delimited by the minus sign ``-``.
* The ``e`` indicates the platform is little-endian.
* The ``i<n>`` indicate the bitsize and alignment of the integer type.
* The ``f<n>`` indicate the bitsize and alignment of the floating point type.
* The ``p<n>`` indicate the bitsize and alignment of the pointer type.
* The ``v<n>`` indicate the bitsize and alignment of the vector type.
* The ``a<n>`` indicate the bitsize and alignment of the aggregate type.
* The ``n<n>`` indicate the widths of the CPU registers.
* The ``S<n>`` indicate the alignment of the stack.
Variables
---------
Symbols used in an LLVM module are either global or local. Global symbols begin
with ``@`` and local symbols begin with ``%``. All symbols must be defined or
forward declared.
Instructions in LLVM are either numbered sequentially (``%0``, ``%1``, ...) or
given explicit variable names (``%a``, ``%foo``, ..). For example the arguments
to the following function are named values, while the result of the add
instructions unnamed.
```perl
define i32 @add(i32 %a, i32 %b) {
%1 = add i32 %a, %b
ret i32 %1
}
```
Instructions
------------
```perl
%result = add i32 10, 20
```
**Logical**
* ``shl``
* ``lshr``
* ``ashr``
* ``and``
* ``or``
* ``xor``
**Binary Operators**
* ``add``
* ``fadd``
* ``sub``
* ``fsub``
* ``mul``
* ``fmul``
* ``udiv``
* ``sdiv``
* ``fdiv``
* ``urem``
* ``srem``
* ``frem``
**Comparison**
op unsigned signed floating
-- --------- -------- --------
lt ULT SLT OLT
gt UGT SGT OGT
le ULE SLE OLE
ge UGE SGE OGE
eq EQ EQ OEQ
ne NE NE ONE
```perl
%c = udiv i32 %a, %b
%d = sdiv i32 %a, %b
%e = fmul float %a, %b
%f = fdiv float %a, %b
```
```perl
%g = icmp eq i32 %a, %b
%i = icmp slt i32 %a, %b
%j = icmp ult i32 %a, %b
%k = fcmp olt float, %a, %b
```
Data
----
```perl
i1 1
i32 299792458
float 7.29735257e-3
double 6.62606957e-34
```
Blocks
------
Function definitions in LLVM introduce a sequence of labeled *basic blocks*
containing any number of instructions and a final *terminator* instruction which
indicates how control flow yields after the instructions of the basic block are
evaluated.
```perl
define i1 @foo() {
entry:
br label %next
next:
br label %return
return:
ret i1 0
}
```
A basic block has either zero (for entry block) or a fixed number of
*predecessors*. A graph with basic blocks as nodes and the predecessors of each
basic block as edges constitutes a *control flow graph*. LLVM's opt command can
be used to dump this graph using graphviz.
```bash
$ opt -view-cfg module.ll
$ dot -Tpng module.dot -o module.png
```
We say a basic block A *dominates* a different block B in the control flow
graph if it's impossible to reach B without passing through "A, equivalently
A is the *dominator* of B.
All logic in LLVM is written in *static single assignment* (SSA) form. Each
variable is assigned precisely once, and every variable is defined before it is
used. Updating any existing variable reference creates a new reference with for
the resulting output.
Control Flow
------------
* Unconditional Branch
* Conditional Branch
* Switch
* Return
* Phi
\clearpage
**Return**
The ``ret`` function simply exits the current function yielding the current
value to the virtual stack.
```perl
define i1 @foo() {
ret i1 0
}
```
![](chapter27/cfg/ret.png)
\clearpage
**Unconditional Branch**
The unconditional branch ``br`` simply jumps to any basic block local to the
function.
```perl
define i1 @foo() {
entry:
br label %next
next:
br label %return
return:
ret i1 0
}
```
![](chapter27/cfg/branch.png)
\clearpage
**Conditional Branch**
The conditional branch ``br`` jumps to one of two basic blocks based on whether
a test condition is ``true`` or ``false``. This corresponds the logic of a
traditional "if statement".
```perl
define i32 @foo() {
start:
br i1 true, label %left, label %right
left:
ret i32 10
right:
ret i32 20
}
```
![](chapter27/cfg/cbranch.png)
\clearpage
**Switch**
The switch statement ``switch`` jumps to any number of branches based on the
equality of value to a jump table matching values to basic blocks.
```perl
define i32 @foo(i32 %a) {
entry:
switch i32 %a, label %default [
i32 0, label %f
i32 1, label %g
i32 2, label %h
]
f:
ret i32 1
g:
ret i32 2
h:
ret i32 3
default:
ret i32 0
}
```
![](chapter27/cfg/switch.png)
\clearpage
**Phi**
A ``phi`` node selects a value based on the predecessor of the current block.
```perl
define i32 @foo() {
start:
br i1 true, label %left, label %right
left:
%plusOne = add i32 0, 1
br label %merge
right:
br label %merge
merge:
%join = phi i32 [ %plusOne, %left ], [ -1, %right ]
ret i32 %join
}
```
![](chapter27/cfg/phi.png)
\clearpage
**Loops**
The traditional ``while`` and ``for`` loops can be written in terms of the
simpler conditional branching constructrs. For example in C we would write:
```cpp
int count(int n)
{
int i = 0;
while(i < n)
{
i++;
}
return i;
}
```
Whereas in LLVM we write:
```perl
define i32 @count(i32 %n) {
entry:
br label %loop
loop:
%i = phi i32 [ 1, %entry ], [ %nextvar, %loop ]
%nextvar = add i32 %i, 1
%cmptmp = icmp ult i32 %i, %n
%booltmp = zext i1 %cmptmp to i32
%loopcond = icmp ne i32 %booltmp, 0
br i1 %loopcond, label %loop, label %afterloop
afterloop:
ret i32 %i
}
```
![](chapter27/cfg/for.png)
\clearpage
**Select**
Selects the first value if the test value is true, the second if false.
```perl
%x = select i1 true, i8 10, i8 20 ; gives 10
%y = select i1 false, i8 10, i8 20 ; gives 20
```
Calls
-----
* ``ccc``: The C calling convention
* ``fastcc``: The fast calling convention
```perl
%result = call i32 @exp(i32 7)
```
Memory
------
LLVM uses the traditional load/store model:
* ``load``: Load a typed value from a given reference
* ``store``: Store a typed value in a given reference
* ``alloca``: Allocate a pointer to memory on the virtual stack
```perl
%ptr = alloca i32
store i32 3, i32* %ptr
%val = load i32* %ptr
```
Specific pointer alignment can be specified:
```perl
%ptr = alloca i32, align 1024
```
For allocating in main memory we use an external reference to the C stdlib
memory allocator which gives us back a (``i8*``).
```haskell
%ptr = call i8* @malloc(i32 %objectsize)
```
For structures:
```perl
extractvalue {i32, float} %a, 0 ; gives i32
extractvalue {i32, {float, double}} %a, 0, 1 ; gives double
extractvalue [2 x i32] %a, 0 ; yields i32
```
```perl
%x = insertvalue {i32, float} %b, float %val, 1 ; gives {i32 1, float %b}
%y = insertvalue {i32, float} zeroinitializer, i32 1, 0 ; gives {i32 1, float 0}
```
GetElementPtr
-------------
Casts
-----
* ``trunc``
* ``zext``
* ``sext``
* ``fptoui``
* ``fptosi``
* ``uitofp``
* ``sitofp``
* ``fptrunc``
* ``fpext``
* ``ptrtoint``
* ``inttoptr``
* ``bitcast``
```haskell
trunc i32 257 to i8 ; yields i8 1
zext i32 257 to i64 ; yields i64 257
sext i8 -1 to i16 ; yields i16 65535
bitcast <2 x i32> %a to i64 ; yields i64 %a
```
Toolchain
---------
```bash
$ llc example.ll -o example.s # compile
$ lli example.ll # execute
$ opt -S example.bc -o example.ll # to assembly
$ opt example.ll -o example.bc # to bitcode
$ opt -O3 example.ll -o example.opt.ll -S # run optimizer
```
Individual modules can be linked together.
```bash
$ llvm-link a.ll b.ll -o c.ll -S
```
Link time optimization.
```bash
$ clang -O4 -emit-llvm a.c -c -o a.bc
$ clang -O4 -emit-llvm a.c -c -o a.bc
$ llvm-link a.bc b.bc -o all.bc
$ opt -std-compile-opts -std-link-opts -O3 all.bc -o optimized.bc
```
The clang project is a C compiler that targets LLVM as it's intermediate
representation. In the case where we'd like to know how some specific C
construct maps into LLVM IR we can ask clang to dump its internal IR using the
``-emit-llvm`` flag.
```perl
# clang -emit-llvm -S add.c -o -
int add(int x)
{
return x+1;
}
```
```perl
; ModuleID = 'add.c'
define i32 @add(i32 %x) nounwind uwtable {
entry:
%x.addr = alloca i32, align 4
store i32 %x, i32* %x.addr, align 4
%0 = load i32* %x.addr, align 4
%add = add nsw i32 %0, 1
ret i32 %add
}
```
LLVM is using a C++ API underneath the hood of all these tools. If you need to
work directly with the API it can be useful to be able to expand out the LLVM IR
into the equivalent C++ code.
```bash
$ llc example.ll -march=cpp -o -
```
~~~~ {.haskell include="chapter27/example.ll"}
~~~~
```cpp
Function* func_test1 = mod->getFunction("test1");
if (!func_test1) {
func_test1 = Function::Create(
/*Type=*/FuncTy_0,
/*Linkage=*/GlobalValue::ExternalLinkage,
/*Name=*/"test1", mod);
func_test1->setCallingConv(CallingConv::C);
}
AttrListPtr func_test1_PAL;
func_test1->setAttributes(func_test1_PAL);
{
Function::arg_iterator args = func_test1->arg_begin();
Value* int32_x = args++;
int32_x->setName("x");
Value* int32_y = args++;
int32_y->setName("y");
Value* int32_z = args++;
int32_z->setName("z");
BasicBlock* label_1 = BasicBlock::Create(mod->getContext(), "",func_test1,0);
BinaryOperator* int32_a = BinaryOperator::Create(
Instruction::And, int32_z, int32_x, "a", label_1);
BinaryOperator* int32_b = BinaryOperator::Create(
Instruction::And, int32_z, int32_y, "b", label_1);
BinaryOperator* int32_c = BinaryOperator::Create(
Instruction::Xor, int32_a, int32_b, "c", label_1);
ReturnInst::Create(mod->getContext(), int32_c, label_1);
}
```
llvm-general
------------
The LLVM bindings for Haskell are split across two packages:
* **llvm-general-pure** is a pure Haskell representation of the LLVM IR.
* **llvm-general** is the FFI bindings to LLVM required for constructing the C representation of the
LLVM IR and performing optimization and compilation.
llvm-general-pure does not require the LLVM libraries be available on the system.
GHCi can have issues with the FFI and can lead to errors when working with
``llvm-general``. If you end up with errors like the following, then you are
likely trying to use ``GHCi`` or ``runhaskell`` and it is unable to link against
your LLVM library. Instead compile with standalone ``ghc``.
```bash
Loading package llvm-general-3.3.8.2
... linking
... ghc: /usr/lib/llvm-3.3/lib/libLLVMSupport.a: unknown symbol `_ZTVN4llvm14error_categoryE'
ghc: unable to load package `llvm-general-3.3.8.2'
````
Code Generation (LLVM)
======================
Resources
---------
* [LLVM Language Reference](http://llvm.org/docs/LangRef.html)
* [Implementing a JIT Compiled Language with Haskell and LLVM](http://www.stephendiehl.com/llvm/)
\clearpage

14
CONTRIBUTORS.md Normal file
View File

@ -0,0 +1,14 @@
Contributors
============
* Matthew Pickering
* Nick Sinopoli
* Nicolas Trangez
* Ingo Blechschmidt
* Rein Henrichs
* Ian Connolly
* Ben James
* Abe Voelker
* Paulo Tanimoto
* Brandon Williams
* Dmitry Ivanov

View File

@ -22,6 +22,7 @@ all: $(OBJ) top
$(PANDOC) --filter ${FILTER} -f $(IFORMAT) $(FLAGS) -o $@ $<
pdf: $(FILTER)
# $(PANDOC) --filter ${FILTER} -f $(IFORMAT) --template $(TEMPLATE_TEX) --latex-engine=xelatex $(FLAGS) -o WYAH.pdf title.md 0*.md contributing.md
$(PANDOC) --filter ${FILTER} -f $(IFORMAT) --template $(TEMPLATE_TEX) --latex-engine=xelatex $(FLAGS) -o WYAH.pdf title.md 0*.md
epub: $(FILTER)

View File

@ -18,6 +18,7 @@
[![Build Status](https://travis-ci.org/sdiehl/write-you-a-haskell.svg)](https://travis-ci.org/sdiehl/write-you-a-haskell)
[![Gitter](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/sdiehl/write-you-a-haskell?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=body_badge)
[![MIT License](http://img.shields.io/badge/license-mit-blue.svg)](https://github.com/sdiehl/write-you-a-haskell/blob/master/LICENSE)
Read Online:

0
chapter10/.gitkeep Normal file
View File

36
chapter10/generics.hs Normal file
View File

@ -0,0 +1,36 @@
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE DefaultSignatures #-}
import GHC.Generics
-- Auxiliary class
class GEq' f where
geq' :: f a -> f a -> Bool
instance GEq' U1 where
geq' _ _ = True
instance (GEq c) => GEq' (K1 i c) where
geq' (K1 a) (K1 b) = geq a b
instance (GEq' a) => GEq' (M1 i c a) where
geq' (M1 a) (M1 b) = geq' a b
instance (GEq' a, GEq' b) => GEq' (a :+: b) where
geq' (L1 a) (L1 b) = geq' a b
geq' (R1 a) (R1 b) = geq' a b
geq' _ _ = False
instance (GEq' a, GEq' b) => GEq' (a :*: b) where
geq' (a1 :*: b1) (a2 :*: b2) = geq' a1 a2 && geq' b1 b2
--
class GEq a where
geq :: a -> a -> Bool
default geq :: (Generic a, GEq' (Rep a)) => a -> a -> Bool
geq x y = geq' (from x) (from y)
instance GEq Char where geq = (==)
instance GEq Int where geq = (==)
instance GEq Float where geq = (==)

0
chapter11/.gitkeep Normal file
View File

0
chapter12/.gitkeep Normal file
View File

View File

@ -0,0 +1,24 @@
type Name = String
type TypeVar = String
type TypeCon = String
data Expr
= Lam Type Name Expr -- \x -> a
| Var Name -- x
| App Expr Expr -- a b
| TLam Name Expr -- /\ a . b
| TApp Expr Type -- a [ b ]
| Lit Literal -- 1
| Let Name Expr Expr -- let x = v in a
data Type
= TForall [Name] Type
| TArr Type Type
| TCon TypeCon
| TVar TypeVar
deriving (Show)
data Literal
= LitInt Integer
| LitChar Char
deriving (Eq, Ord, Show)

9
chapter27/cfg/branch.dot Normal file
View File

@ -0,0 +1,9 @@
digraph "CFG for 'foo' function" {
graph [ dpi = 72 ];
label="Unconditional Branch";
Node0x103c3c0 [shape=record,label="{start:\l br label %next\l}"];
Node0x103c3c0 -> Node0x1038a90;
Node0x1038a90 [shape=record,label="{next: \l br label %return\l}"];
Node0x1038a90 -> Node0x1038b90;
Node0x1038b90 [shape=record,label="{return: \l ret i1 false\l}"];
}

9
chapter27/cfg/branch.ll Normal file
View File

@ -0,0 +1,9 @@
define i1 @foo() {
start:
br label %next
next:
br label %return
return:
ret i1 0
}

BIN
chapter27/cfg/branch.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.1 KiB

10
chapter27/cfg/cbranch.dot Normal file
View File

@ -0,0 +1,10 @@
digraph "CFG for 'foo' function" {
graph [ dpi = 72 ];
label="Conditional Branch";
Node0x1f793c0 [shape=record,label="{start:\l br i1 true, label %left, label %right\l|{<s0>T|<s1>F}}"];
Node0x1f793c0:s0 -> Node0x1f75a20;
Node0x1f793c0:s1 -> Node0x1f75b40;
Node0x1f75a20 [shape=record,label="{left: \l ret i32 10\l}"];
Node0x1f75b40 [shape=record,label="{right: \l ret i32 20\l}"];
}

8
chapter27/cfg/cbranch.ll Normal file
View File

@ -0,0 +1,8 @@
define i32 @foo() {
start:
br i1 true, label %left, label %right
left:
ret i32 10
right:
ret i32 20
}

BIN
chapter27/cfg/cbranch.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.7 KiB

11
chapter27/cfg/for.dot Normal file
View File

@ -0,0 +1,11 @@
digraph "CFG for 'printstar' function" {
graph [ dpi = 72 ];
label="For loop";
Node0x11ea390 [shape=record,label="{entry:\l br label %loop\l}"];
Node0x11ea390 -> Node0x11e72d0;
Node0x11e72d0 [shape=record,label="{loop: \l %i = phi i32 [ 1, %entry ], [ %nextvar, %loop ]\l %nextvar = add i32 %i, 1\l %cmptmp = icmp ult i32 %i, %n\l %booltmp = zext i1 %cmptmp to i32\l %loopcond = icmp ne i32 %booltmp, 0\l br i1 %loopcond, label %loop, label %afterloop\l|{<s0>T|<s1>F}}"];
Node0x11e72d0:s0 -> Node0x11e72d0;
Node0x11e72d0:s1 -> Node0x11e7540;
Node0x11e7540 [shape=record,label="{afterloop: \l ret i32 %i\l}"];
}

17
chapter27/cfg/for.ll Normal file
View File

@ -0,0 +1,17 @@
define i32 @printstar(i32 %n) {
entry:
br label %loop
loop:
%i = phi i32 [ 1, %entry ], [ %nextvar, %loop ]
%nextvar = add i32 %i, 1
%cmptmp = icmp ult i32 %i, %n
%booltmp = zext i1 %cmptmp to i32
%loopcond = icmp ne i32 %booltmp, 0
br i1 %loopcond, label %loop, label %afterloop
afterloop:
ret i32 %i
}

BIN
chapter27/cfg/for.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

14
chapter27/cfg/phi.dot Normal file
View File

@ -0,0 +1,14 @@
digraph "CFG for 'foo' function" {
graph [ dpi = 72 ];
label="Phi node";
Node0x175c3c0 [shape=record,label="{start:\l br i1 true, label %left, label %right\l|{<s0>T|<s1>F}}"];
Node0x175c3c0:s0 -> Node0x1758a20;
Node0x175c3c0:s1 -> Node0x1758b40;
Node0x1758a20 [shape=record,label="{left: \l %plusOne = add i32 0, 1\l br label %merge\l}"];
Node0x1758a20 -> Node0x17593e0;
Node0x1758b40 [shape=record,label="{right: \l br label %merge\l}"];
Node0x1758b40 -> Node0x17593e0;
Node0x17593e0 [shape=record,label="{merge: \l %join = phi i32 [ %plusOne, %left ], [ -1, %right ]\l ret i32 %join\l}"];
}

12
chapter27/cfg/phi.ll Normal file
View File

@ -0,0 +1,12 @@
define i32 @foo() {
start:
br i1 true, label %left, label %right
left:
%plusOne = add i32 0, 1
br label %merge
right:
br label %merge
merge:
%join = phi i32 [ %plusOne, %left], [ -1, %right]
ret i32 %join
}

BIN
chapter27/cfg/phi.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

6
chapter27/cfg/ret.dot Normal file
View File

@ -0,0 +1,6 @@
digraph "CFG for 'foo' function" {
graph [ dpi = 72 ];
label="Return";
Node0x259a2b0 [shape=record,label="{%0:\l ret i1 false\l}"];
}

4
chapter27/cfg/ret.ll Normal file
View File

@ -0,0 +1,4 @@
define i1 @foo() {
ret i1 0
}

BIN
chapter27/cfg/ret.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.1 KiB

14
chapter27/cfg/switch.dot Normal file
View File

@ -0,0 +1,14 @@
digraph "CFG for 'foo' function" {
graph [ dpi = 72 ];
label="Switch";
Node0x1027390 [shape=record,label="{entry:\l switch i32 %a, label %default [\l i32 0, label %f\l i32 1, label %g\l i32 2, label %h\l ]\l|{<s0>def|<s1>0|<s2>1|<s3>2}}"];
Node0x1027390:s0 -> Node0x1024700;
Node0x1027390:s1 -> Node0x1024300;
Node0x1027390:s2 -> Node0x1024440;
Node0x1027390:s3 -> Node0x1023920;
Node0x1024300 [shape=record,label="{f: \l ret i32 1\l}"];
Node0x1024440 [shape=record,label="{g: \l ret i32 2\l}"];
Node0x1023920 [shape=record,label="{h: \l ret i32 3\l}"];
Node0x1024700 [shape=record,label="{default: \l ret i32 0\l}"];
}

14
chapter27/cfg/switch.ll Normal file
View File

@ -0,0 +1,14 @@
define i32 @foo(i32 %a) {
entry:
switch i32 %a, label %default [ i32 0, label %f
i32 1, label %g
i32 2, label %h ]
f:
ret i32 1
g:
ret i32 2
h:
ret i32 3
default:
ret i32 0
}

BIN
chapter27/cfg/switch.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

281
chapter27/dsl/Codegen.hs Normal file
View File

@ -0,0 +1,281 @@
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
module Codegen where
import Data.Word
import Data.String
import Data.List
import Data.Function
import qualified Data.Map as Map
import Control.Monad.State
import Control.Applicative
import LLVM.General.AST
import LLVM.General.AST.Global
import qualified LLVM.General.AST as AST
import qualified LLVM.General.AST.Constant as C
import qualified LLVM.General.AST.Attribute as A
import qualified LLVM.General.AST.CallingConvention as CC
import qualified LLVM.General.AST.FloatingPointPredicate as FP
-------------------------------------------------------------------------------
-- Module Level
-------------------------------------------------------------------------------
newtype LLVM a = LLVM { unLLVM :: State AST.Module a }
deriving (Functor, Applicative, Monad, MonadState AST.Module )
runLLVM :: AST.Module -> LLVM a -> AST.Module
runLLVM = flip (execState . unLLVM)
emptyModule :: String -> AST.Module
emptyModule label = defaultModule { moduleName = label }
addDefn :: Definition -> LLVM ()
addDefn d = do
defs <- gets moduleDefinitions
modify $ \s -> s { moduleDefinitions = defs ++ [d] }
define :: Type -> String -> [(Type, Name)] -> Codegen a -> LLVM ()
define retty label argtys body = addDefn $
GlobalDefinition $ functionDefaults {
name = Name label
, parameters = ([Parameter ty nm [] | (ty, nm) <- argtys], False)
, returnType = retty
, basicBlocks = bls
}
where
bls = createBlocks $ execCodegen $ do
enter <- addBlock entryBlockName
void $ setBlock enter
body
external :: Type -> String -> [(Type, Name)] -> LLVM ()
external retty label argtys = addDefn $
GlobalDefinition $ functionDefaults {
name = Name label
, parameters = ([Parameter ty nm [] | (ty, nm) <- argtys], False)
, returnType = retty
, basicBlocks = []
}
---------------------------------------------------------------------------------
-- Types
-------------------------------------------------------------------------------
-- IEEE 754 double
double :: Type
double = FloatingPointType 64 IEEE
-------------------------------------------------------------------------------
-- Names
-------------------------------------------------------------------------------
type Names = Map.Map String Int
uniqueName :: String -> Names -> (String, Names)
uniqueName nm ns =
case Map.lookup nm ns of
Nothing -> (nm, Map.insert nm 1 ns)
Just ix -> (nm ++ show ix, Map.insert nm (ix+1) ns)
instance IsString Name where
fromString = Name . fromString
-------------------------------------------------------------------------------
-- Codegen State
-------------------------------------------------------------------------------
type SymbolTable = [(String, Operand)]
data CodegenState
= CodegenState {
currentBlock :: Name -- Name of the active block to append to
, blocks :: Map.Map Name BlockState -- Blocks for function
, symtab :: SymbolTable -- Function scope symbol table
, blockCount :: Int -- Count of basic blocks
, count :: Word -- Count of unnamed instructions
, names :: Names -- Name Supply
} deriving Show
data BlockState
= BlockState {
idx :: Int -- Block index
, stack :: [Named Instruction] -- Stack of instructions
, term :: Maybe (Named Terminator) -- Block terminator
} deriving Show
-------------------------------------------------------------------------------
-- Codegen Operations
-------------------------------------------------------------------------------
newtype Codegen a = Codegen { runCodegen :: State CodegenState a }
deriving (Functor, Applicative, Monad, MonadState CodegenState )
sortBlocks :: [(Name, BlockState)] -> [(Name, BlockState)]
sortBlocks = sortBy (compare `on` (idx . snd))
createBlocks :: CodegenState -> [BasicBlock]
createBlocks m = map makeBlock $ sortBlocks $ Map.toList (blocks m)
makeBlock :: (Name, BlockState) -> BasicBlock
makeBlock (l, (BlockState _ s t)) = BasicBlock l s (maketerm t)
where
maketerm (Just x) = x
maketerm Nothing = error $ "Block has no terminator: " ++ (show l)
entryBlockName :: String
entryBlockName = "entry"
emptyBlock :: Int -> BlockState
emptyBlock i = BlockState i [] Nothing
emptyCodegen :: CodegenState
emptyCodegen = CodegenState (Name entryBlockName) Map.empty [] 1 0 Map.empty
execCodegen :: Codegen a -> CodegenState
execCodegen m = execState (runCodegen m) emptyCodegen
fresh :: Codegen Word
fresh = do
i <- gets count
modify $ \s -> s { count = 1 + i }
return $ i + 1
instr :: Instruction -> Codegen (Operand)
instr ins = do
n <- fresh
let ref = (UnName n)
blk <- current
let i = stack blk
modifyBlock (blk { stack = i ++ [ref := ins] } )
return $ local ref
terminator :: Named Terminator -> Codegen (Named Terminator)
terminator trm = do
blk <- current
modifyBlock (blk { term = Just trm })
return trm
-------------------------------------------------------------------------------
-- Block Stack
-------------------------------------------------------------------------------
entry :: Codegen Name
entry = gets currentBlock
addBlock :: String -> Codegen Name
addBlock bname = do
bls <- gets blocks
ix <- gets blockCount
nms <- gets names
let new = emptyBlock ix
(qname, supply) = uniqueName bname nms
modify $ \s -> s { blocks = Map.insert (Name qname) new bls
, blockCount = ix + 1
, names = supply
}
return (Name qname)
setBlock :: Name -> Codegen Name
setBlock bname = do
modify $ \s -> s { currentBlock = bname }
return bname
getBlock :: Codegen Name
getBlock = gets currentBlock
modifyBlock :: BlockState -> Codegen ()
modifyBlock new = do
active <- gets currentBlock
modify $ \s -> s { blocks = Map.insert active new (blocks s) }
current :: Codegen BlockState
current = do
c <- gets currentBlock
blks <- gets blocks
case Map.lookup c blks of
Just x -> return x
Nothing -> error $ "No such block: " ++ show c
-------------------------------------------------------------------------------
-- Symbol Table
-------------------------------------------------------------------------------
assign :: String -> Operand -> Codegen ()
assign var x = do
lcls <- gets symtab
modify $ \s -> s { symtab = [(var, x)] ++ lcls }
getvar :: String -> Codegen Operand
getvar var = do
syms <- gets symtab
case lookup var syms of
Just x -> return x
Nothing -> error $ "Local variable not in scope: " ++ show var
-------------------------------------------------------------------------------
-- References
local :: Name -> Operand
local = LocalReference
global :: Name -> C.Constant
global = C.GlobalReference
externf :: Name -> Operand
externf = ConstantOperand . C.GlobalReference
-- Arithmetic and Constants
fadd :: Operand -> Operand -> Codegen Operand
fadd a b = instr $ FAdd a b []
fsub :: Operand -> Operand -> Codegen Operand
fsub a b = instr $ FSub a b []
fmul :: Operand -> Operand -> Codegen Operand
fmul a b = instr $ FMul a b []
fdiv :: Operand -> Operand -> Codegen Operand
fdiv a b = instr $ FDiv a b []
fcmp :: FP.FloatingPointPredicate -> Operand -> Operand -> Codegen Operand
fcmp cond a b = instr $ FCmp cond a b []
cons :: C.Constant -> Operand
cons = ConstantOperand
uitofp :: Type -> Operand -> Codegen Operand
uitofp ty a = instr $ UIToFP a ty []
toArgs :: [Operand] -> [(Operand, [A.ParameterAttribute])]
toArgs = map (\x -> (x, []))
-- Effects
call :: Operand -> [Operand] -> Codegen Operand
call fn args = instr $ Call False CC.C [] (Right fn) (toArgs args) [] []
alloca :: Type -> Codegen Operand
alloca ty = instr $ Alloca ty Nothing 0 []
store :: Operand -> Operand -> Codegen Operand
store ptr val = instr $ Store False ptr val Nothing 0 []
load :: Operand -> Codegen Operand
load ptr = instr $ Load False ptr Nothing 0 []
-- Control Flow
br :: Name -> Codegen (Named Terminator)
br val = terminator $ Do $ Br val []
cbr :: Operand -> Name -> Name -> Codegen (Named Terminator)
cbr cond tr fl = terminator $ Do $ CondBr cond tr fl []
phi :: Type -> [(Operand, Name)] -> Codegen Operand
phi ty incoming = instr $ Phi ty incoming []
ret :: Operand -> Codegen (Named Terminator)
ret val = terminator $ Do $ Ret (Just val) []

58
chapter27/dsl/JIT.hs Normal file
View File

@ -0,0 +1,58 @@
module JIT where
import Data.Int
import Data.Word
import Foreign.Ptr ( FunPtr, castFunPtr )
import Control.Monad.Error
import LLVM.General.Target
import LLVM.General.Context
import LLVM.General.CodeModel
import LLVM.General.Module as Mod
import qualified LLVM.General.AST as AST
import LLVM.General.PassManager
import LLVM.General.Transforms
import LLVM.General.Analysis
import qualified LLVM.General.ExecutionEngine as EE
foreign import ccall "dynamic" haskFun :: FunPtr (IO Double) -> (IO Double)
run :: FunPtr a -> IO Double
run fn = haskFun (castFunPtr fn :: FunPtr (IO Double))
jit :: Context -> (EE.MCJIT -> IO a) -> IO a
jit c = EE.withMCJIT c optlevel model ptrelim fastins
where
optlevel = Just 0 -- optimization level
model = Nothing -- code model ( Default )
ptrelim = Nothing -- frame pointer elimination
fastins = Nothing -- fast instruction selection
passes :: PassSetSpec
passes = defaultCuratedPassSetSpec { optLevel = Just 3 }
runJIT :: AST.Module -> IO (Either String AST.Module)
runJIT mod = do
withContext $ \context ->
jit context $ \executionEngine ->
runErrorT $ withModuleFromAST context mod $ \m ->
withPassManager passes $ \pm -> do
-- Optimization Pass
{-runPassManager pm m-}
optmod <- moduleAST m
s <- moduleLLVMAssembly m
putStrLn s
EE.withModuleInEngine executionEngine m $ \ee -> do
mainfn <- EE.getFunction ee (AST.Name "main")
case mainfn of
Just fn -> do
res <- run fn
putStrLn $ "Evaluated to: " ++ show res
Nothing -> return ()
-- Return the optimized module
return optmod

32
chapter27/dsl/Main.hs Normal file
View File

@ -0,0 +1,32 @@
import JIT
import Codegen
import qualified LLVM.General.AST as AST
import qualified LLVM.General.AST.Float as F
import qualified LLVM.General.AST.Constant as C
{-
; ModuleID = 'my cool jit'
define double @main() {
entry:
%1 = fadd double 1.000000e+01, 2.000000e+01
ret double %1
}
-}
initModule :: AST.Module
initModule = emptyModule "my cool jit"
logic = do
define double "main" [] $ do
let a = cons $ C.Float (F.Double 10)
let b = cons $ C.Float (F.Double 20)
res <- fadd a b
ret res
main = do
let ast = runLLVM initModule logic
runJIT ast
return ast

22
chapter27/dsl/dsl.cabal Normal file
View File

@ -0,0 +1,22 @@
name: dsl
version: 0.1.0.0
--synopsis:
--description:
license: MIT
license-file: LICENSE-MIT
author: Stephen Diehl
maintainer: stephen.m.diehl@gmail.com
copyright: 2015 Stephen Diehl
Category: Compilers
build-type: Simple
cabal-version: >=1.10
library
default-language: Haskell2010
build-depends:
base >= 4.6
, haskeline >= 0.7.1.2
, llvm-general == 3.4.4.*
, llvm-general-pure == 3.4.4.*
, mtl >= 2.2
, transformers

93
chapter27/example.cpp Normal file
View File

@ -0,0 +1,93 @@
// Generated by llvm2cpp - DO NOT MODIFY!
#include <llvm/LLVMContext.h>
#include <llvm/Module.h>
#include <llvm/DerivedTypes.h>
#include <llvm/Constants.h>
#include <llvm/GlobalVariable.h>
#include <llvm/Function.h>
#include <llvm/CallingConv.h>
#include <llvm/BasicBlock.h>
#include <llvm/Instructions.h>
#include <llvm/InlineAsm.h>
#include <llvm/Support/FormattedStream.h>
#include <llvm/Support/MathExtras.h>
#include <llvm/Pass.h>
#include <llvm/PassManager.h>
#include <llvm/ADT/SmallVector.h>
#include <llvm/Analysis/Verifier.h>
#include <llvm/Assembly/PrintModulePass.h>
#include <algorithm>
using namespace llvm;
Module* makeLLVMModule();
int main(int argc, char**argv) {
Module* Mod = makeLLVMModule();
verifyModule(*Mod, PrintMessageAction);
PassManager PM;
PM.add(createPrintModulePass(&outs()));
PM.run(*Mod);
return 0;
}
Module* makeLLVMModule() {
// Module Construction
Module* mod = new Module("example.ll", getGlobalContext());
// Type Definitions
std::vector<Type*>FuncTy_0_args;
FuncTy_0_args.push_back(IntegerType::get(mod->getContext(), 32));
FuncTy_0_args.push_back(IntegerType::get(mod->getContext(), 32));
FuncTy_0_args.push_back(IntegerType::get(mod->getContext(), 32));
FunctionType* FuncTy_0 = FunctionType::get(
/*Result=*/IntegerType::get(mod->getContext(), 32),
/*Params=*/FuncTy_0_args,
/*isVarArg=*/false);
// Function Declarations
Function* func_test1 = mod->getFunction("test1");
if (!func_test1) {
func_test1 = Function::Create(
/*Type=*/FuncTy_0,
/*Linkage=*/GlobalValue::ExternalLinkage,
/*Name=*/"test1", mod);
func_test1->setCallingConv(CallingConv::C);
}
AttrListPtr func_test1_PAL;
func_test1->setAttributes(func_test1_PAL);
// Global Variable Declarations
// Constant Definitions
// Global Variable Definitions
// Function Definitions
// Function: test1 (func_test1)
{
Function::arg_iterator args = func_test1->arg_begin();
Value* int32_x = args++;
int32_x->setName("x");
Value* int32_y = args++;
int32_y->setName("y");
Value* int32_z = args++;
int32_z->setName("z");
BasicBlock* label_1 = BasicBlock::Create(mod->getContext(), "",func_test1,0);
// Block (label_1)
BinaryOperator* int32_a = BinaryOperator::Create(Instruction::And, int32_z, int32_x, "a", label_1);
BinaryOperator* int32_b = BinaryOperator::Create(Instruction::And, int32_z, int32_y, "b", label_1);
BinaryOperator* int32_c = BinaryOperator::Create(Instruction::Xor, int32_a, int32_b, "c", label_1);
ReturnInst::Create(mod->getContext(), int32_c, label_1);
}
return mod;
}

6
chapter27/example.ll Normal file
View File

@ -0,0 +1,6 @@
define i32 @test1(i32 %x, i32 %y, i32 %z) {
%a = and i32 %z, %x
%b = and i32 %z, %y
%c = xor i32 %a, %b
ret i32 %c
}

View File

@ -22,7 +22,7 @@ main :: IO ()
main = runInputT defaultSettings loop
where
loop = do
minput <- getInputLine "Stlc> "
minput <- getInputLine "Happy> "
case minput of
Nothing -> outputStrLn "Goodbye."
Just input -> (liftIO $ process input) >> loop

3
chapter9/assign/.gitignore vendored Normal file
View File

@ -0,0 +1,3 @@
Lexer.hs
Parser.hs
Main

34
chapter9/assign/Eval.hs Normal file
View File

@ -0,0 +1,34 @@
module Eval (eval) where
import Syntax
import Control.Monad.State
import qualified Data.Map as Map
data Value
= VInt Int
| VUnit
instance Show Value where
show (VInt x) = show x
type Eval = StateT Env IO
type Env = [(String, Value)]
eval1 :: Expr -> Eval Value
eval1 expr = case expr of
Num a -> return (VInt a)
Var a -> do
env <- get
case lookup a env of
Just val -> return val
Nothing -> error "Not in scope"
Print a -> do
a' <- eval1 a
liftIO $ print a'
return VUnit
Assign ref val -> do
modify $ \s -> (ref, VInt val) : s
return VUnit
eval :: [Expr] -> IO ()
eval xs = evalStateT (mapM_ eval1 xs) []

38
chapter9/assign/Lexer.x Normal file
View File

@ -0,0 +1,38 @@
{
module Lexer (
Token(..),
scanTokens
) where
import Syntax
}
%wrapper "basic"
$digit = 0-9
$alpha = [a-zA-Z]
$eol = [\n]
tokens :-
-- Whitespace insensitive
$eol ;
$white+ ;
print { \s -> TokenPrint }
$digit+ { \s -> TokenNum (read s) }
\= { \s -> TokenEq }
$alpha [$alpha $digit \_ \']* { \s -> TokenSym s }
{
data Token
= TokenNum Int
| TokenSym String
| TokenPrint
| TokenEq
| TokenEOF
deriving (Eq,Show)
scanTokens = alexScanTokens
}

21
chapter9/assign/Main.hs Normal file
View File

@ -0,0 +1,21 @@
import Eval (eval)
import Parser (parseExpr)
import System.Environment
process :: String -> IO ()
process input = do
let ast = parseExpr input
case ast of
Right ast -> eval ast
Left err -> do
putStrLn "Parser Error:"
print err
main :: IO ()
main = do
args <- getArgs
case args of
[] -> putStrLn "Usage: assign <input file>"
[fname] -> do
contents <- readFile fname
process contents

6
chapter9/assign/Makefile Normal file
View File

@ -0,0 +1,6 @@
all:
alex Lexer.x
happy Parser.y
ghc --make Main -o Main
clean:
rm -f *.o *.hi Parser.hs Lexer.hs Main

46
chapter9/assign/Parser.y Normal file
View File

@ -0,0 +1,46 @@
{
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
module Parser (
parseExpr,
) where
import Lexer
import Syntax
import Control.Monad.Except
}
%name expr
%tokentype { Token }
%monad { Except String } { (>>=) } { return }
%error { parseError }
%token
int { TokenNum $$ }
var { TokenSym $$ }
print { TokenPrint }
'=' { TokenEq }
%%
terms
: term { [$1] }
| term terms { $1 : $2 }
term
: var { Var $1 }
| var '=' int { Assign $1 $3 }
| print term { Print $2 }
{
parseError :: [Token] -> Except String a
parseError (l:ls) = throwError (show l)
parseError [] = throwError "Unexpected end of Input"
parseExpr :: String -> Either String [Expr]
parseExpr input =
let tokenStream = scanTokens input in
runExcept (expr tokenStream)
}

View File

@ -0,0 +1,8 @@
module Syntax where
data Expr
= Var String
| Num Int
| Print Expr
| Assign String Int
deriving (Eq,Show)

View File

@ -0,0 +1,19 @@
name: assign
version: 0.1.0.0
build-type: Simple
extra-source-files: README.md
cabal-version: >=1.10
executable assign
build-depends:
base >= 4.6 && <4.7
, containers >= 0.5 && <0.6
, mtl >= 2.2
default-language: Haskell2010
main-is: Main.hs
Build-depends: base, array
build-tools: alex, happy
other-modules:
Parser,
Lexer

View File

@ -0,0 +1,6 @@
x = 4
print x
y = 5
print y
y = 6
print y

3
chapter9/happy/.gitignore vendored Normal file
View File

@ -0,0 +1,3 @@
Lexer.hs
Parser.hs
Main

52
chapter9/happy/Eval.hs Normal file
View File

@ -0,0 +1,52 @@
module Eval where
import Syntax
import Control.Monad.Identity
import qualified Data.Map as Map
data Value
= VInt Integer
| VBool Bool
| VClosure String Expr (Eval.Scope)
instance Show Value where
show (VInt x) = show x
show (VBool x) = show x
show VClosure{} = "<<closure>>"
type Evaluate t = Identity t
type Scope = Map.Map String Value
eval :: Eval.Scope -> Expr -> Identity Value
eval env expr = case expr of
Lit (LInt x) -> return $ VInt (fromIntegral x)
Lit (LBool x) -> return $ VBool x
Var x -> return $ env Map.! x
Lam x body -> return (VClosure x body env)
App a b -> do
x <- eval env a
y <- eval env b
apply x y
Op op a b -> do
x <- eval env a
y <- eval env b
return $ binop op x y
binop :: Binop -> Value -> Value -> Value
binop Add (VInt a) (VInt b) = VInt (a+b)
binop Sub (VInt a) (VInt b) = VInt (a-b)
binop Mul (VInt a) (VInt b) = VInt (a*b)
binop Eql (VInt a) (VInt b) = VBool (a==b)
extend :: Scope -> String -> Value -> Scope
extend env v t = Map.insert v t env
apply :: Value -> Value -> Evaluate Value
apply (VClosure v t0 e) t1 = eval (extend e v t1) t0
apply _ _ = error "Tried to apply closure"
emptyScope :: Scope
emptyScope = Map.empty
runEval :: Expr -> Value
runEval x = runIdentity (eval emptyScope x)

63
chapter9/happy/Lexer.x Normal file
View File

@ -0,0 +1,63 @@
{
module Lexer (
Token(..),
scanTokens
) where
import Syntax
}
%wrapper "basic"
$digit = 0-9
$alpha = [a-zA-Z]
$eol = [\n]
tokens :-
-- Whitespace insensitive
$eol ;
$white+ ;
-- Comments
"#".* ;
-- Syntax
let { \s -> TokenLet }
True { \s -> TokenTrue }
False { \s -> TokenFalse }
in { \s -> TokenIn }
$digit+ { \s -> TokenNum (read s) }
"->" { \s -> TokenArrow }
\= { \s -> TokenEq }
\\ { \s -> TokenLambda }
[\+] { \s -> TokenAdd }
[\-] { \s -> TokenSub }
[\*] { \s -> TokenMul }
\( { \s -> TokenLParen }
\) { \s -> TokenRParen }
$alpha [$alpha $digit \_ \']* { \s -> TokenSym s }
{
data Token
= TokenLet
| TokenTrue
| TokenFalse
| TokenIn
| TokenLambda
| TokenNum Int
| TokenSym String
| TokenArrow
| TokenEq
| TokenAdd
| TokenSub
| TokenMul
| TokenLParen
| TokenRParen
| TokenEOF
deriving (Eq,Show)
scanTokens = alexScanTokens
}

25
chapter9/happy/Main.hs Normal file
View File

@ -0,0 +1,25 @@
import Eval
import Parser (parseExpr, parseTokens)
import Control.Monad.Trans
import System.Console.Haskeline
process :: String -> IO ()
process input = do
let tokens = parseTokens input
print tokens
let ast = parseExpr input
case ast of
Left err -> do
putStrLn "Parser Error:"
print err
Right ast -> print $ runEval ast
main :: IO ()
main = runInputT defaultSettings loop
where
loop = do
minput <- getInputLine "Happy> "
case minput of
Nothing -> outputStrLn "Goodbye."
Just input -> (liftIO $ process input) >> loop

6
chapter9/happy/Makefile Normal file
View File

@ -0,0 +1,6 @@
all:
alex Lexer.x
happy Parser.y
ghc --make Main -o Main
clean:
rm -f *.o *.hi Parser.hs Lexer.hs Main

83
chapter9/happy/Parser.y Normal file
View File

@ -0,0 +1,83 @@
{
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
module Parser (
parseExpr,
parseTokens,
) where
import Lexer
import Syntax
import Control.Monad.Except
}
-- Entry point
%name expr
-- Entry point
%name expr
-- Lexer structure
%tokentype { Token }
-- Parser monad
%monad { Except String } { (>>=) } { return }
%error { parseError }
-- Token Names
%token
let { TokenLet }
true { TokenTrue }
false { TokenFalse }
in { TokenIn }
NUM { TokenNum $$ }
VAR { TokenSym $$ }
'\\' { TokenLambda }
'->' { TokenArrow }
'=' { TokenEq }
'+' { TokenAdd }
'-' { TokenSub }
'*' { TokenMul }
'(' { TokenLParen }
')' { TokenRParen }
-- Operators
%left '+' '-'
%left '*'
%%
Expr : let VAR '=' Expr in Expr { App (Lam $2 $6) $4 }
| '\\' VAR '->' Expr { Lam $2 $4 }
| Form { $1 }
Form : Form '+' Form { Op Add $1 $3 }
| Form '-' Form { Op Sub $1 $3 }
| Form '*' Form { Op Mul $1 $3 }
| Fact { $1 }
Fact : Fact Atom { App $1 $2 }
| Atom { $1 }
Atom : '(' Expr ')' { $2 }
| NUM { Lit (LInt $1) }
| VAR { Var $1 }
| true { Lit (LBool True) }
| false { Lit (LBool True) }
{
parseError :: [Token] -> Except String a
parseError (l:ls) = throwError (show l)
parseError [] = throwError "Unexpected end of Input"
parseExpr :: String -> Either String Expr
parseExpr input =
let tokenStream = scanTokens input in
runExcept (expr tokenStream)
parseTokens :: String -> [Token]
parseTokens = scanTokens
}

19
chapter9/happy/Syntax.hs Normal file
View File

@ -0,0 +1,19 @@
module Syntax where
type Name = String
data Expr
= Lam Name Expr
| App Expr Expr
| Var Name
| Lit Lit
| Op Binop Expr Expr
deriving (Eq,Show)
data Lit
= LInt Int
| LBool Bool
deriving (Show, Eq, Ord)
data Binop = Add | Sub | Mul | Eql
deriving (Eq, Ord, Show)

View File

@ -0,0 +1,27 @@
name: happyParser
version: 0.1.0.0
license: MIT
license-file: LICENSE
author: Stephen Diehl
maintainer: stephen.m.diehl@gmail.com
build-type: Simple
extra-source-files: README.md
cabal-version: >=1.10
executable happyParser
build-depends:
base >= 4.6 && <4.7
, pretty >= 1.1 && <1.2
, parsec >= 3.1 && <3.2
, containers >= 0.5 && <0.6
, haskeline >= 0.7
, mtl
, transformers
default-language: Haskell2010
main-is: Main.hs
Build-depends: base, array
build-tools: alex, happy
other-modules:
Parser,
Lexer

65
chapter9/layout/Layout.hs Normal file
View File

@ -0,0 +1,65 @@
{-# LANGUAGE FlexibleContexts #-}
module Layout (
-- * Layout combinators
IParsec,
laidout,
indented,
align,
runIndentParser,
) where
import Data.Text.Lazy
import Text.Parsec (ParseError)
import Text.Parsec.Pos
import Text.Parsec.Prim hiding (State)
import Control.Monad.Identity
import Control.Applicative ((<$>))
-- Indentation sensitive Parsec monad.
type IParsec a = Parsec Text ParseState a
data ParseState = ParseState
{ indents :: Column
} deriving (Show)
initParseState :: ParseState
initParseState = ParseState 0
indentCmp
:: (Column -> Column -> Bool)
-> IParsec ()
indentCmp cmp = do
col <- sourceColumn <$> getPosition
current <- indents <$> getState
guard (col `cmp` current)
withIndent :: Monad m =>Column-> Column -> ParsecT s ParseState m b -> ParsecT s ParseState m b
withIndent cur pos m = do
modifyState $ \st -> st { indents = pos }
res <- m
modifyState $ \st -> st { indents = cur }
return res
laidout :: Parsec s ParseState a -> Parsec s ParseState a
laidout m = do
cur <- indents <$> getState
pos <- sourceColumn <$> getPosition
res <- withIndent cur pos m
return res
indented :: IParsec ()
indented = indentCmp (>) <?> "Block (indented)"
align :: IParsec ()
align = indentCmp (==) <?> "Block (same indentation)"
runIndentParser
:: Stream Text Identity a
=> SourceName
-> IParsec a
-> Text
-> Either ParseError a
runIndentParser filePath p = runParser p initParseState filePath

View File

@ -0,0 +1,42 @@
module Eval where
import Syntax
import Control.Monad.Identity
import qualified Data.Map as Map
data Value
= VInt Integer
| VBool Bool
| VClosure String Expr TermEnv
type TermEnv = Map.Map String Value
type Interpreter t = Identity t
emptyTmenv :: TermEnv
emptyTmenv = Map.empty
instance Show Value where
show (VInt n) = show n
show (VBool n) = show n
show VClosure{} = "<<closure>>"
eval :: TermEnv -> Expr -> Interpreter Value
eval env expr = case expr of
Var _ x -> do
let Just v = Map.lookup x env
return v
Lam _ x body ->
return (VClosure x body env)
App _ fun arg -> do
VClosure x body clo <- eval env fun
argv <- eval env arg
let nenv = Map.insert x argv clo
eval nenv body
runEval :: TermEnv -> String -> Expr -> (Value, TermEnv)
runEval env nm ex =
let res = runIdentity (eval env ex) in
(res, Map.insert nm res env)

View File

@ -0,0 +1,205 @@
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
module Infer (
inferTop,
TypeError(..),
Env,
) where
import Control.Monad.Except
import Control.Monad.Reader
import Control.Monad.Writer
import Control.Monad.State
import qualified Data.Map as Map
import qualified Data.Set as Set
import Syntax
import Type
setLoc :: Loc -> Type -> Type
setLoc l (TVar _ a) = TVar l a
setLoc l (TCon _ a) = TCon l a
setLoc l (TArr _ a b) = TArr l a b
getLoc :: Type -> Loc
getLoc (TVar l _) = l
getLoc (TCon l _) = l
getLoc (TArr l _ _) = l
-------------------------------------------------------------------------------
-- Substitution
-------------------------------------------------------------------------------
type Unifier = (Subst, [Constraint])
type Constraint = (Type, Type)
type Env = [(Name, Type)]
extend :: (Name, Type) -> Env -> Env
extend xt env = xt : env
newtype Subst = Subst (Map.Map TVar Type)
deriving (Eq, Ord, Show, Monoid)
class Substitutable a where
apply :: Subst -> a -> a
ftv :: a -> Set.Set TVar
instance Substitutable Type where
apply _ (TCon l a) = TCon l a
apply (Subst s) t@(TVar l a) = Map.findWithDefault t a s
apply s (TArr l t1 t2) = TArr l (apply s t1) (apply s t2)
ftv TCon{} = Set.empty
ftv (TVar _ a) = Set.singleton a
ftv (TArr _ t1 t2) = ftv t1 `Set.union` ftv t2
instance Substitutable Constraint where
apply s (t1, t2) = (apply s t1, apply s t2)
ftv (t1, t2) = ftv t1 `Set.union` ftv t2
instance Substitutable a => Substitutable [a] where
apply = map . apply
ftv = foldr (Set.union . ftv) Set.empty
data TypeError
= UnificationFail Type Loc Type Loc
| InfiniteType TVar Type
| UnboundVariable String
| Ambigious [Constraint]
| UnificationMismatch [Type] [Type]
-------------------------------------------------------------------------------
-- Environment
-------------------------------------------------------------------------------
inEnv :: (Name, Type) -> Check a -> Check a
inEnv (x,t) = local (extend (x,t))
lookupVar :: Name -> Check Type
lookupVar x = do
env <- ask
case lookup x env of
Nothing -> throwError $ UnboundVariable x
Just s -> return s
letters :: [String]
letters = [1..] >>= flip replicateM ['a'..'z']
fresh :: Loc -> Check Type
fresh l = do
s <- get
put s{count = count s + 1}
return $ TVar l (TV (letters !! count s))
-------------------------------------------------------------------------------
-- Type Checker
-------------------------------------------------------------------------------
-- | Inference state
data InferState = InferState { count :: Int }
type Check =
WriterT [Constraint]
(StateT InferState
(ExceptT TypeError (Reader Env)))
check :: Expr -> Check Type
check expr = case expr of
Var l n -> do
t <- lookupVar n
return $ setLoc l t
App l a b -> do
ta <- check a
tb <- check b
tr <- fresh l
unify ta (TArr l tb tr)
return tr
Lam l n a -> do
tv <- fresh l
ty <- inEnv (n, tv) (check a)
return (TArr l ty (setLoc l tv))
Lit l _ -> return $ TCon l "Int"
runCheck :: Env -> Check a -> Either TypeError (a, [Constraint])
runCheck env =
flip runReader env
. runExceptT
. flip evalStateT (InferState 0)
. runWriterT
inferTop :: Env -> Expr -> Either TypeError Type
inferTop env x = do
(ty, cs) <- runCheck env (check x)
s <- runSolve cs
return (apply s ty)
-------------------------------------------------------------------------------
-- Constraint Solving
-------------------------------------------------------------------------------
type Solve = StateT Unifier (Except TypeError)
runSolve :: [Constraint] -> Either TypeError Subst
runSolve cs = runExcept (evalStateT solver st)
where st = (emptySubst, cs)
-- Unification solver
solver :: Solve Subst
solver = do
(su, cs) <- get
case cs of
[] -> return su
((t1, t2): cs0) -> do
(su1, cs1) <- unifies t1 t2
put (su1 `compose` su, cs1 ++ (apply su1 cs0))
solver
-- | Empty unifier
emptyUnifer :: Unifier
emptyUnifer = (emptySubst, [])
-- | The empty substitution
emptySubst :: Subst
emptySubst = mempty
-- | Compose substitutions
compose :: Subst -> Subst -> Subst
(Subst s1) `compose` (Subst s2) = Subst $ Map.map (apply (Subst s1)) s2 `Map.union` s1
bind :: TVar -> Type -> Solve Unifier
bind a t
| eqLoc t a = return (emptySubst, [])
| occursCheck a t = throwError $ InfiniteType a t
| otherwise = return $ (Subst $ Map.singleton a t, [])
eqLoc :: Type -> TVar -> Bool
eqLoc (TVar _ a) b = a == b
eqLoc _ _ = False
occursCheck :: Substitutable a => TVar -> a -> Bool
occursCheck a t = a `Set.member` ftv t
unifies :: Type -> Type -> Solve Unifier
unifies t1 t2 | t1 == t2 = return emptyUnifer
unifies (TVar _ v) t = v `bind` t
unifies t (TVar _ v) = v `bind` t
unifies (TArr _ t1 t2) (TArr _ t3 t4) = unifyMany [t1, t2] [t3, t4]
unifies t1 t2 = throwError $ UnificationFail t1 (getLoc t1) t2 (getLoc t2)
unifyMany :: [Type] -> [Type] -> Solve Unifier
unifyMany [] [] = return emptyUnifer
unifyMany (t1 : ts1) (t2 : ts2) =
do (su1,cs1) <- unifies t1 t2
(su2,cs2) <- unifyMany (apply su1 ts1) (apply su1 ts2)
return (su2 `compose` su1, cs1 ++ cs2)
unifyMany t1 t2 = throwError $ UnificationMismatch t1 t2
-- | Unify two types
unify :: Type -> Type -> Check ()
unify t1 t2 = tell [(t1, t2)]

View File

@ -0,0 +1,73 @@
module Lexer where
import Text.Parsec
import Text.Parsec.Text.Lazy
import qualified Data.Text.Lazy as L
import qualified Text.Parsec.Token as Tok
import qualified Text.Parsec.Expr as Ex
import Data.Functor.Identity
type Op a = Ex.Operator L.Text () Identity a
type Operators a = Ex.OperatorTable L.Text () Identity a
reservedNames :: [String]
reservedNames = [
"let",
"in",
"fix",
"rec",
"if",
"then",
"else"
]
reservedOps :: [String]
reservedOps = [
"->",
"\\",
"+",
"*",
"-",
"="
]
lexer :: Tok.GenTokenParser L.Text () Identity
lexer = Tok.makeTokenParser $ Tok.LanguageDef
{ Tok.commentStart = "{-"
, Tok.commentEnd = "-}"
, Tok.commentLine = "--"
, Tok.nestedComments = True
, Tok.identStart = letter
, Tok.identLetter = alphaNum <|> oneOf "_'"
, Tok.opStart = oneOf ":!#$%&*+./<=>?@\\^|-~"
, Tok.opLetter = oneOf ":!#$%&*+./<=>?@\\^|-~"
, Tok.reservedNames = reservedNames
, Tok.reservedOpNames = reservedOps
, Tok.caseSensitive = True
}
reserved :: String -> Parser ()
reserved = Tok.reserved lexer
reservedOp :: String -> Parser ()
reservedOp = Tok.reservedOp lexer
identifier :: Parser String
identifier = Tok.identifier lexer
parens :: Parser a -> Parser a
parens = Tok.parens lexer
semiSep :: Parser a -> Parser [a]
semiSep = Tok.semiSep lexer
semi :: Parser String
semi = Tok.semi lexer
contents :: Parser a -> Parser a
contents p = do
Tok.whiteSpace lexer
r <- p
eof
return r

149
chapter9/provenance/Main.hs Normal file
View File

@ -0,0 +1,149 @@
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE TypeSynonymInstances #-}
import Syntax
import Infer
import Parser
import Pretty
import Data.Monoid
import qualified Data.Map as Map
import qualified Data.Text.Lazy as L
import qualified Data.Text.Lazy.IO as L
import Control.Monad.Identity
import Control.Monad.State.Strict
import Data.List (isPrefixOf)
import System.Exit
import System.Environment
import System.Console.Repline
-------------------------------------------------------------------------------
-- Types
-------------------------------------------------------------------------------
data IState = IState
{ tyctx :: Env -- Type environment
}
initState :: IState
initState = IState []
type Repl a = HaskelineT (StateT IState IO) a
hoistErr :: Show e => Either e a -> Repl a
hoistErr (Right val) = return val
hoistErr (Left err) = do
liftIO $ print err
abort
-------------------------------------------------------------------------------
-- Execution
-------------------------------------------------------------------------------
exec :: Bool -> L.Text -> Repl ()
exec update source = do
-- Get the current interpreter state
st <- get
-- Parser ( returns AST )
mod <- hoistErr $ parseExpr source
-- Type Inference ( returns Typing Environment )
tyctx' <- hoistErr $ inferTop (tyctx st) mod
liftIO $ putStrLn (pptype tyctx')
return ()
showOutput :: String -> IState -> Repl ()
showOutput arg st = do
case lookup "it" (tyctx st) of
Just val -> liftIO $ putStrLn $ ppsignature (arg, val)
Nothing -> return ()
cmd :: String -> Repl ()
cmd source = exec True (L.pack source)
-------------------------------------------------------------------------------
-- Commands
-------------------------------------------------------------------------------
-- :browse command
browse :: [String] -> Repl ()
browse _ = do
st <- get
undefined
{-liftIO $ mapM_ putStrLn $ ppenv (tyctx st)-}
-- :load command
load :: [String] -> Repl ()
load args = do
contents <- liftIO $ L.readFile (unwords args)
exec True contents
-- :type command
typeof :: [String] -> Repl ()
typeof args = do
st <- get
let arg = unwords args
case lookup arg (tyctx st) of
Just val -> liftIO $ putStrLn $ ppsignature (arg, val)
Nothing -> exec False (L.pack arg)
-- :quit command
quit :: a -> Repl ()
quit _ = liftIO $ exitSuccess
-------------------------------------------------------------------------------
-- Interactive Shell
-------------------------------------------------------------------------------
-- Prefix tab completer
defaultMatcher :: MonadIO m => [(String, CompletionFunc m)]
defaultMatcher = [
(":load" , fileCompleter)
--, (":type" , values)
]
-- Default tab completer
comp :: (Monad m, MonadState IState m) => WordCompleter m
comp n = do
let cmds = [":load", ":type", ":browse", ":quit"]
ctx <- gets tyctx
let defs = fmap fst ctx
return $ filter (isPrefixOf n) (cmds ++ defs)
options :: [(String, [String] -> Repl ())]
options = [
("load" , load)
, ("browse" , browse)
, ("quit" , quit)
, ("type" , Main.typeof)
]
-------------------------------------------------------------------------------
-- Entry Point
-------------------------------------------------------------------------------
completer :: CompleterStyle (StateT IState IO)
completer = Prefix (wordCompleter comp) defaultMatcher
shell :: Repl a -> IO ()
shell pre = flip evalStateT initState
$ evalRepl "Poly> " cmd options completer pre
-------------------------------------------------------------------------------
-- Toplevel
-------------------------------------------------------------------------------
main :: IO ()
main = do
args <- getArgs
case args of
[] -> shell (return ())
[fname] -> shell (load [fname])
["test", fname] -> shell (load [fname] >> browse [] >> quit ())
_ -> putStrLn "invalid arguments"

View File

@ -0,0 +1,77 @@
{-# LANGUAGE OverloadedStrings #-}
module Parser (
parseExpr,
parseModule
) where
import Text.Parsec
import Text.Parsec.Text.Lazy (Parser)
import qualified Text.Parsec.Expr as Ex
import qualified Text.Parsec.Token as Tok
import qualified Data.Text.Lazy as L
import Lexer
import Syntax
import Control.Applicative ((<$>))
integer :: Parser Integer
integer = Tok.integer lexer
variable :: Parser Expr
variable = do
x <- identifier
l <- sourceLine <$> getPosition
return (Var (Located l) x)
number :: Parser Expr
number = do
n <- integer
l <- sourceLine <$> getPosition
return (Lit (Located l) (fromIntegral n))
lambda :: Parser Expr
lambda = do
reservedOp "\\"
args <- many identifier
reservedOp "->"
body <- expr
l <- sourceLine <$> getPosition
return $ foldr (Lam (Located l)) body args
aexp :: Parser Expr
aexp = parens expr
<|> lambda
<|> number
<|> variable
expr :: Parser Expr
expr = do
es <- many1 aexp
l <- sourceLine <$> getPosition
return (foldl1 (App (Located l)) es)
type Binding = (String, Expr)
val :: Parser Binding
val = do
ex <- expr
return ("it", ex)
top :: Parser Binding
top = do
x <- val
optional semi
return x
modl :: Parser [Binding]
modl = many top
parseExpr :: L.Text -> Either ParseError Expr
parseExpr input = parse (contents expr) "<stdin>" input
parseModule :: FilePath -> L.Text -> Either ParseError [(String, Expr)]
parseModule fname input = parse (contents modl) fname input

View File

@ -0,0 +1,82 @@
{-# Language FlexibleInstances #-}
{-# LANGUAGE OverloadedStrings #-}
{-# Language TypeSynonymInstances #-}
module Pretty (
ppdecl,
ppexpr,
ppsignature,
pptype
) where
import Type
import Syntax
import Infer
import Text.PrettyPrint
parensIf :: Bool -> Doc -> Doc
parensIf True = parens
parensIf False = id
class Pretty p where
ppr :: Int -> p -> Doc
pp :: p -> Doc
pp = ppr 0
instance Pretty Name where
ppr _ x = text x
instance Pretty TVar where
ppr _ (TV x) = text x
instance Pretty Type where
ppr p (TArr _ a b) = (parensIf (isArrow a) (ppr p a)) <+> text "->" <+> ppr p b
where
isArrow TArr{} = True
isArrow _ = False
ppr p (TVar _ a) = ppr p a
ppr _ (TCon _ a) = text a
instance Pretty Expr where
ppr p (Var _ a) = ppr p a
ppr p (App _ a b) = parensIf (p > 0) $ ppr (p+1) a <+> ppr p b
ppr p (Lam _ a b) = text "\\" <> ppr p a <+> text "->" <+> ppr p b
ppr _ (Lit _ a) = int a
instance Pretty Loc where
ppr p (NoLoc) = ""
ppr p (Located n) = int n
instance Show TypeError where
show (UnificationFail a la b lb) =
concat [
"Cannot unify types: \n\t"
, pptype a
, "\n\tIntroduced at: "
, (pploc la)
, "\nwith \n\t"
, pptype b
, "\n\tIntroduced at: "
, (pploc lb)
]
show (InfiniteType (TV a) b) =
concat ["Cannot construct the the infinite type: ", a, " = ", pptype b]
show (Ambigious cs) =
concat ["Cannot not match expected type: '" ++ pptype a ++ "' with actual type: '" ++ pptype b ++ "'\n" | (a,b) <- cs]
show (UnboundVariable a) = "Not in scope: " ++ a
pploc :: Loc -> String
pploc = render . ppr 0
pptype :: Type -> String
pptype = render . ppr 0
ppexpr :: Expr -> String
ppexpr = render . ppr 0
ppsignature :: (String, Type) -> String
ppsignature (a, b) = a ++ " : " ++ pptype b
ppdecl :: (String, Expr) -> String
ppdecl (a, b) = "let " ++ a ++ " = " ++ ppexpr b

View File

@ -0,0 +1,15 @@
module Syntax (
Expr(..),
Name,
Loc(..),
) where
type Name = String
data Loc = NoLoc | Located Int
deriving (Show, Eq, Ord)
data Expr
= Var Loc Name
| App Loc Expr Expr
| Lam Loc Name Expr
| Lit Loc Int

View File

@ -0,0 +1,15 @@
module Type (
Type(..),
TVar(..),
) where
import Syntax (Loc, Name)
data Type
= TVar Loc TVar
| TCon Loc Name
| TArr Loc Type Type
deriving (Show, Eq, Ord)
newtype TVar = TV String
deriving (Show, Eq, Ord)

View File

@ -1,3 +1,6 @@
Contributing
============
As always, I rely on the perpetual kindness and goodwill of Haskellers (like
you!) to help correct grammar, clarify, and fix errors.
@ -11,6 +14,10 @@ repo and then submit a pull request on Github. There should be no need to
compile the text locally. I will try to merge the changes quickly and rebuild
the text daily.
If you would like to add your name to
[CONTRIBUTORS.md](https://github.com/sdiehl/write-you-a-haskell/blob/master/CONTRIBUTORS.md)
submit this along with your pull request.
*Complex Fixes*
If you'd like to submit a change to the publishing software around the text,
@ -59,6 +66,28 @@ syntax highlighting.
*Math Typesetting*
Equations can be included in display form:
```latex
$$
\int_\Omega \mathrm{d}\omega = \oint_{\partial \Omega} \omega
$$
```
$$
\int_\Omega \mathrm{d}\omega = \oint_{\partial \Omega} \omega
$$
Or in inline form (like $a^2 + b^2 = c^2$) with single dollar signs. Specially
there must be no spaces around the dollar signs otherwise Pandoc will not parse
it properly.
```latex
$a^2 + b^2 = c^2$
```
For most definitions, the ``aligned`` block is used:
```latex
$$
\begin{aligned}
@ -105,7 +134,7 @@ Typography
LaTeX
-----
The $\latex$ styling is sourced from the ``template.latex`` file, which is an
The LaTeX styling is sourced from the ``template.latex`` file, which is an
extension of Pandoc's default template with some custom modifications.
Images

View File

@ -23,6 +23,18 @@ pre code {
font: 15px/19px Inconsolata, Monaco,"Lucida Console",Terminal,"Courier New",Courier;
}
.figure {
text-align: center;
}
.pagetitle .figure {
text-align: left !important;
}
.pagetitle .figure img {
height: 36px;
}
table th {
border-right: 1em solid transparent;
}

96
img/Haskell-Logo.ps Normal file
View File

@ -0,0 +1,96 @@
%!PS-Adobe-3.0
%%Creator: cairo 1.10.2 (http://cairographics.org)
%%CreationDate: Wed Jan 14 11:07:43 2015
%%Pages: 1
%%BoundingBox: 0 0 490 349
%%DocumentData: Clean7Bit
%%LanguageLevel: 2
%%DocumentMedia: 173x123mm 490 348 0 () ()
%%EndComments
%%BeginProlog
/languagelevel where
{ pop languagelevel } { 1 } ifelse
2 lt { /Helvetica findfont 12 scalefont setfont 50 500 moveto
(This print job requires a PostScript Language Level 2 printer.) show
showpage quit } if
/q { gsave } bind def
/Q { grestore } bind def
/cm { 6 array astore concat } bind def
/w { setlinewidth } bind def
/J { setlinecap } bind def
/j { setlinejoin } bind def
/M { setmiterlimit } bind def
/d { setdash } bind def
/m { moveto } bind def
/l { lineto } bind def
/c { curveto } bind def
/h { closepath } bind def
/re { exch dup neg 3 1 roll 5 3 roll moveto 0 rlineto
0 exch rlineto 0 rlineto closepath } bind def
/S { stroke } bind def
/f { fill } bind def
/f* { eofill } bind def
/n { newpath } bind def
/W { clip } bind def
/W* { eoclip } bind def
/BT { } bind def
/ET { } bind def
/pdfmark where { pop globaldict /?pdfmark /exec load put }
{ globaldict begin /?pdfmark /pop load def /pdfmark
/cleartomark load def end } ifelse
/BDC { mark 3 1 roll /BDC pdfmark } bind def
/EMC { mark /EMC pdfmark } bind def
/cairo_store_point { /cairo_point_y exch def /cairo_point_x exch def } def
/Tj { show currentpoint cairo_store_point } bind def
/TJ {
{
dup
type /stringtype eq
{ show } { -0.001 mul 0 cairo_font_matrix dtransform rmoveto } ifelse
} forall
currentpoint cairo_store_point
} bind def
/cairo_selectfont { cairo_font_matrix aload pop pop pop 0 0 6 array astore
cairo_font exch selectfont cairo_point_x cairo_point_y moveto } bind def
/Tf { pop /cairo_font exch def /cairo_font_matrix where
{ pop cairo_selectfont } if } bind def
/Td { matrix translate cairo_font_matrix matrix concatmatrix dup
/cairo_font_matrix exch def dup 4 get exch 5 get cairo_store_point
/cairo_font where { pop cairo_selectfont } if } bind def
/Tm { 2 copy 8 2 roll 6 array astore /cairo_font_matrix exch def
cairo_store_point /cairo_font where { pop cairo_selectfont } if } bind def
/g { setgray } bind def
/rg { setrgbcolor } bind def
/d1 { setcachedevice } bind def
%%EndProlog
%%Page: 1 1
%%BeginPageSetup
%%PageMedia: 173x123mm
%%PageBoundingBox: 0 0 490 349
%%EndPageSetup
q 0 0 490 349 rectclip q
1 g
8 w
0 J
1 j
[] 0.0 d
4 M q 1 0 0 -1 0 348.156311 cm
4.016 344.156 m 117.402 174.078 l 4.016 4 l 89.055 4 l 202.441 174.078
l 89.055 344.156 l h
4.016 344.156 m S Q
q 1 0 0 -1 0 348.156311 cm
117.402 344.156 m 230.789 174.078 l 117.402 4 l 202.441 4 l 429.211
344.156 l 344.172 344.156 l 273.309 237.859 l 202.441 344.156 l h
117.402 344.156 m S Q
q 1 0 0 -1 0 348.156311 cm
391.418 244.945 m 353.625 188.254 l 485.906 188.25 l 485.906 244.945 l
h
391.418 244.945 m S Q
q 1 0 0 -1 0 348.156311 cm
334.727 159.906 m 296.93 103.215 l 485.906 103.211 l 485.906 159.906 l
h
334.727 159.906 m S Q
Q Q
showpage
%%Trailer
%%EOF

Binary file not shown.

Before

Width:  |  Height:  |  Size: 4.5 KiB

After

Width:  |  Height:  |  Size: 20 KiB

View File

@ -1,4 +1,5 @@
digraph G {
graph [ dpi = 300 ];
rankdir=LR
node [shape=box]
Source -> Parsing -> Desugar -> "Type Checking" -> Transformation -> Compilation

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.6 KiB

After

Width:  |  Height:  |  Size: 24 KiB

View File

@ -1,4 +1,5 @@
digraph G {
graph [ dpi = 300 ];
rankdir=LR
node [shape=box]
Frontend -> "Core Language" -> "Compiler IR" -> "Machine Code"

Binary file not shown.

Before

Width:  |  Height:  |  Size: 5.2 KiB

After

Width:  |  Height:  |  Size: 19 KiB

View File

@ -1,4 +1,5 @@
digraph G {
graph [ dpi = 300 ];
rankdir=LR
node [shape=box]
Source -> Parsing -> Desugar -> Inference -> Transformation -> Compliation

View File

@ -1,4 +1,5 @@
digraph G {
graph [ dpi = 300 ];
rankdir=LR
node [shape=box]
Parse -> Rename -> Typecheck -> Desugar -> ToCore -> Evaluate

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.3 KiB

After

Width:  |  Height:  |  Size: 23 KiB

View File

@ -1,4 +1,5 @@
digraph G {
graph [ dpi = 300 ];
rankdir=LR
node [shape=box]
Source -> Frontend -> Core -> PHOAS

Binary file not shown.

Before

Width:  |  Height:  |  Size: 4.4 KiB

After

Width:  |  Height:  |  Size: 15 KiB

View File

@ -1,4 +1,5 @@
digraph G {
graph [ dpi = 72 ];
rankdir=TB
node [shape=box]
WriterT -> IO [label = " execWriterT"]

Binary file not shown.

Before

Width:  |  Height:  |  Size: 8.8 KiB

After

Width:  |  Height:  |  Size: 5.5 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 4.5 KiB

After

Width:  |  Height:  |  Size: 17 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 3.2 KiB

After

Width:  |  Height:  |  Size: 11 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 5.4 KiB

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.5 KiB

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 3.6 KiB

After

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 5.4 KiB

After

Width:  |  Height:  |  Size: 20 KiB

BIN
img/titles/llvm.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.9 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 2.8 KiB

After

Width:  |  Height:  |  Size: 9.8 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 4.1 KiB

After

Width:  |  Height:  |  Size: 15 KiB

BIN
img/titles/systemf.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 5.1 KiB

After

Width:  |  Height:  |  Size: 19 KiB

View File

@ -1,5 +1,16 @@
\documentclass[$if(fontsize)$$fontsize$,$endif$$if(lang)$$lang$,$endif$$if(papersize)$$papersize$,$endif$$for(classoption)$$classoption$$sep$,$endfor$]{$documentclass$}
\usepackage{geometry}
\usepackage{xcolor}
\usepackage{graphicx}
\usepackage[labelformat=empty]{caption}
\usepackage{afterpage}
\newcommand\blankpage{%
\null
\thispagestyle{empty}%
\addtocounter{page}{-1}%
\newpage}
$if(fontfamily)$
\usepackage{$fontfamily$}
$else$
@ -34,11 +45,20 @@ $if(sansfont)$
\setsansfont{$sansfont$}
$endif$
$if(monofont)$
\setmonofont[Mapping=tex-ansi]{$monofont$}
%\setmonofont[Mapping=tex-ansi]{$monofont$}
% custom override
$endif$
$if(mathfont)$
\setmathfont(Digits,Latin,Greek){$mathfont$}
$endif$
\usepackage{fontspec}
\setmainfont[Ligatures=Common,
ItalicFont={Adobe Garamond Pro Italic}]
{Adobe Garamond Pro}
\setmonofont[Ligatures=NoCommon]{Source Code Pro}
\fi
% use upquote if available, for straight quotes in verbatim environments
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
@ -106,7 +126,6 @@ $if(graphics)$
\fi
}
\makeatother
\setkeys{Gin}{width=\ScaleWidthIfNeeded,height=\ScaleHeightIfNeeded,keepaspectratio}%
$endif$
\ifxetex
@ -160,11 +179,88 @@ $endfor$
\begin{document}
$if(title)$
% ----------
% Title page
% ----------
\begin{titlepage}
\definecolor{titlepagecolor}{cmyk}{1,.60,0,.40}
\definecolor{namecolor}{cmyk}{1,.50,0,.10}
\newgeometry{left=7.5cm} %defines the geometry for the titlepage
\pagecolor{titlepagecolor}
\begin{figure}
\centering
\includegraphics[width=4cm]{img/Haskell-Logo.png}
\includegraphics[height=3in,width=3in]{img/Haskell-Logo.ps}
\end{figure}
\maketitle
\color{white}
\makebox[0pt][l]{\rule{1.3\textwidth}{1pt}}
\par
\noindent
{\huge \textsf{Write You a Haskell}}
\par
\noindent
{\textit{\textsf{Building a modern functional compiler from first principles}}}
%\textbf{\textsf{Something}} \textcolor{namecolor}{\textsf{Else}}
\vfill
\noindent
%{\huge \textsf{Write You a Haskell}}
\vskip\baselineskip
\noindent
{\huge \textsf{Stephen Diehl}}
\\
\textsf{January 2015 (Draft)}
\end{titlepage}
\pagecolor{white}
\restoregeometry % restores the geometry
% Filler page
%\null
%\thispagestyle{empty}
%\addtocounter{page}{-1}
%\newpage
% Subtitle page
%\vfill{3in}
%\begin{centering}
%{\HUGE \textsf{Write You a Haskell}}
%\end{centering}
%\thispagestyle{empty}
%\addtocounter{page}{-1}
%\newpage
% Copyright page
\thispagestyle{empty}
\addtocounter{page}{-1}
\begin{minipage}[b]{0.9\textwidth}
\footnotesize\raggedright
\setlength{\parskip}{0.5\baselineskip}
\begin{textsf}
{\textbf{Write You a Haskell}}\\
by Stephen Diehl
\par
Copyright \copyright\ 2013-2015. \\
\href{http://www.stephendiehl.com}{www.stephendiehl.com}
\par
This written work is licensed under a Creative Commons
Attribution-NonCommercial-ShareAlike 4.0 International License. You may
reproduce and edit this work with attribution for all non-commercial purposes.
\par
The included source is released under the terms of the MIT License.
\par
Git commit: f09d210be253a05fc8ad0827cd72ffa32404e2ba
\end{textsf}
\end{minipage}
\vspace*{2\baselineskip}
\cleardoublepage
$endif$
$if(abstract)$
\begin{abstract}

Some files were not shown because too many files have changed in this diff Show More